Monthly Archives: March 2017

Detection time from minutes to microseconds

Terahertz spectroscopy, which uses the band of electromagnetic radiation between microwaves and infrared light, is a promising security technology because it can extract the spectroscopic “fingerprints” of a wide range of materials, including chemicals used in explosives.

But traditional terahertz spectroscopy requires a radiation source that’s heavy and about the size of a large suitcase, and it takes 15 to 30 minutes to analyze a single sample, rendering it impractical for most applications.

In the latest issue of the journal Optica, researchers from MIT’s Research Laboratory of Electronics and their colleagues present a new terahertz spectroscopy system that uses a quantum cascade laser, a source of terahertz radiation that’s the size of a computer chip. The system can extract a material’s spectroscopic signature in just 100 microseconds.

The device is so efficient because it emits terahertz radiation in what’s known as a “frequency comb,” meaning a range of frequencies that are perfectly evenly spaced.

“With this work, we answer the question, ‘What is the real application of quantum-cascade laser frequency combs?’” says Yang Yang, a graduate student in electrical engineering and computer science and first author on the new paper. “Terahertz is such a unique region that spectroscopy is probably the best application. And QCL-based frequency combs are a great candidate for spectroscopy.”

Different materials absorb different frequencies of terahertz radiation to different degrees, giving each of them a unique terahertz-absorption profile. Traditionally, however, terahertz spectroscopy has required measuring a material’s response to each frequency separately, a process that involves mechanically readjusting the spectroscopic apparatus. That’s why the method has been so time consuming.

Because the frequencies in a frequency comb are evenly spaced, however, it’s possible to mathematically reconstruct a material’s absorption fingerprint from just a few measurements, without any mechanical adjustments.

Getting even

The trick is evening out the spacing in the comb. Quantum cascade lasers, like all electrically powered lasers, bounce electromagnetic radiation back and forth through a “gain medium” until the radiation has enough energy to escape. They emit radiation at multiple frequencies that are determined by the length of the gain medium.

But those frequencies are also dependent on the medium’s refractive index, which describes the speed at which electromagnetic radiation passes through it. And the refractive index varies for different frequencies, so the gaps between frequencies in the comb vary, too.

Search engine enables English monolingual analysts

“About 6,000 languages are currently spoken in the world today,” says Elizabeth Salesky of MIT Lincoln Laboratory’s Human Language Technology (HLT) Group. “Within the law enforcement community, there are not enough multilingual analysts who possess the necessary level of proficiency to understand and analyze content across these languages,” she continues.

This problem of too many languages and too few specialized analysts is one Salesky and her colleagues are now working to solve for law enforcement agencies, but their work has potential application for the Department of Defense and Intelligence Community. The research team is taking advantage of major advances in language recognition, speaker recognition, speech recognition, machine translation, and information retrieval to automate language processing tasks so that the limited number of linguists available for analyzing text and spoken foreign languages can be used more efficiently. “With HLT, an equivalent of 20 times more foreign language analysts are at your disposal,” says Salesky.

One area in which Lincoln Laboratory researchers are focusing their efforts is cross-language information retrieval (CLIR). The Cross-LAnguage Search Engine, or CLASE, is a CLIR tool developed by the HLT Group for the Federal Bureau of Investigation (FBI). CLASE is a fusion of laboratory research in language identification, machine translation, information retrieval, and query-biased summarization. CLASE enables English monolingual analysts to help search for and filter foreign language documents — tasks that have traditionally been restricted to foreign language analysts.

Laboratory researchers considered three algorithmic approaches to CLIR that have emerged in the HLT research community: query translation, document translation, and probabilistic CLIR. In query translation, an English-speaking analyst queries foreign language documents for an English phrase; that query is translated into a foreign language via machine translation. The most relevant foreign language documents containing the translated query are then translated into English and returned to the analyst. In document translation, foreign language documents are translated into English; an analyst then queries the translated documents for an English phrase, and the most relevant documents are returned to the analyst. Probabilistic CLIR, the approach that researchers within the HLT Group are taking, is based on machine translation lattices (graphs in which edges connect related translations).

Practical for programs that import huge swaths of code

Symbolic execution is a powerful software-analysis tool that can be used to automaticallylocate and even repair programming bugs. Essentially, it traces out every path that a program’s execution might take.

But it tends not to work well with applications written using today’s programming frameworks. An application might consist of only 1,000 lines of new code, but it will generally import functions — such as those that handle virtual buttons — from a programming framework, which includes huge libraries of frequently reused code. The additional burden of evaluating the imported code makes symbolic execution prohibitively time consuming.

Computer scientists address this problem by creating simple models of the imported libraries, which describe their interactions with new programs but don’t require line-by-line evaluation of their code. Building the models, however, is labor-intensive and error prone, and the models require regular updates, as programming frameworks are constantly evolving.

Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory, working with colleagues at the University of Maryland, have taken an important step toward enabling symbolic execution of applications written using programming frameworks, with a system that automatically constructs models of framework libraries.

The researchers compared a model generated by their system with a widely used model of Java’s standard library of graphical-user-interface components, which had been laboriously constructed over a period of years. They found that their new model plugged several holes in the hand-coded one.

They described their results in a paper they presented last week at the International Conference on Software Engineering. Their work was funded by the National Science Foundation’s Expeditions Program.

“Forty years ago, if you wanted to write a program, you went in, you wrote the code, and basically all the code you wrote was the code that executed,” says Armando Solar-Lezama, an associate professor of electrical engineering and computer science at MIT, whose group led the new work. “But today, if you want to write a program, you go and bring in these huge frameworks and these huge pieces of functionality that you then glue together, and you write a little code to get them to interact with each other. If you don’t understand what that big framework is doing, you’re not even going to know where your program is going to start executing.”