DARPA TIDES program

Translingual Information Detection, Extraction and Summarization (TIDES) is a technology development program funded by the U.S. Defense Advanced Research Projects Agency (DARPA), focused on the automated processing and understanding of language data. The primary goal of the program is to enable English speakers to locate and interpret required information quickly and effectively regardless of the original language.

Components
The four component capabilities of the technology being developed by TIDES includes:
 * Detection – Locating required information.
 * Extraction – Pulling out key facts.
 * Summarization – Reducing the information into a readable length.
 * Translation – Converting text from another language into English.

Tools for detection, extraction, and summarization must work within a language (monolingually) and across languages (translingually), to be used by people who speak only English. In addition to developing technology, TIDES is also researching methods to adapt it quickly and cheaply to other languages, including languages with limited linguistic resources. TIDES aims to integrate the component capabilities together and with other technologies to produce tools for real-world applications.

Investigative Data Warehouse
The FBI's Investigative Data Warehouse contains an open-source news library, containing news gathered by the TIDES program. The information is collected from public websites around the world, including Ha'aretz, Pravda, the Jordan Times, The People's Daily, The Washington Post, and others. It uses the Mitre Text and Audio Processing (MiTAP) system.