The Translingual Information Detection, Extraction and Summarization (TIDES) program is developing advanced language processing technology to enable English speakers to find and interpret critical information in multiple languages without requiring knowledge of those languages.
Conduct research to develop effective algorithms for detection, extraction, summarization, and translation -- where the source data may be large volumes of naturally occurring speech or text in multiple languages.
Measure accuracy in rigorous, objective evaluations. Outside groups are invited to participate in the annual Information Retrieval, Topic Detection and Tracking, Automatic Content Extraction, and Machine Translation evaluations run by NIST.
Integrate core capabilities to form effective text and audio processing (TAP) systems. Experiment with those systems on real data with real users, then refine and iterate.