Euromatrix

The EuroMatrix is a project that ran from September 2006 to February 2009. The project aimed to develop and improve machine translation (MT) systems between all official languages of the European Union (EU).

EuroMatrix was followed up by another project EuroMatrixPlus (March 2009 to February 2012).

Approach to translation
EuroMatrix explored using linguistic knowledge in statistical machine translation. Statistical techniques were combined with rule-based approach, resulting in hybrid MT architecture. The project experimented with combining methods and resources from statistical MT, rule-based MT, shallow language processing and computational lexicography and morphology.

Project objectives
EuroMatrix focused on high-quality translation for the publication of technical, social, legal and political documents. It applied advanced MT technologies to all pairs of EU languages; languages of new and likely-to-become EU member states were also taken into account.

Annual international evaluation
Competitive annual international evaluation of machine translation meetings (“MT marathons”) were organized to bring together MT researchers. Participants of the marathons translated test sets with their systems. The test sets were then evaluated by manual as well as automatic metrics.

MT marathons were multi-day happenings consisting of several events — summer school, lab lessons, research talks, workshops, open source conventions, research showcases.

Outcome
Several tools and resources were created or supported by the project:


 * Moses, an open source statistical machine translation engine
 * Europarl Corpus, version 3
 * Results from Workshops on Statistical Machine Translation (2007, 2008, 2009)
 * CzEng Corpus, version 0.7

Funding
The EuroMatrix project was sponsored by EU Information Society Technology program.

Total cost of the project was 2 358 747 €, from which the European Union contributed 2 066 388 €.

Project members
Experienced research groups in machine translation that are internationally recognized, as well as relevant industrial partners participated in the project. The consortium included the University of Edinburgh (United Kingdom), Charles University (Czech Republic), Saarland University (Germany), Center for the Evaluation of Language and Communication Technologies (Italy), MorphoLogic (Hungary), and GROUP Technologies AG (Germany).

The project was coordinated by Hans Uszkoreit, a professor of Computational Linguistics at Saarland University.