User:Noelo121/Sandbox

Fuzzy Matching

“Fuzzy matching” is used in localisation to speed up the translation process. It works as part of the functionality of translation memory (TM); when an exact match cannot be found in the TM database for a piece of text (usually a sentence), the TM will offer a match that has less than 100% correlation with that text. The translator sets the threshold beforehand: e.g., the translator might set fuzzy matching to return matches of 60%, and the database will return all the results that have a 60% match with the text to be translated.

Background:

Because of the polymorphous and dynamic nature of language, particularly English (which accounts for 90% of all source texts undergoing translation in the localisation industry), methods are always being sought to make the translation process easier and faster. Since the late 1980s, translation memory tools have been developed to increase productivity and make the whole translation process faster for the translator. This is where fuzzy matching comes in. It is a feature of most popular TM tools at this stage, with its main function being to assist the translator (not, as many people mistakenly think about TMs, to replace the translator) and maintain a level of consistency in the process. A fuzzy match is a partial match, a less than 100% match when using a TM in the translation process. The TM tool searches the database to locate segments that are an approximate match for a segment in a new source text. The TM, in effect, “proposes” the match to the translator; it is then up to the translator to accept this proposal or to edit this proposal to more fully equate with the new source text that is being translated. In this way, fuzzy matching can speed up the translation process and lead to increased productivity. Of course, this all raises questions about the quality of the resulting translations. We can envisage occasions where a translator is under pressure to deliver on time and thus almost compelled to accept a fuzzy match proposal without checking its suitability and context. We must remember that TM databases are built up by input from numerous different translators working on a variety of different texts, with a danger that sentences extracted from this word “tapestry” will be a stitched-together hodgepodge of styles, and the antithesis of the striven-after consistency – what some critics have dubbed “sentence salad”. This question of faith – in the TM’s proposals – can be a problem when trying to strike a balance between a faster translation process and the quality of that translation.