User:PK2/Google Translate 2

Google Translate is a free multilingual machine translation service developed by Google, to translate text. It offers a website interface, a mobile app for Android and iOS, and an application programming interface that helps developers build browser extensions and software applications. As of July 2024, Google Translate supports languages at various levels and, claimed over 500 million total users, with more than 100 billion words translated daily.

Launched in April 2006 as a statistical machine translation service, it used United Nations and European Parliament transcripts to gather linguistic data. Rather than translating languages directly, it first translates text to English and then pivots to the target language in most of the language combinations it posits in its grid,, with a few exceptions including Catalan-Spanish. During a translation, it looks for patterns in millions of documents to help decide on which words to choose and how to arrange them in the target language. Its accuracy, which has been criticized and ridiculed on several occasions, has been measured to vary greatly across languages. In November 2016, Google announced that Google Translate would switch to a neural machine translation engine - Google Neural Machine Translation (GNMT) - which translates "whole sentences at a time, rather than just piece by piece. It uses this broader context to help it figure out the most relevant translation, which it then rearranges and adjusts to be more like a human speaking with proper grammar". Originally only enabled for a few languages in 2016, GNMT is used in all languages in the Google Translate roster as of 2024, except for Kyrgyz, Latin, and the Belarusian, Maltese and Sundanese to other languages pairs.

History
Google Translate is a complementary translation service developed by Google in April 2006. It translates multiple forms of texts and media such as words, phrases and webpages.

Originally Google Translate was released as a statistical machine translation service. Translating the required text into English before translating into the selected language was a mandatory step that it had to take. Since SMT uses predictive algorithms to translate text, it had poor grammatical accuracy. However, Google initially did not hire experts to resolve this limitation due to the ever-evolving nature of language.

In January 2010, Google has introduced an Android app and iOS version in February 2011 to serve as a portable personal interpreter. As of February 2010, it was integrated into browsers such as Chrome and was able to pronounce the text, automatically recognize words in the picture and spot unfamiliar text and languages.

In May 2014, Google acquired Word Lens to improve the quality of visual and voice translation. It is able to scan text or a picture with one's device and have it translated instantly. Moreover, the system automatically identifies foreign languages and translates speech without requiring individuals to tap the microphone button whenever speech translation is needed.

In November 2016, Google transitioned its translating method to a system called neural machine translation. It uses deep learning techniques to translate whole sentences at a time, which it has measured to be more accurate between English and French, German, Spanish, and Chinese. No measurement results have been provided by Google researchers for GNMT from English to other languages, other languages to English, or between language pairs that do not include English. As of 2018, it translates more than 100 billion words a day.

Functions
Google Translate can translate multiple forms of text and media, which includes text, speech, and text within still or moving images. Specifically, its functions include:


 * Written Words Translation: a function that translates written words or text to a foreign language.
 * Website Translation: a function that translates a whole webpage to selected languages
 * Document Translation: a function that translates a document uploaded by the users to selected languages. The documents should be in the form of: .doc, .docx, .odf, .pdf, .ppt, .pptx, .ps, .rtf, .txt, .xls, .xlsx.
 * Speech Translation: a function that instantly translates spoken language into the selected foreign language.
 * Mobile App Translation: in 2018, Google Translate has introduced its new feature called “Tap to Translate,” which made instant translation accessible inside any app without exiting or switching it.
 * Image Translation: a function that identifies text in a picture taken by the users and translates text on the screen instantly by images.
 * Handwritten Translation: a function that translates language that are handwritten on the phone screen or drawn on a virtual keyboard without the support of a keyboard.

For most of its features, Google Translate provides the pronunciation, dictionary, and listening to translation. Additionally, Google Translate has introduced its own Translate app, so translation is available with a mobile phone in offline mode.

Features
Google Translate produces approximations across languages of multiple forms of text and media, including text, speech, websites, or text on display in still or live video images. For some languages, Google Translate can synthesize speech from text,, and in certain pairs it is possible to highlight specific corresponding words and phrases between the source and target text. Results are sometimes shown with dictional information below the translation box, but it is not a dictionary and has been shown to invent translations in all languages for words it does not recognize. If "Detect language" is selected, text in an unknown language can be automatically identified. If a user enters a URL in the source text, Google Translate will produce a hyperlink to a machine translation of the website. Users can save translation proposals in a "phrasebook" for later use. For some languages, text can be entered via an on-screen keyboard, through handwriting recognition, or speech recognition.

Browser integration
Google Translate is available in some web browsers as an optional downloadable extension that can run the translation engine. In February 2010, Google Translate was integrated into the Google Chrome browser by default, for optional automatic webpage translation.

Mobile app
The Google Translate app for Android and iOS supports more than 100 languages and can propose translations for 37 languages via photo, 32 via voice in "conversation mode", and 27 via live video imagery in "augmented reality mode".

The Android app was released in January 2010, and for iOS on February 8, 2011.

A January 2011 Android version experimented with a "Conversation Mode" that aims to allow users to communicate fluidly with a nearby person in another language. Originally limited to English and Spanish, the feature received support for 12 new languages, still in testing, the following October.

In January 2015, the apps gained the ability to propose translations of physical signs in real time using the device's camera, as a result of Google's acquisition of the Word Lens app. The original January launch only supported seven languages, but a July update added support for 20 new languages, and also enhanced the speed of Conversation Mode translations. The feature was subsequently renamed Instant Camera. The technology underlying Instant Camera combines image processing and optical character recognition, then attempts to produce cross-language equivalents using standard Google Translate estimations for the text as it is perceived.

API
In May 2011, Google announced that the Google Translate API for software developers had been deprecated and would cease functioning. The Translate API page stated the reason as "substantial economic burden caused by extensive abuse" with an end date set for December 1, 2011. In response to public pressure, Google announced in June 2011 that the API would continue to be available as a paid service.

Because the API was used in numerous third-party websites and apps, the original decision to deprecate it led some developers to criticize Google and question the viability of using Google APIs in their products.

Google Assistant
Google Translate also provides translations for Google Assistant and the devices that Google Assistant runs on such as Google Home and Pixel Buds.

Supported languages
The following languages are supported by Google Translate as of July 2024. Chinese (Simplified) and Chinese (Traditional) refer to two different writing systems for the same language, so the actual total number of languages in the roster is.


 * 1) Afrikaans
 * 2) Albanian
 * 3) Amharic
 * 4) Arabic
 * 5) Armenian
 * 6) Azerbaijani
 * 7) Basque
 * 8) Belarusian
 * 9) Bengali
 * 10) Bosnian
 * 11) Bulgarian
 * 12) Burmese
 * 13) Catalan
 * 14) Cebuano
 * 15) Chichewa
 * 16) Chinese (Simplified)
 * 17) Chinese (Traditional)
 * 18) Corsican
 * 19) Croatian
 * 20) Czech
 * 21) Danish
 * 22) Dutch
 * 23) English
 * 24) Esperanto
 * 25) Estonian
 * 26) Filipino
 * 27) Finnish
 * 28) French
 * 29) Galician
 * 30) Georgian
 * 31) German
 * 32) Greek
 * 33) Gujarati
 * 34) Haitian Creole
 * 35) Hausa
 * 36) Hawaiian
 * 37) Hebrew
 * 38) Hindi
 * 39) Hmong
 * 40) Hungarian
 * 41) Icelandic
 * 42) Igbo
 * 43) Indonesian
 * 44) Irish
 * 45) Italian
 * 46) Japanese
 * 47) Javanese
 * 48) Kannada
 * 49) Kazakh
 * 50) Khmer
 * 51) Korean
 * 52) Kurdish (Kurmanji)
 * 53) Kyrgyz
 * 54) Lao
 * 55) Latin
 * 56) Latvian
 * 57) Lithuanian
 * 58) Luxembourgish
 * 59) Macedonian
 * 60) Malagasy
 * 61) Malay
 * 62) Malayalam
 * 63) Maltese
 * 64) Maori
 * 65) Marathi
 * 66) Mongolian
 * 67) Nepali
 * 68) Norwegian (Bokmål)
 * 69) Pashto
 * 70) Persian
 * 71) Polish
 * 72) Portuguese
 * 73) Punjabi
 * 74) Romanian
 * 75) Russian
 * 76) Samoan
 * 77) Scots Gaelic
 * 78) Serbian
 * 79) Sesotho
 * 80) Shona
 * 81) Sindhi
 * 82) Sinhala
 * 83) Slovak
 * 84) Slovenian
 * 85) Somali
 * 86) Spanish
 * 87) Sundanese
 * 88) Swahili
 * 89) Swedish
 * 90) Tajik
 * 91) Tamil
 * 92) Telugu
 * 93) Thai
 * 94) Turkish
 * 95) Ukrainian
 * 96) Urdu
 * 97) Uzbek
 * 98) Vietnamese
 * 99) Welsh
 * 100) West Frisian
 * 101) Xhosa
 * 102) Yiddish
 * 103) Yoruba
 * 104) Zulu


 * 1) 1st stage
 * 2) English to and from German
 * 3) English to and from Spanish
 * 4) English to and from French
 * 5) 2nd stage
 * 6) English to and from Portuguese
 * 7) 3rd stage
 * 8) English to and from Italian
 * 9) 4th stage
 * 10) English to and from Chinese (Simplified)
 * 11) English to and from Japanese
 * 12) English to and from Korean
 * 13) 5th stage (launched April 28, 2006)
 * 14) English to and from Arabic
 * 15) 6th stage (launched December 16, 2006)
 * 16) English to and from Russian
 * 17) 7th stage (launched February 9, 2007)
 * 18) English to and from Chinese (Traditional)
 * 19) Chinese (Simplified to and from Traditional)
 * 20) 8th stage (all 25 language pairs use Google's machine translation system) (launched October 22, 2007)
 * 21) English to and from Dutch
 * 22) English to and from Greek
 * 23) 9th stage
 * 24) English to and from Hindi
 * 25) 10th stage (as of this stage, translation can be done between any two languages, using English as an intermediate step, if needed) (launched May 8, 2008)
 * 26) Bulgarian
 * 27) Croatian
 * 28) Czech
 * 29) Danish
 * 30) Finnish
 * 31) Norwegian
 * 32) Polish
 * 33) Romanian
 * 34) Swedish
 * 35) 11th stage (launched September 25, 2008)
 * 36) Catalan
 * 37) Filipino
 * 38) Hebrew
 * 39) Indonesian
 * 40) Latvian
 * 41) Lithuanian
 * 42) Serbian
 * 43) Slovak
 * 44) Slovene
 * 45) Ukrainian
 * 46) Vietnamese
 * 47) 12th stage (launched January 30, 2009)
 * 48) Albanian
 * 49) Estonian
 * 50) Galician
 * 51) Hungarian
 * 52) Maltese
 * 53) Thai
 * 54) Turkish
 * 55) 13th stage (launched June 19, 2009)
 * 56) Persian
 * 57) 14th stage (launched August 24, 2009)
 * 58) Afrikaans
 * 59) Belarusian
 * 60) Icelandic
 * 61) Irish
 * 62) Macedonian
 * 63) Malay
 * 64) Swahili
 * 65) Welsh
 * 66) Yiddish
 * 67) 15th stage (launched November 19, 2009)
 * 68) The Beta stage is finished. Users can now choose to have the romanization written for Chinese, Japanese, Korean, Russian, Ukrainian, Belarusian, Bulgarian, Greek, Hindi and Thai. For translations from Arabic, Persian and Hindi, the user can enter a Latin transliteration of the text and the text will be transliterated to the native script for these languages as the user is typing. The text can now be read by a text-to-speech program in English, Italian, French and German.
 * 69) 16th stage (launched January 30, 2010)
 * 70) Haitian Creole
 * 71) 17th stage (launched April 2010)
 * 72) Speech program launched in Hindi and Spanish.
 * 73) 18th stage (launched May 5, 2010)
 * 74) Speech program launched in Afrikaans, Albanian, Catalan, Chinese (Mandarin), Croatian, Czech, Danish, Dutch, Finnish, Greek, Hungarian, Icelandic, Indonesian, Latvian, Macedonian, Norwegian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Swahili, Swedish, Turkish, Vietnamese and Welsh (based on eSpeak)
 * 75) 19th stage (launched May 13, 2010)
 * 76) Armenian
 * 77) Azerbaijani
 * 78) Basque
 * 79) Georgian
 * 80) Urdu
 * 81) 20th stage (launched June 2010)
 * 82) Provides romanization for Arabic.
 * 83) 21st stage (launched September 2010)
 * 84) Allows phonetic typing for Arabic, Greek, Hindi, Persian, Russian, Serbian and Urdu.
 * 85) Latin
 * 86) 22nd stage (launched December 2010)
 * 87) Romanization of Arabic removed.
 * 88) Spell check added.
 * 89) For some languages, Google replaced text-to-speech synthesizers from eSpeak's robot voice to native speaker's nature voice technologies made by SVOX (Chinese, Czech, Danish, Dutch, Finnish, Greek, Hungarian, Norwegian, Polish, Portuguese, Russian, Swedish, Turkish). Also the old versions of French, German, Italian and Spanish. Latin uses the same synthesizer as Italian.
 * 90) Speech program launched in Arabic, Japanese and Korean.
 * 91) 23rd stage (launched January 2011)
 * 92) Choice of different translations for a word.
 * 93) 24th stage (launched June 2011)
 * 94) 5 new Indic languages (in alpha) and a transliterated input method:
 * 95) Bengali
 * 96) Gujarati
 * 97) Kannada
 * 98) Tamil
 * 99) Telugu
 * 100) 25th stage (launched July 2011)
 * 101) Translation rating introduced.
 * 102) 26th stage (launched January 2012)
 * 103) Dutch male voice synthesizer replaced with female.
 * 104) Elena by SVOX replaced the Slovak eSpeak voice.
 * 105) Transliteration of Yiddish added.
 * 106) 27th stage (launched February 2012)
 * 107) Speech program launched in Thai.
 * 108) Esperanto
 * 109) 28th stage (launched September 2012)
 * 110) Lao
 * 111) 29th stage (launched October 2012)
 * 112) Transliteration of Lao added. (alpha status)
 * 113) 30th stage (launched October 2012)
 * 114) New speech program launched in English.
 * 115) 31st stage (launched November 2012)
 * 116) New speech program in French, Spanish, Italian and German.
 * 117) 32nd stage (launched March 2013)
 * 118) Phrasebook added.
 * 119) 33rd stage (launched April 2013)
 * 120) Khmer
 * 121) 34th stage (launched May 2013)
 * 122) Bosnian
 * 123) Cebuano
 * 124) Hmong
 * 125) Javanese
 * 126) Marathi
 * 127) 35th stage (launched May 2013)
 * 128) 16 additional languages can be used with camera-input: Bulgarian, Catalan, Croatian, Danish, Estonian, Finnish, Hungarian, Indonesian, Icelandic, Latvian, Lithuanian, Norwegian, Romanian, Slovak, Slovenian and Swedish.
 * 129) 36th stage (launched December 2013)
 * 130) Hausa
 * 131) Igbo
 * 132) Maori
 * 133) Mongolian
 * 134) Nepali
 * 135) Punjabi
 * 136) Somali
 * 137) Yoruba
 * 138) Zulu
 * 139) 37th stage (launched June 2014)
 * 140) Definition of words added.
 * 141) 38th stage (launched December 2014)
 * 142) Burmese
 * 143) Chewa
 * 144) Kazakh
 * 145) Malagasy
 * 146) Malayalam
 * 147) Sinhalese
 * 148) Sotho
 * 149) Sundanese
 * 150) Tajik
 * 151) Uzbek
 * 152) 39th stage (launched October 2015)
 * 153) Transliteration of Arabic restored.
 * 154) 40th stage (launched November 2015)
 * 155) Aurebesh
 * 156) 41st stage (launched February 2016)
 * 157) Aurebesh removed.
 * 158) Speech program launched in Bengali.
 * 159) Amharic
 * 160) Corsican
 * 161) Hawaiian
 * 162) Kurdish (Kurmanji)
 * 163) Kyrgyz
 * 164) Luxembourgish
 * 165) Pashto
 * 166) Samoan
 * 167) Scottish Gaelic
 * 168) Shona
 * 169) Sindhi
 * 170) West Frisian
 * 171) Xhosa
 * 172) 42nd stage (launched September 2016)
 * 173) Speech program launched in Ukrainian.
 * 174) 43rd stage (launched December 2016)
 * 175) Speech program launched in Khmer and Sinhala.
 * 176) 44th stage (launched June 2018)
 * 177) Speech program launched in Malayalam, Telugu, Marathi, and Myanmar (Burmese).
 * 178) 45th stage (launched September 2019)
 * 179) Speech program launched in Urdu, Kannada, and Gujarati.
 * 180) 46th stage (launched February 28, 2020)
 * 181) Abaza
 * 182) Cantonese
 * 183) Crimean Tatar
 * 184) Kabardian
 * 185) Karachay
 * 186) Nogai
 * 187) 47th stage (launched March 4, 2020)
 * 188) Speech program launched in Luxembourgish, Basque, Bulgarian, Hebrew, Hmong, Irish, Lao, Kazakh, Malay, Kurdish (Kurmanji), Kyrgyz, Lithuanian, Scots Gaelic, Persian, Punjabi, Sindhi, Slovenian, Uzbek, Cantonese and Haitian Creole

Method of translation
In April 2006, Google Translate launched with a statistical machine translation engine.

Google Translate does not apply grammatical rules, since its algorithms are based on statistical or pattern analysis rather than traditional rule-based analysis. The system's original creator, Franz Josef Och, has criticized the effectiveness of rule-based algorithms in favor of statistical approaches. Original versions of Google Translate were based on a method called statistical machine translation, and more specifically, on research by Och who won the DARPA contest for speed machine translation in 2003. Och was the head of Google's machine translation group until leaving to join Human Longevity, Inc. in July 2014.

According to Och, a solid base for developing a usable statistical machine translation system for a new pair of languages from scratch would consist of a bilingual text corpus (or parallel collection) of more than 150-200 million words, and two monolingual corpora each of more than a billion words. Statistical models from these data are then used to translate between those languages.

To acquire this huge amount of linguistic data, Google used United Nations and European Parliament transcripts.

Google Translate does not translate from one language to another (L1 → L2). Instead, it often translates first to English and then to the target language (L1 → EN → L2).

When Google Translate generates a translation proposal, it looks for patterns in hundreds of millions of documents to help decide on the best translation. By detecting patterns in documents that have already been translated by human translators, Google Translate makes informed guesses as to what an appropriate translation should be.

Before October 2007, for languages other than Arabic, Chinese and Russian, Google Translate was based on SYSTRAN, a software engine which is still used by several other online translation services such as Babel Fish (now defunct). From October 2007, Google Translate used proprietary, in-house technology based on statistical machine translation instead, before transitioning to neural machine translation.

Google Translate Community
Google has crowdsourcing features for volunteers to be a part of its “Translate Community”, intended to help improve Google Translate's accuracy. In August 2016, a Google Crowdsource app was released for Android users, in which translation tasks are offered. There are three ways to contribute. First, Google will show a phrase that one should type in the translated version. Second, Google will show a proposed translation for a user to agree, disagree, or skip. Third, users can suggest translations for phrases where they think they can improve on Google's results. Tests in 44 languages show that the "suggest an edit" feature led to an improvement in a maximum of 40% of cases over four years, while analysis across the board shows that Google's crowd procedures often lock in erroneous translations.

Statistical machine translation
Although, Google deployed a new system called neural machine translation for better quality translation, there are languages that still use the traditional translation method called statistical machine translation. It is a rule-based translation method that utilizes predictive algorithms to guess ways to translate texts in foreign languages. It aims to translate whole phrases rather than single words then gather overlapping phrases for translation. Moreover, it also analyzes bilingual text corpora to generate statistical model that translates texts from one language to another.

Google Neural Machine Translation
In September 2016, a research team at Google led by the software engineer Harold Gilchrist announced the development of the Google Neural Machine Translation system (GNMT) to increase fluency and accuracy in Google Translate and in November announced that Google Translate would switch to GNMT.

Google Translate's neural machine translation system uses a large end-to-end artificial neural network that attempts to perform deep learning, in particular, long short-term memory networks. GNMT improves the quality of translation over SMT in some instances because it uses an example-based machine translation (EBMT) method in which the system "learns from millions of examples." According to Google researchers, it translates "whole sentences at a time, rather than just piece by piece. It uses this broader context to help it figure out the most relevant translation, which it then rearranges and adjusts to be more like a human speaking with proper grammar". GNMT's "proposed architecture" of "system learning" has been implemented on over a hundred languages supported by Google Translate. With the end-to-end framework, Google states but does not demonstrate for most languages that "the system learns over time to create better, more natural translations." The GNMT network attempts interlingual machine translation, which encodes the "semantics of the sentence rather than simply memorizing phrase-to-phrase translations", and the system did not invent its own universal language, but uses "the commonality found in between many languages". GNMT was first enabled for eight languages: to and from English and Chinese, French, German, Japanese, Korean, Portuguese, Spanish and Turkish. In March 2017, it was enabled for Hindi, Russian and Vietnamese languages, followed by Indonesian, Bengali, Gujarati, Kannada, Malayalam, Marathi, Punjabi, Tamil and Telugu languages in April.

Accuracy
Google Translate is not as reliable as human translation. When text is well-structured, written using formal language, with simple sentences, relating to formal topics for which training data is ample, it often produces conversions similar to human translations between English and a number of high-resource languages. Accuracy decreases for those languages when fewer of those conditions apply, for example when sentence length increases or the text uses familiar or literary language. For many other languages vis-à-vis English, it can produce the gist of text in those formal circumstances. Human evaluation from English to all 102 languages shows that the main idea of a text is conveyed more than 50% of the time for 35 languages. For 67 languages, a minimally comprehensible result is not achieved 50% of the time or greater. A few studies have evaluated Chinese, French, German, and Spanish to English, but no systematic human evaluation has been conducted from most Google Translate languages to English. Speculative language-to-language scores extrapolated from English-to-other measurements indicate that Google Translate will produce translation results that convey the gist of a text from one language to another more than half the time in about 1% of language pairs, where neither language is English. .

When used as a dictionary to translate single words, Google Translate is highly inaccurate because it must guess between polysemic words. Among the top 100 words in the English language, which make up more than 50% of all written English, the average word has more than 15 senses, which makes the odds of a correct translation about 15 to 1 if each sense maps to a different word in the target language. Most common English words have at least two senses, which produces 50/50 odds in the likely case that the target language uses different words for those different senses. The odds are similar from other languages to English. Google Translate makes statistical guesses that raise the likelihood of producing the most frequent sense of a word, with the consequence that an accurate translation will be unobtainable in cases that do not match the majority or plurality corpus occurrence. The accuracy of single-word predictions has not been measured for any language. Because almost all non-English language pairs pivot through English, the odds against obtaining accurate single-word translations from one non-English language to another can be estimated by multiplying the number of senses in the source language with the number of senses each of those terms have in English. When Google Translate does not have a word in its vocabulary, it makes up a result as part of its algorithm.

Limitations
Due to differences between languages in investment, research, and the extent of digital resources, the accuracy of Google Translate varies greatly among languages. Some languages produce better results than others. Most languages from Africa, Asia, and the Pacific, tend to score poorly in relation to the scores of many well-financed European languages, with Afrikaans and Chinese being the high-scoring exceptions from their continents. No languages indigenous to Australia or the Americas are included within Google Translate. Higher scores for European can be partially attributed to the Europarl Corpus, a trove of documents from the European Parliament that have been professionally translated by the mandate of the European Union into as many as 21 languages.

In its Written Words Translation function, there is a word limit on the amount of text that can be translated at once. Therefore, long text should be transferred to a document form and translated through its Document Translate function.

Moreover, like all machine translation programs, Google Translate struggles with polysemy (the multiple meanings a word may have) and multiword expressions (terms that have meanings that cannot be understood or translated by analyzing the individual word units that compose them).

Additionally, grammatical errors remain a major limitation to the accuracy of Google Translate.

Reviews
Shortly after launching the translation service for the first time, Google won an international competition for English–Arabic and English–Chinese machine translation.

Translation mistakes and oddities
Since Google Translate used statistical matching to translate, translated text can often include apparently nonsensical and obvious errors, sometimes swapping common terms for similar but nonequivalent common terms in the other language, or inverting sentence meaning. Novelty websites like Bad Translator and Translation Party have utilized the service to produce humorous text by translating back and forth between multiple languages, similar to the children's game telephone.

Court usage
In 2017, Google Translate was used during a court hearing when court officials at Teesside Magistrates' Court failed to book an interpreter for a Chinese defendant.