| Machine translation (MT) systems are now | | | | For each query, I attempted a "best guess" at |
| ubiquitous. This ubiquity is due to a combination of | | | | the user's purpose for translating the query. In |
| increased need for translation in today's global | | | | many cases, the purpose is quite obvious; in a |
| marketplace, and an exponential growth in | | | | few cases, there is clearly ambiguity. With that |
| computing power that has made such systems | | | | caveat, I judge that in about 88% of cases, the |
| viable. And under the right circumstances, MT | | | | intended use is fairly clear-cut, and categorise |
| systems are a powerful tool. They offer | | | | these uses as follows: |
| low-quality translations in situations where | | | | - Looking up a single word or term: 38% |
| low-quality translation is better than no translation | | | | - Translating a formal text: 23% |
| at all, or where a rough translation of a large | | | | - Internet chat session: 18% |
| document delivered in seconds or minutes is more | | | | - Homework: 9% |
| useful than a good translation delivered in three | | | | A surprising (if not alarming!) observation is that in |
| weeks' time. | | | | such a large proportion of cases, users are using |
| Unfortunately, despite the widespread accessibility | | | | the translator to look up a single word or term. In |
| of MT, it is clear that the purpose and limitations | | | | fact, 30% of queries consisted of a single word. |
| of such systems are frequently misunderstood, | | | | The finding is a little surprising given that the site |
| and their capability widely overestimated. In this | | | | in question also has a Spanish-English dictionary, |
| article, I want to give a brief overview of how | | | | and suggests that users confuse the purpose of |
| MT systems work and thus how they can be put | | | | dictionaries and translators. Although not |
| to best use. Then, I'll present some data on how | | | | represented in the raw figures, there were clearly |
| Internet-based MT is being used right now, and | | | | some cases of consecutive searches where it |
| show that there is a chasm between the intended | | | | appeared that a user was deliberately splitting up |
| and actual use of such systems, and that users | | | | a sentence or phrase that would have probably |
| still need educating on how to use MT systems | | | | been better translated if left together. Perhaps as |
| effectively. | | | | a consequence of student over-drilling on |
| How machine translation works | | | | dictionary usage, we see, for example, a query |
| You might have expected that a computer | | | | for cuarto para ("quarter to") followed |
| translation program would use grammatical rules | | | | immediately by a query for a number. There is |
| of the languages in question, combining them with | | | | clearly a need to educate students and users in |
| some kind of in-memory "dictionary" to produce | | | | general on the difference between the electronic |
| the resulting translation. And indeed, that's | | | | dictionary and the machine translator[3]: in |
| essentially how some earlier systems worked. But | | | | particular, that a dictionary will guide the user to |
| most modern MT systems actually take a | | | | choosing the appropriate translation given the |
| statistical approach that is quite "linguistically blind". | | | | context, but requires single-word or single-phrase |
| Essentially, the system is trained on a corpus of | | | | lookups, whereas a translator generally works |
| example translations. The result is a statistical | | | | best on whole sentences and given a single word |
| model that incorporates information such as: | | | | or term, will simply report the statistically most |
| - "when the words (a, b, c) occur in succession in | | | | common translation. |
| a sentence, there is an X% chance that the | | | | I estimate that in less than a quarter of cases, |
| words (d, e, f) will occur in succession in the | | | | users are using the MT system for its |
| translation" (N.B. there don't have to be the same | | | | "trained-for" purpose of translating or gisting a |
| number of words in each pair); | | | | formal text (and are entering an entire sentence, |
| - "given two successive words (a, b) in the target | | | | or at least partial sentence rather than an isolated |
| language, if word (a) ends in -X, there is an X% | | | | noun phrase). Of course, it's impossible to know |
| chance that word (b) will end in -Y". | | | | whether any of these translations were then |
| Given a huge body of such observations, the | | | | intended for publication without further proof, |
| system can then translate a sentence by | | | | which definitely isn't the purpose of the system. |
| considering various candidate translations-- made | | | | The use for translating formal texts is now |
| by stringing words together almost at random (in | | | | almost rivalled by the use to translate informal |
| reality, via some 'naive selection' process)-- and | | | | on-line chat sessions-- a context for which MT |
| choosing the statistically most likely option. | | | | systems are typically not trained. The on-line chat |
| On hearing this high-level description of how MT | | | | context poses particular problems for MT |
| works, most people are surprised that such a | | | | systems, since features such as non-standard |
| "linguistically blind" approach works at all. What's | | | | spelling, lack of punctuation and presence of |
| even more surprising is that it typically works | | | | colloquialisms not found in other written contexts |
| better than rule-based systems. This is partly | | | | are common. For chat sessions to be translated |
| because relying on grammatical analysis itself | | | | effectively would probably require a dedicated |
| introduces errors into the equation (automated | | | | system trained on a more suitable (and possibly |
| analysis is not completely accurate, and humans | | | | custom-built) corpus. |
| don't always agree on how to analyse a | | | | It's not too surprising that students are using MT |
| sentence). And training a system on "bare text" | | | | systems to do their homework. But it's interesting |
| allows you to base a system on far more data | | | | to note to what extent and how. In fact, use for |
| than would otherwise be possible: corpora of | | | | homework incudes a mixture of "fair use" |
| grammatically analysed texts are small and few | | | | (understanding an exercise) with an attempt to |
| and far between; pages of "bare text" are | | | | "get the computer to do their homework" (with |
| available in their trillions. | | | | predictably dire results in some cases). Queries |
| However, what this approach does mean is that | | | | categorised as homework include sentences which |
| the quality of translations is very dependent on | | | | are obviously instructions to exercises, plus certain |
| how well elements of the source text are | | | | sentences explaining trivial generalities that would |
| represented in the data originally used to train the | | | | be uncommon in a text or conversation, but |
| system. If you accidentally type he will returned | | | | which are typical in beginners' homework |
| or vous avez demander (instead of he will return | | | | exercises. |
| or vous avez demandé), the system will | | | | Whatever the use, an issue for system users and |
| be hampered by the fact that sequences such as | | | | designers alike is the frequency of errors in the |
| will returned are unlikely to have occurred many | | | | source text which are liable to hamper the |
| times in the training corpus (or worse, may have | | | | translation. In fact, over 40% of queries contained |
| occurred with a completely different meaning, as | | | | such errors, with some queries containing several. |
| in they needed his will returned to the solicitor). | | | | The most common errors were the following |
| And since the system has little notion of grammar | | | | (queries for single words and terms were |
| (to work out, for example, that returned is a | | | | excluded in calculating these figures): |
| form of return, and "the infinitive is likely after he | | | | - Missing accents: 14% of queries |
| will"), it in effect has little to go on. | | | | - Missing punctuation: 13% |
| Similarly, you may ask the system to translate a | | | | - Other orthographical error: 8% |
| sentence that is perfectly grammatical and | | | | - Grammatically incomplete sentence: 8% |
| common in everyday use, but which includes | | | | Bearing in mind that in the majority of cases, |
| features that happen not to have been common | | | | users where translating from their native |
| in the training corpus. MT systems are typically | | | | language, users appear to underestimate the |
| trained on the types of text for which human | | | | importance of using standard orthography to give |
| translations are readily available, such as technical | | | | the best chance of a good translation. More |
| or business documents, or transcripts of meetings | | | | subtly, users do not always understand that the |
| of multilingual parliaments and conferences. This | | | | translation of one word can depend on another, |
| gives MT systems a natural bias towards certain | | | | and that the translator's job is more difficult if |
| types of formal or technical text. And even if | | | | grammatical constituents are incomplete, so that |
| everyday vocabulary is still covered by the | | | | queries such as hoy es día de are not |
| training corpus, the grammar of everyday speech | | | | uncommon. Such queries hamper translation |
| (such as using tú instead of usted in | | | | because the chance of a sentence in the training |
| Spanish, or using the present tense instead of the | | | | corpus with, say, a "dangling" preposition like this |
| future tense in various languages) may not. | | | | will be slim. |
| MT systems in practice | | | | Lessons to be learnt...? |
| Researches and developers of computer | | | | At present, there's still a mismatch between the |
| translation systems have always been aware that | | | | performance of MT systems and the |
| one of the biggest dangers is public misperception | | | | expectations of users. I see responsibility for |
| of their purpose and limitations. Somers (2003)[1], | | | | closing this gap as lying in the hands both of |
| observing the use of MT on the web and in chat | | | | developers and of users and educators. Users |
| rooms, comments that: "This increased visibility of | | | | need to think more about making their source |
| MT has had a number of side effets. [...] There is | | | | sentences "MT-friendly" and learn how to assess |
| certainly a need to educate the general public | | | | the output of MT systems. Language courses |
| about the low quality of raw MT, and, importantly, | | | | need to address these issues: learning to use |
| why the quality is so low." Observing MT in use in | | | | computer translation tools effectively needs to be |
| 2009, there's sadly little evidence that users' | | | | seen as a relevant part of learning to use a |
| awareness of these issues has improved. | | | | language. And developers, including myself, need |
| As an illustration, I'll present a small sample of | | | | to think about how we can make the tools we |
| data from a Spanish-English MT service that I | | | | offer better suited to language users' needs. |
| make available at the Español-Inglés | | | | Notes |
| web site. The service works by taking the user's | | | | [1] Somers (2003), "Machine Translation: the |
| input, applying some "cleanup" processes (such as | | | | Latest Developments" in The Oxford Handbook |
| correcting some common orthographical errors | | | | of Computational Linguistics, OUP. |
| and decoding common instances of "SMS-speak"), | | | | [2] This odd number is simply because queries |
| and then looking for translations in (a) a bank of | | | | matching the selection criteria were captured with |
| examples from the site's Spanish-English | | | | random probability within a fixed time frame. It |
| dictionary, and (b) a MT engine. Currently, Google | | | | should be noted that the system for deducing a |
| Translate is used for the MT engine, although a | | | | machine's country from its IP address is not |
| custom engine may be used in the future. The | | | | completely accurate. |
| figures I present here are from an analysis of | | | | [3] If the user enters a single word into the |
| 549 Spanish-English queries presented to the | | | | system in question, a message is displayed |
| system from machines in Mexico[2]-- in other | | | | beneath the translation suggesting that the user |
| words, we assume that most users are | | | | would get a better result by using the site's |
| translating from their native language. | | | | dictionary. |
| First, what are people using the MT system for? | | | | |