Machine Translation - How it Works, What Users Expect, and What They Get

Machine translation (MT) systems are nowFor each query, I attempted a "best guess" at
ubiquitous. This ubiquity is due to a combination ofthe user's purpose for translating the query. In
increased need for translation in today's globalmany cases, the purpose is quite obvious; in a
marketplace, and an exponential growth infew cases, there is clearly ambiguity. With that
computing power that has made such systemscaveat, I judge that in about 88% of cases, the
viable. And under the right circumstances, MTintended use is fairly clear-cut, and categorise
systems are a powerful tool. They offerthese uses as follows:
low-quality translations in situations where- Looking up a single word or term: 38%
low-quality translation is better than no translation- Translating a formal text: 23%
at all, or where a rough translation of a large- Internet chat session: 18%
document delivered in seconds or minutes is more- Homework: 9%
useful than a good translation delivered in threeA surprising (if not alarming!) observation is that in
weeks' time.such a large proportion of cases, users are using
Unfortunately, despite the widespread accessibilitythe translator to look up a single word or term. In
of MT, it is clear that the purpose and limitationsfact, 30% of queries consisted of a single word.
of such systems are frequently misunderstood,The finding is a little surprising given that the site
and their capability widely overestimated. In thisin question also has a Spanish-English dictionary,
article, I want to give a brief overview of howand suggests that users confuse the purpose of
MT systems work and thus how they can be putdictionaries and translators. Although not
to best use. Then, I'll present some data on howrepresented in the raw figures, there were clearly
Internet-based MT is being used right now, andsome cases of consecutive searches where it
show that there is a chasm between the intendedappeared that a user was deliberately splitting up
and actual use of such systems, and that usersa sentence or phrase that would have probably
still need educating on how to use MT systemsbeen better translated if left together. Perhaps as
effectively.a consequence of student over-drilling on
How machine translation worksdictionary usage, we see, for example, a query
You might have expected that a computerfor cuarto para ("quarter to") followed
translation program would use grammatical rulesimmediately by a query for a number. There is
of the languages in question, combining them withclearly a need to educate students and users in
some kind of in-memory "dictionary" to producegeneral on the difference between the electronic
the resulting translation. And indeed, that'sdictionary and the machine translator[3]: in
essentially how some earlier systems worked. Butparticular, that a dictionary will guide the user to
most modern MT systems actually take achoosing the appropriate translation given the
statistical approach that is quite "linguistically blind".context, but requires single-word or single-phrase
Essentially, the system is trained on a corpus oflookups, whereas a translator generally works
example translations. The result is a statisticalbest on whole sentences and given a single word
model that incorporates information such as:or term, will simply report the statistically most
- "when the words (a, b, c) occur in succession incommon translation.
a sentence, there is an X% chance that theI estimate that in less than a quarter of cases,
words (d, e, f) will occur in succession in theusers are using the MT system for its
translation" (N.B. there don't have to be the same"trained-for" purpose of translating or gisting a
number of words in each pair);formal text (and are entering an entire sentence,
- "given two successive words (a, b) in the targetor at least partial sentence rather than an isolated
language, if word (a) ends in -X, there is an X%noun phrase). Of course, it's impossible to know
chance that word (b) will end in -Y".whether any of these translations were then
Given a huge body of such observations, theintended for publication without further proof,
system can then translate a sentence bywhich definitely isn't the purpose of the system.
considering various candidate translations-- madeThe use for translating formal texts is now
by stringing words together almost at random (inalmost rivalled by the use to translate informal
reality, via some 'naive selection' process)-- andon-line chat sessions-- a context for which MT
choosing the statistically most likely option.systems are typically not trained. The on-line chat
On hearing this high-level description of how MTcontext poses particular problems for MT
works, most people are surprised that such asystems, since features such as non-standard
"linguistically blind" approach works at all. What'sspelling, lack of punctuation and presence of
even more surprising is that it typically workscolloquialisms not found in other written contexts
better than rule-based systems. This is partlyare common. For chat sessions to be translated
because relying on grammatical analysis itselfeffectively would probably require a dedicated
introduces errors into the equation (automatedsystem trained on a more suitable (and possibly
analysis is not completely accurate, and humanscustom-built) corpus.
don't always agree on how to analyse aIt's not too surprising that students are using MT
sentence). And training a system on "bare text"systems to do their homework. But it's interesting
allows you to base a system on far more datato note to what extent and how. In fact, use for
than would otherwise be possible: corpora ofhomework incudes a mixture of "fair use"
grammatically analysed texts are small and few(understanding an exercise) with an attempt to
and far between; pages of "bare text" are"get the computer to do their homework" (with
available in their trillions.predictably dire results in some cases). Queries
However, what this approach does mean is thatcategorised as homework include sentences which
the quality of translations is very dependent onare obviously instructions to exercises, plus certain
how well elements of the source text aresentences explaining trivial generalities that would
represented in the data originally used to train thebe uncommon in a text or conversation, but
system. If you accidentally type he will returnedwhich are typical in beginners' homework
or vous avez demander (instead of he will returnexercises.
or vous avez demandé), the system willWhatever the use, an issue for system users and
be hampered by the fact that sequences such asdesigners alike is the frequency of errors in the
will returned are unlikely to have occurred manysource text which are liable to hamper the
times in the training corpus (or worse, may havetranslation. In fact, over 40% of queries contained
occurred with a completely different meaning, assuch errors, with some queries containing several.
in they needed his will returned to the solicitor).The most common errors were the following
And since the system has little notion of grammar(queries for single words and terms were
(to work out, for example, that returned is aexcluded in calculating these figures):
form of return, and "the infinitive is likely after he- Missing accents: 14% of queries
will"), it in effect has little to go on.- Missing punctuation: 13%
Similarly, you may ask the system to translate a- Other orthographical error: 8%
sentence that is perfectly grammatical and- Grammatically incomplete sentence: 8%
common in everyday use, but which includesBearing in mind that in the majority of cases,
features that happen not to have been commonusers where translating from their native
in the training corpus. MT systems are typicallylanguage, users appear to underestimate the
trained on the types of text for which humanimportance of using standard orthography to give
translations are readily available, such as technicalthe best chance of a good translation. More
or business documents, or transcripts of meetingssubtly, users do not always understand that the
of multilingual parliaments and conferences. Thistranslation of one word can depend on another,
gives MT systems a natural bias towards certainand that the translator's job is more difficult if
types of formal or technical text. And even ifgrammatical constituents are incomplete, so that
everyday vocabulary is still covered by thequeries such as hoy es día de are not
training corpus, the grammar of everyday speechuncommon. Such queries hamper translation
(such as using tú instead of usted inbecause the chance of a sentence in the training
Spanish, or using the present tense instead of thecorpus with, say, a "dangling" preposition like this
future tense in various languages) may not.will be slim.
MT systems in practiceLessons to be learnt...?
Researches and developers of computerAt present, there's still a mismatch between the
translation systems have always been aware thatperformance of MT systems and the
one of the biggest dangers is public misperceptionexpectations of users. I see responsibility for
of their purpose and limitations. Somers (2003)[1],closing this gap as lying in the hands both of
observing the use of MT on the web and in chatdevelopers and of users and educators. Users
rooms, comments that: "This increased visibility ofneed to think more about making their source
MT has had a number of side effets. [...] There issentences "MT-friendly" and learn how to assess
certainly a need to educate the general publicthe output of MT systems. Language courses
about the low quality of raw MT, and, importantly,need to address these issues: learning to use
why the quality is so low." Observing MT in use incomputer translation tools effectively needs to be
2009, there's sadly little evidence that users'seen as a relevant part of learning to use a
awareness of these issues has improved.language. And developers, including myself, need
As an illustration, I'll present a small sample ofto think about how we can make the tools we
data from a Spanish-English MT service that Ioffer better suited to language users' needs.
make available at the Español-InglésNotes
web site. The service works by taking the user's[1] Somers (2003), "Machine Translation: the
input, applying some "cleanup" processes (such asLatest Developments" in The Oxford Handbook
correcting some common orthographical errorsof Computational Linguistics, OUP.
and decoding common instances of "SMS-speak"),[2] This odd number is simply because queries
and then looking for translations in (a) a bank ofmatching the selection criteria were captured with
examples from the site's Spanish-Englishrandom probability within a fixed time frame. It
dictionary, and (b) a MT engine. Currently, Googleshould be noted that the system for deducing a
Translate is used for the MT engine, although amachine's country from its IP address is not
custom engine may be used in the future. Thecompletely accurate.
figures I present here are from an analysis of[3] If the user enters a single word into the
549 Spanish-English queries presented to thesystem in question, a message is displayed
system from machines in Mexico[2]-- in otherbeneath the translation suggesting that the user
words, we assume that most users arewould get a better result by using the site's
translating from their native language.dictionary.
First, what are people using the MT system for?