Tuesday 12 September 2017

DeepL: Tool or threat for translators?

The end of August saw the launch of DeepL, a new machine translation tool developed by Cologne-based start-up DeepL GmbH (formerly Linguee GmbH). It was born from Linguee, a translation tool that has been around for some years and is a popular resource amongst us translators.

DeepL apparently performs better than any of its rivals’ products because it’s based on the relatively new Neural Machine Translation (NMT) approach, in which the processing of data is modelled on thought processes as they occur in the human brain. Its makers also claim to have created one of the world’s most powerful supercomputers, conveniently located in Iceland (where electricity costs are lower than elsewhere).

Neural Machine Translation (NMT) is modelled on thought processes in the human brain

Curious about these latest developments in machine translation (MT), I incorporated DeepL into my own work last week so I could familiarise myself with it. Since I’d heard it supposedly is excellent at what it does, I started off my experiment with a bit of a feeling of dread in my stomach. I was soon relieved, though, when I realised it’s basically yet just another tool. However, unlike many of its predecessors, it produces some output that is actually usable!

Having said that, I also encountered severe (in some text types potentially even dangerous!) issues in the DeepL MT output. They may seem minor or insignificant if you don’t work with language professionally; yet in translation for the commercial world they do matter. They do, in fact, matter very much!

I’m going to list a handful of these issues from the patent I was translating assisted by DeepL. (Note that for this article I’ve deliberately picked just shorter sentences or terms from shorter sentences, as DeepL couldn’t cope with longer sentences or shorter sentences with more convoluted syntax.)

“In one embodiment, the guide tube 106 includes an opening 105 on a first end which receives the medications.”
Although I was supplied with a sentence in perfect German grammar, so at first sight there seemed nothing wrong with it, DeepL had incorrectly assumed that the relative pronoun refers back to “a first end”, whereas its actual antecedent is “an opening”.

“treatment of the surface of the guide tube 106 that comes in contact with the pill
Here we have the same issue as above: The antecedent of the relative pronoun “that” in this particular context is “surface”, i.e. not “guide tube”, because the surface comes into contact with the pill. How can a computer decide what the antecedent of a relative pronoun is? It can’t.

“The shape of the guide tube 106, the orientation of the guide tube 106 to the force of gravity or other source of force, and the coefficients of friction and drag can be specifically designed to orient the axis of each pill in the direction of travel or with the axis of the tube 106.
“direction of travel” was nonsensically translated by DeepL as “Fahrtrichtung”, which would, of course, be the correct term in automotive contexts, whereas here it simply means the pill is moving in a particular direction.

Translated by DeepL as “Rillen”. Further down in the text, though, and especially when I looked at the technical drawings, it became clear that “Erhöhungen” or a synonymous term is more appropriate because the ridges on the internal (i.e. not the external) surface are described.

“low-distortion transparent material
Translated by DeepL as “verzerrungsarmes transparentes Material”, which does not make sense here since “low-distortion” in this particular context simply means the material in question isn’t prone to becoming deformed. (Also, DeepL omitted the important comma between the two adjectives in German.)

“cameras with fast shutters
Translated by DeepL as “Kameras mit schnellen Shuttern”; however, people working in this field tend to call them “Ultrakurzzeitkameras”.

“System 700 includes an image analyzer 704 and includes or has access to an image database 706.
Translated by DeepL as “Das System 700 verfügt über einen Bildanalysator 704 und eine Bilddatenbank 706”. Although the sentence is correct grammatically and sort of conveys the meaning, leaving out parts of a sentence is a no-go, especially in patent translation.

“In one embodiment, the light sources are continuous.
Translated by DeepL as “In einer Ausführungsform sind die Lichtquellen durchgehend”. The grammar is impeccable, yet the sentence sounds odd. A human translator would likely opt for a more technically sounding translation such as “In einer Ausführungsform sind die Lichtquellen Dauerlichtquellen.”

Translated by DeepL by “Optiken” in the plural. Difficult for a computer to get right, but Germans tend to use the term in the singular here to refer to an assembly of optical elements.

“electrophoresis (e.g., capillary)
Translated by DeepL as “Elektrophorese (z. B. Kapillare)”. A human translator would likely elaborate a bit and render the whole phrase as “Elektrophorese (z. B. Kapillarelektrophorese)” as otherwise it all somehow doesn’t fit together.

“limit the invention to the precise forms disclosed
“forms” was translated by DeepL literally as “Formen”. In this particular sentence, however, its meaning in the patent is “embodiments” or “forms of embodiment”, so it really should have been output as “Ausführungsformen” (or “Ausführungen”, which is even more common in patents originally drafted in German).

Following my experiment, I can confirm DeepL is indeed more precise and nuanced than any of the other machine translations that I’ve previously seen floating around the internet. So should we translators see DeepL as a threat? Will it disrupt the translation industry? I don’t believe it will. Machine translation is becoming more and more widespread, but: I am convinced human input will always be required for many text types.

For any change that looks potentially disruptive, there is both threat and opportunity. It’s ultimately all about how we respond to such changes! It’s also worth remembering there is a shortage of translators (read: good translators) across the board, while translation volumes are increasing year by year. So there is no other way than additionally employing machine translation for all the easier-to-handle-texts that require to be translated.

Machine translation or MT (also often referred to as instant, automated or automatic translation) was pioneered in the 1950s, and although this has taken a very long time, machines are gradually becoming better at translating. We have to acknowledge they are now no longer producing the hopeless gibberish of the early days of MT.

I have until recently been sceptical about the viability of post-editing machine translations as a new field of work in professional translation, simply because the MT output has typically been poor. But following these latest developments, I wonder if it is now worth exploring a bit more? Although DeepL hasn’t set out its vision yet, I wouldn’t mind if DeepL was made available for professionals at some stage – perhaps as a plug-in in the CAT software that we use?

If computers are indeed becoming more and more capable of taking over the boring bits of our work, then this can only be a welcome move forward. For it’ll mean we will at last be able to concentrate and spend more time on the bits in our texts that are actually interesting, that are blissfully complex and therefore worth getting our teeth into!