Does In-Ear Speech-to-Speech Translation Technology Really Work?

A Review of the Lingmo Translate One2One

I’ve been testing and writing about interpreting technology since 2015. Until now, all the technologies I’ve reviewed have been software. This review, however, takes us into new territory. We’re going to take a look at one of the first wearable speech-to-speech translation devices available on the consumer electronics market—the Lingmo Translate One2One.1

You may wonder why a professional interpreter would take the time to review a wearable translation device designed for use by travelers. The reason is simple. No one, to my knowledge, has given one of these new devices a thorough review to see what they can do. Do they live up to their claims? Will they actually help people communicate? Do they live up to all the hype about them in the media? And perhaps most importantly, will they put professional interpreters out of business? (That is the question lurking in the back of many an interpreter’s mind.)

I wanted to find out for myself. So, in June of last year, I plunked down $229 of my hard-earned cash and ordered a pair of Lingmo Translate One2One headsets. They were supposed to ship within a few weeks, but they didn’t arrive until late December.

This is the first wearable translator I’ve reviewed, but I hope to review a couple more in the coming months2 because I want to be able to say authoritatively, based on my own experience, what these devices can and can’t do and if they truly perform as promised. As they say, knowledge is power. Let’s get started!

How Does It Work?

The Translate One2One concept goes like this: you wear an earpiece over one ear while participating in an interpreted dialog. When you want to say something, you tap the screen on the earpiece with your finger and speak normally. Once you finish speaking you tap the screen again and the IBM Watson artificial intelligence processes the speech, converts it to text, runs it through a machine translation algorithm, produces the translated text in the target language, and then uses speech synthesis to pronounce the translated text. The person who speaks the other language wears a similar earpiece and goes through the same steps when she or he wants to reply. Currently, the One2One can translate between any combination of Arabic, Chinese (Mandarin), English (U.S. and U.K.), French, German, Italian, Japanese, Portuguese (Brazilian), and Spanish.

Lingmo

How do the earpieces accomplish this? Like every other speech-to-speech translator I’ve seen, the Lingmo device requires an internet connection either by Wi-Fi or mobile data networks. One of the things that attracted me to the Translate One2One earpiece is that it didn’t have to be tethered to a cell phone (like the Google Pixel Buds or the Waverly Labs Pilot) to work. To make that possible, Lingmo just included the mobile phone in the earpiece. So, in essence, the process is the same, but the presentation is different.

One of the earpieces is designed for use with mobile data technology in the Americas (1900 HMz) and the other for Europe, Africa, Asia, and Oceania (2100 HMz). The catch is that you have to purchase a micro SIM card for use in the earpiece designed to work with the mobile data networks in the part of the world you’re in and pay for the mobile data used by the earpiece once connected to the data network. If you live in the Americas and travel to Asia, you’ll have to purchase a micro SIM card in Asia and insert it into the appropriate earpiece for data network access. So, you still have to pay for access to a mobile data network even though it doesn’t go through a mobile phone.

The earpieces also work with Wi-Fi, but Lingmo notes that this will slow their reaction time. Lingmo uses Bluetooth technology to connect the mobile or Wi-Fi-enabled earpiece to the second earpiece, which is not directly connected to the internet. The second earpiece sends all its data back to the first, which in turn sends it to the Lingmo IBM Watson servers for processing. Once processed, the data is sent back to the first earpiece and then relayed to the second through the Bluetooth connection. Wait times for translations were usually between four to six seconds—definitely not real time, but speedy nonetheless considering all the steps the data has to go through.

Product Design, Fit, and Finish

The Translate One2One earpieces are designed to fit over your right ear, and are quite bulky by today’s minimalist earbud standards. Each earpiece has a square touchscreen (three centimeters square) and two mechanical buttons at the top (the on/off switch and the back button). Upon closer inspection, I noticed that the folks at Lingmo had repurposed a smartwatch form factor to create the earpieces. The watchband lugs (where the wristband usually attaches to the watch) are secured to a plastic frame that allows the earpiece to rest over the top of your right ear once it’s attached to a soft rubber headband that comes with the earpiece. That’s right, you are basically hanging a smartwatch over your ear.

All this gives the earpiece a thrown-together feel. The on/off and back buttons are so close to the plastic frame that they’re not easy to find and press without looking at the earpiece (especially for someone with big fingers), which is hard to do when it’s hanging on your ear. The one-size-fits-all headband is not very comfortable. It didn’t fit my head very well and didn’t keep the earpiece in the right place over my ear when I turned my head. It constantly felt like I needed to readjust it so it didn’t slip off the back of my head. Also, only being able to wear the earpiece over the right ear is problematic. What happens if you prefer your left ear or are deaf in your right? These observations aside, the earpiece generally stayed over my ear, and wearing the earpiece and headband was comfortable enough to wear for 10- to 15-minute stints. On the positive side, each earpiece comes with magnetic charging points that make it easy to connect the charging cable.

User Experience

User experience is the single most important factor of any piece of consumer technology, especially speech-to-speech translators. If the technology hopes to wow the end user, it has to work right out of the box, be extremely simple to use, and require almost no training to make it work. What’s more, the device needs to turn on quickly and require as few steps as possible to provide the service. Otherwise, users will seldom have the patience to use the technology.

Unfortunately, trying to use the One2One earpieces to communicate was an exercise in frustration, but not for the reasons you might think. Out of the box, it took me several attempts before I could get the correct earpiece connected to the internet and then connected to the other earpiece by Bluetooth. One complicating factor is that each earpiece is a full-fledged miniature smartphone running the Android operating system, complete with 24 pre-loaded apps and the possibility to purchase and download even more from Google Play.

Connecting the main earpiece to Wi-Fi was a challenge. I had to type in a long and complicated network password on a microscopic keyboard on the square screen (remember, it’s only three centimeters). I think it took me at least five attempts before I finally got my extra-large fingers to tap the right keys on that tiny screen. The size of the text on the screen appeared to be between six- and eight-point type—not easy for my middle-aged eyes to read.

I never did successfully get the 1900 HMz earpiece for the Americas working with a micro SIM card, in spite of the fact that I tried three different ones and even went to a local AT&T store to get technical support. The technician couldn’t get the earpiece to connect to the mobile data network. So, all my tests were conducted using Wi-Fi.

Once you’ve connected the earpieces and pre-selected your preferred language combination, it only takes three taps to begin a conversation. You touch the screen, hear a beep, and begin speaking. In theory, I was told by a Lingmo representative, you should be able to speak as long as necessary to complete your thought and then touch the screen again to begin the translation process. In all my tests, however, the earpiece would not let me talk for more than four seconds at a time before it beeped again and began the translation process. So, at best, the One2One will really only work for simple conversations in short chunks. No complex sentences.

One other oversight that makes the earpieces difficult to use is the lack of volume control. When I first tested them with a conversation partner, the initial beep to begin talking for the Spanish speaker was uncomfortably loud, so much so that she immediately pulled the earpiece off after wincing in pain. Everyone’s hearing is different and not providing a simple way to adjust the volume is a serious design flaw.

Software Performance

Let me start with the positives. The speech recognition in the languages I was able to check (English, German, Portuguese, and Spanish) was very good. It isn’t perfect, but I didn’t expect it to be. That said, it’s more than adequate for the application. The speech synthesis was also quite impressive in the languages I tested. It was clear and easy to understand. The U.S. English was a standard midwestern accent, the Spanish a strong Peninsular accent, and the Portuguese was notably Brazilian (even though the earpiece displays the Portuguese flag, which is something I’m sure any Portuguese user will note immediately).

The IBM Watson Language Translator3 was spotty at best. Although it did produce an accurate sentence from time to time, most of the translations were off, many terribly so. The software has no ability to recognize and process intonation for meaning. As a result, most questions are translated as declarative sentences. This is problematic because the most likely use case will be to ask questions to solicit information. Polysemic words are frequently mistranslated. In addition, the four-second time limitation allows for almost zero context, so the entire setup just isn’t very practical. Lingmo claims that Translate One2One is 85% accurate, but what does that even mean? Does that mean that 85% of each sentence is accurate and 15% is totally wrong, or that the translations are completely accurate 85% of the time? I went through the Lingmo website and the instruction manual, but found no basis for that statistic. It’s unclear whether the 85% accuracy rate is based on BLEU (Bilingual Evaluation Understudy) scores4 or some other company-specific metric.

Summing It All Up

This is not the review I wanted to write. I spent several weeks testing and working with the One2One earpieces to learn to use them and creating test dialogs to see if they would deliver as promised. Dialogs included conversations with doctors and taxi drivers, hotel check-ins, and being pulled over by police for speeding. The One2One earpieces struggled with every scenario. Not necessarily because of translation problems, but often because of technical problems like the Bluetooth connection failing between the headsets or the app crashing. Even so, the IBM Watson machine translation was still comically poor when all other technical processes worked appropriately. You can watch the Translate One2One earpieces in action and see for yourself how they performed.5

The target market for the One2One is the world traveler who needs to communicate as she or he goes from one country to the next. That’s the right target market for this kind of consumer electronics. Unfortunately, Lingmo severely underdelivers for its own target market mainly because of product design flaws and frustrating user experience, not necessarily because of the machine translation software.

What’s the bottom line for professional interpreters working in legal, medical, and conference settings? Don’t expect this technology to take away your job anytime soon. If anything, the proliferation of these translation devices reflects a huge demand for interpreting services—a demand that we, as a profession, still haven’t figured out how to meet adequately.

Remember, if you have any ideas and/or suggestions regarding helpful resources or tools you would like to see featured, please e-mail Jost Zetzsche at jzetzsche@internationalwriters.com.

Notes
  1. Check out the Lingmo Translate One2One website at http://bit.ly/TranslateOne2One.
  2. I recently obtained a “Travis the Translator,” a handheld speech-to-speech translation device designed in the Netherlands, and am preparing a review of this device as well.
  3. For a demo of the IBM Watson Language Translator, check out http://bit.ly/Watson-demo.
  4. BLEU (Bilingual Evaluation Understudy) scores, https://en.wikipedia.org/wiki/BLEU.
  5. “The Tech-Savvy Interpreter: A First Look at the Lingmo Translate One2One Earpiece,” http://bit.ly/One2One-demo.

Barry Slaughter Olsen is a veteran conference interpreter and technophile with 25 years of experience interpreting, training interpreters, and organizing language services. He is an associate professor at the Middlebury Institute of International Studies at Monterey, the founder and co-president of InterpretAmerica, and general manager of multilingual operations at ZipDX. He is also a member of the International Association of Conference Interpreters. For updates on interpreting, technology, and training, follow him on Twitter @ProfessorOlsen. Contact: bsolsen@middlebury.edu.

1 Responses to "Does In-Ear Speech-to-Speech Translation Technology Really Work?"

  1. Zakhira Shopysheva, z says:

    Hello, thank you for sharing your experience with new tools.

Comments are closed.

The ATA Chronicle © 2018 All rights reserved.