Skip to content
FacebookTwitterLinkedinYoutubeInstagram
  • Join ATA
  • Renew
  • Shop ATAware
  • Contact Us
  • Log In Welcome, My Account
American Translators Association (ATA)
Find a Language Professional
  • Certification
    • Certification
      • Guide to ATA Certification
      • What is a Certified Translation?
      • How the Exam is Graded
      • Review and Appeal Process
      • Looking for more information?
    • Taking the Exam
      • About the Exam
      • How to Prepare
      • Practice Test
      • Exam Schedule
    • Already Certified?
      • Put Your Credentials To Work
      • Continuing Education Requirement
    • Register Buttons
      • Register for Exam
         
      • Order Practice Test
  • Career and Education
    • For Newcomers
      • Student Resources
      • Starting Your Career
      • The Savvy Newcomer Blog
    • For Professionals
      • Growing Your Career
      • Business Strategies
      • Next Level Blog
      • Client Outreach Kit
      • Mentoring
    • Resources
      • For Educators and Trainers
      • Tools and Technology
      • Publications
      • School Outreach
    • Event Buttons
      • Visit ATA66
      • Upcoming Webinars
  • Client Assistance
    • Client Resources
      • Why Should I Hire a Professional?
      • Translator vs. Interpreter
      • Buying Language Services
    • More Client Resources
      • Need a Certified Translation?
      • What is Machine Translation?
      • The ATA Compass Blog
    • Find a Translator Button
      • Find a Language Professional
  • Events
    • Events
      • Annual Conference
      • Free Events for ATA Members
      • Certification Exam Schedule
    • More Events
      • Virtual Workshops and Events
      • Live and On-Demand Webinars
      • Calendar of Events
    • Event Buttons
      • Visit ATA66
      • Upcoming Webinars
         
  • News
    • Industry News
    • Advocacy and Outreach
    • The ATA Chronicle
    • The ATA Podcast
    • ATA Newsbriefs
    • Press Releases
  • Member Center
    • Member Resources
      • Join ATA
      • Renew Your Membership
      • Benefits of Membership
      • Divisions & Special Interest Groups
      • Chapters, Affiliates, Partners, and Other Groups
      • Get Involved
      • Member Discounts
      • Shop ATAware
    • Already a Member?
      • Member Login
      • Connect with Members
      • Credentialed Interpreter Designation
      • Become a Voting Member
      • Submit Member News
      • Submit Your Event
      • Contact Us
    • Member Buttons
  • About Us
    • About ATA
      • Who We Are
      • Honors and Awards Program
      • Advertise with Us
      • Media Kit
    • How ATA Works
      • Board of Directors
      • Committees
      • Policies & Procedures
      • Code of Ethics
      • ATA Team
    • Contact Button
      • Contact ATA
  • Join ATA
  • Renew Your Membership
  • Contact Us
  • Log In
  • Find a Language Professional
March 21, 2017

Morphing into the Promised Land

Resources, Tools and Technology
Source: The ATA Chronicle

I’ve been very interested in morphology in translation technology. No, let me put that differently: I’ve been very frustrated that the translation environment tools we use don’t offer morphology. There are some exceptions—such as SmartCat, Star Transit, Across, and OmegaT—that offer some morphology support. But all of them are limited to a small number of languages, and any effort to expand these would require painful and manual coding.

Other tools (e.g., memoQ) have decided that they’re better off with fuzzy recognition rather than specific morphological language rules, but that clearly is not the best possible answer either.

So, what’s the problem? And what’s morphology in translation environment tools about in the first place?

Well, wouldn’t it be nice to have all inflected forms of any given word in your source text be associated automatically with the uninflected form that’s located in your termbase or glossary, and have that displayed in your terminology search results? And does it feel a little silly to even have to ask that question at a point when it should be a no-brainer to have any given tool provide that service? In case you wondered, the answer to these questions is “Yes, yes, resoundingly yes!”

On the other hand, there’s a reason why we’re stuck where we are. It happens to be cost. If you really have to manually enter morphology rules for all languages, it quickly becomes a Sisyphean exercise (starting with: “What exactly are all languages?”). If you do it just for the “important” languages (which, at least in the eyes of technology vendors, means “profitable”), you end up with the situation we already have with the tools mentioned above.

A few years ago, a group of folks, including myself, had the idea to crowdsource the collection of morphology rules for and with each language-specific group of translators. Once the rules were collected, they could then be integrated into the various technologies. It sounded good, but it was hard to get the project started due to a lack of funds to build the necessary infrastructure and/or the time it would have taken to raise funds, among other issues.

Enter translation environment tool Lilt with a very cool proposal that may very well be the solution. Lilt’s latest version introduces a “neural morphology” engine for all presently supported languages minus Chinese (so: English, Danish, Dutch, French, German, Italian, Norwegian, Polish, Portuguese, Russian, Spanish, and Swedish).

Here is the honest truth, though. When I first read the press release some time ago, I rolled my eyes fondly and thought to myself that the folks from Lilt were just thinking it was wise to throw a little “neural” around while it’s hot.

It turns out I was mistaken, however, as I found out when I talked with Lilt’s John DeNero, who is the architect of this part of Lilt’s system. John tried to explain to me what the system does and why it can make a big difference. It was not so hard to understand the second part, but my feeble untechnical mind had a hard time with the first part.

(By the way, we always assume that it’s us, the less-technically-inclined, who are to be pitied when we don’t understand technology. But can you imagine how pitiful life is for the more-technically-inclined who have to speak baby talk when communicating to us?)

An article by Radu Soricut and Franz Och, “Unsupervised Morphology Induction Using Word Embeddings,” provides a good summary of the system.1 It essentially analyzes large monolingual corpora, detects morphological modifications (in theory, they could be any kind of modification; in practice, Lilt focusses on suffixes right now), and classifies them. Since any word is evaluated and also classified within a context, the system is able to distinguish between the adverbial ending -ly in English when it encounters “gladly” versus “only.” Using the same contextual analysis, the system is also able to make very educated guesses about the morphological transformation of unknown words. (For instance, it might never have encountered “loquacious,” but chances are it would assume—correctly—that the adverbial transformation would be “loquaciously.”)

This works with every language that uses morphology (therefore excluding Chinese, for instance), provided there is enough corpus material to train the system. The time it takes for a new language to be trained is about two and a half days (on very powerful computers). That’s it.

Now, it’s not perfect (what is??). John was very open in his assessment about where the system fails. It tends to fail with irregular morphology (it might not recognize “geese” as the plural of “goose” or “well” as the adverbial form of “good”), and there are about 5% of all cases where John felt that the engine should have made a correct judgment but did not.

On the other hand, terminology hits have increased by a third for its users since Lilt introduced the system two weeks ago.

I consider this a quantum leap—in particular because it will not only benefit the large European and Asian languages (where applicable), but the long tail end of other languages as well.

You might say, Lilt covers only a handful of languages, so doesn’t that end up being the same thing? The answer to that is (a two-fold) “no.” First, you can expect Lilt to continue to add languages, and—even more importantly—the module used to build these neural morphology engines is open-source and available for every translation technology developer online.2

Here is what John said about the available engine and its usability:

Here’s our open-source release of the morphology system. It’s released as an academic project and doesn’t have any formal support, so it’s not a product. If someone wanted to use it, they would have to figure it out on their own (though, of course, I’m happy to answer questions).

So, get on it Kilgray, SDL, Atril, Wordfast and, and, and….

It’s also very promising that there are other areas where morphological knowledge can be used by a translation system. How about actively changing the inflection of a term that is automatically inserted based on its usage in the source? Or how about changing that inflection when repairing fuzzy matches? Or when repairing machine translation suggestions?

The sky’s the limit with this. Be creative!

Notes
  1. Radu Soricut, Radu, and Franz Och. “Unsupervised Morphology Induction Using Word Embeddings” https://www.aclweb.org/anthology/N/N15/N15-1186.pdf
  2. Oscii Lexicon,https://github.com/oscii-lab/lex.

Jost Zetzsche is the co-author of Found in Translation: How Language Shapes Our Lives and Transforms the World, a robust source for replenishing your arsenal of information about how human translation and machine translation each play an important part in the broader world of translation. Contact: jzetzsche@internationalwriters.com.

This column has two goals: to inform the community about technological advances and at the same time encourage the use and appreciation of technology among translation professionals.

Share this

Posts navigation

← Is Twitter Stupid?
Treasurer's Report for the First Five Months of FY2016–17 →

Latest Posts

  • ATA Statement on Artificial Intelligence May 20, 2025
  • Pennsylvania Recruiting Bilingual Workers with a Pay Incentive Pilot Program May 5, 2025
  • A County in Illinois Rolls Out “I Speak” Cards as Part of April’s “Language Access Month” May 5, 2025
  • Trump Administration Cuts Funding for Ukrainian Literature Translations at Harvard May 5, 2025
  • Washington State Senate Passes Bill Enhancing Court Interpreting Services for Non-English Speakers May 5, 2025

Topics

  • Advocacy & Outreach
  • Annual Conference
  • Book Reviews
  • Business Strategies
  • Certification Exam
  • Certification Program
  • Client Assistance
  • Educators and Trainers
  • Growing Your Career
  • Industry News
  • Interpreting
  • Member Benefits
  • Member News
  • Mentoring
  • Networking
  • Public Outreach
  • Publications
  • Resources
  • School Outreach
  • Specializations
  • Starting Your Career
  • Student Resources
  • Tools and Technology
  • Translation
Language Services Directory
ata_logo_footer

American Translators Association
211 N. Union Street, Suite 100
Alexandria, VA 22314

Phone +1-703-683-6100
Fax +1-703-778-7222

  • Certification
  • Career and Education
  • Client Assistance
  • Events
  • News
  • Member Center
  • About Us
  • Member Login
  • Contact Us
  • Sitemap
  • Privacy Policy
  • Accessibility Statement
  • Submit Feedback

© 2025 - American Translators Association

Find a Language Professional
Scroll To Top
By clicking accept or closing this message and continuing to use this site, you agree to our use of cookies.I AcceptPrivacy Policy