Sunday, December 01, 2024

The Power of Large Language Models for Translations

One of the earliest use cases of large language models, i.e. text-based generative AI, was for translating text provided in one input language into another language - a literal transformation of source to target if you will - greetings almighty transformer model.

Before that, translation was often based on rule-based and or statistical models like Hidden Markov models, and later some variant of the recursive neural network models like RNN and LSTM. But then the transformer came, and blew everything that was before out of the park.

While many more powerful, novel and interesting adoptions of generative AI for text have taken center stage in the last month and years, the translation of text from one language into another language still feels magical. Just a few years back we would have searched for individual words in our offline or online dictionaries and translated between German and English or any other language word by word. We would have depended on our or someone else's expertise of forming natural and gramatical sentences. We would have looked up similar uses of a word, reference texts and the like in cases we wanted to know more or simply weren't sure.

Now, we enter our words, phrases, sentences, paragraphs or even whole documents into a translation service like google translate, or DeepL's translator. We can give guidance in terms of tone, grammaticality, verboseness, target audience, and generally steer the generation towards our intenteded use and recipients.

You may have experienced PowerPoint's live translation of speech. While we browse through our slides, talking in whatever language, english subtitles are pushed out almost synchronously. Yes, sometimes it goes wrong, not always is the right word used. But just few years back the online translation was a service that was only used for high-profile events or in exceptional settings. Today, it is available within a consumer application for a moderate price! We simply press a button and in the matter of seconds - Hola amigo. There you go understandable utterances. Sure, this might not necessarily be at the level required to win the Purlitzer price, but hey! It comes in the language of your choice! Yes, the target language needs to be a somewhat actively spoken language otherwise only limited training data is available. But you get my point.

Pair this with generative model's power to take a few samples of our voice and recreate our speech bit by bit. Pair this with generative's power to create somewhat realistic images and avatars of us - et voilĂ . You have a virtual replica of you. Explore the countries of the world. Even if you do not speak the language of the country you are visiting that moment. There is a good chance that your avatar does! These are magical times indeed.

No comments: