Text to Speech accuracy in Lingq

Hi everyone,

How accurate would you say it the Lingq Text to Speech voice for learning single word pronunciation?

After a few reads of a transcript, I go through it in sentence mode. Then I click on the linqs below to get the audio and shadow the pronunciation. The Text to Speech voice (Carla) sounds pretty accurate. I believe it is a real person rather than an AI voice unless I’m mistaken.

Any thoughts on the accuracy of the TTS voice for pronunciation practice?

It’s baseline accurate, but TTS is a (supplementary) tool for listeners who don’t have access to native audio rather than something to be used as a model of how the words are to be pronounced.

Thanks Gigusek - good to know it sounds fairly accurate from someone else’s ears. I have the native audio since I import from Youtube. However, I can only play this audio in whole sentences and like to hone down to single words so I can concentrate on how each vowel is pronounced. I use native audio to get the intonation and melody of the sentence.

I think it also depends on the language being learned. In Chinese, I think the tones change depending on the tone of the next word (for example when 不 bù followed by 4th tone character would change to bú). I think TTS doesn’t always account for this, but I could be wrong.

On the word level I’d say it’s generally very good. The real challenge however starts at the sentence level and here I find the performance to be rather poor, or at least below par. Personally, I find Microsoft’s Azure voices to be much better across the board.
An evaluation is quite difficult because the availability varies among languages and platforms (Android, iOS, web). But the most common TTS provider on LingQ seems to be AWS Polly (Text to Speech Software – Amazon Polly – Amazon Web Services) this one is unfortunately a little on the robotic side, AWS offer a higher quality version, codename ‘neural’, but that comes at 4x the price, which I assume is the reason we don’t hear it on LingQ. On iOS you can try the system’s built-in TTS system instead of the standard online one, those voices can be pretty good as well and come with the responsiveness of an offline system.


I have to say I’m a huge TTS fan, I’ve used this technology extensively in conjunction with LingQ, for example I went from zero to reading in Romanian. Because I was unable to find any suitable content, I just imported the news every day and read the articles while listening to the TTS. Harnessing the ‘print lesson’ feature together with the ‘read aloud’ feature in the MS Edge browser: this has been discussed a couple of times already [1][2].
This feature is unfortunately broken in Chinese languages because LingQ decided to insert spaces between the characters (Chinese doesn’t use spaces) this rightfully confuses the TTS which assumes the user meant to indicate a pause, resulting in a stuttery mess. But I definitely used this feature before they broke it. Also, I’m pretty sure Azure’s TTS observes tone sandhi (@Alicia05).

[1] Text To Speech (Turkish) - Language Forum @ LingQ
[2] Argentinian Tts - Language Forum @ LingQ

6 Likes

Ah, good tip! I just tried using the Edge/Azure voices for Chinese and they work like a charm!

You are also right about the spaces being added in Chinese LingQ lessons. After “printing” the lesson, I removed the spaces using an online tool (www.dcode.fr/spaces-remover) and then copied/pasted that text into a notepad. Saved that text file and then opened it in Edge.

Thanks!

2 Likes

I’m actually a fan of TTS too now. I basically use it every day on every language at the moment. It’s just very practical to use it for every content I decide to upload without much of a hassle.

Unfortunately, it’s not always good. In English is definitely better but in other languages you need to test different voices to see what’s better.

I haven’t completely understood if there is a substantial difference between the web voices and the devices voices. Feature that exist on iOS.

I wish there was a sort of update on this feature and that LingQ could consider upgrading to a more natural and better performance.

But it’s a great tool to have.

Thanks, I ain’t tried it yet but after reading your post, now I will try it.

Those Microsoft Azure TTS voices are on another level!

I started using a chrome extension called Language Reactor recently. It shows up subtitles of videos. I clicked on the TTS voice for a word then listened to the pronunciation of the Youtube presenter and they sounded the same! I kept trying again and again with the same result. This Edge / Azure voice sounds so accurate that I decided to use it for single word pronunciation practice.

I used to use the TTS in LingQ but these voices don’t appear anywhere near as good.

Thanks for the info Bamboozled!