Best practice for creating courses with audio track?

Hi,

What is the best practice these days to have a course created where the lessons are split as the creator prefers, instead of arbitrarily, and each lesson is matched with the respective audio track?

Previously I did the following:

  1. Take an Ebook, convert to txt
  2. Split it into chapters, one text file per chapter
  3. Create mp3 files matching the chapters
  4. Uploaded the text files using LingQ Bulk Upload script
  5. Go back to the UI, and merge lessons manually that were split by LingQ
  6. Upload mp3 files manually for each merged lesson.

I tried to freshen the LingQ Bulk Upload script (originally developed by Shaun Patterson and Colin Johnstone) to work with the current LingQ website, to no avail. The uploader doesnā€™t work because the UI has changed, and the new editor has been implemented in a way that makes it quite hard to maneuver with Selenium - let alone for humans.

So, how do you folks do this nowadays?

4 Likes

It seems like REST API Documentation ā€” LingQ 1.0 documentation has API documentation, the API key can be found at Login - LingQ, so with a little bit of shell scripting and ā€œcurlingā€, itā€™s easy to achieve #4 above.

However, the lessons are split to the segments, and it seems to be harder to merge split lessons than before (because of the new editor).

BTW has someone figured out how to ā€œembedā€ an mp3 file into the json file that is used to create a lesson?

Hi,
LingQ splits lessons at about 2000 words (Ā± 10% or so). You canā€™t really create lessons that exceed this limit. This used to be different, before LingQ5. Technically it is possible to override this limit by removing all whitespaces; I discovered this by accident. So I possibly have the longest LingQ lesson at 13764 wordsā€¦ But it goes without saying that LingQ doesnā€™t work properly in this scenario.
Long story short: I believe you need to content yourself with 2000 word lessons.

Regarding the API: the documentation is old and incomplete, LingQ5 introduced version 3 of the API, I still donā€™t see any documentation for this. I had communicated this to Mark Kaufmann some time ago, but I believe addressing this is low priority.
My recommendation is to open the Developer tools of your browser, go to the network tab and inspect the requests when importing a lesson using the web interface. You can then use this as a starting point.

I just tried to upload a lesson with audio:
curl --location --request POST ā€˜https://www.lingq.com/api/v3/it/lessons/ā€™
ā€“header ā€˜Authorization: Token 12345ā€™
ā€“header ā€˜Content-Type: application/jsonā€™
ā€“form ā€˜collection=ā€œ1015050ā€ā€™
ā€“form ā€˜title=ā€œExampleā€ā€™
ā€“form ā€˜text=ā€œLorem ipsum.ā€ā€™
ā€“form ā€˜language=ā€œitā€ā€™
ā€“form ā€˜status=ā€œprivateā€ā€™
ā€“form ā€˜audio=@ā€œ/Users/bamboo/1.mp3ā€ā€™

This worked for me. Iā€™m no expert so I donā€™t think I can be of further help. But @ColinJohnstonov is still active on LingQ. Since you seem to be using some of his code, maybe you can get in touch?

2 Likes

Iā€™ll bump this since Iā€™m also curious about how people fare with version 3 of the API.

As a side note, has someone managed to bulk create ā€œtext to speechā€ now using the new automated feature in the web? Either once the course/lesson is created or while creating it through the API.
I was looking at it the other day but I couldnā€™t find anything in my uploaded lessonā€™s JSONā€™s attributes to prompt the tts generation.
I donā€™t think itā€™s possible atm but I might as well ask in case someone found a way to do it.

1 Like

The TTS function on the website seems to send a bunch of requests, each containing one line of text, so it doesnā€™t actually send the complete lesson at once. Maybe thatā€™s why it sounds so robotic? In most languages LingQ uses Amazonā€™s Polly service, but frankly other TTS service sound better to me. E.g. https://azure.microsoft.com/en-us/services/cognitive-services/text-to-speech/
(which is available for free in the MS Edge browser)

This is how it looks for me in Dutch: (you would have to change the name of the voice and the provider if it doesnā€™t come from Amazon)
curl ā€˜https://www.lingq.com/api/v2/tts/?language=nl&voice=Lotte&app_name=polly&text=Lorem+ipsum+dolor+sit+amet.ā€™ -H ā€˜Authorization: Token 123ā€™

After all the snippets are sent, small mp3 files are received, LingQ will also get timestamps:
https://www.lingq.com/api/v3/nl/lessons/123/timestamps/
There doesnā€™t seem to be an API call that would trigger a complete lesson to be converted to TTS, it all seems to happen in the js code.

Btw. trying to generate TTS in Chinese results in lots of errors, something seems wrong because LingQ makes all sorts of requests to other TTS providers that donā€™t seem to be available, gCloud, msspeak?

Just last week I brought up the API topic in a conversation I had with a LingQ employee. But the most I can do is to raise awareness, Iā€™m somewhat skeptical if anything actually changes. Iā€™m not sure if this message even reaches the LingQ web developer. I feel it could be so easy, the whole API documentation must exist, how else would the Android and iOS developers work with the API?

2 Likes

There is not much we can do on the api side at the moment Iā€™m afraid, but, if you upload any text with matching audio, the lesson generator should split the text and audio file to match. ie. Upload a chapter at a time and it should all split up accordingly. You donā€™t really want to merge split lessons together since longer lessons do have performance issues.

3 Likes

I can confirm that it works fine except for the audio on the first lesson, which instead of being split it is kept untrimmed. That is, I uploaded a lesson with audio (53mins) that was split into 4 parts, and the respective audio lengths were 53, 15, 15, and 8 mins. It matches perfectly the text though. I did it with V3 of the api.

Well, seems too much for me, and since I didnā€™t intend to listen to the tts but rather have it there to maybe share the course in the future, I guess my solution is going to be to manually click it while I read it the first time. Thanks for answering nevertheless.

If thatā€™s happening, it sounds like a bug and should be reported. Can you send an email to support with all details? Along with the audio file and text. Did you do this import using our built-in functionality on the Import page? Or, using the api?

Actually I gave it a try and I see now what you meanā€¦ At any rate, have you tried/managed to upload an audio file for an existing lesson? I know how to create a lesson with audio & text in a new course but Iā€™m unable to post only the audio if the text for said lesson was already previously uploaded.

curl --location --request PATCH ā€˜https://www.lingq.com/api/v3/ro/lessons/123/ā€™
ā€“header ā€˜Authorization: Token 123ā€™
ā€“header ā€˜Content-Type: multipart/form-dataā€™
ā€“form ā€˜audio=@ā€œ/Users/bamboo/1.mp3ā€ā€™

1 Like

works perfectly thanks! I donā€™t know why I couldnā€™t make it work with the multipart encoder in python but oh well

1 Like

Hi, just a few updates / clarifications:

  • the TTS providers themselves probably only accept small audio snippets via their APIs
  • so itā€™s probably not even possible to hit those with 1000s of characters at once
  • itā€™s convenient for time stamping to work at the sentence level
  • the voice sounds robotic because LingQ uses the ā€œstandardā€ tier of AWS Polly
  • I created a version using Polly ā€œneuralā€
  • a file is attached, first LinQ, then neural, 10 seconds each
  • itā€™s probably not used by LingQ because itā€™s 4x more expensive: Amazon Polly Pricing

1 Like

| ā€¦ text with matching audio
Where can I find the instructions to do this properly? I do have courses with matching text file and audio file, but I donā€™t know how to upload audio and text so that they remain in sync in the course.

You should be able to go to the Import Lesson page, upload the audio, paste the text and then click Generate Lesson. It automatically compares the text and audio and adds timestamps in most languages.

1 Like