Extension merges some words together when importing from Netflix

Hi,

When importing from Netflix the show Superstore in Turkish, lots of words end up showing up attached to the words next to them.

Example (from Superstore S01E01):
The sentence should appear as follows:
“Anladım. Enayileri çekmek için ucuz mücevher reklamı yapıp”

However, it instead, appears as follows:
“Anladım. Enayileri çekmek içinucuz mücevher reklamı yapıp”

The main issue being that the 2 words " için" and “ucuz” are attached together and appear as one word.

There are many examples of this issue when I imported this show. I think it mainly occurs when one word is at the end of a line and another word is on a new line.

I hope you can resolve this issue as it causes obvious issues when using Lingq with lookups and stats.

Thank you

Update:
I just tried importing a different piece of content (Shrek 1) and got the same issue. Note I’m using Chrome extension.

1 Like

Are you sure that there are no same issues in the subtitles when you are watching the show?

The subtitles look fine when viewing in Netflix (eg, Superstore, Shrek 1) though the instances of word merging seem to correspond to places where one word ends a line and the next word starts the new line.

I didn’t encounter the issue when I imported a video from YouTube.

UPDATE:
I just tried to import another video from Netflix and encountered the issue right on the first sentence in the show. The show is Kung Fu Panda (1):
In LingQ, the first sentence appears as:
“Efsanelerde anlatılanefsanevi bir savaşçı”

rather than:
“Efsanelerde anlatılan efsanevi bir savaşçı”

The merged words correspond to the last word on the first line and the first word on the second line when you check the caption in Netflix itself.

I hadn’t encountered this word-merging issue before the past couple of days.

Thanks. Our developers are looking into it. We will get it fixed.

Hi Zoran,
This is happening in German as well so it must be across the board on the languages… I thought it could just be the captions as well and hadn’t bothered to look.

As monorris points out, this looks like maybe a carriage return/line feed are being removed but no space is left two separate two words.

An example is the recap of Babylon Berline Season 1, episode 3. This is very near the beginning and you can see two cases of it in quick succession…The sentences:

" Warum ist der bei der Sitte? Warum nichtwas Anständiges? " (“nicht” and “was” are concatenated together into “nicthwas”. The words are on two separate lines in the subtitles)

" Der Oberbürgermeister von Kölnwird erpresst, Herr Dr. Adenauer. " (“Köln” and “wird” are concatenated into “Kölnwird”. These two words are on separate lines).

Hope that helps as some additional examples.

Thanks! Yes, it affects all languages. We will fix it.

1 Like

Can you guys please post a link to these imported lessons with the issue here? Also, are you using the production or beta version?

It’s private though…can you access it (or the developers)?

I’ve not used beta web much, but after doing the lingq import, if I respond “yes” to open lesson it brings me to the beta site (although it fails to open the lesson…it merely dumps one at the lesson feed if I remember correctly).

@ericb100 Thanks, we can access it. We are investigating the issue.

Note that I’m still running into this issue today when using the lingq extension. But my current workaround is to use the extension subadub (though I think there are others) to download the SRT (subtitiles file) for a given piece of content on Netflix, then importing into lingq using the “Import ebook” feature. When done this way, the word-merging issue doesn’t show up.

We will push a fix to production soon.