Python - uploading audio via API

There was never an answer for the last post on this thread: Python Example For Creating A Lesson For A Course? - Lang...

How do upload audio?
How do we resolve the “The submitted data was not a file. Check the encoding type on the form.” error message?

My script thus far:

this is in my .env file:

#APIKey = “[the access code from Login - LingQ]”
#mp3DIR = “/Users/paulgamble/Desktop/mp3splitter-js/test.mp3”
#status = “private”
#collectionID = “495095”

import requests
import os
from dotenv import load_dotenv
load_dotenv()

key = os.getenv(“APIKey”)
mp3DIR = os.getenv(“mp3DIR”)
status = os.getenv(“status”)
collectionID = os.getenv(“collectionID”)

def printenvironment():
print(f’The API key is: {key}‘)
print(f’The mp3 dir is: {mp3DIR}’)

if name == “main”:
printenvironment()

data = {"title": "Lorem Ipsum",

“text”: “Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut
labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi
ut aliquip ex ea commodo consequat.”,
“status”: status,
“tags”: [“Test”],
“collection”: collectionID,
“audio”: mp3DIR }
response = requests.post(‘https://www.lingq.com/api/v2/fr/lessons/’,
data=data, headers={‘Authorization’: 'Token ’ + key})
print(response.status_code)
print(response.text)

THIS IS THE OUTPUT:

The API key is: d57b6431d575db71fc754b5e5527dd2e422a2f94

The mp3 dir is: / Users/paulgamble/Desktop/mp3splitter-js/test.mp3

400

#{“audio”: [“The submitted data was not a file. Check the encoding type on the form.”]}

3 Likes

I thought maybe I found a quick solution here: audio - How to extract the raw data from a mp3 file using python? - Stack Overflow

Using this library: GitHub - jiaaro/pydub: Manipulate audio with a simple and easy high level interface


from pydub import AudioSegment
sound = AudioSegment.from_mp3(“test.mp3”)

Getting:

400

Bad Request (400)


Really could use confirmation or an example of someone has been able to get the API audio to work.

So the reason audio is not working intuitively is because of the limitations of Django REST Framework (DRF): python - Django REST Framework upload image: "The submitted data was not a file" - Stack Overflow

Here’s a great article the Django developers at LingQ could reference: Good Code - File upload with Django REST Framework

I have successfully uploaded using POSTMAN (ref: https://chrisbartos.com/articles/uploading-images-drf/)

The following is the generated Python code from POSTMAN:

import requests

url = “https://www.lingq.com/api/v2/fr/lessons/

payload = “------WebKitFormBoundary7MA4YWxkTrZu0gW\r\nContent-Disposition: form-data; name="title"\r\n\r\nLorem Ipsum - POSTMAN\r\n------WebKitFormBoundary7MA4YWxkTrZu0gW\r\nContent-Disposition: form-data; name="text"\r\n\r\nLorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut \\n labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi\\n ut aliquip ex ea commodo consequat.\n\r\n------WebKitFormBoundary7MA4YWxkTrZu0gW\r\nContent-Disposition: form-data; name="status"\r\n\r\nprivate\r\n------WebKitFormBoundary7MA4YWxkTrZu0gW\r\nContent-Disposition: form-data; name="tags"\r\n\r\n["TEST"]\r\n------WebKitFormBoundary7MA4YWxkTrZu0gW\r\nContent-Disposition: form-data; name="collection"\r\n\r\n495095\r\n------WebKitFormBoundary7MA4YWxkTrZu0gW\r\nContent-Disposition: form-data; name="audio"; filename="test2.mp3"\r\nContent-Type: audio/mpeg\r\n\r\n\r\n------WebKitFormBoundary7MA4YWxkTrZu0gW–”
headers = {
‘Authorization’: “Token [api code]”,
‘User-Agent’: “PostmanRuntime/7.11.0”,
‘Accept’: “/”,
‘Cache-Control’: “no-cache”,
‘Postman-Token’: “a4307d63-56fb-458d-b977-1cf589132ac2,b55c1d2f-dee6-44ab-aecb-09c0a99e600c”,
‘Host’: “www.lingq.com”,
‘accept-encoding’: “gzip, deflate”,
‘content-type’: “multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW”,
‘content-length’: “496748”,
‘Connection’: “keep-alive”,
‘cache-control’: “no-cache”
}

response = requests.request(“POST”, url, data=payload, headers=headers)

print(response.text)

Obviously this needs some formatting, trying to copy and paste as is results in: “{“audio”:[“The submitted file is empty.”]}”

But I think there’s enough clues in here to in combination with something like this: Uploading Data — requests_toolbelt 1.0.0 documentation


Testing continues.

That’s similar to the approach I took when I wanted to import assimil content en-masse. Here’s what I ended up doing.

#!/usr/bin/env python

import eyed3
import sys
import requests
from requests_toolbelt.multipart.encoder import MultipartEncoder

get your apikey from Login - LingQ and paste it in

APIKEY=‘Token xxxxxxxxxxxxxxxxxxxx’

def main():
for args in sys.argv[1:]:
assimil = eyed3.load(args)
lyrics = assimil.tag.lyrics[0].text
title = args + ’ - ’
transcript = ‘’
course = assimil.tag.artist
# split lyrics string to an array of lines
lyricsArr = lyrics.splitlines()
# iterate through the lyrics to generate a transcript and title
for x in lyricsArr:
y = x.split(’ : ')
if y[0][0] == ‘N’:
title = title + y[1] + ’ - ’
elif y[0] == ‘S00-TITLE’:
title = title + y[1]
else:
transcript = transcript + y[1] + “\r\n”

    # we have the necessary data now use the lingq api
    m = MultipartEncoder([
                    ('title',title),
                    ('text',transcript),
                    ('status','private'),
                    ('audio',(args, open(args, 'rb'), 'audio/mpeg'))]
            )
    h = {'Authorization' : APIKEY,
         'Content-Type' : m.content_type}

    r = requests.post('https://www.lingq.com/api/v2/es/lessons/', data=m, headers=h)
    print r.text

if name == “main”:
main()

3 Likes

Did this work as intended?

1 Like

Yeah, the code at the bottom is all you need. Specify the params in the MultipartEncoder tuples, set your Authorization and Content-Type headers, and send the post. You’ll need requests-toolbelt, though. Otherwise you can do it manually by piecing together the multipart form data like you did in your partial example…

1 Like

Had to tweak a few things but finally got it working!

Thanks a ton! :slight_smile:

1 Like

For those that are trying to accomplish the same thing I’ve created a simple Python script to upload audio: GitHub - paulywill/lingq_upload: Use Lingq.com API to upload audio books with text faster

In a future version (and ideal world) it would be nice to scrape my ebooks with this script as well rather than copy and paste. If I have time in the future perhaps I will find a way.

In the mean time this resolved my original problem I was trying to solve with audio upload.

1 Like

Awesome work!

Some Fine tuning:

Below line "APIKEY " add lines

cover = r’AbsPath/to/your/cover.png’
collectionID = ‘12345’ #create collection get number from URL

and then add to ’ MultipartEncoder’ the statements (‘collection’,collectionID) and (‘image’, (cover , open(cover , ‘rb’), ‘image/png’)) so

m = MultipartEncoder([
(‘title’,title),
(‘text’,transcript),
(‘status’,‘private’),
(‘collection’,collectionID)
(‘image’, (cover , open(cover , ‘rb’), ‘image/png’)),
(‘audio’,(args, open(args, ‘rb’), ‘audio/mpeg’))]
)

Anyway, this is optional.

1 Like