Python -ibm watson - speech to text

  • Throughout the month of April 2024, participate in the FileJoker Thread Contest OPEN TO EVERYONE!

    From 1st to 30th of April 2024, members can earn cash rewards by posting Filejoker-Exclusive threads in the Direct-Downloads subforums.

    There are $1000 in prizes, and the top prize is $450!

    For the full rules and how to enter, check out the thread
  • Akiba-Online is sponsored by FileJoker.

    FileJoker is a required filehost for all new posts and content replies in the Direct Downloads subforums.

    Failure to include FileJoker links for Direct Download posts will result in deletion of your posts or worse.

    For more information see
    this thread.

frostieff

New Member
May 12, 2021
21
3
Hi,

Aside from the pyscribe tool, im trying to use IBM watson to transcribe using this tutorial


this is the code he used:

from ibm_watson import SpeechToTextV1, LanguageTranslatorV3
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

ltapikey = 'YOUR LANGUAGE TRANSLATOR APIKEY'
lturl = 'YOUR LANGUAGE TRANSLATOR URL'
sttapikey = 'YOUR STT API KEY'
stturl = 'YOUR STT URL'

ltauthenticator = IAMAuthenticator(ltapikey)
lt = LanguageTranslatorV3(version='2018-05-01', authenticator=ltauthenticator)
lt.set_service_url(lturl)

sttauthenticator = IAMAuthenticator(sttapikey)
stt = SpeechToTextV1(authenticator=sttauthenticator)
stt.set_service_url(stturl)

with open('YOURFILENAME.mp3', 'rb') as f:
res = stt.recognize(audio=f, content_type='audio/mp3', model='en-AU_NarrowbandModel', continuous=True).get_result()
voicetext = res['results'][0]['alternatives'][0]['transcript']
voicetext

greek = 'en-el'
chinese = 'en-zh'
hindi = 'en-hi'

translation = lt.translate(text=voicetext, model_id=hindi).get_result()
translatedtext = translation['translations'][0]['translation']
translatedtext

with open('result.txt', 'w') as f:
f.write(translatedtext)



I get the following error:
TypeError:request() got an unexpect keyword argument 'continous'

Any help would be greatly appreciated :)
 

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,562
4,935
Just remove ", continuous=True", it doesn't exist in the definition of recognize(select the text recognize and press ctrl+I to bring up the contextual help to see all the options), at least with what I assume is the default install for all that stuff, it's a really bad tutorial for someone not familiar with that stuff and he doesn't explain much.

Seems to work fine without it, at least up to that point:
SpeechToText_Tutorial.jpg


Edit: And after finishing the whole script thing, the 1 obvious issue is that he doesn't iterate through the results since his example has only 1 line so you'd have to implement that yourself or manually change the script for every lines which would be insane if your plan is to translate a whole movie.

I only get the translation for "ええ " from the example I used in the picture above which translates to "-Yeah. ". Not sure why there's a "-" there.

Changing this line"voicetext = res['results'][0]['alternatives'][0]['transcript']" with a 1 instead of the first 0 and I get the second line "日本 全国 の 皆さん こんにちは " -> "Hello, everyone in Japan. "

My python knowledge is very basic and I never heard of JupyterLab before today so not sure how you'd automate iterating through all the results, you need to put everything after my screenshot in a loop and make sure the file writing append(not overwrite which this example does) the result to the text file(or you put all the translated results in an array and output that somehow).
 
Last edited:

frostieff

New Member
May 12, 2021
21
3
Just remove ", continuous=True", it doesn't exist in the definition of recognize(select the text recognize and press ctrl+I to bring up the contextual help to see all the options), at least with what I assume is the default install for all that stuff, it's a really bad tutorial for someone not familiar with that stuff and he doesn't explain much.

Seems to work fine without it, at least up to that point:
View attachment 2753150


Edit: And after finishing the whole script thing, the 1 obvious issue is that he doesn't iterate through the results since his example has only 1 line so you'd have to implement that yourself or manually change the script for every lines which would be insane if your plan is to translate a whole movie.

I only get the translation for "ええ " from the example I used in the picture above which translates to "-Yeah. ". Not sure why there's a "-" there.

Changing this line"voicetext = res['results'][0]['alternatives'][0]['transcript']" with a 1 instead of the first 0 and I get the second line "日本 全国 の 皆さん こんにちは " -> "Hello, everyone in Japan. "

My python knowledge is very basic and I never heard of JupyterLab before today so not sure how you'd automate iterating through all the results, you need to put everything after my screenshot in a loop and make sure the file writing append(not overwrite which this example does) the result to the text file(or you put all the translated results in an array and output that somehow).
thanks a lot , your still better than me though.

Now my results are showing a blank. Weird... Also you can clean it up , but i forgot how.