akiba resident JAV subtitlers & subtitle talk★NOT A SUB REQUEST THREAD★

kingsepol420 · Sep 30, 2022

KingofBugs said:
Question for some more experienced subbers. I wanted to clean up some subtitles I have collected over the years and then post them here. Is there a program y'all use to streamline this or just open the srt file in notepad and edit it there?

have you tried "Subtitle Edit"?

KingofBugs · Sep 30, 2022

kingsepol420 said:
have you tried "Subtitle Edit"?

Nope since I am a complete noob when it comes to this. Did just look at it though and looks like what I need. Thanks for the recommendation.

mei2 · Oct 1, 2022

Non_Entity said:
A big hurdle towards fine-tuning Whisper (or any other model) is a lack of Japanese training data. OpenAI's dataset has 15,914 hours of JP audio (7054 hours with Japanese transcripts, 8860 with English ones), and even that dwarfs the publicly available ones I'm aware of.

Another big thank you @Non_Entity! The Whisper with Silero VAD produces the best subs I have seen so far. Great work!

I just upgraded my Colab to check out the results with faster GPUs

Some obsevations so far:
- I dialed down the chunk_threshold to 2, and it has been working well. I will try with 1 later on.
- I get "Thanks for watching" ghost lines every now and then. I wonder if that line appears when there are many Hhmm,Hhm moans. I noticed Vosk spits out a lot of those at the same time frame of "thanks for watching"

- I have set the switch to no-translatation. I got DeepL error code 413 (too long request). May be it would be an idea to make the translation to be a cell of its own after Whisper --so it produces the original srt as well as a translation if cell executed?

Out of curiosity, would anyone know what VAD/tokenizer/transformer SubtitleEdit uses (for Vosk)? Somehow I think that tool finds the sub timings more accurately --exact time of dialogue detection I guess.

Non_Entity · Oct 2, 2022

The 413 error should be fixed now. It'll also save the original Japanese output in a separate file when using DeepL (not possible when Whisper translates it).

mei2 said:
Out of curiosity, would anyone know what VAD/tokenizer/transformer SubtitleEdit uses (for Vosk)? Somehow I think that tool finds the sub timings more accurately --exact time of dialogue detection I guess.

Vosk is based on Kaldi. It uses an acoustic model that looks at each fraction of a second, and predicts individual phonemes. Then a language model guesses which words those phonemes form. Since you know where each phoneme starts, it's easy to create accurate timestamps.
Whisper doesn't have an acoustic model. It takes an entire 30-second slice of audio, and tries to predict the subtitles, timestamps included. It doesn't actually understand what timestamps are; it's just been trained on lots of text that contains them.

javjod · Oct 2, 2022

maybe someone can compare my MIAA-698 subtitle from CAPCUT windows version link of capcut, the problem of this subtitle only the sync of the sub because I split the movie into 60 minutes and use deepl for translate from Japan to English. please check it

mei2 · Oct 2, 2022

Non_Entity said:
The 413 error should be fixed now. It'll also save the original Japanese output in a separate file when using DeepL (not possible when Whisper translates it).

Vosk is based on Kaldi. It uses an acoustic model that looks at each fraction of a second, and predicts individual phonemes. Then a language model guesses which words those phonemes form. Since you know where each phoneme starts, it's easy to create accurate timestamps.
Whisper doesn't have an acoustic model. It takes an entire 30-second slice of audio, and tries to predict the subtitles, timestamps included. It doesn't actually understand what timestamps are; it's just been trained on lots of text that contains them.

Thanks!!! Will give the new version a try. Meanwhile, I upgraded my colab to pay-as-you-go, I get T4 GPU, but I don't see any viisble improvement in speed/performance. Your code to check on GPU before running was quite helpful.

Quick question: would changing the chunk-threshold make any difference in quality of the output?

SUNBO · Oct 2, 2022

mei2 said:
Quick question: would changing the chunk-threshold make any difference in quality of the output?

Yes it does, it changes the translation. I tested it out and found chunk2 or chunk3 to be slightly better.

You should be aware it gives random translation each run with speech that are not very clear. I am only looking at the sentences that don't change between each run. Chunk2 and chunk3 gives the same result, while chunk1 changes. A few sentences are not the same and not as good as chunk2 and 3.

mei2 · Oct 5, 2022

Non_Entity said:
The 413 error should be fixed now. It'll also save the original Japanese output in a separate file when using DeepL (not possible when Whisper translates it).

Thanks @Non_Entity. The new changes work very well. Juts in case, if you feel like adding more features, here are couple of thoughts:

- would be nice to be able to process multiple files -batch process;
- would be nice if DeepL API is called only for the text of the subs--reduce cost of DeepL by reducing the number of characters sent;

cheers

PS. by pure luck Colabpro gave me an A100 GPU during one of my runs. However, in my experience Whisper performance didn't change at all. It seems that more power/VRAM doesn't change the speed. It seems that a T4 GPU is an optimal cost/performance setup

SUNBO · Oct 5, 2022

mei2 said:
Thanks @Non_Entity. The new changes work very well. Juts in case, if you feel like adding more features, here are couple of thoughts:

- would be nice to be able to process multiple files -batch process;
- would be nice if DeepL API is called only for the text of the subs--reduce cost of DeepL by reducing the number of characters sent;

cheers

PS. by pure luck Colabpro gave me an A100 GPU during one of my runs. However, in my experience Whisper performance didn't change at all. It seems that more power/VRAM doesn't change the speed. It seems that a T4 GPU is an optimal cost/performance setup

Is it possible to run it offline without needing to upload the mp3 every time

mei2 · Oct 5, 2022

SUNBO said:
Is it possible to run it offline without needing to upload the mp3 every time

Do you mean to package the code to run locally? The large Japanese model seems to need more than 12GB of GPU VRAM. At least I was not able to run it locally --I got out of memory runtime error. It would be nice if OpenAI would optimise the model to work on lower specs.

In case of colab, I found it easier to link my Drive to colab than to upload local files. The bad thing is that wih every new session I have to reconnect my Drive (I undertsand there a way to automate it but I haven't tried). The good thing is that the transfers to Colab are much faster, I believe.

SUNBO · Oct 6, 2022

mei2 said:
Do you mean to package the code to run locally? The large Japanese model seems to need more than 12GB of GPU VRAM. At least I was not able to run it locally --I got out of memory runtime error. It would be nice if OpenAI would optimise the model to work on lower specs.

In case of colab, I found it easier to link my Drive to colab than to upload local files. The bad thing is that wih every new session I have to reconnect my Drive (I undertsand there a way to automate it but I haven't tried). The good thing is that the transfers to Colab are much faster, I believe.

Oh i see. Yea I'm not sure how it works. Not a programmer.

Also not sure how google drive works, don't you have to upload to google anyways?

mei2 · Oct 6, 2022

SUNBO said:
Oh i see. Yea I'm not sure how it works. Not a programmer.

Also not sure how google drive works, don't you have to upload to google anyways?

Yes the upload will be to Drive instead of colab. The way I do it is (and I hope others can review and suggest easier ways please): I extract and save mp3s of movies that I want to sub into one folder. I keep that folder in sync with my drive {Google Drive App). This way I can run Whisper with different parameters on the same file without needing to upload again between sessions.

Non_Entity · Oct 8, 2022

I've noticed the way you split lines makes a difference with DeepL. Translating each line separately removes context, but running them together makes it blend the lines and repeat itself. See avatarthe's tests:

Because eight years ago, my wife left me for a young man she worked with part-time.
She left me for a young man she was working with.

Since then, I've been the mainstay of Araike for three years.
I've been the mainstay of Araike's household, raising our three children.

The best fix I've found is putting quotes between the lines. 「」 and "" work about equally well, and remove most (not all) of the repetitions.
I've added this to the notebook. My earlier fix was using a sliding window, which also worked, but tripled the API cost.

SUNBO · Oct 8, 2022

Non_Entity said:
I've noticed the way you split lines makes a difference with DeepL. Translating each line separately removes context, but running them together makes it blend the lines and repeat itself. See avatarthe's tests:

The best fix I've found is putting quotes between the lines. 「」 and "" work about equally well, and remove most (not all) of the repetitions.
I've added this to the notebook. My earlier fix was using a sliding window, which also worked, but tripled the API cost.

I love that you keep updating this. Really appreciate it.

Is there any chance you can add a queue feature to it? For example, giving us more input fields for the audio path, maybe 4 more. So when it finishes with one, it will move to the next one automatically.

maelstrom9999 · Oct 9, 2022

So I downloaded OYC-036 years ago, must have been at least 5 years back. Was watching it today and noticed it is a hardsub with Asian language subtitles. Odd as that's the only one I've run into. Anyone know which Asian language this is?

Also, anyone know what software can be used to OCR this and turn it into a soft sub?

SUNBO · Oct 9, 2022

maelstrom9999 said:
So I downloaded OYC-036 years ago, must have been at least 5 years back. Was watching it today and noticed it is a hardsub with Asian language subtitles. Odd as that's the only one I've run into. Anyone know which Asian language this is?

View attachment 3062637

Also, anyone know what software can be used to OCR this and turn it into a soft sub?

It's chinese (traditional).

And the best way to OCR this into soft sub is using VideoSubFinder. see tutorial here

frostieff · Oct 11, 2022

How are y'all able to get accurate subtitle translations from Pytranscribe? Is there a certain site you download the movies from? I upload the entire movie or the mp3 audio file of the movie and i get inaccurate translations. Is it something i'm doing wrong ?

maload · Oct 11, 2022

frostieff said:
How are y'all able to get accurate subtitle translations from Pytranscribe? Is there a certain site you download the movies from? I upload the entire movie or the mp3 audio file of the movie and i get inaccurate translations. Is it something i'm doing wrong ?

accurate translations ??
oh.
if you talking about how come people can got more line of subtitle and upload it on the net then so i can tell you a little bit

1 some of subtitle is not translating ,,, its just kind of .... add more line in the files. ( their imagine , guessing )

2 how the Pytranscribe?

if the dialog in movies are clear enough ,,,, then its work ....you will got more lines in the file

the main problem is some of the movies have so many sound ,,, music ,people ,,,and its not loud enough

you need some audio-program for booting the sound s level ..(audacity is free )

accurate translations ?? (for me i said its better than nothing ))

clear sound , increase the volume of the dialogs ... just try to do it ..
thats all i know now

Taako · Oct 12, 2022

frostieff said:
How are y'all able to get accurate subtitle translations from Pytranscribe? Is there a certain site you download the movies from? I upload the entire movie or the mp3 audio file of the movie and i get inaccurate translations. Is it something i'm doing wrong ?

It's like Maload said. The clearer the movie dialog, the better pyTranscriber works. And along with Audacity it's great. Remember pytranscriber will give you timing code and some subs at the very least. The rest is up you

You can use both the .mp4(audio/video file) or mp3(audio file) with pyTranscriber. I think mp3 works best because with Audacity you can make the audio a little better.

And remember, 90% of JAV dialog is the same words/phrases.

You should be prepare to guess some scenes as the music and other people talking is just too hard to understand.. unless you know Japanese.

If it's girls-- and they're at school-- and it's just background talk-- then just say, Girl A: How was the exam? Or something like that. Girl B: Um, I didn't score well. Or something like that.

But look/listen for the main star(s) to talk louder OR whoever talks the loudest. That's your focus when everyone talking at once. OR just skip it. Sometimes its just better than stressing

Imscully · Oct 12, 2022

Taako said:
It's like Maload said. The clearer the movie dialog, the better pyTranscriber works. And along with Audacity it's great. Remember pytranscriber will give you timing code and some subs at the very least. The rest is up you
You can use both the .mp4(audio/video file) or mp3(audio file) with pyTranscriber. I think mp3 works best because with Audacity you can make the audio a little better.

And remember, 90% of JAV dialog is the same words/phrases.

You should be prepare to guess some scenes as the music and other people talking is just too hard to understand.. unless you know Japanese.

If it's girls-- and they're at school-- and it's just background talk-- then just say, Girl A: How was the exam? Or something like that. Girl B: Um, I didn't score well. Or something like that.

But look/listen for the main star(s) to talk louder OR whoever talks the loudest. That's your focus when everyone talking at once. OR just skip it. Sometimes its just better than stressing

Good, sound advice. (Pun intended.) Well done. Thanks for sharing.

akiba resident JAV subtitlers & subtitle talk★NOT A SUB REQUEST THREAD★

kingsepol420

Active Member

Well-Known Member

New Member

Member

Attachments

Well-Known Member

Active Member

Well-Known Member

Active Member

Well-Known Member

Active Member

Well-Known Member

New Member

Active Member

Well-Known Member

Active Member

New Member

Active Member

Akiba Citizen

Well-Known Member