Post your JAV subtitle files here - JAV Subtitle Repository (JSP)★NOT A SUB REQUEST THREAD★

Makkdom

Well-Known Member
Mar 4, 2019
157
385
OMG Whisper is going to kill me...I just used it to translate DVDES-794 A&B. The good news is it looks pretty good, but like all Whisper translations, it still requires a little clean up and occasionally a little re-interpretation to make sense. The bad news is it's an 8 hr movie! So I may be off-line for a while...I guess be careful what you wish for! lol

Status update (2/2/23) for anyone interested: just finished editing/cleaning DVDES-794A and will begin editing/cleaning DVDES-794B tomorrow.. Thought I would wait to post until both parts have been edited/cleaned unless anyone would like DVDES-794A earlier.
Fast work on the first part. Good job!
 

soloporhoy666

Active Member
Nov 29, 2021
118
124
B

bro what is VAD?
Hello, I understand that it is a code program that improves the audio of the files that go through that program, there is a virtual version that includes both, Whisper+VAD, with this it greatly improves the result of the subtitles of our movies, using whisper without VAD, it works fine but you will get a lot of repeated dialogue (also with VAD but it is moderate) also the result for me is much better, I leave you the link.
 

ericf

Well-Known Member
Jan 13, 2007
234
528
If I use collab and set vad threshold to 0 or remove the number there, will it stop using VAD? I have heard some of the online programs' attempts at isolating voices and a lot of them do a very bad job, making the dialog more unclear. I don't see a version without VAD linked anywhere so I suppose I have to use it?
And should I leave the chunk threshold at 0.3? Do the chunks increase in size with higher numbers or lower numbers?
 

Nokel

Member
May 18, 2019
13
36
Wow, I just tried that Colab link for the first time and the process is super easy. I was getting really confused trying to set up everything myself via Python or whatever, so that's a lifesaver.

Could somebody advise the best way to rip audio from vids? I think it was mentioned further back in the thread, but I'm having trouble finding it.
 

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,536
4,897
Could somebody advise the best way to rip audio from vids? I think it was mentioned further back in the thread, but I'm having trouble finding it.
Demux it. Google how for your specific extension or use mkvtoolnix to put just the audio in an mka.
 
  • Like
Reactions: Taako

Electromog

Akiba Citizen
Dec 7, 2009
4,436
2,705
Is VAD only for the online version or is there something like it for when you run whisper on your own computer?
 

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,536
4,897
You can see all the code they use to run it, what they install, etc. so if you reproduce that, you can use it on your own pc too.

It's all for linux though so you'd have to convert it to windows(I'm assuming).
 
  • Like
Reactions: Taako

porgate55555

Active Member
Jul 24, 2021
51
163
Is anyone using a script do auto sub srt with deepl (free version)? Mine broke and I can't seem to fix it so I am looking for an alternative or someone more technical than me who knows how to fix it.
 

porgate55555

Active Member
Jul 24, 2021
51
163
If anyone is intersted, I have a large amount of Whisper generated subtitles. They are NOT cleaned, because I am only interested in understanding the story, not having perfect subs. Duplicates are removed as good as possible and timings slightly fixed.
 

Attachments

  • Whisper_03.02.23.zip
    4.1 MB · Views: 977

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,536
4,897
Is anyone using a script do auto sub srt with deepl (free version)? Mine broke and I can't seem to fix it so I am looking for an alternative or someone more technical than me who knows how to fix it.

Post your script and I can see if it's easy to fix or not.
 

porgate55555

Active Member
Jul 24, 2021
51
163
Post your script and I can see if it's easy to fix or not.
Thanks a lot! Everything seems to be working but deepl changed the input css selector, which I fixed but then for some reason it messed with the merge of the chunks into one file. All srt which only have one chunk work, but as soon as more than one is required, it fails.
 

noirzmonster

New Member
Dec 27, 2021
4
13
Use the virtual Whisper+VAD version (it is not necessary to use your computer resources) use MP3 files, I recommend M4A, if you have any audio enhancement program turn it up adjust the volume (I use MOVAVI), from page 218 onwards You can find some discussions of using Whisper (tutorial) you can see my posts where I use images to guide you, good luck.

Some (complete) movies take about 15 minutes to 1 hour, but the result is much better than the other options that already exist (that's my opinion, I'm sure someone else contradicts me), you can use the program up to 6 times a day. sometimes a little less. (free has a limit)
Thanks for explaining how to do it. I tried to translate one, despite whisper's translation being much better than other auto translatorsthere were some obvious wrong translations. I tried to clean it up as much as I can. There were few 15-20 second intervals which didn't have any translation maybe sound wasn't loud enough for whisper to pick it up, do you have any suggestions on that? Anyway I tried to fill those intervals as best as I can. Here is my first attempt at translating a JAV. I would appreciate feedbacks.

RCT-925 Studio ROCKET The Cum Swallowing Game Try To Guess Which Shot Of Cum Belongs To Your Boyfriend!


1rct925pl.jpg
 

Attachments

  • RCT-925-EN.zip
    26.9 KB · Views: 236

Makkdom

Well-Known Member
Mar 4, 2019
157
385
If anyone is intersted, I have a large amount of Whisper generated subtitles. They are NOT cleaned, because I am only interested in understanding the story, not having perfect subs. Duplicates are removed as good as possible and timings slightly fixed.
Wow, that is indeed a large amount of file. Thanks!

Edited to add: I admire your taste in porn actresses.
 
Last edited:

soloporhoy666

Active Member
Nov 29, 2021
118
124
Thanks for explaining how to do it. I tried to translate one, despite whisper's translation being much better than other auto translatorsthere were some obvious wrong translations. I tried to clean it up as much as I can. There were few 15-20 second intervals which didn't have any translation maybe sound wasn't loud enough for whisper to pick it up, do you have any suggestions on that? Anyway I tried to fill those intervals as best as I can. Here is my first attempt at translating a JAV. I would appreciate feedbacks.

RCT-925 Studio ROCKET The Cum Swallowing Game Try To Guess Which Shot Of Cum Belongs To Your Boyfriend!


1rct925pl.jpg
This usually happens, when I'm really interested in knowing what happens in a movie scene, I play the movie through Whisper+VAD again, modifying the VAD values (by default it comes in 0.4) I've set it to 0.3, 0.5 and 0.6 the The text lines are usually the same, but it has detected some lines that I had not detected before, then I pass the .srt file to the Subtitle Edit program and add those lines to the best srt file that I consider to be more complete (I only do this with movies that are really worth trying, that I like a lot)
The other option I do is increase the volume of the file and balance the audio (there are programs that do this automatically)
 

Dom047

New Member
May 5, 2016
9
4
This usually happens, when I'm really interested in knowing what happens in a movie scene, I play the movie through Whisper+VAD again, modifying the VAD values (by default it comes in 0.4) I've set it to 0.3, 0.5 and 0.6 the The text lines are usually the same, but it has detected some lines that I had not detected before, then I pass the .srt file to the Subtitle Edit program and add those lines to the best srt file that I consider to be more complete (I only do this with movies that are really worth trying, that I like a lot)
The other option I do is increase the volume of the file and balance the audio (there are programs that do this automatically)
I see u use movavi to balance audio and boost volume, does that make a big diff in dialogue output? are you gettin way more text?
 
  • Like
Reactions: soloporhoy666

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,536
4,897
I see u use movavi to balance audio and boost volume, does that make a big diff in dialogue output? are you gettin way more text?
Be aware that whisper is random. It won't give you the same amount of lines if you rerun the exact same audio unmodified in my experience so comparing like this is mostly meaningless.
 

SamKook

Grand Wizard
Staff member
Super Moderator
Uploader
May 10, 2009
3,536
4,897
Thanks a lot! Everything seems to be working but deepl changed the input css selector, which I fixed but then for some reason it messed with the merge of the chunks into one file. All srt which only have one chunk work, but as soon as more than one is required, it fails.

I had to change
Code:
"css selector"
with
Code:
By.CSS_SELECTOR
in 3 places for it to work for me.

What do you mean exactly by it fails?
I've tried it with a 5 chunk one and it did produce a combined SRT but it does seem to skip the last line of each chunk(hard to say for sure though), unless something else caused those in the 1 long srt I tested so wondering if your issue is the same and I should look further into it or if it was something else.

Edit: I see that it does throw an error when combining but it does still create the file mostly(minus a few lines) but I don't know how it acted before.

Edit2: If you also get similar result than I do, basically the following portion of code(that merges the chunks together) throws some kind of error but I don't know python enough to debug so hard to tell what exactly does:
Code:
    with open(mysrt, 'r', encoding='utf-8') as f:
        srt = f.read()
        match = re.findall(r'\d+:\d+:\d+,\d+ --> \d+:\d+:\d+,\d+', srt)

    linerList = []
    liner = ""
    with open(wordtxt, "r", encoding="utf-8", errors='ignore') as wordfile:
        lines = wordfile.readlines()
        for line in lines:
            if line != '\n' and line is not lines[-1]:
                liner += line
            elif line != '\n' and len(linerList) == len(match)-1:
                liner += line
                linerList.append(liner)
                break
            else:
                linerList.append(liner)
                liner = ""

    count = 0
    with open(finalsrt, 'w', encoding='utf-8') as resfile:
        for timeline in match:
            resfile.write(f"{count+1}\n")
            resfile.write(timeline+'\n')
            resfile.write(linerList[count])
            resfile.write("\n")
            count += 1

Edit3: And to get rid of various warnings and errors, you can replace:
Code:
# Start a Selenium driver
driver_path=r'C:\Program Files (x86)\chromedriver.exe'
driver = webdriver.Chrome(driver_path)
with
Code:
s = Service("C:\Program Files (x86)\chromedriver.exe")
chrome_options = Options()
chrome_options.add_argument("--ignore-ssl-errors")
# The following 2 are probably unnecessary but doesn't hurt
chrome_options.add_argument("--ignore-certificate-errors")
chrome_options.AcceptInsecureCertificates = True
# Gets rid of USB error
chrome_options.add_experimental_option('excludeSwitches', ['enable-logging'])
# Start a Selenium driver with chosen chrome options
driver = webdriver.Chrome(service=s, options=chrome_options)
And add the following at the beginning(I did after line 3):
Code:
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
 
Last edited:
  • Like
Reactions: mei2

porgate55555

Active Member
Jul 24, 2021
51
163
I had to change
Code:
"css selector"
with
Code:
By.CSS_SELECTOR
in 3 places for it to work for me.

What do you mean exactly by it fails?
I've tried it with a 5 chunk one and it did produce a combined SRT but it does seem to skip the last line of each chunk(hard to say for sure though), unless something else caused those in the 1 long srt I tested so wondering if your issue is the same and I should look further into it or if it was something else.

Edit: I see that it does throw an error when combining but it does still create the file mostly(minus a few lines) but I don't know how it acted before.

Edit2: If you also get similar result than I do, basically the following portion of code(that merges the chunks together) throws some kind of error but I don't know python enough to debug so hard to tell what exactly does:
Code:
    with open(mysrt, 'r', encoding='utf-8') as f:
        srt = f.read()
        match = re.findall(r'\d+:\d+:\d+,\d+ --> \d+:\d+:\d+,\d+', srt)

    linerList = []
    liner = ""
    with open(wordtxt, "r", encoding="utf-8", errors='ignore') as wordfile:
        lines = wordfile.readlines()
        for line in lines:
            if line != '\n' and line is not lines[-1]:
                liner += line
            elif line != '\n' and len(linerList) == len(match)-1:
                liner += line
                linerList.append(liner)
                break
            else:
                linerList.append(liner)
                liner = ""

    count = 0
    with open(finalsrt, 'w', encoding='utf-8') as resfile:
        for timeline in match:
            resfile.write(f"{count+1}\n")
            resfile.write(timeline+'\n')
            resfile.write(linerList[count])
            resfile.write("\n")
            count += 1

Edit3: And to get rid of various warnings and errors, you can replace:
Code:
# Start a Selenium driver
driver_path=r'C:\Program Files (x86)\chromedriver.exe'
driver = webdriver.Chrome(driver_path)
with
Code:
s = Service("C:\Program Files (x86)\chromedriver.exe")
chrome_options = Options()
chrome_options.add_argument("--ignore-ssl-errors")
# The following 2 are probably unnecessary but doesn't hurt
chrome_options.add_argument("--ignore-certificate-errors")
chrome_options.AcceptInsecureCertificates = True
# Gets rid of USB error
chrome_options.add_experimental_option('excludeSwitches', ['enable-logging'])
# Start a Selenium driver with chosen chrome options
driver = webdriver.Chrome(service=s, options=chrome_options)
And add the following at the beginning(I did after line 3):
Code:
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
Thanks for looking into it. That was exactly the issue. It merges the file, but combines the last 2 rows of a chunk together, so the whole thing is not usable. Thanks anyways :)

Edit: Was able to fix it after 3h of reverse engineering. Man was that a pain but was worth the effort. Your improvments make it also better. Thanks again!
 
Last edited:

mei2

Well-Known Member
Dec 6, 2018
217
354
If anyone is intersted, I have a large amount of Whisper generated subtitles. They are NOT cleaned, because I am only interested in understanding the story, not having perfect subs. Duplicates are removed as good as possible and timings slightly fixed.
Wow, thanks for the great collection!
 
  • Like
Reactions: porgate55555

amnscfnt

Active Member
Apr 28, 2008
102
83
I am having problems with Whisper on the collab, might be my computer, it's been weird lately. But I setup Whisper, upload a file, run Whisper and..... nothing. I get the "executing" text at the bottom but the clock just runs. After a while I get this message

Screen Shot 2023-02-04 at 11.20.06 AM.png

What does this mean? If I change to standard runtime, I get a time out error after a while. Any ideas? Thanks in advance!!