Post your JAV subtitle files here - JAV Subtitle Repository (JSP)★NOT A SUB REQUEST THREAD★

panop857 · Feb 13, 2023

For some reason many subs generated via Whisper will have a bunch of subtitles start exactly at 30 seconds, and totally rushed through and offset. Most of the time the timing is great, but I've seen this initial rush a few times and I can't tell how/why it is happening like this.

For the most part, the timings are pretty good compared to most JAV subtitles I've seen. They aren't anime quality, but for stuff that is automatically generated they seem pretty good.

I think I'm in a spot where I can write a Whisper intro thread, but I think I need to understand logprob_threshold. I understand the no_speech_threshold mechanic just fine, but logprob_threshold I do not have a good sense for. Also, compression_ratio_threshold is confusing as well and I don't know when to adjust it higher or lower than the default 2.4.

For condition_on_previous_text I think I have settled on just setting it to False and keeping it there. You'll get scattered totally left field lines but they can be deleted or replaced in the edit. The upside is it gets stuck in loops way less often, and will be more willing to resort to descriptive emotes like "(moaning)" or "(crying)" "
(Heavy breathing)
(gagging)
*panting*

When you are conditioning to previous text, it will fit to a specific style of transcriptions, but with it set to False, if it sounds like some woman is gagging on something, it will reference some subtitle in those 680k hours of data where some woman sounds like she's gagging and will use the corresponding subtitle that noted (gagging).

r00g · Feb 14, 2023

@panop857 - have you checked this out? https://blog.deepgram.com/exploring-whisper/ She goes through compression ratio and log probability. Her examples are not very thorough or deep, but it's something.

And this discussion post describes the interrelation between compression_ratio_threshold, logprob_threshold and temperature - https://github.com/openai/whisper/discussions/406

panop857 · Feb 14, 2023

r00g said:
@panop857 - have you checked this out? https://blog.deepgram.com/exploring-whisper/ She goes through compression ratio and log probability. Her examples are not very thorough or deep, but it's something.

And this discussion post describes the interrelation between compression_ratio_threshold, logprob_threshold and temperature - https://github.com/openai/whisper/discussions/406

Thank you, I had seen the Temperature discussion but not that optimization blog.

Zephlol · Feb 14, 2023

SamKook said:
Their timing is shit too or at least whisper timing are, haven't looked at too many others.

Usually you want just a little bit of lead in, respect scene changes as much as possible and don't display the subs for too long or too short. What whisper does is pretty much display all subs for the same amount of time and mostly start at the second and hold that pattern(it does stray a bit over time).
Might seems fine if you're not difficult but after timing anime for years, I personally can't stand it, I'll notice even if it's just one frame off.

Havent used whisper yet, but ive made 50+ srts using adobe premiere/ youtube as starting template for timings and it has been bang on everytime with the occasional line or 2 staying on for too long. Adobe has been awesome at capturing dialogue while ignoring white noises. Youtube not so much as it has a lot of subtitles like (music) or (applause) but they can be easily deleted.

Prinsipe · Feb 14, 2023

Zephlol said:
Havent used whisper yet, but ive made 50+ srts using adobe premiere/ youtube as starting template for timings and it has been bang on everytime with the occasional line or 2 staying on for too long. Adobe has been awesome at capturing dialogue while ignoring white noises. Youtube not so much as it has a lot of subtitles like (music) or (applause) but they can be easily deleted.

Can you post an example of transcription of adobe premiere using the japanese language that is already translated in English. I think you should use Whisper too to compare their results. Because Whisper is the best by far in my opinion in terms of machine transcription.

If adobe premiere produces transcription that is comparably as good as Whisper then it will be an another application that will help jav subbers in creating their subtitle more accurate.

Thank you very much.

Zephlol · Feb 14, 2023

Prinsipe said:
Can you post an example of transcription of adobe premiere using the japanese language that is already translated in English. I think you should use Whisper too to compare their results. Because Whisper is the best by far in my opinion in terms of machine transcription.

If adobe premiere produces transcription that is comparably as good as Whisper then it will be an another application that will help jav subbers in creating their subtitle more accurate.

Thank you very much.

This is the srt for RKI-606. Its my most recent completed raw srt. Ran the video through premiere on japanese detection and translated to english using subtitle edit. Zero touch-up. Can someone with whisper run the same video and post it here? Im curious how they compare.
Link to the video

SamKook · Feb 14, 2023

Zephlol said:
Can someone with whisper run the same video and post it here? Im curious how they compare.
Link to the video

Did you use the 4.8GB download or the 1GB or so stream to make the subs?

It could make a difference and I'm unable to get the downloaded version since they don't provide free links. I am downloading the stream though.

Zephlol · Feb 14, 2023

SamKook said:
Did you use the 4.8GB download or the 1GB or so stream to make the subs?

It could make a difference and I'm unable to get the downloaded version since they don't provide free links. I am downloading the stream though.

The 4.8gb. Are you going to run it through whisper? If so. Wait a few til i get home and i can upload the 4.8gb version to gdrive for a better comparison

SamKook · Feb 14, 2023

Zephlol said:
The 4.8gb. Are you going to run it through whisper? If so. Wait a few til i get home and i can upload the 4.8gb version to gdrive for a better comparison

It will probably take me roughly 24h until I can do that since I'm going to sleep soon and then work, but if nobody else does it, I will. I found a torrent for the 4.8GB version and started it so I might be good to get it, shows at least one seed so good sign.

On a slightly different note, I found the old timing tutorial I made to teach new members back when I was doing anime fansubbing so if anyone is interested, I made a post in the tutorial section with it unedited so some stuff might be irrelevant or outdated but the principle is still good: https://www.akiba-online.com/thread...ike-an-anime-fansubber-using-aegisub.2114315/

Zephlol · Feb 14, 2023

SamKook said:
It will probably take me roughly 24h until I can do that since I'm going to sleep soon and then work, but if nobody else does it, I will. I found a torrent for the 4.8GB version and started it so I might be good to get it, shows at least one seed so good sign.

On a slightly different note, I found the old timing tutorial I made to teach new members back when I was doing anime fansubbing so if anyone is interested, I made a post in the tutorial section with it unedited so some stuff might be irrelevant or outdated but the principle is still good: https://www.akiba-online.com/thread...ike-an-anime-fansubber-using-aegisub.2114315/

Im in no hurry. Ill upload it to this post when i get home. Will be a lot faster than torrent i reckon.

edit: Google Drive for RKI-606 I'll keep the link for a few days only.

mei2 · Feb 14, 2023

It is that time of the month: Queen Minami time

View attachment 3160182

IPX-998 Teacher...Can I Stay The Night? During The Training Period, I, Who Lives In A Hotel, Was Forced By A Student To Share A Room With Me. Minami Aizawa

Like many of you here I have been experimenting with various workflows, and parameters of Whisper and I think I have come up with something that creates decent subs. If any one here speaks Japanese, please take a look at this one and review --any spot check of accuracy and quality will be helpful. I appreciate it. I plan to write up my workflow once I am more sure of the quality.

Thanks in advance.

PS. How does one links Javlibrary here with large screenshots?

Zephlol · Feb 14, 2023

dont make it as an attachment. copy and paste the cover as is.

On a side note. I don't know how you can stand the begging chooser requests @ scanlover, they bug me a lot.

mei2 · Feb 14, 2023

Zephlol said:
On a side note. I don't know how you can stand the begging chooser requests @ scanlover, they bug me a lot.

Yeah, it gets annoying some time. There are quite a few of them there, aren't they? But there are also few genuine contrubutors too.

Gokkun Punch · Feb 14, 2023

superman4207 · Feb 14, 2023

Zephlol said:
Im in no hurry. Ill upload it to this post when i get home. Will be a lot faster than torrent i reckon.

edit: Google Drive for RKI-606 I'll keep the link for a few days only.

@Zephlol @SamKook Hey guys, I went ahead and ran RKI-606 (the version from Gdrive) through Whisper. Here's the Whisper version of the sub so you can compare and contrast, Zephol.

Anything for SCIENCE (and JAV)!

SamKook · Feb 14, 2023

Zephlol said:
dont make it as an attachment. copy and paste the cover as is.

Don't hotlink pictures and do make them as attachments on the forum instead or we end up attracting unwanted attention since hotlinking steals bandwidth from those other website or end up as dead links down the road if that website dies or blocks hotlinking.

When you hover over your attached pictures, there's 2 option on how to insert them into your post and you can simply choose full size(or something like that, not 100% sure on the term from memory) and it'll add it full size where the cursor is in your post.

SamKook · Feb 14, 2023

panop857 said:
For some reason many subs generated via Whisper will have a bunch of subtitles start exactly at 30 seconds, and totally rushed through and offset. Most of the time the timing is great, but I've seen this initial rush a few times and I can't tell how/why it is happening like this.

For the most part, the timings are pretty good compared to most JAV subtitles I've seen. They aren't anime quality, but for stuff that is automatically generated they seem pretty good.

superman4207 said:
@Zephlol @SamKook Hey guys, I went ahead and ran RKI-606 (the version from Gdrive) through Whisper. Here's the Whisper version of the sub so you can compare and contrast, Zephol.

Anything for SCIENCE (and JAV)!

Zephlol said:
This is the srt for RKI-606. Its my most recent completed raw srt. Ran the video through premiere on japanese detection and translated to english using subtitle edit. Zero touch-up. Can someone with whisper run the same video and post it here? Im curious how they compare.
Link to the video

If you look at the subs generated by whisper for RKI-606, you'll see that they almost always last for exactly a few full secs(the milliseconds for the beginning and end of the line is almost always the same) and the starting millisecond will be the same across many lines until it changes and that repeats.

There's no way that's even remotely accurate to what's being said in any video this way, just having that happen once would be a rarity but it's happening all over the file constantly, that's why I call the timing shit. It's in the right general spot but it's not very good at all.

If you look at Zephlol srt made with premiere, you can see the milliseconds are pretty much never the same for the beginning and end of a line and the next line also pretty much never start at the same milliseconds, like normal subtitles would.
I haven't looked closely at it yet to say if it's good or not though but it should be much better than whisper at least at first glance.

Zephlol · Feb 14, 2023

SamKook said:
If you look at the subs generated by whisper for RKI-606, you'll see that they almost always last for exactly a few full secs(the milliseconds for the beginning and end of the line is almost always the same) and the starting millisecond will be the same across many lines until it changes and that repeats.

There's no way that's even remotely accurate to what's being said in any video this way, just having that happen once would be a rarity but it's happening all over the file constantly, that's why I call the timing shit. It's in the right general spot but it's not very good at all.

If you look at Zephlol srt made with premiere, you can see the milliseconds are pretty much never the same for the beginning and end of a line and the next line also pretty much never start at the same milliseconds, like normal subtitles would.
I haven't looked closely at it yet to say if it's good or not though but it should be much better than whisper at least at first glance.

Agree. Whisper timing is fixed, and gets annoying if the lines are either too short or too long. Premiere captures the duration of dialogue a little better, but is imperfect and tends to make some glaring errors (like a line lasting for faaaar too long).

For the translation itself, whisper wins. Premiere spews out nonsense from time to time.

I do find following the dialogue from Whisper to be a lot easier however

Mei's IPX-998 is pretty damn good in terms of translation, was this manually edited?

SamKook · Feb 14, 2023

Zephlol said:
Agree. Whisper timing is fixed, and gets annoying if the lines are either too short or too long. Premiere captures the duration of dialogue a little better, but is imperfect and tends to make some glaring errors (like a line lasting for faaaar too long).

For the translation itself, whisper wins. Premiere spews out nonsense from time to time.

I do find following the dialogue from Whisper to be a lot easier however

Yeah, had a look at the premiere subs on the video and it's not great either, they often start and end too early and it does miss the end sometimes, like that line that lasts for over 4 minutes 12 mins in.

You can see why Gokkun Punch insist on getting a human timer, neither whisper nor premiere produce even remotely close to quality timing. Good enough to follow if you don't want to put in hours of time doing it yourself though.

Zephlol · Feb 14, 2023

SamKook said:
Yeah, had a look at the premiere subs on the video and it's not great either, they often start and end too early and it does miss the end sometimes, like that line that lasts for over 4 minutes 12 mins in.

You can see why Gokkun Punch insist on getting a human timer, neither whisper nor premiere produce even remotely close to quality timing. Good enough to follow if you don't want to put in hours of time doing it yourself though.

Yep. AI is not ready yet

Post your JAV subtitle files here - JAV Subtitle Repository (JSP)★NOT A SUB REQUEST THREAD★

Well-Known Member

Member

Well-Known Member

Member

Member

Member

Grand Wizard

Member

Grand Wizard

Member

Well-Known Member

IPX-998 Teacher...Can I Stay The Night? During The Training Period, I, Who Lives In A Hotel, Was Forced By A Student To Share A Room With Me. Minami Aizawa​

Attachments

Member

Well-Known Member

Active Member

JAV Perv Enthusiast

Attachments

Grand Wizard

Grand Wizard

Member

Grand Wizard

Member

Similar threads

IPX-998 Teacher...Can I Stay The Night? During The Training Period, I, Who Lives In A Hotel, Was Forced By A Student To Share A Room With Me. Minami Aizawa