First pack has about 500 new subtitles from last time, more than 70% of them is probably from the last 2 months. This pack is
ordered by label. You can see all the new files in the output.txt below the post.
I've fixed some mistakes with the naming of HODV, if you find more big mistakes let me know.
This file has been shared with you on pixeldrain
pixeldrain.com
The second pack is the pack that is ordered by actress names.
IMPORTANT: Almost all actresses folder names are in Japanese, except for like Julia, Aika and others. But in the folders themselves there is a .txt which has the English/Romanized names and all their aliases. So you just look for the Romanized name and then open the folder to the .txt.
TIP: Get software called everything, it's a freeware desktop search utility for Windows that can rapidly find files and folders by name. It's a godsend to find something really fast. I wouldn't be able to live without it.
This file has been shared with you on pixeldrain
pixeldrain.com
Workflow of the 2nd pack for people that are interested in it
1. I used JAVstash as main resource for this (last time it was the r18.dev db). I used this database as it's probably the most correct database that's publicly available. It's way better in that there's no duplicate actresses under different names. They also have all aliases and the most common romanization that is used. There's a .json database as well in the pack with all the names and aliases.
2. From 23.5k subtitles I had about a 1000 left that wasn't found. This is where I used r18.dev DB. Any time I found a code on r18 and the actress. I cross-referenced it on JAVstash to see if that actress was on JAVstash so I could pull all the aliases as well. Any actress that wasn't found on JAVstash but is on r18 has a "R18_NAME" identified in the .json DB.
3. I have about now about 500 subtitles left, and just doing those 500 last ones probably took me double the amount of time of scripting that doing the first 22.5k. The logic of the script became crazy complex to avoid duplication. I could now also scrape jav.guru and javmost for the remaining subtitles like I did with V1 of the pack, but I'm not going to do that. All leftover files are old or small labels 99% of people won't care about.
I also have a script that's already made to convert all Japanese folders to the English name. The problem with JAVstash is that they have all the romanized names, but you don't know which one is currently used, there could be 5 sometimes romanized names. I can query StashDB with the Japanese name and then get (mostly) the correct romanized version. StashDB misses a lot of JAVs, but their actress list is quite up to date.
After I made it (what a waste of time) I decided not to run it cause I realized that it's meaningless. If there's really some requests to do it, I can do it.