I've cleaned up up some some of the Chinese subtitles packs that were posted here, most were from before 2021-2022 I think.
I've added tons of new subtitles from after 2021-2022 that I've found in the RUNBKK packs and scraped
https://www.786551.xyz/ for all their subs. Combined it was probably more than 5000 new ones I've added but I've never checked what the exact number was.
If people know more sources for these good Chinese subtitles feel free to link them to me and I can scrape them (if possible). Or if they are big giant packs I can order and add them through a script I made
There were tons of duplicates in those older Chinese packs, most were deleted but there will still be more than a 1000 that remain. I've deleted most of them just through encoding all files to UTF-8 without bom and then checking if the filesize is the same. But obviously if someone puts an advertisement in there, the filesize is different...
There's also a python script in there that I've used to format all subtitles and remove all junk from the filename and order them in maps. Be advised, never use the script without backing up your subtitle files first.
There's still about 500 unsorted subtitles in 1 folder, that my script didn't catch. And it's just too much of a hassle to fix it all, and it was all labels I didn't recognize anyway.
These are great sources to put through LLM and translate to English. I've seen DeepL mentioned a lot, I honestly think DeepL sucks. Deepseek does a way better job but it's really slow, Gemini is the best of both worlds, translation is worse than deepseek and better than DeepL, but ultra fast. Gemini comes with like 200€ free credits as well I think ? I find DeepL also quite expensive if you go past their free monthly tokens.
This file has been shared with you on pixeldrain
pixeldrain.com