Gokkun

Thank you very much. Our file lists don't match up exactly and many of my file names have extra characters which are easy to delete but when I keep things simple (:)) things work just fine on VLC.
It's very much appreciated.
I posted earlier about the filenames. As you can see I am doing a massive project of creating subs for about 25K videos. I have spent a lot of time trying to tweak whisper to give the best possible result but there's a tradeoff, the more you get exact the more hallucinations and so, while the results are not perfect, they are reasonably close. With regard the the name shortening. When doing a batch of files, let's say 100, Whisper ignored maybe a Dozen and I eventually figured out it was because of the names. It didn't even have to be a long name, You could have ABC-123 gokkun-.ts and that may be enough to cause whisper to ignore it. For me the solution was to rename all of my files with only the code. My renaming program allows removing all characters after a certain point but you have to define where you want the characters removed, for example, if the file is named ABC-123, if I tell my program to remove all characters after the 8th character then that woud work fine. The problem is that if a filename is say 15 characters, like, FC2-ppv-1234556, then my perameters would rename that file to FC2-ppv- , and remove all the necessary code. To combat that I have set my perameters to remove from the 11th character for most files. This will yield filenames like ABC-123 wh.ts it is a bit cumbersome but it is the best I can do. If I were dealing with 5 or 10 files, it would be easy to do them individually but with 25 Thousand it is just not prudent. Net result, the end-user will have to do a little file-renaming to make the srt and video file coincide.
On the VLC front I had just nightmares trying to get VLC to recognize subs for files with the .ts extention. I have finally resorted to using potplayer as my default for .ts and that works great. Ultimately I like VLC better but dems the breaks.
 
Hi Fellows and Fellerettes and any other non-binary sorts who identify as a Clam/ Here are the R files. There's about 1000 srts in this one. S is next and is an enormous file, I think there's about 3000 files there. I have some 'housecleaning' to do with my files so I will be taking a few days before I start with S and S will likely take several days to process so look for the S's later next week. In the meantime, Here is R: https://filejoker.net/3rgriddrmhs4/R.rar
Your hard work is very appreciated. Thank you very much!
 
  • Like
Reactions: DScott