A lot of these are tricky files , for example the Agemix files, these are almost always compillation videos which edit out all of the bits with dialog and just get to the ,, OH you have to forgive me for this awful pun, but they just get to the climax. So you could have a 4 Hour, 8 Hour, etc. video with almost no dialog so even the most finely tuned Whisper is still going to end up with those damned hallucinations. I'm always tweaking whisper, I know that you know the story, but without knowing Japanese and with over 20Thousand files to deal with I have to just take the best that I can get. Presently I'm mostly happy with the results that I am getting. One little issue is that occasionally whisper/VAD will completely ignore dialog that is obvious, clearly spoken and without any surface noise. I have tweaked and tweaked and tweaked my files to try to resolve this but the trade-off is more hallucinations, I hate hallucinations so my current configs are set to agressively remove hallucinations and if there are bits of dialog missed, shrug... Oh, One other thing, it would be interesting to reevaluate these files after my re-scan of everything, If you're up to it, see what you find after I've run everything through again, I expect to be done with 'A' maybe Tomorrow if all goes well, I have a ton of other background work todo, renaming all my files, I'm extracting all of the useless text after the filename, One problem with that is I am blanket-removing all text past 11 Characters, 11 because there are files named longer than others so I don't want to remove some of the code but the end result is I'm getting a lot of files named, ASW-234 de , The de is a result of that 11 character issue, most of the new files will have this but at least all of that idiotic long text will be gone. IN any case, I've babbled too long, Thanks for the excellent feedback Mei.It's good that you caught it.
I think you might also have a situation where some audio extractions have failed. I was just running stats on the A set and got this below. There are many files which just are filled with repetiotions. I contribute that to audio failure, I guess.
View attachment 3680909
Last edited: