Thanks for the suggestion. I may do that. But for now, I'll point out that I'm going through the 28,000+ Chinese sub file ZIP, and it's easy to find more.
So for example, just under SNIS, after SNIS-035 with the gibberish inside, I started looking through them one by one. About half of them contain Chinese characters and are translatable. The other half contain nonsense and if you try to translate them, you get only ���ַ���, which is not hard to predict since what is in those files is not language.
Then I wondered if it's an issue just with the SNIS series. So I checked another common series, MIAD. The first two in series MIAD128 and 458, were fine. Then I check MIAD501, and here's the first three lines as a sample.
1
00:00:01,160 --> 00:00:10,660
°f3P®õ?¯D¤Ñ?
2
00:01:20,030 --> 00:01:21,530
§A¦n
3
00:01:21,800 --> 00:01:23,930
§Ú¬O¯u?¡A§Ú¬O¤s¤â?
I have no idea what percentage of those Chinese soft subs are like this but I'm beginning to think it's a crap ton. Like maybe a third or half of them. I can't possibly be the first person to notice this.