Ohh now I got you. I believe README should be improved in this part then.
It reads:
====
Although it can usually work if all you have is the video file, it will be faster (and potentially more accurate) if you have a correctly synchronized "reference" srt file, in which case you can do the following:
I believe you should explain that if you have a reference file in another language which is correctly synchronized with that video, you can use that file instead of the video, as its timestamps will serve as references when synchronizing the target .srt file.
Now this has raised a question, what if the reference file has a different block count? For example, in some languages (like Chinese or Japanese) we can say a lot with fewer characters than in English. So in Chinese a text will stay on the screen for a long time, whereas in English the corresponding text would be split into two or more blocks. Wouldn't that make synchronization less accurate?
Sorry, I forgot to answer your last question. It turns out that, because of how the algorithm works, the number of blocks shouldn't matter. Since it is discretizing time windows in 10ms increments, the granularity of the "effective blocks" is small enough that putting two separate large blocks on the screen, each for a shorter period of time, is roughly equivalent to putting a single large block on the screen for twice as long (for synchronization purposes, that is).
It reads:
====
Although it can usually work if all you have is the video file, it will be faster (and potentially more accurate) if you have a correctly synchronized "reference" srt file, in which case you can do the following:
subsync reference.srt -i unsynchronized.srt -o synchronized.srt
====
I believe you should explain that if you have a reference file in another language which is correctly synchronized with that video, you can use that file instead of the video, as its timestamps will serve as references when synchronizing the target .srt file.
Now this has raised a question, what if the reference file has a different block count? For example, in some languages (like Chinese or Japanese) we can say a lot with fewer characters than in English. So in Chinese a text will stay on the screen for a long time, whereas in English the corresponding text would be split into two or more blocks. Wouldn't that make synchronization less accurate?
BTW that's a cool project. Thanks for sharing!