> False positives from SHA-1 hash collisions are detected after object retrieval from the disk by comparison with the requested key
Is that check really necessary?
To have 1 in a trillion chance of having accidental SHA-1 collision they'd have to store 1.7*10^18 keys, and mere key index of that would require 54000 petabytes of RAM.
Comparing the sha-1 is much faster than reading the ssd, so it's almost free.
Accidental sha-1 collision is probably not a problem, but in a few years [1] it will be possible to crate sha-1 collisions and use that as an attack. It looks difficult, but supposes that with the correct string an attacker can retrieve the cached information of another user, for example sha1("joedoe:creditcard")=sha1("atacker:hc!?!=u?ee&f%g#jo").
I don't know if they are using randomization, because the collision can be used (in a few years) as a DOS atack [2]
Your example describes a preimage attack on SHA-1, not a collision attack. Even with a working collision attack you are probably still far away from taking "some.other.input" and creating a sha1("some.other.input") = sha1("johndoe:creditcard").
For instance MD5 collisions are really easy to create but for preimage attacks on MD5 there is still no better approach than just doing brute force.
Smells like a bad idea, I agree. You have more chances to have a hardware error or system error that will cause your branch to be incorrectly executed than having an accidental SHA-1 collision. And if you're worried about SHA-1 collisions, use SHA-2 or even SHA-3.
Is that check really necessary?
To have 1 in a trillion chance of having accidental SHA-1 collision they'd have to store 1.7*10^18 keys, and mere key index of that would require 54000 petabytes of RAM.