Hacker News .hnnew | past | comments | ask | show | jobs | submitlogin

Denoising algorithms are always lossy. An LLM (or, y'know, Markov chain) could do this job by exploiting statistical regularities in the English language, but a hex dump isn't quite the English language, so it'd be completely useless. Even if this text were English, though, the LLM would make opinionated edits (e.g. twiddling the punctuation): you'd be unlikely to get a faithful reproduction out the other end.


Of course, use search and replace to change 0 to zero... etc. The OCR will (should) work better.


You might as well just use an error-correction code: same result, less overhead.


> hex dump

ah, missed that, was just skipping through


Still would not solve the problem of copying data without changing it.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: