Hacker News .hnnew | past | comments | ask | show | jobs | submitlogin

Tesseract[0] is the classic example. There's a bunch of advice for improving your accuracy with it, like making your images larger (literally just scale it up x2 or x4).

It would be interesting to the benchmark from the article repeated with different scaling options (or other preprocessing, depending on platform).

[0]: https://github.com/tesseract-ocr/tesseract



Running tesseract (4.0.0 using the LSTM engine) on the same images leaves a lot to be desired for handwriting, but does well on the (non-handwriting) website image (the source images are linked in the "OCR Image Processing Results" section).


From the Tesseract FAQ:

"Can I use Tesseract for handwriting recognition?

You can, but it won’t work very well, as Tesseract is designed for printed text. Look for projects focused on handwriting recognition."

https://github.com/tesseract-ocr/tesseract/wiki/FAQ#can-i-us...


Tesseract works really good for literal OCR but I haven't had much luck before with more common work (like documents with tables and such). Has anything changed as of late?


Since October there is a new version, v4: “Added a new OCR engine that uses neural network system based on LSTMs, with major accuracy gains. [..] Added trained data that includes LSTM models to 123 languages.”

No idea how well it does for structured content like tables.

There seems to be a recent v2 of a javascript port (i.e. Tesseract v4 compiled to wasm), if anyone wants to do OCR in their browser:

https://observablehq.com/@tmcw/tesseract-js-v2-alpha

https://github.com/naptha/tesseract.js


  (literally just scale it up x2 or x4).
Just to clarify, the input image should be 300dpi.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: