HN2new | past | comments | ask | show | jobs | submitlogin

I don't know about orders of magnitude left but we're definitely not close yet. This is just 5 languages(and frankly not even the 5 with the most text) and just as importantly, just what is crawlable from the web. There's tons of stuff in popular ebook archives you can't crawl from the web.

This is also relatively code/scientific corpora scant.

We're just getting started.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: