Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, the dream is to fully automate the entire pipeline, then let it loose on a massive collection of scanned manuscripts and come back in a couple days to perfect markdown formatted copies. I wish they would run my project on all the books on Archive.org because the current OCRed output is not usable generally.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: