16 March 2017: Tested a Trial Copy of ABBYY FineReader
Jump to navigation
Jump to search
Experiments
- In order to better judge what's possible for OCR, we are sampling both proprietary and open-source softwares
- We installed ABBYY FineReader 12.1.x onto the dorkroom mac mini
- Asked it to convert the images from the previous experiment with Tesseract
- It produced a PDF containing the first three images (a limit of their trial version), with the following issues-
* Positive * Pages were automatically oriented for English LRTB * Pages were automatically straightened * It produced indexed, searchable PDF * It indexes scientific terminology * It recognizes images, tables and diagrams, and paginates them in the resulting file * Neutral * The resulting PDF contains not only text, diagrams and images, it also contains the entire original scanner image * Negative * None of the pages was automatically cropped, so the scanner platen occupies most of the image * One of the pages was cropped, by FineReader, but incorrectly (page 3), removing all text content * Vis-a-vis Tesseract *