16 March 2017: Tested a Trial Copy of ABBYY FineReader

Experiments

In order to better judge what's possible for OCR, we are sampling both proprietary and open-source softwares
We installed ABBYY FineReader 12.1.x onto the dorkroom mac mini
Asked it to convert the images from the previous experiment with Tesseract
It produced a PDF containing the first three images (a limit of their trial version), with the following issues-

* Positive
 * Pages were automatically oriented for English LRTB
 * Pages were automatically straightened
 * It produced indexed, searchable PDF
 * It indexes scientific terminology
 * It recognizes images, tables and diagrams, and paginates them in the resulting file
* Neutral
 * The resulting PDF contains not only text, diagrams and images, it also contains the entire original scanner image
* Negative
 * None of the pages was automatically cropped, so the scanner platen occupies most of the image
 * One of the pages was cropped, by FineReader, but incorrectly (page 3), removing all text content
* Vis-a-vis Tesseract
 *

16 March 2017: Tested a Trial Copy of ABBYY FineReader

Experiments

Navigation menu

Search