Explore projects
-
Machine Learning applied to deskew and autocrop books and microfilm.
Contains tools to manage datasets, also contains actual ML scripts - using tensorflow and dhSegment
Updated -
Tesserotate (aka tesseract-baselines-rotate)
Calculates image rotation to rotate them to their proper orientation with tesseract, using the baselines as found by the OSD
Updated -
Merlijn Wajer / dhSegment
GNU General Public License v3.0 onlyUpdated -
Christian Clauss / morituri
GNU General Public License v3.0 onlyUpdated -
www / Tesseract
GNU Affero General Public License v3.0Tesseract deriver module used to OCR items with tesseract. Outputs hOCR and various metadata keys.
Updated -
Merlijn Wajer / archive-pdf-tools
GNU Affero General Public License v3.0Updated -
Merlijn Wajer / hocr-tools
Apache License 2.0Updated -
Merlijn Wajer / archive-hocr-tools
GNU Affero General Public License v3.0Updated -
Merlijn Wajer / Python Derivermodule
GNU Affero General Public License v3.0Shared code for Python deriver modules
Updated -
Updated
-
See https://git.archive.org/www/tesseract/ instead
Updated -
Updated
-
Updated
-
Updated
-
IA serverless deriving .mp4 from Popcorn Projects. logical mirror pull from https://github.com/Laurian/popcorn-exporter + .gitlab-ci.yml
Updated -
Updated
-
Merlijn Wajer / ocr-dataset
GNU Affero General Public License v3.0Updated -
Temporary repository for implementations of https://arxiv.org/pdf/1905.13038.pdf
Will contain at least a Cython implementation and a C implementation (using leptonica to read images)
Updated -