Explore projects
-
archived 0Updated
-
Arthur Milliken / morituri
GNU General Public License v3.0 onlyUpdated -
Extract structured metadata and content from article PDFs; use this to match against databases of known identifiers.
Updated -
archived 0Updated
-
ansible-roles-contrib / ansible-role-ferm
MIT Licensearchived 0Updated -
Automatic extraction data (e.g. content, title and etc) from archived news pages
Updated -
archivecd / pagenet
BSD 3-Clause "New" or "Revised" LicenseUpdated -
Merlijn Wajer / dhSegment
GNU General Public License v3.0 onlyUpdated -
Merlijn Wajer / Python Derivermodule
GNU Affero General Public License v3.0Shared code for Python deriver modules
Updated -
www / Tesseract
GNU Affero General Public License v3.0Tesseract deriver module used to OCR items with tesseract. Outputs hOCR and various metadata keys.
Updated -
Updated
-
Updated
-
Updated
-
Updated
-
Temporary repository for implementations of https://arxiv.org/pdf/1905.13038.pdf
Will contain at least a Cython implementation and a C implementation (using leptonica to read images)
Updated -
IA serverless deriving .mp4 from Popcorn Projects. logical mirror pull from https://github.com/Laurian/popcorn-exporter + .gitlab-ci.yml
Updated -
Merlijn Wajer / archive-hocr-tools
GNU Affero General Public License v3.0Updated -
Updated
-
Aram Verstegen / archive-hocr-tools
GNU Affero General Public License v3.0Updated