Explore projects
-
Tesseract deriver module used to OCR items with tesseract. Outputs hOCR and various metadata keys.
Updated -
Updated
-
This repository contains utilities for working with the 78rpm collection.
Updated -
Updated
-
Updated
-
lightweight JS-only slimmed down archive.org website prototype
Updated -
IA serverless deriving .mp4 from Popcorn Projects. logical mirror pull from https://github.com/Laurian/popcorn-exporter + .gitlab-ci.yml
Updated -
Updated
-
-
archived 0Updated
-
Extract structured metadata and content from article PDFs; use this to match against databases of known identifiers.
Updated -
-
-
combine audio and unaligned captions into aligned captions
Updated -
Machine Learning applied to deskew and autocrop books and microfilm.
Contains tools to manage datasets, also contains actual ML scripts - using tensorflow and dhSegment
Updated -
produces ffmpeg binary into a docker run-able nvidia runtime container that can run on GPU-enabled VMs/baremetals.
Updated -
Updated