#Tag · Bonfire Dinteg Labs

What's working:

✅ Download JSON of each page from Amazon.
✅ Deobfuscate the SVG "DRM".
✅ Draw each letter on the page with the correct indent, placement, and font (italics, etc).

What's mostly working:
🚧 OCR. Tesseract gets most of the text, but some errors.

What's not working:
❌ OCR doesn't output italics.
❌ Linebreaks are hardcoded.
❌ Doesn't integrate into the original ePub code - so no chapters etc.
❌ No idea about footnotes, images, etc.

#Kindle #DRM #OCR

Screenshot of a page of text. Indents and italics all work.

Music Channel boosted

Oblomov

@oblomov@sociale.network · 7 days ago

#AskFedi is there an #OCR for #music notation? Something that can convert scanned sheet music in some standardized music notation format that can be typeset with appropriate programs?

Oblomov

@oblomov@sociale.network · 7 days ago

#AskFedi is there an #OCR for #music notation? Something that can convert scanned sheet music in some standardized music notation format that can be typeset with appropriate programs?

Bonfire Dinteg Labs

This is a bonfire demo instance for testing purposes. This is not a production site. There are no backups for now. Data, including profiles may be wiped without notice. No service or other guarantees expressed or implied.

Bonfire Dinteg Labs: About · Code of conduct · Privacy ·

Bonfire social · 1.0.0-rc.3.15 no JS en

Automatic federation enabled