Surya : OCR and line detection in 90+ languages
Date : 2024-01-10
Description
This summary was drafted with mixtral-8x7b-instruct-v0.1.Q5_K_M.gguf
Surya is an open-source document OCR toolkit developed by Vik Paruchuri that offers accurate OCR in 90+ languages and line-level text detection in any language. It supports a range of documents, including images, PDFs, and folders of images/PDFs, and is capable of detecting tables and charts (coming soon). The toolkit includes a streamlit app for interactive use, making it accessible to users who want to try Surya on their images or PDF files. Surya's name comes from the Hindu sun god, who has universal vision.
GitHub repo here
Recently on :
Artificial Intelligence
Information Processing | Computing
WEB - 2024-12-30
Fine-tune ModernBERT for text classification using synthetic data
David Berenstein explains how to finetune a ModernBERT model for text classification on a synthetic dataset generated from argi...
WEB - 2024-12-25
Fine-tune classifier with ModernBERT in 2025
In this blog post Philipp Schmid explains how to fine-tune ModernBERT, a refreshed version of BERT models, with 8192 token cont...
WEB - 2024-12-18
MordernBERT, finally a replacement for BERT
6 years after the release of BERT, answer.ai introduce ModernBERT, bringing modern model optimizations to encoder-only models a...
PITTI - 2024-09-19
A bubble in AI?
Bubble or true technological revolution? While the path forward isn't without obstacles, the value being created by AI extends ...
PITTI - 2024-09-08
Artificial Intelligence : what everyone can agree on
Artificial Intelligence is a divisive subject that sparks numerous debates about both its potential and its limitations. Howeve...