OdiaOCR Logo

About

This initiative builds on the Odia Lipi — a focused effort to address the longstanding challenges of digitizing Odia text from images, scanned documents, palm leaves, manuscripts, newspapers, and handwritten pages.

The goal is to host open OCR datasets, models, tools, and benchmarks that empower researchers, developers, linguists, and archivists to extract machine‑readable text from complex Indic scripts. This is essential for education, cultural preservation, digital accessibility, and downstream AI applications.


Vision

To build robust, open, and community‑driven Odia OCR datasets and models that can accurately recognize both printed and handwritten Odia script, overcoming limitations of existing OCR tools and making Odia text fully searchable, editable, and usable in modern AI workflows.


Problem Statement

Odia, like many other Indic languages, is underserved by existing OCR systems, which struggle with:


What We Work On

This project aims to make Odia text searchable, editable, and machine‑interpretable, enabling downstream language technologies such as translation, summarization, and speech‑to‑text.


How to Contribute

We welcome contributions from researchers, students, linguists, and developers for:

Feel free to open issues, share data sources, or propose collaborations.


🧩 Visit the org page: https://huggingface.co/OdiaGenAIOCR