Translate Scanned Filings From Image to Text
Upload a scanned document and Niyam reads the text with OCR, then translates it into the Indian language you choose.
What changed
Translation now handles scanned and image-based documents as well as plain text. When you upload a photograph of a court order, a scanned notice, a faxed filing, or any other document that only exists as an image, Niyam first extracts the text using optical character recognition and then translates it into the Indian language you select. The two steps run as a single pipeline, so you upload once and receive a translation without having to type out the source text yourself. The pipeline connects to the same side-by-side review interface you use for text-based translations, so the experience is consistent regardless of how the source document arrived.
How to use it
- Open Translation and choose to upload a document.
- Select your scanned file or photograph of the filing.
- Set the source language if it is not detected automatically, then choose the target language.
- Start the job. Niyam extracts the text using OCR and then passes it through translation. Both steps run in sequence without any input from you.
- When the job completes, review the result using the side-by-side interface. Compare the translated text to the original image, paying particular attention to names, case numbers, dates, and numerical figures.
- Export or copy the translated text once you are satisfied with the result.
Why it matters
A significant portion of legal documents in India still circulate as physical papers and scans. Court orders are photographed, notices arrive by post and are scanned into folders, case files from older matters exist only on paper, and documents from district courts frequently come as images rather than digital text. Until now, translating any of these required typing out the source text first — an extra step that could take as long as the translation itself for a dense or lengthy document.
Folding OCR into the translation pipeline removes that barrier entirely. A scanned notice in one Indian language becomes translated, readable text you can respond to, analyze, or file in your matter, all in a single pass through the tool. For advocates who regularly work across languages, or for in-house counsel handling filings from courts in different linguistic jurisdictions, this removes a friction point that was slowing down every document that arrived as an image rather than a typed file.
It also makes working through a volume of documents manageable. When a batch of scanned materials arrives that all need translating, running them through the pipeline one by one is a realistic workflow. Transcribing each one before you can start is a different undertaking entirely.
Good to know
- OCR accuracy depends on the quality of the source image. Clear, straight, well-lit scans produce good results. Faint, skewed, or low-resolution images will have more errors, and those carry through to the translation output.
- Always review the result against the original image, especially for names, case reference numbers, dates, and monetary figures. OCR can misread individual characters, and those details matter in a legal document.
- This pipeline is for documents that arrive as images. For text you are actively writing or editing, the in-editor translation feature in Draft remains the right tool.
- Side-by-side review works the same way for OCR-based jobs as it does for text-based ones, so your review process does not change.
- Translation carries a beta label while we continue to refine OCR handling for less standard document formats and improve quality across all supported language pairs.