Using OCR (experimental)
Recognize text from scans locally with Tesseract.js.
Updated May 8, 2026
OCR extracts text from scanned pages using Tesseract.js, which runs entirely in your browser. It's labeled experimental because accuracy depends heavily on scan quality, language, and layout.
The first run downloads a language model to your browser, so it can take a moment. Your document is never sent anywhere.
Getting better results
- Use clean, high-contrast scans.
- Straighten skewed pages before running OCR.
- Expect to proofread the output.