Using OCR (experimental)

Recognize text from scans locally with Tesseract.js.

Updated May 8, 2026

OCR extracts text from scanned pages using Tesseract.js, which runs entirely in your browser. It's labeled experimental because accuracy depends heavily on scan quality, language, and layout.

The first run downloads a language model to your browser, so it can take a moment. Your document is never sent anywhere.

Getting better results

Use clean, high-contrast scans.
Straighten skewed pages before running OCR.
Expect to proofread the output.

Related tools

OCR PDF (Text Recognition)

experimental

Recognize text from scanned pages with an in-browser engine.

Runs in browserPDF