OCR PDF
Convert Scanned PDF to Word — Editable Text via OCR
Scanned PDFs are images of pages — there's no actual text inside, just pixels. Trying to convert one directly to Word produces a Word doc with image objects, not editable text. The fix is OCR (optical character recognition): turn the images of text into real text first, then convert. Both happen in the browser, no install.
- Works in your browser — no install
- Files private and isolated to your workspace
- Free tier covers most everyday use
What you should know
How to tell if your PDF is scanned
Open the PDF and try to select text with your cursor. If the cursor only selects whole pages (not individual letters), it's scanned. If you can highlight specific words, it's already text.
OCR language matters
Tesseract (the OCR engine) needs the right language model to read your document. We support 25 languages including English, Spanish, French, German, Chinese (Simplified + Traditional), Japanese, Korean, Arabic, Hindi, Russian, Portuguese, and more. Pick the language(s) your document is written in.
Quality of the scan affects accuracy
Clean, high-resolution scans (300 DPI) give 99%+ OCR accuracy. Phone-camera scans of paper can be messy (skew, lighting, fingers in frame) and drop to 90–95%. For best results, scan flat with good lighting or use a scanner app like Adobe Scan or CamScanner.
Two-step process
Step 1: OCR PDF turns the scan into a searchable PDF (image + invisible text layer). Step 2: PDF-to-Word converts that searchable PDF to .docx. Some users only need step 1 — a searchable PDF is editable in our editor and acceptable for most workflows.
Tips that actually help
- If your scan has multiple languages (English + Spanish, etc.), select both — Tesseract handles multilingual OCR well.
- For receipts, IDs, and forms with small text, scan at 300 DPI minimum. 600 DPI is overkill for most cases.
- If OCR misreads specific words consistently (a custom name, technical term), add a manual fix in Word after conversion.
- For sensitive scans (tax docs, IDs), consider redacting in our edit-pdf tool before OCR if you're sharing with others.
OCR + convert your scan.
No install, no signup wall, no watermark on paid plans.
Frequently asked questions
How accurate is the OCR?
99%+ for clean printed text at 300 DPI, 90–95% for phone-camera scans. Handwriting accuracy is much lower (~70%) and depends heavily on the writer's clarity.
Can I OCR a handwritten document?
Tesseract supports handwriting recognition for printed-style handwriting (most clear adult handwriting). Cursive is harder. Expect 60–80% accuracy on neat handwriting, lower on messy.
What languages do you support?
25 languages including English, Spanish, French, German, Italian, Portuguese, Dutch, Russian, Polish, Turkish, Arabic, Hindi, Bengali, Chinese (Simplified + Traditional), Japanese, Korean, Vietnamese, Thai, and more.
Will the Word document look like the scan?
Layout is approximated — paragraphs reflow into normal Word formatting. Headers and bullets are usually detected. Heavy graphic layouts (newspaper-style) may need manual cleanup.
Can I OCR multi-page PDFs?
Yes — OCR runs on every page. Free plan up to 25 MB; Pro plan up to 250 MB.
Related scenarios
PDF to Word
PDF to Word Without Losing Formatting
The dream conversion: PDF in, perfectly editable Word doc out, fonts intact, tables aligned, no ASCII salad. Reality is messier — some PDFs were never editable Word in the first place (they're scans, or they were generated from InDesign), and even the cleanest PDFs require layout judgment to convert. Our converter does the best-possible round-trip: text stays as text, paragraphs stay as paragraphs, tables stay as tables when possible, and fonts substitute to the closest available match.
Read
PDF to Word
Convert PDF Resume to Word
You have a PDF resume that you want to update — maybe add a new job, tweak a bullet, or tailor for a specific role. Editing PDF directly is awkward; round-tripping to Word is the obvious move. The trick is keeping the layout intact: bullets aligned, headers bold, contact info on top. Our PDF-to-Word converter is tuned for resume layouts.
Read
Edit PDF
Redact a PDF
Redacting a PDF means permanently removing sensitive information — not just covering it with a black rectangle that anyone can move out of the way in Acrobat. Real redaction strips the underlying text and the visible glyphs, so even copy-paste, search, and OCR-on-the-redacted-file can't recover the hidden data. This is what you need before sharing contracts, court filings, financial records, or anything covered by HIPAA, GDPR, or attorney-client privilege.
Read