PDF → Tekst / Markdown
Trekk ut ren tekst
Om dette verktøyet
Extract plain text or Markdown from a PDF. Reading order is reconstructed by sorting text items by their position on each page, so the output reads like the document - not like a random shuffle of words.
Use it to feed PDFs into ChatGPT, build a searchable archive, copy content into another document, or quickly scan the contents of a long report. Privvert runs the extraction locally with pdf.js - confidential PDFs never leave your machine.
Funksjoner
- Plain text or Markdown output
- Preserves natural reading order
- Per-page text or single concatenated document
- Works on any text-based PDF
- Detects bullets, headings (heuristic) for Markdown
- Browser-only - files never uploaded
- Free and unlimited
- Optionally preserves paragraph breaks based on PDF text layout
Slik bruker du det
- Drop in your PDF.
- Pick plain text or Markdown.
- Click Extract.
- Download the .txt or .md file.
Alt skjer inne i nettleseren din med JavaScript og WebAssembly. Filene dine lastes aldri opp, lagres aldri og ses aldri av oss.
Ofte stilte spørsmål
No - scanned pages are images of text, not actual text. Run the OCR tool first to produce a text layer, then extract.
Very accurate for normal single-column documents. Multi-column layouts (newspapers, academic papers) sometimes interleave columns; manual cleanup is occasionally needed.
The Markdown converter uses font-size heuristics to guess heading levels. It's good for most documents but not perfect - review and adjust as needed.
Yes - switch to per-page mode and copy only the page you want.
PDF stores text in absolute positions on the page, not as a linear stream. Text in columns, footnotes or text boxes can come out in an unexpected order. The tool tries to reconstruct natural reading order but multi-column layouts are a known hard case.
No - it extracts text that's already inside the PDF. Pure image scans return nothing. Run them through the OCR tool first to get real text underneath the images.