If I use Print to PDF on a Word document, is the text underneath a black rectangle still there?

Almost always, yes. The black rectangle you drew in Word is a drawing object that sits on top of the text in the visual stack. When Word renders the page for printing, the rectangle is painted over the text, but the text is still a separate object underneath. The PDF print driver receives both - the text run and the rectangle on top of it - and writes both into the PDF in their original layers. The visual result looks identical to a redaction, but anyone who opens the file in Acrobat, Preview, or a browser PDF viewer can click-drag across the rectangle, copy what is selected, and paste the original text. This is the same mistake that has produced public-facing incidents at major US law firms, intelligence agencies, and government departments. The fix is a real redaction tool that removes the underlying content stream before saving, not a drawing on top of it.

Does Print to PDF strip the author name and editing history from a Word or Pages document?

It strips some of it and preserves more than people expect. The visible track-changes markup and comments are usually flattened into the rendered page (if they were displayed when you printed) or omitted (if they were hidden). The PDF metadata that the print driver writes, however, typically inherits the document's title and author fields from the source application's print metadata, and on Windows and macOS the PDF /Producer and /Creator fields record the name of the PDF engine ('Microsoft: Print To PDF', 'macOS Quartz PDFContext', 'Adobe PDF Library 23.0'). The original document's last-modified timestamp can leak through as the PDF /ModDate, and on Word documents the print path sometimes preserves the RSID (revision save identifier) values inside embedded XML. None of this is visible in the rendered page. All of it is visible to anyone who opens the PDF in a metadata viewer or runs a one-line exiftool command on it.

What about hidden layers, optional content groups, and form fields - do those survive Print to PDF?

Optional content groups (PDF's name for layers, the things you toggle in CAD drawings, multi-language documents, and engineering plans) are a property of the PDF itself, so a Print-to-PDF pass through a PDF source will often collapse them into a single rendered page - which is good if you wanted to flatten, bad if you needed the layers preserved. Form fields are a different story: a fillable PDF re-printed via Print to PDF in many viewers preserves the form fields as live, fillable fields rather than baking the filled-in answers into the page. If your goal is to send a completed form that the recipient cannot edit, Print to PDF is not enough; you need an explicit Flatten step (Acrobat: Print Production > Flatten Transparencies + Flatten Form Fields, or in pdftk/qpdf, a flatten pass). The same applies to digital signatures - re-printing a signed PDF removes the signature's cryptographic binding without warning, which silently invalidates the trust without changing the visible page.

Why does a screenshot of a PDF leak less than the PDF itself, and when is the screenshot the wrong choice?

A screenshot is a flat raster bitmap. It has no selectable text, no layers, no form fields, no embedded fonts, no JavaScript, no attached files, and no source-document metadata - the only metadata it carries is the screenshot tool's own (timestamp, sometimes the OS version, and on iOS/macOS the screen geometry, which is far less revealing than a PDF's full document history). For sharing a single piece of evidence - one paragraph, one diagram, one page of a contract - a screenshot is often the right answer because it cannot leak anything that is not visible in the image. The trade-off is everything you give up: no copyable text for the recipient, no accessibility for screen readers, larger file size for multi-page documents, no print quality for hard-copy use, and no machine-readable structure for downstream processing. The rule of thumb: send a screenshot when the recipient only needs to see, send a properly cleaned PDF when they need to use the document.

Does macOS 'Save as PDF' from the print dialog produce the same file as Adobe's 'Save As PDF'?

They are different PDFs from different engines and they leak different things. macOS's Save as PDF goes through Quartz PDFContext, which is a system-level PDF generator that writes /Producer: 'Mac OS X 14.x Quartz PDFContext' (or the current version) into the metadata, preserves the source application's title and author by default, and uses a relatively compact PDF feature set that strips most interactive elements. Adobe Acrobat's Save As PDF (whether from the print dialog or from Acrobat's own export) uses Adobe PDF Library, writes /Producer: 'Adobe PDF Library' with the version, and tends to preserve more source-document structure - tagged PDF for accessibility, bookmarks from headings, sometimes the full editing thread from Word. Neither one is a redaction tool. Both encode the source application's identity into the file in a way that an inspector can read. If your threat model includes 'do not let the recipient know which OS and application produced this document', you need an explicit metadata-stripping pass after the export.

How do I actually inspect what a PDF is carrying before I send it?

Three quick checks cover most cases. First, open the PDF in any viewer and try Select All - if you can drag-select across what you thought were images or redactions, the underlying text is still there. Second, run exiftool on the file (exiftool report.pdf) or open it in a hex editor and read the first kilobyte - the /Title, /Author, /Producer, /Creator, /CreationDate, and /ModDate fields are in plaintext and tell you what the source app embedded. Third, examine the PDF structure with a tool like pdfinfo, qpdf --check, or Acrobat's Preflight - this lists optional content groups (layers), attached files, embedded JavaScript, form fields, and digital signatures that would not show up in a casual visual inspection. If you do not have those tools, the in-browser Privvert PDF metadata tool reads the same fields without uploading the file. Once you know what is in there, decide whether to strip it, redact properly, or flatten.

What is the cleanest way to produce a 'safe' PDF for sending - one that has no hidden text, no metadata, no layers, no editing history?

The reliable recipe in 2026 is a two-step pipeline. Step one is to do the real work in the source application: do all real redactions in a tool that strips the underlying content stream (Acrobat's Redact, or a local browser-based redaction tool), not by drawing rectangles; accept all tracked changes; remove comments; delete any hidden text frames; turn off 'Save preview thumbnails' if your app offers it. Step two is to re-export the PDF through a flattening pass - the most thorough is to render every page to an image and then re-assemble those images into a fresh PDF, which guarantees that no text layer, no metadata, no form field, and no layer survives. The Privvert PDF tools can do step two locally without uploading. The lower-effort version is to use Acrobat's Sanitize Document feature (it removes metadata, embedded files, JavaScript, and hidden layers in one pass) followed by a metadata wipe. For documents where the visible content really is everything you want to send - a one-page invoice, a single screenshot of a chart - a flat PNG or JPEG is shorter, easier, and guaranteed to leak nothing more than the pixels.

The 'Print to PDF' trap: what your exported PDF still contains - and what a screenshot leaves out

There is a comforting fiction about the Print to PDF button. The fiction goes like this: you have a document in Word or Pages or a browser tab, you choose File > Print > Save as PDF, and the result is a clean, sealed, paper-like artefact - a snapshot of the visible page with no editing history, no author name, no hidden layers, and nothing the recipient can dig into. It is the digital equivalent of pressing the document onto a sheet of paper. What you see is what they get.

The fiction is wrong in almost every detail. A PDF produced by Print to PDF in 2026 is not a snapshot of the page; it is a structured file with the text as selectable text, the images as embedded images, the layers as separate layers, the form fields as live form fields, and a metadata block at the top that names the source application, the operating system, the document title, the author, the creation timestamp, and the last-modified timestamp. Every black rectangle you drew over a sensitive name is still sitting on top of the original text, which is still selectable underneath. Every comment you thought you removed may still be there in the document structure. The PDF /Producer field tells the recipient that you saved it from a 2019 MacBook Pro running Sonoma at 11:47 last Tuesday.

None of this is a bug. PDFs are designed to preserve all of that information, because most of the time you want them to. The problem is that the Print to PDF button looks and feels like a flattening operation, and it is not. This piece walks through what Print to PDF actually preserves, the categories of leakage people most often miss, why a flat screenshot leaks much less in some real situations, and the two-step recipe for producing a PDF that is genuinely safe to send.

What 'Print to PDF' actually does under the hood

On every modern operating system, the Print to PDF button routes the document through the same machinery the OS uses to send pages to a physical printer. The application generates a print job in a high-level page description language (PostScript on older systems, Quartz drawing operations on macOS, XPS on Windows, Cairo or Skia on Linux and browsers), the operating system hands that job to a virtual printer driver named something like 'Save as PDF' or 'Microsoft Print to PDF', and the virtual driver translates the drawing operations into PDF content streams instead of sending them to a physical device.

The crucial detail is what 'translation' means. PDF is not a bitmap format - it is a structured page description language with its own text, vector, image, font, and metadata primitives. When the print driver encounters a run of text in the print job, it does not rasterise the text into pixels and then write the pixels to the PDF. It writes the text as text, with the font, the position, and the character codes preserved, because that is the format PDF natively expects. The same is true for vector shapes (preserved as vectors), embedded images (preserved as compressed image objects), and overlaid drawing layers (preserved in their original z-order with the layer below still present).

This is the right behavior for the case Print to PDF was designed for: producing a high-quality, searchable, accessible, printable artefact of a document. It is the wrong behavior for the case people most often use it for: producing a sealed, flattened, redacted, anonymised artefact for safe sharing. The same machinery that makes the PDF searchable also makes every black rectangle transparent to a determined reader.

The five categories of leakage

Across the documents that end up causing public-facing incidents, the same five categories of leakage recur. None of them are visible in a normal page view.

1. Selectable text under visual redactions

This is the famous one, and it is covered in detail in the dedicated piece on PDF redaction, but it is worth restating because Print to PDF makes it particularly easy to do wrong. A black rectangle drawn in Word, Pages, Google Docs, Preview's markup tools, Acrobat's Comment tools, or the highlighter in a browser PDF viewer is a drawing object. The text underneath is a text object. Both objects survive Print to PDF unchanged, in their original z-order. The visible page shows the rectangle on top. The PDF content stream contains both, and any standard PDF viewer will let the recipient select-all, copy, and paste the underlying text. The list of well-documented public incidents involving exactly this mistake is long enough to fill a separate article - it includes US federal courts, the NSA, the TSA, several major law firms in New York and London, and at least one acquisition disclosure that re-priced a deal.

2. Document metadata

Every PDF has a metadata dictionary, often called the Document Information Dictionary, that lives near the start of the file. The standard fields are /Title, /Author, /Subject, /Keywords, /Creator (the application that authored the source document - 'Microsoft Word for Mac', 'Pages 14.0', 'LibreOffice 7.6'), /Producer (the engine that wrote the PDF itself - 'Adobe PDF Library 23.0', 'Mac OS X 14.4 Quartz PDFContext', 'Microsoft: Print To PDF'), /CreationDate, and /ModDate. On documents produced via Print to PDF, the /Title and /Author fields are typically inherited from the source application's document properties without being shown to you, and the /Producer field always records the exact engine and version. A one-line exiftool command on the file reads all of this out in plain text.

On top of that, many PDFs carry a second metadata block in XMP (Extensible Metadata Platform) format - an XML document embedded at the end of the file that can include the original document's instance ID, a history of editing actions, the application versions used at each stage, and on some pipelines the original file path on the author's machine. XMP is what photo editors use to attach edit history to JPEG and TIFF files, and it travels in PDFs the same way.

3. Layers, form fields, and optional content

PDFs support optional content groups - the spec's name for layers - which let a single PDF carry multiple toggleable visual states. CAD exports, GIS maps, multi-language documents, and engineering drawings routinely have ten or twenty layers, only one of which is visible at a time in the default view. When you Print to PDF from a viewer showing such a document, the behavior is viewer-dependent: some viewers flatten to the visible state, some preserve all the layers in the output, and some preserve the layers but hide all the non-visible ones in a way that a casual reader will not notice but an inspector will. The safe assumption is that layers survive unless you explicitly flatten.

Form fields are a related trap. A fillable PDF that you have completed in Acrobat or Preview, and then re-printed via Print to PDF, often comes out the other side with the form fields still live - the recipient can click into the field and edit your answer, or worse, can read the form's default value and any JavaScript validation attached to the field. If your intent is to send a filled-in form that the recipient cannot edit, you need an explicit Flatten Form Fields step, not a re-print.

3a. Digital signatures (which Print to PDF silently breaks)

A digital signature on a PDF is a cryptographic hash over the file's contents, signed with the signer's private key. The signature is valid only as long as the file's bytes match the hash. Re-printing a signed PDF via Print to PDF produces a new PDF that is visually identical but is byte-for-byte different; the new file does not contain the signature object at all, and any signature blocks that survive as visible page elements are now just decorative rectangles with no cryptographic meaning. The recipient sees a 'signed' document that is not actually signed, and the original signer's name and timestamp are still printed on the page. This is a real risk for contracts, audit reports, and regulatory filings.

4. Comments, tracked changes, and hidden text

Word, Pages, and Google Docs all keep tracked changes and comments as a parallel structure to the document text. The print pipeline's behavior depends on the source application's print-time setting: if 'Print markup' is on, the comments are rendered to the visible page (and become part of the PDF the recipient sees); if it is off, the comments are typically dropped from the rendered page. What is less reliable is whether the comments survive in the PDF structure when they are not rendered. In some pipelines they are dropped entirely. In others - particularly the Acrobat plugin chain on Windows - they are preserved as PDF annotation objects that do not show on the page but are visible in the Comments pane of any PDF viewer that supports them.

Hidden text is a related category. Word documents often contain text in headers, footers, hidden text runs (Format > Font > Hidden), text in white-on-white, text inside collapsed outline sections, and text in document properties that the author never realised was being saved. The print pipeline's treatment of each of these is application-specific and version-specific, and the only reliable way to know what survived is to open the resulting PDF and inspect it.

5. Embedded fonts, attached files, and JavaScript

PDFs can carry embedded font subsets, which on rare occasions can leak the corporate font license text or the foundry's licensing metadata. They can carry attached files - the PDF spec lets you bolt an entire Excel workbook or a folder of images onto a PDF as an attachment, and some invoicing and EDI workflows do this routinely. They can carry JavaScript, which can fire on open and do things like populate form fields, validate input, or make outbound HTTP requests (the latter is blocked by every modern viewer, but the JavaScript is still in the file as evidence of intent). Print to PDF from most viewers will strip attachments and JavaScript, but Print to PDF from the source application (Word, Pages) may not have to strip them because the source application does not generate them in the first place. The safer assumption is that any PDF whose origin you do not control may carry any of the above.

Why a screenshot leaks less, and where it does not

A screenshot is a flat raster bitmap. The format (PNG, JPEG, HEIC) has none of PDF's structure: no text layer, no layers, no form fields, no annotations, no embedded JavaScript, no attached files, no XMP edit history. The only metadata it carries is the screenshot tool's own - a creation timestamp, sometimes an OS version, and on iOS/macOS the screen dimensions and the device model. That metadata exists, and on a photo it can include GPS coordinates and the camera's serial number (covered in the EXIF article), but a software-generated screenshot of a window has nothing comparable to leak.

For sharing a single piece of evidence - one paragraph of a long document, one chart, one error message, one diagram - a screenshot is often the right answer. Whatever is visible in the image is what the recipient gets. Nothing in the image can be selected, edited, or inspected for more than is already visible. The original document's full text, comments, metadata, and editing history simply do not travel with the picture.

The trade-offs are real, and they are the reason a screenshot is not the universal answer. Screenshots cost the recipient: no selectable text means no copy-paste, no machine-readable structure, no accessibility for screen readers, larger files for multi-page content, and lower quality if the recipient prints. They cost the sender on bulk documents - a 40-page contract as 40 PNGs is unwieldy. They lose vector quality on diagrams, which look fine on screen but pixellate when zoomed or printed. The trade-off discussion lives in the dedicated PDF or image piece, but the short version is: screenshot for evidence, cleaned PDF for documents the recipient needs to use.

The two-step recipe for a PDF that is safe to send

A PDF that is genuinely safe to send is one where the visible content is the only content - no underlying selectable text under redactions, no author metadata, no layers, no form fields, no attachments, no comments. There is a reliable recipe.

Step one: clean at the source

Accept or reject all tracked changes and remove all comments in the source application before exporting. In Word: Review > Accept All Changes, then Delete All Comments. In Pages: Edit > Track Changes > Accept All, then Insert > Comment > Delete All. In Google Docs: File > Version history > See version history will tell you what is in there; File > Make a copy creates a clean version.
Do real redactions in a tool that strips the underlying content stream, not by drawing rectangles. Acrobat's Redact tool removes the text and replaces it with a black box that is a single image, not a rectangle over text. The Privvert PDF redact tool does the same locally in the browser. The full reasoning and the list of well-known incidents from doing it wrong is in the PDF redaction article.
Empty the document's metadata fieldsin the source application before exporting. In Word: File > Info > Inspect Document > Check for Issues > Document Inspector, then remove document properties and personal information. In Pages: File > Advanced > Remove Pages metadata (the exact name varies by version). In Google Docs, the metadata is server-side and the cleanest path is to copy the text into a new document.
Turn off 'Print markup' / 'Print comments' / 'Print hidden text' in the print dialog before saving. If your application offers a 'Save preview thumbnail' option, turn it off - some old PDF readers cached the source application's preview in the PDF itself.

Step two: flatten through a render pass

Even with a clean source, the export still produces a structured PDF with metadata, fonts, and potentially layers. The reliable way to flatten everything in one pass is to render every page to an image and then re-assemble those images into a fresh PDF. The result is a PDF where every page is a single image - no text layer, no metadata beyond the new PDF's own, no layers, no form fields, no annotations, no JavaScript, and no attachments. The cost is that the recipient cannot select text, and the file is larger.

The render-and-reassemble pass can be done locally in the browser with the Privvert PDF-to-images tool followed by the images-to-PDF tool. No upload, no server in the middle. The same effect is achievable with command-line tools (pdftoppm followed by img2pdf) or with Acrobat's Sanitize Document feature, which in one click strips metadata, attachments, JavaScript, hidden layers, and form fields - it does not flatten the text layer, but for most threat models a Sanitize pass followed by a metadata wipe is enough.

For documents where the visible content really is everything you want to send - a one-page invoice, a single screenshot of a chart, a copy of an ID - skip the PDF entirely and send a flat PNG or JPEG. Strip its EXIF first if it came from a phone or camera (covered in the photo metadata piece), and you have a file that cannot leak anything beyond the pixels you see.

How to inspect a PDF before you send it

Before any meaningful document leaves your machine, three quick checks catch most of the categories above.

Select All, then drag-copy across what you think is redacted or flattened. If the cursor changes to a text cursor over your black boxes, the text underneath is selectable. If paste produces the original text, the redaction is cosmetic.
Read the metadata. The Privvert local PDF metadata tool reads /Title, /Author, /Subject, /Keywords, /Creator, /Producer, /CreationDate, /ModDate, and the XMP block without uploading the file. On the command line, exiftool report.pdf or pdfinfo report.pdf does the same. If any of those fields contain a name, an email, a file path, an OS version, or a timestamp you do not want the recipient to see, strip them before sending.
Inspect the structure. Acrobat's Preflight, the command-line qpdf --check, or any of the open-source PDF structure viewers will list optional content groups (layers), attached files, embedded JavaScript, form fields, and digital signatures. If anything in that list should not be in the file, remove it before sending.

The checks take a minute. The cost of skipping them is a recipient who can paste the redacted name into a chat, read the author's identity in the metadata, and tell their colleague which version of Word was used to write the document.

Where this fits

Print to PDF sits at the intersection of two recurring patterns: a convenient default that does the opposite of what most people assume, and a document format that was designed to preserve everything because most users want everything preserved. The redaction-specific failure mode is in the PDF redaction piece; the broader question of when to send a PDF and when to send an image is in PDF or image; the equivalent metadata problem on photos is in removing photo metadata; and the larger question of what 'delete' actually accomplishes when the file lives in a backup, a cache, or a counterpart's mailbox is in the delete piece.

Privvert's PDF tools all run locally in the browser - the file never leaves the device, which matters most for exactly the kind of document you are thinking about cleaning. The reasoning is on the privacy page, and the rest of the practical guides are on the blog.