A single engine for text, images, documents, audio and video.

PolyRedact exposes a unified redaction engine. Different content types all go through the same policy and logging layer, so you do not need five different tools for five different modalities.

MultimodalAPI-first /proxySigned auditRun in your cloud

End-to-end privacy layer

An end-to-end privacy layer for your AI data

Put PolyRedact across the full lifecycle: prep your data, gate live traffic, then hand auditors a tamper-evident trail.

Data lifecycleMultimodalAudit-ready

Before training

Redact high-risk fields in tickets, chat logs, documents and transcripts.

Do it before they are used for fine-tuning or retrieval.

During inference

Route live traffic through PolyRedact’s multimodal gateway.

Do it before it reaches LLMs, analytics tools or external vendors.

Afterward

Keep signed audit trails and simple reports.

Let security and privacy teams plug them into existing review and approval workflows.

Before training, During inference, Afterward — each phase gets a clear action so teams stay aligned on how data moves.

Clean training corpora with automated redaction.
Live gateway for screenshots, audio, video and text.
Signed audit logs that slot into existing reviews.

See how it works

📝

Text redaction

Detect emails, phone numbers, addresses, IDs, credit cards, API keys, internal IDs and custom patterns. Return redacted text and a structured list of findings.

🖼️

Screenshot & image redaction

Handle screenshots of consoles and dashboards. Combine OCR and visual detection to find PII and secrets, apply blur or black boxes, and return safe images plus findings.

📄

Document redaction

Process PDFs and DOCX files by extracting text and embedded images. Run both text and image engines and produce per-page reports or fully redacted exports.

🎙️

Audio redaction

Transcribe audio with timestamps, detect PII in the transcript and map findings back to time ranges. Return redacted transcripts and lists of sensitive segments.

🎬

Video redaction

Extract audio and key frames from video, run the audio and image engines, and generate a redaction report with preview frames – with the option to output fully redacted videos.

🗄️

Structured & database data redaction

Scan CSVs, SQL tables and JSON payloads to detect sensitive columns, mask PII consistently across rows, and return clean datasets plus a findings report.

Supported PII entities

PolyRedact detects a wide range of sensitive entities out of the box.

👤Personal Identity

Person Name (person_name)
Email Address (email)
Phone Number (phone)
Physical Address (address)
Postal Code (postal_code)
National ID, SSN, Passport
Driving License (driving_license)
ID Document (id_document)
Face (face) images
Date (date)

💳Financial

Credit Card Number (credit_card)
CVV / Security Code (cvv)
Bank Account Number (bank_account)
IBAN (iban)
Routing Number (routing_number)
VAT / Tax ID (vat_id)

🏥Healthcare

Medical Record Number (mrn)
NHS Number (nhs_number)
Health Insurance ID (health_insurance_id)
National Provider Identifier (npi)
Medicare Beneficiary ID (mbi)
National Drug Code (ndc)
Health Measurement (health_measurement)
PPSN (ppsn)

🔐Security & Technical

IP Address (ipv4, ipv6)
MAC Address (mac_address)
API Key (api_key)
AWS Secret Key (aws_secret)
Private Key (private_key)
Crypto Wallet Address (crypto_wallet)
Crypto Seed Phrase (crypto_seed)
Password (password)
TOTP Secret (totp_secret)
Recovery Key/Code
Access Code
Bitlocker Key
URL (url)

📦Other / Business

Company Registration Number (company_reg_number)
License Plate (license_plate)
Customer ID (cid)
Reference ID (reference_id)
Other (other)

Supported PII Actions

Choose how to process detected PII with flexible action types that fit your privacy and compliance needs.

🎭mask (Default)

Replaces the entire PII value with a labeled placeholder.

john.doe@example.com → [REDACTED_EMAIL]

⚡partial_mask

Preserves first and last 2 characters, masks everything in between with asterisks.

john.doe@example.com → jo***************om

🗑️remove

Completely removes the PII from the text (replaces with empty string).

My email is john.doe@example.com → My email is

📈DP-like (Differential Privacy like)

Adds calibrated noise while keeping values or number formats. Ideal for potential AI/LLM training use cases.

alice@example.com → emily.carter@example.com (preserve_domain)

📋log_only

Keeps the original value unchanged but logs that PII was detected.

john.doe@example.com → john.doe@example.com

🎲tokenize

Replaces PII with a deterministic token based on SHA-256 hash (first 16 characters).

john.doe@example.com → TKN_a1b2c3d4e5f6g7h8

🔒hash

Replaces PII with a labeled SHA-256 hash for cryptographic security.

john.doe@example.com → HASH_9f86d081...

🔄FPE (Format-Preserving Encryption)

Maintains character type and length while masking the value.

john.doe@example.com → kpim*epf@fybnqmf*dpn

Gateway and API: plug into your existing flows.

Two modes: Proxy mode forwards to OpenAI, Azure OpenAI or your own models after redaction. Redaction-only mode lets you preprocess content and handle forwarding and storage yourself.

Before

const result = await openai.chat.completions.create({ ... });

After (via PolyRedact)

const result = await polyredact.proxy({ target: 'openai', model: 'gpt-4o', messages, images });

Why a gateway

Drop-in /proxy endpoint sits between your app and vendors.
Redaction-only mode lets you keep forwarding under your control.
Consistent audit logging across every outbound payload.
Consistent policy model across text, screenshots, documents, audio and video.

One place to manage what is allowed to leave.

Enable/disable categories like PII, secrets, internal IDs and custom regex patterns.
Decide per category whether to block, mask, partially mask or allow but log.
Use different policies per tenant or environment.

Signed audit trail for every redaction call.

Every request generates an audit record with hashes of the original and redacted payloads, a summary of findings, timestamps and caller identity. Records can be signed with a service key so you can verify that logs have not been tampered with and export them to your SIEM.

FAQ

What data types can PolyRedact process today?

Text, screenshots, images, documents, audio and video are available now.

Can I run PolyRedact in my own cloud?

Yes. PolyRedact is built to run in your environment for teams that need to keep sensitive traffic inside their own cloud.

How does PolyRedact integrate with LLMs?

Use the /proxy endpoint to forward to OpenAI, Azure OpenAI or your own models after redaction, or call the redact-only APIs.

🚀Ready to try it?

🔒Need to convince security?

Share the signed audit trail approach and optional private cloud deployment.

View use cases