Document Analysis API
Analyze PDFs, images, and scanned documents for tampering, forgery, and AI generation. Detect pixel-level manipulation, validate digital signatures, and verify metadata integrity. One API call returns a complete authenticity report.
Detect tampering, verify metadata integrity, extract text via OCR, and validate digital signatures.
How it works
Submit a document file or URL and SignalStack runs forensic analyses including PDF metadata extraction, image EXIF forensics, and OCR verification, returning a comprehensive authenticity report.
1. Submit document
Upload a file (PDF, image, Office document) or provide a URL. SignalStack securely downloads and sandboxes the document for analysis.
2. Forensic analysis
The analysis engine runs tamper detection, metadata forensics, signature validation, OCR verification, and AI-generation checks in parallel. Each check produces an independent signal.
3. Authenticity report
Receive a trust score, per-check breakdown with detailed findings, and evidence artifacts. Results include highlighted tamper regions, signature validation details, and metadata analysis.
from signalstack import SignalStack
client = SignalStack(api_key="sk_...")
result = client.verify.document(
file="invoice_1432.pdf",
checks=[
"tamper_detection",
"signature_validation",
"metadata_forensics",
"ai_generated",
],
)
print(f"Trust score: {result.score}")
# Trust score: 0.92
for check in result.checks:
print(f"{check.name}: {check.passed}")
if check.artifacts:
for a in check.artifacts:
print(f" → {a.description}")Comprehensive forensic analysis
Each document undergoes 14 specialized checks covering visual, structural, cryptographic, and content-based analysis.
Tamper detection
Detect pixel-level manipulation, metadata inconsistencies, embedded object anomalies, and hidden layer alterations in PDFs and images using forensic analysis algorithms.
Digital signature verification
Validate PAdES, CAdES, XAdES, and PKCS#7 signatures. Verify certificate chains, revocation status, and signature embedding timestamps against trusted CA roots.
Metadata forensics
Extract and analyze embedded metadata including author, creation tool, revision history, geolocation, device fingerprints, and edit trails across document formats.
Content consistency analysis
Cross-reference document content against known templates, detect inconsistent fonts, anomalous formatting, and verify that embedded data matches visible content.
Optical character verification
Verify that document text extracted via OCR matches the rendered content, detecting character substitution attacks, hidden text overlays, and invisible layer manipulation.
AI-generated document detection
Identify documents fully or partially generated by AI using statistical analysis of language patterns, embedding artifacts, and generative model fingerprints.
Frequently asked questions
We support PDF, JPEG, PNG, GIF, BMP, and WebP. PDF support includes metadata extraction and forensic analysis. Additional formats are in development.
Our analysis runs 14 forensic checks including PDF metadata extraction and date integrity, image EXIF forensics with edit history detection, OCR text verification, and trust score calculation that combines tamper flag analysis, metadata consistency, and document age heuristics.
Yes. Our AI-generated document detection module analyzes language model artifacts, statistical token distributions, embedding vector anomalies, and generative model watermarks to flag AI-synthesized content — including documents created by GPT, Claude, Gemini, and image generators.
We validate PAdES (PDF), CAdES (CMS), XAdES (XML), and PKCS#7 detached signatures. Certificate chain validation includes CRL and OCSP checks against 200+ trusted CA roots. We also detect embedded timestamp signatures per RFC 3161.
Each document submission counts as one API call regardless of file size (up to 50MB). Multi-page documents up to 100 pages are counted as a single verification. Larger documents may be split into multiple verifications.