What is Quilldoc?
Quilldoc extracts structured data from any document — invoices, receipts, contracts, bank statements, medical reports, and more. Upload a PDF, get back clean JSON with every field extracted, verified, and grounded to the source.
Any document type
Works with any document type — no templates or pre-configuration needed. Quilldoc understands the document first, then extracts.
Built for developers
Accurate, fast document processing at scale. Simple REST API, structured JSON output, confidence scores on every field.
# Upload a document, get structured JSON
curl -X POST https://api.quilldoc.studio/extract \
-H "X-API-Key: sk_live_..." \
-F "file=@invoice.pdf"
{
"doc_type": "invoice",
"confidence": 0.96,
"data": {
"vendor_name": { "value": "Acme Corp", "confidence": 0.98 },
"total_amount": { "value": 1250.00, "confidence": 0.96 },
"invoice_date": { "value": "2026-02-15", "confidence": 0.92 }
}
}Key Concepts
Core terms and ideas you will encounter throughout the Quilldoc platform and API.
Document Processing Pipeline
The 11-stage AI pipeline that ingests, understands, and extracts data from documents. Each stage — from PDF parsing and OCR through extraction and verification — runs automatically and produces auditable intermediate results.
Knowledge Boards
Collections of documents you can chat with. Upload a set of related documents to a board, then ask questions and get cited answers grounded in the source material.
Schemas
Define what fields to extract from a document type (e.g., invoice_number, total_amount, line_items). Quilldoc can also auto-suggest schemas for unknown document types based on its understanding of the content.
Confidence Scores
Every extracted field gets a confidence score from 0 to 1. High confidence (≥0.85) means auto-approved. Medium (0.60–0.85) means the field needs review. Low (<0.60) means it is flagged for human verification.
Grounding
Every extracted value is linked back to its exact location in the source document — bounding boxes, page numbers, and source text. You can verify any field by clicking through to where it appears in the original PDF.
Document Understanding
Before extraction, AI analyzes each document to identify its purpose, structure, key entities, and logical sections. This enables zero-config processing of any document type without pre-defined templates.
Review Queue
Documents with fields below the confidence threshold are automatically routed to a review queue. Reviewers can approve, reject, or correct individual extracted fields, and corrections feed back into the system to improve future accuracy.
API Reference
The Quilldoc REST API lets you upload documents, extract structured data, manage schemas, and integrate with your workflows.
Base URL: https://api.quilldoc.studio
Getting Started
All API requests require an API key passed via the X-API-Key header. Create an API key from the dashboard or the API Keys endpoint.
curl -X POST https://api.quilldoc.studio/documents/upload \ -H "X-API-Key: your-api-key" \ -F "file=@invoice.pdf"
Authentication
All endpoints require an API key in the X-API-Key header. Keys can be created, listed, and revoked via the API Keys endpoints below.
# Include in every request -H "X-API-Key: sk_live_..."
Documents API
Upload, process, and retrieve documents and their extraction results.
/documents/uploadUpload a PDF document for processing. Optionally specify a schema to use.
Request Body
Content-Type: multipart/form-data file: <binary PDF> schema: "invoice" # optional priority: "high" # optional: low | normal | high
Response
{
"document_id": "doc_abc123",
"status": "queued",
"created_at": "2026-03-06T10:00:00Z"
}/documentsList all documents with optional filtering and pagination.
Response
{
"documents": [
{
"id": "doc_abc123",
"filename": "invoice_001.pdf",
"status": "completed",
"doc_type": "invoice",
"confidence": 0.94,
"created_at": "2026-03-06T10:00:00Z"
}
],
"total": 142,
"page": 1,
"per_page": 20
}/documents/{id}/statusGet the current processing status and stage of a document.
Response
{
"document_id": "doc_abc123",
"status": "processing",
"current_stage": "extraction",
"stage_number": 8,
"total_stages": 11,
"started_at": "2026-03-06T10:00:05Z"
}/documents/{id}/resultGet the extracted data, confidence scores, and grounding information.
Response
{
"document_id": "doc_abc123",
"doc_type": "invoice",
"confidence": 0.94,
"data": {
"vendor_name": { "value": "Acme Corp", "confidence": 0.98 },
"total_amount": { "value": 1250.00, "confidence": 0.96 },
"invoice_date": { "value": "2026-02-15", "confidence": 0.92 }
},
"grounding": { ... }
}/documents/{id}/pdfDownload the original uploaded PDF file.
/documents/{id}/correctSubmit manual corrections for extracted fields. Used by the review queue.
Request Body
{
"corrections": {
"vendor_name": "Acme Corporation",
"total_amount": 1250.50
}
}Response
{
"document_id": "doc_abc123",
"status": "corrected",
"updated_fields": ["vendor_name", "total_amount"]
}/documents/{id}/retryRetry processing for a failed document.
Response
{
"document_id": "doc_abc123",
"status": "queued"
}/documents/{id}/chatAsk a natural language question about the document contents.
Request Body
{
"message": "What is the payment due date?"
}Response
{
"answer": "The payment due date is March 15, 2026.",
"confidence": 0.91,
"source_page": 1
}/documents/{id}/accept-schemaAccept a suggested schema for this document type.
Response
{
"schema_name": "invoice_v2",
"status": "accepted"
}/documents/{id}/relatedFind related documents via cross-document matching.
Response
{
"related": [
{ "id": "doc_def456", "relation": "same_vendor", "similarity": 0.89 }
]
}Export API
Export extraction results in multiple formats.
/documents/{id}/export/{format}Export extracted data. Supported formats: json, csv, excel.
Response
# format = json
{
"vendor_name": "Acme Corp",
"total_amount": 1250.00,
"line_items": [...]
}
# format = csv
Content-Type: text/csv
# format = excel
Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheetSchemas API
Manage document schemas that define which fields to extract.
/schemasList all available schemas including built-in and custom ones.
Response
{
"schemas": [
{ "name": "invoice", "type": "built-in", "field_count": 12 },
{ "name": "receipt", "type": "built-in", "field_count": 8 },
{ "name": "custom_po", "type": "custom", "field_count": 15 }
]
}/schemas/{name}Get the full schema definition including field types and descriptions.
Response
{
"name": "invoice",
"fields": [
{ "name": "vendor_name", "type": "string", "required": true },
{ "name": "total_amount", "type": "number", "required": true },
{ "name": "line_items", "type": "array", "required": false }
]
}/schemasCreate a custom schema for a new document type.
Request Body
{
"name": "purchase_order",
"fields": [
{ "name": "po_number", "type": "string", "required": true },
{ "name": "vendor", "type": "string", "required": true },
{ "name": "line_items", "type": "array", "required": true }
]
}Response
{
"name": "purchase_order",
"status": "created"
}/schemas/{name}Delete a custom schema. Built-in schemas cannot be deleted.
Batch API
Upload and process multiple documents in a single batch.
/batch/uploadUpload multiple PDFs for batch processing.
Request Body
Content-Type: multipart/form-data files: [<binary PDF>, <binary PDF>, ...] schema: "invoice" # optional
Response
{
"batch_id": "batch_xyz789",
"document_count": 5,
"status": "processing"
}/batch/{id}Get batch processing status and summary.
Response
{
"batch_id": "batch_xyz789",
"status": "processing",
"total": 5,
"completed": 3,
"failed": 0,
"pending": 2
}/batch/{id}/documentsList all documents in a batch with their individual statuses.
Response
{
"documents": [
{ "id": "doc_001", "filename": "inv_1.pdf", "status": "completed" },
{ "id": "doc_002", "filename": "inv_2.pdf", "status": "processing" }
]
}Webhooks API
Register webhook URLs to receive real-time processing notifications.
/webhooksRegister a new webhook endpoint.
Request Body
{
"url": "https://your-app.com/webhook",
"events": ["document.completed", "document.failed"]
}Response
{
"webhook_id": "wh_abc123",
"url": "https://your-app.com/webhook",
"events": ["document.completed", "document.failed"],
"status": "active"
}/webhooksList all registered webhooks.
Response
{
"webhooks": [
{
"id": "wh_abc123",
"url": "https://your-app.com/webhook",
"events": ["document.completed", "document.failed"],
"status": "active"
}
]
}/webhooks/{id}Delete a registered webhook.
Review Queue
Manage documents flagged for human review due to low confidence.
/review-queueGet all documents in the review queue, sorted by priority.
Response
{
"items": [
{
"document_id": "doc_abc123",
"reason": "low_confidence",
"confidence": 0.62,
"flagged_fields": ["total_amount", "tax"],
"created_at": "2026-03-06T10:05:00Z"
}
],
"total": 8
}/review-queue/{id}Resolve a review queue item by approving or correcting the extraction.
Request Body
{
"action": "approve",
"corrections": {}
}Response
{
"document_id": "doc_abc123",
"status": "resolved"
}API Keys
Create and manage API keys for authentication.
/api-keysCreate a new API key. The full key is only shown once.
Request Body
{
"name": "production-key"
}Response
{
"key": "sk_live_abc123def456...",
"prefix": "sk_live_abc",
"name": "production-key",
"created_at": "2026-03-06T10:00:00Z"
}/api-keysList all API keys (prefix only, full key is never shown again).
Response
{
"keys": [
{ "prefix": "sk_live_abc", "name": "production-key", "created_at": "2026-03-06T10:00:00Z" }
]
}/api-keys/{prefix}Revoke an API key by its prefix.
Health Checks
Monitor service health and readiness.
/healthFull health check including database, Redis, and MinIO status.
Response
{
"status": "healthy",
"version": "2.3.0",
"database": "connected",
"redis": "connected",
"minio": "connected"
}/health/liveKubernetes liveness probe. Returns 200 if the service is running.
/health/readyKubernetes readiness probe. Returns 200 if the service can accept traffic.
Utility Endpoints
Direct extraction and parsing without document storage.
/extractRun extraction directly on an uploaded file without storing it. Useful for testing.
Request Body
Content-Type: multipart/form-data file: <binary PDF> schema: "invoice" # optional
Response
{
"doc_type": "invoice",
"confidence": 0.93,
"data": { ... }
}/parseParse a document into structured markdown without extraction.
Request Body
Content-Type: multipart/form-data file: <binary PDF>
Response
{
"pages": 3,
"markdown": "# Invoice\n\nVendor: Acme Corp\n...",
"tables": [...]
}