HTTP API (container service)
DocTranslater ships an optional HTTP API for running the engine inside a container and driving it from other systems. It wraps the stable Python surface (doctranslate.api, doctranslate.schemas) and does not shell out to the CLI.
Install
pip install "DocTranslater[full,api]"
# or minimal API stack on top of an existing PDF install:
pip install "DocTranslater[pdf,cli,llm,tm,vision,api]"
Development lockfile (this repo):
(fastapi / uvicorn are included in the dev group for tests.)
Run locally
- OpenAPI UI:
http://127.0.0.1:8000/docs - OpenAPI JSON:
http://127.0.0.1:8000/openapi.json
The repository keeps a checked-in OpenAPI snapshot at openapi/dist/openapi.json (regenerate with uv run python scripts/export_openapi.py). A generated Python client lives under clients/http-python/; see HTTP API — Python client.
Authentication
The OSS HTTP API uses a single shared secret (opaque string, not a JWT). By default authentication is disabled so local scripts and tests keep working unchanged.
DOCTRANSLATE_API_AUTH_MODE |
Behavior |
|---|---|
disabled (default) |
No credentials required on any route. |
required |
Protected routes need Authorization: Bearer <token> or X-API-Key: <token> (header name overridable via DOCTRANSLATE_API_AUTH_HEADER_API_KEY_NAME). DOCTRANSLATE_API_AUTH_TOKEN must be set at process start. |
Public when auth_mode=required (default): GET /v1/health/live and GET /v1/health/ready stay unauthenticated so orchestrators can probe liveness/readiness without a secret. Set DOCTRANSLATE_API_AUTH_ALLOW_UNAUTHENTICATED_PROBE_PATHS=false to require the token on those URLs as well.
Also protected when auth_mode=required: all other /v1/* routes, plus GET /metrics (when enabled), /docs, /openapi.json, and /redoc (unless DOCTRANSLATE_API_DOCS_ENABLED=false). Treat interactive docs and metrics as operator-only in production.
Local development (auth off)
uv run doctranslate serve --host 127.0.0.1 --port 8000
# no Authorization header needed
curl -sS "http://127.0.0.1:8000/v1/runtime"
Production-style (Bearer)
export DOCTRANSLATE_API_AUTH_MODE=required
export DOCTRANSLATE_API_AUTH_TOKEN='your-long-random-secret'
uv run doctranslate serve --host 127.0.0.1 --port 8000
curl -sS -H "Authorization: Bearer your-long-random-secret" \
"http://127.0.0.1:8000/v1/runtime"
API key header (alternative)
OpenAPI / Swagger UI
When auth_mode=required, open /docs only with a valid token (same middleware as /openapi.json). In Swagger UI use Authorize and enter the token for the HTTP Bearer scheme, or send the API key header via your client.
Reverse proxies and gateways
- Forward
Authorizationand/or your configured API key header to the API process. - Strip any untrusted incoming identity headers at the edge; this OSS mode does not trust
X-Forwarded-Useror similar. - Prefer TLS termination at the proxy and keep the token out of query strings and logs.
Troubleshooting (401)
| Symptom | Check |
|---|---|
401 with ok: false and “Authentication required” |
Missing Authorization: Bearer … or API key header on a protected route. |
401 with “Invalid authentication credentials” |
Wrong token; compare with DOCTRANSLATE_API_AUTH_TOKEN. |
401 on /docs or /metrics |
Expected when auth is required; supply the same headers as for /v1/*. |
| Process exits at startup | With auth_mode=required, DOCTRANSLATE_API_AUTH_TOKEN must be non-empty. |
CORS
CORS is driven by DOCTRANSLATE_API_CORS_ALLOW_ORIGINS (comma-separated), DOCTRANSLATE_API_CORS_ALLOW_METHODS, DOCTRANSLATE_API_CORS_ALLOW_HEADERS, and DOCTRANSLATE_API_CORS_ALLOW_CREDENTIALS. Default remains permissive (*) for local use; for production, set an explicit origin allowlist that includes any browser app origin that must call the API.
Serverless and multi-instance behavior
Job records live in SQLite under DOCTRANSLATE_API_DATA_ROOT by default (with optional dual-write to legacy meta.json per job). Execution mode is selected with DOCTRANSLATE_API_QUEUE_BACKEND:
| Mode | Behavior |
|---|---|
inprocess (default) |
A bounded-concurrency JobManager runs jobs inside the API process (asyncio tasks + semaphore). |
arq |
The API enqueues work to Redis; a separate doctranslate worker process runs translations/warmups. Use shared DOCTRANSLATE_API_DATA_ROOT (and DB path) across API + workers. See HTTP API workers. |
Implications:
POST /v1/jobsreturns202with ajob_id;GET /v1/jobs/{id}polls SQLite-backed state. Withinprocessand multiple API replicas, clients may need stickiness unless only one replica accepts jobs. Witharq, any replica can serve status as long as metadata/artifacts are shared.- Horizontal scaling:
inprocessreplicas do not share an in-memory queue. For production multi-replica APIs, preferarq(or an external pattern) plus object storage for inputs/outputs when appropriate. - Restarts: completed/failed jobs remain readable from SQLite (and disk blobs if
DOCTRANSLATE_API_DATA_ROOTpersists).inprocess: in-flight tasks do not survive API restart.arq: queued jobs remain in Redis; a running worker may be restarted independently.
For platform guidance, see Serverless containers and Deploy on Cloud Run.
For blob mirrors (S3/GCS), presigned downloads, TTL cleanup, and migration from legacy meta.json, see HTTP API storage and metadata.
Docker
Build the API image (extends the CPU translate stack + api extra):
docker build --target runtime-api -t doctranslater:api .
docker run --rm -p 8000:8000 \
-e OPENAI_API_KEY \
-v doctranslate-cache:/home/doctranslater/.cache/doctranslate \
-v "$PWD/examples/ci:/in:ro" \
doctranslater:api
Hardened example (shared secret + explicit CORS origin):
docker run --rm -p 8000:8000 \
-e OPENAI_API_KEY \
-e DOCTRANSLATE_API_AUTH_MODE=required \
-e DOCTRANSLATE_API_AUTH_TOKEN \
-e DOCTRANSLATE_API_CORS_ALLOW_ORIGINS='https://app.example.com' \
-v doctranslate-cache:/home/doctranslater/.cache/doctranslate \
-v "$PWD/examples/ci:/in:ro" \
doctranslater:api
Then create a job (mounted PDF under /work — default allow-prefix):
curl -sS -X POST 'http://127.0.0.1:8000/v1/jobs' \
-H "Authorization: Bearer $DOCTRANSLATE_API_AUTH_TOKEN" \
-F 'translation_request={"schema_version":"1","input_pdf":"/work/test.pdf","lang_in":"en","lang_out":"zh","translator":{"mode":"openai","openai":{"model":"gpt-4o-mini"}},"options":{"skip_translation":true}}'
Multipart upload variant:
curl -sS -X POST 'http://127.0.0.1:8000/v1/jobs' \
-H "Authorization: Bearer $DOCTRANSLATE_API_AUTH_TOKEN" \
-F 'translation_request={"schema_version":"1","lang_in":"en","lang_out":"zh","translator":{"mode":"openai","openai":{"model":"gpt-4o-mini"}},"options":{"skip_translation":true}}' \
-F "input_pdf=@examples/ci/test.pdf;type=application/pdf"
Poll status and fetch the result:
JOB_ID=...
curl -sS -H "Authorization: Bearer $DOCTRANSLATE_API_AUTH_TOKEN" \
"http://127.0.0.1:8000/v1/jobs/$JOB_ID"
curl -sS -H "Authorization: Bearer $DOCTRANSLATE_API_AUTH_TOKEN" \
"http://127.0.0.1:8000/v1/jobs/$JOB_ID/result"
When DOCTRANSLATE_API_AUTH_MODE is disabled (default), omit the -H "Authorization: …" lines.
Endpoints (v1)
| Method | Path | Purpose |
|---|---|---|
GET |
/v1/health/live |
Liveness |
GET |
/v1/health/ready |
Readiness (dirs, optional assets, job capacity, Redis when arq) |
GET |
/v1/runtime |
Version and schema versions |
GET |
/v1/assets/status |
Font/model cache presence |
POST |
/v1/assets/warmup |
Async warmup job (202 + job id) |
POST |
/v1/config/validate |
Validate TranslationRequest JSON and/or router/local TOML |
POST |
/v1/inspect |
PDF inspection (inspect_input) |
POST |
/v1/jobs |
Create translation job (202, multipart or mounted path) |
POST |
/v1/jobs/json |
Create translation job (202, JSON body: typed TranslationRequest, optional input_pdf_base64, optional webhook) |
GET |
/v1/jobs/{id} |
Job status / last progress event (includes progress_seq) |
GET |
/v1/jobs/{id}/events |
Replayable progress history (after_seq, limit) |
GET |
/v1/jobs/{id}/stream |
SSE progress stream (text/event-stream; supports Last-Event-ID reconnect; full_events=1 includes full finish payload) |
POST |
/v1/jobs/{id}/cancel |
Best-effort cancel |
GET |
/v1/jobs/{id}/result |
Result + artifact URLs |
GET |
/v1/jobs/{id}/manifest |
Artifact manifest with resolved download URLs + metadata |
GET |
/v1/jobs/{id}/artifacts/{kind} |
Download one artifact (supports Range, optional redirect to presigned URL) |
HEAD |
/v1/jobs/{id}/artifacts/{kind} |
Artifact size / type without downloading the body |
Progress streams and webhooks
- Polling remains supported:
GET /v1/jobs/{id}is unchanged aside from optionalprogress_seq. - Replay:
GET /v1/jobs/{id}/events?after_seq=Nreturns ordered{seq, event}rows from SQLite (job_eventstable). - SSE:
curl -N "http://127.0.0.1:8000/v1/jobs/$JOB_ID/stream"— behind nginx, setproxy_buffering offfor the location so chunks flush promptly. - Webhooks (optional): add multipart form field
webhookwith JSON{"url":"https://example.com/hook","secret":"..."}or{"url":"...","secret_env":"MY_ENV_VAR"}(secret read at delivery time). On terminal states the API POSTs a compact JSON body signed with Standard Webhooks–stylewebhook-id,webhook-timestamp, andwebhook-signature(v1,<hmac>). Delivery uses retries with exponential backoff until success or max attempts. - Set
DOCTRANSLATE_API_PUBLIC_BASE_URLwhen consumers need absoluteresult_url/manifest_urlvalues inside webhook payloads (otherwise paths are rooted at/v1/...relative to the configured origin).
Environment variables
| Variable | Default | Meaning |
|---|---|---|
DOCTRANSLATE_API_AUTH_MODE |
disabled |
disabled or required (shared secret on protected routes) |
DOCTRANSLATE_API_AUTH_TOKEN |
(unset) | Shared secret when auth_mode=required (use a strong random value) |
DOCTRANSLATE_API_AUTH_HEADER_API_KEY_NAME |
X-API-Key |
Header name for API-key style authentication |
DOCTRANSLATE_API_AUTH_ALLOW_UNAUTHENTICATED_PROBE_PATHS |
true |
When false, liveness/readiness also require auth |
DOCTRANSLATE_API_DOCS_ENABLED |
true |
Expose /docs, /redoc, /openapi.json |
DOCTRANSLATE_API_CORS_ALLOW_ORIGINS |
* |
Comma-separated allowed origins (see Authentication) |
DOCTRANSLATE_API_CORS_ALLOW_CREDENTIALS |
false |
CORS Access-Control-Allow-Credentials |
DOCTRANSLATE_API_CORS_ALLOW_METHODS |
* |
Comma-separated methods or * |
DOCTRANSLATE_API_CORS_ALLOW_HEADERS |
* |
Comma-separated headers or * |
DOCTRANSLATE_API_DATA_ROOT |
system temp + /doctranslate-api |
Job workspaces and metadata |
DOCTRANSLATE_API_TMP_ROOT |
(unset) | Optional separate temp root |
DOCTRANSLATE_API_MOUNT_ALLOW_PREFIXES |
/work,/in,/data |
Allowed path prefixes for input_pdf without upload |
DOCTRANSLATE_API_ALLOW_MOUNTED_PATHS |
true |
Disable mounted paths when false |
DOCTRANSLATE_API_MAX_UPLOAD_BYTES |
256000000 |
Multipart upload limit |
DOCTRANSLATE_API_QUEUE_BACKEND |
inprocess |
inprocess or arq (Redis workers) |
DOCTRANSLATE_API_REDIS_URL |
redis://127.0.0.1:6379/0 |
Redis for ARQ (queue_backend=arq) |
DOCTRANSLATE_API_ARQ_QUEUE_NAME |
arq:queue |
ARQ queue name (API + worker must match) |
DOCTRANSLATE_API_MAX_CONCURRENT_JOBS |
2 |
Semaphore for in-process running jobs; per-worker concurrency hint for arq |
DOCTRANSLATE_API_MAX_QUEUED_JOBS |
32 |
Max jobs in queued+running (SQLite count for arq) |
DOCTRANSLATE_API_JOB_TIMEOUT_SECONDS |
0 |
Per-job wall clock (0 = off) |
DOCTRANSLATE_API_REQUIRE_ASSETS_READY |
false |
If true, readiness requires warmed assets |
DOCTRANSLATE_API_WARMUP_ON_STARTUP |
none |
none | lazy | eager (only eager is implemented: run assets.warmup at startup) |
DOCTRANSLATE_API_ARTIFACT_RETENTION_SECONDS |
86400 |
After terminal job states, schedule workspace + metadata deletion (0 disables) |
DOCTRANSLATE_API_TTL_CLEANUP_INTERVAL_SECONDS |
300 |
Background sweep interval for expired jobs |
DOCTRANSLATE_API_METADATA_SQLITE_PATH |
(unset) | Override SQLite DB path (default <DATA_ROOT>/http_api_metadata.db) |
DOCTRANSLATE_API_DUAL_WRITE_JSON_META |
true |
Also write jobs/<id>/meta.json |
DOCTRANSLATE_API_READ_JSON_META_FALLBACK |
true |
If SQLite misses a row, read legacy meta.json |
DOCTRANSLATE_API_ARTIFACT_STORAGE |
local |
local or remote (fsspec mirror) |
DOCTRANSLATE_API_ARTIFACT_REMOTE_ROOT |
(unset) | e.g. s3://bucket/prefix (requires [api-s3]) |
DOCTRANSLATE_API_FSSPEC_STORAGE_OPTIONS_JSON |
(empty) | JSON object for fsspec / s3fs / gcsfs options |
DOCTRANSLATE_API_ARTIFACT_DOWNLOAD_MODE |
proxy |
proxy or redirect (presigned URLs when available) |
DOCTRANSLATE_API_PRESIGN_EXPIRES_SECONDS |
3600 |
Presigned URL TTL |
DOCTRANSLATE_API_JOB_SSE_POLL_INTERVAL_SECONDS |
0.25 |
Poll interval for SSE when not using in-process fan-out (arq workers) |
DOCTRANSLATE_API_PUBLIC_BASE_URL |
(unset) | Optional public origin for webhook payload URLs (https://api.example.com) |
DOCTRANSLATE_API_WEBHOOK_HTTPS_REQUIRED |
false |
When true, reject non-HTTPS webhook URLs at job creation |
DOCTRANSLATE_API_WEBHOOK_ALLOW_HOSTS |
(empty) | Optional comma-separated host allowlist for webhook URLs |
DOCTRANSLATE_API_WEBHOOK_MAX_ATTEMPTS |
10 |
Max delivery attempts before abandoning a webhook |
DOCTRANSLATE_API_WEBHOOK_DELIVERY_BATCH |
5 |
Max concurrent webhook deliveries per sweep |
DOCTRANSLATE_API_WEBHOOK_HTTP_TIMEOUT_SECONDS |
30 |
Outbound webhook HTTP timeout |
DOCTRANSLATE_API_WEBHOOK_SWEEP_INTERVAL_SECONDS |
2 |
Background sweep interval for pending webhook deliveries |
Production notes
- Prefer horizontal scaling (several single-worker replicas) over many Uvicorn workers per replica: the layout ONNX model and PDF work are memory-heavy.
- Put
DOCTRANSLATE_API_DATA_ROOTon a writable volume; mount~/.cache/doctranslatefor persistent fonts/models/TM. - Configure reverse-proxy
client_max_body_sizeto matchDOCTRANSLATE_API_MAX_UPLOAD_BYTES. - Enable
DOCTRANSLATE_API_AUTH_MODE=requiredbefore exposing the service on the internet; use a long randomDOCTRANSLATE_API_AUTH_TOKENfrom a secret manager. - Tighten CORS to explicit origins; avoid
*for browser-accessible production APIs. - Optionally set
DOCTRANSLATE_API_DOCS_ENABLED=falseon internet-facing services if you do not want interactive OpenAPI UI reachable.
Serverless deployment (short checklist)
- Choose image
runtime-apiand container port 8000 (see Docker image profiles). - Set
DOCTRANSLATE_API_DATA_ROOT(and optionallyDOCTRANSLATE_API_TMP_ROOT) on fast writable storage. - Decide warmup strategy: baked warm image,
DOCTRANSLATE_API_WARMUP_ON_STARTUP=eager, orPOST /v1/assets/warmupafter deploy. - Set
DOCTRANSLATE_API_JOB_TIMEOUT_SECONDSwhen the platform needs a hard wall-clock bound. - For multi-replica services, prefer
DOCTRANSLATE_API_QUEUE_BACKEND=arqwith shared storage, or read Serverless and multi-instance behavior and enable session affinity only as a best-effort mitigation forinprocess.
Full matrix and image tags: Serverless runtime & image reference.
Observability
Structured JSON logs, Prometheus (GET /metrics), request correlation (X-Request-ID), and optional OpenTelemetry tracing are documented in Observability.
ASGI import
Uvicorn can load the pre-built app: