Multi-provider routing
DocTranslater can route LLM calls through a sync TranslatorRouter that selects among several LiteLLM-backed providers using a TOML-defined profile (ordered provider list + strategy + failure policy).
When to use --translator local
Use local mode when you want a single-machine or LAN Ollama, vLLM, or OpenAI-compatible server without editing nested profiles / providers tables. It expands internally to the same router + LiteLLM path as multi-provider mode. See Local translation.
When to use --translator router
Use router mode when you want:
- Failover across OpenAI, Anthropic, OpenRouter, OpenAI-compatible gateways, or local Ollama
- Separate translation vs term extraction profiles (e.g. JSON-only backends for glossary extraction)
- Per-provider metrics (requests, latency, tokens, estimated cost) at end of run
Requirements:
--configpointing to a TOML file with[doctranslate],profiles, andproviders(see Configuration).- Valid API keys in the environment for providers that need them.
Legacy workflows keep using --translator openai (default) with --openai and related flags.
Translation memory (--tm-mode, etc.) applies to any translator mode once enabled; cache keys still include the router fingerprint (profile, strategy, per-provider fields). See Translation memory.
Strategies
| Strategy | Behavior (simplified) |
|---|---|
failover |
Walk the profile’s providers order; on recoverable failures in fallback_on, try the next provider until success or max_attempts. |
round_robin |
Rotate starting provider per request. |
least_loaded |
Prefer provider with fewer concurrent in-flight calls when possible. |
cost_aware |
Prefer cheaper estimated next-token cost when usage and price hints are available. |
Global routing_strategy in TOML can be overridden per profile or via --routing-strategy.
Failure handling
Failures are classified (e.g. rate_limit, timeout, authentication, malformed_response, content_filter, unknown). The router only fails over when the category is listed in the profile’s fallback_on. Auth-style failures typically mark the provider unhealthy for the run so it is not selected again.
Content filtering: By default profiles do not allow silently falling back to another provider on content-filter errors unless allow_content_filter_fallback is enabled in that profile.
Term extraction profile
Automatic glossary term extraction issues JSON-shaped prompts. Define a profile with require_json_mode = true and list providers that support JSON mode. Set term_extraction_profile in TOML (or --term-extraction-profile) to that profile name. Paragraph translation can use a different routing_profile with cheaper or faster models.
Structured outputs: With --translator openai, term extraction and batched LLM paragraph translation can use OpenAI chat.completions.parse when supports_structured_outputs is true (default for common providers in router config). Router / LiteLLM paths continue to use response_format: json_object unless you add future schema support in provider configs.
Metrics
With metrics_output = "json" or both, and metrics_json_path set (or overridden by CLI), the router can write structured usage summaries. Logs always include a human-readable summary when log or both is selected.
When Prometheus metrics are enabled for the process (DOCTRANSLATE_METRICS_ENABLED, see Observability), the same router emits labeled service counters and histograms (outcomes, latency, tokens, estimated cost) alongside these end-of-run summaries—use one or both depending on whether you need files/logs vs scrape targets.
Example
doctranslate -c doctranslate.toml translate input.pdf \
--translator router \
--lang-in en --lang-out zh \
-o ./out
Validate config without translating:
Programmatic use
For application embedding, prefer doctranslate.api with a TranslationRequest whose translator.mode is router or local, or use doctranslate.api.build_translators when you need low-level BaseTranslator instances (see Stable library API).
For tests or advanced composition, you may call doctranslate.translator.factory.build_translators with translator_mode="router" and a config path, or construct TranslatorRouter with pre-built LiteLLMProviderExecutor instances.
The public PDF pipeline still calls translate / llm_translate synchronously on the translator instance; there is no requirement to use async in your own code for translation itself. Progress reporting uses doctranslate.api.async_translate around thread-pooled work (see Async Translation API).