Documentation Index
Fetch the complete documentation index at: https://docs.barndoor.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The Barndoor LLM Gateway is an OpenAI-compatible (and Anthropic Messages-compatible) proxy that sits between your AI tools and one or more upstream LLM providers (OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Vertex AI, and other compatible providers). Pointing your client at the gateway gives you a single endpoint that:- Centralizes provider credentials — provider API keys stay in Barndoor, never on developer machines.
- Enforces governance — rate limits, token budgets, and per-user / per-team model access policies.
- Routes intelligently — model aliases, multi-provider failover, and automatic retries on transient 429s.
- Records usage — per-request token, cost, and latency telemetry surfaced in the Barndoor portal.
base_url. Clients that prefer Anthropic’s Messages API (for example Claude Code) can talk to the gateway directly as well.Architecture at a Glance
Before You Begin
You’ll need:- A Barndoor account with access to the LLM Configuration area (admin) and/or the My Models settings page (any user).
- The base URL of your Barndoor portal. Throughout this guide we use
https://app.barndoor.ai(the Barndoor SaaS host). The authoritative value for your tenant is shown in the LLM Gateway Endpoint card on Settings → My Models — copy it from there. - For admins setting up the gateway for the first time: an upstream provider account (OpenAI, Anthropic, etc.) and credentials for it.
app.barndoor.ai for your organization’s portal hostname everywhere in this guide. The path (/api/llm-gateway/v1) is the same.Quickstart
If your admin has already configured a provider and at least one model, you can be sending requests in two minutes.Get your API key
bd-… token immediately — you won’t be able to see it again.
Find your endpoint
https://app.barndoor.ai/api/llm-gateway/v1.
Admin: Configure the Gateway
| Tab | What it configures |
|---|---|
| Credentials | The actual upstream credentials (API key, AWS role, Vertex service account, etc.). |
| Providers | A named upstream that references one set of credentials and a model family. |
| Models | The catalog of upstream models exposed via this provider. |
| Model Pricing | Per-million-token input/output costs used for usage and budget reporting. |
| Model Routes | The client-facing alias (for example gpt-4o-mini) and ordered list of providers / upstream models it resolves to. |
| API Keys | Org-wide bd-… keys (per-user keys live under Settings → My Models). |

Step 1: Add Provider Credentials
Most credentials live in LLM Configuration → Credentials, where you save reusable secrets that one or more providers can share. A few setups have their own flow:- Anthropic OAuth passthrough is configured exclusively on the Providers tab (there are no Credentials to save) — see Anthropic OAuth passthrough below.
- AWS Bedrock and Google Vertex AI can be set up from either tab — pre-create reusable Credentials and attach them on the Providers tab, or configure them inline when you create the Provider. The Providers-tab flow walks you through a guided checklist and is the easiest first-time path.

OpenAI / Anthropic / Azure OpenAI (API key)
OpenAI / Anthropic / Azure OpenAI (API key)
- Pick the provider from the catalog.
- Paste the API key (and any provider-specific settings, such as Azure
endpoint_modeandapi_version). - Save. The key is stored in Barndoor’s encrypted secret store and is never returned to the browser.
AWS Bedrock
AWS Bedrock
| Method | When to use |
|---|---|
| AWS IAM Role | Production. Barndoor assumes a role in your AWS account at request time — no long-lived secrets stored. |
| Static AWS Keys | Pilot or fallback. Access key + secret key (+ optional session token). |
| Bedrock API Key | If your team prefers AWS’s first-party Bedrock API key feature. |
- AWS Region — where Barndoor sends Bedrock Runtime requests (free text, e.g.
us-east-1). Make sure the models you want are enabled in that region in AWS. - Bedrock API Family — pick Claude / Anthropic Messages for Claude models on Bedrock, or Bedrock Converse for the broader Converse-API model set (Llama, Titan, etc.).
Setting up IAM Role federation
The Providers tab walks you through this end-to-end with a Bedrock setup checklist — recommended for first-time setup.- Open Providers → Add Provider → AWS Bedrock → Connect.
- Fill in AWS Region, Customer IAM Role ARN, and Bedrock API Family. Barndoor auto-generates an External ID and shows you its own Barndoor Principal ARN for this environment in the same dialog.
- Click Copy trust policy — Barndoor builds the JSON template for you, pre-filled with its principal ARN and your external ID.
- In AWS IAM, create (or update) the role you named in step 2 and paste the copied trust policy into its trust relationship.
- Grant the role permission to invoke Bedrock — at minimum
bedrock:InvokeModelandbedrock:InvokeModelWithResponseStream. (The AWS-managedAmazonBedrockReadOnly-style policies do not include these actions.) For a POC,Resource: "*"is fine; for production, narrowResourceto the specific Bedrock model ARNs / inference profiles and regions your team actually routes to. - Back in Barndoor, click Validate role access. The portal attempts the exact STS assume-role chain it uses at runtime and surfaces the AWS error verbatim if something’s off — iterate on the trust policy / role permissions until it validates.
- Click Save. The credentials are stored and the AWS Trust Info dialog opens; you can reopen it anytime from the credential’s row menu to recheck the principal or external ID.
Setting up Static AWS Keys
Paste the AWS Access Key ID, AWS Secret Access Key, and an optional AWS Session Token (if you’re using temporary credentials). Region and Bedrock API Family work the same as the role flow. There’s no Validate button for this mode — failures show up on the first real request.Setting up a Bedrock API Key
Paste the Bedrock API Key issued by the AWS Bedrock console. Region and Bedrock API Family work the same as above.Google Vertex AI
Google Vertex AI
| Method | When to use |
|---|---|
| Workload Identity / ADC | The easiest path. Barndoor uses its platform-managed Google identity to authenticate; you grant that identity Vertex access on your project. No keys to paste. |
| Service Account Impersonation | Recommended for production. Create a service account in your GCP project with Vertex permissions, then grant Barndoor’s runtime principal roles/iam.serviceAccountTokenCreator on it. Barndoor impersonates your service account at request time. |
| Service Account Key | Pilot or fallback. Paste a service-account email + private key (the contents of a downloaded JSON key file). |
- GCP Project ID — your project.
- Vertex AI Location — region (default
us-central1). - Vertex API Family — Gemini Generate Content for Gemini models, or Claude / Anthropic Messages for Claude on Vertex.
Setup steps
- In the portal, pick the auth method and fill in the fields.
- In your GCP project, enable the Vertex AI API and any partner-model access (e.g. Claude on Vertex requires explicit model enablement).
- Grant
roles/aiplatform.user(or a more specific Vertex role) on the project to whichever principal will actually call Vertex — Barndoor’s runtime identity (Workload Identity / ADC), your target service account (impersonation), or the key’s service account. - For Service Account Impersonation only: grant
roles/iam.serviceAccountTokenCreatoron your target service account to Barndoor’s runtime principal. - Click Validate Vertex Access. Barndoor fetches a token via your chosen auth method and asks Vertex about a model in your chosen API family — confirming end-to-end that both auth and model access work.
Claude / Anthropic Messages family) works normally.Anthropic OAuth passthrough (Claude Code subscribers)
Anthropic OAuth passthrough (Claude Code subscribers)
bd-… gateway API key and forwards each caller’s Claude OAuth token (sk-ant-oat…) to Anthropic on the upstream request.Setup
- Providers → Add Provider and pick the Anthropic OAuth card (the second of the two Anthropic catalog cards).
- Name the provider (e.g.
Anthropic via Claude OAuth) and click Save — no API key field appears. - Add this provider to a Model Route alongside the Claude models you want to expose (typically
claude-sonnet-4-5,claude-opus-4, etc.).
What callers send
Every request carries two headers:| Header | Value | Purpose |
|---|---|---|
x-api-key | bd-… | Authenticates the caller to the Barndoor Gateway. |
Authorization | Bearer sk-ant-oat… | The caller’s Claude OAuth token, forwarded to Anthropic verbatim. |
Smoke-test with curl
To verify the dual-header path end-to-end without Claude Code:200 with a normal Anthropic response body confirms both headers flowed correctly. A 401 typically means one of the two headers is missing or malformed.Limitations to know about
- Endpoint scope: OAuth tokens are forwarded only on
POST /v1/messages(and Claude Code’s pre-flight/v1/messages/count_tokens). Hitting/v1/chat/completionsagainst an Anthropic OAuth provider returns401with a message asking the caller to use the Messages endpoint. - Token management: Barndoor never stores or refreshes Claude OAuth tokens. Token expiry and refresh are handled entirely by Claude Code on the caller’s machine.
- Failover: Missing-OAuth-token errors are treated as caller errors (
401), not upstream failures, so they don’t trigger failover to other routes for the same alias.
Pairing OAuth with a fallback (worked example)
A common production setup is to put OAuth passthrough first and an AWS Bedrock provider second on the same alias, so Claude OAuth absorbs everyday traffic and Bedrock takes over if Anthropic is degraded.- Create the Anthropic Claude OAuth provider as described above.
- Create an AWS Bedrock provider — see the Bedrock accordion.
- In Model Routes, make a single alias (e.g.
claude-opus-4-7) with two entries:- Entry 1 — provider: Anthropic Claude OAuth, upstream model: the Anthropic public API model id (e.g.
claude-opus-4-7). - Entry 2 — provider: AWS Bedrock, upstream model: the Bedrock model id (e.g.
us.anthropic.claude-opus-4-7).
- Entry 1 — provider: Anthropic Claude OAuth, upstream model: the Anthropic public API model id (e.g.
- Callers send requests for
claude-opus-4-7. The gateway tries OAuth first, and falls over to Bedrock on a failover-eligible upstream error. The response’smodelfield tells you which route served any given request without inspecting headers.
Step 2: Create a Provider
A Provider is the named upstream that uses one credential. From the Providers tab, click Add Provider, then:- Pick the Catalog Entry (for example OpenAI, Anthropic, Azure OpenAI East-US, Bedrock Claude).
- Select the Credential you created above.
- Give it a friendly Name (this is what shows up in audit logs and headers).
- Optionally override the Base URL (for self-hosted or proxy deployments).

Step 3: Enable Models
Open the Models tab and toggle on every upstream model your org should be able to call (for examplegpt-4o-mini, claude-sonnet-4-5, text-embedding-3-large).
These are the upstream model identifiers. They’re the names that go on the wire to the provider, not necessarily the names your clients will use.

Step 4: Configure Model Pricing (optional but recommended)
In the Model Pricing tab, set input and output cost per million tokens for each model you’ve enabled. Pricing drives every cost-aware feature on the platform:- Spending budgets in LLM Controls
estimated_costvalues recorded on every per-request audit event- Per-team / per-user / per-model attribution in the Reporting → LLM Usage dashboards
Step 5: Define Model Routes
This is the most important step. A Model Route maps a client-facing alias to one or more (provider, upstream model) pairs in priority order. It’s where you turn a logical name likegpt-4o-mini into a routing decision.
- Open Model Routes → New Route.
- Set the Alias clients will use, for example
gpt-4o-miniorteam-coding-model. - Add one or more route entries, each pointing at a provider and upstream model. The first entry is the primary; subsequent entries are failover targets used if the primary returns a 5xx, network error, or exhausted-429.
- (Optional) Configure per-route 429 retry behavior.
- Mark the entry as Bare alias if clients should be able to call the plain alias (
gpt-4o-mini). Otherwise it’s only reachable via the prefixed form (openai/gpt-4o-mini).

Step 6: (Optional) Add Governance
Once routes are live, you’ll likely want to layer on usage policies — token budgets, rate limits, and model access. All of these live in LLM Controls (a separate top-level area in the portal) and are enforced inside the gateway on every request, so they apply uniformly to every client and SDK.- Token Budgets — daily / weekly / monthly token (and optional spending) caps with alert thresholds, scoped to org, group, role, or user.
- Rate Limits — RPM / TPM caps scoped to org, group, role, user, API key, or model.
- Model Access Policies — allowlist or denylist specific models, providers, or upstream-model combinations for an org, group, role, user, or API key.
Step 7: Create an Org API Key (optional)
If you need a service-account-style key not tied to a single user, use LLM Configuration → API Keys → Create Key. Org keys can be scoped, named, and revoked, and they appear in the same usage reports as user keys.User: Get Your Endpoint and API Key
If you’re a developer who just wants to call the gateway from your app or editor:
Open My Models
Create an API key
Cursor on laptop), and click Create. The dialog shows a bd-… token — copy it now. Barndoor only stores a one-way hash of the key, so the raw value cannot be shown again.Pick a model
- Alias models (
gpt-4o-mini,claude-sonnet-4-5, …) — use the plain name in themodelfield of your request. - Standalone models — use the
provider/modelform (for exampleopenai/gpt-4o-mini).
Sending Requests
The gateway speaks two API dialects on the same port:| Endpoint | Use it from |
|---|---|
POST /v1/chat/completions | OpenAI SDKs, LangChain, LlamaIndex, most chatbots |
POST /v1/completions | Legacy OpenAI text-completion clients |
POST /v1/embeddings | OpenAI / LangChain embeddings clients |
POST /v1/responses | OpenAI Responses API clients |
POST /v1/messages | Anthropic Messages clients (Claude Code, Anthropic SDK) |
POST /v1/messages/count_tokens | Anthropic clients that pre-flight token counts |
GET /v1/models | Any client that lists models on startup |
Authorization: Bearer <key> and x-api-key: <key> headers are accepted; the gateway prefers x-api-key so Anthropic-compatible clients can still forward provider-side OAuth in Authorization.
List Available Models
{"object": "list", "data": [...]}), filtered by the policies that apply to your key.
Non-Streaming Chat Completion
Streaming Chat Completion
Set"stream": true and the gateway forwards SSE chunks back exactly as the upstream emits them.
Embeddings
Anthropic Messages (Claude Code)
The gateway also accepts native Anthropic-style requests:Connect Your Tools
OpenAI Python SDK
OpenAI Python SDK
OpenAI Node SDK
OpenAI Node SDK
Anthropic SDK
Anthropic SDK
/v1/messages to base_url itself, so the base URL should not include /v1.Claude Code
Claude Code
ANTHROPIC_BASE_URL. The exact env-var setup depends on which kind of Anthropic provider your admin configured — a shared API key provider (everyone bills against one Anthropic key) or an OAuth passthrough provider (each developer bills against their personal Claude subscription). See Step 1 → Anthropic OAuth passthrough for the admin side./v1/messages to ANTHROPIC_BASE_URL itself, so the base URL must not include /v1. If you copied the URL from Settings → My Models, strip the trailing /v1.`jq: Could not open file ~/.claude/.credentials.json`
`jq: Could not open file ~/.claude/.credentials.json`
jq over that path to extract an OAuth token, remove it — Claude Code handles the OAuth token internally and your only job is to set the env vars in the tabs above.Cursor IDE
Cursor IDE
- Toggle Custom OpenAI API Key.
- Set the base URL to
https://app.barndoor.ai/api/llm-gateway/v1. - Paste your
bd-…key as the API key. - Add the model names (for example
gpt-4o-mini,claude-sonnet-4-5) that you want Cursor to be able to select.

LangChain
LangChain
ChatAnthropic, OpenAIEmbeddings, and the rest of the OpenAI-style integrations behave the same way — set base_url/anthropic_api_url and api_key.cURL / scripts
cURL / scripts
Model Naming
The string you put in themodel field is what the gateway uses to choose a route. Two forms are valid:
| Form | Example | What the gateway does |
|---|---|---|
| Bare alias | gpt-4o-mini | Resolves the alias against your org’s configured routes, picks one in priority order, and fails over if the first choice errors. |
| Provider-prefixed alias | openai/gpt-4o-mini | Pins the request to the named provider, with no failover. Useful when more than one provider backs the same alias and you want to force a specific one. |
GET /v1/models is a discovery endpoint that returns every identifier valid for the calling key — the bare aliases, plus the provider-prefixed forms for any alias that has more than one provider behind it. Call it once on startup to see your options; you don’t need to call it on every request.
Observability Headers
Successful responses onPOST /v1/chat/completions and POST /v1/messages include a small set of headers you can log to understand routing — useful for client-side monitoring, A/B comparisons across providers, and post-incident analysis.
| Header | Meaning |
|---|---|
x-bd-provider-route-attempt | 1-indexed position of the route that succeeded (e.g. 1 = primary, 2 = first failover). |
x-bd-provider-route-count | Total number of routes configured for this alias. |
x-bd-provider-failover-count | Number of routes that failed before one succeeded. |
x-bd-provider-retry-count | Same-route 429 retries the gateway performed. |
x-bd-provider-retry-sleep-ms | Time spent waiting between 429 retries. |
/v1/completions, /v1/embeddings, or /v1/responses. For full per-request usage and cost reporting, head to Reporting → LLM Usage in the Barndoor portal.
Failover, Retries, and Streaming
- If the primary provider returns a 5xx, network error, or exhausts its 429-retry budget, the gateway transparently retries the next route in the alias’s priority list and surfaces the chosen route via the response headers above.
- Same-route 429 handling is configured per route (count + max wait). The gateway retries with exponential backoff, then either succeeds or fails over.
- Streaming responses (
stream: true) preserve provider SSE framing and pass through token usage where the provider emits it.
Troubleshooting
`401 invalid API key` / `missing or invalid Authorization header`
`401 invalid API key` / `missing or invalid Authorization header`
`404 model not found` / `no route configured for alias`
`404 model not found` / `no route configured for alias`
- Call
GET /v1/modelsto see exactly which aliases your key can use. - If you expected to see an alias, verify with an admin that it has at least one enabled route in Model Routes and that the model is toggled on in the Models tab.
- If you’re using a bare alias and two providers back it, try the prefixed form (
provider/alias) instead.
`403 Forbidden` (model access denied)
`403 Forbidden` (model access denied)
- Model-access denials look like
model 'foo' is not allowed for this caller. Check LLM Controls → Model Access for the policy that applies to your role/group.
`429 rate limit exceeded` or `token budget exhausted`
`429 rate limit exceeded` or `token budget exhausted`
- Barndoor’s rate-limit and token-budget checks can return 429 before the request reaches the upstream provider. The response body explains which limit fired.
- Upstream provider 429s are retried in-place per the route’s retry policy and may eventually fail over. Check the
x-bd-provider-retry-countandx-bd-provider-failover-countheaders to see what happened.
Streaming connection closes early
Streaming connection closes early
- Make sure your HTTP client supports SSE and doesn’t buffer (
curl -N,requests.post(..., stream=True),fetch(..., { ... })with manual reader, etc.). - The gateway will close the stream cleanly with a final
[DONE]chunk. A premature close usually means the upstream provider closed first — check the response headers and audit logs for the failover trail.
Anthropic SDK / Claude Code can't reach the gateway
Anthropic SDK / Claude Code can't reach the gateway
- The Anthropic SDK appends
/v1/messagesitself; set the base URL tohttps://app.barndoor.ai/api/llm-gateway(without/v1). - Use
x-api-keyfor the Barndoor token. If your client only knowsANTHROPIC_API_KEY, the gateway still accepts it through the same header.
Frequently Asked Questions
Do I have to use OpenAI's SDK?
Do I have to use OpenAI's SDK?
/v1/messages — will work. SDKs are a convenience, not a requirement.Where do upstream API keys live?
Where do upstream API keys live?
Can I see how many tokens / how much money my agents are using?
Can I see how many tokens / how much money my agents are using?
What's the difference between user keys and org keys?
What's the difference between user keys and org keys?
- User keys (Settings → My Models) are tied to the calling user. Usage and audit events attribute back to that user.
- Org keys (LLM Configuration → API Keys) are typically used for service accounts and CI. They have an org scope but no individual user.
What happens when an upstream provider is down?
What happens when an upstream provider is down?
502 with details for each attempt.Need Help?
Reach out to [email protected] with:- The endpoint you hit, the HTTP status code, and any
error.messagecontent. - The approximate time of the request and the model name you sent in the
modelfield. - A redacted request body if the issue reproduces consistently.
