Every layer on GitHub. Every byte in the EU.
Anthropic / OpenAI / Operator all share one shape: a US opaque-box runtime at the bottom of the stack. We built the opposite — every component is open-source software your engineers can read, hosted on German infrastructure your security review can reach, with cross-tenant learning gated by a scrubbing pipeline you can audit per tenant. Sovereignty is the architecture, not a checkbox.
From your Chrome down to the metal — nothing proprietary, nothing offshore.
Each layer is a known open-source primitive, deployed on Hetzner Cloud or in your VPC. The diagram below is the actual deployment shape — not a marketing simplification. License badges link to the repos; the audit trail starts there.
Every line of LiteLLM, Langfuse, pgvector, and the SiteBridge MCP server is on GitHub under MIT or Apache‑2. You don't need our permission to read it. You don't need our permission to host it.
How a single agent action traverses the stack.
A browser action — clicking Save in HubSpot, dropping an HTTP node in n8n, rescuing a stuck Base44 chat — flows through the stack in a fixed shape. The interesting part is how often it short-circuits before reaching the model.
Most browser-agent calls are repeats. The first time the agent figures out 'click the green Save in HubSpot's stage dropdown' costs full-LLM. The next 4 200 times — across all tenants — it costs ~€0 and 50 ms.
Every LiteLLM call writes to Langfuse via a 2-line callback. PII masking runs before the trace lands. Datasets feed the offline distiller that turns successful runs into recipes. One trace store, per-tenant by design.
pgvector holds two namespaces: tenant overlay (your selectors, your conventions, your data) and pattern library (cross-tenant, scrubbed, k-anonymity ≥ 5). Patterns flow up. Data stays down. Audit log shows every contribution.
Models + context + MCP — delivered as one product.
Anthropic's pitch is "subscribe to Claude, install our extension." That sounds simple until you multiply it by every user, every team, every procurement cycle, every regional billing constraint. SiteBridge through this stack lets you sell the outcome — a working browser agent — instead of three components each with its own contract.
Without us
- User signs up for Anthropic Pro / Max / Team / Enterprise (per-user, per-tier).
- Procurement signs Anthropic DPA, reviews EU data-residency clauses, files SCCs anyway because the data plane is US.
- Org admins manage Anthropic console: usage, members, billing.
- Outage = downtime. No second provider in the path.
- Model gets deprecated mid-quarter — the team scrambles.
- No insight into how the agent is reasoning — Anthropic owns the trace store.
With SiteBridge
- User signs in via your SSO. Done.
- Procurement signs one Agiliton contract. EU data residency by construction — no SCCs.
- Admins use one dashboard for users, usage, budgets, audit.
- Outage on any single provider = transparent failover.
- Model deprecation = config swap on our side. Customers don't notice.
- Every tool call traced in your Langfuse instance — you own the data, not us.
What this stack actually buys you.
1.Open source end-to-end
2.Hosted in EU · no US data plane
3.No third-party subscription per user
4.Every major model, one gateway
5.One contract, one bill, one DPA
6.Cost: complexity-routed
7.Failover built in
8.Privacy: route to self-hosted
9.Patterns up · data down
10.Per-team budgets & quotas
11.Single-pane observability
12.Future-proof model layer
Per-user / per-month — and why the bundled price beats Anthropic-direct.
Anthropic charges per user, per tier. Agiliton via this stack charges per outcome, with the spread between routed-cheap inference and a reasonable seat price as the margin. For a moderately active user (~50 browser tasks/day, 30k tokens each, 22 working days/month):
| Configuration | What user pays | Underlying inference cost | Vendor exposure |
|---|---|---|---|
| Claude for Chrome (Pro) | $20/mo per seat | (Anthropic captures all margin) | Anthropic only · per-seat sub required · US data plane |
| Claude for Chrome (Max) | $100–200/mo per seat | (Anthropic captures all margin) | Anthropic only · per-seat sub required · US data plane |
| SiteBridge · bundled routed | Customer's negotiated seat / metered price | ~$5–10/seat/mo (MiniMax/GLM majority + bypass minority) | Multi · failover ready · we manage · EU-hosted |
| SiteBridge · self-hosted models | Customer's negotiated price | ~$0 marginal (their compute) | Zero · in customer VPC · open-source verified |
Numbers illustrative as of May 2026. The ratio is the point: ~10× spread between routed-cheap inference and Claude Pro list price, and the customer never sees the spread — they see a single bundled price for "browser agent that works."
This isn't roadmap — it's how Agiliton runs today.
The full stack — LiteLLM gateway, Langfuse trace store, pgvector skill library, SiteBridge MCP server — is core infrastructure at Agiliton. Routing decisions, virtual keys, complexity routing, the bypass-flag mechanism, scrubbing pipeline are shipping in production behind multiple internal agents — including the Matrix bot, customer-facing automations, and SiteBridge sessions themselves. The integration is battle-tested, not theoretical.
- Complexity router live. Auto-downgrades sonnet/haiku/opus aliases to MiniMax/GLM when prompt complexity allows.
- Bypass flag proven. Per-key bypass_complexity_router=true shipped after morphology errors in non-English replies; matrix-ai-agent uses it daily.
- Virtual keys with metadata. Per-tenant, per-agent keys with spend caps and audit metadata are the norm, not the exception.
- Self-hosted path validated. LiteLLM routes to Anthropic, OpenAI, xAI, Google, and self-hosted Ollama-served models in production.
- Langfuse trace ingest live. Every model call + every MCP tool call captured, PII-masked, queryable per-tenant.
Anticipated pushback.
“Adding a proxy adds latency and a failure point.”
LiteLLM adds ~5–20 ms of routing overhead. For a browser task that costs 800–3000 ms of model latency, this is in the noise. The failure-point concern is real but inverted: without LiteLLM, your single point of failure is Anthropic. With it, the failover chain is N providers deep. You trade one critical dependency for a routable mesh.
For self-hosted LiteLLM the gateway runs on the same network as your agents — no extra public-internet hop.
“My team standardized on Anthropic. Why complicate things?”
You don't have to use it differently. SiteBridge speaks MCP; day-1 config can be "everything to Claude." The option to route is the value — not having it means you can't react when Anthropic raises prices, deprecates a model mid-migration, or has a multi-hour outage. With LiteLLM in the path, you keep the option. Without it, you're vendor-coupled and the vendor knows it.
The flexibility is there when you need it — and procurement still signs one contract.
“We already have an LLM gateway / observability stack.”
Good — bring it. LiteLLM is one component; if you already run Portkey, Helicone, your own gateway, point SiteBridge at it. Langfuse is one component; if you already pipe traces to Datadog, OpenTelemetry, or Grafana, the LiteLLM callback supports those too. The stack is composable, not coupled.
What you can't bring: the cross-tenant skill library, the pre-click guards, the Chrome-MCP toolkit. Those are SiteBridge proper.