Agiliton CRM — Architecture
What the AI reads, when a customer writes
Layers
- Input
- 1. Customer messages
- 2. Customer profile & narrative
- 3. Health data & lab values
- 4. Knowledge chunks
- 5. Coach voice & prompts
- 6. Coach role & rules
- 7. Cross-tenant patterns
Scope
- Single customer
- Tenant-wide
- Anonymised cross-tenant
The message
What the customer just sent. The five seconds the AI spends on this single sentence are why every layer around it exists.
- Input: The message — Scope: input. What the customer just sent. The five seconds the AI spends on this single sentence are why every layer around it exists.
- Layer 1: Customer messages — Scope: single customer. Recent messages plus semantic-search hits in this customer's full message history. Every read is filtered by customerId in the database — never reaches another customer.
- Layer 2: Customer profile & narrative — Scope: single customer. Lifecycle status, enrolment, due date, plus an AI-maintained narrative summary of this customer's journey. Regenerates as new data arrives — and stays inside this one record.
- Layer 3: Health data & lab values — Scope: single customer. Per-person blood panels, hair mineral analyses, supplements, symptoms, treatment plans. Phone numbers are AES-256 encrypted at rest. Every read is row-level filtered.
- Layer 4: Knowledge chunks — Scope: tenant-wide · mode-filtered · PII-redacted. Retrieved chunks from the coach's library — programme materials, protocols, references. PII is stripped before chunks ever enter the index, and the customer's lifecycle status pre-filters by sales vs. coaching.
- Layer 5: Coach voice & prompts — Scope: tenant-wide. The coach's tone and the mode-specific system prompts (sales for leads, coaching for active care). No customer data; identical across every reply this coach generates.
- Layer 6: Coach role & rules — Scope: tenant-wide. The hard rules the AI inherits: what role it plays, what it must never say, how it must cite sources, deontological constraints. Tenant-wide; the customer cannot see or edit it.
- Layer 7: Cross-tenant patterns — Scope: anonymised cross-tenant. Coaching patterns extracted daily from anonymised feedback signals. No names, no specific lab values, no conversation snippets — only general claims weighted by confidence. The one cross-tenant artefact, by construction inverted-name-safe.
Agiliton CRM — Architecture
What the AI reads, when a customer writes
Seven layers of context, each with its own scope rule. The geometry is the proof.
Customer messages
Recent messages plus semantic-search hits in this customer's full message history. Every read is filtered by customerId in the database — never reaches another customer.
context-assembler.ts:863–884
Customer profile & narrative
Lifecycle status, enrolment, due date, plus an AI-maintained narrative summary of this customer's journey. Regenerates as new data arrives — and stays inside this one record.
context-assembler.ts:771–832
Health data & lab values
Per-person blood panels, hair mineral analyses, supplements, symptoms, treatment plans. Phone numbers are AES-256 encrypted at rest. Every read is row-level filtered.
context-assembler.ts:833–862
Knowledge chunks
Retrieved chunks from the coach's library — programme materials, protocols, references. PII is stripped before chunks ever enter the index, and the customer's lifecycle status pre-filters by sales vs. coaching.
context-assembler.ts:751–769
Coach voice & prompts
The coach's tone and the mode-specific system prompts (sales for leads, coaching for active care). No customer data; identical across every reply this coach generates.
context-assembler.ts:726–749
Coach role & rules
The hard rules the AI inherits: what role it plays, what it must never say, how it must cite sources, deontological constraints. Tenant-wide; the customer cannot see or edit it.
context-assembler.ts:712–724
Cross-tenant patterns
Coaching patterns extracted daily from anonymised feedback signals. No names, no specific lab values, no conversation snippets — only general claims weighted by confidence. The one cross-tenant artefact, by construction inverted-name-safe.
context-assembler.ts:886–896
Boundaries enforced in the query, not the prompt.
What you just scrolled through is not a metaphor. Every layer's scope is a WHERE clause on a database query — a misbehaving prompt cannot widen the boundary. Below, the same rules made explicit: how the mode is selected, how PII is stripped before indexing, how cross-tenant learning happens without cross-tenant data.
Mode selection
The mode is picked from data, not from the LLM.
A pre-purchase lead and an active coaching client need fundamentally different conversations. We don't ask the model to decide which one this is — we read the customer's lifecycle status from the database and pick the mode there.
Mode: Sales
Pre-enrolment touchpoints
Only sales-tagged knowledge chunks are searchable (consultation scripts, programme description, objection handling). The AI uses the sales prompt from coach settings.
Mode: Coaching
Active care
Only coaching chunks are searchable (nutrition, supplements, thyroid, cycle, etc.). The AI uses the coaching prompt — different voice, different scope.
Personal data protection
Personal data never enters the knowledge base in the first place.
The textbook approach is to encrypt PII at rest and rely on access control. We go further: by the time a chunk is embedded, every name, address, phone number and date of birth has been replaced with a placeholder. The original is never indexed.
Regex layer (fast)
Email addresses, phone numbers (DE/AT/CH formats), street addresses, dates of birth, insurance numbers — all replaced with placeholders like [EMAIL], [PHONE], [ADDRESS].
AI name detection
A name-extraction model finds personal names (first/last, with titles like Dr., Prof.) and replaces with [NAME]. Catches names regex misses — including German compound names and academic titles.
Filename sanitisation
A document called 'Application Plan Maria Schmidt.pdf' is renamed to 'Application Plan [NAME].pdf' before its title even reaches the chunk index.
Chat-export awareness
WhatsApp transcripts are recognised on import and sender names are redacted automatically. The text the AI sees never carries the original sender's name.
Cross-tenant learning
The system learns from every coach. No coach learns from another coach's customers.
Coaches benefit from system-wide experience patterns. A pattern is only ever extracted from anonymised feedback signals — never raw conversations, never customer names, never specific health values.
What patterns contain
- A general claim (e.g. 'Iron bisglycinate is better tolerated than ferrous sulphate')
- Confidence (0–1, derived from feedback signal count)
- Evidence count (how many anonymised events support the pattern)
- Recency weight (newer patterns are preferred)
- Conflict group (so contradictory patterns can't both surface)
What patterns never contain
- Customer names or any identifier
- Specific lab values or dosages
- Conversation snippets
- Tenant identifiers — patterns are stored once, queried by every tenant
- Anything that could be inverted to a person
Data pipeline
From document upload to AI suggestion, one path.
Every document a coach drops into Google Drive runs through the same pipeline. Every step is observable, reversible, and bounded by the redaction layer at the front.
Sync
Google Drive folder is polled every 4 hours (or on demand). Content-hash diff means unchanged files don't get re-indexed.
Extract
PDF, DOCX, Google Docs, Sheets, CSV. Scanned PDFs are routed through OCR (Gemini Flash). Empty pages are dropped.
Redact
The two-layer PII pass described above. Filename is sanitised here too.
Chunk
500-token chunks with 50-token overlap, cut at sentence boundaries. Section headings are preserved as metadata on each chunk.
Embed
text-embedding-3-large (1,536-dim) via OpenRouter. Embeddings stored in pgvector + a German-language full-text index (tsvector).
Retrieve
Per message: scope filter → vector search + full-text search → Reciprocal Rank Fusion → optional cross-encoder rerank → top 5–10 chunks to the AI.
Infrastructure
Where the data lives, and where it doesn't go.
EU-only datacenters
- Hosted at Hetzner Cloud (Frankfurt / Helsinki)
- No US datacenters in the request path
- PostgreSQL, pgvector, Redis — all in-region
- Backups encrypted, EU-resident
Encryption & access
- Phone numbers AES-256 encrypted at rest
- Per-tenant credentials via OpenBao (HashiCorp Vault fork)
- Row-level filtering by tenantId + customerId on every query
- Audit log of every read/write, retained 7 years
LLM access
- Routed through OpenRouter under enterprise DPA
- No-training-on-customer-data terms
- Default model: Claude Sonnet 4.6 (Anthropic)
- No human at any provider routinely reviews data
Compliance
GDPR you can run, not just talk about.
Most privacy promises are policies. These are endpoints — they exist in the code, can be invoked by the customer, and produce a verifiable result.
Art. 17
Right to erasure
A single request anonymises every personal-data field for the customer. Identifiers are replaced with ANONYMIZED, message bodies are deleted. Audit log entries remain — required for compliance — but carry no path back to the person.
Art. 20
Data portability
A full export of profile, messages, sessions, health data, insights and audit log entries — generated on demand, delivered as a structured archive.
Limited Use
Google API compliance
Customer Drive data is used only for the feature the customer enabled. Not for advertising, not for training general-purpose models, not subject to routine human review.
The full legal text — including OAuth scopes, sub-processor list, and retention periods — lives in the Agiliton CRM privacy policy.
Want to talk?
We're happy to walk a technical buyer or a DPO through the architecture, run live queries against a sandbox tenant, or share the schema diagram.
Reach us at service@agiliton.eu.