EU retail bankingGuarded agentCloud + on-prem

Trusted agent platform

Tessera

A guarded agentic AI system for regulated banking support

Built with LangGraph, retrieval, guarded tool execution, audit envelopes and a multilingual regression tests across French, German and English

LanguagesFR / DE / EN
Eval set40 failure cases
DomainEU banking
Grounding4 corpora

Demo and agent UI

Start with the guided walkthrough, then open the interface to chat, inspect live traces, guarded execution and escalation flows in action

2:00Watch demo videoA guided fraud-support scenario showing guard decisions, audit evidence and escalationPlay video
tessera.ui/session/urgent-card
ChatguardedAuditliveEval40 cases

Customer

Ma carte a ete volee. Peux-tu verifier mon compte ?

Tessera

I can help, but account lookup is blocked until identity is verified. I am escalating this with redacted evidence

Live guard trace

Reviewer-ready evidence

SAFE HANDOFF
  1. classifyurgent_card
  2. guardidentity_required
  3. decisionblock account_lookup
  4. handoffhuman_escalation
FRDEEN

Try agent UI

Open the interactive interface to explore live traces and guarded execution

Open interface

Before action

Runtime

Tool arguments checked before execution

Replayable

Regression

Failure catalogue lives as JSON test cases

Structured

Audit

Every guard decision is inspectable

Explicit

Escalation

Low confidence becomes a handoff

Scope

Not a new firewall, not a new benchmark

A concrete assembly for one regulated European banking support workflow

3

Languages

French, German, English

40

Failure cases

CI-gated non-regression set

4

Quality layers

Guard, eval, audit, escalation

4

Regulatory corpora

DORA, CNIL, BaFin, GDPR

What it proves

A bank-support agent that can be inspected, tested, and safely handed off

Tessera shows that guarded tool calls, multilingual grounding, audit evidence, and escalation paths can work together in one concrete banking support flow

Multilingual support agent for Crédit Aurore
Guarded function calls for sensitive banking actions
Traceable audit evidence for reviewer and operator workflows
Non-regression catalogue covering known agent failure patterns

mcp-firewall

Runtime guard

Sensitive tool calls are checked before execution with allow, deny, transform, and redaction decisions

Regression suite

Offline tests

Documented failure cases are replayed against the agent graph so safety behavior can regress loudly

JSON evidence

Structured audit

Each guard decision records the policy rule, rationale, target tool, redacted arguments, and operational timestamps

Reviewer node

Human escalation

Low-confidence or high-stakes turns route to escalation instead of pretending the answer is certain

Multilingual operating surface

Same banking product, three language and regulator contexts

Each language path shows how Tessera routes the request, applies the relevant regulator context, checks the guard, and keeps the customer reply controlled

France

FRCNIL + GDPR

Je veux contester un paiement carte visible depuis hier
route
transaction_search -> reviewer
guard
verify identity first
reply
French, bounded, cited

Germany

DEBaFin + DORA

Meine Karte wurde gestohlen. Kannst du mein Konto pruefen?
route
urgent_card -> escalation
guard
block account lookup
reply
German, no disclosure

Cross-border EU

ENGDPR + DORA

Can you explain why my loan simulation changed?
route
loan_simulate -> audit
guard
redact sensitive fields
reply
English, grounded

Regression scorecard

The safety claim is backed by replayable failures

Tessera treats known agent failures as JSON test cases. Each case declares the failure pattern, the check that must pass, and the expected bounded behavior before the demo can be trusted

Catalogue

40 cases

Locales

FR / DE / EN

Gate

eval.yml

Output

JSON + markdown

CaseFailureCheckExpectedStatus
#01

Prompt injection

forbidden phrase + tool boundary
refuse or transform
guarded
#02

PII leakage

redaction + disclosure limit
redact evidence
redacted
#03

Citation hallucination

grounded source required
cite corpus only
grounded
#04

Overconfident action

must_escalate path
human review
escalated

Scenario spotlight

The difference is visible when the request is risky

An unguided agent might answer with whatever tool result is easiest to fetch. Tessera treats the banking situation as the product surface: urgency, ownership, disclosure risk, and escalation all change the route before a tool is allowed to run

Recognizes stolen-card urgency before ordinary account lookup
Avoids exposing account data while fraud context is unresolved
Escalates to a human path with structured evidence attached

Customer turn

French support case

Ma carte a ete volee et je vois des paiements que je n'ai pas faits. Pouvez-vous verifier mon compte maintenant ?

The request combines urgency and fraud. The account lookup path is no longer the right default

Route contrast

Same customer message, different operating discipline

Looks helpful, leaks context

Naive agent route

  1. 1Fetch account data
  2. 2Show suspicious payments
  3. 3Try to block the card late

The assistant optimizes for an answer before ownership, urgency, and disclosure risk are resolved

Guarded, auditable handoff

Tessera route

  1. 1Classify fraud urgency
  2. 2Block unsafe lookup path
  3. 3Escalate with redacted evidence

The system keeps the user moving toward help while preserving a reviewer-ready trail

Controlled execution path

Every useful action leaves a trail

Each customer turn moves through routing, retrieval, guard checks, audit emission, and a final response or escalation path

01

Route

Classify language, intent, and urgency before planning

02

Retrieve

Pull product and regulation context from pgvector

03

Guard

Check sensitive tool calls through policy before execution

04

Audit

Emit structured evidence for reviewer and operator views

05

Respond

Answer, decline, or escalate with confidence signals

Delivery assurance

Trust comes from the delivery chain, not from a polished interface

The strongest signal is not a single UI screen. It is the chain from local quality gates to deployed infrastructure, with an on-prem path when regulation changes the deployment boundary

Build gate

ruff, mypy, pytest

The system is framed as software that must keep compiling, typing, and replaying

Safety gate

40 failure replays

Known agent failures are catalogued as JSON test cases instead of left as anecdotes

Hosted path

Cloud Run + Cloud SQL

The frontier path is deployable with managed runtime, secrets, logs, and monitoring

Local path

Ollama on Apple Silicon

The on-prem mode keeps the banking story credible when data cannot leave the perimeter

Try the system

A dedicated slot for the public agent UI

Visitors should be able to leave the case study and test Tessera for themselves. This block will point to the deployed dashboard as soon as the public URL is available

Open agent UIChat · audit · eval
tessera.ui/session/urgent-card
ChatguardedAuditliveEval40 cases

Customer

Ma carte a ete volee. Peux-tu verifier mon compte ?

Tessera

I can help, but account lookup is blocked until identity is verified. I am escalating this with redacted evidence

Live guard trace

Reviewer-ready evidence

SAFE HANDOFF
  1. classifyurgent_card
  2. guardidentity_required
  3. decisionblock account_lookup
  4. handoffhuman_escalation
FRDEEN

Chat workbench

Test the multilingual banking-support flow with guarded tool calls

Audit trail

Inspect policy decisions, redactions, policy rules, and reviewer evidence

Eval scorecard

Replay known failure cases and see what still needs work

Operating evidence

Architecture and evaluation flow

The diagrams show how Tessera is assembled: agent orchestration, guarded tools, audit evidence, cloud and on-prem paths, and evaluation checks

System architecture

LangGraph orchestration, retrieval, guarded tools, audit sinks, and dual LLM deployment paths

Non-regression tests

Failure cases move through schema validation, multilingual replay, scoring, and CI gates

Stack snapshot

Agent

LangGraphFastAPIPython 3.12uv

Retrieval

PostgreSQLpgvectorHybrid searchRegulatory corpora

LLM paths

Vertex AICloud RunOllamaLlama 3.3 70B

Quality

mcp-firewallpytestmypy strictruff

Operating model

From customer signal to reviewer-ready evidence

Tessera treats banking support as a controlled workflow: language, intent, policy, tool permission, audit evidence, and escalation remain visible from the first message to the final handoff

Most agent systems

  • Unguarded tool calls
  • No replay evidence
  • Manual-only validation
  • English-only compliance

What Tessera makes visible

  • mcp-firewall + YAML policy
  • Structured JSON audit trail
  • 40 failure cases, CI-gated
  • FR / DE / EN regulatory routing

Architecture summary

Router, planner, guarded tools and reviewer remain explicit

RouterPlannerWorkersReviewer

Graph orchestration

Router, planner, reviewer and workers are separated so useful work and controlled action remain distinct

Guarded tool boundary

Account lookup, card blocking and transaction search stay behind policy checks and auditable decisions

Cloud and on-prem paths

Vertex AI and Cloud Run cover the frontier path; Ollama and Llama 3.3 70B keep an on-prem option explicit

Shipped

validated

LangGraph agent, FR / DE / EN prompts, audit trail, guard adapter, JSON eval test cases

Hardening

next

Public demo URL, mcp-firewall upstream contribution, German escalation calibration

Open-source & transparent

Inspect the assembly, not a black box

Tessera stays honest about what it contributes: an end-to-end regulated assembly with reusable dependencies, visible guardrails and documented failures

Open channels

Follow the work

AI engineering, data platforms and applied machine learning, shared through practical case studies and shipped systems

© 2026Amadou Mamane. Built with Next.js and Tailwind CSS.

Privacy-friendly analytics may be used to understand aggregate visits and improve the site experience.