STAGINGNon production environment. Data branched from prod, crons disabled.
Orkestra

Trust

How the Orkestra engine knows what it knows.

We classify regulatory publications across nine regimes. We score posture against an entity’s licence perimeter. We say what we do not know. Every claim is sourced. The numbers below are refreshed weekly from our held out evaluation set.

What the engine does

The Orkestra engine harvests publications from sixteen EU and national regulatory authorities twice daily. Each publication is classified by regime and severity. Items are scoped to the licence perimeter of each tenant. A posture engine produces gap verdicts (NEW_GAP, WIDENS_GAP, TIGHTENS, NO_CHANGE) against the tenant’s evidence corpus, with a confidence band per verdict. An AI Analyst returns grounded answers with mandatory citations to the source text. Every prepared artefact is logged in the audit trail with the engine version, the confidence per claim, and the citation list, and is signed under the Customer Co Sign mechanic.

What the engine does not do

The engine does not provide legal advice and does not act as a professional advisor. The engine flags low confidence verdicts and routes them to the tenant’s accountable officer for human judgement. If a publication addresses a regime, jurisdiction, or scenario not covered by our corpus, the AI Analyst declines and flags the question for officer attention. We publish our weekly evaluation results below: precision, recall, calibration of the confidence bands, and the false negative rate on a held out set, evaluated per regime.

Latest weekly eval

What the classifier scored on the latest held out batch.

The contractual quality threshold per Service Schedule X.7 is precision at least 0.95, recall at least 0.92, Brier at most 0.05. The aggregate batch accuracy below is what the eval pipeline records today. Per regime precision and recall ship with the V1.7.x eval extension.

Aggregate accuracy

1.000

4 of 4 fixtures

Drift

0

Failed cases logged with expected and actual

Suite

classification-v161

Classifier v1.6.1 on claude-sonnet

Batch

2026-05-24

Refreshes every Sunday from the cron pipeline

Per regime breakdown

Per regime precision, recall, and Brier ship with the V1.7.x eval extension. Until then, the aggregate accuracy above is the calibration sample for every regime. The contractual quality threshold above applies per regime once the breakdown lands.

MiCApending
AMLRpending
DORApending
GDPRpending
NIS2pending
PSD2pending
EMD2pending
MiFID IIpending
IFR/IFDpending

Confidence calibration

How well our confidence numbers match reality.

X axis is the confidence the engine assigned to a verdict, in ten bins from zero to one. Y axis is the empirical accuracy on the held out set per bin. The diagonal is the ideal calibration line. The engine curve is overlaid once the latest weekly batch lands.

PREDICTED CONFIDENCEEMPIRICAL ACCURACYEngine curve refreshes from/admin/eval every Sunday

Quality threshold

A service level control on the eval set, not a deliverable warranty.

The thresholds are precision 0.95, recall 0.92, and Brier 0.05, measured per regime on the documented held out evaluation set. They are service level controls on the platform, not warranties that any individual prepared Deliverable is complete or fit for filing without your officer’s review.

If a weekly evaluation falls below an applicable threshold for a regime, Orkestra pauses Deliverable generation for that regime, notifies affected customers in writing within twenty four hours, conducts a Practitioner of Record review, and resumes only after documented remediation. Generation in unaffected regimes continues normally.

Methodology and threshold changes follow the change control procedure in the Service Schedule. The /trust page is informational unless expressly incorporated by reference in the applicable Service Schedule. Reproduced in the AI Governance and Quality Schedule of every tenant agreement.

Practitioner of Record

Independent practitioner oversight over our quality governance.

The Practitioner of Record provides independent practitioner oversight over Orkestra’s quality governance. The role covers weekly sample reviews of prepared outputs, weekly model card refresh signature, escalation review when a quality threshold is not met for a regime, and adjudication of held out set labelling disputes. Each review is sample based.

The Practitioner of Record signs Orkestra’s quality governance records. The Practitioner of Record does not sign customer compliance artefacts, does not approve customer regulatory decisions, and does not assume any customer’s regulated accountability.

This review does not constitute legal advice, audit assurance, regulatory approval, supervisory authority certification, customer specific compliance approval, or certification of any individual customer deliverable. The Practitioner of Record is not a director, employee, officer, MLRO, Compliance Officer, DPO, auditor, legal counsel, regulated function holder, or authorised representative of Orkestra or of any customer.

No customer, customer affiliate, investor, counterparty, regulator, or other third party shall have any direct claim against the Practitioner of Record arising out of this listing or any Practitioner of Record signature, except where such exclusion is prohibited by applicable law.

Practitioner of Record

Gerardo Abattan

ACAMS

Seven years of EU regulated compliance experience across AMLR, MiFID II, and DORA. Independent consultant running his own compliance bureau. Engaged under a separate retainer agreement with Orkestra. Public listing on this page during the engagement.

Routine cadence two to four hours per week, escalation reserve scoped and compensated separately. Signs Orkestra’s quality governance records only.

Methodology

How the held out set works.

Construction

Five percent of harvested publications, randomly sampled and labelled by two senior compliance practitioners. Disagreements adjudicated by the Practitioner of Record. Inter annotator agreement published quarterly.

Refresh

The held out set is refreshed weekly. The classifier eval pipeline runs every Sunday and writes results to the internal admin surface, then publishes the per regime numbers and the calibration curve to this page.

Growth

The labelled set has roughly twelve hundred publications at the time of writing. We grow it at roughly four hundred new labelled publications per quarter. Target five thousand by Q3 2026.

Regulatory accountability

What Orkestra prepares, what only the entity can sign.

The regulated entity remains responsible for its regulatory obligations, governance decisions, filings, internal controls, and supervisory interactions. Orkestra prepares the artefact. Where applicable to the entity, its management body, responsible regulatory function, compliance function, MLRO, DPO, risk function, internal audit function, legal counsel, and statutory auditor remain in place. Orkestra does not replace any of them.