Proposal for the Valdosta DCOM initiative — follow-up to the 23 April call with Mathieu Hélin and Stéphane Largeau.
Prepared by Monce AI · April 2026 · Live at
saft.aws.monce.ai
Your April KPI deck shows the DCOM project hitting — and beating — every target: lead time below 48 h (target 3 days), 97 % field extraction (target 95 %), 100 % archiving, all on 1,800 orders. The quantitative bar is already passed. The question is scale.
That is a single point of failure between you and 7,000 orders. The next 3,000 orders don't need better extraction — they need a matcher that scales to every customer layout without re-tuning, and a review workflow that keeps the CSR on exceptions only.
02A 97 % field-extraction number is a ceiling for known layouts. Growth from 4k → 7k orders is not more Verizon volume — it's new aviation distributors, new rail customers, new military primes, each with a different PO format and its own manufacturer-PN conventions.
Every new customer means a new template. Extraction accuracy on unseen layouts drops the day the CSR has to teach the system.
Verizon orders 80-94890-02. Your ERP knows it as a
different SKU. The cross-reference lives in a CSR's head — and in
QAD comments — not in the matcher.
9,425 SKUs in OM-Material. Most match 3 % of volume. A matcher that over-fits on the hot 2 % collapses on the long tail — and the long tail is where growth comes from.
Scaling means a matcher that doesn't need per-customer engineering. It needs to be deterministic, auditable, and trained in minutes when the master data changes.
03Between the 23 April call and this deck: full pipeline running on the Valdosta master data you sent us. One URL, end-to-end.
80-94890-02Verizon Ariba PO 3002630800 →
VERIZON + layout ariba_sap + SAFT confirmed,
confidence 0.92. Zero LLM call.
500 MB SAT classifier trained in < 3 min, inference < 10 ms/query, no GPU. Auditable by line.
Catalog integrity, stage 0/1 regex, trained Snake, live HTTPS smoke — run on every deploy.
🔗 saft.aws.monce.ai · /ui
drag-and-drop upload · /snake matching playground ·
/paper, /architecture, /economics.
Your 9,425 SKUs don't need embeddings or a fine-tuned model. They need a polynomial-time SAT classifier that proves its answer by construction. That's exactly what Snake does — and it was designed for this.
Any indicator function over a finite discrete domain can be encoded as a SAT instance in polynomial time. Decision-tree bucketing reduces it to linear in the sample count.
Charles Dana's thesis (Ecole Polytechnique, supervised by E. Le Pennec). Independently validated against XGBoost / RF / DL on an NIH-funded dataset — published in a Springer-accepted paper on mitochondrial classification (2025).
Non-deterministic. Hallucinates SKUs. Costs $/query. Can't explain the answer.
Needs vector DB + re-indexing. Silent failure on long-tail SKUs. No audit trail.
Brittle on new layouts. Per-customer ops cost. That's the DCOM ceiling today.
INSEAD. Building the future of industrial commerce. 6,800 LinkedIn followers, deep network across EU industrial accounts (TotalEnergies, aerospace primes, automotive tier-1s). Owns the commercial relationship.
FH Oberösterreich. AI-native industrial commerce operator. Customer-facing delivery — pilot scoping, master-data onboarding, CSR workflow integration. The person Saft CSRs will actually work with day-to-day.
X-HEC Entrepreneurs. Author of Snake and the Dana Theorem (2024). Published in Springer (2025, NIH-funded). Making AI trustworthy · CPU-based AI. Owns the pipeline end-to-end. Shipped this deck's live demo in 4 hours.
Why it matters for Saft: the engineer who wrote the matching theorem is the engineer who will wire it into your ERP. No handoff, no "it works on the data scientist's laptop." Direct line from research to production.
06| Per-PO cost | Monce pipeline | Manual entry |
|---|---|---|
| LLM (Haiku + Sonnet + Haiku) | $ 0.05 | — |
| Matcher (Snake, in-process) | $ 0.00 | — |
| CSR touch time | 30 s spot-check | 8–15 min data entry |
| Loaded CSR cost | $ 0.38 | $ 9.00 |
| Per PO all-in | $ 0.43 | $ 9.00 |
Three model_mode knobs let Saft trade cost for accuracy per
customer: cheap ($0.01), balanced ($0.05,
default), accurate ($0.12). Expensive models fire only when
Snake's top-1 confidence falls below θauto.
OM-Material ingestion pipeline (weekly or on-change)Hit all four and we extend — Saft pays for production. Miss any and Monce walks, no lock-in, Saft keeps every artifact produced.
08Per-field exact match, per-line ok/ko, or CSR-accepted-as-is? Determines our pilot comparator.
CSR headcount, new layouts, error recovery, master-data drift? Defines phase-2 scope.
Daily batch, weekly CSV, change-feed? We build the ingestion to match.
QAD comments, CSR head, XLS sidecar? That's the bootstrap source for Snake aliases.
Direct QAD EE API, S3/SFTP queue, or existing DCOM post-processing hook? All three buildable, pick one.
Bedrock eu-west-3 default. US-region or SageMaker/on-prem if TotalEnergies policy requires.
Poitiers CSE was mentioned. Single site first, or platform play from day 1?
Stéphane — can you forward it? Keeps us aligned on what was actually agreed on the call.
Full list: saft.aws.monce.ai → QUESTIONS.md in the repo (17 questions).
We built the proof in four hours on the dataset you sent. Give us 60 days on the Valdosta feed and we meet the 7k ambition — or walk.
saft.aws.monce.ai/ui — drag a Valdosta PO, see the full pipeline.
Monce-AI/saft.aws.monce.ai — private, 13/13 green.
Mathieu Hélin · CEO, Monce · mathieu@monce.ai
Charles Dana · AI/ML · charles@monce.ai