The Arthur Platform
Product & Design Lead
Series B · Enterprise AI · New York, NY · 2024–2026

Defined and shipped the Arthur Platform from 0 to 1, a unified AI discovery, governance, and performance evaluation framework that monitors models and agents across the entire AI development lifecycle. Built for any AI, any use case, deployed inside multi-billion-dollar enterprises and now safeguarding 1B+ production tokens with real-time evals.
Technology Partners
Challenge
Enterprises were shipping AI into production without a clear way to discover what was running, measure performance, catch regressions, or trust outputs at inference time. Existing observability tooling was built for traditional ML, not for the open-ended, agentic, multi-modal systems teams were actually deploying. There was no single layer that spanned discovery, governance, development, testing, and production monitoring.
Solution
Architected and led product on the Arthur Platform: a unified discovery, governance, and AI performance evaluation framework that works across any model, any modality, and any use case. Owned discovery-to-GA on the AI Governance Framework, shaped directly with regulated-industry design partners. Designed continuous evaluation primitives that run from pre-production testing through live inference, surfacing hallucinations, drift, PII leakage, and prompt injection in real time. Closed strategic AWS and GCP partnerships and launched native cloud integrations to accelerate enterprise adoption.
Impact
- •Shipped the unified AI Discovery & Governance platform for multi-billion-dollar enterprises, unlocking $7M+ in ARR
- •Built enterprise-grade systems monitoring 10,000+ models and hundreds of agentic AI systems in production
- •Safeguarded 1B+ production tokens with real-time evals and regression detection across live AI deployments
- •Owned discovery-to-GA for the AI Governance Framework, shaped with regulated-industry design partners
- •Closed strategic AWS and GCP partnerships, shipping native integrations that accelerated enterprise adoption
- •Shipped the guardrails enterprises actually needed: hallucination, PII, toxicity, and prompt injection detection at inference time
Results
- •$7M+ in net-new ARR from the unified Discovery & Governance platform
- •10,000+ models and hundreds of agentic systems monitored in production across enterprise customers
- •1B+ production tokens evaluated in real time with continuous regression detection
- •Native AWS and GCP integrations live, opening enterprise distribution at hyperscaler scale
- •Platform positioned Arthur as the category-defining AI control plane for regulated enterprises
Design

