Weaving AI into human systems — one loop at a time.
The operating system of the AI-augmented organisation — how policy encoded as code and intent expressed as prompt together shape how teams work with AI at scale.
How do you measure an agentic system when the “correct” output isn't known in advance? On evaluation without ground truth — labelled suites, LLM-as-judge, regression, and drift — and why measurement is what separates a demo from production.
More essays in progress.
Short fiction, coming soon.
Applied AI leader — ideating, building, and shipping AI systems for real-world use cases.
Focused on bringing the research of the controlled environment into messy reality. I've worked across customer, product, and research — the writing here is from that vantage point.