AI Lab
Experiments and prototypes.
A working portfolio of small, opinionated AI products. Each one starts with a product question I wanted to answer about how this technology should be packaged, and is built as the cheapest possible artifact that makes the answer legible. Each has a clickable demo and a writeup of what the experiment was testing and what fell out.
-
Case study
Tax Policy Navigator
A citation-grounded UK tax navigator over the full HMRC Employment Income Manual, evaluated end-to-end on Mistral.
Two architectural approaches to regulated-domain retrieval-augmented generation, measured on a 50-question Claude-authored eval. 47 of 50 reach the right verdict, with across 188 cited claims, every claim is factually correct against UK tax law and 89% are directly grounded in the cited HMRC paragraph. Functional gaps remain that block public end-user use.
RAG · Citations · Graph-RAG · Evaluation · Mistral
-
Live
AgentScope
An interface prototype for a developer-facing harness that makes a multi-agent run legible at a glance, including when it fails.
A clickable prototype that visualises a tree of agent runs, each with their own steps, context window, and cost share, on one screen. Three switchable mock runs demonstrate the design across different shapes.
Interface prototype · Developer tools · Multi-agent · Observability
-
Live
Data Discovery Agent
An interface prototype for a financial-data catalogue where every recommendation arrives with its lineage, entitlement status, and monthly cost.
A clickable prototype that answers a question most chat-with-data demos refuse to ask. How should a regulated-data platform expose an agent surface when picking the wrong dataset can mean wrong answers, compliance breaches, or runaway cost?
Interface prototype · Product design · Data catalogues · Compliance