Eval-Driven Development for Multi-Agent AI Systems: A Practitioner's Framework
By Francis Beeson and Pam Portman
April 2026
Framing the Problem [Framing the Problem]
Most AI products rely on anecdotal quality assessment. Someone tries the system, decides it "feels good" or "feels off," and adjustments are made based on intuition. For a prototype, this is fine. For a system that serves vulnerable workers navigating career crossroads, emotional distress, or complex life constraints, it is not.
SkillUp’s AI Coach (SUAI) is a conversational AI career coach built for non-degree workers. It coordinates multiple specialized AI agents to deliver empathetic, personalized career guidance through voice and text, with persistent memory of each user across sessions and life changes. The system makes consequential decisions on every user message: Is this person in crisis? Should the conversation shift from career exploration to job search? Is the user's emotional state shifting from engagement to frustration? Are we respecting the constraints they told us about: childcare schedule, transport limitations, budget?
These decisions are made by AI agents, and each one must be verifiable. As AI systems move beyond single-model chatbots to architectures where multiple specialized agents coordinate to serve a user, the evaluation challenge changes fundamentally: it is no longer enough to test one model's output in isolation; you need to verify that the agents, working together across turns, produce the right outcome. When the stakes are this high, "it seems to work" is not an engineering standard. This paper describes the evaluation framework we built to replace intuition with evidence: what it measures, how it works, and what we learned building it - as we prepare to scale this to thousands of SkillUp users targeting later this year.
Read Now

Francis Beeson is Principal AI Engineer at SkillUp Coalition and Director of Engineering at Medius Cortex. Pam Portman is SVP of Product at Skillup Coalition. They work together to evolve SkillUp’s AI-powered career navigation platform.
View our comprehensive SkillUp AI Vision & Roadmap Here