About
I'm CS at the University of Washington. I spend most of my time thinking about how AI agents remember, reason, and communicate — and building the infrastructure that makes that possible.
Previously researched privacy-preserving vector search at UW's HPDIC Lab. Currently founding engineer at Agentnomics, and interned at UKG and Toyota Racing Development.
Work
JudgeCalibrator→
Open-source auditing tool that measures LLM judge reliability. Runs 4 diagnostic probes to detect bias and miscalibration in AI evaluators — useful for anyone building or using LLM-as-judge evaluation pipelines.
Engram→
Hybrid persistent memory layer for AI agents combining knowledge graphs, vector search, and temporal versioning. Enables agents to store, retrieve, and reason over structured and unstructured memories across sessions.
TrajAI→
Open-source testing framework for AI agents. Mock tools and assert on agent behavior rather than raw outputs — filling the gap that standard unit testing frameworks leave for agentic systems.