Predictive Validity: New LLM Agent Evaluation Channel: AI Research Roundup32 views • 2 days agoRelated VideosLLM as a Judge: Scaling AI Evaluation StrategiesMLflow Agent Evaluation: Judges, Scorers & Multi-Turn Sessions (Notebook 1.7)AI Agent Leaderboards Are BrokenThe agent evaluation revolution