Agentic Evaluations Workshop Deep Dive

Agentic Evaluations Workshop - Deep Dive on the Future on Evals for Agents.

As agents evolve from text conversations to autonomous agents capable of multi-step reasoning, tool use, and real-world task ...

Ship Real Agents: Hands-On Evals for Agentic Applications — Laurie Voss, Arize

Most agents get tested by running a few queries and checking if it looks right. Laurie calls this the vibes problem: it doesn't catch ...

Agentic AI Engineering: Complete 4-Hour Workshop feat. MCP, CrewAI and OpenAI Agents SDK

In this comprehensive hands-on

Amazon Bedrock AgentCore Deep dive series: AgentCore Evaluations | AWS Show and Tell

In this episode of "AWS Show and Tell", we will

How Agentic AI Transforms Maintenance and Asset Decisions

Learn more about Asset Lifecycle Management here → https://ibm.biz/~xM9tMWHdt "Unplanned outages and breakdowns can ...

Agentic Automation for Testers – A Hands-On Deep Dive

As AI continues to reshape software

Evals SDK: How to Evaluate Enterprise-Grade Agentic AI

In this episode of VectorLab, we sit down with Vishnu, Forward Deployed Engineer at OpenAI, to

Agent Evaluation & Benchmarks - Agentic AI MOOC 2025 Lecture 4 Summary

This lecture discusses the critical shift from evaluating static LLMs to complex AI agents that take action. It explores the vital role of ...

How to Evaluate AI Agents: Comprehensive Strategies for Reliable, High‑Quality Agentic Systems

Evaluating AI agents in 2025 goes beyond simply checking outputs. As agents take on multi-step, autonomous workflows, ...

End-to-End Evaluation of Agentic Workflows with Deepchecks and CrewAI

In this session, we walked through how Deepchecks evaluates end-to-end

How to Systematically Setup LLM Evals (Metrics, Unit Tests, LLM-as-a-Judge)

Want to learn real AI Engineering? Go here: https://go.datalumina.com/iIO93Ps Want to start freelancing? Let me help: ...

AWS re:Invent 2025 - Improve agent quality in production with Bedrock AgentCore Evaluations(AIM3348)

Amazon Bedrock AgentCore

AI Evals 101: How to Evaluate LLMs, Agentic AI & GenAI Systems (Step by Step)

FREE

How to secure your AI Agents: A Technical Deep-dive

AI agents introduce unique security challenges like prompt injection, data leakage, and excessive agency. This

Agentic AI: The Complete Masterclass (A 55-Minute Deep Dive)

This is the complete 55-minute masterclass on