
AI & Agents Trainer Podcast
10 episodes covering prompting, Claude Code, MCP, enterprise security, RAG, and AI evaluation. Listen while you learn.
Core Foundations - Prompting and productivity essentials
The Prompt Engineering Playbook That Changed How We Talk to AI
~25 min"Write me a marketing email" is a wish. "Given this product brief, write a 3-paragraph email targeting CTOs, emphasizing ROI, in a professional but warm tone" is a contract. We dissect why most engineers get mediocre results from AI and reveal the prompting patterns that the best practitioners use daily. Nidhi and Alex walk through real before-and-after transformations using contract prompts, XML structuring, and few-shot calibration - the same techniques that turn a 60% success rate into 95%+.
Read the lessonInside Claude Apps: The Features Power Users Swear By
~27 minMost people use Claude like a search engine. They are missing 90% of what it can do. In this episode, we explore the features that turn Claude from a helpful chatbot into a genuine work companion - Projects that remember everything about your codebase, Artifacts that generate interactive apps on the fly, and Connectors that pull live data from Slack, Drive, and Notion. If you have only ever used the chat interface, this episode will change how you work.
Read the lessonClaude Code - CLI workflows and developer tools
The Secret Workflow Behind Engineers Who Ship 10x Faster with Claude Code
~20 minWe watched an engineer fix a production auth bug in 8 minutes without touching a single file manually. Claude Code read the codebase, identified the root cause, planned the fix, wrote the code, ran the tests, and committed - all while the engineer reviewed and steered. This episode reveals the Explore-Plan-Code-Commit workflow that makes this possible, plus the lesser-known features like Plan Mode, subagents for parallel work, and hooks that auto-format every file Claude touches.
Read the lessonSkills - Extend Claude with reusable workflows
How One Markdown File Transformed Our Entire Engineering Team's Output
~28 minHere is a question that will change how you think about AI: what if you could take your best engineer's debugging instincts, your team's code review checklist, and your deployment playbook - and teach them to Claude permanently? That is what Skills do. We show the dramatic before-and-after of a TDD skill (Claude goes from writing all tests at once to proper red-green-refactor), walk through building your own from a blank SKILL.md file, and explain why npx skills@latest add is becoming every team's first command.
Read the lessonMCP - Model Context Protocol and integrations
The Protocol That Made OpenAI, Google, and Microsoft Agree on Something
~23 minEvery AI company used to build integrations differently. GitHub had one API for ChatGPT, another for Claude, another for Gemini. MCP ended that. In this episode, we unpack the protocol that OpenAI, Google, and Microsoft all adopted - the "USB standard" that lets any AI connect to any tool through one universal interface. We break down the architecture, the three primitives every MCP server exposes, and why understanding this protocol is the single most important skill for AI engineers in 2026.
Read the lessonWe Built 27 GitHub Tools in One Session. Here's Exactly How.
~25 minWe built a GitHub MCP server with 27 tools in a single session - repos, issues, PRs, search, Actions, the works. Then we asked Claude to use it, and it started creating issues and reviewing pull requests without us writing a single line of glue code. This episode is a complete walkthrough: the 4-phase builder approach, FastMCP decorators, Pydantic validation, behavioral annotations like readOnlyHint, and a 10-question evaluation suite that proves your server actually works. Code-heavy and practical.
Read the lessonClaude API - Build production applications
The API Architecture That Cut Our AI Costs from $2,100 to $340 a Month
~23 minA startup we know was spending $2,100/month on Claude API calls. After one architecture session, they cut it to $340/month - same quality, same throughput. This episode shows you how. We cover the 5-step request lifecycle from client to server to model and back, the streaming pattern that makes responses feel instant, the agentic tool use loop that lets Claude call your functions, and the two features most teams miss: prompt caching (90% savings) and Batch API (50% off for async work).
Read the lessonThe $900 Coffee Machine: When Your AI Confidently Gives the Wrong Answer
~25 minA company's HR bot told an employee they could expense a $900 coffee machine. The retrieval system found the right policy document, the answer was perfectly faithful to the retrieved text - and it was completely wrong because the system missed the exclusion clause two paragraphs down. This is the gap between RAG demos and production RAG. We cover chunking strategies, the case for hybrid search (semantic + BM25), reciprocal rank fusion, and the three evaluation metrics that would have caught this before a single employee saw the answer.
Read the lessonEnterprise - Identity, security, and governance
The $72K Mistake Every Enterprise Makes When Rolling Out Claude
~27 minAn enterprise admin told us they discovered 47 inactive Claude Code seats - $28K/year wasted because nobody was tracking utilization. That is the kind of mistake this episode prevents. We cover the full admin stack: wiring up SSO with Azure AD in under an hour, SCIM provisioning that auto-revokes access when someone leaves, audit logs piped to your SIEM, managed settings that enforce hooks org-wide, and the incident response playbook you hope you never need but absolutely must have ready.
Read the lessonAI Evals - Measure and improve AI quality
The $500K Hallucination That Could Have Been Caught in CI
~22 minA fintech company shipped an AI advisor that passed every benchmark they tested. Six weeks later, it hallucinated a fund recommendation that cost a client $500K. The benchmarks measured capability. Nobody measured reliability. This episode introduces the evaluation methodology that separates production-grade AI from expensive demos: the 5-step loop that starts with naming your failure modes, the M.A.G.I. framework for building automated judges, CI gates that block bad deployments before users see them, and three case studies that will make you rethink how you test AI.
Read the lesson