Models & Agents

Daily AI models, agents, and practical developments.

Daily ~15 min

Daily AI briefing covering new model releases, agent frameworks, and practical developments. From GPT and Claude to OpenClaw and Agent Zero — stay on top of the most exciting developments of our generation.

For developers building with AI, professionals adopting AI tools, and anyone who wants to stay ahead of the most transformative technology of our generation.

Listen Now Read Blog View Summaries Apple Podcasts Spotify RSS Feed

Latest Episode

Recent Episodes

Ep 33: MetaComp just released the world's first dedicated AI agent governance framework built specifically for regulated financial services.

Tue, Apr 21, 2026

Ep 32: Qwen3.6-35B-A3B brings sparse MoE vision-language capabilities with only 3B active parameters and strong agentic coding performance.

Fri, Apr 17, 2026

Ep 31: Google DeepMind's Gemini Robotics-ER 1.6 upgrade delivers enhanced embodied reasoning and instrument reading for real-world robot control.

Wed, Apr 15, 2026

Ep 30: Aaron Levie declares the enterprise AI shift from chatbots to agents is now underway, moving beyond the "Chat Era."

Mon, Apr 13, 2026

Ep 29: Knowledge distillation now compresses full ensembles into single deployable models while preserving their collective intelligence.

Sat, Apr 11, 2026

Ep 28: Meta’s Muse Spark and a production-grade compiler-as-a-service approach for agents headline a day heavy on practical agent infrastructure.

Thu, Apr 09, 2026

Ep 27: Gemma 4 delivers massive gains across European languages while a 25.6M Rust model achieves 50× faster inference via hybrid attention.

Tue, Apr 07, 2026

Ep 26: AutoAgent autonomously optimizes its own harness using the same model to reach #1 on Terminal-Bench and financial modeling in under 24 hours.

Sun, Apr 05, 2026

Ep 25: Google drops Gemma 4, claiming the strongest small multimodal open model yet with dramatic gains across every benchmark compared to Gemma 3.

Fri, Apr 03, 2026

Models & Agents - Episode 24 - April 01, 2026

Wed, Apr 01, 2026

Ep 23: Alibaba Qwen just dropped Qwen3.5-Omni, a native end-to-end multimodal model built for text, audio, video, and realtime interaction.

Tue, Mar 31, 2026

Ep 22: Naver's Seoul World Model grounds video generation in real Street View geometry from over a million images and generalizes to other cities without fine-tuning.

Sun, Mar 29, 2026

View All Episodes

About Models & Agents

Hosted by Patrick in Vancouver.

Resources & Further Reading

Recommended Tools

Free tier

Claude

Anthropic's AI assistant — strong at reasoning, coding, and long-context analysis

Free tier

ChatGPT

OpenAI's conversational AI — GPT-4o with vision, code, and browsing capabilities

Free tier

Cursor

AI-first code editor — autocomplete, chat, and codebase-aware assistance built on LLMs

Free

Hugging Face

Try 500K+ models in your browser — text, image, audio, and multimodal demos

Free

Ollama

Run open-source LLMs locally — one command to download and chat with Llama, Mistral, Phi

Free tier

Perplexity

AI-powered search engine — asks follow-up questions and cites sources for every answer

Key Concepts

What is an LLM?

A Large Language Model (LLM) is an AI system trained on vast amounts of text data to understand and generate human language. Models like GPT-4, Claude, Gemini, and Llama are LLMs. They work by predicting the most likely next token (word piece) in a sequence, but this simple mechanism produces remarkably capable systems that can write code, analyze documents, reason through problems, and hold conversations.

What is an AI agent?

An AI agent is a system that uses an LLM as its 'brain' to autonomously plan and execute multi-step tasks. Unlike a simple chatbot that responds to one message at a time, an agent can break down complex goals, use tools (web search, code execution, APIs), observe results, and iterate. Examples include coding agents (Cursor, Claude Code), research agents (Perplexity), and browser agents that navigate websites on your behalf.

What is RAG?

Retrieval-Augmented Generation (RAG) is a technique that gives LLMs access to external knowledge by retrieving relevant documents before generating a response. Instead of relying solely on training data, a RAG system searches a database (using vector embeddings), finds relevant passages, and includes them in the LLM's context. This reduces hallucinations and lets you build AI that can answer questions about your own documents, codebase, or data.

What is fine-tuning?

Fine-tuning is the process of further training a pre-trained LLM on a specific dataset to specialize it for a particular task or domain. For example, you might fine-tune Llama on medical literature to create a healthcare-specific model. It's more expensive than RAG but can teach the model new behaviors, styles, or domain expertise that prompt engineering alone can't achieve. Most developers start with RAG and only fine-tune when necessary.

What is MCP (Model Context Protocol)?

MCP is an open protocol (created by Anthropic) that standardizes how AI models connect to external data sources and tools. Think of it as a USB-C port for AI — instead of building custom integrations for every tool, MCP provides a universal interface. An MCP server can expose databases, APIs, file systems, or any tool, and any MCP-compatible AI client can use them. It's rapidly becoming the standard for agent tool connectivity.

Recommended for you

Models & Agents for Beginners

If you enjoy Models & Agents, you might also like Models & Agents for Beginners — the same AI news explained simply for newcomers.