🚀 OpenAI Launches o3 + o4-mini

PLUS: Claude Reads Your Gmail

Welcome back!

OpenAI is leveling up again, this time with o3 and o4-mini, models that blend code, web search, and images into how they think. It's the closest we’ve seen yet to an AI that reasons in real world, with visual input and analytical power. Let’s unpack…

Today’s Summary:

  • 🚀 OpenAI launches o3 + o4-mini

  • 🎬 Google unveils AI video tools

  • 🔎 Claude reads Gmail, researches web

  • 📈 OpenAI eyes $3B Windsurf buy

  • 🗣️ Voice features coming to Claude

  • 👥 OpenAI may build social network

  • 🛠️ 2 new tools

TOP STORY

OpenAI launches o3 and o4-mini

The Summary: OpenAI has released o3 and o4-mini, reasoning models that work across text, code, and images while calling web search, code, and image generation in the same answer. For the first time, they can “think with images”, integrating visuals and tools directly into reasoning chains. Both models are live for ChatGPT subscribers, with o4-mini also available to free users under “Think”.

Key details:

  • o3 is OpenAI’s new flagship reasoning model, outperforming o1

  • o4-mini scores 99.5% on AIME Math 2025 when paired with Python

  • Both models now combine all tools inside ChatGPT, including web, vision, code, and image generation

  • Users can upload blurry or reversed images, and the model will flip, zoom, or enhance them mid-reasoning to extract meaning

  • OpenAI also released Codex CLI, a lightweight local coding agent seen as a reply to Anthropic’s Claude Code

  • Available now for ChatGPT Plus, Pro, and Team users. o4-mini is also accessible to free-tier users via the “Think” option.

Source: OpenAI

Why it matters: o3 and o4-mini can research, interpret and build across multiple formats, deciding which tool to fire mid-task. They handle real-world problems with minimal prompting, figuring out more on their own. The bar for useful, general-purpose AI just moved again.

GOOGLE

Google releases AI video tools in Gemini and Whisk

The Summary: Google has rolled out Veo 2 video model, to its Gemini Advanced app and Whisk creative tool. Subscribers to Google One AI Premium can now generate eight-second, high-res videos from text prompts or animate still images. Veo 2 handles complex motion, physics, and scene styles with cinematic quality. The feature is now live for users in 60+ countries.

Key details:

  • Veo 2 generates 720p videos from text prompts in Gemini and image inputs in Whisk, capped at 8 seconds

  • Whisk Animate can turn still images into short videos, blending and animating user-uploaded visuals

  • SynthID watermarks are added to all AI-generated content

Why it matters: Google keeps shipping across the entire AI stack, from models to consumer tools, and Veo 2 is another piece falling into place. With video now integrated into Gemini and Whisk, Google is outpacing rivals by turning its research edge into polished mass-market products.

ANTHROPIC

Claude can now read your Gmail and do its own research

The Summary: Anthropic’s AI assistant now links to Gmail, Google Calendar, and Docs, and introduces a new Research mode that performs multi-step web searches to answer complex questions. This update lets Claude combine personal context with internet sources to deliver insights.

Key details:

  • The Google Workspace integration is available to all paid plans and gives Claude direct access to Gmail, Docs, and Calendar

  • Claude provides inline citations from Workspace and the web, giving users a clear trail of sources, even across long chains of emails or massive docs.

  • Claude’s Research tool acts autonomously, chaining web searches and refining its focus mid-query to answer complex questions in under a minute

  • Research is a beta feature only available to Max, Team, and Enterprise users in the US, Japan, and Brazil

  • Anthropic plans to expand the number of connected content sources and deepen Claude’s research capabilities in the coming weeks.

Why it matters: Claude now joins OpenAI, Google, xAI and Perplexity in pushing agent-like research assistants. Claude’s integration with Google Workspace aims for depth of contextual answers built from your actual inbox and docs. It’s a meaningful differentiator in a space where everyone’s starting to look the same.

TOOLS

🥇 New tools

  • Kling AI 2.0 - Next-level AI video generation + editing

  • Aqua - Super fast AI dictation for Mac and Windows

That’s all for today!

If you liked the newsletter, share it with your friends and colleagues by sending them this link: https://thesummary.ai/