๐Ÿš€ OpenAI Releases GPT-5.4

PLUS : Anthropic Tracks Real AI Job Impact

In partnership with

Welcome back!

OpenAI just released GPT-5.4, with native computer control, 1 million tokens of context, and major gains in real-world task performance. It has fewer hallucinations and allows live interruption while thinking. Letโ€™s unpackโ€ฆ

Todayโ€™s Summary:

  • ๐Ÿš€ OpenAI launches GPT-5.4 and 5.3 Instant

  • ๐ŸŽจ Google Flow unifies AI creation

  • ๐Ÿ”Ž Anthropic measures real AI job impact

  • โšก Gemini 3.1 Flash-Lite boosts API speed

  • ๐ŸŽ™๏ธ Claude Code adds Voice Mode

  • ๐Ÿ“ฑ Claude becomes top app

  • ๐Ÿ› ๏ธ 2 new tools

TOP STORY

OpenAI releases GPT-5.4

The Summary: OpenAI launched GPT-5.4 with native computer control and 1 million tokens of context. The model scores 83% on professional work tasks and includes a new "tool search" feature reducing token consumption by 47%. OpenAI also rolled out GPT-5.3 Instant, maintaining an aggressive shipping cadence.

Key details:

  • GDPval high scores on presentations and spreadsheets

  • ARC-AGI-2 score jumps from 52.9% to 83.3% (GPT-5.4 Pro)

  • Hits 75.0% success rate in UI desktop navigation

  • New ChatGPT for Excel add-in scores 87.3% on investment banking modeling tasks (GPT-5.2 had 68.4%)

  • GPT-5.3 Instant becomes ChatGPT's default model, reducing hallucinations by 26.8% with web search

  • You can now interrupt and add details while the model is thinking

  • Available for ChatGPT Plus, Team, and Pro

Why it matters: The GDPval results deserve special attention as it measures whether AI can produce deliverable work, beyond flashy demos and abstract benchmarks. The convergence of reasoning and coding into a single model also reveals something about where this is headed: instead of model specialization, we're moving toward unified models that think through problems across domains. This improvement is happening faster than most people expected.

FROM OUR PARTNERS

Voice Dictation That Works

Speak fuller prompts. Get better answers.

Stop losing nuance when you type prompts. Wispr Flow captures your spoken reasoning, removes filler, and formats it into a clear prompt that keeps examples, constraints, and tone intact. Drop that prompt into your AI tool and get fewer follow-up prompts and cleaner results. Works across your apps on Mac, Windows, and iPhone. Try Wispr Flow for AI to upgrade your inputs and save time.

GOOGLE

Google Flow merges three AI tools into one creative platform

The Summary: Google relaunched Flow as a unified AI creative platform, merging Whisk and ImageFX into a single workspace where creators can generate, edit, and animate content. The update also adds a lasso tool for precision editing, collection-based asset management, and transitions between Nano Banana image generation and Veo video generation.

Key details:

  • Lasso tool lets users select image regions and edit with language commands like "remove the man" or "add fish in the water"

  • Nano Banana image generation now free inside Flow

  • All Whisk and ImageFX projects migrate into Flow libraries

Why it matters: Google is betting that creators want integrated tools. Flow brings three separate tools into a single platform where creators can move quickly between images and video. The free tier for image generation lowers the barrier to entry. This consolidation also positions Flow against competitors like Runway and Midjourney by offering text-to-image-to-video in a single session.

FROM OUR PARTNERS

Smarter Instructions For Better Results

Want to get the most out of ChatGPT?

ChatGPT is a superpower if you know how to use it correctly.

Discover how HubSpot's guide to AI can elevate both your productivity and creativity to get more things done.

Learn to automate tasks, enhance decision-making, and foster innovation with the power of AI.

ANTHROPIC

Anthropic measures real AI job impact

The Summary: Anthropic developed "observed exposure", a measure tracking real AI usage against theoretical automation potential across 800 occupations. The findings reveal a massive gap: while LLMs could theoretically accelerate 94% of computer tasks, only 33% show real usage. Programmers top the exposure list at 75%.

Key details:

  • Computer programmers (75%), customer service reps (70%), and data entry specialists (67%) face the highest exposure; 30% of workers show zero AI task coverage

  • Workers in top exposed occupations earn 47% more and are 4x more likely to hold graduate degrees than unexposed workers

  • Young workers (22-25) experienced a 0.5 percentage point drop in monthly job-finding rates for exposed occupations starting in 2024

Why it matters: This creates the first employment measure that updates as adoption spreads, instead of relying on static forecasts. The 33% real usage against the 94% theoretical ceiling suggests either massive deployment runway remains, or the assessments overestimate what organizations may automate.

TOOLS

๐Ÿฅ‡ New tools

Thatโ€™s all for today!

If you liked the newsletter, share it with your friends and colleagues by sending them this link: https://thesummary.ai/