The Summary AI
Posts
OpenAI Launches GPT-5.5 Instant

OpenAI Launches GPT-5.5 Instant

PLUS: DeepSeek Slashes AI Prices

The Summary AI
May 07, 2026

In partnership with

Welcome back!

OpenAI just upgraded the ChatGPT base model. The company launched GPT-5.5 Instant as the new default model, providing shorter, faster, and smarter replies. At the same time, OpenAI unveiled a new wave of GPT-5-level voice agents. Let’s unpack…

Today’s Summary:

🚀 GPT-5.5 Instant becomes ChatGPT default
🔥 DeepSeek V4 cuts frontier AI pricing
🎙️ OpenAI launches realtime voice agents
⚡ OpenAI ends Microsoft cloud exclusivity
🎤 xAI launches Grok 4.3 API
🔥 Anthropic doubles Claude usage limits
🛠️ 2 new tools

TOP STORY

OpenAI launches GPT-5.5 Instant as ChatGPT’s new default

The Summary: OpenAI launched GPT-5.5 Instant, the new base model without thinking mode, replacing GPT‑5.3 Instant as the default ChatGPT model. It features a major reduction in hallucinations and stronger performance on expert-level tasks. Responses use 30% fewer words without losing depth. The model better recalls past conversations and personalizes answers. Free users also get access to these improvements.

— (@)

Key details:

Produces 52% fewer hallucinations on medical, legal, and financial
GPQA (PhD-level science) rose from 78.5% to 85.6%, AIME 2025 math rose from 65.4% to 81.2%
Responses use 30% fewer words while maintaining substance
New memory sources reveals which past chats, files, or Gmail context created a memory, with full control to edit or delete
Rolling out to all ChatGPT users including free

Why it matters: The changes are both useful improvements and cost optimizations, as a 30% drop in response length will save compute and reduce latency. GPT-5.5 Instant also has a better ability to recover mid-answer from initial mistakes, catching weak reasoning before users intervene.

FROM OUR PARTNERS

Effortless Voice Dictation

Your prompts are leaving out 80% of what you're thinking.

When you type a prompt, you summarize. When you speak one, you explain. Wispr Flow captures your full reasoning — constraints, edge cases, examples, tone — and turns it into clean, structured text you paste into ChatGPT, Claude, or any AI tool. The difference shows up immediately. More context in, fewer follow-ups out.

89% of messages sent with zero edits. Used by teams at OpenAI, Vercel, and Clay. Try Wispr Flow free — works on Mac, Windows, and iPhone.

Start flowing free

DEEPSEEK

DeepSeek V4 cuts frontier AI prices

The Summary: DeepSeek just released V4-Pro and V4-Flash, two open-weight AI models with up to 1.6 trillion parameters and a native 1-million-token context window. V4-Pro is close to GPT-5.5 and Claude Opus 4.7 on benchmarks but at a much lower price and with a free license.

Source: LMArena

Key details:

LMArena ranked DeepSeek V4-Pro as the #3 open model
Activates only 49B parameters out of 1.6T, keeping costs very low
Context length is expanded to 1 million tokens
Cuts KV cache usage to 10%, a massive reduction for agents
Scored 83.4% on BrowseComp, nearly matching GPT-5.5 at 84.4%, while its API is almost 7x cheaper

Why it matters: This launch reinforces a growing industry view that open models are competing near the frontier on practical coding and reasoning tasks. Once models get so close in quality, cost per task matters more than a marginal improvement in the leaderboards. AI competition is currently moving to ecosystem control: distribution, agents, memory, developer tooling, and infrastructure ownership look like the real moat.

API models and pricing

FROM OUR PARTNERS

A Smarter Way to Read the News

Tired of news that feels like noise?

Every day, 4.5 million readers turn to 1440 for their factual news fix. We sift through 100+ sources to bring you a complete summary of politics, global events, business, and culture — all in a brief 5-minute email. No spin. No slant. Just clarity.

Join for free today!

OPENAI

OpenAI launches GPT-5-level voice agents

The Summary: OpenAI launched three new realtime voice models that can reason, translate, and transcribe live while people speak. The flagship model, GPT-Realtime-2, brings GPT-5 class reasoning into live conversations and can use tools, recover from mistakes, and manage interruptions without breaking flow. OpenAI also introduced live translation and a low-latency transcription model built for meetings, support, and voice agents.

Key details:

GPT-Realtime-2 expands the context window from 32K to 128K tokens, for much longer live conversations with memory
GPT-Realtime-Translate handles 70 input languages and 13 output languages in live conversation
GPT-Realtime-Whisper streams speech-to-text

Why it matters: Voice AI has struggled with one core problem: models could talk fast, but they could not think as deeply as text mode. OpenAI is trying to fix that with this new model. The most interesting part is how the voice model behaves under pressure. Older voice systems would freeze, restart, or lose context. GPT-Realtime-2 can say “one moment while I check that..”, keeping the conversation alive while tools run or while it recovers seamlessly from failures.

QUICK NEWS

Quick news

TOOLS

🥇 New tools

ChatGPT for Google Sheets - Build and update sheets
OpenAI Privacy Filter - Remove private data from text

That’s all for today!

If you liked the newsletter, share it with your friends and colleagues by sending them this link: https://thesummary.ai/