- The Summary AI
- Posts
- 🔥 Google Veo 3 Videos Can Talk
🔥 Google Veo 3 Videos Can Talk
PLUS: Jony Ive Teams With OpenAI

Welcome back!
It’s Google’s world this week. At the I/O conference, Google unveiled Veo 3, the first AI video model to generate impressive synchronized dialogue. Paired with smarter reasoning with Gemini 2.5 and real-time voice translation in Meet, Google is currently dictating the pace of the AI race. Let’s unpack…
Today’s Summary:
🔥 Google Veo 3 video AI can speak
đź§ Gemini 2.5 gets smarter reasoning
📞 Google Meet translates in real time
🧑‍🎨 OpenAI buys Jony Ive’s “io”
🔍 Google Search gets AI Mode
đź’ł Google unveils AI Ultra VIP tier
🛠️ 2 new tools

Google Veo 3 gives AI video a voice
The Summary: Google has launched Veo 3, the first generative video model capable of producing native audio and dialogue. This leap moves AI video from silent cinema to full audio-visual storytelling. Launched alongside new image model Imagen 4, and Flow, an AI filmmaking tool, this update rewrites the creative playbook for video storytelling.
Key details:
Veo 3 generates synchronized dialogue, sound effects, and ambient audio directly from text prompts.
The model uses physics-aware rendering for realism and accurate lip-sync with speech
Can use multiple images of a scene, a character, or an object as input to guide the generation, plus camera controls
Available through the Flow filmmaking tool and through Google’s Gemini app for Ultra subscribers (US-only, more countries coming soon)
Why it matters: Video generation has been mostly silent until now; Veo 3 breaks this limit. It compresses into a prompt what once took entire production teams, writing, directing, and scoring. With synchronized voices and motion, AI videos can now speak for themselves. Expect ripple effects in film, advertising, and game design industries.

FROM OUR PARTNERS
What’s Working on Social in 2025
Is your social strategy ready for what's next in 2025?
HubSpot Media's latest Social Playbook reveals what's actually working for over 1,000 global marketing leaders across TikTok, Instagram, LinkedIn, Pinterest, Facebook, and YouTube.
Inside this comprehensive report, you’ll discover:
Which platforms are delivering the highest ROI in 2025
Content formats driving the most engagement across industries
How AI is transforming social content creation and analytics
Tactical recommendations you can implement immediately
Unlock the playbook—free when you subscribe to the Masters in Marketing newsletter.
Get cutting-edge insights, twice a week, from the marketing leaders shaping the future.

Google Gemini 2.5 now thinks in parallel
The Summary: Google has updated its Gemini 2.5 AI models with a suite of new capabilities, led by Deep Think, an experimental reasoning mode that allows Gemini Pro to consider multiple ideas at once. Gemini 2.5 Flash, the lighter sibling, gets faster and smarter with better efficiency and multimodal reasoning.
Key details:
Gemini 2.5 Pro Deep Think mode leads LiveCodeBench (competition-level coding) and MMMU (multimodal reasoning) benchmarks
Gemini 2.5 Flash is now 20-30% more efficient in token usage, with improved performance at affordable prices
Gemini gets native multimodal audio output with emotion detection, two-voice support, and whisper like expressiveness
Developers can control how long the AI “thinks” and see structured thought summaries in API responses
Why it matters: More than a simple upgrade, Google is testing ideas once reserved for science fiction. The new Gemini models continue to leapfrog past limits, now offering voice interfaces with emotional nuance and speaker dynamics. Google is giving developers a powerful platform to build and deploy AI products, with detailed control over performance and cost.

FROM OUR PARTNERS
Start Your Day Smarter
Seeking impartial news? Meet 1440.
Every day, 3.5 million readers turn to 1440 for their factual news. We sift through 100+ sources to bring you a complete summary of politics, global events, business, and culture, all in a brief 5-minute email. Enjoy an impartial news experience.

OPENAI
Google Meet gets real-time voice translation
The Summary: Google Meet now offers real-time speech translation from English to Spanish, using AI to preserve the speaker’s voice and tone. Available to AI Pro subscribers, this beta feature adds a fluent interpreter without subtitles. Google plans to expand language support soon and roll out enterprise testing.
Key details:
Available in beta for Google AI Pro at $20/month and for Ultra plan
Translates English to Spanish in real-time, using the speaker’s own synthesized voice
The AI preserves tone, cadence, and emotion
The original voice plays in the background while the translated voice keeps at normal volume
Why it matters: Real-time voice translation can redefine remote communication, not just for business, but also for family, travel, and education. As speech synthesis gets better, expect conversations between people who don’t share a language to feel less like translations and more like perfectly normal conversations.

QUICK NEWS
Quick news
OpenAI acquires Jony Ive’s “io” to build AI devices
Google introduces AI Mode in Search in the US
Google announces AI Ultra VIP subscription plan

TOOLS
🥇 New tools

That’s all for today!
If you liked the newsletter, share it with your friends and colleagues by sending them this link: https://thesummary.ai/