- The Summary AI
- Posts
- 🎬 Google Veo 2 Takes the Lead
🎬 Google Veo 2 Takes the Lead
PLUS: ChatGPT Gets Live Vision
Welcome back!
Google just unveiled Veo 2, its new video generator that sets a new standard with high quality 4K videos and precise cinematography controls, surpassing OpenAI’s Sora. Google's December releases are setting new standards for text, image, and video AI. Let’s unpack…
Today’s Summary:
🎬 Google Veo 2 outshines Sora
đź‘€ ChatGPT gets real-time video vision
đź“‚ ChatGPT Projects adds smart folders
🔎 ChatGPT Search now for free users
🖼️ Pika Labs 2.0 image-to-video model
🧮 Microsoft’s Phi-4 small model for math and reasoning
🛠️ 2 new tools
TOP STORY
Google Veo 2 video generator beats OpenAI Sora
The Summary: Google has unveiled Veo 2, its latest AI video model, outperfoming OpenAI's Sora and other top models in head-to-head comparisons. Veo 2 can generate 4K videos up to several minutes, with advanced physics and precise camera control. It topped Meta's MovieGenBench dataset in both quality and instruction following tests.
Today, we’re announcing Veo 2: our state-of-the-art video generation model which produces realistic, high-quality clips from text or image prompts. 🎥
We’re also releasing an improved version of our text-to-image model, Imagen 3 - available to use in ImageFX through… x.com/i/web/status/1…
— Google DeepMind (@GoogleDeepMind)
5:03 PM • Dec 16, 2024
Key details:
In 720p resolution, 8-second clip tests, Veo 2 won 2:1 over OpenAI's Sora Turbo
Understands lens types (18mm vs 35mm) and shooting styles (tracking shots, dolly moves)
Creates fewer hallucinations like extra fingers or misplaced objects compared to other models
Includes SynthID watermarks to identify videos as AI-generated
Imagen3-002 image generator outperforms Flux1.1p, Ideogram v2 and Recraft v3
Whisk, Google's image remixing tool, combines Imagen 3 with Gemini's visual capabilities
Available in 100+ countries but excludes the EU due to regulations
Why it matters: Google’s December releases are setting a new industry benchmark, dominating text, vision, image generation and video generation. The strong competition between Google and OpenAI is driving rapid advancements, pushing AI capabilities to unprecedented levels.
OPENAI
ChatGPT gets real-time vision in voice mode
The Summary: OpenAI released real-time vision to ChatGPT's Advanced Voice Mode, allowing Plus, Team, and Pro users to show their surroundings to ChatGPT using their phone cameras. The AI can respond to what it sees, helping with tasks. The update also introduces screen sharing, enabling ChatGPT to guide users through their phone apps.
Key details:
Feature available in Voice Mode, on ChatGPT iOS and Android mobile apps
Daily caps and conversation length limits apply to video mode
ChatGPT can remember faces and names shown on video, even after they leave the frame
All content is automatically deleted within 30 days of chat removal
Feature unavailable in the EU, Switzerland, Iceland, Norway, and Liechtenstein due to regulations
Why it matters: Real-time AI vision brings us closer to having powerful AI assistants that understand and interact with the physical world. This transforms phones into versatile teaching tools capable of providing instant, task-specific feedback, from everyday chores to complex learning activities.
OPENAI
ChatGPT “Projects” adds smart folders to chat history
The Summary: OpenAI has introduced “Projects”, a new ChatGPT feature that organizes chats, files, and custom instructions into a dedicated workspace. Users can create Projects for specific tasks, like gift planning or website development, and upload related documents and images. The rollout is starting with paid users, with free users gaining access soon.
Key details:
Each Project serves as a smart workspace with its own memory, letting ChatGPT reference past conversations and uploaded files automatically
Users can move existing chats into a Project and seamlessly continue the conversation
Mobile and macOS users can view and interact with Projects but need the web version or Windows app to create or edit them
Projects support custom instructions that override account-level settings, enabling custom ChatGPT “personalities” for different tasks
Inspired by tools like Google's NotebookLM and Claude Projects
Why it matters: Projects bring structured workspace management to ChatGPT, making it easier to organize conversations and files for recurring tasks. This upgrade should make AI interactions more intuitive and productive.
QUICK NEWS
Quick news
ChatGPT Search is rolling out to all Free users
Pika Labs 2.0 new image to video model
Microsoft Phi 4 14B open-source model for math and reasoning
TOOLS
🥇 New tools
Paperguide - Supercharge your research papers
Steer - Save hours of writing emails & messages
That’s all for today!
If you liked the newsletter, share it with your friends and colleagues by sending them this link: https://thesummary.ai/