- The Summary AI
- Posts
- 🚀 ChatGPT Voice Mode Begins
🚀 ChatGPT Voice Mode Begins
PLUS: Meta SAM 2: The GPT-4 of Computer Vision
Welcome back!
ChatGPT Advanced Voice Mode is finally here — but only for the lucky alpha users. The rest of the Plus subscribers still need to wait until the rollout completes in the fall. In the meantime, Meta is dropping release after release at an incredible pace. Let's unpack...
Today’s Summary:
🎙️ OpenAI launches ChatGPT voice mode Alpha
🕶️ Meta releases SAM 2 model
🤖 Meta's AI Studio for custom characters
🖼️ Canva acquires Leonardo AI
🧥 Zuckerberg, Huang chat at SIGGRAPH 2024
📱 Apple unveils iOS 18 AI models tech details
2 new tools
TOP STORY
OpenAI launches ChatGPT Advanced Voice Mode Alpha
The Summary: OpenAI has begun rolling out the Enhanced Voice Mode for ChatGPT to a select group of Plus subscribers. This upgraded version offers more natural conversations with real-time responses and emotional awareness. OpenAI plans to expand access gradually, with all Plus users getting the feature by fall 2024. The release comes after months of safety testing and refinements since its initial demo.
Key details:
Users in this alpha will get an email with instructions and a message in their mobile app
More people will join the alpha on a rolling basis until everyone on Plus will have access in the fall
Advanced Voice Mode processes audio directly without a separate model, reducing latency
Understands emotional tones like sadness or excitement
Why it matters: This long-awaited upgrade marks a leap in AI conversational abilities, bringing chatbots closer to human-like interactions. The slow rollout focuses on safety testing with a very cautious approach to deploying this new technology.
META
Meta unveils SAM 2: The GPT-4 of Computer Vision
The Summary: Meta has released Segment Anything 2 (SAM 2), an advanced AI model that can identify and isolate objects in real time in both images and videos. Building on the original SAM, this new version can track and segment objects across video frames. Meta is open-sourcing the model, code, and a large video dataset called SA-V. SAM 2 achieves state-of-the-art performance.
Key details:
Trained on 50,900 videos with 642,600 mask annotations
It uses a new "memory attention" mechanism to track objects across frames
The model runs at 44 frames per second for real-time performance
Meta created a new data annotation system that is 8.4x faster than previous methods
Why it matters: SAM 2 represents a breakthrough in computer vision capabilities, especially for video. Its ability to isolate objects across video frames could enable many new applications in augmented reality, robotics, and video editing. By open-sourcing the model and dataset, Meta is accelerating research in this important area of AI.
META
Meta launches AI Studio to create custom AI characters
The Summary: Meta is releasing AI Studio, allowing anyone to create and share custom AI characters on Instagram, Messenger, WhatsApp, and the web. Users can design AIs for various purposes, from generating memes to offering travel advice. Instagram creators can now use AI to respond to common fan questions. The tool is powered by Meta Llama 3.1 model and is currently available in the US.
Key details:
Users can customize AI name, personality, tone, avatar, and tagline
Creator AIs can auto-reply to DMs and story comments
AI responses are clearly labeled for transparency
Meta Llama 3.1 model powers the AI characters
Why it matters: This move brings AI character creation to social media. It helps managing high message volumes for popular Instagram creators. However, the effectiveness and user adoption of AI characters in social media are yet to be seen.
QUICK NEWS
Quick news
Canva acquires Leonardo image AI platform
Apple releases technical details of its foundational models for iOS 18
Mark Zuckerberg and Jensen Huang discuss at SIGGRAPH 2024
TOOLS
🥇 New tools
That’s all for today!
If you liked the newsletter, share it with your friends and colleagues by sending them this link: https://thesummary.ai/