The Summary AI
Posts
🚀 ChatGPT Voice Mode Begins

🚀 ChatGPT Voice Mode Begins

PLUS: Meta SAM 2: The GPT-4 of Computer Vision

The Summary AI
July 30, 2024

Welcome back!

ChatGPT Advanced Voice Mode is finally here — but only for the lucky alpha users. The rest of the Plus subscribers still need to wait until the rollout completes in the fall. In the meantime, Meta is dropping release after release at an incredible pace. Let's unpack...

Today’s Summary:

🎙️ OpenAI launches ChatGPT voice mode Alpha
🕶️ Meta releases SAM 2 model
🤖 Meta's AI Studio for custom characters
🖼️ Canva acquires Leonardo AI
🧥 Zuckerberg, Huang chat at SIGGRAPH 2024
📱 Apple unveils iOS 18 AI models tech details
2 new tools

TOP STORY

OpenAI launches ChatGPT Advanced Voice Mode Alpha

The Summary: OpenAI has begun rolling out the Enhanced Voice Mode for ChatGPT to a select group of Plus subscribers. This upgraded version offers more natural conversations with real-time responses and emotional awareness. OpenAI plans to expand access gradually, with all Plus users getting the feature by fall 2024. The release comes after months of safety testing and refinements since its initial demo.

Source: OpenAI

Key details:

Users in this alpha will get an email with instructions and a message in their mobile app
More people will join the alpha on a rolling basis until everyone on Plus will have access in the fall
Advanced Voice Mode processes audio directly without a separate model, reducing latency
Understands emotional tones like sadness or excitement

Why it matters: This long-awaited upgrade marks a leap in AI conversational abilities, bringing chatbots closer to human-like interactions. The slow rollout focuses on safety testing with a very cautious approach to deploying this new technology.

Meta unveils SAM 2: The GPT-4 of Computer Vision

The Summary: Meta has released Segment Anything 2 (SAM 2), an advanced AI model that can identify and isolate objects in real time in both images and videos. Building on the original SAM, this new version can track and segment objects across video frames. Meta is open-sourcing the model, code, and a large video dataset called SA-V. SAM 2 achieves state-of-the-art performance.

Source: Meta

Key details:

Trained on 50,900 videos with 642,600 mask annotations
It uses a new "memory attention" mechanism to track objects across frames
The model runs at 44 frames per second for real-time performance
Meta created a new data annotation system that is 8.4x faster than previous methods

Why it matters: SAM 2 represents a breakthrough in computer vision capabilities, especially for video. Its ability to isolate objects across video frames could enable many new applications in augmented reality, robotics, and video editing. By open-sourcing the model and dataset, Meta is accelerating research in this important area of AI.

Try SAM 2 interactive demo

Meta launches AI Studio to create custom AI characters

The Summary: Meta is releasing AI Studio, allowing anyone to create and share custom AI characters on Instagram, Messenger, WhatsApp, and the web. Users can design AIs for various purposes, from generating memes to offering travel advice. Instagram creators can now use AI to respond to common fan questions. The tool is powered by Meta Llama 3.1 model and is currently available in the US.

Source: Meta

Key details:

Users can customize AI name, personality, tone, avatar, and tagline
Creator AIs can auto-reply to DMs and story comments
AI responses are clearly labeled for transparency
Meta Llama 3.1 model powers the AI characters

Why it matters: This move brings AI character creation to social media. It helps managing high message volumes for popular Instagram creators. However, the effectiveness and user adoption of AI characters in social media are yet to be seen.

Get started here