The Summary AI
Posts
🌟 GPT-4o "Free AI for Everyone"

🌟 GPT-4o "Free AI for Everyone"

PLUS: Voice Mode Coming Soon

The Summary AI
May 13, 2024

Welcome back!

Today's issue is completely dedicated to the launch of OpenAI's GPT-4o! Deep dive into three comprehensive summaries: its enhanced features, its performance evaluations / benchmarks, and details on the rollout of the features by user tier. Let’s unpack...

Today’s Summary:

OpenAI launches GPT-4o
The Mysterious "gpt2" Model Revealed to be GPT-4o
Details on Availability, API, Free version, macOS desktop app
3 new tools

TOP STORY

OpenAI Unveils GPT-4o "Free AI for Everyone"

The Summary: OpenAI announced the launch of GPT-4o (“o” for “omni”), their new flagship AI model. GPT-4o brings GPT-4 level intelligence to everyone, including free users. It has improved capabilities across text, vision, audio, and real-time natural voice interaction. OpenAI aims to reduce friction and make AI freely available to everyone.

Key details:

May remind some of the AI character Samantha from the movie "Her"
Unified Processing Model: GPT-4o can handle audio, vision, and text inputs and outputs seamlessly.
GPT-4o provides GPT-4 level intelligence but is much faster and enhances text, vision, and audio capabilities
Allows for natural dialogue and real-time, interruptible conversational speech without lag
Perceives emotion from audio and facial expressions
Generates expressive speech across styles like storytelling
Understands visuals like documents, real-time video, charts, equations in conversations
Offers multilingual support with real-time translation
Free users get GPT-4.0 level access; paid users get higher limits: 80 messages every 3 hours on GPT-4o and up to 40 messages every 3 hours on GPT-4 (may be reduced during peak hours)
GPT-4o available on API for developers to build apps at scale
2x faster, 50% cheaper, 5x higher rate limits than previous GPT-4 Turbo model
New ChatGPT desktop app for macOS is announced, with multi-modal and voice capabilities
An iterative rollout of capabilities is planned, starting immediately with text mode for Plus users.
The new Voice Mode will be available in alpha in the coming weeks, initially accessible to Plus users, with plans to expand availability to Free users.
Progress towards the "next big thing" will be announced later.

Why it matters: GPT-4o brings advanced multimodal AI capabilities to the masses for free. With natural voice interaction, visual understanding, and ability to collaborate seamlessly across modalities, it can redefine human-machine interaction.

“It feels like AI from the movies; and it’s still a bit surprising to me that it’s real. Getting to human-level response times and expressiveness turns out to be a big change.”

Sam Altman, OpenAI

TECHNICAL DETAILS

The Mysterious "gpt2" Model Revealed to be GPT-4o

The Summary: OpenAI has confirmed that the AI model recently seen outperforming GPT-4 Turbo in the LMSys arena under the name "gpt2" was indeed GPT-4o. It had achieved an impressive Elo rating of 1310, significantly surpassing GPT-4 Turbo's 1253 and establishing itself as the world's best language model to date.

Source: OpenAI

Key details:

Elo ratings are non-linear, making the differences more impactful than they seem. A +57 point increase between GPT-4o and GPT-4 Turbo is a substantial performance boost.
GPT-4o achieves human parity on audio response times, averaging 320ms lag.
Matches GPT-4 Turbo performance on English text and coding tasks. Outperforms GPT-4 Turbo on non-English languages, vision, and audio tasks
Sets new state-of-the-art records on benchmarks like speech recognition, translation, and multilingual exams.
Utilizes a new tokenizer for better compression of non-English languages.
Incorporates new safety measures like filtered training data and output refinement.
Extensively evaluated for risks like cybersecurity threats and misinformation.

Source: OpenAI (Speech Recognition Performance)

Source: OpenAI (Vision Understanding Performance)

ROLLOUT

Details on availability, API, Free version, macOS desktop app

The Summary: GPT-4o is rolling out to ChatGPT's paid tiers first, with free users subject to usage limits. The model is also available via the API. A ChatGPT desktop app for macOS featuring voice interactions is launching as well.

Key details:

Rolling out first to ChatGPT Plus starting today and later to ChatGPT free users with usage limits (will switch back to GPT-3.5 when GPT-4o limit is reached). The advanced voice model will launch soon (the old voice model will be used initially).
The Free tier will also get limited access to advanced tools like data analysis, file uploads, vision and GPTs.
Available in OpenAI API as gpt-4o model for Chat Completions, Assistants, and Batch APIs with 128k token context window.
ChatGPT Plus users get up to 80 messages/3 hrs. Higher limits for Team users. An Enterprise plan is coming soon.
The MacOS desktop app is launching for Plus, later for Free users. The Windows desktop app is coming later in 2024.
Redesigned ChatGPT interface with new home screen.
The GPT-4o knowledge cutoff is in October 2023.

Source: OpenAI (macOS desktop app)

TOOLS

🥇 New tools

AI Logo Design by Stylar - Level up your text logo design with AI
Voicenotes - AI note-taker
Boodlebox - Team collaboration with AI

That’s all for today!

If you liked this newsletter, share it with your friends and colleagues by sending them your invite link : https://thesummary.ai/subscribe?ref=PLACEHOLDER.