🌟 GPT-4o "Free AI for Everyone"

PLUS: Voice Mode Coming Soon

Welcome back!

Today's issue is completely dedicated to the launch of OpenAI's GPT-4o! Deep dive into three comprehensive summaries: its enhanced features, its performance evaluations / benchmarks, and details on the rollout of the features by user tier. Let’s unpack...

Today’s Summary:

  • OpenAI launches GPT-4o

  • The Mysterious "gpt2" Model Revealed to be GPT-4o

  • Details on Availability, API, Free version, macOS desktop app

  • 3 new tools

TOP STORY

OpenAI Unveils GPT-4o "Free AI for Everyone"

The Summary: OpenAI announced the launch of GPT-4o (“o” for “omni”), their new flagship AI model. GPT-4o brings GPT-4 level intelligence to everyone, including free users. It has improved capabilities across text, vision, audio, and real-time natural voice interaction. OpenAI aims to reduce friction and make AI freely available to everyone.

Key details:

  • May remind some of the AI character Samantha from the movie "Her"

  • Unified Processing Model: GPT-4o can handle audio, vision, and text inputs and outputs seamlessly.

  • GPT-4o provides GPT-4 level intelligence but is much faster and enhances text, vision, and audio capabilities

  • Allows for natural dialogue and real-time, interruptible conversational speech without lag

  • Perceives emotion from audio and facial expressions

  • Generates expressive speech across styles like storytelling

  • Understands visuals like documents, real-time video, charts, equations in conversations

  • Offers multilingual support with real-time translation

  • Free users get GPT-4.0 level access; paid users get higher limits: 80 messages every 3 hours on GPT-4o and up to 40 messages every 3 hours on GPT-4 (may be reduced during peak hours)

  • GPT-4o available on API for developers to build apps at scale

  • 2x faster, 50% cheaper, 5x higher rate limits than previous GPT-4 Turbo model

  • New ChatGPT desktop app for macOS is announced, with multi-modal and voice capabilities

  • An iterative rollout of capabilities is planned, starting immediately with text mode for Plus users.

  • The new Voice Mode will be available in alpha in the coming weeks, initially accessible to Plus users, with plans to expand availability to Free users.

  • Progress towards the "next big thing" will be announced later.

Why it matters: GPT-4o brings advanced multimodal AI capabilities to the masses for free. With natural voice interaction, visual understanding, and ability to collaborate seamlessly across modalities, it can redefine human-machine interaction.

“It feels like AI from the movies; and it’s still a bit surprising to me that it’s real. Getting to human-level response times and expressiveness turns out to be a big change.”

Sam Altman, OpenAI
TECHNICAL DETAILS

The Mysterious "gpt2" Model Revealed to be GPT-4o

The Summary: OpenAI has confirmed that the AI model recently seen outperforming GPT-4 Turbo in the LMSys arena under the name "gpt2" was indeed GPT-4o. It had achieved an impressive Elo rating of 1310, significantly surpassing GPT-4 Turbo's 1253 and establishing itself as the world's best language model to date.

Source: OpenAI

Key details:

  • Elo ratings are non-linear, making the differences more impactful than they seem. A +57 point increase between GPT-4o and GPT-4 Turbo is a substantial performance boost.

  • GPT-4o achieves human parity on audio response times, averaging 320ms lag.

  • Matches GPT-4 Turbo performance on English text and coding tasks. Outperforms GPT-4 Turbo on non-English languages, vision, and audio tasks

  • Sets new state-of-the-art records on benchmarks like speech recognition, translation, and multilingual exams.

  • Utilizes a new tokenizer for better compression of non-English languages.

  • Incorporates new safety measures like filtered training data and output refinement.

  • Extensively evaluated for risks like cybersecurity threats and misinformation.

Source: OpenAI (Speech Recognition Performance)

Source: OpenAI (Vision Understanding Performance)

ROLLOUT

Details on availability, API, Free version, macOS desktop app

The Summary: GPT-4o is rolling out to ChatGPT's paid tiers first, with free users subject to usage limits. The model is also available via the API. A ChatGPT desktop app for macOS featuring voice interactions is launching as well.

Key details:

  • Rolling out first to ChatGPT Plus starting today and later to ChatGPT free users with usage limits (will switch back to GPT-3.5 when GPT-4o limit is reached). The advanced voice model will launch soon (the old voice model will be used initially).

  • The Free tier will also get limited access to advanced tools like data analysis, file uploads, vision and GPTs.

  • Available in OpenAI API as gpt-4o model for Chat Completions, Assistants, and Batch APIs with 128k token context window.

  • ChatGPT Plus users get up to 80 messages/3 hrs. Higher limits for Team users. An Enterprise plan is coming soon.

  • The MacOS desktop app is launching for Plus, later for Free users. The Windows desktop app is coming later in 2024.

  • Redesigned ChatGPT interface with new home screen.

  • The GPT-4o knowledge cutoff is in October 2023.

Source: OpenAI (macOS desktop app)

TOOLS

🥇 New tools

That’s all for today!

If you liked this newsletter, share it with your friends and colleagues by sending them your invite link : {{rp_refer_url}}.