The Summary AI
Posts
🚀 OpenAI Reveals o3: AGI in Sight?

🚀 OpenAI Reveals o3: AGI in Sight?

PLUS: Dial 1-800-CHATGPT for Free

The Summary AI
December 20, 2024

Welcome back!

OpenAI announced o3, a new breakthrough in AI with jaw-dropping performance on advanced benchmarks and edging closer AGI-like capabilities. Could this be the AGI leap we’ve been waiting for? Let’s unpack…

Today’s Summary:

🚀 OpenAI reveals o3 breakthrough
📞 ChatGPT now available by phone
🧠 Google debuts "Gemini Flash Thinking"
🎞️ Meta unveils Apollo for video understanding
🖥️ OpenAI integrates ChatGPT into new MacOS applications
🔍 Google tests new AI Search Mode
🛠️ 2 new tools

TOP STORY

OpenAI unveils o3 with mind-bending results

The Summary: OpenAI's new o3 model family marks a dramatic leap in AGI-like capabilities, with scores that dwarf previous benchmarks. The model achieves near human performance on the ARC-AGI test and solves graduate level math problems using a "private chain of thought" system that adjust reasoning time based on task complexity. OpenAI plans a public release in January after completing safety testing.

Today, we shared evals for an early version of the next model in our o-model reasoning series: OpenAI o3
— OpenAI (@OpenAI)
7:16 PM • Dec 20, 2024

Key details:

Scored 87.5% on ARC-AGI benchmarks, tripling o1 performance
Achieved a 2727 Codeforces rating, placing it among the top 175 competitive programmers worldwide
Solved 25.2% of problems on Frontier Math benchmark, while no other model surpasses 2%
Base version may cost $17-20/task, with high-compute mode significantly more expensive
Features a "private chain of thought" system, reasoning through problems step by step before answering

Why it matters: OpenAI o3 challenges perceptions about AI limits, showing new scaling paths. But is it AGI? Experts, including ARC Prize founder François Chollet, say not yet – as it still struggles with some simple tasks. True AGI will exist when no tasks remain that are easy for humans but difficult for AI. The o3 model’s impressive capabilities seem to mainly target high-stake domains like advanced science, math, and programming, where its higher costs may be justified.

"This is not merely incremental improvement, but a genuine breakthrough, marking a qualitative shift in AI capabilities."

François Chollet - ARC Prize

OPENAI

OpenAI launches phone access for ChatGPT

The Summary: OpenAI has launched 1-800-CHATGPT, a phone service that allows users in US and Canada call and talk to ChatGPT for free, up to 15 minutes per month. International users can also message ChatGPT via WhatsApp. This initiative aims to make AI more accessible, reducing the need for smartphones or high-speed internet.

Source: OpenAI

Key details:

US and Canada users can dial 1-800-CHATGPT (1-800-242-8478) for 15 minutes of free AI interaction per month
Global users can access ChatGPT via WhatsApp, powered by GPT-4o mini
Developed during OpenAI’s Hack Week, the service even works with rotary phones
OpenAI confirmed that no user voice data will be used to train its models

Why it matters: Phone access removes barriers like apps and accounts, making AI more approachable and practical for everyday use. This expansion reaches new audiences, including those without reliable internet or advanced devices.

GOOGLE

Google debuts “Gemini 2.0 Flash Thinking” reasoning model

The Summary: Google announced “Gemini 2.0 Flash Thinking”, an early-stage model which can solve complex tasks by reasoning step by step and thinking a longer time before giving the final answer. The model builds on Gemini 2.0 Flash architecture and arrives amid rising interest in AI systems capable of multi-step reasoning.

Source: Google DeepMind

Key details:

Processes up to 32,000 tokens (50-60 pages) of input and generates 8,000 tokens per response
LM Arena ranks it #1 across all LLM categories, though testing didn’t include OpenAI o1 full model
Developed with input from Transformer Paper co-author Noam Shazeer, brought back by Google for this project
Prioritizes runtime reasoning, using additional compute at query time instead of relying solely on pre-training step

Why it matters: “Gemini 2.0 Flash Thinking” is Google’s response to OpenAI’s o1, emphasizing runtime reasoning to solve complex tasks. This approach reflects a broder industry pivot to overcome the diminishing returns of scaling training data, focusing instead on smarter problem-solving methods.

Try it in AI Studio

QUICK NEWS

Quick news

OpenAI enhances ChatGPT integration with MacOS applications
Meta unveils Apollo, AI model aimed at fast video understanding
Google is working on a new AI Mode for Google Search

TOOLS

🥇 New tools

Findr - Unlock infinite digital memory with your AI second brain
Nora - 24/7 mental health companion without revealing your identity

That’s all for today!

If you liked the newsletter, share it with your friends and colleagues by sending them this link: https://thesummary.ai/