- The Summary AI
- Posts
- 🚀 OpenAI Reveals o3: AGI in Sight?
🚀 OpenAI Reveals o3: AGI in Sight?
PLUS: Dial 1-800-CHATGPT for Free
Welcome back!
OpenAI announced o3, a new breakthrough in AI with jaw-dropping performance on advanced benchmarks and edging closer AGI-like capabilities. Could this be the AGI leap we’ve been waiting for? Let’s unpack…
Today’s Summary:
🚀 OpenAI reveals o3 breakthrough
đź“ž ChatGPT now available by phone
🧠Google debuts "Gemini Flash Thinking"
🎞️ Meta unveils Apollo for video understanding
🖥️ OpenAI integrates ChatGPT into new MacOS applications
🔍 Google tests new AI Search Mode
🛠️ 2 new tools
TOP STORY
OpenAI unveils o3 with mind-bending results
The Summary: OpenAI's new o3 model family marks a dramatic leap in AGI-like capabilities, with scores that dwarf previous benchmarks. The model achieves near human performance on the ARC-AGI test and solves graduate level math problems using a "private chain of thought" system that adjust reasoning time based on task complexity. OpenAI plans a public release in January after completing safety testing.
Today, we shared evals for an early version of the next model in our o-model reasoning series: OpenAI o3
— OpenAI (@OpenAI)
7:16 PM • Dec 20, 2024
Key details:
Scored 87.5% on ARC-AGI benchmarks, tripling o1 performance
Achieved a 2727 Codeforces rating, placing it among the top 175 competitive programmers worldwide
Solved 25.2% of problems on Frontier Math benchmark, while no other model surpasses 2%
Base version may cost $17-20/task, with high-compute mode significantly more expensive
Features a "private chain of thought" system, reasoning through problems step by step before answering
Why it matters: OpenAI o3 challenges perceptions about AI limits, showing new scaling paths. But is it AGI? Experts, including ARC Prize founder François Chollet, say not yet – as it still struggles with some simple tasks. True AGI will exist when no tasks remain that are easy for humans but difficult for AI. The o3 model’s impressive capabilities seem to mainly target high-stake domains like advanced science, math, and programming, where its higher costs may be justified.
"This is not merely incremental improvement, but a genuine breakthrough, marking a qualitative shift in AI capabilities."
OPENAI
OpenAI launches phone access for ChatGPT
The Summary: OpenAI has launched 1-800-CHATGPT, a phone service that allows users in US and Canada call and talk to ChatGPT for free, up to 15 minutes per month. International users can also message ChatGPT via WhatsApp. This initiative aims to make AI more accessible, reducing the need for smartphones or high-speed internet.
Key details:
US and Canada users can dial 1-800-CHATGPT (1-800-242-8478) for 15 minutes of free AI interaction per month
Global users can access ChatGPT via WhatsApp, powered by GPT-4o mini
Developed during OpenAI’s Hack Week, the service even works with rotary phones
OpenAI confirmed that no user voice data will be used to train its models
Why it matters: Phone access removes barriers like apps and accounts, making AI more approachable and practical for everyday use. This expansion reaches new audiences, including those without reliable internet or advanced devices.
Google debuts “Gemini 2.0 Flash Thinking” reasoning model
The Summary: Google announced “Gemini 2.0 Flash Thinking”, an early-stage model which can solve complex tasks by reasoning step by step and thinking a longer time before giving the final answer. The model builds on Gemini 2.0 Flash architecture and arrives amid rising interest in AI systems capable of multi-step reasoning.
Key details:
Processes up to 32,000 tokens (50-60 pages) of input and generates 8,000 tokens per response
LM Arena ranks it #1 across all LLM categories, though testing didn’t include OpenAI o1 full model
Developed with input from Transformer Paper co-author Noam Shazeer, brought back by Google for this project
Prioritizes runtime reasoning, using additional compute at query time instead of relying solely on pre-training step
Why it matters: “Gemini 2.0 Flash Thinking” is Google’s response to OpenAI’s o1, emphasizing runtime reasoning to solve complex tasks. This approach reflects a broder industry pivot to overcome the diminishing returns of scaling training data, focusing instead on smarter problem-solving methods.
QUICK NEWS
Quick news
OpenAI enhances ChatGPT integration with MacOS applications
Meta unveils Apollo, AI model aimed at fast video understanding
Google is working on a new AI Mode for Google Search
TOOLS
🥇 New tools
That’s all for today!
If you liked the newsletter, share it with your friends and colleagues by sending them this link: https://thesummary.ai/