🚀 OpenAI Reveals o3: AGI in Sight?

PLUS: Dial 1-800-CHATGPT for Free

Welcome back!

OpenAI announced o3, a new breakthrough in AI with jaw-dropping performance on advanced benchmarks and edging closer AGI-like capabilities. Could this be the AGI leap we’ve been waiting for? Let’s unpack…

Today’s Summary:

  • 🚀 OpenAI reveals o3 breakthrough

  • đź“ž ChatGPT now available by phone

  • 🧠 Google debuts "Gemini Flash Thinking"

  • 🎞️ Meta unveils Apollo for video understanding

  • 🖥️ OpenAI integrates ChatGPT into new MacOS applications

  • 🔍 Google tests new AI Search Mode

  • 🛠️ 2 new tools

TOP STORY

OpenAI unveils o3 with mind-bending results

The Summary: OpenAI's new o3 model family marks a dramatic leap in AGI-like capabilities, with scores that dwarf previous benchmarks. The model achieves near human performance on the ARC-AGI test and solves graduate level math problems using a "private chain of thought" system that adjust reasoning time based on task complexity. OpenAI plans a public release in January after completing safety testing.

Key details:

  • Scored 87.5% on ARC-AGI benchmarks, tripling o1 performance

  • Achieved a 2727 Codeforces rating, placing it among the top 175 competitive programmers worldwide

  • Solved 25.2% of problems on Frontier Math benchmark, while no other model surpasses 2%

  • Base version may cost $17-20/task, with high-compute mode significantly more expensive

  • Features a "private chain of thought" system, reasoning through problems step by step before answering

Why it matters: OpenAI o3 challenges perceptions about AI limits, showing new scaling paths. But is it AGI? Experts, including ARC Prize founder François Chollet, say not yet – as it still struggles with some simple tasks. True AGI will exist when no tasks remain that are easy for humans but difficult for AI. The o3 model’s impressive capabilities seem to mainly target high-stake domains like advanced science, math, and programming, where its higher costs may be justified.

"This is not merely incremental improvement, but a genuine breakthrough, marking a qualitative shift in AI capabilities."

François Chollet - ARC Prize
OPENAI

OpenAI launches phone access for ChatGPT

The Summary: OpenAI has launched 1-800-CHATGPT, a phone service that allows users in US and Canada call and talk to ChatGPT for free, up to 15 minutes per month. International users can also message ChatGPT via WhatsApp. This initiative aims to make AI more accessible, reducing the need for smartphones or high-speed internet.

Key details:

  • US and Canada users can dial 1-800-CHATGPT (1-800-242-8478) for 15 minutes of free AI interaction per month

  • Global users can access ChatGPT via WhatsApp, powered by GPT-4o mini

  • Developed during OpenAI’s Hack Week, the service even works with rotary phones

  • OpenAI confirmed that no user voice data will be used to train its models

Why it matters: Phone access removes barriers like apps and accounts, making AI more approachable and practical for everyday use. This expansion reaches new audiences, including those without reliable internet or advanced devices.

GOOGLE

Google debuts “Gemini 2.0 Flash Thinking” reasoning model

The Summary: Google announced “Gemini 2.0 Flash Thinking”, an early-stage model which can solve complex tasks by reasoning step by step and thinking a longer time before giving the final answer. The model builds on Gemini 2.0 Flash architecture and arrives amid rising interest in AI systems capable of multi-step reasoning.

Key details:

  • Processes up to 32,000 tokens (50-60 pages) of input and generates 8,000 tokens per response

  • LM Arena ranks it #1 across all LLM categories, though testing didn’t include OpenAI o1 full model

  • Developed with input from Transformer Paper co-author Noam Shazeer, brought back by Google for this project

  • Prioritizes runtime reasoning, using additional compute at query time instead of relying solely on pre-training step

Why it matters: “Gemini 2.0 Flash Thinking” is Google’s response to OpenAI’s o1, emphasizing runtime reasoning to solve complex tasks. This approach reflects a broder industry pivot to overcome the diminishing returns of scaling training data, focusing instead on smarter problem-solving methods.

QUICK NEWS

Quick news

TOOLS

🥇 New tools

  • Findr - Unlock infinite digital memory with your AI second brain

  • Nora - 24/7 mental health companion without revealing your identity

That’s all for today!

If you liked the newsletter, share it with your friends and colleagues by sending them this link: https://thesummary.ai/