• The Summary AI
  • Posts
  • 🔥 OpenAI Unveils o1 "Strawberry", Beats PhDs

🔥 OpenAI Unveils o1 "Strawberry", Beats PhDs

PLUS: New Prompting Rules for o1

Welcome back!

OpenAI has released o1 (formerly codenamed Strawberry), a powerful AI reasoning model with remarkable performance in complex tasks, particularly in science and programming. Sam Altman called it “a new era for AI”. The o1 lineup is strong in problem-solving across STEM fields and daily applications. Let’s unpack…

Today’s Summary:

  • 🌟 o1 outperforms PhDs in science questions

  • đź’ˇ New prompting guidelines for o1

  • đź’» o1 lineup targets STEM

  • 🎥 Adobe teases text-to-video AI tools

  • ⚛️ Oracle’s nuclear-powered AI data center

  • 🎬 Hailuo Minimax high consistency video generator

  • 🛠️ 2 tools

TOP STORY

OpenAI’s o1 release (Strawberry) : a new era of AI reasoning begins

The Summary: OpenAI has launched o1, a series of innovative models with AI reasoning capabilities. o1 performs impressively in complex tasks, outperforming PhDs in science questions and ranking highly in programming competitions. The release includes o1-preview and o1-mini, now available to ChatGPT Plus, Team users, and select API users, with plans for broader access soon.

Key details:

  • o1, formerly codenamed "Strawberry" introduces a new approach to AI reasoning

  • Sam Altman describes it as the start of a new paradigm in AI reasoning

  • The o1 models use reinforcement learning to develop an internal "chain of thought" before responding, breaking down prompts and exploring multiple approaches and ideas before generating the final response

  • o1 outperforms expert humans in PhD-level physics questions

  • Ranks in the 89th percentile in competitive programming questions

  • Successfully answers questions that typically challenge GPT-4o

  • The model is not flawless, with occasional errors and hallucinations

  • It generates long internal reasoning chains before responding, but only the final summary is shown to the user.

Why it matters: o1 represents a leap forward in AI's ability to handle complex reasoning. While still limited, it opens new possibilities for AI applications in fields requiring advanced problem-solving and analysis, potentially accelerating progress in scientific research, coding, and STEM-related areas.

GUIDE

o1 Prompting Guide

The Summary: OpenAI's latest o1 models perform best with simple, straightforward prompts, challenging traditional prompt engineering methods. Complex techniques, such as step-by-step instructions, may lower performance. The models respond optimally with brief, clear instructions and minimal guidance.

What are three compounds we should consider investigating to advance research into new antibiotics? Explain your reasoning.
Implement Snake with HTML + JS + CSS. The entire code should be written in a single HTML block with embedded JS and CSS. Don't use any remote assets. After opening the html, user will need to hit space to start / restart the game, the snake will randomly go in one direction at the start and use "wasd" to control the direction of the snake. Make it pretty and the playground large.

Key details:

  • Keep prompts simple and direct

  • Avoid chain-of-thought prompts such as “think step by step”

  • Use delimiters for clarity such as XML tags or section titles

  • Limit additional context: when providing extra information or documents, include only the most relevant details to prevent the model from overcomplicating responses

O1 RELEASE

o1 Model Lineup

The Summary: The current ChatGPT model lineup includes o1-preview and o1-mini, each designed for specific use cases. GPT-4o remains the versatile multimodal flagship, while o1-preview offers advanced reasoning, and o1-mini provides cost-efficient STEM performance. OpenAI has hinted at future developments in both the GPT and o1 series.

Source: OpenAI

Key details:

  • GPT-4o: The versatile flagship model

  • o1-preview: New full-version model with enhanced reasoning

  • o1-mini: Cost-efficient variant optimized for STEM tasks

  • Context window: 128,000 input tokens and up to 32,768 output tokens for o1-preview, 65,536 tokens for o1-mini

  • ChatGPT Plus and Team users can access o1-preview and o1-mini with weekly rate limits (30 messages/week for o1-preview, 50/week for o1-mini). Enterprise and Edu users will get access in the coming days.

  • API Tier 5 users (who have spent more than $1,000 on API credits) can access o1 models with a rate limit of 20 RPM. Pricing: $15.00 / 1M input tokens and $60.00 / 1M output tokens for o1-preview; $3.00 / 1M input tokens and $12.00 / 1M output tokens for o1-mini. Hidden “reasoning tokens” are billed as well as output tokens.

  • o1-mini will be available to free ChatGPT users in the future

  • Future o1 versions may be capable of reasoning for hours, days, or weeks

  • A larger version of GPT-4 is also in development, described as a "giant whale" compared to previous models

Why it matters: OpenAI's new model lineup gives users a range of options tailored to their needs, balancing factors like reasoning ability, speed, and cost. The coexistence of the GPT and o1 series reflects a strategy to offer specialized AIs for different use cases.

QUICK NEWS

Quick news

TOOLS

🥇 New tools

That’s all for today!

If you liked the newsletter, share it with your friends and colleagues by sending them this link: https://thesummary.ai/