🤖 OpenAI Unveils Research Agent

PLUS: Google’s New Gemini 2.0 Models

Welcome back!

OpenAI released Deep Research, an AI agent that can independently search, analyze, and compile research reports in minutes, with citation-backed insights. It’s like having a personal research analyst at your fingertips. How will this reshape research workflows? Let’s unpack…

Today’s Summary:

  • 🔍 OpenAI unveils Deep Research agent

  • 🚀 Google debuts 3 new Gemini 2.0 models

  • 🎥 ByteDance AI makes realistic videos from one image

  • 💬 Mistral AI assistant launches on iOS & Android

  • 🎯 Ex-OpenAI CTO Murati recruits OpenAI co-founder

  • 🤖 OpenAI’s o3-mini update shows internal reasoning

  • 🛠️ 2 new tools

TOP STORY

OpenAI unveils “Deep Research” agent in ChatGPT

The Summary: OpenAI has launched Deep Research, a ChatGPT agent designed for complex, multi-step research. Running on a web-optimized version of the o3 model, it autonomously searches, analyzes, and synthesizes online data to produce detailed reports with citations. Acting as a research analyst, it assists users in various fields like finance, policy, and engineering.

Source: OpenAI

Key details:

  • Available now for Pro users, with Plus, Team & Enterprise coming soon

  • Similar concept to Gemini Deep Research, released last December

  • Generates in-depth reports in 5 to 30 minutes, analyzing hundreds of online sources, PDFs, and images

  • Runs on a web-optimized version of OpenAI’s o3 model

Why it matters: This agent brings true online research capabilities to ChatGPT, helping professionals and general users compile information faster. It can save hours of manual work, making complex research tasks easier. While users should still verify critical details, this tool could change how people approach learning and research, offering valuable support for knowledge workers.

GOOGLE

Google releases three new Gemini 2.0 models

The Summary: Google has introduced three new Gemini 2.0 AI models, each tailored for different needs. The lineup includes Flash for production apps, Flash-Lite for budget-conscious developers, and Pro for complex coding tasks. Each model brings multimodal capabilities and massive context windows.

Key details:

  • Available via API and the Gemini App

  • Flash-Lite processes 1M tokens for just $0.075/M tokens - half the cost of GPT-4o mini

  • Pro model handles a massive 2M token context window

  • Pro model scores 91.8% on MATH benchmarks, beating previous models

  • Pro ranks #1 on LM Arena, Flash at #3, Flash-Lite enters top 10

  • Flash achieves near-perfect OCR accuracy for image-to-text transcription

Why it matters: Google capitalizes on its massive compute infrastructure to deliver powerful and efficient API models with up to 2M token context windows, becoming even more competitive and attractive for developers and enterprise applications.

RESEARCH

ByteDance AI creates ultra-realistic videos from one image

The Summary: ByteDance researchers have developed OmniHuman, an AI system that turns a single photo into a lifelike video with natural movements and speech. The model supports any aspect ratio and generates full-body animations, expressive hand gestures, and facial expressions that sync with audio input. It also maintains realistic lighting and texture while preserving the subject's unique features.

Key details:

  • Requires just one photo and audio clip to generate a video of any length

  • Works for portraits, half-body and full-body shots

  • Uses a novel "omni-conditions" approach integrating text, audio, and body movement

  • Can generate singing performances with accurate lip-sync

  • Produces natural full-body motion including hand gestures

  • This remains a research project, ByteDance has not released the model or any service

Why it matters: This technology pushes AI-generated video to new levels of realism for digital content generation. The quality gap between AI-generated and real human videos continues to narrow, raising questions about digital authenticity and trust.

QUICK NEWS

Quick news

TOOLS

🥇 New tools

  • Tana - Put your notes to work with voice and AI

  • Concierge - Talk to your apps with natural language

That’s all for today!

If you liked the newsletter, share it with your friends and colleagues by sending them this link: https://thesummary.ai/