🔥 Claude 3.5 Gets New Powers

PLUS: ChatGPT Voice Mode Released in the EU

Welcome back!

Welcome back! Anthropic just gave Claude 3.5 the ability to control a computer via mouse, keyboard, and screenshots, just like a human. Additionally, Claude's coding capabilities received a significant boost, positioning it as the top performer in all coding benchmarks. Let's unpack...

Today’s Summary:

  • 🚀 New Claude 3.5 controls computers

  • 🎨 Stability AI launches Stable Diffusion 3.5

  • 🎥 Genmo releases Mochi-1 free video AI

  • 🎙 ChatGPT Voice Mode expands to the EU

  • 💼 Former White House official joins OpenAI as Chief Economist

  • 📋 News Corp sues Perplexity over WSJ content

  • 🛠 2 new tools

TOP STORY

Claude AI new version can control a computer using mouse and keyboard

The Summary: Anthropic has released a new version of its Claude 3.5 Sonnet model, with improved performance and the ability to control a computer directly via mouse, keyboard and screenshots. Though still in beta and limited to API users, this feature enables Claude to interact with everyday software tools. Additionally, the “new” Claude 3.5 Sonnet has seen major improvements in coding capabilities, now surpassing OpenAI GPT-4o and o1-preview.

Key details:

  • Developers can now direct Claude to control computers by looking at screenshots, moving the cursor, clicking, and typing text

  • First AI to score 22% on OSWorld computer control benchmark, more than doubling the previous record

  • Claude 3.5 Sonnet (new) scores 84.2% on aider coding benchmark, surpassing OpenAI o1-preview (79.7%)

  • Major increase in code refactoring accuracy, from 64% to 92.1%

  • Outperforms OpenAI GPT-4o in most benchmarks

  • The smaller Claude Haiku 3.5 matches previous Claude 3 Opus performance while maintaining the speed and cost of Haiku 3

Why it matters: The addition of computer control capabilities brings AI closer to interacting with software through human-like interfaces, such as using a keyboard and mouse, opening up new possibilities for automation and integration with existing software systems.

IMAGE AI

Stability AI launches Stable Diffusion 3.5 free image AI

The Summary: Stability AI has released Stable Diffusion 3.5, offering three advanced AI image models to the open-source community. Following June’s underwhelming SD3 release, this update improves quality and offers free commercial use for small businesses. Early tests show the Large model matches top competitors in quality. The models are designed to run on regular consumer hardware.

Key details:

  • 4090 GPUs creates 1024x1024 images in 15 seconds using SD 3.5 Large

  • Free commercial license for companies with less than $1M annual revenue

  • Models are highly stable for fine-tuning

  • Early users report "more creative and less over-fit" outputs compared to competitors like Flux

Why it matters: By offering advanced images models with free commercial use, Stability AI could further change the economics of AI image generation. While many providers lock their top models behind APIs, this open release could push Stability back to the forefront in areas like design and entertainment.

VIDEO AI

Genmo releases Mochi-1, free video AI for all

The Summary: Genmo has released Mochi 1, an open-source video generation model that matches proprietary platforms like Runway and Luma. With 10 billion parameters, the model generates fluid, physics-aware videos from text prompts. Available under the Apache 2.0 license, Mochi-1 runs on four H100 GPUs and produces 480p videos up to 5.4 seconds long.

Key details:

  • Largest open-source video model at 10B parameters - 4x more parameters dedicated to video vs text processing

  • Novel architecture compresses videos 128x smaller using Asymmetric Diffusion Transformer

  • Requires significant compute power – minimum 4 Nvidia H100 GPUs to run locally or in the cloud

  • Available on HuggingFace

Why it matters: Mochi-1 offers high-end video AI capabilities in the open-source space, which were previously only accessible through expensive subscriptions. The Apache 2.0 license allows researchers and companies to improve the project and build commercial applications on top of Mochi-1's technology.

QUICK NEWS

Quick news

TOOLS

🥇 New tools

  • Pixyer - Turn snapshots into studio-quality product photos

  • Focus Buddy - Your always on AI executive assistant

That’s all for today!

If you liked the newsletter, share it with your friends and colleagues by sending them this link: https://thesummary.ai/