- The Summary AI
- Posts
- 🔥 Claude 3.5 Gets New Powers
🔥 Claude 3.5 Gets New Powers
PLUS: ChatGPT Voice Mode Released in the EU
Welcome back!
Welcome back! Anthropic just gave Claude 3.5 the ability to control a computer via mouse, keyboard, and screenshots, just like a human. Additionally, Claude's coding capabilities received a significant boost, positioning it as the top performer in all coding benchmarks. Let's unpack...
Today’s Summary:
🚀 New Claude 3.5 controls computers
🎨 Stability AI launches Stable Diffusion 3.5
🎥 Genmo releases Mochi-1 free video AI
🎙 ChatGPT Voice Mode expands to the EU
💼 Former White House official joins OpenAI as Chief Economist
📋 News Corp sues Perplexity over WSJ content
🛠 2 new tools
TOP STORY
Claude AI new version can control a computer using mouse and keyboard
The Summary: Anthropic has released a new version of its Claude 3.5 Sonnet model, with improved performance and the ability to control a computer directly via mouse, keyboard and screenshots. Though still in beta and limited to API users, this feature enables Claude to interact with everyday software tools. Additionally, the “new” Claude 3.5 Sonnet has seen major improvements in coding capabilities, now surpassing OpenAI GPT-4o and o1-preview.
Key details:
Developers can now direct Claude to control computers by looking at screenshots, moving the cursor, clicking, and typing text
First AI to score 22% on OSWorld computer control benchmark, more than doubling the previous record
Claude 3.5 Sonnet (new) scores 84.2% on aider coding benchmark, surpassing OpenAI o1-preview (79.7%)
Major increase in code refactoring accuracy, from 64% to 92.1%
Outperforms OpenAI GPT-4o in most benchmarks
The smaller Claude Haiku 3.5 matches previous Claude 3 Opus performance while maintaining the speed and cost of Haiku 3
Why it matters: The addition of computer control capabilities brings AI closer to interacting with software through human-like interfaces, such as using a keyboard and mouse, opening up new possibilities for automation and integration with existing software systems.
IMAGE AI
Stability AI launches Stable Diffusion 3.5 free image AI
The Summary: Stability AI has released Stable Diffusion 3.5, offering three advanced AI image models to the open-source community. Following June’s underwhelming SD3 release, this update improves quality and offers free commercial use for small businesses. Early tests show the Large model matches top competitors in quality. The models are designed to run on regular consumer hardware.
Key details:
4090 GPUs creates 1024x1024 images in 15 seconds using SD 3.5 Large
Free commercial license for companies with less than $1M annual revenue
Models are highly stable for fine-tuning
Early users report "more creative and less over-fit" outputs compared to competitors like Flux
Why it matters: By offering advanced images models with free commercial use, Stability AI could further change the economics of AI image generation. While many providers lock their top models behind APIs, this open release could push Stability back to the forefront in areas like design and entertainment.
VIDEO AI
Genmo releases Mochi-1, free video AI for all
The Summary: Genmo has released Mochi 1, an open-source video generation model that matches proprietary platforms like Runway and Luma. With 10 billion parameters, the model generates fluid, physics-aware videos from text prompts. Available under the Apache 2.0 license, Mochi-1 runs on four H100 GPUs and produces 480p videos up to 5.4 seconds long.
Key details:
Largest open-source video model at 10B parameters - 4x more parameters dedicated to video vs text processing
Novel architecture compresses videos 128x smaller using Asymmetric Diffusion Transformer
Requires significant compute power – minimum 4 Nvidia H100 GPUs to run locally or in the cloud
Available on HuggingFace
Why it matters: Mochi-1 offers high-end video AI capabilities in the open-source space, which were previously only accessible through expensive subscriptions. The Apache 2.0 license allows researchers and companies to improve the project and build commercial applications on top of Mochi-1's technology.
QUICK NEWS
Quick news
ChatGPT Advanced Voice Mode now available in the EU
OpenAI hires former White House official as Chief Economist
News Corp sues Perplexity for copying WSJ and New York Post content
TOOLS
🥇 New tools
Pixyer - Turn snapshots into studio-quality product photos
Focus Buddy - Your always on AI executive assistant
That’s all for today!
If you liked the newsletter, share it with your friends and colleagues by sending them this link: https://thesummary.ai/