The Summary AI
Posts
🚀 Fei-Fei Li's Spatial Intelligence AI Startup

🚀 Fei-Fei Li's Spatial Intelligence AI Startup

PLUS: Runway video-to-video AI

The Summary AI
September 17, 2024

Welcome back!

“AI godmother” Fei-Fei Li is redefining AI’s frontier with a new venture in 3D spatial intelligence. With the launch of World Labs, her ambitious goal is to teach AI how to interact with the physical world, which could reshape the future of AI reasoning, augmented reality and robotics. Let's unpack…

Today’s Summary:

🌍 Fei-Fei Li 3D intelligence startup
🎥 Runway video-to-video
📊 Google DataGemma reduces AI hallucinations
💬 OpenAI increases o1-mini and o1-preview limits
🤖 Aloha robot ties shoelaces
🖥️ Exo runs Llama 3.1 on 2 MacBooks
📝 Microsoft launches Copilot Pages tool
🛠️ 2 new tools

TOP STORY

AI godmother Fei-Fei Li launches 3D intelligence startup

The Summary: Fei-Fei Li, a renowned AI researcher known as the "godmother of AI," has launched World Labs, focusing on developing AI models with "spatial intelligence" to understand and interact with the 3D world. This technology could transform how AI perceives and reasons about physical environments, with applications ranging from AR/VR to robotics.

Source: World Labs

Key details:

Fei-Fei Li pioneered ImageNet in the 2010s, a project that revolutionized computer vision and deep learning
World Labs will train "large world models" (LWMs) using transformers
The 20-employee startup aims to create 3D interactive worlds as a basis for spatial reasoning
$230M funding led by Andreessen Horowitz, NEA, and Radical Ventures. Investors include AMD, Intel, Nvidia, and AI luminaries like Geoffrey Hinton and Jeff Dean
Founding team includes key researchers Ben Mildenhall, creator of Neural Radiance Fields (NeRF), and Christoph Lassner, whose work on Pulsar predated Gaussian splatting

Why it matters: World Labs' focus on spatial intelligence could bridge the gap between AI's current capabilities and human-like understanding of the physical world. This technology has the power to unlock new frontiers in scientific discovery, virtual reality and robotics, with an impact which could rival language models in shaping the future of AI.

“Humans have spatial intelligence. It’s an ancient ability evolved over million of years, it’s this ability to understand and interact with the 3D world. The next frontier, which is such a hard problem to crack, is to bring AI to 3D, because the real world is 3D.”

Fei-Fei Li

TOOLS

Runway unleashes Video-to-Video AI magic

The Summary: Runway's Gen-3 Alpha AI model now offers video-to-video capabilities, allowing users to transform existing videos with AI-generated aesthetics. This feature complements its previous text-to-video and image-to-video modes. The tool is available on the web for paid subscribers.

Source: RunwayML

Key details:

Users can upload a video and apply AI-generated aesthetics while preserving the original motion
Preset styles include options like turning subjects into glass or line drawings
Runway also launched an API for developers to integrate Gen-3 Alpha Turbo model

Why it matters: This advancement in AI video editing opens new doors for content creators and filmmakers. Indie filmmakers can now achieve high-budget visual effects on a shoestring. Marketing teams can quickly prototype ad concepts, turning a simple product demo into different themed environments. Educators could turn lecture videos into engaging animated lessons.

GOOGLE

Google DataGemma grounds AI in real-world data to reduce hallucinations

The Summary: Google unveiled DataGemma, open models designed to reduce AI hallucinations by using real-world data from Data Commons. These models use retrieval techniques to access trusted statistical information, improving the accuracy and reliability of AI responses.

Source: Google Research

Key details:

DataGemma uses Data Commons, a knowledge graph with 240+ billion data points from trusted sources
Two approaches: Retrieval Interleaved Generation (RIG) and Retrieval Augmented Generation (RAG)
RIG improved factuality from 5-17% to about 58% in tests
RAG provided accurate numbers for 99% of responses, with 6-20% inference errors
Models are available on Hugging Face and Kaggle for academic and research use

Why it matters: Grounding AI in verifiable data could significantly improve trust in AI systems. By addressing the critical issue of hallucinations, this research aims to build more reliable AI applications for decision-making and information retrieval.

Research paper

QUICK NEWS

Quick news

OpenAI increased o1-mini limits to 50 messages / day and o1-preview to 50 messages / week
Aloha dexterous robot can tie shoelaces and hang up shirts
Exo can run Llama 3.1 405B across 2 MacBooks
Microsoft launches Copilot Pages collaboration tool

TOOLS

🥇 New tools

AIPhone - AI-powered phone call app with live translation
Cracked - Create animations with text descriptions

That’s all for today!

If you liked the newsletter, share it with your friends and colleagues by sending them this link: https://thesummary.ai/