- The Summary AI
- Posts
- 💎 Google Gems Custom AI Assistants
💎 Google Gems Custom AI Assistants
PLUS: AI Simulates Doom Without a Game Engine
Welcome back!
Google’s latest updates bring powerful new features, including "Gems"—customizable assistants designed to boost productivity, enhanced image generation, and AI-powered note taking in Google Meet. Google is making AI more accessible for everyday tasks. Let’s dive in…
Today’s Summary:
💎 Google releases “Gems” customizable assistants
👁️ Alibaba’s Qwen2-VL analyzes 20-minute videos
🎮 AI simulates Doom without a game engine
🔒 California passes AI “kill switch” bill
🕺 Meta releases Sapiens for human pose vision tasks
💰 OpenAI eyes $100B valuation in new funding round
🚀 Llama models see 10x growth in usage
2 new tools
TOP STORY
Google unveils Gemini upgrades with Customizable Assistants and Google Meet AI note-taking
The Summary: Google released new AI updates across its product lineup. Gemini Advanced users can now create custom assistants called "Gems" for specialized tasks, similar to OpenAI’s Custom GPTs. The new Imagen 3 model has been added to Gemini, offering high-quality image generation capabilities that outperform DALL-E 3. Additionally, Google Meet introduces AI-powered note-taking, automating meeting summaries for Workspace customers.
Key details:
Gems allow users to create personalized AI experts for tasks like brainstorming, learning, code review, resume editing, language tutoring
Imagen 3 enhances Gemini with superior image generation capabilities
Google Meet’s "Take notes for me" summarizes meetings
These updates are rolling out over the next few days to Google Workspace tiers, Gemini Advanced, Business and Enterprise
Why it matters: Google continues to release task-specific features that enhance productivity. “Gems” streamline specialized tasks, while Imagen 3 enables creation of high-quality designs and marketing imagery. The automated note-taking in Google Meet could be a big deal for corporate meetings.
OPEN SOURCE
Alibaba Qwen2-VL analyzes 20-minute videos
The Summary: Alibaba Cloud has unveiled Qwen2-VL, a vision model capable of analyzing videos over 20 minutes long. This AI can also answer questions about photos, and engage in real-time webcam dialogue. Qwen2-VL comes in three sizes, with the smaller 7B and 2B versions publicly released and the largest 72B version available via API and soon to be open-sourced.
Key details:
72B version beats GPT-4o-vision and Claude 3.5 Sonnet-vision on benchmarks
Qwen2-VL can process images at various resolutions
Supports multiple languages including English, Chinese, and several European and Asian languages
Introduces Multimodal Rotary Position Embedding (M-ROPE) for integrating 1D text, 2D visual, and 3D video information
Why it matters: Qwen2-VL's ability to analyze videos and handle diverse visual inputs expands the capabilities of open-source tools for visual content understanding. It could lead to the development of novel AI applications, such as UI vision tasks on smartphones and robotic operations based on visual input.
RESEARCH
Neural network simulates Doom without a Game Engine
The Summary: Google researchers have developed GameNGen, a neural network capable of generating playable Doom gameplay without a traditional game engine. It produces high-quality, interactive Doom gameplay at 20 fps, using only a diffusion model to predict each frame based on a sequence of past frames and user actions.
Key details:
First AI to fully simulate a complex video game with graphics and interactivity
Runs on a single Tensor Processing Unit (TPU) at 20 fps
Human raters struggled to distinguish AI-generated clips from real gameplay
Uses modified Stable Diffusion 1.4 model trained on RL agent gameplay footage
Why it matters: This breakthrough shows the potential of AI models to simulate complex environments. Over time, such research could lead to entirely new types of adaptive, AI-generated gameplay experiences.
QUICK NEWS
Quick news
California legislature passes “kill switch” AI safety bill
Meta releases Sapiens models for human-oriented vision tasks
OpenAI in talks for a new funding round at $100B valuation
Usage of Llama models grew 10x from January to July 2024
That’s all for today!
If you liked the newsletter, share it with your friends and colleagues by sending them this link: https://thesummary.ai/