Google CEO Sundar Pichai speaks at the Google I/O developer conference.
Andrej Sokolow | Picture Alliance | Getty Images
Google on Tuesday hosted its annual I/O developer conference where it announced a range of artificial intelligence products, from new search and chat features to AI hardware for cloud customers. It shows how the company is working to rapidly roll out new AI tools to fight competitors in the space, such as OpenAI.
Many of the features or tools Google announced are only in testing or limited to developers, but they give an idea of how Google is thinking about AI and what it’s working on. Google makes money from AI by charging developers who use its models and from customers who pay for Gemini Advanced, its competitor to ChatGPT, which costs $19.99 per month and can help users summarize PDFs, Google Docs and more.
Tuesday’s announcements follow similar events held by its AI competitors. Earlier this month, Amazon-backed Anthropic announced its first-ever enterprise offering and a free iPhone app. Meanwhile, OpenAI on Monday launched a new AI model and desktop version of ChatGPT, along with a new user interface.
Here’s what Google announced.
Gemini AI updates
There’s also a new Gemini 1.5 Flash AI model, which the company said is more cost-effective and designed for smaller tasks like quickly summarizing conversations, captioning images and videos and pulling data from large documents.
Google CEO Sundar Pichai highlighted improvements to Gemini’s translations, adding that it will be available to all developers worldwide in 35 languages to all users across the company’s Gemini Advanced. Within Gmail, Gemini 1.5 Pro will analyze attached PDFs and videos, giving summaries and more, Pichai said. That means that if you missed a long email thread on vacation, Gemini will be able to summarize it — and any attachments contained within those emails — for you.
The new Gemini updates are also helpful for searching Gmail. One example the company gave: If you’ve been comparing prices from different contractors to fix your roof and are looking for a summary to help you decide who to go with, Gemini could return three quotes along with the anticipated start dates offered in the different email threads.
Google said Gemini will eventually replace Google Assistant on Android phones, which means it’s going to be a more powerful competitor to Apple’s Siri on iPhone.
Google Veo, Imagen 3 and Audio Overviews
Google announced “Veo,” its latest model for generating high-definition video, and Imagen 3, its highest quality text-to-image model, which promises lifelike images and “fewer distracting visual artifacts than our prior models.”
The tools will be available for select creators on Monday and will come to Vertex AI, Google’s machine learning platform that lets developers train and deploy AI applications. Until then, there will be a waitlist.
The company also showcased “Audio Overviews,” the ability to generate audio discussions based on text input. For instance, if a user uploads a lesson plan, the chatbot can speak a summary of it. Or, if you ask it to give an example of a science problem in real life, it can do so through interactive audio.
Separately, the company also showcased “AI Sandbox,” a range of generative AI tools for creating music and sounds from scratch, based on user prompts.
Generative AI tools such as chatbots and image creators continue to have issues with accuracy, however.
Earlier this year, Google introduced the Gemini-powered image generator. Users discovered historical inaccuracies that went viral online, and the company pulled the feature, saying it would relaunch it in the coming weeks. The feature has still not been re-released.
New search features
Google is launching “AI Overviews” in Google Search on Monday in the U.S. AI overviews show a quick summary of answers to the most complex search questions, according to Liz Reid, head of Google Search. For example, if a user searches for the best way to clean leather boots, the results page may display an “AI Overview” at the top with a multi-step cleaning process, gleaned from information it synthesized from around the web.
The company said it plans to introduce assistant-like planning capabilities directly within Search. it explained users will be able to search for something like, “‘Create a 3-day meal plan for a group that’s easy to prepare,’ and you’ll get a starting point with a wide range of recipes from across the web.”
As far as its progress to offer “multimodality,” or integrating more images and video within generative AI tools, Google said it will begin testing the ability for users to ask questions through video, such as filming a problem with a product they own, uploading it and asking the search engine what the problem is. In one example, Google demoed someone filming a broken record player while asking why it wasn’t working. Google Search found the model of the record player and suggested that it could be malfunctioning because it wasn’t properly balanced.
Another new feature in testing called “AI Teammate” will integrate into a user’s Google Workspace. It can build a searchable collection of work from messages and email threads with more PDFs and documents. For instance, a founder-to-be could ask the AI Teammate, “Are we ready for launch?” and the assistant will provide an analysis and summary based on the information it has access to in Gmail, Google Docs and other Workspace apps.
Project Astra
Project Astra is Google’s latest advancement towards its AI assistant that’s built by Google’s DeepMind AI unit. It’s just a prototype for now, but you can think of it as Google’s aim to develop its own version of J.A.R.V.I.S., Tony Stark’s all-knowing AI assistant from the Marvel Universe.
In the demo video presented at Google I/O, the assistant — through video and audio, rather than a chatbot interface — was able to help the user remember where they left their glasses, review code and answer questions about what a certain part of a speaker is called, when that speaker was shown on video.
Google said a truly useful chatbot needs to be useful in a way that “users can talk to it naturally and without lag or delay,” and the conversation in the demo video happened in realtime, without lags. The demo followed OpenAI’s Monday showcase of a similar audio back-and-forth conversation with ChatGPT.
Onstage, DeepMind CEO Demis Hassabis said that “getting response time down to something conversational is a difficult engineering challenge.”
Pichai said he expects Project Astra will launch in Gemini later this year.
AI hardware
Finally, Google announced Trillium, its sixth-generation TPU, or tensor processing unit — a piece of hardware integral to running complex AI operations — which will be available to Cloud customers in late 2024.
The TPUs aren’t meant to compete with other chips, like Nvidia’s graphics processing units. Pichai noted during I/O, for example, that Google Cloud will begin offering Nvidia’s Blackwell GPUs in early 2025.
Nvidia said in March that Google will be using the Blackwell platform for “various internal deployments and will be one of the first cloud providers to offer Blackwell-powered instances,” and that access to Nvidia’s systems will help Google offer large-scale tools for enterprise developers building large language models.
In his speech, Pichai highlighted Google’s “longstanding partnership with Nvidia.” The companies have been working together for more than a decade, and Pichai has said in the past that he expects they’ll still be working together another decade from now.