The Google I/O 2024 keynote was a packed Gemini fest, and CEO Sundar Pichai was right to describe it as his version of The Eras Tour — specifically, the “Gemini Era” — at the top.
The entire conference was about Gemini and AI; in fact, Google said this last 121 times. From the introduction of a futuristic AI assistant called “Project Astra” that can run on a phone (and maybe glasses, one day) to Gemini being incorporated into almost every service or product the company offers, AI was definitely the big topic.
It was all enough to melt the minds of all but the most ardent LLM enthusiasts, so we've broken down the 7 biggest things Google revealed and discussed during its I/O 2024 keynote.
1. Google abandoned Project Astra, an “artificial intelligence agent” for everyday life
So it turns out that Google has an answer to OpenAI's GPT-4o and Microsoft's CoPilot. Project Astra, dubbed an “AI agent” for everyday life, is essentially Google Lens on steroids and looks really impressive, capable of understanding, reasoning and responding to live video and audio.
In a demonstration on a Pixel phone in a recorded video, the user was seen walking around an office, providing a live feed of the rear camera and asking Astra questions spontaneously. Gemini was seeing and understanding the images at the same time she was addressing the questions.
It speaks to the long-term, multi-modal context in the Gemini backend, which works in a jiffy to quickly identify and provide a response. In the demonstration she knew what a specific part of a speaker was and was even able to identify a neighborhood in London. He's also generative because he quickly created a band name for a cute puppy along with a stuffed animal (watch the video above).
It won't roll out right away, but developers and press like us at TechRadar will be able to try it out at I/O 2024. And while Google didn't make it clear, there was a sneak peek of glasses for Astra, which could mean Google Glass could return.
Still, even as a demo during Google I/O, it's really impressive and potentially very compelling. It could power the current smartphones and assistants we have from Google and even Apple. In addition, it also shows Google's true ambitions in terms of AI, a tool that can be immensely useful and its use does not involve any arduous task.
2. Google Photos got a useful AI boost from Gemini
Have you ever wanted to quickly find a specific photo you took at some point in the distant past? Maybe it's a note from a loved one, an early photo of a dog when he was a puppy, or even his license plate. Well, Google is making that wish come true with a major update to Google Photos that merges it with a Gemini. This gives you access to your library, allows you to search it, and easily delivers the result you're looking for.
In an onstage demonstration, Sundar Pichai revealed that you can ask for your license plate and Photos will give you an image showing it and the digits/characters that make up your license plate. Likewise, you can request photos of when your child learned to swim and more details. It should make even the most disorganized photo libraries a little easier to search.
Google has named this feature “Ask Photos” and will roll it out to all users in the “coming weeks.” And it will almost certainly be useful and make people who don't use Google Photos a little jealous.
3. Your children's homework is now much easier thanks to NotebookLM
All parents will know the horror of trying to help children with homework; If you ever knew about these things in the past, there's no way the knowledge is still lurking in your brain 20 years later. But Google may have made the task a lot easier thanks to an update to its NotebookLM note-taking app.
NotebookLM now has access to Gemini 1.5 Pro, and based on the demo at I/O 2024, it will now be a better teacher than ever. The demo showed Google's Josh Woodward carrying a notebook full of notes on a learning topic, in this case, science. With the push of a button, he was able to create a detailed learning guide, with additional deliverables including quizzes and FAQs, all drawn from the source material.
Impressive, but it was about to get a lot better. A new feature, still a prototype for now, was able to output all content as audio, essentially creating a podcast-style discussion. What's more, the audio featured more than one speaker, chatting about the topic matter-of-factly in a way that would definitely be more helpful than a frustrated parent trying to play the role of teacher.
Woodward was even able to interrupt and ask a question, in this case “give us a basketball example,” at which point the AI changed course and came up with clever metaphors for the topic, but in an accessible context. Parents on the TechRadar team can't wait to try this one out.
4. Soon you will be able to search Google for a video.
In a bizarre on-stage demonstration with a record player, Google showed off a very impressive new search trick. Now you can record a video and search it to get results and hopefully a response.
In this case, it was Googler who was wondering how to use a record player; he pressed record to film the unit in question while asking something and then sent it. Google worked its search magic and provided a text answer, which could be read aloud. It's a completely new way to search, like Google Lens for videos, and also clearly different from Project Astra's upcoming everyday AI, as it needs to be recorded and then searched instead of working in real time.
Still, it's part of an infusion of Gemini and generative AI with Google Search, which aims to keep you on that page and make it easier to get answers. Ahead of this video search demo, Google showed off a new generative experience for recipes and meals. This allows you to search for something in natural language and get recipes or even restaurant recommendations on the results page.
Simply put, Google is going full throttle with generative AI in search, both for getting results and for various ways to get them.
We've been marveling at the creations of Sora, OpenAI's text-to-video tool, for the past few months, and now Google is joining the generative video party with its new tool called Veo. Like Sora, Veo can generate minute-long videos in 1080p quality, all with a simple message.
That message can include cinematic effects, like a request for a time-lapse or aerial shot, and early samples look impressive. You don't have to start from scratch either: load an input video with one command and Veo can edit the clip to match your request. There is also the option to add masks and modify specific parts of a video.
The bad news? Like Sora, I Spy is not yet widely available. Google says it will be available to select creators through VideoFX, one of its experimental Labs features, “over the next few weeks.” It might be a while until we see a wide rollout, but Google has promised to bring the feature to YouTube Shorts and other apps. And that will make Adobe shift uncomfortably in its AI-generated chair.
6. Android got a big Gemini infusion
Just like Google's “Circle to Search” feature sits on top of an app, Gemini is now being built into Android's core to integrate with your stream. As demonstrated, Gemini can now see, read and understand what's on your phone's screen, allowing you to anticipate questions about anything you see.
So you can get context from a video you're watching, anticipate a summary request when viewing a long PDF, or be prepared for countless questions about an application you're in. Having content-aware AI built into a phone's operating system is not a bad thing at all and could prove very useful.
In addition to integrating Gemini at the system level, Gemini Nano with multimodality will launch later this year on Pixel devices. Which will allow? Well, it should speed things up, but the standout feature, for now, is that Gemini listens to calls and can alert you in real time if they're spam. That's pretty cool and builds on call filtering, a long-standing feature of Pixel phones. It is set to be faster and process more on the device instead of sending it to the cloud.
7. Google Workspace will become much smarter
Workspace users are getting a treasure trove of Gemini integrations and useful features that could make a big impact on a daily basis. Within Mail, thanks to a new side panel on the left, you can ask Gemini to summarize all recent conversations with a colleague. The result is then summarized with bullet points highlighting the most important aspects.
Gemini on Google Meet can give you the highlights of a meeting or what other people on the call might be asking. You will no longer need to take notes during that call, which could be useful if it is long. Within Google Sheets, Gemini can help make sense of data and process requests, such as fetching a sum or a specific data set.
The virtual teammate “Chip” may be the most futuristic example. You can live in a G-chat and be called for various tasks or queries. While these tools will come to Workspace, likely first through Labs, the remaining question is when they will reach regular Gmail and Drive clients. Given Google's focus on AI for everyone and its strong push in search, it's likely a matter of time.