AI Overviews, the next evolution of Search Generative Experience, will launch in the US this week and soon in more countries, Google announced at the Shoreline Amphitheater in Mountain View, CA. Google showed off several other changes coming to Google Cloud, Gemini, Workspace, and more, including AI actions and summaries that can work across apps, opening up some interesting options for small businesses.
Search will include AI overviews
AI Overviews is the expansion of Google's generative search experience, AI-generated answers that appear at the top of Google searches. You may have already seen SGE in action, as select US users have been able to try it out since last October. SGE can generate images or text. AI Overviews adds AI-generated information to the top of any Google search engine results.
With AI Overviews, “Google does the work for you. Instead of gathering all the information yourself, you can ask your questions and “get an answer instantly,” said Liz Reid, vice president of Google Search.
By the end of the year, AI Overviews will reach more than one billion people, Reid said. Google wants to be able to answer “ten questions in one,” linking tasks so that AI can make precise connections between information. This is possible through multi-step reasoning. For example, someone might ask not only about the best yoga studios in the area, but also about the distance between the studios and their home and the studios' introductory offers. All of this information will appear in convenient columns at the top of the search results.
Soon, AI Overviews will also be able to answer questions about the videos provided to it.
AI Overviews will roll out in “the coming weeks” in the US and will be available first in Search Labs.
Do AI overviews really make Google Search more useful? Google says it will look carefully at which images are generated by AI and which come from the web, but AI overviews can dilute the usefulness of the search if the AI answers turn out to be incorrect, irrelevant or misleading.
Gemini 1.5 Pro gets some updates, including a 2M context window for select users
Google's large language model, Gemini 1.5 Pro, is receiving quality improvements and a new version, Gemini 1.5 Flash. New developer features in the Gemini API include video frame extraction, parallel function calling, and developer context caching. Native video frame extraction and parallel function calling are now available. Context caching is expected to decrease in June.
Available today globally, Gemini 1.5 Flash is a smaller model focused on rapid response. Gemini 1.5 Pro and Gemini 1.5 users will be able to enter information for AI to analyze in a 1M window of context.
On top of that, Google is expanding the Gemini 1.5 Pro contextual window to 2 million for select Google Cloud customers. For a broader window of context, join the waitlist on Google AI Studio or Vertex AI.
The ultimate goal is “infinite context,” said Google CEO Sundar Pichai.
Gemma 2 comes in a parameter size 27B
Google's small language model, Gemma, will receive a major overhaul in June. Gemma 2 will have a 27B parameter model, in response to developers requesting a larger Gemma model that is still small enough to fit inside compact projects. Gemma 2 can run efficiently on a single TPU host on Vertex AI, Google said. Gemma 2 will be available in June.
Additionally, Google released PaliGemma, a language and vision model for tasks like image captioning and asking image-based questions. PaliGemma is now available on Vertex AI.
Gemini summary and other features will be attached to Google Workspace
Google Workspace is receiving several AI improvements, which are enabled by Gemini 1.5's long context window and multimodality. For example, users can ask Gemini to summarize long email threads or Google Meet calls. Gemini will be available in the Workspace sidebar next month on desktops for businesses and consumers using the Gemini for Workspace add-ons and the Google One AI Premium plan. The Gemini Side Panel is now available in Workspace Labs and for Gemini for Workspace Alpha users.
Workspace and AI Advanced customers will be able to use some new Gemini features in the future, starting for Labs users this month and becoming generally available in July:
- Summarize email threads.
- Run a Q&A session in your email inbox.
- Use longer suggested responses in Smart Reply to extract contextual information from email threads.
Gemini 1.5 can establish connections between applications in Workspace, such as Gmail and Docs. Google VP and GM of Workspace Aparna Pappu demonstrated this by showing how small business owners could use Gemini 1.5 to organize and track their travel receipts in an email-based spreadsheet. This feature, Data Q&A, will roll out to Labs users in July.
Next, Google wants to be able to add a virtual teammate to Workspace. The Virtual Teammate will act as an AI coworker, with an identity, a Workspace account, and a goal. (But without the need for PTO). Employees can ask the assistant questions about work, and the assistant will maintain the “collective memory” of the team they work with.
Google has not yet announced a release date for Virtual Teammate. They plan to add third-party capabilities to it in the future. This is just speculative, but Virtual Teammate could be especially useful for businesses if it connects to CRM applications.
Voice and video capabilities coming to the Gemini app
Talk and video features are coming to the Gemini app later this year. Gemini will be able to “see” through your camera and respond in real time.
Users will be able to create “Gems,” custom agents to do things like act as personal writing coaches. The idea is to make Gemini “a true assistant” who can, for example, plan a trip. Gems are coming to Gemini Advanced this summer.
The addition of multimodality to Gemini comes at an interesting time compared to ChatGPT's demo with GPT-4o earlier this week. They both showed a conversation that sounded very natural. OpenAI's AI voice responded to the interruption, but misread or misinterpreted some situations.
SEE: OpenAI showed how the latest version of the GPT-4 model can respond to live video.
Image 3 improvement in text generation
Google announced Image 3, the next evolution of its image-generating AI. Image 3 aims to be better at rendering text, which has been a major weakness for AI image generators in the past. Select creators can try out Image 3 in ImageFX at Google Labs today, and Imagine 3 is coming soon for developers at Vertex AI.
Google and DeepMind reveal other creative AI tools
Another creative AI product Google announced was Veo, its next-generation generative video model from DeepMind. Veo created an impressive video of a car going through a tunnel and arriving on a city street. Veo can be used by select creators in VideoFX, an experimental tool found on labs.google.
Other creative types may want to use Music AI Sandbox, a set of generative artificial intelligence tools for making music. No public or private release dates have been announced for Music AI Sandbox.
6th Generation Trillium GPUs Boost the Power of Google Cloud Data Centers
Pichai introduced Google's sixth-generation Google Cloud TPUs, called Trillium. Google claims that TPUs show a 4.7x improvement over the previous generation. Trillium TPUs are intended to add higher performance to Google Cloud data centers and compete with NVIDIA's AI accelerators. Time on Trillium will be available to Google Cloud customers in late 2024. Additionally, NVIDIA Blackwell GPUs will be available on Google Cloud starting in 2025.
TechRepublic covered Google I/O remotely.