Google enters the lightweight AI market with Gemma

Google has launched Gemma, a family of AI models based on the same research as Gemini. Developers can't get their hands on Google's Gemini engine yet, but what the tech giant released on February 21 is a smaller, open-source model for researchers and developers to experiment with.

Although generative AI is all the rage, organizations can struggle to figure out how to apply it and demonstrate return on investment; Open source models allow them to experiment and find practical use cases.

Smaller AI models like this don't have the same performance as larger ones like Gemini or GPT-4, but they are flexible enough to allow organizations to create custom bots for customers or employees. In particular, the fact that Gemma can run on a workstation shows the continuing trend from generative AI manufacturers toward providing organizations with options for ChatGPT-like functionality without the heavy workload.

SEE: Sora, OpenAI's newest model, creates stunning photorealistic videos that often still look unreal. (Technological Republic)

What is Gemma from Google?

Google Gemma is a family of generative AI models that can be used to create chatbots or tools that can summarize content. Google Gemma models can run on a developer laptop, workstation, or through Google Cloud. Two sizes are available, 2 billion or 7 billion parameters.

For developers, Google offers a variety of tools for implementing Gemma, including toolchains for supervised inference and tuning in JAX, PyTorch, and TensorFlow.

For now, Gemma only works in English.

How do I access Google Gemma?

Google Gemma can be accessed through Colab, Hugging Face, Kaggle, Google's Kubernetes Engine and Vertex AI, and NVIDIA's NeMo.

Google Gemma can be accessed for free for research and development on Kaggle and through a free tier for Colab notebooks. New Google Cloud users can receive $300 in Gemma credits. Google Cloud credits of up to $500,000 are available to researchers who apply. In other cases, pricing and availability may depend on your organization's particular subscriptions and needs.

Because Google Gemma is open source, commercial use is permitted as long as such use is in accordance with the Terms of Service. Google also launched a Responsible Generative AI Toolkit with which developers can provide guidelines on their AI projects.

“It's great to see Google reinforce its commitment to open source AI, and we're excited to fully support the launch with a comprehensive integration into Hugging Face,” said Hugging Face technical lead Phillip Schmid, head of platform and community. , Omar Sanseviero and Machine. Learning engineer Pedro Cuenca in a blog post.

How does Google Gemma work?

Like other generative AI models, Gemma is software that can respond to natural language prompts, unlike conventional programming languages or commands. Google Gemma was trained on publicly available information, filtering out personally identifiable information and “sensitive” material.

Google worked with NVIDIA to optimize Gemma for NVIDIA products, particularly by offering acceleration in NVIDIA's TensorRT-LLM, a library for large language model inference. Gemma can be tuned in NVIDIA AI Enterprise.

What are Google Gemma's main competitors?

Gemma competes with small generative AI models, such as Meta's large open source language models, notably Llama 2; Mistral AI's Model 7B, Deci's DecilLM, and Microsoft's Phi-2, as well as similar small models of generative AI intended to run on an organization's hardware.

Hugging Face noted that Gemma outperforms many other small AI models on its leaderboard, which evaluates pre-trained models on basic factual issues, common-sense reasoning, and reliability. Only the Llama 2 70B, a model included as a reference benchmark, obtained a higher score than the Gemma 7B. Gemma 2B, on the other hand, performed relatively poorly compared to other small, open AI models.

Google's large-scale AI model, Gemini, comes in parameter versions of 1.8B and 3.25B and is designed to run on Android phones.