Although everyone wants to participate, deploying generative AI at scale has proven to be a significant challenge for large companies and government agencies.
Despite recognizing the potential of technology to streamline processes, reduce costs and improve supply chains, concerns about cost, complexity, security, data privacy, model ownership and regulatory compliance have acted as barriers to adoption.
In a possible breakthrough, Softbank-funded SambaNova Systems announced the launch of Samba-1, the first trillion-parameter generative AI model. Powered by SambaNova Suite, Samba-1 is designed to meet performance, accuracy, scalability, and total cost of ownership (TCO) requirements. The model also promises a 90% reduction in inference costs, although this claim should be approached with caution.
Building the 'AI iPhone'
Unlike other billion-parameter models, which are built as single, monolithic entities, Samba-1 uses a Composition of Experts (CoE) architecture. This system aggregates multiple small “expert” models into a single large solution, which functions as a single large model. This approach offers broader knowledge on various topics, high precision and multimodality.
The CoE model can also reportedly provide greater insight and accuracy for specialized domains than other large models. Smaller individual models can be trained for specific domains, such as finance, law, physics, or biology, and added to the CoE, providing high accuracy for that specific domain without the need to train on the entire billion-parameter model.
The launch of Samba-1 follows SambaNova's announcement of the SN40L, a smart AI chip designed to rival those from AI giant Nvidia. The integration of this chip with the Samba-1 model represents an important step forward, with SambaNova being the first to offer an integrated hardware and software system for the company.
“The entire AI industry is talking about building the AI iPhone – an integrated hardware and software system – and SambaNova is the first to offer a version of that to the enterprise,” said Rodrigo Liang, co-founder and CEO of SambaNova Systems. “Last fall, we announced the SN40L, the smartest AI chip, and now we have integrated that chip with the first 1T parameter model for the enterprise. Samba-1 rivals GPT-4; However, it is more suitable for the enterprise as it can be delivered on-premises or in private clouds so that customers can fine-tune the model with their private data without revealing it to the public domain. ”
Despite Samba-1's impressive capabilities, the model's claim to reduce inference costs by 90% should be taken with a pinch of salt. While the CoE architecture offers low inference costs, the true value of these savings will only be evident once the model is deployed in real-world scenarios.
Liang told us: “AI is not a fad, we are at the beginning of this journey. Our complete solution focuses on large-scale government and enterprise organizations, which no one else can offer locally and privately. “There’s no escaping how dominant Nvidia is right now, but we can deploy these models at scale for a fraction of the cost.”