Ampere Computing is a startup that is making waves in the tech industry by challenging the dominance of tech giants like AMD, Nvidia, and Intel. With the rise of AI, demand for computing power has skyrocketed, along with energy costs and demand on power grids. Ampere aims to solve this problem with a low consumption and high performance solution.
Despite being the underdog, Ampere's offering has been adopted by almost all of the world's major hyperscalers. It has overcome the scaling wall several times with its CPUs and the company plans to continue scaling in ways that legacy architectures cannot. We spoke with Ampere CPO Jeff Wittich about his company's success and his future plans.
Sometimes I feel like challenging startups like Ampere Computing are caught between a rock and a hard place. On the one hand, we have multi-billion dollar companies like AMD, Nvidia and Intel and on the other, hyperscalers like Microsoft, Google and Amazon that have their own offerings. What does it feel like to be the little mammal in the land of the dinosaurs?
It is truly an exciting time for Ampere. We may only be six years old, but as we predicted when we started the company, the need for a new cloud computing solution has never been stronger. The industry doesn't need more dinosaurs: it needs something new.
Cloud needs have changed. The amount of computing power needed for today's connected world is far greater than anyone could have imagined and will continue to grow with the rise of AI. At the same time, energy costs have skyrocketed, demand on the world's power grids is outstripping supply, and construction of new data centers is halting for a variety of reasons. The convergence of these factors has created the perfect opportunity for Ampere to provide a much-needed low-power, high-performance solution that has not been offered by traditional large players.
Thanks to our ability to offer this, we have grown rapidly and been adopted by almost every major hyperscaler in the world. We are also seeing increased adoption in the enterprise, as companies look to make the most of their existing data center space. The increased demand we continue to see for Ampere products makes us confident that the industry recognizes our value.
Ampere has been the high core count leader in the server CPU market for a few years. However, others (AMD and Intel) have been catching up; Given the immutable laws of physics, when do you expect to hit a wall as far as physical cores are concerned, and how do you plan to get through it?
As you mentioned, Ampere has been a leader in dense, efficient, high-core-count computing for the past few years. From the beginning, we identified where the key challenges to cloud growth would arise and today we are addressing those exact challenges with our Ampere CPUs. Our Ampere CPUs are perfect for cloud use cases of all types and across a wide range of workloads.
We have surpassed the scaling wall several times, being the first at 128 cores and now at 192 cores. Innovation like this requires a new approach that breaks inherited limitations. Ampere's new approach to CPU design, from microarchitecture to feature set, will allow us to continue to scale in ways that legacy architectures cannot.
Another credible threat on the horizon is the rise of RISC-V, with China throwing its weight behind microarchitecture. What are your personal views on that front? Could Ampere join the RISC team one day?
Ampere's core strategy is to develop sustainable processors to power computing both today and in the future. We will build our CPUs using the best available technologies to deliver leading performance, efficiency and scalability, so long as our customers can easily use those technologies to run their desired operating systems, infrastructure software and user applications.
What can you tell us about the sequel to Ampere One? Will it follow the same trajectory as Altra > One? More cores? Same frequency, more L2 cache per core? Will it be called Ampere 2 and still be single threaded?
Over the next few years, we will continue to focus on releasing CPUs that are more efficient and offer higher core counts, as well as more memory bandwidth and I/O capabilities. This will give us increasing performance for increasingly important workloads, such as AI inference, while uniquely meeting the sustainability goals of cloud providers and users.
Our products will also continue to focus on delivering predictable performance to cloud users, eliminating noisy neighbor issues and enabling providers to run Ampere CPUs at high utilization. We will introduce additional features that provide greater degrees of flexibility for cloud providers to meet customers' diverse set of applications. These are critical to cloud-native workload performance now and in the future.
Given Ampere Computing's focused approach, can you give us a brief overview of what your average customer is and what type of workloads you typically handle?
Because our CPUs are general purpose, they serve a wide spectrum of applications. We built our CPUs from the ground up as cloud-native processors, so they work great in almost all cloud workloads: AI inference, web services, databases, and video processing are just a few examples. In many cases, we can deliver twice the performance for these workloads with half the power of legacy x86 processors.
In terms of clients, we are working with almost all the big hyperscalers in the US, Europe and China. In the US, for example, you can find Ampere instances on Oracle Cloud, Google Cloud, Microsoft Azure, and more. Ampere CPUs are also available throughout Europe through various cloud providers.
Beyond the big cloud providers, we're seeing a lot of traction in the enterprise through our offerings with OEMs like HPE and Supermicro. This is largely due to the increased efficiency and rack density these companies can achieve by deploying Ampere servers. Companies want to save energy and do not want to build additional data centers that are not critical to their business.
With the rise of AI, once “simple” devices are becoming increasingly smarter, driving greater demand for cloud computing in super-local areas. These edge deployments have strict space and power requirements, and due to Ampere's ability to provide such a high number of cores in a low-power envelope, we also see a lot of demand for these workloads.
AI has become the main topic of conversation this year in the semiconductor industry and beyond. Will this change in 2024, in your opinion? How do you see this market?
We firmly believe that AI will continue to be the main topic of conversation. But we do believe that the conversation will change… and it is already beginning to do so.
In 2024, many companies working on AI solutions will move from initial training of neural networks to deploying them, also known as AI inference. Given that AI inference can require 10 times more aggregate computing power than training, the ability to deploy AI at scale will become increasingly important. Achieving this required scale will be limited by performance, cost, and availability, so organizations will look for alternatives to GPUs as they enter this next phase. CPUs, and particularly low-power, high-performance CPUs like those offered by Ampere, will become an increasingly attractive option given their ability to enable more efficient and cost-effective execution of AI inference models. GPUs will still be important for certain aspects of AI, but we hope the enthusiasm will start to die down.
Secondly, sustainability and energy efficiency will be even more important next year in the context of AI. Today, data centers often struggle to meet their energy needs. Increased use of AI will drive even greater demand for computing power in 2024, and for some AI workloads, that may require up to 20 times more power. Because of this, sustainability and efficiency will become challenges for expansion. Data center operators will place a high priority on efficiency in the new year to avoid jeopardizing growth.
How is Ampere addressing this new AI market opportunity with its products?
For many AI applications, GPUs are overkill and consume much more power and money than necessary. This is especially true for most inferences, especially when running AI workloads alongside other workloads like databases or web services. In these cases, replacing the GPU with a CPU saves power, space, and cost.
We're already seeing this become a reality for real-world workloads, and the benefit of using Ampere processors is significant. For example, running the popular Whisper generative AI model on our 128-core Altra CPU compared to Nvidia's A10 GPU card, we consume 3.6x less power per inference. Compared to Nvidia Tesla T4 cards, we consume 5.6 times less.
Because of this, we have seen a substantial increase in demand for Ampere processors for AI inference and we expect it to become a huge market for our products. Just a few weeks ago, Scaleway, one of Europe's leading cloud providers, announced the upcoming general availability of new AI inference instances powered by Ampere. Additionally, in the last six months, we have seen a seven-fold increase in usage of our AI software library. All of this speaks to the growing adoption of our products as a high-performance, low-power alternative to AI inference.