In 2022, before ChatGPT completely revolutionized the world of artificial intelligence, Etched decided to invest heavily in transformers.
With this approach, the startup has developed Sohu, a specialized ASIC chip designed exclusively for Transformer models, the architecture that powers ChatGPT, Sora and Gemini.
Sohu is a one-trick pony: It can't run machine learning models like convolutional neural networks (CNNs), recurrent neural networks (RNNs), or long short-term memory networks (LSTMs), but for transformers, Etched says it's unparalleled and can outperform Nvidia's flagship B200 GPU in speed by nearly tenfold.
It's all about scalability
Because Sohu is designed exclusively for Transformer models, it can avoid the complex and often unnecessary control flow logic that general-purpose GPUs need to handle to support a wide variety of applications.
By focusing solely on the computational needs of transformers, Sohu can devote more resources to performing mathematical operations, which are the core tasks in transformer processing.
This optimized approach allows Sohu to achieve over 90% utilization of its FLOPS capacity, significantly higher than the ~30% utilization seen on mainstream GPUs. This means that Sohu can perform more calculations in a given period, making it much more efficient for transformer-based tasks.
The use of Transformer models has increased dramatically worldwide, and all major AI labs (from Google to Microsoft) are committed to further scaling this technology. With throughput exceeding 500,000 tokens per second on Llama 70B, Sohu is an order of magnitude faster and more cost-effective than next-generation GPUs.
Etched believes the shift to specialized chips is inevitable and intends to stay ahead of the curve. “Current and next-generation models are transformers,” the company says. “The hardware stack of the future will be optimized for transformers. Nvidia’s GB200s have special support for transformers (TransformerEngine). The entry of ASICs like Sohu into the market marks the point of no return.”
Etched reports that Sohu production is ramping up and significant orders have already been received. “We believe in the hardware lottery: the models that win are the ones that can run the fastest and at the lowest hardware cost. Transformers are powerful, useful, and cost-effective enough to dominate all major AI computing markets before alternatives are ready.”
The company adds: “Killer Transformers will need to run on GPUs faster than Transformers on Sohu. If that happens, we'll build an ASIC for that too!”