- Microsoft-backed startup presents GPU-free alternatives for generative AI
- DIMC architecture delivers ultra-high memory bandwidth of 150 TB/s
- Corsair supports transformers, agent AI, and interactive video generation
d-Matrix Inc., a hardware startup based in Santa Clara, California, has introduced its first AI processor, Corsair, which aims to improve AI inference.
Backed by Microsoft and leveraging cutting-edge technology, Corsair eschews traditional GPUs and expensive high-bandwidth memory (HBM), delivering significant performance and cost benefits.
Corsair is currently available to early access customers, with wider availability planned for Q2 2025.
Corsair performance redefines AI inference
The Corsair processor is specifically designed to handle demanding AI inference tasks, particularly for generative AI models. For example, it reaches 60,000 tokens per second at 1 ms per token. when running Llama3 8B on a single server.
In more resource-intensive scenarios, such as with the Llama3 70B models, Corsair delivers 30,000 tokens per second at 2 ms per token in a single rack, resulting in substantial savings in power and operating costs compared to traditional solutions based on GPU.
The processor is built on Nighthawk and Jayhawk II tiles, using a 6nm manufacturing process. Each Nighthawk tile integrates four neural cores and a RISC-V CPU, designed to support large model inference with digital in-memory computing (DIMC) and versatile data type processing, including block floating point (BFP).
Corsair adopts chiplet packaging, integrating memory and computing to maximize efficiency. It complies with the industry-standard PCIe Gen5 full-length card form factor and can be combined with DMX Bridge cards for scalable performance. Each card operates with 2400 TFLOPs of maximum 8-bit computing, along with 2 GB of onboard performance memory and up to 256 GB of off-chip memory capacity.
Importantly, Micron Technology, a key Nvidia partner, is also collaborating with d-Matrix.
Initially scheduled for release in late 2023, d-Matrix reconfigured its architecture in response to growing demand for generative AI. This pivot allowed Corsair to incorporate improvements designed for transformer models and emerging applications such as agent AI and interactive video generation.
“We saw transformers and generative AI coming, and founded d-Matrix to address the inference challenges around the largest computing opportunity of our time,” said Sid Sheth, co-founder and CEO of d-Matrix.
“The first-of-its-kind Corsair computing platform offers incredibly fast token generation for highly interactivity applications with multiple users, making Gen AI commercially viable,” Sheth added.
Through eeNews