TensorWave bags $43M to pack its datacenter with AMD accelerators

Startup also set to launch an inference service in Q4

by · The Register

TensorWave on Tuesday secured $43 million in fresh funding to cram its datacenter full of AMD's Instinct accelerators and bring a new inference platform to market.

Founded in late 2023, the Las Vegas-based startup is one of several cloud providers that have popped up amid the generative AI boom, looking to replicate the successes of CoreWeave and Lambda. But rather than stick with Nvidia accelerators, TensorWave's founders are betting it all on AMD's HBM-packed Instinct accelerators.

TensorWave began racking up MI300X-based systems this spring. The startup now aims to add "thousands" more of the accelerators and scale up its team to support the launch of a new inference platform called Manifest in the fourth quarter.

AMD's MI300X has seen widespread adoption across multiple cloud providers since its launch last December. In addition to TensorWave, Microsoft is now running OpenAI's GPT-4 Turbo and many of its Copilot services on the chips, and Oracle has also deployed a cluster of 16,384 MI300X accelerators. As a result, AMD now expects Instinct accelerators to drive $4.5 billion in revenues in 2024.

Even cloud upstart Vultr now plans to offer MI300X based instances.

On paper, there's a lot to like about the chips, which not only offer substantially higher floating point performance but more than twice the memory than Nvidia's coveted H100 at 192 GB compared to 80 GB.

Memory capacity is particularly valuable for those running larger models at full 16-bit precision. With 1,536 GB per node, an MI300X-based system can easily fit Meta's Llama 3.1 405B at full resolution, whereas it would need to be split between multiple H100 systems or compressed using 8-bit quantization to fit. While it is possible to squeeze the uncompressed model into a single H200 node, doing so doesn't leave a ton of room left over for the larger context window supported by the model.

With the launch of its next-gen MI325X accelerators later this year, AMD will further extend this lead, pushing the accelerator to 288 GB of capacity, more than three times that of the H100 and 50 percent more than Nvidia's upcoming Blackwell parts.

TensorWave intends to start deploying the chips in its datacenter as soon as they hit the market, potentially before the end of the year.

Alongside new hardware, the startup is preparing to launch an inference service in the fourth quarter, which will give customers an alternative to renting entire systems and managing their own software stack.

TensorWave hasn't said much about the service just yet, but an emphasis on large context windows and lower latency suggests they may be leaning on the MI300X's memory capacity and bandwidth to support retrieval augmented generation (RAG) use cases. We've previously explored RAG in detail, but in a nutshell it functions as an external database from which large language models can retrieve data.

TensorWave is far from the first company to launch a managed inference service. SambaNova, Cerebras, Groq, not to mention many of the model builders, have launched similar offerings, which bill by the token rather than GPU hours.

But while $43 million isn't chump change, it's still tiny compared to the hundreds of millions and even billions in funding Lambda and CoreWeave and others have managed to talk their VC backers into.

When we last spoke to TensorWave's founders in April, the startup aimed to have 20,000 some Instinct accelerators operational by the end of 2024. But as we understand it, those plans were dependent in part on debt financing.

In a statement to El Reg, TensorWave couldn’t provide specifics on the progress of its datacenter build out.

“While we can’t share specific numbers at this stage, I can confirm that we’re making significant progress toward our goal of deploying GPUs across our data centers,” CEO Darrick Horton told us.

"Our partnership with AMD and access to their Instinct MI300X and upcoming MI325X accelerators have positioned us to meet the growing demand for AI compute resources. We’re well on track and excited about the milestones we expect to hit as we close out the year." ®