Supermicro crams 18 GPUs into a 3U AI server that's a little slow by design

Can handle edge inferencing or run a 64 display command center

by · The Register

GPU-enhanced servers can typically pack up to eight of the accelerators, but Supermicro has built a box that manages to fit 18 of them inside an air-cooled chassis that'll eat up just 3U of rack space.

The delightfully named SYS-322GB-NR sports 20 PCIe slots, with the expectation that's where you'll connect GPUs. That's an unusual arrangement these days: most AI servers offer Nvidia's SXM socket or use the Open Accelerator Module spec, as both offer more inter-chip bandwidth than PCIe.

But this box isn't designed to do the heavy lifting required of other AI servers. Supermicro suggests this machine for jobs like running machine learning and AI inference workloads at the edge, as part of automated production systems that require data to be processed from camera feeds or sensors at very low latencies. Another suggested role is using GPUs dedicated to graphics rather than AI and connecting up to 64 monitors – the sort of thing that gets visualization wonks excited at the prospect of building 46,080 x 12,960 pixel displays.

At the back of the system is room for 18 single-slot GPUs or ten dual-slot cards. Or at least that's what the press release claims – the marketing imagery seems to indicate eight dual slot cards, though that may be less of a physical limit and more a power and cooling one.

Supermicro doesn't say which cards it'll support – perhaps as we're between major releases from Nvidia and AMD – but does note that accelerators from both vendors are on the menu.

For edge AI, we suspect Nvidia's diminutive L4 accelerators will be a popular configuration. Meanwhile, for those that need a little extra grunt, a bank of ten Nvidia L40S GUPs churning out 3.6 petaFLOPS of dense FP16 performance might be the ticket – assuming the PSU can supply roughly 5.5kW of power we estimate such a configuration would need under load.

Supporting all those GPUs isn't trivial either. At the heart of the system are a pair of Intel 6900-series Xeons with support for up to 128 cores, 256 threads, and 96 lanes of PCIe 5 a piece, which feed 20 PCIe slots on the motherboard. The observant among you will note that even with 192 PCIe lanes, that's still not nearly enough for 18 – let alone 20 – PCIe x16 slots.

It's not clear if Supermicro only supports eight lanes per slot when fully populated or if it's using a PCIe switch to overcome the limitation. If we had to guess, it's probably the former. Unless the GPUs need to shuffle data between one another, eight lanes per slot is probably fine. And if they do, Supermicro sells systems better suited to that use case. In any case, we've reached out for comment regarding power and PCIe bandwidth and will let you know what we find out.

Beyond the sheer number of PCIe slots at your disposal, the system is otherwise a vanilla server that supports up to 6TB of DDR5 or, if you prefer something speedier, 8,800 MT/sec MRDIMMs. Storage is also pretty standard, with support for your choice of either 14 E1.S or 6 U.2 NVMe drives.

Oh, and if GPUs aren't your thing but dense memory-packed servers are, Gigabyte recently announced a dual socket Epyc system with a whopping 48 DIMM slots. ®