News

To Exascale and Beyond: 7 Key Takeaways From ISC 2024

May 21, 2024 by Duane Benson

In this roundup, we highlight some announcements from the International Supercomputer Conference that shed light on the state of exascale computing.

The International Supercomputer Conference (ISC) concluded last week in Hamburg, Germany. Several of the biggest names in supercomputing, including Intel, Hewlett Packard Enterprise, Nvidia, and AMD, announced new updates at the show alongside up-and-comers such as IQM Quantum, Supermicro, and Spinncloud.

 

Technicians in the Intel/HPE/Argonne Aurora

Technicians in the Intel/HPE/Argonne Aurora, the second supercomputer to break the exascale barrier. Image used courtesy of Intel

 

The 2024 show took on “Reinventing HPC” as the theme, exploring how the supercomputer industry is one of both breakthroughs and blurred lines. Supercomputers are massively parallel systems that use tens of thousands of CPUs, GPUs, TPUs, or other specialized processing engines. They may differ only in software from their kindred data center cloud hyperscalers. With AI taking such a big role in the data center and high-performance computing (HPC) worlds, the distinction is only becoming less clear. The very definition of supercomputer and HPC may need to change.

In this roundup, we'll examine ISC's key announcements and discuss how they point to trends in high-performance computing and supercomputing.

 

Intel

Intel has teamed with Argonne National Laboratory and Hewlett Packard Enterprise (HPE) to produce the Aurora supercomputer. Aurora delivers conventional supercomputer performance at 1.012 exaflops, placing it in the number two spot in the most recent Top500 supercomputer list. It’s only the second exascale computer ever to power up. Aurora comes in on top of the AI supercomputing list at 10.6 AI exaflops. 

Aurora is a massive system consisting of 166 racks, 10,624 compute blades, 21,248 Intel Xeon CPU Max processors, and 63,744 Intel Data Center GPU Max units.
 

Hewlett Packard Enterprise

In 2019, Hewlett Packard Enterprise (HPE) joined the supercomputer fray by purchasing the late Seymour Cray's machines. Today, the HPE/Intel/Argonne Aurora supercomputer announced at ISC 2024 continues that legacy by being only the second supercomputer to reach exascale capability.

Aurora uses the HPE Cray EX supercomputer platform, which was purpose-built for exascale computing. A crucial component is HPE Slingshot, the largest deployment of open Ethernet-based supercomputing interconnect. The system was built with AI-driven research in mind. It will be used to map the human brain, study high-energy particle physics, and accelerate AI-driven drug research.

 

Nvidia

Nvidia has emerged as one of the giants in AI and supercomputing processor cores. Nvidia announced the installation of its Grace Hopper superchips in nine supercomputer systems worldwide. The superchip combines Arm-based Nvidia Grace CPU and Nvidia Hopper GPU architectures using Nvidia NVLink-C2C interconnect technology. 

 

The Nvidia Grace Hopper superchip

The Nvidia Grace Hopper superchip. Image used courtesy of Nvidia
 

The integrated combination delivers a balance of HPC and power efficiency. Grace Hopper is designed for power efficiency, high computational speeds, and easy scaling. The new Nvidia-based supercomputers are online or coming online in France, Poland, Switzerland, Germany, the U.S., Japan, and the U.K.

 

AMD

AMD showcased its HPC solutions at ISC via the Frontier supercomputer housed at Oak Ridge National Lab. Frontier came out three years ago as the first exascale and the highest-performing supercomputer, with 1.2 exaflops, according to Top500. Frontier, powered by AMD EPYC CPUs and AMD Instinct GPUs, still holds the title of fastest supercomputer in the world, albeit by a much smaller margin this year than last.

 

Frontier supercomputer

Frontier supercomputer, the first to exascale computing and the fastest for the third year in a row. Image used courtesy of AMD
 

AMD also noted that 157 of the top 500 fastest supercomputers are powered by AMD. That’s a 29% increase since 2023.

 

IQM Quantum

IQM Quantum partnered with HPE at ISC 2024 to demonstrate a hybrid system that integrates quantum computing and more conventional high-performance computing.

 

The IQM/HPE quantum-HPC integrated computing system

The IQM/HPE quantum-HPC integrated computing system. Image used courtesy of IQM
 

Quantum computers, while still largely in the early research stage, have the potential to radically disrupt HPC. IQM Quantum has taken a novel approach by combining quantum hardware with classical HPC hardware. One of the first deployments of this hardware will be in Germany at the Leibniz Supercomputing Centre (LRZ).

IQM Quantum will use the platform in partnership with Hewlett Packard Labs to continue hybrid advancement and allow users to develop quantum and hybrid computing algorithms and practices.

 

Supermicro

Supermicro showed off its liquid-cooled AI and HPC systems at the show, which reduce cooling power requirements over conventional air-cooled systems. Liquid cooling enables denser AI and HPC computing. By improving heat extraction, Supermicro rack solutions may increase the speed and lower the cost of data centers and supercomputers.

 

Supermicro liquid-cooled rack system

Supermicro liquid-cooled rack system. Image used courtesy of Supermicro
 

Thermal management is one of the key enabling technologies for high-performance computing, including supercomputers. The CPUs, GPUs, and TPUs, along with DRAM and solid-state storage, create massive amounts of heat. Supermicro liquid-cooled supercomputing rack systems can be adapted for most HPC hardware. Supermicro offers servers based around Nvidia, AMD, and Intel processors.

The liquid cooling system is part of Supermicro’s environmental, social, and governance (ESG) initiative and promises to save 40% of the power used by an equivalent air-cooled system.

 

Spinncloud

Spinncloud announced the SpiNNaker2 event-based hybrid AI platform. The system’s predecessor, SpiNNaker1, was the brainchild of Steve Furber, one of the original developers of the Arm architecture. SpiNNaker was designed to emulate the human brain. SpiNNaker2 expands upon the prior version, extending traditional AI computing models with new algorithms that adapt dynamically to contextual nuance.

 

Dr. Steven Furber

Dr. Steven Furber, one of the original developers of the Arm architecture. Image used courtesy of Spinncloud
 

Spinncloud believes that the current computing architecture is woefully inadequate for AI. Even with massively parallel matrix math-optimized computing, existing AI systems use simple pattern recognition, tokenization of patterns, and matching to existing tokenized data. They fall short when tasked with original thinking or contextual nuance. 

Spinncloud has developed a low-power architecture that they believe closely models the human brain. One of the key elements is the biologically-inspired, event-based asynchronous parallel operation. Side-by-side operations don’t need to remain synchronous. SpiNNaker2, based on this new architecture, promises significantly greater computing power per unit of energy consumed.

 

Zetta, Here We Come

When Seymore Cray first used parallelism and other architectural innovations to create the supercomputer sixty years ago, the size and scope of today’s systems were mere science fiction. The June 2024 Top500 list now has two exascale supercomputers, Frontier and Aurora, at 1.206 and 1.012 exaflops, respectively. The next closest, Microsoft’s Eagle, comes in at 561 petaflops. Frontier was the first to reach the exascale, arriving at that point in June 2022, just 14 years after IBM hit the petaflop line (1,000x slower than exascale) with Roadrunner.

 

Year

Supercomputer

Scale Gap
2022 Frontier Exascale 14
2008

Roadrunner

Petascale

12
1996

ASCII Red

Terascale

11
1985

Cray-2

Gigascale

19
1964

CDC 6600

Megascale (3 Mflops)  
Years for 1,000x jump in supercomputing scale
 

The newest HPC chips are built with multiple-core processor chiplets containing different combinations of CPUs, GPUs, TPUs, and stacked memory all on the same substrate connected with high-speed interconnects and intra-chip networking. They are nearly complete computers on their own, yet they are combined in the tens of thousands to create data centers and supercomputers. The scalability of these systems means that the primary limitations are a quadfecta of power consumption, cooling, data transfer, and raw computing power. Innovations in any of these four areas can lead to jumps in supercomputer performance.

With new architectures and quantum-HPC hybrids, like those from Spinncloud and IQM Quantum, emerging, the gap between exascale and zettascale may close sooner than a decade and a half. Alternately, breakthroughs in data transfer between dispersed hyperspeed cloud computing may make the concept of a singular supercomputer meaningless. 

 


 

A Footnote on Supercomputing Speed

Supercomputer speed is measured in floating point operations per second (flops). Today, that means petaflops or exaflops. A petaflop is 1015 flops per second. An exaflop is 1018 flops—one thousand petaflops—a quintillion (billion billion) flops. By comparison, the Intel 8087, the first math co-processor I ever coded Assembly language for, came in at a not-at-all blazing 50 kiloflops. A supercomputer capable of operating at one exaflop or greater is said to be an exascale supercomputer.

The “f” in supercomputing flops refers to an IEEE double-precision, 64-bit floating-point number (FP64). Common AI benchmarks use a combination of 32-bit and 64-bit math.
While calculating an orbital trajectory for a space probe billions of miles away requires a lot of precision, tokenized patterns in an AI model can be found easily with 32, 16, and even 8-bit precision floating point (FP32, FP16, and FP8) numbers. Additional precision would only slow the operations down. AI calculations often start with FP8 and then bump up to FP32 calculations for final refinement. This generally allows the same machine to deliver a higher AI benchmark than the conventional benchmark.

1 Comment