Ultra Ethernet Consortium Set to Enable Scaling of Networking Interconnects for AI and HPC

Jon Ames

Aug 01, 2024 / 3 min read

Ethernet standards in computer networking have been enabling innovation for the last 50 years or so.

Today’s compute applications have brought about unprecedented challenges. As machine learning processing needs threaten to further strain networks, the  time to update Ethernet standards is now.

In the summer of 2023, the Ultra Ethernet Consortium (UEC) was announced to do just that. The group of now over 70 members from organizations including leading Hyperscalers, OEMs, and Synopsys aims to revolutionize networking by optimizing Ethernet for the rapidly evolving AI and HPC workloads. It addresses critical issues encountered by machine learning algorithms in large compute clusters, making it a promising solution to future-proof performance for scale-out data center networking.

While the group is still working on drafting these new standards, there is much to be excited about at this stage. Read on for a look at where the Ultra Ethernet Consortium is heading, what impact AI demands will have on the shaping of the new standard, and how Synopsys 1.6T/224G Ethernet IP expertise will provide SoC designers an accelerated path to silicon success with the latest standards developments.

ultra ethernet

Why New Ethernet Standards?

Ethernet was first developed  in 1973, as a means of connecting early workstations. Computing and connectivity has come a very long way in the decades since. Current Large Language AI models (LLM) have trillions of parameters and require massive increases in computational throughput to prevent performance bottlenecks. 

ai clusters

Figure 1: AI clusters are needed to enable the computing, storage, and bandwidth needed to process trillions of LLM parameters. Source: Trillion Parameter AI Serving Infrastructure for Scientific Discovery: A Survey and Vision (arxiv.org)

As application demands drive the need for increased network performance and efficiency, now is the time to improve the Ethernet networking technology and standards.

One issue addressed by Ultra Ethernet is the impact of tail latency of the network. The tail latency is defined as the significantly longer delay experienced by occasional datagrams as they traverse the network. This can cause a system to momentarily stall due to packet loss or congestion. Ultra Ethernet will address this by allowing out of order packet delivery together with a smart mechanism for low layer retransmission of packets. Reduction in stalls of AI processors increases performance and maximizes utilization and efficiency of these high-value resources.  

Modern Problems, Modern Ultra Ethernet Solutions

Earlier this year, Synopsys announced the world’s first 1.6T Ethernet IP solution, which allows users to prepare for Ultra Ethernet as the industry addresses the challenges of large-scale data processing. The reality of the current computing landscape is that we are bumping against the ceiling of capabilities in existing infrastructure.

One of the most challenging issues in modern computing is scale. Scaling out (adding more accelerators to your architecture to spread the workload across more machines) is needed to process the LLMs. With this in mind, Ultra Ethernet will enable AI clusters to scale out to a nearly unlimited number of accelerators, breaking free from the constraints of the past.

The complete Synopsys 1.6T Ethernet IP solution, supporting the upcoming Ultra Ethernet specifications, offers a host of benefits aimed at addressing bandwidth needs driven by AI that are straining data centers across the globe, including:

  1. Reducing interconnect power consumption by up to 50% compared to existing implementations.
  2. Multi-channel/multi-rate Ethernet controllers that offer 1.6T support with up to 40% latency reduction and up to 50% area reduction compared to 800G solutions.
  3. Pre-verified MAC+PCS+224G PHY IP Subsystem that accelerates TTM and minimizes integration risk.
  4. Synopsys verification IP for 1.6T Ethernet that speeds up verification closure through advanced protocol, methodology, and productivity features.

More detailed product information can be found here.

A Fast Future for the Terabit Era

Organizations of all areas of the networking ecosystem will benefit significantly from engaging with the UEC. The Consortium provides collaboration opportunities by bringing together industry experts like Synopsys, researchers, and innovators. This benefits everyone in the ecosystem; startups gain access to valuable knowledge, resources, and expertise, accelerating their development cycles, while enterprise-level businesses ensure they continue to lead the industry in innovation and craft best practices. The UEC’s focus on ultra-high-speed Ethernet aligns with the demands of next-generation AI and HPC applications, enabling the development of robust, scalable networking solutions. 

In the last few years, the explosion of AI and machine learning has driven the growth in computing needs to levels never seen in history. In order to keep the wheels turning, Synopsys is excited to be a part of the Ultra Ethernet Consortium and continue to shape the future of networking standards.

Continue Reading