Cloud native EDA tools & pre-optimized hardware platforms
Low power design has been talked about for decades, but power challenges have intensified as process geometries have scaled and the demand for lower power devices has grown exponentially as the use cases evolve.
It’s true that companies are continuing to innovate new features and functionality for portable, handheld devices which require improvements in the battery life by minimizing power consumption (an important differentiator for consumers). But mobile design challenges are a relatively known entity as smartphones have been widely available for over a decade.
However, for “plug-in” products, power efficiency is increasingly important because it can affect the overall cost of building the systems (requiring heat sinks and elaborate cooling systems) as well as operating them (e.g., in server farms where massively parallel systems are used, a reduction in power for a single chip can result in significant power savings enterprise-wide, not to mention contribute to more environmentally sound operations).
The real leviathan in the low power space that design teams are now having to tackle is the AI chip, particularly the variety used in high-performance computing (HPC) applications. When used in data centers and other HPC applications, they don’t have the same constraints as traditional mobile devices (e.g., battery life, portability). However, the physics of smaller, denser and more novel architectures and manufacturing processes needed to implement AI introduce new types of power challenges. The traditional holy grail of performance, power, area (PPA) is still led by the need for the highest possible performance. But these days, the performance is actually limited by the power. It’s becoming extremely hard to deliver power reliably to every part of the chip without worrying about the dispersed heat impacting the reliability of the chip and leading to a thermal run-away.
Power ramifications with advanced AI chips can have significant impact on overall functionality, manufacturability, cost and reliability. As a result, design teams must evolve to even more power-smart methodologies and use sophisticated power analysis techniques and tools.
Low power design is all about reducing the overall dynamic and static power consumption of an integrated circuit (IC). Dynamic power comprises switching and short-circuit power, while static power is leakage, or current that flows through the transistor when the device is inactive.
Leakage power was the primary concern for design teams in the 90-to-16 nanometer range of process geometries because dynamic power was insignificant (10-15%) compared to its counterpart leakage power (85-95%). Once the industry shifted to 16-to-14 nanometers, dynamic power became more dominant than leakage power. This change also corresponded with a major shift in transistor architecture, moving from planar devices to FinFETs, a multigate device built on a substrate where the gate is placed on two, three or four sides of the channel or wrapped around the channel, forming a double-gate, 3D structure.
However, now as we move to process nodes like 7, 5 and 3 nanometers and architectures similar to “gates all around” implementation, leakage is again becoming an issue. Today, design teams are exploring options that were set aside in past designs to enable as much power and performance out of a design as possible. The need to reduce margin at advanced nodes has been discussed for a while, but the ability to actually do something about it was dispersed across different parts of the design process. That being said, the techniques and technologies to address today’s issues are familiar, but we are just now starting to really understand the precision with which they can be used.
Traditionally, low power design was overseen by an architect who had a complete system-level view of the chip. This architect would guide the rest of the team on which functional vectors to use for power. However, this was a very limited way to approach design.
Now, you’ll see team members from hardware, software and architecture working together from the very start (typically in parallel). This team structure and convergence of multiple disciplines in the same design has been talked about for many years, but is crucial to achieve the results required in the new generation of AI chips.
The team needs to intimately understand power’s ramifications across software development, hardware design and manufacturing. Modern design methodologies focus on concurrent design to optimize for PPA early on and avoid costly re-spin downstream.
Design for low power does not occur in a single step. It involves a collection of techniques and methodologies aimed at reducing the overall dynamic and static power consumption. Design for optimal power is woven throughout the entire chip design process, and typically there are five main phases for a design and verification methodology that are used:
Synopsys offers a proven low power flow and methodology solution that covers all aspects of the low power design and verification flow. It includes the necessary tools and integration to support a seamless methodology to address the power issue at each stage of the design process.
The most critical component for dynamic power analysis and optimization is the quality of the vectors. Vector quality is defined by the realistic activity seen when the SoC is working in a real system. As mentioned above, the traditional power analysis process involved checking with the SoC architect to identify which vectors to use for power analysis and optimization. This was a hit or miss activity that didn’t always cover all aspects and scenarios.
To be able to accurately predict the amount of power that SoCs are going to consume, designers need to put them under a test bench that is a true-to-life representation of how they are going to be used. The best system that can be used to run live applications is called emulation.
The sheer amount of data involved in running a power analysis for an AI chip requires high-powered tools. Even when running an application for a few seconds on an emulator, the resulting data is massive (hundreds of gigabytes comprised of trillions or billions of clock cycles). To help solve this problem, power profiling within an emulation system identifies the window of interest for power analysis and prunes the windows from billions to millions to thousands which makes the power analysis from an emulation system much more practical.
Synopsys’ ZeBu Server is the industry’s fastest emulation system, delivering two times the performance of legacy emulation solutions by taking advantage of its unique Fast Emulation architecture, the most advanced commercial FPGAs, and innovations in FPGA-based emulation software. These software innovations enable users with faster compile, advanced debug, including native integration with Verdi, simulation acceleration, hybrid emulation, and, of course, power analysis.
Additionally, the new third dimension that comes into the picture when designing AI chips that isn’t as much of a factor in mobile chip design is temperature. Generating a heat map at an early stage via emulation becomes a lot more important for the entire design process.
When it comes to low power design for AI chips, adopting new methodologies and even new tools, like emulation, is critical to creating a tightly interwoven team of design professionals from many different disciplines.