Cloud native EDA tools & pre-optimized hardware platforms
The market projections for 3D multi-die designs show an unprecedented change in how silicon will be designed and delivered. IDTechEx projects the Chiplet marketplace to reach $411B by 2028. The Market.us report projects the growth of advanced packaging from $35B in 2023 to $158B in 2033 and in the same report, Market.us predicts over $60B of the $155B to be 3D SoC and 3D stacked memory. Such numbers and reports validate the trend to rapid adoption of multi-die designs and 3D packaging. This article highlights some of the drivers for 3D multi-die designs and the key requirements for die-to-die and interface IP for 3D packaging.
To overcome the limitations of Moore’s Law and take full advantage of multi-die design, designers can integrate heterogeneous and homogeneous dies in a single package in many ways, as illustrated in Figure 1.
Figure 1: Example of multiple die integration in a single package
The first example shows the integration of 2 or more dies in a package, connected using an organic substrate with either single-ended or differential IO or short-reach serial transceivers. The 2D integration approach, is relatively low-cost but has limitations on bandwidth between the dies. In a 2.5D integration, as shown in Figure 2, a higher performance interposer is used to enable high-density signal routing between the multiple dies. The signals are then routed to the package substrate and out to the package pins. The dies are connected through a die-to-die interface such as UCIe at 40G or higher per lane data rates that aims to deliver higher bandwidth while managing latency and thermal constraint trade-offs.
Includes in-depth technical articles, white papers, videos, upcoming webinars, product announcements and more.
Figure 2: 2.5D integration with Interposer + Substrate Package
A 3D integration can reduce form factor but more importantly increase interconnect density, lower latency, and lower interconnect power for better scalability. In a 3.5D integration, die-to-die connectivity from a 3D die stack to another 2D chip or 3D die stack is included.
Advanced process nodes are driving more transistors but with the slowing of Moore’s law and the demand to increase compute performance for complex AI workloads, designs need more processing capacity than is included in a single 800mm2 reticle. As a first step, designers could place two or more dies in a package connected via parallel or serial IOs to scale up to more processing capacity. A better approach is to break up the functionality into multiple smaller dies also called chiplets. The smaller dies increase yield and can offer a lower cost solution even with the added silicon area of die-to-die interfaces and cost of advanced packaging. This multi-die design approach includes the option to optimize the process node for each die, which can result in more cost savings.
With careful planning, product managers and architects can draw from a collection of reusable chiplet dies that can be integrated in an advanced package. For example, a low-end system may have a single AI accelerator chiplet and a high-performance product may include multiple AI accelerator chiplets to scale performance. Each product could be created with the same set of base chiplets in a different combinations or topologies to optimize processing, thermal management, and cost needs. In addition, by re-using dies and creating a new package to produce a new product, the new systems can be implemented much faster and at lower total cost of ownership than taping out a monolithic chip using traditional methods.
2.5D integration has been in mass production for over a decade in applications such as FPGAs. Some of the challenges with 2.5D integration are that silicon interposers used to interconnect multiple dies are limited on how large they can be (3-5 reticles near term). This puts a limit on how many dies can be included in a single package. Larger silicon interposers bring reliability issues such as brittleness and warpage which could affect bump connection reliability. To address some of these size and reliability concerns and extend the utility of 2.5D integration, the industry is developing new redistribution layer (RDL) interposers with or without silicon bridges. Silicon bridges can add higher signal density routing than offered by RDL interposers alone.
2.5D integration has enabled a wave of multi-die products but 2.5D interconnect does become a limiting factor for bandwidth, processing, and low latency needs that are accelerating in scale faster than serial 2.5D die-to-die link capabilities. One improvement is to use 3D die stacking. 3D packaging has the potential to dramatically increase interconnect density while lowering latency and interconnect power consumption with almost wire-to-wire links in some topologies. The UCIe specification shows that target bandwidth specification for UCIe Advanced (or UCI-A) interconnect is from 188-1350 GB/s per millimeter squared (mm2) while the target value for UCIe-3D is 4TB/s per mm2 assuming 9um bump pitch, as shown in Table 1. At the same time power efficiency improves from 0.25 pJ/b target to <0.05 pJ/b target. The low latency advantage of 3D packaging is critical for systems with the compute die as the top die and cache memory die as the bottom die as an example use case.
Table 1: UCIe Consortium KPI Targets for 2.5D and 3D Packaging
Source: A Deep Dive into UCIe-3D UCIe Consortium Webinar Dec 4, 2024
Implementing a 3D package provides many benefits in scalability and performance but brings new challenges. To address these challenges a new approach and new tools are required for architecture definition and planning, feasibility assessment, prototyping, and advanced package design. Designers need to consider new multi-physics aspects such as crosstalk between elements on different dies and thermal management of multiple dies where one die may heat up a nearby die.
With 3D packaging, IO no longer needs to be placed at the edge of the chip. Also, by using hybrid bond technology the vertical die-to-die connection between is even tighter. Hybrid bonding connects dies in packages using tiny copper-to-copper connections (<10um). Synopsys offers specially tuned 3DIO IP for multi-die designs and 3D packaging, enabling the optimal balance of power, performance, and area to address the 3D packaging demands. This IP solution includes a synthesis friendly Tx/Rx cell compatible with Synopsys standard cell libraries, source synchronous 3DIO to aid in lowering bit error rates and easing timing closure, and a 64-bit hardened PHY solution with clock forwarding and built in redundancy. For more information read the Synopsys 3DIO Solution for Multi-Die Integration (2.5D/3D) article.
When considering interface IP integration with off chip IO PHYs, it is not as simple as taking existing 2D implementations and running them through 3D-enabled tools. Careful consideration must be taken by IP providers to deliver IPs that work in context of a particular 3D IC topology. This will require a closer partnership between IP providers and designers than may have been required in the past.
A common 3D topology is Chip-on-Wafer or CoW. This topology has tested dies stacked on top of a tested wafer and then diced into individual known good die stacks that are assembled and tested to create a final product. The bonding of the dies to wafer could be using metal-to-metal hybrid bonding techniques or solder bump connections. In this topology the bottom die is flipped compared to a standard face down flipchip assembly, so the metals face up and directly face the metals from the top die, which is in the standard orientation, as shown in Figure 3. This offers the highest density and lowest resistance connection between the 2 dies but is limited to stacks of 2 dies. The face-to-back topology keeps dies in the standard flipchip face down orientation but allows stacking more than 2 dies in applications such as HBM memories.
Figure 3: Face-to-Back vs. Face-to-Face
In the face-to-face topology, the typical stack has the leading-edge compute node as the top die and the lower die will be on an older less expensive process node and include analog and IO functions which do not benefit as much from scaling to the latest nodes.
Examples of bottom die interface IP that may be included are 2.5D UCIe interfaces to connect to other 3D stacks or 2D die in the same package or PCIe 6.0/7.0 or 224G Ethernet interfaces to connect to outside world through the package. In these cases, the PHY IP must be reoriented so signals from the bumps pass through the silicon bulk using through silicon vias (TSVs) to connect to the metal layers and route to the diffusion layer silicon devices. IOs also may need to account for adding in TSVs and routing to connect signals and power to the top die. In this case, the bottom die (PHY IP) may increase in size to account for these additional signals and designers must perform additional analysis to address effects from multi-physics on TSV signals and embedded inductors.
The adoption of multi-die designs is increasing as seen by products from AMD, Intel, Nvidia, and others, and as reported by various industry analysts. Synopsys predicts the use of 3D packaging to go from R&D to production in the next 12 months, which will put the focus on 3D-enabled IP availability beyond just die-to-die. Depending on the die topology, the interface IP must be optimized to support the required 3D packaging features for scalability and performance. Synopsys offers a complete and scalable multi-die solution, which includes a broad portfolio of die-to-die and chip-to-chip IPs that address the latency, power, performance, and multi-physics requirements of 2.5D and 3D packaging. We are working with our ecosystem partners to deliver the best and most differentiated solutions that our customers can leverage to move their innovations forward.
In-depth technical articles, white papers, videos, webinars, product announcements and more.
Learn More →