From Silicon Design to End of Life—Mitigate Memory Failures to Boost Reliability

Anand Thiruvengadam, Guy Cortez

May 31, 2023 / 6 min read

If you want to lower your risk and achieve SoC design success sooner, memory design and verification should be commanding your attention. This is even more true if your projects are safety-critical, including applications such as autonomous vehicles, smart medicine, life support in the cosmos, disaster prevention here on earth, you name it. In these critical applications, meeting or exceeding high reliability and functional safety can be the difference between life and death, chaos and order, success and failure. Simply put, you don’t want your memory to fail.

Memory design must meet the moment as we trend toward a hyperconvergent, multi-die future. One-size-fits all, general-purpose memories no longer work for advanced applications. Today, a web of diverse analog and digital interconnects, a complex power distribution network (PDN), and new requirements for faster memory access all impact memory design, even while adapting to new protocols, technologies, and architectures. With all these balls in the air, how can you ensure that your memory design also remains reliable throughout its lifecycle?

Leveraging an article that recently appeared in Semiconductor Engineering, we’re exploring the importance of memory reliability while addressing today’s silicon design complexities. Read on to learn more about increasing memory reliability throughout the silicon lifecycle.

memory design techniques

What’s the Difference Between Safety and Reliability Anyway?

In electronic design, the terms “safety” and “reliability” are often conflated, but there are nuances that are important to understand. For instance, functional safety engineers design safety into a design by considering a lot of potential issues so problems are flagged for appropriate mitigation that is seamless and fast. On the other hand, designing reliability into an application minimizes the odds of those events happening and makes the design more resilient in case such events do happen anytime during the product lifecycle.

The good news is that electronic design automation (EDA) methods and tools already used in functional safety can be applied to memory, ensuring standards safety compliance for many applications, for instance, ISO 26262 in automotive. Reducing the number of faults, detecting them, and responding appropriately is critical to all the dies in your successful design. This is true throughout the lifecycle of your system and applies to your memory components as well. For memory, the biggest reliability hurdles happen in the early and late parts of the lifecycle and these challenges must be mitigated prior to tape out by performing static and dynamic analysis.

Here is a roadmap to better ensure memory reliability through every stage of life:

  Early Life Mid Life End of Life
Memory Life-Stage Characteristics Increased failures: Marginal devices are culled through failures, otherwise known as “infant mortality” Long period of low failure risk Increased failures: Silicon aging decreases reliability and increases device failures
Failure Mitigation through Static/Dynamic Analyses (prior to tape out)
  • Static analog and digital circuit checks
  • Analog fault simulations
  • High-sigma Monte Carlo analysis
  • Static power/signal net resistance checks
  • Dynamic electromigration/IR drop (EMIR) analysis
  • Device aging analysis

How to Find the Right Solution for Memory Reliability in Electronic Systems

Because you are anticipating the problems your memory encounters throughout its entire life—development, manufacture, in the field, and beyond—it’s vital to ensure a cohesive methodology for a robust static and dynamic analysis that maps appropriately to the health of your chip. And… it’s important to do this analysis prior to tape out. You will want a unified workflow with easy-to-use tools that will enable greater productivity while helping to solve any issues your chip may encounter before they become a problem.

Through Synopsys PrimeSim™ Reliability Analysis and PrimeWave Design Environment, Synopsys delivers a complete solution for memory reliability that will lower your risk and ease your design process.

Ensuring Memory Reliability in a New Era of Multi-Die Design

Synopsys PrimeSim Reliability Analysis provides a unified workflow of production-proven and foundry-certified reliability analysis technologies to enable full lifecycle reliability verification.

Synopsys PrimeWave Design Environment delivers a rich and consistent reliability verification experience across PrimeSim Reliability Analysis technologies and PrimeSim simulator engines with unified setup and results post-processing for better productivity and ease of use.

Synopsys PrimeSim Reliability Analysis includes:

  • Circuit Check (CCK) extends electrical rules checking (ERC) to analog/memory circuits to help weed out gross design violations and improve coverage.
Analog ERC PrimeSim CCK

  • Design Robustness Analysis (DR): A suite of machine learning- (ML-) driven analysis technologies that enable efficient and accurate high sigma (usually 4-7) margin analysis across memory leaf cells and full instances.
  • Electromigration (EMIR): GPU-accelerated high performance with foundry-certified analysis on a full instance or full chip memory design including PDN. Includes “what-if” analysis and debug hints that make problems easier to locate and resolve.

 

PrimeSim EMIR

  • Custom Fault: This leader in fault simulation makes functional safety and test coverage realistic. It is standards compliant for ISO 26262 for complex and comprehensive failure modes, effects, and diagnostic analysis (FMEDA).

 

PrimeSim Custom Fault

  • Static Powernet Resistance (SPRES): Power/ground integrity analysis that’s quick enough to run even as memory development is just getting started.
  • MOS Reliability Analysis (MOSRA): High-performance, foundry-certified accuracy MOS aging analysis.
 

PrimeSim Family of Simulators

  • PrimeSim SPICE: Fast GPU-accelerated SPICE simulator for analog, RF, and mixed-signal design
  • PrimeSim Pro: Industry-leading performance and capacity for DRAM, Flash memory, and mixed-signal verification.
  • PrimeSim HSPICE ®: This “gold standard,” accurate circuit simulation uses ML to exponentially decrease the amount of runs relative to traditional Monte Carlo simulations.
  • PrimeSim XA: Delivers performance and capacity for SRAM, custom digital, and mixed signal verification.

 

PrimeWave Design Environment

The Synopsys PrimeWave Design Environment delivers a complete solution to ensure memory reliability from early life to end of life.

Silicon Lifecycle Management (SLM) Family of Products Complements the Synopsys PrimeWave Design Environment

Synopsys Silicon Lifecycle Management promotes silicon reliability and performance throughout your device’s lifecycle as a complement to Synopsys PrimeSim Reliability Analysis and PrimeWave Design Environment:

  Lifecycle Stage
In-Design In-Ramp In-Production In-Field
Synopsys
SLM Solution
Silicon data from embedded monitors help you understand the dynamic environmental conditions in tandem with PrimeSim Reliability Analysis Pinpoint systemic failures in silicon with the objective of attaining yield targets prior to production Ongoing monitoring and analysis of test manufacturing data to optimize yield, quality, and reliability Ongoing real-time monitoring and analysis of health/aging performance and power optimization

 

 

Silicon Lifecycle Management Diagram

Learn More About Reliable Memory Design

From hyperconvergent data centers to the devices in your hand, memory reliability is critical to the functioning of our modern lives. If you would like to learn more about how to get full-spectrum insights on the health of your systems—certified by foundries, compliant with standards, and built for ease of use with a unified flow—Learn more at Synopsys.com/memory or contact us.

Continue Reading