Cloud native EDA tools & pre-optimized hardware platforms
The mobile industry is growing at a very fast pace with its never-ending hunger for data and bandwidth. We have witnessed the change from a dial-pad to touch-screens, from black and white display to QHD 4k display with millions of colors, and memory space from KB to GB, in a very short span of time. The biggest challenge is increasing bandwidth without compromising performance or adding any significant numbers in the power consumption column. The solution to this challenge is the LPDDR or Mobile DDR standard released by JEDEC. There have been several revisions to this standard, the latest being LPDDR4. LPDDR4 provides a data bandwidth of 4266 Mbps, which is almost double that of LPDDR3. It also provides a significant reduction in power consumption compared to LPDDR3. For further insights on LPDDR4 and its predecessors please refer to our previous blog “LPDDR4: What Makes it Faster and Reduces Power Consumption.”
In this blog, we will discuss the features which make LPDDR4 efficient in terms of power consumption, bandwidth utilization, data integrity and performance.
A new I/O signaling scheme has been introduced in LPDDR4, known as low voltage swing terminated logic (LVSTL). LVSTL uses significantly lower voltage levels than used in the previous version of LPDDR. Another advantage of this signaling scheme is that it consumes no termination power while low level (0) is being driven through the I/O drivers. It implies that less power will be consumed if there are more zeros in the data stream. DBI feature has been introduced to keep more zeros than ones in the data stream. DBI works at a byte level granularity. Whenever a byte contains more than four number of bits as 1 then the driver will invert the whole byte and send the corresponding data mask inversion(DMI) bit to notify the receiver that the respective byte has been inverted.
LPDDR4 added two physical sets of register spaces, FSP0 and FSP1, to switch between two different operating frequencies without retraining. These register sets store all operational parameters required at two different frequencies for DRAM, one in effective mode and the other in shadow mode. The DRAM will be trained with both frequencies and the parameters will be stored in the register sets during the command bus training mode. Switching between FSP0 and FSP1, or vice versa, can be completed rapidly by just writing on a mode register.
Increased memory densities within the same chip size lead to smaller DRAM cells. Smaller cells can store a smaller charge compared to a bigger cell, this in turn can lead to a reduced noise margin that makes the system more data error prone. Also, densely placed cells are less immune to cross talk interference which eventually results in data error. To perform any data operation on a row, it needs to be activated first. Here “activate” means to put the cells of the row on a higher voltage level while the other rows of that bank remain at lower voltage level. When a row gets activated rapidly, its voltage level also changes accordingly which eventually accelerates the discharge rate of the cells of an adjacent row due to the close proximity of the cells. Since DRAM cells store data information in capacitors in the form of charges that tends to discharge over a period of time, a refresh cycle is needed within the refresh period to retain the stored charge. Due to the accelerated discharge rate on the cells of an adjacent row, it may result in the loss of data because the capacitor was fully discharged before the next refresh cycle arrived. To overcome this scenario, LPDDR4 introduced the Target Row Refresh (TRR) mechanism. TRR limits the maximum number of activates (MAC count) on a single row within a refresh period. Whenever the activate count per row (target row) reaches the MAC count, the adjacent rows (victim row) will be refreshed by the TRR procedure to avoid data loss.
There are multiple trainings provided by LPDDR4 to align or re-adjust the delays introduced on the I/O signals with respect to CLK or other signals. As per standard physical interface definition of LPDDR4, there are CLK, CS, CA, DQ and DQS signals which need proper alignment for successful data transfers. As the CA line is sampled at the CLK signal, there should be a proper phase relationship between CA and CLK. Similarly, DQ gets sampled on DQS signal, so again there should be a phase relationship between the two. To maintain these phase relationships, LPDDR4 proposes training mechanisms. Let’s look at those:
These features make LPDDR4 a complete package and ideal to be used as a RAM in any mobile SoC. These features must be addressed in any verification plan for LPDDR4 based SoC designs. Synopsys provides a complete verification solution for LPDDR4 with run time selection of JEDEC and vendor parts, a set of built-in protocol, timing and data integrity checks, configurable timing parameters, built-in functional coverage and verification plan, and backdoor access to memory.