Error Pattern Transformation: Rescuing Uncorrectable Fault Patterns in On-chip Memories
Voltage scaling can effectively reduce processor power, but reduces the reliability of the SRAM cells in on-chip memories; as such, it is often accompanied by the use of an error correcting code (ECC). To enable reliable and efficient memory operations at low voltages, ECCs for on-chip memories must provide both high error coverage and low correction latency. Unfortunately, existing ECCs provide either low error coverage or high correction latency. We observe that the number of errors that many low-latency ECCs can correct differs widely depending on the error patterns in the logical words they protect. We propose adaptively rearranging the logical bit to physical bit mapping per word according to the BIST-detectable fault pattern in the physical word. The adaptive logical bit to physical bit mapping transforms many uncorrectable error patterns in the logical words into correctable error patterns and, therefore, improves ECC error coverage and reduces the minimum required voltage of operation. Our evaluations for an L1 cache show that applying our proposal to a low-latency ECC can tolerate 26.7x higher bit failure rate than the best low-latency ECC baseline alone while incurring the same number of overhead cycles for correction; this in turn provides an 28.2% processor-wide power reduction when applied to 65nm 6T SRAM cells. Benefits remain even in the presence of faults undetectable by BIST (e.g., soft errors).
Sunday, Sept. 20, 2015, 8 a.m. — Tuesday, Sept. 22, 2015, 10 p.m. CT
Austin, TX, United States
Technical conference and networking event for SRC members and students.