Design methodologies for scalable and reliable memory systems
Abstract
The objective of this research is to develop design methodologies for scalable and reliable memory systems in the presence of scalability and reliability issues exacerbated or created by continuous scaling. After investigating the origins and device-level mechanisms of memory failures, to examine the impact of such failures on operations of a memory system, this research proposes circuit- and system-level modeling and simulating methodologies. From observations based on simulation results, this research introduces design methodologies that mitigate row-hammering phenomenon by employing counter-based or probabilistic row activations on victim rows and repair increasing wearout failures by exploiting error-correcting codes for the error detection and sequence of commands for error identification during field operations. To enhance the reliability of a memory system, this research proposes methodologies that accurately estimate memory reliability using a system-level accelerated life tests with built-in self-test and error-correcting codes. This research also introduces a method of optimizing the design of experiments for isolating a failure caused by a target wearout mechanism from failures caused by other mechanisms and minimizing errors in the estimation of wearout parameters at the normal operating condition.