Data Remapping for Design Space Optimization of Embedded Cache Systems
Palem, Krishna V.
Rabbah, Rodric Michel
MetadataShow full item record
In this paper, we present a novel linear time algorithm for data remapping that is (i) lightweight, (ii) fully automated and (iii) applicable in the context of pointer-centric programming languages with dynamic memory allocation support. All previous work in this area lacks one or more of these features. We go on to show that this algorithm impacts the design and usage of embedded systems in two significant ways. First, we show that by virtue of locality enhancements via data remapping, our approach halves the amount of cache memory needed to support a specified performance goal. While cache systems are very desirable from a performance standpoint, cost considerations have always limited their use in the context of embedded designs. However, we have shown that remapping can significantly lower cache needs, and thus it can be a key step in optimizing the memory needs during design-space exploration of embedded systems. To help achieve this goal, we identify a range of metrics for quantifying the costs associated with popular notions of locality, prefetching, regularity of memory access and others. These metrics can serve as the quantitative foundations of a design space exploration system in which remapping can play a crucial role for optimizing the costs associated with the cache subsystem. Second, for several COTS microprocessors with a fixed cache architecture, such as the Pentium and UltraSparc, we show that remapping can achieve a performance improvement of 20% on the average. In addition, for a parametric research HPL-PD microprocessor, which characterizes the new Itanium machines, we achieve a performance improvement of 28% on average. All of our results are achieved using applications from the DIS, OLDEN and SPEC2000 suites of integer and floating point benchmarks.