A HyperTransport-Enabled Global Memory Model For Improved Memory Efficiency

Show full item record

Please use this identifier to cite or link to this item: http://hdl.handle.net/1853/27231

Title: A HyperTransport-Enabled Global Memory Model For Improved Memory Efficiency
Author: Young, Jeffrey ; Yalamanchili, Sudhakar ; Silla, Federico ; Duato, José
Abstract: Modern and emerging data centers are presenting unprecedented demands in terms of cost and energy consumption, far outpacing architectural advances related to economies of scale. Consequently, blade designs exhibit significant cost and power inefficiencies, particularly in the memory system. For example, we observe that modern blades are often overprovisioned to accommodate peak memory demand which rarely occurs concurrently across blades. With memory often accounting for 20% to 40% of the total system power [1], this approach is not sustainable. Concurrently, HyperTransport in concert with new high-bandwidth commodity interconnects can provide low-latency sharing of memory across blades. This paper provides a HyperTransport-enabled solution for seamless, efficient sharing of memory across blades in a data center, leading to significant power and cost savings. Specifically, we propose a new global address space model called the Dynamic Partitioned Global Address Space (DPGAS) model that extends previous concepts for Non-Uniform Memory Access (NUMA) and partitioned global address spaces (PGAS). The DPGAS model relies on HyperTransport’s low-latency characteristics to enable new techniques for efficient sharing of memory across data center blades. This paper presents the DPGAS model, describes HyperTransport-based hardware support for the model, and assesses this model’s power and cost impact on memory intensive applications. Overall, we find that cost savings can range from 4% to 26% with power reductions ranging from 2% to 25% across a variety of fixed application configurations using server consolidation and memory throttling. The HyperTransport implementation enables these savings with an additional node latency cost of 1,690 ns latency per remote 64 byte cache line access across the blade-to-blade interconnect.
Type: Technical Report
URI: http://hdl.handle.net/1853/27231
Date: 2008
Contributor: Georgia Institute of Technology. School of Electrical and Computer Engineering
Universidad Politécnica de Valencia
Relation: CERCS ; GIT-CERCS-08-10
Publisher: Georgia Institute of Technology
Subject: Bandwidth
Blades
Dynamic Partitioned Global Address Space (DPGAS)
Interconnect
Latency
Low-latency
Memory
Partitions

All materials in SMARTech are protected under U.S. Copyright Law and all rights are reserved, unless otherwise specifically indicated on or in the materials.

Files in this item

Files Size Format View
git-cercs-08-10.pdf 208.1Kb PDF View/ Open

This item appears in the following Collection(s)

Show full item record