• Login
    View Item 
    •   SMARTech Home
    • College of Computing (CoC)
    • School of Computer Science (SCS)
    • School of Computer Science Technical Reports
    • View Item
    •   SMARTech Home
    • College of Computing (CoC)
    • School of Computer Science (SCS)
    • School of Computer Science Technical Reports
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    SD³: A Scalable Approach to Dynamic Data-Dependence Profiling

    Thumbnail
    View/Open
    GT-CS-11-09.pdf (2.709Mb)
    Date
    2011
    Author
    Kim, Minjang
    Lakshminarayana, Nagesh B.
    Kim, Hyesoon
    Chi-Keung Luk,
    Metadata
    Show full item record
    Abstract
    As multicore processors are deployed in mainstream computing, the need for software tools to help parallelize programs is increasing dramatically. Data-dependence profiling is an important technique to exploit parallelism in programs. More specifically, manual or automatic parallelization can use the outcomes of data-dependence profiling to guide where to parallelize in a program. However, state-of-the-art data-dependence profiling techniques are not scalable as they suffer from two major issues when profiling large and long-running applications: (1) runtime overhead and (2) memory overhead. Existing data-dependence profilers are either unable to profile large-scale applications or only report very limited information. In this paper, we propose a scalable approach to data-dependence profiling that addresses both runtime and memory overhead in a single framework. Our technique, called SD³, reduces the runtime overhead by parallelizing the dependence profiling step itself. To reduce the memory overhead, we compress memory accesses that exhibit stride patterns and compute data dependences directly in a compressed format. We demonstrate that SD³ reduces the runtime overhead when profiling SPEC 2006 by a factor of 4.1⨯ and 9.7⨯ on eight cores and 32 cores, respectively. For the memory overhead, we successfully profile SPEC 2006 with the reference input, while the previous approaches fail even with the train input. In some cases, we observe more than a 20⨯ improvement in memory consumption and a 16⨯ speedup in profiling time when 32 cores are used.
    URI
    http://hdl.handle.net/1853/38798
    Collections
    • College of Computing Technical Reports [505]
    • School of Computer Science Technical Reports [105]

    Browse

    All of SMARTechCommunities & CollectionsDatesAuthorsTitlesSubjectsTypesThis CollectionDatesAuthorsTitlesSubjectsTypes

    My SMARTech

    Login

    Statistics

    View Usage StatisticsView Google Analytics Statistics
    facebook instagram twitter youtube
    • My Account
    • Contact us
    • Directory
    • Campus Map
    • Support/Give
    • Library Accessibility
      • About SMARTech
      • SMARTech Terms of Use
    Georgia Tech Library266 4th Street NW, Atlanta, GA 30332
    404.894.4500
    • Emergency Information
    • Legal and Privacy Information
    • Human Trafficking Notice
    • Accessibility
    • Accountability
    • Accreditation
    • Employment
    © 2020 Georgia Institute of Technology