drpoc-Extensible Run-Time Resource Monitoring for Cluster Applications
MetadataShow full item record
In this paper we describe the dproc (distributed /proc) kernel-level mechanisms and abstractions, which provide the building blocks for the implementation of efficient, cluster-wide, and application-specific performance monitoring. Such monitoring functionality may be constructed at any time, both before and during application invocation, and can include dynamic run-time extensions. This paper (i) presents dproc's implementation in a Linux-based cluster of SMP-machines, and (ii) evaluates its utility by construction of sample monitoring functionality.