Balancing generality and specialization for machine learning in the post-ISA era
MetadataShow full item record
A growing number of commercial and enterprise systems are increasingly relying on compute-intensive machine learning algorithms. While the demand for these apaplications is growing, the performance benefits from general-purpose platforms is diminishing. This challenge has coincided with the explosion of data where the rate of data generation has reached an overwhelming level that is beyond the capabilities of current computing systems. Therefore, applications such as machine learning and robotics can benefit from hardware acceleration. Traditionally, to accelerate a set of workloads, we pro- file the code optimized for CPUs and offload the hot functions on hardware compute units designed specially for that particular function, hence providing higher performance and energy efficiency. Instead in this work, we take a revolutionary approach where we delve into the algorithmic properties of applications to define domain-generic hardware acceleration solutions. We leverage the property that a wide range of machine learning algorithms can be modeled as stochastic optimization problems. Using this insight we devise compute stacks for hardware acceleration that are built independent of the CPU. These stacks expose a high-level mathematical programming interface and automatically generate accelerators for users who have limited knowledge about hardware design, but can benefit from large performance and efficiency gains for their programs. Keeping these ambitious goals in mind, our work (1) strikes a balance between generality and specialization by breaking the long-held traditional abstraction of the Instruction Set Architecture (ISA) in favor of a more algorithm-centric approach; (2) develops hard- ware acceleration frameworks by co-designing a language, compiler, runtime system, and hardware to provide high performance and efficiency, in addition to flexibility and programmability; (3) segregates algorithmic specification from implementation to shield the programmer from continual hardware/software modifications while allowing them to benefit from the emerging heterogeneity of modern compute platforms; and (4) develops real cross-stack prototypes to evaluate these innovative solutions in a real-world setting and make them open-source to maximize community engagement and industry impact. Our work TABLA (http://act-lab.org/artifacts/tabla/) is public, and defines the very first open-source hardware platform for machine learning and artificial intelligence.