Graph databases and applications have attracted much attention in the past few years due to the efficiency with which they can represent big data, connecting different layers of data structures and allowing analysis while preserving contextual relationships.
This has resulted in a fast-growing community that has been developing various database and algorithmic innovations in this area, many of which will be gathering together in this conference. We joined this field as computer architecture researchers and are currently building a complete hardware-software design, called DECADES, that aims to accelerate the execution of these algorithms.
From a computer architecture point of view, applications involving dense matrix operations such as neural networks have garnered much attention for their acceleration through specialized hardware such as GPUs and TPUs, while graph applications remain difficult to improve even with modern specialized accelerator designs. The reason for this is the characteristic pointer-based data structures of graph applications and the resulting irregular memory accesses performed by many of these workloads. Such irregular memory accesses result in memory latency bottlenecks that dominate the total execution time. In this talk, as part of the DECADES infrastructure, we present an elegant hardware-software codesign solution, named FAST-LLAMAs, to overcome these memory-bottlenecks, and thus, accelerate graph and sparse applications in an energy efficient way.
Graph databases and applications have attracted much attention in the past few years due to the efficiency with which they can represent big data, connecting different layers of data structures and allowing analysis while preserving contextual relationships.
This has resulted in a fast-growing community that has been developing various database and algorithmic innovations in this area, many of which will be gathering together in this conference. We joined this field as computer architecture researchers and are currently building a complete hardware-software design, called DECADES, that aims to accelerate the execution of these algorithms.
From a computer architecture point of view, applications involving dense matrix operations such as neural networks have garnered much attention for their acceleration through specialized hardware such as GPUs and TPUs, while graph applications remain difficult to improve even with modern specialized accelerator designs. The reason for this is the characteristic pointer-based data structures of graph applications and the resulting irregular memory accesses performed by many of these workloads. Such irregular memory accesses result in memory latency bottlenecks that dominate the total execution time. In this talk, as part of the DECADES infrastructure, we present an elegant hardware-software codesign solution, named FAST-LLAMAs, to overcome these memory-bottlenecks, and thus, accelerate graph and sparse applications in an energy efficient way.
We propose a 40 minute talk which includes a rigorous characterization of the problem, and an in-depth analysis of our software-hardware co-design solution, FAST LLAMAs. We will present results based on a simulated model of our system which show significant performance improvements (up to 8x), as well as energy improvements (up to 20x) on a set of fundamental graph algorithms and important real-world datasets. Our system is completely open-source, and includes a compiler and cycle-accurate simulator. Our proposed system is compatible and easily extendable to many of the open-source graph analytic and database frameworks and we are excited to engage with the open-source community of this increasingly important domain.
The work is part of a large collaboration from three academic groups: Margaret Martonosi (PI Princeton), David Wentzlaff (PI Princeton), Luca Carloni (PI Columbia) with students/researchers: Juan L. Aragón (U. of Murcia, Spain), Jonathan Balkind, Ting-Jung Chang, Fei Gao, Davide Giri, Paul J. Jackson, Aninda Manocha, Opeoluwa Matthews, Tyler Sorensen, Esin Türeci, Georgios Tziantzioulis, and Marcelo Orenes Vera. In addition to the submission author, portions of the talk may be offered by others in the collaboration.