The proposed research aims at designing highly efficient and scalable algorithms for large graph-based computations on modern data sets and emerging parallel platforms. Graph abstractions and graph-theoretic computations play a critical role in the analysis of data from experimental devices, medical data sets, socially-generated data from the web and mobile devices, and scientific simulation data. The proposal aims to provide a fundamental understanding of parallel graph analytics, with the creation of novel algorithmic frameworks. Motivated by current terascale applications in genomics, proteomics, and social network analytics, the project will undertake the clean-slate design of four algorithmic frameworks that capture broad classes of graph-based computations: traversal-based static graph computations, dynamic graph analytics, subgraph enumeration and pattern search computations, and multiscale and multilevel graph computations. This research will lead to the design of new memory-efficient graph representations and data structures, the creation of novel linear time parallel algorithmic strategies based on data partitioning, and a deeper understanding of architectural features that impact graph processing efficiency and scalability. The expected outcome is to enable non-parallel computing experts design graph analytics at orders-of-magnitude higher levels of abstraction and performance than the current state-of-the-art.

The proposed educational activities, closely related to the research goals, attempt to foster an environment of interdisciplinary computational research within Penn State. New graduate classes on parallel graph analysis, parallel algorithms for computational biology, and high-performance social data mining, will facilitate student involvement in current research activities of this project and its collaborators. The identification of new high-level data-centric parallel algorithm and software design principles will be a key broader impacts outcome. The project will actively collaborate with academic, industrial, and government laboratory partners that rely on graph-based analytics, release the algorithmic frameworks under open-source licenses, and train practitioners and interdisciplinary teams through virtual workshops and tutorials.

NSF ACI Award #1253881

Research Team


Talks and Presentations


Last updated: