# ENERGY-AWARE PERFORMANCE SCALING

The commercial rise of Chip Multiprocessors (CMPs) highlights the general trend of increased parallelism over the past decade. Utilizing these complex resources is an arduous task from both a power and performance perspective. It is clear that these metrics are intertwined, and rigorous study of their scaling characteristics is needed towards optimization goals.

We investigate power and performance as we scale scientific applications to use these emerging hardware resources. We consider a range of hardware, including the utilization of parallel computation inside CMPs through the shared resource nodes in a high performance cluster. An accurate evaluation of such an hardware is difﬁcult and involves simulation of intricate interactions within the shared resources. In response to these challenges, we seek novel model-driven techniques to increase understanding and as a tool guide performance and power optimizations. These approaches allow for a quick and accurate prediction of near optimal conﬁgurations to reduce the search space.

*Speedup-aware co-schedules for energy efficient workload management*

Manu Shantharam, Padma Raghavan

Today's high performance computing (HPC) systems have immense parallel processing capacity. However, many applications within commonly executed workloads on such systems suffer from Amdahl's Law effects-- i.e., sub-linear fixed-problem-speedup on increasing processor counts. We propose speedup-aware co-schedule schemes that seek to enhance overall HPC system energy and throughput efficiencies. We demonstrate their effectiveness at delivering overall system energy improvement by managing the tradeoffs between faster workload completion and slower execution of one or more applications.

*Application Modeling: Sparse Scientific Codes *

Michael Frasca, Anirban Chatterjee, Padma Raghavan
**Abstract**

A large number of computational modeling and simulation applications rely on sparse data structures and algorithms. Such sparse computations are inherently scalable yet difficult to tune and adapt for energy-aware high-performance on modern multicore architectures, since their irregular and complex access patterns create many bottlenecks. Consequently, we are developing memory-based interaction models between such applications and hardware to enable performance optimizations at the software and hardware layers of advanced computing systems. We are also developing these techniques to guide resource scheduling decisions for multi-programmed workloads.

### Publications List in the area of Energy Aware Performance Scaling:

2012

2012

- NUMA-Aware Graph Mining Techniques for Performance and Energy Efficiency. M. Frasca, K. Madduri, P. Raghavan, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2012
- Phase Partitioning Methods for I/O Cache Optimization. M. Frasca, P. Raghavan, Proceedings of the International Conference on Parallel Processing, ICPP 2012

**2011**

- Virtual I/O Caching: Effective Storage Cache Management for Concurrent Workloads. M. Frasca, R. Prabhakar, P. Raghavan, M. Kandemir, Proceedings of the International Conference on Computational Science, SC 2011
- Exploiting dense substructures for fast sparse matrix vector multiplication. M. Shantharam, A. Chatterjee, P. Raghavan, Proceedings of the 25th International Journal of High Performance Computing Applications: 328-341 (2011)
- Can models of scientific software-hardware interactions be predictive? M. Frasca, A. Chatterjee, P Raghavan, Proceedings of the International Conference on Computational Science, ICCS 2011

2010

2010

- Analyzing the Soft-Error Resilience of Linear Solvers on Multicore Multiprocessors, K. Malkowski, P. Raghavan and M. Kandemir, Proceedings of the 24nd IEEE/ACM Inter-national Parallel and Distributed Symposium, IPDPS-2010, April 2010.
- T-NUCA - A Novel Approach to Non-Uniform Access LAtency Cache Architectures for 3D CMPs, K.Malkowski, P. Raghavan, M. Kandemir and M. J. Irwin, Proceedings of the 6th Workshop on High-Performance, Power-Aware Computing (HPPAC), in conjunction with 24nd IEEE/ACM International Parallel and Distributed Symposium, IPDPS-2010, to appear, April 2010.
- Intra-Application Shared Cache Partitioning For Multithreaded Applications, S. P. Muralidhara, M. Kandemir and P. Raghavan, Proceedings of 15th ACM SIGPLAN Annual Symposium on Principles and Practices of Parallel Programming (PPoPP 2010), January 2010.
- Dynamic Core Partitioning for Energy Efficiency. Y. Ding, M. Kandemir, M. J. Irwin and P. Raghavan, Proceedings of the 6th Workshop on High-Performance, Power-Aware Computing (HPPAC), in conjunction with 24nd IEEE/ACM International Parallel and Distributed Symposium, IPDPS-2010, to appear, April 2010.

2009

2009

- Performance and Power Impacts of Memory Latency Hiding for Sparse Matrix Vector Multiplication on MultiCore Architectures, M. Shantharam, K. Malkowski and P. Raghavan, Post conference Proceedings of the 9th International Workshop on State-of-the-Art in Scientific and Parallel Computing (PARA 2008), May 2008.
- Adapting Application Execution in CMPs Using Helper Threads, Y. Ding,M. Kandemir, P. Raghavan and M. J. Irwin. Journal of Parallel and Distributed Computing, Invited Paper, Vol. 69, No. 9, pp. 790–806, 2009.
- Adapting Application Mapping to Systematic Within-die Process Variations on Chip Multiprocessors. Y. Ding, M. Kandemir, M.J. Irwin and P. Raghavan, Proceedings of International Conference on High Performance Embedded Architectures & Compilers, pp. 231-247, 2009.

2008

2008

- A Helper Thread Based EDP Reduction Scheme for Adapting Application Execution in CMPs, Y. Ding, M. Kandemir, P. Raghavan and M. J. Irwin. 22nd IEEE/ACM International Parallel and Distributed Symposium, IPDPS-2008, pp. 1–14, DOI: 10.1109/IPDPS.2008.4536297. Best Paper, Software Track; 4 Best Papers, one per track out of 410 submitted papers.
- Ring Data Location Prediction Scheme for Non-Uniform Cache Architectures, S. Akikoa, F. Li, K. Malkowski, P. Raghavan, M. Kandemir and M. J. Irwin, Proceedings of XXVI IEEE International Conference on Computer Design, ICCD’08, pp. 693–698, 2008.
- Evaluating the Role of Scratchpad Memories in Chip Multiprocessors for Sparse Matrix Computations, A. Yanamandra, B. Cover, P. Raghavan. M. J. Irwin and M. Kandemir, 22nd IEEE/ACM International Parallel and Distributed Symposium, IPDPS-2008, pp. 1–10, DOI:0.1109/IPDPS.2008.453631, April 2008.
- Managing Power, Performance and Reliability Trade-offs, P. Raghavan, M. Kandemir, M. J. Irwin and K. Malkowski, Next Generation Software Workshop, Proceedings of 22nd IEEE/ACM International Parallel and Distributed Symposium, IPDPS-2008, pp. 1–6, April 2008.
- Towards Energy Efficient Scaling of Scientific Codes Y. Ding, K. Malkowski, P. Raghavan and M. Kandemir, High-Performance, Power-Aware Computing Workshop. Proceedings of 22nd IEEE/ACM International Parallel and Distributed Symposium, IPDPS-2008, pp. 1–8, April 2008.

2007

2007

- Adapting Application Execution to Reduced CPU Availability, Y. Ding, M. Kandemir, P. Raghavan and M. J. Irwin, INTERACT at IEEE 13th International Symposium on High Performance Computer Architecture, HPCA-INTERACT’07, pp. 24–31, February 2007.
- Analysis of the IPv4 Address Space Delegation Structure, A. Sriraman, K. Butler, P. McDaniel and P. Raghavan, pp. 501–508, IEEE Symposium on Computers and Communications (ISCC’07), DOI: 10.1109/ISCC.2007.4381538, July 2007.
- Scientific Algorithms: Performance, Power, Thermal Properties on Modern Computing Architectures, I. Lee, P. Raghavan, Workshop on Unique Chips and Systems (UCAS-3), pp. 1–8, April 2007.
- Load Miss Prediction for Energy-Aware High Performance Computing, K. Malkowski, G. Link, P. Raghavan and M. J. Irwin, 21st IEEE/ACM International Parallel and Distributed Symposium, IPDPS-2007, High-Performance, Power-Aware Computing Workshop, pp. 1–8, DOI: 10.1109/IPDPS.2007.370536, March 2007.
- Memory Optimizations For Fast Power-Aware Sparse Computations, K. Malkowski, P. Raghavan and M.J. Irwin, Proceedings of the 21st IEEE/ACM International Parallel and Distributed Symposium, IPDPS-2007, Next Generation SoftwareWorkshop, pp. 1–8, DOI: 10.1109/IPDPS.2007.370501, March 2007.
- Reducing energy consumption of parallel sparse matrix applications through integrated link/CPU voltage scaling, S.W. Son, K.Malkowski, G. Chen, M. T. Kandemir, P. Raghavan, The Journal of Supercomputing, DOI: 10.1007/s11227-007-0113-9, Vol. 41, No. 3, pp. 179–213, September 2007.
- Phase-Aware Adaptive Hardware Selection for Power-Efficient Scientific Computing, K. Malkowski, P. Raghavan, M. Kandemir and M. J. Irwin, ISLPED, pp. 403–406, DOI: 10.1145/1283780.1283869, August 2007.
- Modeling of Link Shutdown Opportunities During Collective Communication Primitives in 3-D Torus Nets, S. Conner, S. Akioka, G. M. Link, M. J. Irwin and P. Raghavan, 21st IEEE International Parallel and Distributed Symposium, IPDPS-2007, High-Performance, Power-Aware Computing Workshop, pp. 1–8, March 2007.

2006

2006

- On Improving Performance and Energy Profiles of Sparse Scientific Applications, K. Malkowski, I. Lee, P. Raghavan and M. Irwin, Proceedings of the 20th IEEE/ACM International Parallel and Distributed Symposium, IPDPS’06, Next Generation Software Workshop, pp. 1–8, DOI: 10.1109/IPDPS.2006.1639589, April 2006.
- Conjugate Gradient Sparse Iterative Solvers: Performance-Power Characteristics, K. Malkowski, I. Lee, P. Raghavan and M. Irwin, Proceedings of the 20th IEEE/ACM International Parallel and Distributed Symposium, Second High-Performance, Power-Aware Computing Workshop, pp. 1–8, DOI: 10.1109/IPDPS.2006.1639595, April 2006.
- Integrated Link/CPU Voltage Scaling for Reducing Energy Consumption of Parallel Sparse Matrix Applications, S. W. Son, K. Malkowski, G. Chen, M. T. Kandemir and P. Raghavan, Proceedings of the 20th IEEE/ACM International Parallel and Distributed Symposium, IPDPS’06, Second High-Performance, Power-Aware Computing Workshop, pp. 1–8, DOI: 10.1109/IPDPS.2006.1639596, April 2006.
- Characterizing the Performance and Energy Attributes of Scientific Simulations, S.Akioka, K. Malkowski, P. Raghavan, M. J. Irwin, L. C. McInnes and B. Norris, Lecture Notes in Computer Science, Volume 399/2006, pp. 242–249, January 2006.

2005

2005

- Adaptive Software for Scientific Computing: Co-Managing Quality-Performance-Power Tradeoffs, P. Raghavan, M. J. Irwin L. C. McInnes and B. Norris Proceedings of the Next Generation Software Workshop at the 19th IEEE/ACM International Parallel and Distributed Symposium, IPDPS-05 Vol. 11, No. 11, p. 220b, 19th, 2005.
- Reducing Power with Performance Constraints for Parallel Sparse Applications, G. Chen, K. Malkowski, M. Kandemir and P. Raghavan, Proceedings of the High-Performance, Power-Aware Computing Workshop at the 19th IEEE/ACM International Parallel and Dis=tributed Symposium, IPDPS-05 Vol. 12, No. 12, p. 231a, 19th, 2005.