A picture of me, possibly qualifying as recent.

John (Jack) Sampson

Assistant Professor

Contact information:
Email - [lastname]@cse.psu.edu
Office - W324 Westgate Building
    University Park, PA 16802
Phone - Ext. 57496
CSE Department, School of EECS, College of Engineering, PSU


My primary research interests lie broadly in the area of computer architecture, with energy-efficiency of one form or another being the thread that has tied together much of my work since the mid-2000s. Pulling on that thread has lead me into research on energy constrained systems at both the smallest (energy-harvesting motes) and largest (data center energy-storage) ends of the computing power scale. Likewise, the end of Dennard scaling and the slowdown of dimensional (Moore's Law) scaling have motivated my investigations into alternatives to CMOS logic and traditional (SRAM/DRAM) memory elements from the perspective of how emerging technologies will reshape our assumptions about good architectural design approaches and enable new computing paradigms (e.g. nonvolatile processors) and functionalities (e.g. oscillator-based associative computing, processing-in-memory, multi-dimensional memory access models).

As part of the multi-PI Microsystems Design Lab (MDL) in the School of EECS, I have ongoing research collaborations with members of both the EE, ESM, and CSE departments, and I co-advise several students with fellow MDL faculty to develop cross-layer expertise. If you are interested in joining my lab or working on any of the active projects as an undergraduate here at Penn State, please send me email, or drop by my office, respectively, and indicate which of the ongoing tasks you are most interested in.

Ongoing Projects


Nonvolatile Processors

Nonvolatile Processor (NVP) Architectures (2014-)
    The ability to seamlessly integrate nonvolatile memory elements on-chip, and in a distributed fashion alongside and within datapaths, is transformative. We can now construct processors that can back up all of their architectural state in a matter of a few cycles. This enables us to push computation into previously adverse environments (e.g. wearables, sensor networks, the IoT) wherein the combinations of power reliability and battery longevity/form-factor formerly constrained deployment and to pursue efficiency-oriented power-gating optimizations on highly stateful, as well as stateless components. We are actively exploring the right combinations of design, policy, and architecture for this new processing paradigm, and validating our research findings with prototype NVP platforms produced by our collaborators.
NVP Systems and Applications (2016-)
    The applications served by NVPs and systems composed of NVPs each have their own, unique characteristics relative to their volatile counterparts. There are natural similarities between the unreliability of NVP power in a batteryless scenario and unreliable quality of output in approximate computing, and we are investigating how best to exploit synergies among them. At the system scale, NVPs challenge previously held rules of thumb about the cost models and reliability of execution on energy-harvesting nodes, and can warrant re-partitioning of computations. Similarly, techniques developed for NVPs can be applied beyond the processor to I/O, accelerators, and other system components. At the application level, we are investigating the semantics of "forward progress" from the perspective of maximizing hardware-supported transparency of execution interruptions while still enabling a common HW-SW notion of whether the resumed work is or isn't actually useful to continue.

Emerging and Efficient Memories

Emerging Memory Technologies and Applications Exploiting Their Novel Features (2015-)
    The current era is a boom time for emerging memory technologies. In addition to enhancements to traditional density, power, and latency metrics for on and off-chip memory technologies, many of these new technologies offer either new opportunities for functionality or radically altered costs for implementing previously considered memory-logic functionality integration. These potential functionality improvements range from physical access symmetry (column as well as row access naturally supported within a mat) to the use of new process technologies to add monolithically stacked additional transistors without impacting memory footprints, or the ability to efficiently embed nonvolatility at flip-flop granularity within a pipeline.
Architecture Support for darK Silicon in the "Uncore" (2014-)
    Successfully navigating the post-Dennardian landscape of computer engineering requires drawing on diverse sets of expertise ranging from low-power circuit design to the ability to integrate and exploit emerging technologies and techniques into both traditional and less traditional multiprocessor designs. Our work aims to tackle the following challenges:
    1) Explore the relationship between the design of uncore components and continuing dark/dim silicon trends
    2) Co-manage the bandwidth opportunities and thermal challenges introduced by 3D stacking
    3)Exploit power and energy efficiency opportunities provided by non- volatile storage while maintaining traditional semantics
    4) Perform cross-layer optimizations in the face of dark silicon challenges that span from within a given processor's datapaths to memory and storage sub-systems shared among multiple processors.

Design Implications of Alternatives to CMOS Technology

Efficiency Improvements and Design Space Enhancements (2013-)
    While no technology has yet arisen as the clear successor to CMOS for general adoption, many emerging technologies do offer clear benefits within well-specified niches. Steep slope devices offer the promise of resumed voltage scaling, post-bulk (2D) materials offer the promise of continued dimensional scaling, and novel device structures, such as ferroelectrics in the gate stack, offer the promise of single-device scale integration of memory and logic.
Exploiting Emerging Device Features for Novel Processing Opportunities (2014-)
    Emerging devices, such as nano-oscillators, may offer their benefits through changing the way that computations are performed, rather than performing traditional computations in a superior fashion. In some cases, the physics of the device can perform a computation directly.

Cognitive Embedded Visual Architectures

Energy-efficient Computer Vision Processing (2014-)
    Even with the enormous advances that have recently been demonstrated in computer vision processing, real-time high definition processing remains a highly computationally, and thus power, intensive task. The need for efficiency is especially key in embedded systems, such as wearable visual assist or mobile vision applications.
Cognitive Architectures for Embedded and Resource-Constrained Systems (2014-)
    In addition to the focus on inference accuracy, there are many other facets of cognitive architectures for which there remain open questions. These include training efficiency and human-effort scalability, appropriate resource-accuracy tradeoffs in low-power and distributed congnitive systems, human interface concerns, multi-modal processing, interconnect modeling, and improved understanding of "neuromorphic"/bio-inspired vs. linear algebraic approaches to both learning and inference.


Current Graduate Students

Primary work areas

Saambhavi BaskaranPost-CMOS, Compute-enhanced memories
Zhixuan HuanEfficient on-chip memories
Minli (Julie) LiaoPattern/Locality-enhanced memories, NVPs
Tulika Parija [Co-advised: C. Das]ANN inference efficiency, ANN interconnect
Philip ShinNon-Boolean computing
Balachandran Swaminathan [Co-advised: C. Das]Inference Architectures
Peter Zientara [Co-advised: V. Narayanan]Distributed Cognitive Systems


Went on to...

Kaisheng Ma, PhD, 2018. [Co-advised: V. Narayanan] Assistant Professor, Tsinghua University
Wei-Yu (William) Tsai PhD, 2017. [Co-advised: V. Narayanan] Intel
Nandhini Chandramoorthy, PhD, 2016. [Co-advised: V Narayanan]IBM TJ Watson
Siddharth Advani, PhD, 2016. [Co-advised: V Narayanan]Samsung Research America, Dallas

Matthew Poremba, PhD, 2015. [As co-chair at PSU, primary advisor: Yuan Xie at UCSB]AMD

Ivan Stalev, MS, 2015. Lyft

Research Sponsors

  • NSF/SRC E2CDA: 2D Electrostrictive FETs for Ultra-Low Power Circuits and Architectures. 2016-2019
  • Intel: A Configurable Vision Platform for Cognitive Image Analytics. 2014-2017
  • NSF: CCF: SHF: Medium: ASKS - Architecture Support for darK Silicon. 2014-2018

Completed, Inactive, or External Projects


These projects are not currently active at Penn State. While the below listings are hopefully of some use for understanding my research history and associated tastes, no lab resources are currently being allocated to these projects.

GreenDroid / Conservation Cores

Conservation Cores (2007-2014)
    The C-Cores project developed an automated C-to-silicon infrastructure for generating specialized hardware (Conservation Cores) to improve the energy efficiency of mature codebases and for integration of these Conservation Cores into a multiprocessor platform. C-Cores were patchable at basic block granularity, and offload was highly transparent to software. A core tenet of the C-core approach to dark silicon was that proposed solutions should scale alongside the growth of dark silicon. In the case of C-Cores, this scaling would be in the form of functions converted to hardware.
GreenDroid (2008-2014)
    Work continued on a prototype mobile application processor called GreenDroid that leveraged dark silicon to dramatically reduce energy consumption compared to contemporary smart phones. GreenDroid incorporates many specialized processors (Conservation Cores) targeting key portions of Google's Android smart phone platform to reduce their energy consumption.

Transactions, Interference, and Parallel Architecture Management

Introspectively Tracking Program Execution Progress and Interference (2011-2013)
    Both hardware-assisted and pure software (performance-counter driven) approaches and mechanisms were investigated for accurately estimating what the interference-free performance of a dynamically executing program would have been. Key applications for the techniques include augmenting job scheduling and metering in IaaS environments.
Hardware Support for Transactional Memory (2005-2007)
    This line of inquiry focused on modifying existing virtual memory support mechanisms to provide the necessary features for transactional semantics.
Multicore Synchronization Primitives (Filter Barriers) (2005-2006)
    This work focused on utilizing existing stalling mechanisms within the physically shared on-chip memory system of a CMP to implement collective synchronization primitives. Both denial of instruction fetch and denial of data fetch were examined as the basis for barrier implementation, and fine-grained barriers were shown to be plausible for implementing core=lane vectorization on a CMP.

Program and Workload Analysis

Program Phase Analysis (2004-2006)
    Work in this area included extensions of SimPoint to x86, effective methods for mapping program counter statistics to phase information, and suite-level phase sampling.


Data Center Power Management (2011-2012)
    This work explored power management policies governing the use of stored energy to constrain peak power consumption in data centers and plausible strategies for deploying the necessary distributed energy storage.
Vector Acceleration for Energy-Efficiency (2005-2006)
    This work examined medium-length vector architectures as a means of efficiently accelerating power-constrained designs.