Controlled Kernel Launch for Dyanamic Parallelism in GPUs.
Phoenix: A Constraint-aware Scheduler for Heterogeneous Datacenters.
Co-training of Feature Extraction and Classification using Partitioned Convolutional Neural Networks
OSCAR:Orchestrating STT-RAM Cache Traffic for Heteregeneous CPU-GPU Architectures.
Scheduling Techniques for GPU Architectures with Processing-In-Memory Capabilities.
µC States:Fine-grained GPU Datapath Power Management.
Boosting Access Parallism to PCM-Based Main Memory.
Exploiting Core Criticality for Enhanced GPU Performance.
Re-NUCA: A Practial NUCA Architecture for ReRAM Based Last-Level Caches.
Exploiting Staleness for Approximating Loads on CMPs.
Exploiting Inter-Warp Heterogeneity to Improve GPGPU Performance.