-
SC25 Watchlist
Personal watchlist of SC25 sessions and workshops, grouped by their official tracks with quick notes on why each paper matters.
-
Accel-Sim and the Evolution of GPU Performance Modeling
Accel-Sim framework and research built upon it: from the original extensible simulation framework to modern core modeling, power analysis, and applications across GPU architecture research.
-
Performance Engineering Toolkit: A Complete Guide to Analysis Methods and Tools
A comprehensive toolkit covering performance analysis methods and tools across the entire lifecycle: prediction, monitoring, and profiling for HPC and LLM systems.
-
LLM-Based Kernel Generation: From Manual Optimization to Automated Code Synthesis
Automated GPU kernel generation using large language models: from benchmarks and evaluation frameworks to agentic systems and compiler infrastructure.
-
Extra-P and Score-P: Automated Performance Modeling for HPC
Extra-P's empirical performance modeling and Score-P's measurement infrastructure: from automated scalability bug detection to noise-resilient modeling for exascale systems.
-
New Architectures, New Opportunities
Exploring emerging AI accelerator architectures: AWS Trainium's distributed training capabilities and Cerebras WSE's wafer-scale computing approach for large language models and HPC workloads.
-
PC Sampling in CPU Systems: A Comprehensive Survey
Tracing the evolution of PC sampling from early profiling techniques in the 1980s to modern continuous profiling systems: hardware innovations, compiler optimizations, and large-scale deployments.
-
Calling Context Trees: Concepts, Challenges, and Tools
A structured guide to CCT research: foundational concepts, scalability solutions (encoding and approximation), and modern visualization and analysis tools.