The Performance Spectrum 🌈

Exploring the full spectrum of performance engineering: from predictive modeling to empirical measurement and analysis

SC25 Watchlist

Personal watchlist of SC25 sessions and workshops, grouped by their official tracks with quick notes on why each paper matters.

8 min read · November 16, 2025

2025 · sc25 conferences hpc
Accel-Sim and the Evolution of GPU Performance Modeling

Accel-Sim framework and research built upon it: from the original extensible simulation framework to modern core modeling, power analysis, and applications across GPU architecture research.

6 min read · October 28, 2025

2025 · gpu-simulation accel-sim performance-modeling gpgpu microarchitecture
Performance Engineering Toolkit: A Complete Guide to Analysis Methods and Tools

A comprehensive toolkit covering performance analysis methods and tools across the entire lifecycle: prediction, monitoring, and profiling for HPC and LLM systems.

33 min read · October 22, 2025

2025 · simulator profiling tracing gpu hpc
LLM-Based Kernel Generation: From Manual Optimization to Automated Code Synthesis

Automated GPU kernel generation using large language models: from benchmarks and evaluation frameworks to agentic systems and compiler infrastructure.

11 min read · October 20, 2025

2025 · llm gpu kernel-generation cuda triton compiler-optimization
Extra-P and Score-P: Automated Performance Modeling for HPC

Extra-P's empirical performance modeling and Score-P's measurement infrastructure: from automated scalability bug detection to noise-resilient modeling for exascale systems.

11 min read · October 12, 2025

2025 · performance-modeling hpc profiling scalability extra-p score-p
New Architectures, New Opportunities

Exploring emerging AI accelerator architectures: AWS Trainium's distributed training capabilities and Cerebras WSE's wafer-scale computing approach for large language models and HPC workloads.

3 min read · October 05, 2025

2025 · trainium cerebras wafer-scale-engine ml-accelerators distributed-training
PC Sampling in CPU Systems: A Comprehensive Survey

Tracing the evolution of PC sampling from early profiling techniques in the 1980s to modern continuous profiling systems: hardware innovations, compiler optimizations, and large-scale deployments.

12 min read · September 22, 2025

2025 · profiling pc-sampling performance-analysis cpu hardware-counters
Calling Context Trees: Concepts, Challenges, and Tools

A structured guide to CCT research: foundational concepts, scalability solutions (encoding and approximation), and modern visualization and analysis tools.

8 min read · September 15, 2025

2025 · profiling cct performance-analytics hpc tracing