Abstract: Performance profiling tools are crucial for HPC specialists to identify performance bottlenecks in parallel codes at various levels of granularity (i.e., across nodes, ranks, and threads).