This change adds s_incperflevel/s_decperflevel intrinsics to AMDGPU backend. These instructions along with using SQ_PERF_SEL_USER_LEVEL/SQ_PERF_SEL_ACCUM_PREV counters allows to measure average time spent in some section of kernel code.
Details
Details
Diff Detail
Diff Detail
- Repository
- rL LLVM