We can chain bcnt instructions together, so any width popcnt is pretty fast.
Details
Details
Diff Detail
Diff Detail
Paths
| Differential D20340
AMDGPU: Other sizes of popcnt are fast ClosedPublic Authored by arsenm on May 17 2016, 3:10 PM.
Details
Summary We can chain bcnt instructions together, so any width popcnt is pretty fast.
Diff Detail Event Timelinearsenm updated this object. This revision is now accepted and ready to land.May 18 2016, 6:25 AM
Revision Contents
Diff 57529 lib/Target/AMDGPU/AMDGPUTargetTransformInfo.h
test/CodeGen/AMDGPU/ctpop64.ll
test/Transforms/LoopIdiom/AMDGPU/popcnt.ll
|