Details: https://reviews.llvm.org/D96805 changed the GCNTTIImpl::getCFInstrCost to return 1 for the PHI nodes for the TTI::TCK_CodeSize and TTI::TCK_SizeAndLatency. This is incorrect because the value moves that are the result of the PHI lowering are inserted into the basic block predecessors - not into the block itself. As a result of this change LoopRotate and LoopUnroll were broken because of the incorrect Loop header and loop body size/cost estimation.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/test/Analysis/CostModel/AMDGPU/control-flow.ll | ||
---|---|---|
8 | Please update the sizes reported after your change instead of just removing the test lines here and below. |
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp | ||
---|---|---|
841 | Nit: leave this or similar related todo comments somewhere, it wasn't done. |
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp | ||
---|---|---|
841 | Such a prediction is unlikely possible. The number of copies that survived after the register coalescing too much dependent on the passes that run in between. |
Nit: leave this or similar related todo comments somewhere, it wasn't done.