- User Since
- Apr 4 2014, 4:14 AM (288 w, 6 d)
Still missing foreach around dpp.
In addition it can be unrelated to the threshold at all. It may be a flaw in the cost model for specific instructions. Please also see D68881 which started to address cost model issues.
I disagree to the idea of having different thresholds based on the runtime. A runtime has nothing to do with it. For example compute can work on top of ROCm or PAL. Can you justify different results for the same programs?
Wed, Oct 16
Tue, Oct 15
Added VOPC. This has erased 175 instructions in total.
Mon, Oct 14
Full accounting for undefs.
Removed dpp subreg handling, a subreg cannot be defined in SSA.
Fri, Oct 11
Removed special handling of gfx10, it uses the same pseudo now.
How big was the performance testing?
Thu, Oct 10
GCNDPPCombiner can split the new pseudo and then handle the split.
Post-RA split is needed anyway since combining is an optimization.
Tests are updated to handle case w/o optimization.
Given the numbers I tend to agree with the change.
Wed, Oct 9
Tue, Oct 8
Mon, Oct 7
Thu, Oct 3