In further attempts to reduce RP and increase occupancy in the
PreRARematerialize stage, we can widen the criteria for sinking trivially
rematerializable defs to include defs with multiple uses and sink a copy
directly to each use if occupancy would be improved in doing so.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This fixes the regression in SWDEV-316487. I agree that this is making the scheduler too complex. We really need to a way to calculate register pressure before hoisting trivially rematerializable defs in MachineLICM or make this its own pass.
It is still an issue. We are not able to collect enough trivially rematerializable defs with just single def/single use instructions. Multiple defs are hoisted and then eliminated due to being redundant thus increasing their use count. In another case, MachineLICM hoisted parts of a reg sequence and we are unable to sink them back down due being part of a subreg. This causes an increase in overall register pressure throughout the loop and decreases occupancy.
Looking to see if we can fix this issue in register allocation as suggested by @kerbowa