Page MenuHomePhabricator

[LSR] Unify scheduling of existing and inserted addrecs
ClosedPublic

Authored by reames on Feb 22 2021, 12:45 PM.

Details

Summary

LSR goes to some lengths to schedule IV increments such that %iv and %iv.next never need to overlap. This is fairly fundamental to LSRs cost model. LSR assumes that an addrec can be represented with a single register. If %iv and %iv.next have to overlap, then that assumption does not hold.

The bug - which this patch is fixing - is that LSR only does this scheduling for IVs which it inserts, but it's cost model assumes the same for existing IVs that it reuses. It will rewrite existing IV users such that the no-overlap property holds, but will not actually reschedule said IV increment.

As you can see from the relatively lack of test updates, this doesn't actually impact codegen much. The main reason for doing it is to make a follow up patch series which improves post-increment use and scheduling easier to follow.

Diff Detail

Event Timeline

reames created this revision.Feb 22 2021, 12:45 PM
reames requested review of this revision.Feb 22 2021, 12:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 22 2021, 12:45 PM
mkazantsev added inline comments.Feb 23 2021, 9:35 PM
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
5578

nit: let's avoid || by splitting this into two checks.

5583

I'm pretty sure that at this point having constant as 2nd operand is the canonical form. Do we really care about add 2, phi case? If not, I'd propose match (m_Add(m_Specific(PN), m_ConstantInt(Step))) (and same for sub - if we care, because subs are also canonicalized to adds).

5597

I think DT.dominates(IncV, IVIncInsertPos) is not what we should be checking to enforce legality. It's totally legal to move IncV before IVIncInsertPos if DT.dominates(IVIncInsertPos, IncV). It's maybe not what we want performance-wise, but it's legal.

I think the real legality check needed here is DT.dominates(IVIncInsertPos, Latch).

llvm/test/Transforms/LoopStrengthReduce/post-increment-insertion.ll
141

Please commit these tests separately so the diff from this patch is obvious.

reames updated this revision to Diff 326185.Feb 24 2021, 12:56 PM

address review comments

reames added inline comments.Feb 24 2021, 12:59 PM
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
5597

(previous response comment got lost?)

It turns out the whole check is unneeded. What I actually needed to check was that all of the operands dominate the insert location, but that's already implied from the other checks. We know the phi must dominate everything in the loop, and constants trivially dominate.

mkazantsev accepted this revision.Feb 28 2021, 9:44 PM

LGTM for now. I'm currently factoring out functions like isIVStep, getIVStep for other purposes. We can reuse it here once all pieces are pulled together.

llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
5581

I would still advice using matchers here.

This revision is now accepted and ready to land.Feb 28 2021, 9:44 PM
mkazantsev added inline comments.Feb 28 2021, 9:50 PM
llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
5574

Because of simplifications, isa<SCEVAddRecExpr>(SE.getSCEV(&PN)) is potentially more expensive than everything you are going to make here. If it's not a legality check, I think it should be removed.

This revision was landed with ongoing or failed builds.Mar 3 2021, 12:08 PM
This revision was automatically updated to reflect the committed changes.