This is an archive of the discontinued LLVM Phabricator instance.

[IndVars] Don't forget value when inferring nowrap flags
ClosedPublic

Authored by nikic on May 31 2021, 12:27 PM.

Details

Summary

When SimplifyIndVars infers IR nowrap flags from SCEV, it forgets the SCEV value. This doesn't make a lot of sense, as the nowrap fact is derived from SCEV in the first place, so there shouldn't be any need to invalidate to make use of the nowrap information (as it is information that SCEV already has...)

I'm sure this is not strictly NFC in some edge-cases because SCEV sometimes infers nowrap flags lazily, but at least on test-suite this change showed no differences.

This drops compile-time for n=128 from https://bugs.llvm.org/show_bug.cgi?id=50384 from 1.0s to 0.4s for me.

Diff Detail

Event Timeline

nikic created this revision.May 31 2021, 12:27 PM
nikic requested review of this revision.May 31 2021, 12:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 31 2021, 12:27 PM

Thank you for looking into it. This makes sense to me.
strengthenRightShift() does something similar, without forgetting from SCEV.

I do suspect this is not NFC, because ScalarEvolution::getNoWrapFlagsFromUB(),
called by ScalarEvolution::createSCEV(), would pick up the newly-proved flags,
but since we didn't forget the SCEV, it naturally won't.

The fix is to promote SimplifyIndvar::strengthenOverflowingOperation() into SCEV,
and call it in ScalarEvolution::getNoWrapFlagsFromUB() first.
(likely, changing it to return the deduced flags, not actually changing them).
I would expect that SCEV dump test would be able to demonstrate the change.

Would it make sense to deal with that first?

llvm/lib/Transforms/Utils/SimplifyIndVar.cpp
775

Looks like this was added in rL231306 / D7981.

Did you try a lightweight (and absolutely NFCish) approach of dropping SCEV 1 time instead of 2? Like

if (!BO->hasNoUnsignedWrap() &&
    willNotOverflow(SE, BO->getOpcode(), /* Signed */ false, LHS, RHS)) {
  BO->setHasNoUnsignedWrap();
  Changed = true;
}

if (!BO->hasNoSignedWrap() &&
    willNotOverflow(SE, BO->getOpcode(), /* Signed */ true, LHS, RHS)) {
  BO->setHasNoSignedWrap();
  Changed = true;
}
if (Changed)
  SE->forgetValue(BO);

? Curious if it improves the situation.

As for this patch, I don't have a strong opinion. To me it looks that dropping SCEV here can be useful to infer nsw/nuw for some AddRec involving BO, and, consequently, prove iteration count of some loop that depends on it. Not sure how often does it happen and how useful it is, though.

lebedev.ri accepted this revision.Jun 1 2021, 2:03 AM
This comment was removed by lebedev.ri.
This revision is now accepted and ready to land.Jun 1 2021, 2:03 AM

I think i can take a look into doing this inferrence early, but thinking about it,
looks like it would be best to do that after this patch.

I do think this is *not* NFC, but i do think we can solve this more generically,
and even if we can't, the current situation does not seem reasonable.

This reduces runtime of n=512 case from https://bugs.llvm.org/show_bug.cgi?id=50384
from ~7.1s to 0.8s, i.e. an order of magnitude improvement.
I'm not sure if this actually resolves quadratic runtime problem,
it actually might, but it at least *severely* reduces it's impact.

So LGTM.

Are you planning on landing this?

reames accepted this revision.Jun 3 2021, 9:52 AM

The fact this leaves SCEV in a imprecise state is not ideal. From a practical engineering perspective, we may need to accept it. The compile time impact here is rather compelling. At a minimum, please replace the dropped lines with a comment explaining that we are intentionally leaving SCEV imprecise. Alternatively, you could use the existing APIs to mutate the existing SCEV, but that path tends to be fragile, so I'm not really sure it's worth it.

This fits into a long standing thought I have that we should restructure SCEV as an optimizeable IR with support for replaceAllUsesWith type operations. If we did, we'd get around the need to invalidate to refine analysis here. That's a huge project, and probably worthy of a lot of discussion outside of a random review thread though. :)

LGTM w/comment added.

@nikic are you at liberty to state your plans regarding this patch?
The issue referenced is somewhat of a blocker/deal-breaker for me.

nikic added a comment.Jun 4 2021, 9:11 AM

@reames Just to clarify one point here, because I think there might be confusion about it. There are two possible situations:

  1. SCEV already has nowrap flags, and we're just using them to prove that the IR-level operation doesn't wrap. There's definitely no need to invalidate anything in this case.
  2. SCEV doesn't have nowrap flags yet, but they are inferred during the zero/sign extend operation, which tries harder to infer nowrap flags on addrecs.

In situation 2. the code already adds the nowrap flags to the addrec itself and recomputes its range. However, there may be further SCEV expressions based on that addrec, which will not get recomputed with the new nowrap information. So, the implementation already mutates the addrec to add the flags, the invalidation is about other expressions that might have been based on it. This doesn't make a difference on test-suite, but at least theoretically it could make a difference.

With that in mind, I'm not sure I follow your suggestion regarding an optimizable IR and RAUW operations. The concept doesn't seem to fit SCEV, in that {1,+,1} might have been "based on" {0,+,1} and might benefit from re-computation if additional information about {0,+,1} became known, but I don't think we would consider {1,+,1} a "user" of {0,+,1}.

I believe the actual problem here is rather that we only infer some of the nowrap information lazily -- if we didn't, then we could say with certainty that no invalidation is necessary here. However, I doubt computing it eagerly will work, there's probably considerations relating to either compile-time or SCEV construction order involved there.

This revision was landed with ongoing or failed builds.Jun 4 2021, 11:57 AM
This revision was automatically updated to reflect the committed changes.
reames added a comment.Jun 4 2021, 2:21 PM

I went ahead and wrote up my thoughts on the optimizeable IR notion I mentioned upthread. You can read them here: https://github.com/preames/public-notes/blob/master/llvm-loop-opt-ideas.rst#should-scev-be-an-optimizeable-ir

After writing that up, this seems more workable near term than I'd been thinking it was. Might be worth bouncing around a bit actively.