This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Analysis/
-
llvm/
-
Analysis/
-
ScalarEvolution.h
-
lib/
-
Analysis/
8/8
ScalarEvolution.cpp
-
Transforms/Scalar/
-
Scalar/
-
StraightLineStrengthReduce.cpp

Differential D114650

[SCEV] Construct SCEV iteratively.
ClosedPublic

Authored by fhahn on Nov 26 2021, 10:09 AM.

Download Raw Diff

Details

Reviewers

nikic
reames
lebedev.ri
efriedma
mkazantsev

Commits

rG675080a4533b: [SCEV] Construct SCEV iteratively.

Summary

This patch updates SCEV construction to work iteratively instead of recursively
in most cases. It resolves stack overflow issues when trying to construct SCEVs
for certain inputs, e.g. PR45201.

The basic approach is to to use a worklist to queue operands of V which
need to be created before V. To do so, the current patch adds a
getOperandsToCreate function which collects the operands SCEV
construction depends on for a given value. This is a slight duplication
with createSCEV.

At the moment, SCEVs for phis are still created recursively.

Fixes #32078, #42594, #44546, #49293, #49599, #55333, #55511

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

fhahn created this revision.Nov 26 2021, 10:09 AM

Herald added subscribers: javed.absar, hiraditya. · View Herald TranscriptNov 26 2021, 10:09 AM

fhahn requested review of this revision.Nov 26 2021, 10:09 AM

Herald added a project: Restricted Project. · View Herald TranscriptNov 26 2021, 10:09 AM

Thanks for looking into this!
I previously looked into this: https://github.com/LebedevRI/llvm-project/commit/6b006aa21caf018aa0f280828899d510274c8444
... but as it is apparent, never posted the diff. Nonetheless, it may be interesting to look at.

The most interesting question is, do we really want to avoid (indirect self-) recursion (through getSCEV)?
Then the api of ScalarEvolution needs more changes than what the current diff contains,
basically everything that can end up creating new SCEV expressions from IR
needs to be able to return "in-progress, requery me later" status.

Harbormaster completed remote builds in B136262: Diff 390092.Nov 26 2021, 10:37 AM

To ask the obvious question: Can we get away with simply adding a getSCEV recursion limit instead? Is there reason to believe that a sensible recursion cutoff (similar to the many SCEV already has) would adversely affect analysis quality in practice? I'd rather avoid the additional complexity if we can.

In D114650#3157253, @nikic wrote:

To ask the obvious question: Can we get away with simply adding a getSCEV recursion limit instead? Is there reason to believe that a sensible recursion cutoff (similar to the many SCEV already has) would adversely affect analysis quality in practice? I'd rather avoid the additional complexity if we can.

Recursion itself is rarely the right solution in the first place, let alone working around it by introducing random cut-offs.

In D114650#3156273, @lebedev.ri wrote:

Thanks for looking into this!
I previously looked into this: https://github.com/LebedevRI/llvm-project/commit/6b006aa21caf018aa0f280828899d510274c8444
... but as it is apparent, never posted the diff. Nonetheless, it may be interesting to look at.

The most interesting question is, do we really want to avoid (indirect self-) recursion (through getSCEV)?
Then the api of ScalarEvolution needs more changes than what the current diff contains,
basically everything that can end up creating new SCEV expressions from IR
needs to be able to return "in-progress, requery me later" status.

Thanks for sharing the patch. I think the current patch doesn't really need to change any existing interfaces, it just introduces a new expectation: before creating a SCEV for I, SCEVs for all of I's operands must already exist. It's up to the callers of createSCEV (and friends) to ensure that. The only additional complexity is 1) extracting operands for which we need to create SCEVs and 2) processing them in the right order.

At the moment, 1) is a bit unfortunate, because it requires some duplication of createSCEV logic, but overall it's not too bad (if no other interfaces need modifying IMO). I think we may be able to improve/unify the logic there as well, but I think first it would be good to get agreement on whether we want to handle the issue by constructing SCEVs iteratively or not. (Getting this patch to work for all cases, cleaning it up and making sure its as fast as possible will be quite a bit of additional work, which I'd only like to do once we it's clear that this is the overall direction we want to pursue :))

In D114650#3157263, @lebedev.ri wrote:

In D114650#3157253, @nikic wrote:

To ask the obvious question: Can we get away with simply adding a getSCEV recursion limit instead? Is there reason to believe that a sensible recursion cutoff (similar to the many SCEV already has) would adversely affect analysis quality in practice? I'd rather avoid the additional complexity if we can.

Recursion itself is rarely the right solution in the first place, let alone working around it by introducing random cut-offs.

I think one issue with a cut-off for the recursion is that such a cut-off would make SCEV instantiations depend on the instantiation order. I am not sure how much this would be an issue in practice, but among other things it may make verification of SCEV harder (e.g. if we compute SCEVs from scratch to the cached version we would have to make sure to use the same evaluation order as when computing the original SCEV).

>>>! In D114650#3157263, @lebedev.ri wrote:

In D114650#3157253, @nikic wrote:

To ask the obvious question: Can we get away with simply adding a getSCEV recursion limit instead? Is there reason to believe that a sensible recursion cutoff (similar to the many SCEV already has) would adversely affect analysis quality in practice? I'd rather avoid the additional complexity if we can.

Recursion itself is rarely the right solution in the first place, let alone working around it by introducing random cut-offs.

I think one issue with a cut-off for the recursion is that such a cut-off would make SCEV instantiations depend on the instantiation order. I am not sure how much this would be an issue in practice, but among other things it may make verification of SCEV harder (e.g. if we compute SCEVs from scratch to the cached version we would have to make sure to use the same evaluation order as when computing the original SCEV).

https://bugs.llvm.org/show_bug.cgi?id=32731 has a bit more discussion/context on the potential issues of a cut-off.

I want to make sure I understand the problem to be solved. The description and discussion confused me a bit, so I may in fact be wildly off here.

Is the basis issue being solved that calling createSCEV on some IR instruction can require recursing through a very deep chain of instructions which have not yet had SCEVs formed for them? More specifically, is that the *only* deep recursion we're concerned by? The discussion about new invariants for creating a SCEV seemed to indicate it might be much wider, but that discussion didn't parse for me.

If my understanding is correct, then I would have expected this to be effectively a change strictly inside createSCEV. The fact we seem to be changing things much more broadly confuses me. Maybe there's some complexity coming from the phi handling, but if so, maybe a first patch which *only* iteratively constructs arithmetic chains and relies on recursion for the tricky phi parts?

Putting this here only because the bug database is still readonly.

Random thought: We're discussing this as a SCEV cornercase, and it's probably worth fixing the case this exposes, but it's also worth noting here that the unroller is producing a horrendously bad form here. It might be worth considering whether the unroller should be applying a peephole as it unrolls to avoid generating this massive and utterly useless IR.

Specifically for this case: if generating an unrolled IV increment with no other uses of the IV, eagerly fold the step increments if constant. That is, instead of producing:

%a = gep i8, i8 %p, i64 1
%b = gep i8, i8 %b, i64 1
%c = gep i8, i8 %c, i64 1
%d = gep i8, i8 %d, i64 1

produce:

%a = gep i8, i8 %p, i64 1
%b = gep i8, i8 %p, i64 2
%c = gep i8, i8 %p, i64 3
%d = gep i8, i8 %p, i64 4
(All of which but the last are dead.)

Doing this might be a worthwhile compile time optimization on it's own.

In D114650#3159649, @reames wrote:

I want to make sure I understand the problem to be solved. The description and discussion confused me a bit, so I may in fact be wildly off here.

Is the basis issue being solved that calling createSCEV on some IR instruction can require recursing through a very deep chain of instructions which have not yet had SCEVs formed for them? More specifically, is that the *only* deep recursion we're concerned by? The discussion about new invariants for creating a SCEV seemed to indicate it might be much wider, but that discussion didn't parse for me.

Yes, this is the main issue I'd like to address with this change, thanks for summarising this concisely!

If my understanding is correct, then I would have expected this to be effectively a change strictly inside createSCEV. The fact we seem to be changing things much more broadly confuses me. Maybe there's some complexity coming from the phi handling, but if so, maybe a first patch which *only* iteratively constructs arithmetic chains and relies on recursion for the tricky phi parts?

I think most of the 'cluttering' changes are getSCEV -> getExistingSCEV. Those are not strictly necessary and I should be able to strip those from the initial version, same for excluding the phi part.

The getSCEV -> getExistingSCEV changes are related to the invariant mentioned: if we iteratively construct the operands first, then we should be able to rely on the fact that SCEVs for all operands exist when construction a new SCEV, hence using getExistingSCEV instead of getSCEV.

fhahn planned changes to this revision.Jan 17 2022, 9:39 AM

Just a rebase so this can be applied on top of main.

Harbormaster completed remote builds in B143830: Diff 400588.Jan 17 2022, 10:18 AM

I tried this patch to see if it solves the issue reported in https://github.com/llvm/llvm-project/issues/49293#issuecomment-981040924

It fails with the following error - and a request to submit a bug report:

PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
 #0 0x0000000000801b33 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/lib/llvm-14/bin/ld.lld+0x801b33)
 #1 0x00000000007ffb0e llvm::sys::RunSignalHandlers() (/usr/lib/llvm-14/bin/ld.lld+0x7ffb0e)
 #2 0x000000000080212f SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f3012f663c0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x153c0)
 #4 0x0000000002d3cd12 CompareSCEVComplexity(llvm::EquivalenceClasses<llvm::SCEV const*, std::less<llvm::SCEV const*> >&, llvm::EquivalenceClasses<llvm::Value const*, std::less<llvm::Value const*> >&, llvm::LoopInfo const*, llvm::SCEV const*, llvm::SCEV const*, llvm::DominatorTree&, unsigned int) ScalarEvolution.cpp:0:0
 #5 0x0000000002d0044f GroupByComplexity(llvm::SmallVectorImpl<llvm::SCEV const*>&, llvm::LoopInfo*, llvm::DominatorTree&) ScalarEvolution.cpp:0:0
 #6 0x0000000002cf4608 llvm::ScalarEvolution::getAddExpr(llvm::SmallVectorImpl<llvm::SCEV const*>&, llvm::SCEV::NoWrapFlags, unsigned int) (/usr/lib/llvm-14/bin/ld.lld+0x2cf4608)
 #7 0x0000000002d14f67 llvm::ScalarEvolution::createSCEV(llvm::Value*) (/usr/lib/llvm-14/bin/ld.lld+0x2d14f67)
 #8 0x0000000002d0714f llvm::ScalarEvolution::createSCEVIter(llvm::Value*) (/usr/lib/llvm-14/bin/ld.lld+0x2d0714f)
 #9 0x00000000025c508e llvm::SLPVectorizerPass::vectorizeGEPIndices(llvm::BasicBlock*, llvm::slpvectorizer::BoUpSLP&) (/usr/lib/llvm-14/bin/ld.lld+0x25c508e)
#10 0x00000000025c280c llvm::SLPVectorizerPass::runImpl(llvm::Function&, llvm::ScalarEvolution*, llvm::TargetTransformInfo*, llvm::TargetLibraryInfo*, llvm::AAResults*, llvm::LoopInfo*, llvm::DominatorTree*, llvm::AssumptionCache*, llvm::DemandedBits*, llvm::OptimizationRemarkEmitter*) (/usr/lib/llvm-14/bin/ld.lld+0x25c280c)
#11 0x00000000025c20f1 llvm::SLPVectorizerPass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/usr/lib/llvm-14/bin/ld.lld+0x25c20f1)
#12 0x00000000022c7b5d llvm::detail::PassModel<llvm::Function, llvm::SLPVectorizerPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function> >::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/usr/lib/llvm-14/bin/ld.lld+0x22c7b5d)
#13 0x0000000003181b59 llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/usr/lib/llvm-14/bin/ld.lld+0x3181b59)
#14 0x0000000000dfe32d llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function> >::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/usr/lib/llvm-14/bin/ld.lld+0xdfe32d)
#15 0x00000000031853f5 llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/lib/llvm-14/bin/ld.lld+0x31853f5)
#16 0x0000000000dfe17d llvm::detail::PassModel<llvm::Module, llvm::ModuleToFunctionPassAdaptor, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/lib/llvm-14/bin/ld.lld+0xdfe17d)
#17 0x0000000003180d16 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/lib/llvm-14/bin/ld.lld+0x3180d16)
#18 0x0000000001e7311c llvm::lto::opt(llvm::lto::Config const&, llvm::TargetMachine*, unsigned int, llvm::Module&, bool, llvm::ModuleSummaryIndex*, llvm::ModuleSummaryIndex const*, std::vector<unsigned char, std::allocator<unsigned char> > const&) (/usr/lib/llvm-14/bin/ld.lld+0x1e7311c)
#19 0x0000000001e75a42 llvm::lto::thinBackend(llvm::lto::Config const&, unsigned int, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream> > > (unsigned int)>, llvm::Module&, llvm::ModuleSummaryIndex const&, llvm::StringMap<std::unordered_set<unsigned long, std::hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<unsigned long> >, llvm::MallocAllocator> const&, llvm::DenseMap<unsigned long, llvm::GlobalValueSummary*, llvm::DenseMapInfo<unsigned long, void>, llvm::detail::DenseMapPair<unsigned long, llvm::GlobalValueSummary*> > const&, llvm::MapVector<llvm::StringRef, llvm::BitcodeModule, llvm::DenseMap<llvm::StringRef, unsigned int, llvm::DenseMapInfo<llvm::StringRef, void>, llvm::detail::DenseMapPair<llvm::StringRef, unsigned int> >, std::vector<std::pair<llvm::StringRef, llvm::BitcodeModule>, std::allocator<std::pair<llvm::StringRef, llvm::BitcodeModule> > > >*, std::vector<unsigned char, std::allocator<unsigned char> > const&)::$_3::operator()(llvm::Module&, llvm::TargetMachine*, std::unique_ptr<llvm::ToolOutputFile, std::default_delete<llvm::ToolOutputFile> >) const LTOBackend.cpp:0:0
#20 0x0000000001e758bf llvm::lto::thinBackend(llvm::lto::Config const&, unsigned int, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream> > > (unsigned int)>, llvm::Module&, llvm::ModuleSummaryIndex const&, llvm::StringMap<std::unordered_set<unsigned long, std::hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<unsigned long> >, llvm::MallocAllocator> const&, llvm::DenseMap<unsigned long, llvm::GlobalValueSummary*, llvm::DenseMapInfo<unsigned long, void>, llvm::detail::DenseMapPair<unsigned long, llvm::GlobalValueSummary*> > const&, llvm::MapVector<llvm::StringRef, llvm::BitcodeModule, llvm::DenseMap<llvm::StringRef, unsigned int, llvm::DenseMapInfo<llvm::StringRef, void>, llvm::detail::DenseMapPair<llvm::StringRef, unsigned int> >, std::vector<std::pair<llvm::StringRef, llvm::BitcodeModule>, std::allocator<std::pair<llvm::StringRef, llvm::BitcodeModule> > > >*, std::vector<unsigned char, std::allocator<unsigned char> > const&) (/usr/lib/llvm-14/bin/ld.lld+0x1e758bf)
#21 0x0000000001e6ed58 (anonymous namespace)::InProcessThinBackend::runThinLTOBackendThread(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream> > > (unsigned int)>, std::function<llvm::Expected<std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream> > > (unsigned int)> > (unsigned int, llvm::StringRef)>, unsigned int, llvm::BitcodeModule, llvm::ModuleSummaryIndex&, llvm::StringMap<std::unordered_set<unsigned long, std::hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<unsigned long> >, llvm::MallocAllocator> const&, llvm::DenseSet<llvm::ValueInfo, llvm::DenseMapInfo<llvm::ValueInfo, void> > const&, std::map<unsigned long, llvm::GlobalValue::LinkageTypes, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, llvm::GlobalValue::LinkageTypes> > > const&, llvm::DenseMap<unsigned long, llvm::GlobalValueSummary*, llvm::DenseMapInfo<unsigned long, void>, llvm::detail::DenseMapPair<unsigned long, llvm::GlobalValueSummary*> > const&, llvm::MapVector<llvm::StringRef, llvm::BitcodeModule, llvm::DenseMap<llvm::StringRef, unsigned int, llvm::DenseMapInfo<llvm::StringRef, void>, llvm::detail::DenseMapPair<llvm::StringRef, unsigned int> >, std::vector<std::pair<llvm::StringRef, llvm::BitcodeModule>, std::allocator<std::pair<llvm::StringRef, llvm::BitcodeModule> > > >&)::'lambda'(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream> > > (unsigned int)>)::operator()(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream> > > (unsigned int)>) const LTO.cpp:0:0
#22 0x0000000001e6e857 std::_Function_handler<void (), std::_Bind<(anonymous namespace)::InProcessThinBackend::start(unsigned int, llvm::BitcodeModule, llvm::StringMap<std::unordered_set<unsigned long, std::hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<unsigned long> >, llvm::MallocAllocator> const&, llvm::DenseSet<llvm::ValueInfo, llvm::DenseMapInfo<llvm::ValueInfo, void> > const&, std::map<unsigned long, llvm::GlobalValue::LinkageTypes, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, llvm::GlobalValue::LinkageTypes> > > const&, llvm::MapVector<llvm::StringRef, llvm::BitcodeModule, llvm::DenseMap<llvm::StringRef, unsigned int, llvm::DenseMapInfo<llvm::StringRef, void>, llvm::detail::DenseMapPair<llvm::StringRef, unsigned int> >, std::vector<std::pair<llvm::StringRef, llvm::BitcodeModule>, std::allocator<std::pair<llvm::StringRef, llvm::BitcodeModule> > > >&)::'lambda'(llvm::BitcodeModule, llvm::ModuleSummaryIndex&, llvm::StringMap<std::unordered_set<unsigned long, std::hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<unsigned long> >, llvm::MallocAllocator> const&, llvm::DenseSet<llvm::ValueInfo, llvm::DenseMapInfo<llvm::ValueInfo, void> > const&, std::map<unsigned long, llvm::GlobalValue::LinkageTypes, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, llvm::GlobalValue::LinkageTypes> > > const&, llvm::DenseMap<unsigned long, llvm::GlobalValueSummary*, llvm::DenseMapInfo<unsigned long, void>, llvm::detail::DenseMapPair<unsigned long, llvm::GlobalValueSummary*> > const&, llvm::MapVector<llvm::StringRef, llvm::BitcodeModule, llvm::DenseMap<llvm::StringRef, unsigned int, llvm::DenseMapInfo<llvm::StringRef, void>, llvm::detail::DenseMapPair<llvm::StringRef, unsigned int> >, std::vector<std::pair<llvm::StringRef, llvm::BitcodeModule>, std::allocator<std::pair<llvm::StringRef, llvm::BitcodeModule> > > >&) (llvm::BitcodeModule, std::reference_wrapper<llvm::ModuleSummaryIndex>, std::reference_wrapper<llvm::StringMap<std::unordered_set<unsigned long, std::hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<unsigned long> >, llvm::MallocAllocator> const>, std::reference_wrapper<llvm::DenseSet<llvm::ValueInfo, llvm::DenseMapInfo<llvm::ValueInfo, void> > const>, std::reference_wrapper<std::map<unsigned long, llvm::GlobalValue::LinkageTypes, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, llvm::GlobalValue::LinkageTypes> > > const>, std::reference_wrapper<llvm::DenseMap<unsigned long, llvm::GlobalValueSummary*, llvm::DenseMapInfo<unsigned long, void>, llvm::detail::DenseMapPair<unsigned long, llvm::GlobalValueSummary*> > const>, std::reference_wrapper<llvm::MapVector<llvm::StringRef, llvm::BitcodeModule, llvm::DenseMap<llvm::StringRef, unsigned int, llvm::DenseMapInfo<llvm::StringRef, void>, llvm::detail::DenseMapPair<llvm::StringRef, unsigned int> >, std::vector<std::pair<llvm::StringRef, llvm::BitcodeModule>, std::allocator<std::pair<llvm::StringRef, llvm::BitcodeModule> > > > >)> >::_M_invoke(std::_Any_data const&) LTO.cpp:0:0
#23 0x0000000000b35f0c std::_Function_handler<void (), llvm::ThreadPool::createTaskAndFuture(std::function<void ()>)::'lambda'()>::_M_invoke(std::_Any_data const&) (/usr/lib/llvm-14/bin/ld.lld+0xb35f0c)
#24 0x0000000003253d11 void* llvm::thread::ThreadProxy<std::tuple<llvm::ThreadPool::grow(int)::$_0> >(void*) ThreadPool.cpp:0:0
#25 0x00007f3012f5a609 start_thread /build/glibc-eX1tMB/glibc-2.31/nptl/pthread_create.c:478:7
#26 0x00007f301291c293 __clone /build/glibc-eX1tMB/glibc-2.31/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:97:0
fish: “/usr/lib/llvm-14/bin/ld.lld --h…” terminated by signal SIGSEGV (Address boundary error)

mkazantsev resigned from this revision.Mar 4 2022, 12:14 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 4 2022, 12:14 AM

fhahn mentioned this in rG85eb8b7244b6: [IndVars] Add test for crash exposed by D114650..Apr 22 2022, 2:44 AM

Another rebase and attempt to reduce the diff by keeping getSCEV instead of getExistingSCEV. I am planning to further strip down the changes related to phi handling and have the initial version only construct SCEVs iteratively for IR without PHIs/cycles.

Harbormaster completed remote builds in B161140: Diff 424873.Apr 25 2022, 5:45 AM

fhahn edited the summary of this revision. (Show Details)Apr 28 2022, 5:55 AM

fhahn edited the summary of this revision. (Show Details)May 10 2022, 3:10 AM

chapuni added a subscriber: chapuni.May 12 2022, 2:38 PM

chapuni removed a subscriber: chapuni.

chapuni added a subscriber: chapuni.

uabelho added a subscriber: uabelho.Jun 9 2022, 11:36 PM

Rebased and updated to keep recursive handling of phi nodes for now. I think this is as strippped down as possible for an initial version. Compile-time impact is more managable than for the initial version:

NewPM-O3: +0.11%
NewPM-ReleaseThinLTO: +0.12%
NewPM-ReleaseLTO-g: +0.08%

https://llvm-compile-time-tracker.com/compare.php?from=9d2349c78f93cba78c1f3b770355f2ac0cb98163&to=ff418d805ec44e288f4c8eb3150e1e874b26040c&stat=instructions

The patch can lead to slightly different (mostly better) results, because in some cases we do not run into various existing recursion limits.

I'd be curious to hear what people think about the latest version and the compile-time trade-off. It should fix a set of related crashes, linked to https://github.com/llvm/llvm-project/issues/44546. It fixes all crashes I was able to reproduce from the issue list on my local system.

Herald added a subscriber: nlopes. · View Herald TranscriptJun 24 2022, 8:08 AM

fhahn retitled this revision from [SCEV] Construct SCEV iteratively (WIP). to [SCEV] Construct SCEV iteratively..Jun 24 2022, 8:20 AM

fhahn edited the summary of this revision. (Show Details)

Harbormaster completed remote builds in B171875: Diff 439764.Jun 24 2022, 9:04 AM

nlopes added inline comments.Jun 24 2022, 10:05 AM

llvm/lib/Analysis/ScalarEvolution.cpp
7067	Poison, poison! :)

fhahn mentioned this in D128586: [SCEV] Use SCEVUnknown(poison) instead of SCEVUnknown(undef)..Jun 25 2022, 8:42 AM

fhahn marked an inline comment as done.Jun 25 2022, 8:44 AM

fhahn added inline comments.

llvm/lib/Analysis/ScalarEvolution.cpp
7067	This is unrelated to this patch, I created D128586. I'll update things here once/if D128586 goes in

fhahn mentioned this in rGe4e22b6d8038: [SCEV] Use SCEVUnknown(poison) instead of SCEVUnknown(undef)..Jun 27 2022, 1:33 AM

Updated to use PoisonValue after e4e22b6d8038.

Harbormaster completed remote builds in B172137: Diff 440131.Jun 27 2022, 2:55 AM

nikic added inline comments.Jun 27 2022, 7:01 AM

llvm/lib/Analysis/ScalarEvolution.cpp
7099	Add and mul currently have special code that tries to combine multiple sequential adds/mul into one getAddExpr/getMulExpr call. This is effectively lost here (or maybe worse, we end up doing both -- each add individually here, and then a multiple-add variant in getSCEV). I think if we're going to change this (which might well make sense), we should probably change that separately, and also in the createSCEV() code as well. I suspect that this might account for both some of the codegen changes and some of the compile-time impact.
7125	https://github.com/llvm/llvm-project/commit/327307d9d4da0045f762f75343fe66b0f10ecc63 ;)

Replace isSized check with assertion, traverse add/mul chains when collecting operands.

This slightly increases compile-time, and the geoman impact is now

NewPM-O3: +0.08%
NewPM-ReleaseThinLTO: +0.08%
NewPM-ReleaseLTO-g: +0.09%

https://llvm-compile-time-tracker.com/compare.php?from=403466860b628d6ada20a4b85b5b2ecdfe97b389&to=d662c2da9dc338dd260dacab19c3fd3f5b252fc0&stat=instructions

fhahn marked 2 inline comments as done.Jun 28 2022, 5:57 AM

fhahn added inline comments.

llvm/lib/Analysis/ScalarEvolution.cpp
7099	Thanks, I updated `getOperandsToCreate` to also traverse mul/add chains. This indeed removed the `IndVarSimplify` changes and improved compile-time a bit. We could go even further and enqueue (Opcode, Operands) tuples to create directly, instead of going through `createSCEV`, but this would require a way to encode at least negations for the Add case.
7125	Replace with an assert

adjust comment about traversing add/mul chains in getOperandsToCreate.

LG. Seems to be a recurring problem, and the solution looks sufficiently non-intrusive. I like that there is no requirement to precisely synchronize the logic between createSCEV and createSCEVIter here, which would be a PITA. If we don't queue enough operands, those will just get computed recursively.

llvm/lib/Analysis/ScalarEvolution.cpp
7185	I don't think this is strictly true due to the createNodeForSelectViaUMinSeq() code. And either way, I don't think we need to handle it this precisely here.

This revision is now accepted and ready to land.Jun 28 2022, 6:28 AM

Harbormaster completed remote builds in B172456: Diff 440584.Jun 28 2022, 6:54 AM

This revision was landed with ongoing or failed builds.Jun 29 2022, 3:29 AM

Closed by commit rG675080a4533b: [SCEV] Construct SCEV iteratively. (authored by fhahn). · Explain Why

This revision was automatically updated to reflect the committed changes.

fhahn marked an inline comment as done.

fhahn added a commit: rG675080a4533b: [SCEV] Construct SCEV iteratively..

fhahn added inline comments.Jun 29 2022, 3:30 AM

llvm/lib/Analysis/ScalarEvolution.cpp
7185	Thanks, I removed the check for Instruction to simplify things here.

Allen mentioned this in D104679: [WIP][LoopUnrolling] Add flag to restrict the unroll with large loop size.Jul 1 2022, 11:03 PM

Heads up: this commit seems to cause miscompiles on some of our code. We're still working on an isolated test case.

In D114650#3627627, @alexfh wrote:

Heads up: this commit seems to cause miscompiles on some of our code. We're still working on an isolated test case.

False alarm. The issue turned out to be caused by a UB in the code.

In D114650#3630397, @alexfh wrote:

In D114650#3627627, @alexfh wrote:

Heads up: this commit seems to cause miscompiles on some of our code. We're still working on an isolated test case.

False alarm. The issue turned out to be caused by a UB in the code.

Great, thanks for checking!

Hi! We are seeing some large compile time increases with this patch. For example on F23720290, the time for opt -passes=loop-unroll goes from about 2s to 14s while producing identical output.

In D114650#3639074, @foad wrote:

Hi! We are seeing some large compile time increases with this patch. For example on F23720290, the time for opt -passes=loop-unroll goes from about 2s to 14s while producing identical output.

Interesting, I'll take a look

In D114650#3641601, @fhahn wrote:

In D114650#3639074, @foad wrote:

Hi! We are seeing some large compile time increases with this patch. For example on F23720290, the time for opt -passes=loop-unroll goes from about 2s to 14s while producing identical output.

Interesting, I'll take a look

I put up a fix: D129731

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

ScalarEvolution.h

2 lines

lib/

Analysis/

ScalarEvolution.cpp

402 lines

Transforms/

Scalar/

StraightLineStrengthReduce.cpp

1 line

Diff 400588

llvm/include/llvm/Analysis/ScalarEvolution.h

Show First 20 Lines • Show All 1,593 Lines • ▼ Show 20 Lines	private:
/// a constant range which represents the entire recurrence. Note that		/// a constant range which represents the entire recurrence. Note that
/// add recurrences with loop invariant steps aren't represented by		/// add recurrences with loop invariant steps aren't represented by
/// SCEVUnknowns and thus don't use this mechanism.		/// SCEVUnknowns and thus don't use this mechanism.
ConstantRange getRangeForUnknownRecurrence(const SCEVUnknown *U);		ConstantRange getRangeForUnknownRecurrence(const SCEVUnknown *U);

/// We know that there is no SCEV for the specified value. Analyze the		/// We know that there is no SCEV for the specified value. Analyze the
/// expression.		/// expression.
const SCEV createSCEV(Value V);		const SCEV createSCEV(Value V);
		const SCEV createSCEVIter(Value V);
		void getOperandsToCreate(Value V, SmallVectorImpl<Value > &Ops);

/// Provide the special handling we need to analyze PHI SCEVs.		/// Provide the special handling we need to analyze PHI SCEVs.
const SCEV createNodeForPHI(PHINode PN);		const SCEV createNodeForPHI(PHINode PN);

/// Helper function called from createNodeForPHI.		/// Helper function called from createNodeForPHI.
const SCEV createAddRecFromPHI(PHINode PN);		const SCEV createAddRecFromPHI(PHINode PN);

/// A helper function for createAddRecFromPHI to handle simple cases.		/// A helper function for createAddRecFromPHI to handle simple cases.
▲ Show 20 Lines • Show All 648 Lines • Show Last 20 Lines

llvm/lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 3,649 Lines • ▼ Show 20 Lines	#endif
// Okay, it looks like we really DO need an addrec expr. Check to see if we		// Okay, it looks like we really DO need an addrec expr. Check to see if we
// already have one, otherwise create a new one.		// already have one, otherwise create a new one.
return getOrCreateAddRecExpr(Operands, L, Flags);		return getOrCreateAddRecExpr(Operands, L, Flags);
}		}

const SCEV *		const SCEV *
ScalarEvolution::getGEPExpr(GEPOperator *GEP,		ScalarEvolution::getGEPExpr(GEPOperator *GEP,
const SmallVectorImpl<const SCEV *> &IndexExprs) {		const SmallVectorImpl<const SCEV *> &IndexExprs) {
const SCEV *BaseExpr = getSCEV(GEP->getPointerOperand());		const SCEV *BaseExpr = getExistingSCEV(GEP->getPointerOperand());
// getSCEV(Base)->getType() has the same address space as Base->getType()		// getSCEV(Base)->getType() has the same address space as Base->getType()
// because SCEV::getType() preserves the address space.		// because SCEV::getType() preserves the address space.
Type *IntIdxTy = getEffectiveSCEVType(BaseExpr->getType());		Type *IntIdxTy = getEffectiveSCEVType(BaseExpr->getType());
const bool AssumeInBoundsFlags = [&]() {		const bool AssumeInBoundsFlags = [&]() {
if (!GEP->isInBounds())		if (!GEP->isInBounds())
return false;		return false;

// We'd like to propagate flags from the IR to the corresponding SCEV nodes,		// We'd like to propagate flags from the IR to the corresponding SCEV nodes,
▲ Show 20 Lines • Show All 667 Lines • ▼ Show 20 Lines	void ScalarEvolution::insertValueToMap(Value V, const SCEV S) {
}		}
}		}

/// Return an existing SCEV if it exists, otherwise analyze the expression and		/// Return an existing SCEV if it exists, otherwise analyze the expression and
/// create a new one.		/// create a new one.
const SCEV ScalarEvolution::getSCEV(Value V) {		const SCEV ScalarEvolution::getSCEV(Value V) {
assert(isSCEVable(V->getType()) && "Value is not SCEVable!");		assert(isSCEVable(V->getType()) && "Value is not SCEVable!");

const SCEV *S = getExistingSCEV(V);		if (const SCEV *S = getExistingSCEV(V))
if (S == nullptr) {
S = createSCEV(V);
// During PHI resolution, it is possible to create two SCEVs for the same
// V, so it is needed to double check whether V->S is inserted into
// ValueExprMap before insert S->{V, 0} into ExprValueMap.
std::pair<ValueExprMapType::iterator, bool> Pair =
ValueExprMap.insert({SCEVCallbackVH(V, this), S});
if (Pair.second) {
ExprValueMap[S].insert({V, nullptr});

// If S == Stripped + Offset, add Stripped -> {V, Offset} into
// ExprValueMap.
const SCEV *Stripped = S;
ConstantInt *Offset = nullptr;
std::tie(Stripped, Offset) = splitAddExpr(S);
// If stripped is SCEVUnknown, don't bother to save
// Stripped -> {V, offset}. It doesn't simplify and sometimes even
// increase the complexity of the expansion code.
// If V is GetElementPtrInst, don't save Stripped -> {V, offset}
// because it may generate add/sub instead of GEP in SCEV expansion.
if (Offset != nullptr && !isa<SCEVUnknown>(Stripped) &&
!isa<GetElementPtrInst>(V))
ExprValueMap[Stripped].insert({V, Offset});
}
}
return S;		return S;
		return createSCEVIter(V);
}		}

const SCEV ScalarEvolution::getExistingSCEV(Value V) {		const SCEV ScalarEvolution::getExistingSCEV(Value V) {
assert(isSCEVable(V->getType()) && "Value is not SCEVable!");		assert(isSCEVable(V->getType()) && "Value is not SCEVable!");

ValueExprMapType::iterator I = ValueExprMap.find_as(V);		ValueExprMapType::iterator I = ValueExprMap.find_as(V);
if (I != ValueExprMap.end()) {		if (I != ValueExprMap.end()) {
const SCEV *S = I->second;		const SCEV *S = I->second;
▲ Show 20 Lines • Show All 1,140 Lines • ▼ Show 20 Lines	const SCEV ScalarEvolution::createSimpleAffineAddRec(PHINode PN,
auto BO = MatchBinaryOp(BEValueV, DT);		auto BO = MatchBinaryOp(BEValueV, DT);
if (!BO)		if (!BO)
return nullptr;		return nullptr;

if (BO->Opcode != Instruction::Add)		if (BO->Opcode != Instruction::Add)
return nullptr;		return nullptr;

const SCEV *Accum = nullptr;		const SCEV *Accum = nullptr;
if (BO->LHS == PN && L->isLoopInvariant(BO->RHS))		if (BO->LHS == PN && L->isLoopInvariant(BO->RHS)) {
Accum = getSCEV(BO->RHS);		Accum = getExistingSCEV(BO->RHS);
else if (BO->RHS == PN && L->isLoopInvariant(BO->LHS))		assert(Accum);
Accum = getSCEV(BO->LHS);		} else if (BO->RHS == PN && L->isLoopInvariant(BO->LHS)) {
		Accum = getExistingSCEV(BO->LHS);
		assert(Accum);
		}

if (!Accum)		if (!Accum)
return nullptr;		return nullptr;

SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap;		SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap;
if (BO->IsNUW)		if (BO->IsNUW)
Flags = setFlags(Flags, SCEV::FlagNUW);		Flags = setFlags(Flags, SCEV::FlagNUW);
if (BO->IsNSW)		if (BO->IsNSW)
Flags = setFlags(Flags, SCEV::FlagNSW);		Flags = setFlags(Flags, SCEV::FlagNSW);

const SCEV *StartVal = getSCEV(StartValueV);		const SCEV *StartVal = getExistingSCEV(StartValueV);
const SCEV *PHISCEV = getAddRecExpr(StartVal, Accum, L, Flags);		const SCEV *PHISCEV = getAddRecExpr(StartVal, Accum, L, Flags);
insertValueToMap(PN, PHISCEV);		insertValueToMap(PN, PHISCEV);

// We can add Flags to the post-inc expression only if we		// We can add Flags to the post-inc expression only if we
// know that it is undefined behavior for BEValueV to		// know that it is undefined behavior for BEValueV to
// overflow.		// overflow.
if (auto *BEInst = dyn_cast<Instruction>(BEValueV)) {		if (auto *BEInst = dyn_cast<Instruction>(BEValueV)) {
assert(isLoopInvariant(Accum, L) &&		assert(isLoopInvariant(Accum, L) &&
"Accum is defined outside L, but is not invariant?");		"Accum is defined outside L, but is not invariant?");
if (isAddRecNeverPoison(BEInst, L))		if (isAddRecNeverPoison(BEInst, L))
(void)getAddRecExpr(getAddExpr(StartVal, Accum), Accum, L, Flags);		(void)getAddRecExpr(getAddExpr(StartVal, Accum), Accum, L, Flags);
}		}

return PHISCEV;		return PHISCEV;
}		}

const SCEV ScalarEvolution::createAddRecFromPHI(PHINode PN) {		static std::pair<Value , Value > classifyPhiValues(PHINode *PN,
const Loop *L = LI.getLoopFor(PN->getParent());		const Loop *L) {
if (!L \|\| L->getHeader() != PN->getParent())		if (!L \|\| L->getHeader() != PN->getParent())
return nullptr;		return {nullptr, nullptr};

// The loop may have multiple entrances or multiple exits; we can analyze		// The loop may have multiple entrances or multiple exits; we can analyze
// this phi as an addrec if it has a unique entry value and a unique		// this phi as an addrec if it has a unique entry value and a unique
// backedge value.		// backedge value.
Value BEValueV = nullptr, StartValueV = nullptr;		Value BEValueV = nullptr, StartValueV = nullptr;
for (unsigned i = 0, e = PN->getNumIncomingValues(); i != e; ++i) {		for (unsigned i = 0, e = PN->getNumIncomingValues(); i != e; ++i) {
Value *V = PN->getIncomingValue(i);		Value *V = PN->getIncomingValue(i);
if (L->contains(PN->getIncomingBlock(i))) {		if (L->contains(PN->getIncomingBlock(i))) {
if (!BEValueV) {		if (!BEValueV) {
BEValueV = V;		BEValueV = V;
} else if (BEValueV != V) {		} else if (BEValueV != V) {
BEValueV = nullptr;		BEValueV = nullptr;
break;		break;
}		}
} else if (!StartValueV) {		} else if (!StartValueV) {
StartValueV = V;		StartValueV = V;
} else if (StartValueV != V) {		} else if (StartValueV != V) {
StartValueV = nullptr;		StartValueV = nullptr;
break;		break;
}		}
}		}
		return {StartValueV, BEValueV};
		}

		const SCEV ScalarEvolution::createAddRecFromPHI(PHINode PN) {
		const Loop *L = LI.getLoopFor(PN->getParent());
		Value BEValueV, StartValueV;
		std::tie(StartValueV, BEValueV) = classifyPhiValues(PN, L);
if (!BEValueV \|\| !StartValueV)		if (!BEValueV \|\| !StartValueV)
return nullptr;		return nullptr;

assert(ValueExprMap.find_as(PN) == ValueExprMap.end() &&		assert(ValueExprMap.find_as(PN) == ValueExprMap.end() &&
"PHI node already processed?");		"PHI node already processed?");

// First, try to find AddRec expression without creating a fictituos symbolic		// First, try to find AddRec expression without creating a fictituos symbolic
// value for PN.		// value for PN.
▲ Show 20 Lines • Show All 232 Lines • ▼ Show 20 Lines	if (DT.dominates(LeftEdge, RightUse) && DT.dominates(RightEdge, LeftUse)) {
LHS = RightUse;		LHS = RightUse;
RHS = LeftUse;		RHS = LeftUse;
return true;		return true;
}		}

return false;		return false;
}		}

const SCEV ScalarEvolution::createNodeFromSelectLikePHI(PHINode PN) {		static BranchInst getSelectPhiBr(PHINode PN, const Loop *L, DominatorTree &DT,
		LoopInfo &LI) {
auto IsReachable =		auto IsReachable =
[&](BasicBlock *BB) { return DT.isReachableFromEntry(BB); };		[&](BasicBlock *BB) { return DT.isReachableFromEntry(BB); };

if (PN->getNumIncomingValues() == 2 && all_of(PN->blocks(), IsReachable)) {		if (PN->getNumIncomingValues() == 2 && all_of(PN->blocks(), IsReachable)) {
const Loop *L = LI.getLoopFor(PN->getParent());

// We don't want to break LCSSA, even in a SCEV expression tree.		// We don't want to break LCSSA, even in a SCEV expression tree.
for (unsigned i = 0, e = PN->getNumIncomingValues(); i != e; ++i)		for (unsigned i = 0, e = PN->getNumIncomingValues(); i != e; ++i)
if (LI.getLoopFor(PN->getIncomingBlock(i)) != L)		if (LI.getLoopFor(PN->getIncomingBlock(i)) != L)
return nullptr;		return nullptr;

// Try to match		// Try to match
//		//
// br %cond, label %left, label %right		// br %cond, label %left, label %right
// left:		// left:
// br label %merge		// br label %merge
// right:		// right:
// br label %merge		// br label %merge
// merge:		// merge:
// V = phi [ %x, %left ], [ %y, %right ]		// V = phi [ %x, %left ], [ %y, %right ]
//		//
// as "select %cond, %x, %y"		// as "select %cond, %x, %y"

BasicBlock *IDom = DT[PN->getParent()]->getIDom()->getBlock();		BasicBlock *IDom = DT[PN->getParent()]->getIDom()->getBlock();
assert(IDom && "At least the entry block should dominate PN");		assert(IDom && "At least the entry block should dominate PN");

auto *BI = dyn_cast<BranchInst>(IDom->getTerminator());		return dyn_cast<BranchInst>(IDom->getTerminator());
Value Cond = nullptr, LHS = nullptr, *RHS = nullptr;		}
		return nullptr;
		}

if (BI && BI->isConditional() &&		const SCEV ScalarEvolution::createNodeFromSelectLikePHI(PHINode PN) {
BrPHIToSelect(DT, BI, PN, Cond, LHS, RHS) &&		const Loop *L = LI.getLoopFor(PN->getParent());
IsAvailableOnEntry(L, DT, getSCEV(LHS), PN->getParent()) &&		BranchInst *BI = getSelectPhiBr(PN, L, DT, LI);
IsAvailableOnEntry(L, DT, getSCEV(RHS), PN->getParent()))		Value Cond = nullptr, LHS = nullptr, *RHS = nullptr;
		if (BI && BI->isConditional() && BrPHIToSelect(DT, BI, PN, Cond, LHS, RHS) &&
		IsAvailableOnEntry(L, DT, getExistingSCEV(LHS), PN->getParent()) &&
		IsAvailableOnEntry(L, DT, getExistingSCEV(RHS), PN->getParent()))
return createNodeForSelectOrPHI(PN, Cond, LHS, RHS);		return createNodeForSelectOrPHI(PN, Cond, LHS, RHS);
}

return nullptr;		return nullptr;
}		}

const SCEV ScalarEvolution::createNodeForPHI(PHINode PN) {		const SCEV ScalarEvolution::createNodeForPHI(PHINode PN) {
if (const SCEV *S = createAddRecFromPHI(PN))		if (const SCEV *S = createAddRecFromPHI(PN))
return S;		return S;

Show All 14 Lines

const SCEV ScalarEvolution::createNodeForSelectOrPHI(Instruction I,		const SCEV ScalarEvolution::createNodeForSelectOrPHI(Instruction I,
Value *Cond,		Value *Cond,
Value *TrueVal,		Value *TrueVal,
Value *FalseVal) {		Value *FalseVal) {
// Handle "constant" branch or select. This can occur for instance when a		// Handle "constant" branch or select. This can occur for instance when a
// loop pass transforms an inner loop and moves on to process the outer loop.		// loop pass transforms an inner loop and moves on to process the outer loop.
if (auto *CI = dyn_cast<ConstantInt>(Cond))		if (auto *CI = dyn_cast<ConstantInt>(Cond))
return getSCEV(CI->isOne() ? TrueVal : FalseVal);		return getExistingSCEV(CI->isOne() ? TrueVal : FalseVal);

// Try to match some simple smax or umax patterns.		// Try to match some simple smax or umax patterns.
auto *ICI = dyn_cast<ICmpInst>(Cond);		auto *ICI = dyn_cast<ICmpInst>(Cond);
if (!ICI)		if (!ICI)
return getUnknown(I);		return getUnknown(I);

Value *LHS = ICI->getOperand(0);		Value *LHS = ICI->getOperand(0);
Value *RHS = ICI->getOperand(1);		Value *RHS = ICI->getOperand(1);

switch (ICI->getPredicate()) {		switch (ICI->getPredicate()) {
case ICmpInst::ICMP_SLT:		case ICmpInst::ICMP_SLT:
case ICmpInst::ICMP_SLE:		case ICmpInst::ICMP_SLE:
case ICmpInst::ICMP_ULT:		case ICmpInst::ICMP_ULT:
case ICmpInst::ICMP_ULE:		case ICmpInst::ICMP_ULE:
std::swap(LHS, RHS);		std::swap(LHS, RHS);
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
case ICmpInst::ICMP_SGT:		case ICmpInst::ICMP_SGT:
case ICmpInst::ICMP_SGE:		case ICmpInst::ICMP_SGE:
case ICmpInst::ICMP_UGT:		case ICmpInst::ICMP_UGT:
case ICmpInst::ICMP_UGE:		case ICmpInst::ICMP_UGE:
// a > b ? a+x : b+x -> max(a, b)+x		// a > b ? a+x : b+x -> max(a, b)+x
// a > b ? b+x : a+x -> min(a, b)+x		// a > b ? b+x : a+x -> min(a, b)+x
if (getTypeSizeInBits(LHS->getType()) <= getTypeSizeInBits(I->getType())) {		if (getTypeSizeInBits(LHS->getType()) <= getTypeSizeInBits(I->getType())) {
bool Signed = ICI->isSigned();		bool Signed = ICI->isSigned();
		// TODO: Use getExistingSCEV!
const SCEV *LA = getSCEV(TrueVal);		const SCEV *LA = getSCEV(TrueVal);
const SCEV *RA = getSCEV(FalseVal);		const SCEV *RA = getSCEV(FalseVal);
const SCEV *LS = getSCEV(LHS);		const SCEV *LS = getSCEV(LHS);
const SCEV *RS = getSCEV(RHS);		const SCEV *RS = getSCEV(RHS);
if (LA->getType()->isPointerTy()) {		if (LA->getType()->isPointerTy()) {
// FIXME: Handle cases where LS/RS are pointers not equal to LA/RA.		// FIXME: Handle cases where LS/RS are pointers not equal to LA/RA.
// Need to make sure we can't produce weird expressions involving		// Need to make sure we can't produce weird expressions involving
// negated pointers.		// negated pointers.
▲ Show 20 Lines • Show All 69 Lines • ▼ Show 20 Lines
/// to be analyzed by regular SCEV code.		/// to be analyzed by regular SCEV code.
const SCEV ScalarEvolution::createNodeForGEP(GEPOperator GEP) {		const SCEV ScalarEvolution::createNodeForGEP(GEPOperator GEP) {
// Don't attempt to analyze GEPs over unsized objects.		// Don't attempt to analyze GEPs over unsized objects.
if (!GEP->getSourceElementType()->isSized())		if (!GEP->getSourceElementType()->isSized())
return getUnknown(GEP);		return getUnknown(GEP);

SmallVector<const SCEV *, 4> IndexExprs;		SmallVector<const SCEV *, 4> IndexExprs;
for (Value *Index : GEP->indices())		for (Value *Index : GEP->indices())
IndexExprs.push_back(getSCEV(Index));		IndexExprs.push_back(getExistingSCEV(Index));
return getGEPExpr(GEP, IndexExprs);		return getGEPExpr(GEP, IndexExprs);
}		}

uint32_t ScalarEvolution::GetMinTrailingZerosImpl(const SCEV *S) {		uint32_t ScalarEvolution::GetMinTrailingZerosImpl(const SCEV *S) {
if (const SCEVConstant *C = dyn_cast<SCEVConstant>(S))		if (const SCEVConstant *C = dyn_cast<SCEVConstant>(S))
return C->getAPInt().countTrailingZeros();		return C->getAPInt().countTrailingZeros();

if (const SCEVPtrToIntExpr *I = dyn_cast<SCEVPtrToIntExpr>(S))		if (const SCEVPtrToIntExpr *I = dyn_cast<SCEVPtrToIntExpr>(S))
▲ Show 20 Lines • Show All 1,002 Lines • ▼ Show 20 Lines

bool ScalarEvolution::loopIsFiniteByAssumption(const Loop *L) {		bool ScalarEvolution::loopIsFiniteByAssumption(const Loop *L) {
// A mustprogress loop without side effects must be finite.		// A mustprogress loop without side effects must be finite.
// TODO: The check used here is very conservative. It's only specific		// TODO: The check used here is very conservative. It's only specific
// side effects which are well defined in infinite loops.		// side effects which are well defined in infinite loops.
return isMustProgress(L) && loopHasNoSideEffects(L);		return isMustProgress(L) && loopHasNoSideEffects(L);
}		}

		const SCEV ScalarEvolution::createSCEVIter(Value V) {
		SmallVector<std::pair<Value *, bool>> Stack;
		SmallVector<Value *> Worklist;
		SmallPtrSet<Value *, 8> Visited;

		Stack.emplace_back(V, true);
		Stack.emplace_back(V, false);
		while (!Stack.empty()) {
		std::pair<Value *, bool> E = Stack.pop_back_val();
		if (E.second) {
		auto *S = getExistingSCEV(E.first);
		if (!S) {
		S = createSCEV(E.first);
		// During PHI resolution, it is possible to create two SCEVs for the
		// same V, so it is needed to double check whether V->S is inserted into
		// ValueExprMap before insert S->{V, 0} into ExprValueMap.
		std::pair<ValueExprMapType::iterator, bool> Pair =
		ValueExprMap.insert({SCEVCallbackVH(E.first, this), S});
		if (Pair.second) {
		ExprValueMap[S].insert({E.first, nullptr});

		// If S == Stripped + Offset, add Stripped -> {V, Offset} into
		// ExprValueMap.
		const SCEV *Stripped = S;
		ConstantInt *Offset = nullptr;
		std::tie(Stripped, Offset) = splitAddExpr(S);
		// If stripped is SCEVUnknown, don't bother to save
		// Stripped -> {V, offset}. It doesn't simplify and sometimes even
		// increase the complexity of the expansion code.
		// If V is GetElementPtrInst, don't save Stripped -> {V, offset}
		// because it may generate add/sub instead of GEP in SCEV expansion.
		if (Offset != nullptr && !isa<SCEVUnknown>(Stripped) &&
		!isa<GetElementPtrInst>(E.first))
		ExprValueMap[Stripped].insert({E.first, Offset});
		}
		}
		} else {
		if (!Visited.insert(E.first).second \|\| getExistingSCEV(E.first))
		continue;
		SmallVector<Value *> Ops;
		getOperandsToCreate(E.first, Ops);
		Stack.emplace_back(E.first, true);
		for (Value *Op : Ops) {
		Stack.emplace_back(Op, false);
		}
		}
		}

		return getExistingSCEV(V);
		}
		void ScalarEvolution::getOperandsToCreate(Value *V,
		SmallVectorImpl<Value *> &Ops) {
		if (!isSCEVable(V->getType()))
		nlopesUnsubmitted Done Reply Inline Actions Poison, poison! :) nlopes: Poison, poison! :)
		fhahnAuthorUnsubmitted Done Reply Inline Actions This is unrelated to this patch, I created D128586. I'll update things here once/if D128586 goes in fhahn: This is unrelated to this patch, I created D128586. I'll update things here once/if D128586…
		return;

		if (Instruction *I = dyn_cast<Instruction>(V)) {
		// Don't attempt to analyze instructions in blocks that aren't
		// reachable. Such instructions don't matter, and they aren't required
		// to obey basic rules for definitions dominating uses which this
		// analysis depends on.
		if (!DT.isReachableFromEntry(I->getParent()))
		return;
		} else if (ConstantInt *CI = dyn_cast<ConstantInt>(V))
		return;
		else if (GlobalAlias *GA = dyn_cast<GlobalAlias>(V)) {
		if (!GA->isInterposable())
		Ops.push_back(GA->getAliasee());
		return;
		} else if (!isa<ConstantExpr>(V))
		return;

		Operator *U = cast<Operator>(V);
		if (auto BO = MatchBinaryOp(U, DT)) {
		Ops.push_back(BO->LHS);
		Ops.push_back(BO->RHS);
		return;
		}

		switch (U->getOpcode()) {
		case Instruction::Trunc:
		case Instruction::ZExt:
		case Instruction::SExt:
		case Instruction::PtrToInt:
		Ops.push_back(U->getOperand(0));
		return;
		nikicUnsubmitted Done Reply Inline Actions Add and mul currently have special code that tries to combine multiple sequential adds/mul into one getAddExpr/getMulExpr call. This is effectively lost here (or maybe worse, we end up doing both -- each add individually here, and then a multiple-add variant in getSCEV). I think if we're going to change this (which might well make sense), we should probably change that separately, and also in the createSCEV() code as well. I suspect that this might account for both some of the codegen changes and some of the compile-time impact. nikic: Add and mul currently have special code that tries to combine multiple sequential adds/mul into…
		fhahnAuthorUnsubmitted Done Reply Inline Actions Thanks, I updated `getOperandsToCreate` to also traverse mul/add chains. This indeed removed the `IndVarSimplify` changes and improved compile-time a bit. We could go even further and enqueue (Opcode, Operands) tuples to create directly, instead of going through `createSCEV`, but this would require a way to encode at least negations for the Add case. fhahn: Thanks, I updated `getOperandsToCreate` to also traverse mul/add chains. This indeed removed…

		case Instruction::BitCast:
		if (isSCEVable(U->getType()) && isSCEVable(U->getOperand(0)->getType())) {
		Ops.push_back(U->getOperand(0));
		return;
		}
		break;

		case Instruction::SDiv:
		case Instruction::SRem:
		Ops.push_back(U->getOperand(0));
		Ops.push_back(U->getOperand(1));
		return;

		case Instruction::GetElementPtr:
		if (cast<GEPOperator>(U)->getSourceElementType()->isSized()) {
		for (Value *Index : U->operands())
		Ops.push_back(Index);
		}
		return;

		case Instruction::IntToPtr:
		return;

		case Instruction::PHI: {
		auto *PN = cast<PHINode>(U);
		nikicUnsubmitted Done Reply Inline Actions https://github.com/llvm/llvm-project/commit/327307d9d4da0045f762f75343fe66b0f10ecc63 ;) nikic: https://github.com/llvm/llvm-project/commit/327307d9d4da0045f762f75343fe66b0f10ecc63 ;)
		fhahnAuthorUnsubmitted Done Reply Inline Actions Replace with an assert fhahn: Replace with an assert
		const Loop *L = LI.getLoopFor(PN->getParent());
		Value BEValueV, StartValueV;
		std::tie(StartValueV, BEValueV) = classifyPhiValues(PN, L);

		auto GetFoo = [&]() -> Value * {
		if (!BEValueV \|\| !StartValueV)
		return nullptr;
		auto BO = MatchBinaryOp(BEValueV, DT);
		if (!BO)
		return nullptr;

		if (BO->Opcode != Instruction::Add)
		return nullptr;

		if (BO->LHS == PN && L->isLoopInvariant(BO->RHS))
		return BO->RHS;
		else if (BO->RHS == PN && L->isLoopInvariant(BO->LHS))
		return BO->LHS;
		return nullptr;
		};

		if (auto *Acc = GetFoo()) {
		Ops.push_back(StartValueV);
		Ops.push_back(Acc);
		return;
		}
		BranchInst *BI = getSelectPhiBr(PN, L, DT, LI);
		Value Cond = nullptr, LHS = nullptr, *RHS = nullptr;
		if (BI && BI->isConditional() &&
		BrPHIToSelect(DT, BI, PN, Cond, LHS, RHS)) {
		Ops.push_back(LHS);
		Ops.push_back(RHS);
		return;
		}
		return;
		}

		case Instruction::Select:
		// U can also be a select constant expr, which let fall through. Since
		// createNodeForSelect only works for a condition that is an `ICmpInst`, and
		// constant expressions cannot have instructions as operands, we'd have
		// returned getUnknown for a select constant expressions anyway.
		if (isa<Instruction>(U)) {
		for (Value *Inc : cast<Instruction>(U)->operands())
		Ops.push_back(Inc);
		return;
		}
		break;

		case Instruction::Call:
		case Instruction::Invoke:
		if (Value *RV = cast<CallBase>(U)->getReturnedArgOperand()) {
		Ops.push_back(RV);
		return;
		}

		if (auto *II = dyn_cast<IntrinsicInst>(U)) {
		switch (II->getIntrinsicID()) {
		case Intrinsic::abs:
		Ops.push_back(II->getArgOperand(0));
		nikicUnsubmitted Done Reply Inline Actions I don't think this is strictly true due to the createNodeForSelectViaUMinSeq() code. And either way, I don't think we need to handle it this precisely here. nikic: I don't think this is strictly true due to the createNodeForSelectViaUMinSeq() code. And either…
		fhahnAuthorUnsubmitted Done Reply Inline Actions Thanks, I removed the check for Instruction to simplify things here. fhahn: Thanks, I removed the check for Instruction to simplify things here.
		return;
		case Intrinsic::umax:
		case Intrinsic::umin:
		case Intrinsic::smax:
		case Intrinsic::smin:
		case Intrinsic::usub_sat:
		case Intrinsic::uadd_sat:
		Ops.push_back(II->getArgOperand(0));
		Ops.push_back(II->getArgOperand(1));
		return;
		case Intrinsic::start_loop_iterations:
		Ops.push_back(II->getArgOperand(0));
		return;
		default:
		break;
		}
		}
		break;
		}

		return;
		}

const SCEV ScalarEvolution::createSCEV(Value V) {		const SCEV ScalarEvolution::createSCEV(Value V) {
if (!isSCEVable(V->getType()))		if (!isSCEVable(V->getType()))
return getUnknown(V);		return getUnknown(V);

if (Instruction *I = dyn_cast<Instruction>(V)) {		if (Instruction *I = dyn_cast<Instruction>(V)) {
// Don't attempt to analyze instructions in blocks that aren't		// Don't attempt to analyze instructions in blocks that aren't
// reachable. Such instructions don't matter, and they aren't required		// reachable. Such instructions don't matter, and they aren't required
// to obey basic rules for definitions dominating uses which this		// to obey basic rules for definitions dominating uses which this
// analysis depends on.		// analysis depends on.
if (!DT.isReachableFromEntry(I->getParent()))		if (!DT.isReachableFromEntry(I->getParent()))
return getUnknown(UndefValue::get(V->getType()));		return getUnknown(UndefValue::get(V->getType()));
} else if (ConstantInt *CI = dyn_cast<ConstantInt>(V))		} else if (ConstantInt *CI = dyn_cast<ConstantInt>(V))
return getConstant(CI);		return getConstant(CI);
else if (GlobalAlias *GA = dyn_cast<GlobalAlias>(V))		else if (GlobalAlias *GA = dyn_cast<GlobalAlias>(V))
return GA->isInterposable() ? getUnknown(V) : getSCEV(GA->getAliasee());		return GA->isInterposable() ? getUnknown(V)
		: getExistingSCEV(GA->getAliasee());
else if (!isa<ConstantExpr>(V))		else if (!isa<ConstantExpr>(V))
return getUnknown(V);		return getUnknown(V);

Operator *U = cast<Operator>(V);		Operator *U = cast<Operator>(V);
if (auto BO = MatchBinaryOp(U, DT)) {		if (auto BO = MatchBinaryOp(U, DT)) {
switch (BO->Opcode) {		switch (BO->Opcode) {
case Instruction::Add: {		case Instruction::Add: {
// The simple thing to do would be to just call getSCEV on both operands		// The simple thing to do would be to just call getSCEV on both operands
Show All 12 Lines	case Instruction::Add: {

// If a NUW or NSW flag can be applied to the SCEV for this		// If a NUW or NSW flag can be applied to the SCEV for this
// addition, then compute the SCEV for this addition by itself		// addition, then compute the SCEV for this addition by itself
// with a separate call to getAddExpr. We need to do that		// with a separate call to getAddExpr. We need to do that
// instead of pushing the operands of the addition onto AddOps,		// instead of pushing the operands of the addition onto AddOps,
// since the flags are only known to apply to this particular		// since the flags are only known to apply to this particular
// addition - they may not apply to other additions that can be		// addition - they may not apply to other additions that can be
// formed with operands from AddOps.		// formed with operands from AddOps.
const SCEV *RHS = getSCEV(BO->RHS);		const SCEV *RHS = getExistingSCEV(BO->RHS);
SCEV::NoWrapFlags Flags = getNoWrapFlagsFromUB(BO->Op);		SCEV::NoWrapFlags Flags = getNoWrapFlagsFromUB(BO->Op);
if (Flags != SCEV::FlagAnyWrap) {		if (Flags != SCEV::FlagAnyWrap) {
const SCEV *LHS = getSCEV(BO->LHS);		const SCEV *LHS = getExistingSCEV(BO->LHS);
if (BO->Opcode == Instruction::Sub)		if (BO->Opcode == Instruction::Sub)
AddOps.push_back(getMinusSCEV(LHS, RHS, Flags));		AddOps.push_back(getMinusSCEV(LHS, RHS, Flags));
else		else
AddOps.push_back(getAddExpr(LHS, RHS, Flags));		AddOps.push_back(getAddExpr(LHS, RHS, Flags));
break;		break;
}		}
}		}

if (BO->Opcode == Instruction::Sub)		if (BO->Opcode == Instruction::Sub)
AddOps.push_back(getNegativeSCEV(getSCEV(BO->RHS)));		AddOps.push_back(getNegativeSCEV(getExistingSCEV(BO->RHS)));
else		else
AddOps.push_back(getSCEV(BO->RHS));		AddOps.push_back(getExistingSCEV(BO->RHS));

auto NewBO = MatchBinaryOp(BO->LHS, DT);		auto NewBO = MatchBinaryOp(BO->LHS, DT);
if (!NewBO \|\| (NewBO->Opcode != Instruction::Add &&		if (!NewBO \|\| (NewBO->Opcode != Instruction::Add &&
NewBO->Opcode != Instruction::Sub)) {		NewBO->Opcode != Instruction::Sub)) {
AddOps.push_back(getSCEV(BO->LHS));		AddOps.push_back(getExistingSCEV(BO->LHS));
break;		break;
}		}
BO = NewBO;		BO = NewBO;
} while (true);		} while (true);

return getAddExpr(AddOps);		return getAddExpr(AddOps);
}		}

case Instruction::Mul: {		case Instruction::Mul: {
SmallVector<const SCEV *, 4> MulOps;		SmallVector<const SCEV *, 4> MulOps;
do {		do {
if (BO->Op) {		if (BO->Op) {
if (auto *OpSCEV = getExistingSCEV(BO->Op)) {		if (auto *OpSCEV = getExistingSCEV(BO->Op)) {
MulOps.push_back(OpSCEV);		MulOps.push_back(OpSCEV);
break;		break;
}		}

SCEV::NoWrapFlags Flags = getNoWrapFlagsFromUB(BO->Op);		SCEV::NoWrapFlags Flags = getNoWrapFlagsFromUB(BO->Op);
if (Flags != SCEV::FlagAnyWrap) {		if (Flags != SCEV::FlagAnyWrap) {
MulOps.push_back(		MulOps.push_back(getMulExpr(getExistingSCEV(BO->LHS),
getMulExpr(getSCEV(BO->LHS), getSCEV(BO->RHS), Flags));		getExistingSCEV(BO->RHS), Flags));
break;		break;
}		}
}		}

MulOps.push_back(getSCEV(BO->RHS));		MulOps.push_back(getExistingSCEV(BO->RHS));
auto NewBO = MatchBinaryOp(BO->LHS, DT);		auto NewBO = MatchBinaryOp(BO->LHS, DT);
if (!NewBO \|\| NewBO->Opcode != Instruction::Mul) {		if (!NewBO \|\| NewBO->Opcode != Instruction::Mul) {
MulOps.push_back(getSCEV(BO->LHS));		MulOps.push_back(getExistingSCEV(BO->LHS));
break;		break;
}		}
BO = NewBO;		BO = NewBO;
} while (true);		} while (true);

return getMulExpr(MulOps);		return getMulExpr(MulOps);
}		}
case Instruction::UDiv:		case Instruction::UDiv:
return getUDivExpr(getSCEV(BO->LHS), getSCEV(BO->RHS));		return getUDivExpr(getExistingSCEV(BO->LHS), getExistingSCEV(BO->RHS));
case Instruction::URem:		case Instruction::URem:
return getURemExpr(getSCEV(BO->LHS), getSCEV(BO->RHS));		return getURemExpr(getExistingSCEV(BO->LHS), getExistingSCEV(BO->RHS));
case Instruction::Sub: {		case Instruction::Sub: {
SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap;		SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap;
if (BO->Op)		if (BO->Op)
Flags = getNoWrapFlagsFromUB(BO->Op);		Flags = getNoWrapFlagsFromUB(BO->Op);
return getMinusSCEV(getSCEV(BO->LHS), getSCEV(BO->RHS), Flags);		return getMinusSCEV(getExistingSCEV(BO->LHS), getExistingSCEV(BO->RHS),
		Flags);
}		}
case Instruction::And:		case Instruction::And:
// For an expression like x&255 that merely masks off the high bits,		// For an expression like x&255 that merely masks off the high bits,
// use zext(trunc(x)) as the SCEV expression.		// use zext(trunc(x)) as the SCEV expression.
if (ConstantInt *CI = dyn_cast<ConstantInt>(BO->RHS)) {		if (ConstantInt *CI = dyn_cast<ConstantInt>(BO->RHS)) {
if (CI->isZero())		if (CI->isZero())
return getSCEV(BO->RHS);		return getExistingSCEV(BO->RHS);
if (CI->isMinusOne())		if (CI->isMinusOne())
return getSCEV(BO->LHS);		return getExistingSCEV(BO->LHS);
const APInt &A = CI->getValue();		const APInt &A = CI->getValue();

// Instcombine's ShrinkDemandedConstant may strip bits out of		// Instcombine's ShrinkDemandedConstant may strip bits out of
// constants, obscuring what would otherwise be a low-bits mask.		// constants, obscuring what would otherwise be a low-bits mask.
// Use computeKnownBits to compute what ShrinkDemandedConstant		// Use computeKnownBits to compute what ShrinkDemandedConstant
// knew about to reconstruct a low-bits mask value.		// knew about to reconstruct a low-bits mask value.
unsigned LZ = A.countLeadingZeros();		unsigned LZ = A.countLeadingZeros();
unsigned TZ = A.countTrailingZeros();		unsigned TZ = A.countTrailingZeros();
unsigned BitWidth = A.getBitWidth();		unsigned BitWidth = A.getBitWidth();
KnownBits Known(BitWidth);		KnownBits Known(BitWidth);
computeKnownBits(BO->LHS, Known, getDataLayout(),		computeKnownBits(BO->LHS, Known, getDataLayout(),
0, &AC, nullptr, &DT);		0, &AC, nullptr, &DT);

APInt EffectiveMask =		APInt EffectiveMask =
APInt::getLowBitsSet(BitWidth, BitWidth - LZ - TZ).shl(TZ);		APInt::getLowBitsSet(BitWidth, BitWidth - LZ - TZ).shl(TZ);
if ((LZ != 0 \|\| TZ != 0) && !((~A & ~Known.Zero) & EffectiveMask)) {		if ((LZ != 0 \|\| TZ != 0) && !((~A & ~Known.Zero) & EffectiveMask)) {
const SCEV *MulCount = getConstant(APInt::getOneBitSet(BitWidth, TZ));		const SCEV *MulCount = getConstant(APInt::getOneBitSet(BitWidth, TZ));
const SCEV *LHS = getSCEV(BO->LHS);		const SCEV *LHS = getExistingSCEV(BO->LHS);
const SCEV *ShiftedLHS = nullptr;		const SCEV *ShiftedLHS = nullptr;
if (auto *LHSMul = dyn_cast<SCEVMulExpr>(LHS)) {		if (auto *LHSMul = dyn_cast<SCEVMulExpr>(LHS)) {
if (auto *OpC = dyn_cast<SCEVConstant>(LHSMul->getOperand(0))) {		if (auto *OpC = dyn_cast<SCEVConstant>(LHSMul->getOperand(0))) {
// For an expression like (x * 8) & 8, simplify the multiply.		// For an expression like (x * 8) & 8, simplify the multiply.
unsigned MulZeros = OpC->getAPInt().countTrailingZeros();		unsigned MulZeros = OpC->getAPInt().countTrailingZeros();
unsigned GCD = std::min(MulZeros, TZ);		unsigned GCD = std::min(MulZeros, TZ);
APInt DivAmt = APInt::getOneBitSet(BitWidth, TZ - GCD);		APInt DivAmt = APInt::getOneBitSet(BitWidth, TZ - GCD);
SmallVector<const SCEV*, 4> MulOps;		SmallVector<const SCEV*, 4> MulOps;
Show All 18 Lines	if (auto BO = MatchBinaryOp(U, DT)) {
case Instruction::Or:		case Instruction::Or:
// If the RHS of the Or is a constant, we may have something like:		// If the RHS of the Or is a constant, we may have something like:
// X4+1 which got turned into X4\|1. Handle this as an Add so loop		// X4+1 which got turned into X4\|1. Handle this as an Add so loop
// optimizations will transparently handle this case.		// optimizations will transparently handle this case.
//		//
// In order for this transformation to be safe, the LHS must be of the		// In order for this transformation to be safe, the LHS must be of the
// form X*(2^n) and the Or constant must be less than 2^n.		// form X*(2^n) and the Or constant must be less than 2^n.
if (ConstantInt *CI = dyn_cast<ConstantInt>(BO->RHS)) {		if (ConstantInt *CI = dyn_cast<ConstantInt>(BO->RHS)) {
const SCEV *LHS = getSCEV(BO->LHS);		const SCEV *LHS = getExistingSCEV(BO->LHS);
const APInt &CIVal = CI->getValue();		const APInt &CIVal = CI->getValue();
if (GetMinTrailingZeros(LHS) >=		if (GetMinTrailingZeros(LHS) >=
(CIVal.getBitWidth() - CIVal.countLeadingZeros())) {		(CIVal.getBitWidth() - CIVal.countLeadingZeros())) {
// Build a plain add SCEV.		// Build a plain add SCEV.
return getAddExpr(LHS, getSCEV(CI),		return getAddExpr(LHS, getExistingSCEV(CI),
(SCEV::NoWrapFlags)(SCEV::FlagNUW \| SCEV::FlagNSW));		(SCEV::NoWrapFlags)(SCEV::FlagNUW \| SCEV::FlagNSW));
}		}
}		}
break;		break;

case Instruction::Xor:		case Instruction::Xor:
if (ConstantInt *CI = dyn_cast<ConstantInt>(BO->RHS)) {		if (ConstantInt *CI = dyn_cast<ConstantInt>(BO->RHS)) {
// If the RHS of xor is -1, then this is a not operation.		// If the RHS of xor is -1, then this is a not operation.
if (CI->isMinusOne())		if (CI->isMinusOne())
return getNotSCEV(getSCEV(BO->LHS));		return getNotSCEV(getExistingSCEV(BO->LHS));

// Model xor(and(x, C), C) as and(~x, C), if C is a low-bits mask.		// Model xor(and(x, C), C) as and(~x, C), if C is a low-bits mask.
// This is a variant of the check for xor with -1, and it handles		// This is a variant of the check for xor with -1, and it handles
// the case where instcombine has trimmed non-demanded bits out		// the case where instcombine has trimmed non-demanded bits out
// of an xor with -1.		// of an xor with -1.
if (auto *LBO = dyn_cast<BinaryOperator>(BO->LHS))		if (auto *LBO = dyn_cast<BinaryOperator>(BO->LHS))
if (ConstantInt *LCI = dyn_cast<ConstantInt>(LBO->getOperand(1)))		if (ConstantInt *LCI = dyn_cast<ConstantInt>(LBO->getOperand(1)))
if (LBO->getOpcode() == Instruction::And &&		if (LBO->getOpcode() == Instruction::And &&
LCI->getValue() == CI->getValue())		LCI->getValue() == CI->getValue())
if (const SCEVZeroExtendExpr *Z =		if (const SCEVZeroExtendExpr *Z =
dyn_cast<SCEVZeroExtendExpr>(getSCEV(BO->LHS))) {		dyn_cast<SCEVZeroExtendExpr>(getExistingSCEV(BO->LHS))) {
Type *UTy = BO->LHS->getType();		Type *UTy = BO->LHS->getType();
const SCEV *Z0 = Z->getOperand();		const SCEV *Z0 = Z->getOperand();
Type *Z0Ty = Z0->getType();		Type *Z0Ty = Z0->getType();
unsigned Z0TySize = getTypeSizeInBits(Z0Ty);		unsigned Z0TySize = getTypeSizeInBits(Z0Ty);

// If C is a low-bits mask, the zero extend is serving to		// If C is a low-bits mask, the zero extend is serving to
// mask off the high bits. Complement the operand and		// mask off the high bits. Complement the operand and
// re-apply the zext.		// re-apply the zext.
Show All 33 Lines	case Instruction::Shl:
auto MulFlags = getNoWrapFlagsFromUB(BO->Op);		auto MulFlags = getNoWrapFlagsFromUB(BO->Op);
if ((MulFlags & SCEV::FlagNSW) &&		if ((MulFlags & SCEV::FlagNSW) &&
((MulFlags & SCEV::FlagNUW) \|\| SA->getValue().ult(BitWidth - 1)))		((MulFlags & SCEV::FlagNUW) \|\| SA->getValue().ult(BitWidth - 1)))
Flags = (SCEV::NoWrapFlags)(Flags \| SCEV::FlagNSW);		Flags = (SCEV::NoWrapFlags)(Flags \| SCEV::FlagNSW);
if (MulFlags & SCEV::FlagNUW)		if (MulFlags & SCEV::FlagNUW)
Flags = (SCEV::NoWrapFlags)(Flags \| SCEV::FlagNUW);		Flags = (SCEV::NoWrapFlags)(Flags \| SCEV::FlagNUW);
}		}

Constant *X = ConstantInt::get(		ConstantInt *X = ConstantInt::get(
getContext(), APInt::getOneBitSet(BitWidth, SA->getZExtValue()));		getContext(), APInt::getOneBitSet(BitWidth, SA->getZExtValue()));
return getMulExpr(getSCEV(BO->LHS), getSCEV(X), Flags);		return getMulExpr(getExistingSCEV(BO->LHS), getConstant(X), Flags);
}		}
break;		break;

case Instruction::AShr: {		case Instruction::AShr: {
// AShr X, C, where C is a constant.		// AShr X, C, where C is a constant.
ConstantInt *CI = dyn_cast<ConstantInt>(BO->RHS);		ConstantInt *CI = dyn_cast<ConstantInt>(BO->RHS);
if (!CI)		if (!CI)
break;		break;

Type *OuterTy = BO->LHS->getType();		Type *OuterTy = BO->LHS->getType();
uint64_t BitWidth = getTypeSizeInBits(OuterTy);		uint64_t BitWidth = getTypeSizeInBits(OuterTy);
// If the shift count is not less than the bitwidth, the result of		// If the shift count is not less than the bitwidth, the result of
// the shift is undefined. Don't try to analyze it, because the		// the shift is undefined. Don't try to analyze it, because the
// resolution chosen here may differ from the resolution chosen in		// resolution chosen here may differ from the resolution chosen in
// other parts of the compiler.		// other parts of the compiler.
if (CI->getValue().uge(BitWidth))		if (CI->getValue().uge(BitWidth))
break;		break;

if (CI->isZero())		if (CI->isZero())
return getSCEV(BO->LHS); // shift by zero --> noop		return getExistingSCEV(BO->LHS); // shift by zero --> noop

uint64_t AShrAmt = CI->getZExtValue();		uint64_t AShrAmt = CI->getZExtValue();
Type *TruncTy = IntegerType::get(getContext(), BitWidth - AShrAmt);		Type *TruncTy = IntegerType::get(getContext(), BitWidth - AShrAmt);

Operator *L = dyn_cast<Operator>(BO->LHS);		Operator *L = dyn_cast<Operator>(BO->LHS);
if (L && L->getOpcode() == Instruction::Shl) {		if (L && L->getOpcode() == Instruction::Shl) {
// X = Shl A, n		// X = Shl A, n
// Y = AShr X, m		// Y = AShr X, m
// Both n and m are constant.		// Both n and m are constant.

const SCEV *ShlOp0SCEV = getSCEV(L->getOperand(0));		const SCEV *ShlOp0SCEV = getExistingSCEV(L->getOperand(0));
if (L->getOperand(1) == BO->RHS)		if (L->getOperand(1) == BO->RHS)
// For a two-shift sext-inreg, i.e. n = m,		// For a two-shift sext-inreg, i.e. n = m,
// use sext(trunc(x)) as the SCEV expression.		// use sext(trunc(x)) as the SCEV expression.
return getSignExtendExpr(		return getSignExtendExpr(
getTruncateExpr(ShlOp0SCEV, TruncTy), OuterTy);		getTruncateExpr(ShlOp0SCEV, TruncTy), OuterTy);

ConstantInt *ShlAmtCI = dyn_cast<ConstantInt>(L->getOperand(1));		ConstantInt *ShlAmtCI = dyn_cast<ConstantInt>(L->getOperand(1));
if (ShlAmtCI && ShlAmtCI->getValue().ult(BitWidth)) {		if (ShlAmtCI && ShlAmtCI->getValue().ult(BitWidth)) {
Show All 13 Lines	case Instruction::AShr: {
}		}
break;		break;
}		}
}		}
}		}

switch (U->getOpcode()) {		switch (U->getOpcode()) {
case Instruction::Trunc:		case Instruction::Trunc:
return getTruncateExpr(getSCEV(U->getOperand(0)), U->getType());		return getTruncateExpr(getExistingSCEV(U->getOperand(0)), U->getType());

case Instruction::ZExt:		case Instruction::ZExt:
return getZeroExtendExpr(getSCEV(U->getOperand(0)), U->getType());		return getZeroExtendExpr(getExistingSCEV(U->getOperand(0)), U->getType());

case Instruction::SExt:		case Instruction::SExt:
if (auto BO = MatchBinaryOp(U->getOperand(0), DT)) {		if (auto BO = MatchBinaryOp(U->getOperand(0), DT)) {
// The NSW flag of a subtract does not always survive the conversion to		// The NSW flag of a subtract does not always survive the conversion to
// A + (-1)*B. By pushing sign extension onto its operands we are much		// A + (-1)*B. By pushing sign extension onto its operands we are much
// more likely to preserve NSW and allow later AddRec optimisations.		// more likely to preserve NSW and allow later AddRec optimisations.
//		//
// NOTE: This is effectively duplicating this logic from getSignExtend:		// NOTE: This is effectively duplicating this logic from getSignExtend:
// sext((A + B + ...)<nsw>) --> (sext(A) + sext(B) + ...)<nsw>		// sext((A + B + ...)<nsw>) --> (sext(A) + sext(B) + ...)<nsw>
// but by that point the NSW information has potentially been lost.		// but by that point the NSW information has potentially been lost.
if (BO->Opcode == Instruction::Sub && BO->IsNSW) {		if (BO->Opcode == Instruction::Sub && BO->IsNSW) {
Type *Ty = U->getType();		Type *Ty = U->getType();
auto *V1 = getSignExtendExpr(getSCEV(BO->LHS), Ty);		auto *V1 = getSignExtendExpr(getExistingSCEV(BO->LHS), Ty);
auto *V2 = getSignExtendExpr(getSCEV(BO->RHS), Ty);		auto *V2 = getSignExtendExpr(getExistingSCEV(BO->RHS), Ty);
return getMinusSCEV(V1, V2, SCEV::FlagNSW);		return getMinusSCEV(V1, V2, SCEV::FlagNSW);
}		}
}		}
return getSignExtendExpr(getSCEV(U->getOperand(0)), U->getType());		return getSignExtendExpr(getExistingSCEV(U->getOperand(0)), U->getType());

case Instruction::BitCast:		case Instruction::BitCast:
// BitCasts are no-op casts so we just eliminate the cast.		// BitCasts are no-op casts so we just eliminate the cast.
if (isSCEVable(U->getType()) && isSCEVable(U->getOperand(0)->getType()))		if (isSCEVable(U->getType()) && isSCEVable(U->getOperand(0)->getType()))
return getSCEV(U->getOperand(0));		return getExistingSCEV(U->getOperand(0));
break;		break;

case Instruction::PtrToInt: {		case Instruction::PtrToInt: {
// Pointer to integer cast is straight-forward, so do model it.		// Pointer to integer cast is straight-forward, so do model it.
const SCEV *Op = getSCEV(U->getOperand(0));		const SCEV *Op = getExistingSCEV(U->getOperand(0));
Type *DstIntTy = U->getType();		Type *DstIntTy = U->getType();
// But only if effective SCEV (integer) type is wide enough to represent		// But only if effective SCEV (integer) type is wide enough to represent
// all possible pointer values.		// all possible pointer values.
const SCEV *IntOp = getPtrToIntExpr(Op, DstIntTy);		const SCEV *IntOp = getPtrToIntExpr(Op, DstIntTy);
if (isa<SCEVCouldNotCompute>(IntOp))		if (isa<SCEVCouldNotCompute>(IntOp))
return getUnknown(V);		return getUnknown(V);
return IntOp;		return IntOp;
}		}
case Instruction::IntToPtr:		case Instruction::IntToPtr:
// Just don't deal with inttoptr casts.		// Just don't deal with inttoptr casts.
return getUnknown(V);		return getUnknown(V);

case Instruction::SDiv:		case Instruction::SDiv:
// If both operands are non-negative, this is just an udiv.		// If both operands are non-negative, this is just an udiv.
if (isKnownNonNegative(getSCEV(U->getOperand(0))) &&		if (isKnownNonNegative(getExistingSCEV(U->getOperand(0))) &&
isKnownNonNegative(getSCEV(U->getOperand(1))))		isKnownNonNegative(getExistingSCEV(U->getOperand(1))))
return getUDivExpr(getSCEV(U->getOperand(0)), getSCEV(U->getOperand(1)));		return getUDivExpr(getExistingSCEV(U->getOperand(0)),
		getExistingSCEV(U->getOperand(1)));
break;		break;

case Instruction::SRem:		case Instruction::SRem:
// If both operands are non-negative, this is just an urem.		// If both operands are non-negative, this is just an urem.
if (isKnownNonNegative(getSCEV(U->getOperand(0))) &&		if (isKnownNonNegative(getExistingSCEV(U->getOperand(0))) &&
isKnownNonNegative(getSCEV(U->getOperand(1))))		isKnownNonNegative(getExistingSCEV(U->getOperand(1))))
return getURemExpr(getSCEV(U->getOperand(0)), getSCEV(U->getOperand(1)));		return getURemExpr(getExistingSCEV(U->getOperand(0)),
		getExistingSCEV(U->getOperand(1)));
break;		break;

case Instruction::GetElementPtr:		case Instruction::GetElementPtr:
return createNodeForGEP(cast<GEPOperator>(U));		return createNodeForGEP(cast<GEPOperator>(U));

case Instruction::PHI:		case Instruction::PHI:
return createNodeForPHI(cast<PHINode>(U));		return createNodeForPHI(cast<PHINode>(U));

case Instruction::Select:		case Instruction::Select:
// U can also be a select constant expr, which let fall through. Since		// U can also be a select constant expr, which let fall through. Since
// createNodeForSelect only works for a condition that is an `ICmpInst`, and		// createNodeForSelect only works for a condition that is an `ICmpInst`, and
// constant expressions cannot have instructions as operands, we'd have		// constant expressions cannot have instructions as operands, we'd have
// returned getUnknown for a select constant expressions anyway.		// returned getUnknown for a select constant expressions anyway.
if (isa<Instruction>(U))		if (isa<Instruction>(U))
return createNodeForSelectOrPHI(cast<Instruction>(U), U->getOperand(0),		return createNodeForSelectOrPHI(cast<Instruction>(U), U->getOperand(0),
U->getOperand(1), U->getOperand(2));		U->getOperand(1), U->getOperand(2));
break;		break;

case Instruction::Call:		case Instruction::Call:
case Instruction::Invoke:		case Instruction::Invoke:
if (Value *RV = cast<CallBase>(U)->getReturnedArgOperand())		if (Value *RV = cast<CallBase>(U)->getReturnedArgOperand())
return getSCEV(RV);		return getExistingSCEV(RV);

if (auto *II = dyn_cast<IntrinsicInst>(U)) {		if (auto *II = dyn_cast<IntrinsicInst>(U)) {
switch (II->getIntrinsicID()) {		switch (II->getIntrinsicID()) {
case Intrinsic::abs:		case Intrinsic::abs:
return getAbsExpr(		return getAbsExpr(
getSCEV(II->getArgOperand(0)),		getExistingSCEV(II->getArgOperand(0)),
/IsNSW=/cast<ConstantInt>(II->getArgOperand(1))->isOne());		/IsNSW=/cast<ConstantInt>(II->getArgOperand(1))->isOne());
case Intrinsic::umax:		case Intrinsic::umax:
return getUMaxExpr(getSCEV(II->getArgOperand(0)),		return getUMaxExpr(getExistingSCEV(II->getArgOperand(0)),
getSCEV(II->getArgOperand(1)));		getExistingSCEV(II->getArgOperand(1)));
case Intrinsic::umin:		case Intrinsic::umin:
return getUMinExpr(getSCEV(II->getArgOperand(0)),		return getUMinExpr(getExistingSCEV(II->getArgOperand(0)),
getSCEV(II->getArgOperand(1)));		getExistingSCEV(II->getArgOperand(1)));
case Intrinsic::smax:		case Intrinsic::smax:
return getSMaxExpr(getSCEV(II->getArgOperand(0)),		return getSMaxExpr(getExistingSCEV(II->getArgOperand(0)),
getSCEV(II->getArgOperand(1)));		getExistingSCEV(II->getArgOperand(1)));
case Intrinsic::smin:		case Intrinsic::smin:
return getSMinExpr(getSCEV(II->getArgOperand(0)),		return getSMinExpr(getExistingSCEV(II->getArgOperand(0)),
getSCEV(II->getArgOperand(1)));		getExistingSCEV(II->getArgOperand(1)));
case Intrinsic::usub_sat: {		case Intrinsic::usub_sat: {
const SCEV *X = getSCEV(II->getArgOperand(0));		const SCEV *X = getExistingSCEV(II->getArgOperand(0));
const SCEV *Y = getSCEV(II->getArgOperand(1));		const SCEV *Y = getExistingSCEV(II->getArgOperand(1));
const SCEV *ClampedY = getUMinExpr(X, Y);		const SCEV *ClampedY = getUMinExpr(X, Y);
return getMinusSCEV(X, ClampedY, SCEV::FlagNUW);		return getMinusSCEV(X, ClampedY, SCEV::FlagNUW);
}		}
case Intrinsic::uadd_sat: {		case Intrinsic::uadd_sat: {
const SCEV *X = getSCEV(II->getArgOperand(0));		const SCEV *X = getExistingSCEV(II->getArgOperand(0));
const SCEV *Y = getSCEV(II->getArgOperand(1));		const SCEV *Y = getExistingSCEV(II->getArgOperand(1));
const SCEV *ClampedX = getUMinExpr(X, getNotSCEV(Y));		const SCEV *ClampedX = getUMinExpr(X, getNotSCEV(Y));
return getAddExpr(ClampedX, Y, SCEV::FlagNUW);		return getAddExpr(ClampedX, Y, SCEV::FlagNUW);
}		}
case Intrinsic::start_loop_iterations:		case Intrinsic::start_loop_iterations:
// A start_loop_iterations is just equivalent to the first operand for		// A start_loop_iterations is just equivalent to the first operand for
// SCEV purposes.		// SCEV purposes.
return getSCEV(II->getArgOperand(0));		return getExistingSCEV(II->getArgOperand(0));
default:		default:
break;		break;
}		}
}		}
break;		break;
}		}

return getUnknown(V);		return getUnknown(V);
▲ Show 20 Lines • Show All 339 Lines • ▼ Show 20 Lines	if (BEExact != getCouldNotCompute()) {
++NumTripCountsComputed;		++NumTripCountsComputed;
} else if (Result.getConstantMax(this) == getCouldNotCompute() &&		} else if (Result.getConstantMax(this) == getCouldNotCompute() &&
isa<PHINode>(L->getHeader()->begin())) {		isa<PHINode>(L->getHeader()->begin())) {
// Only count loops that have phi nodes as not being computable.		// Only count loops that have phi nodes as not being computable.
++NumTripCountsNotComputed;		++NumTripCountsNotComputed;
}		}
#endif // LLVM_ENABLE_STATS \|\| !defined(NDEBUG)		#endif // LLVM_ENABLE_STATS \|\| !defined(NDEBUG)

		SmallVector<Value *> Recompute;
// Now that we know more about the trip count for this loop, forget any		// Now that we know more about the trip count for this loop, forget any
// existing SCEV values for PHI nodes in this loop since they are only		// existing SCEV values for PHI nodes in this loop since they are only
// conservative estimates made without the benefit of trip count		// conservative estimates made without the benefit of trip count
// information. This invalidation is not necessary for correctness, and is		// information. This invalidation is not necessary for correctness, and is
// only done to produce more precise results.		// only done to produce more precise results.
if (Result.hasAnyInfo()) {		if (Result.hasAnyInfo()) {
// Invalidate any expression using an addrec in this loop.		// Invalidate any expression using an addrec in this loop.
SmallVector<const SCEV *, 8> ToForget;		SmallVector<const SCEV *, 8> ToForget;
auto LoopUsersIt = LoopUsers.find(L);		auto LoopUsersIt = LoopUsers.find(L);
if (LoopUsersIt != LoopUsers.end())		if (LoopUsersIt != LoopUsers.end()) {
append_range(ToForget, LoopUsersIt->second);		append_range(ToForget, LoopUsersIt->second);
		}
forgetMemoizedResults(ToForget);		forgetMemoizedResults(ToForget);

// Invalidate constant-evolved loop header phis.		// Invalidate constant-evolved loop header phis.
for (PHINode &PN : L->getHeader()->phis())		for (PHINode &PN : L->getHeader()->phis())
ConstantEvolutionLoopExitValue.erase(&PN);		ConstantEvolutionLoopExitValue.erase(&PN);
}		}

// Re-lookup the insert position, since the call to		// Re-lookup the insert position, since the call to
// computeBackedgeTakenCount above could result in a		// computeBackedgeTakenCount above could result in a
// recusive call to getBackedgeTakenInfo (on a different		// recusive call to getBackedgeTakenInfo (on a different
// loop), which would invalidate the iterator computed		// loop), which would invalidate the iterator computed
// earlier.		// earlier.
return BackedgeTakenCounts.find(L)->second = std::move(Result);		auto R = BackedgeTakenCounts.find(L);
		R->second = std::move(Result);
		for (Value *V : Recompute) {
		getSCEV(V);
		}

		return BackedgeTakenCounts.find(L)->second;
}		}

void ScalarEvolution::forgetAllLoops() {		void ScalarEvolution::forgetAllLoops() {
// This method is intended to forget all info about loops. It should		// This method is intended to forget all info about loops. It should
// invalidate caches as if the following happened:		// invalidate caches as if the following happened:
// - The trip counts of all loops have changed arbitrarily		// - The trip counts of all loops have changed arbitrarily
// - Every llvm::Value has been updated in place to produce a different		// - Every llvm::Value has been updated in place to produce a different
// result.		// result.
▲ Show 20 Lines • Show All 6,446 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/StraightLineStrengthReduce.cpp

Show First 20 Lines • Show All 540 Lines • ▼ Show 20 Lines	for (unsigned I = 1, E = GEP->getNumOperands(); I != E; ++I, ++GTI) {
if (GTI.isStruct())		if (GTI.isStruct())
continue;		continue;

const SCEV *OrigIndexExpr = IndexExprs[I - 1];		const SCEV *OrigIndexExpr = IndexExprs[I - 1];
IndexExprs[I - 1] = SE->getZero(OrigIndexExpr->getType());		IndexExprs[I - 1] = SE->getZero(OrigIndexExpr->getType());

// The base of this candidate is GEP's base plus the offsets of all		// The base of this candidate is GEP's base plus the offsets of all
// indices except this current one.		// indices except this current one.
		SE->getSCEV(GEP);
const SCEV *BaseExpr = SE->getGEPExpr(cast<GEPOperator>(GEP), IndexExprs);		const SCEV *BaseExpr = SE->getGEPExpr(cast<GEPOperator>(GEP), IndexExprs);
Value *ArrayIdx = GEP->getOperand(I);		Value *ArrayIdx = GEP->getOperand(I);
uint64_t ElementSize = DL->getTypeAllocSize(GTI.getIndexedType());		uint64_t ElementSize = DL->getTypeAllocSize(GTI.getIndexedType());
if (ArrayIdx->getType()->getIntegerBitWidth() <=		if (ArrayIdx->getType()->getIntegerBitWidth() <=
DL->getPointerSizeInBits(GEP->getAddressSpace())) {		DL->getPointerSizeInBits(GEP->getAddressSpace())) {
// Skip factoring if ArrayIdx is wider than the pointer size, because		// Skip factoring if ArrayIdx is wider than the pointer size, because
// ArrayIdx is implicitly truncated to the pointer size.		// ArrayIdx is implicitly truncated to the pointer size.
factorArrayIndex(ArrayIdx, BaseExpr, ElementSize, GEP);		factorArrayIndex(ArrayIdx, BaseExpr, ElementSize, GEP);
▲ Show 20 Lines • Show All 223 Lines • Show Last 20 Lines