Page MenuHomePhabricator

[IndVarSimplify] Do not use SCEV expander for IVCount in LFTR when possible.
Needs RevisionPublic

Authored by aus_intel on Wed, Aug 28, 9:25 AM.

Details

Reviewers
sanjoy
reames
Summary

SCEV analysis cannot properly cache instruction with poison flags
(for example, add nsw outside of loop will not be reused by expander).
This can lead to generating of additional instructions by SCEV expander.

Example IR:

...
%maxval = add nuw nsw i32 %a1, %a2
...

for.body:

...
%cmp22 = icmp ult i32 %ivadd, %maxval
br i1 %cmp22, label %for.body, label %for.end
...

SCEV expander will generate copy of %maxval in preheader but without
nuw/nsw flags. This can be avoided by explicit check that iv count
value gives the same SCEV expression as calculated by LFTR.

Diff Detail

Event Timeline

aus_intel created this revision.Wed, Aug 28, 9:25 AM

Can you run ./update_test_checks.py on llvm/test/Transforms/IndVarSimplify/add_nsw.ll and llvm/test/Transforms/IndVarSimplify/udiv.ll?

cc @reames since he is more familiar with SCEV and can comment this patch.

llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
2399

nits:
auto *BI

auto *Cmp

lebedev.ri added subscribers: llvm-commits, lebedev.ri.

I don't have a stance here, but i think cache/flags issue is a bug regardless (is there a bug # ?),
also, does/should some other pass cleanup the IR without this patch?

  1. There is https://reviews.llvm.org/D41578 which disabled caching of SCEV if it lost flags which, in my understanding, causes this problem. There is also analysis that SCEV should generate poison but it doesn't handle loop invariants (and I assume that changing of it can trigger longer compilation time). With all of these, I decided that it is better to just avoid use of expander if it is possible.
  1. I noticed that early CSE can handle simple cases like this (simple add: c++ - https://godbolt.org/z/yBcf8L, opt with indvars and cse - https://godbolt.org/z/_s-Z9P). However, I have an example when CSE (or anyone else) does not remove extra instructions (c++ - https://godbolt.org/z/vc2Wf5, opt - https://godbolt.org/z/so9zTE). So it seems that there is no cleanup for this code when -O2 is passed. Moreover, if cleanup is done it loses flags from instructions too (is this ok?).
  1. Moreover, now I discovered that there is a problem in different pass: loop strength reduction. It uses SCEV expander to rewrite values and it again can generate extra instructions even when nothing is changed inside loop (https://godbolt.org/z/gYx2Ke). Initially I didn't noticed that because some backends do not use codegen passes that added with TargetPassConfig so this pass was not added by default (opt do not run LSR too). Seems, that I should file a bug on this.
aus_intel updated this revision to Diff 217835.Thu, Aug 29, 4:49 AM

Updated patch. Added autos instead of pointer types and updates tests with script.

reames requested changes to this revision.Fri, Aug 30, 9:39 AM

See inline review comment

If you decide to revise, please do the following:

  1. Land your tests in a precommit, then rebase
  2. Land the auto-generation of the one effected test, then rebase
  3. Provide a clear comment (or pointer to particular test is fine) where SCEVExpander goes wrong. I might be able to make better suggestions on approach.
llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
2403

I don't believe this is sound.

The problem is that SCEV can map multiple expressions to the same SCEV, but that does not imply that all of them are equivalent.

Given:
%v1 = add nsw %a, %b -> SCEV A
%v2 = add nuw %a, %b -> SCEV B
(i.e pretend SCEV dropped *all* flags in it's construction)
This would replace all uses of %v2 with %v1 which is incorrect.

This revision now requires changes to proceed.Fri, Aug 30, 9:39 AM
aus_intel marked an inline comment as done.Wed, Sep 11, 3:21 AM

Sorry for delayed answer.

  1. How can I land tests in precommit? I tested my patch locally (lit tests and LNT on x86) and everything seems to be ok. I would be glad to know if there is possibility for more testing.
  2. About auto-generation of affected test. Is it about these ll tests that I attached to patch? If so, I have already used script to regenerate them.
  3. Probably, I left a bad comment pointing to SCEV expander. Actually, it is SCEV analysis issue. It loses flags from add instruction and then SCEV expander cannot find any cached value to use it which results in extra instruction without flags. This new instruction is for logically the same operation (compute loop limit) so the old value could be reused. There is a function getNoWrapFlagsFromUB that tries to set flags for computed SCEV. It calls isSCEVExprNeverPoison which does not handle loop invariants and this leads to losing of flags. However, if I fix this function (I tried this locally), the problem still presents. The issue is that there are some transformations of SCEV obtained from loop limit and these can lose flags again.

How can I land tests in precommit?

You need to request a write access to LLVM.