This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done
ClosedPublic

Authored by mkazantsev on Jun 25 2018, 11:00 PM.

Details

Summary

This patch changes order of transform in InstCombineCompares to avoid
performing transforms based on ranges which produce complex bit arithmetics
before more simple things (like folding with constants) are done. See PR37636
for the motivating example.

Diff Detail

Repository
rL LLVM

Event Timeline

mkazantsev created this revision.Jun 25 2018, 11:00 PM
lebedev.ri added inline comments.Jun 25 2018, 11:20 PM
test/Transforms/InstCombine/icmp_sdiv_with_and_without_range.ll
4–5 ↗(On Diff #152835)

Could you add comment what this tests, and commit as-is [and rebase this]?
Also, i'm not sure these two lines are needed.

mkazantsev marked an inline comment as done.
lebedev.ri added inline comments.Jun 25 2018, 11:57 PM
test/Transforms/InstCombine/icmp-shl-nsw.ll
72 ↗(On Diff #152844)

Stale comments. (and elsewhere in this file)

This only moves the foldICmpUsingKnownBits() call later on,
but in PR37636 @spatel suggests it *might* expose more problems.
Does this expose any problems?

This only moves the foldICmpUsingKnownBits() call later on,
but in PR37636 @spatel suggests it *might* expose more problems.
Does this expose any problems?

From my point of view, no. We use different predicates in some cases, I am not convinced that either of them is better (at least less/greater predicates are more useful for SCEV analysis than ne/eq).
I'll fix the comments.

mkazantsev marked an inline comment as done.

This only moves the foldICmpUsingKnownBits() call later on,
but in PR37636 @spatel suggests it *might* expose more problems.
Does this expose any problems?

From my point of view, no. We use different predicates in some cases, I am not convinced that either of them is better (at least less/greater predicates are more useful for SCEV analysis than ne/eq).
I'll fix the comments.

The test diffs here show what I was concerned about. It's not clear to me if/when the predicate/constant changes in these compares are wins or not. It's probably in the noise, but we should check for regressions on test-suite or some other benchmark?

This only moves the foldICmpUsingKnownBits() call later on,
but in PR37636 @spatel suggests it *might* expose more problems.
Does this expose any problems?

From my point of view, no. We use different predicates in some cases, I am not convinced that either of them is better (at least less/greater predicates are more useful for SCEV analysis than ne/eq).
I'll fix the comments.

The test diffs here show what I was concerned about. It's not clear to me if/when the predicate/constant changes in these compares are wins or not. It's probably in the noise, but we should check for regressions on test-suite or some other benchmark?

Also, are there more diffs if you sink foldICmpUsingKnownBits even further down? If there aren't, we should sink it all the way to the end because it's expensive in compile-time. Either way, please add a code comment, so we know why we positioned it wherever it is.

This only moves the foldICmpUsingKnownBits() call later on,
but in PR37636 @spatel suggests it *might* expose more problems.
Does this expose any problems?

This only moves the foldICmpUsingKnownBits() call later on,
but in PR37636 @spatel suggests it *might* expose more problems.
Does this expose any problems?

From my point of view, no. We use different predicates in some cases, I am not convinced that either of them is better (at least less/greater predicates are more useful for SCEV analysis than ne/eq).
I'll fix the comments.

The test diffs here show what I was concerned about. It's not clear to me if/when the predicate/constant changes in these compares are wins or not. It's probably in the noise, but we should check for regressions on test-suite or some other benchmark?

I think that these changes are practically either unimportant or positive. SCEV has reasons to prefer lt/gt comparisons over ne/eq because in certain cases it can derive more facts from it, however I'm not sure whether it has real profit anywhere or not. I can test it on our Java benchmarks, but don't have environment to run C++ clang tests properly, so I'd appreciate if you could help with it.

This only moves the foldICmpUsingKnownBits() call later on,
but in PR37636 @spatel suggests it *might* expose more problems.
Does this expose any problems?

From my point of view, no. We use different predicates in some cases, I am not convinced that either of them is better (at least less/greater predicates are more useful for SCEV analysis than ne/eq).
I'll fix the comments.

The test diffs here show what I was concerned about. It's not clear to me if/when the predicate/constant changes in these compares are wins or not. It's probably in the noise, but we should check for regressions on test-suite or some other benchmark?

Also, are there more diffs if you sink foldICmpUsingKnownBits even further down? If there aren't, we should sink it all the way to the end because it's expensive in compile-time. Either way, please add a code comment, so we know why we positioned it wherever it is.

Interesting idea, I need to check it.

Sinking it to the end of method did not introduce any new changes. Also added a comment.

lebedev.ri added inline comments.Jun 27 2018, 2:00 AM
lib/Transforms/InstCombine/InstCombineCompares.cpp
4710–4711 ↗(On Diff #153000)

Please check that there is some test that relies on this, and does not get folded by previous cases.

mkazantsev added inline comments.Jun 27 2018, 7:55 PM
lib/Transforms/InstCombine/InstCombineCompares.cpp
4710–4711 ↗(On Diff #153000)

If we completely comment away this code, more tests fail. So yes, it actually does some useful work.

I think that these changes are practically either unimportant or positive. SCEV has reasons to prefer lt/gt comparisons over ne/eq because in certain cases it can derive more facts from it, however I'm not sure whether it has real profit anywhere or not. I can test it on our Java benchmarks, but don't have environment to run C++ clang tests properly, so I'd appreciate if you could help with it.

Ok - I need to get a test-suite machine set up again. I'll try to have some data by tomorrow.

I think that these changes are practically either unimportant or positive. SCEV has reasons to prefer lt/gt comparisons over ne/eq because in certain cases it can derive more facts from it, however I'm not sure whether it has real profit anywhere or not. I can test it on our Java benchmarks, but don't have environment to run C++ clang tests properly, so I'd appreciate if you could help with it.

Ok - I need to get a test-suite machine set up again. I'll try to have some data by tomorrow.

I don't see anything above the noise with this patch applied, so we should be ok here.

lib/Transforms/InstCombine/InstCombineCompares.cpp
4707–4709 ↗(On Diff #153000)

I'd state this more like:
"This may be expensive in compile-time, and transforms based on known bits can make further analysis more difficult, so we use it as the last resort..."

test/Transforms/InstCombine/icmp_sdiv_with_and_without_range.ll
4–5 ↗(On Diff #152835)

Please commit the test file to trunk with the baseline CHECK lines, and then rebase this patch, so we have a record of the bug (also add a comment referencing/linking PR37636).

mkazantsev marked 2 inline comments as done.

Rebased on top of pre-merged test, fixed comments.

lebedev.ri accepted this revision.Jul 2 2018, 4:44 AM

I think that these changes are practically either unimportant or positive. SCEV has reasons to prefer lt/gt comparisons over ne/eq because in certain cases it can derive more facts from it, however I'm not sure whether it has real profit anywhere or not. I can test it on our Java benchmarks, but don't have environment to run C++ clang tests properly, so I'd appreciate if you could help with it.

Ok - I need to get a test-suite machine set up again. I'll try to have some data by tomorrow.

I don't see anything above the noise with this patch applied, so we should be ok here.

LGTM then.

This revision is now accepted and ready to land.Jul 2 2018, 4:44 AM
spatel added inline comments.Jul 2 2018, 5:38 AM
test/Transforms/InstCombine/icmp_sdiv_with_and_without_range.ll
5 ↗(On Diff #153675)

This comment isn't correct - as the test shows. Instcombine should do better with the range metadata, and I think that's true now (at least in this case).

spatel accepted this revision.Jul 2 2018, 5:39 AM

LGTM too, other than the nit about the test comment.

This revision was automatically updated to reflect the committed changes.