Page MenuHomePhabricator

[InstCombine] Optimize redundant 'signed truncation check pattern'.
ClosedPublic

Authored by lebedev.ri on Aug 8 2018, 10:46 AM.

Details

Summary

This comes with Implicit Conversion Sanitizer - integer sign change (D50250):

signed char test(unsigned int x) { return x; }

clang++ -fsanitize=implicit-conversion -S -emit-llvm -o - /tmp/test.cpp -O3

General pattern:

X & Y

Where Y is checking that all the high bits (covered by a mask 4294967168)
are uniform, i.e. %arg & 4294967168 can be either 4294967168 or 0
Pattern can be one of:

%t = add        i32 %arg,    128
%r = icmp   ult i32 %t,      256

Or

%t0 = shl       i32 %arg,    24
%t1 = ashr      i32 %t0,     24
%r  = icmp  eq  i32 %t1,     %arg

Or

%t0 = trunc     i32 %arg  to i8
%t1 = sext      i8  %t0   to i32
%r  = icmp  eq  i32 %t1,     %arg

This pattern is a signed truncation check.

And X is checking that some bit in that same mask is zero.
I.e. can be one of:

%r = icmp sgt i32   %arg,    -1

Or

%t = and      i32   %arg,    2147483648
%r = icmp eq  i32   %t,      0

Since we are checking that all the bits in that mask are the same,
and a particular bit is zero, what we are really checking is that all the
masked bits are zero.
So this should be transformed to:

%r = icmp ult i32 %arg, 128

The transform itself ended up being rather horrible, even though i omitted some cases.
Surely there is some infrastructure that can help clean this up that i missed?

https://rise4fun.com/Alive/3Ou

Diff Detail

Repository
rL LLVM

Event Timeline

lebedev.ri added inline comments.
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1404 ↗(On Diff #159752)

@RKSimon: if i don't set a name here, the utils/update_test_checks.py goes mad,
and produces strange (wrong) check-lines:

$ ninja check-llvm-transforms-instcombine -v
[0/1] cd /build/llvm-build-GCC-release/test && /usr/bin/python2.7 /build/llvm-build-GCC-release/./bin/llvm-lit -sv /build/llvm/test/Transforms/InstCombine
FAIL: LLVM :: Transforms/InstCombine/signed-truncation-check.ll (696 of 824)
******************** TEST 'LLVM :: Transforms/InstCombine/signed-truncation-check.ll' FAILED ********************
Script:
--
: 'RUN: at line 2';   /build/llvm-build-GCC-release/bin/opt < /build/llvm/test/Transforms/InstCombine/signed-truncation-check.ll -instcombine -S | /build/llvm-build-GCC-release/bin/FileCheck /build/llvm/test/Transforms/InstCombine/signed-truncation-check.ll
--
Exit Code: 1

Command Output (stderr):
--
/build/llvm/test/Transforms/InstCombine/signed-truncation-check.ll:281:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[TMP1:%.*]] = icmp ult i16 [[TMP1]], 128
              ^
<stdin>:114:2: note: scanning from here
 %1 = icmp ult i16 %tmp1, 128
 ^

--

********************
Testing Time: 5.48s
********************
Failing Tests (1):
    LLVM :: Transforms/InstCombine/signed-truncation-check.ll

  Expected Passes    : 808
  Unsupported Tests  : 15
  Unexpected Failures: 1
FAILED: test/CMakeFiles/check-llvm-transforms-instcombine 
cd /build/llvm-build-GCC-release/test && /usr/bin/python2.7 /build/llvm-build-GCC-release/./bin/llvm-lit -sv /build/llvm/test/Transforms/InstCombine
ninja: build stopped: subcommand failed.
spatel added inline comments.Aug 9 2018, 7:12 AM
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1404 ↗(On Diff #159752)
Independent of this patch: the script uses "TMP###" as the substitute for unnamed (%###) values. So if a test already has values named "%tmp###", things will likely go wrong. The quick fix is to replace all of the "%tmp###" names in these tests with "%t###" (or anything other than "%tmp###").
lebedev.ri edited the summary of this revision. (Show Details)Aug 13 2018, 1:41 AM

Ping.

spatel added inline comments.Aug 13 2018, 8:08 AM
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1307–1308 ↗(On Diff #159752)

It's better to include the comment from the test file directly (or at least a shorter description) instead of asking the reader to jump to another file for documentation.

1311–1316 ↗(On Diff #159752)

Can this be moved within InstCombiner::foldAndOfICmps() to eliminate these preliminary checks?

Any chance for shared functionality with foldAndOrOfICmpsOfAndWithPow2() or simplifyRangeCheck()?

Rebased ontop of committed tests, NFC.

lebedev.ri marked an inline comment as done.

Sink the call down into InstCombiner::foldAndOfICmps().

lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1311–1316 ↗(On Diff #159752)

I don't think i should squeeze it into InstCombiner::foldAndOrOfICmpsOfAndWithPow2(),
as that looks for a *very* specific, different pattern.

InstCombiner::simplifyRangeCheck() also does not look too promising,
we will then only handle the extreme cases (which is, admittedly, probably
sufficient for my needs right now), but not the general case :/

None of those handle this case because they mostly require for the both icmp's
to operate on the same value, and don't understand what

%t = add        i32 %arg,    128
%r = icmp   ult i32 %t,      256

(and similar patterns) mean.

So i have sunk it down into InstCombiner::foldAndOfICmps() for now..

LGTM.

About the autogen script problem - we could hack that to use something less likely than 'TMP' as its regex string (eg, 'AUTOGEN'), but that will result in significant test file churn if someone updates an existing file.

lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1311–1316 ↗(On Diff #159752)

Ok, thanks for checking. I think we've noted this before - the whole thing is a complicated mess, and there should be some better way to generalize it, but it's not clear how to do it.

Thank you for taking a look!

LGTM.

(i'm not sure if you've meant to accept this, because you have only posted the comment, not clicked 'accept')

To be noted, i still don't quite like this code, but i'm not sure how to improve this.

About the autogen script problem - we could hack that to use something less likely than 'TMP' as its regex string (eg, 'AUTOGEN'), but that will result in significant test file churn if someone updates an existing file.

Yeah... I think some note in the docs would be good-enough, too.

lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1311–1316 ↗(On Diff #159752)

Auto-generated (souper) tablegen-driven transforms would be The Solution :)
Else i suspect we'll be hitting missing corner cases and patterns for eternities to come.

spatel accepted this revision.Aug 13 2018, 1:16 PM

Forgot to click 'Accept' here in Phab.

This revision is now accepted and ready to land.Aug 13 2018, 1:16 PM

Forgot to click 'Accept' here in Phab.

Hmm, okay, thank you for the review!

This revision was automatically updated to reflect the committed changes.
lebedev.ri reopened this revision.Aug 13 2018, 2:02 PM

Temporarily reverted.
That assert at the beginning is to greedy.

This revision is now accepted and ready to land.Aug 13 2018, 2:02 PM

NFC, revert, just so the diffs are sane.

Test + less eager assert.
We can indeed do that fold: https://rise4fun.com/Alive/0lz

This revision was automatically updated to reflect the committed changes.