This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations
ClosedPublic

Authored by thebolt on Jan 26 2018, 4:36 AM.

Details

Summary

Expressions of the form x < 0 ? 0 : x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves

In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before.

Diff Detail

Repository
rL LLVM

Event Timeline

thebolt created this revision.Jan 26 2018, 4:36 AM
pbarrio requested changes to this revision.Jan 26 2018, 12:39 PM

The logic & testing LGTM. Just a couple of coding standard nits.

lib/Target/ARM/ARMISelLowering.cpp
4386 ↗(On Diff #131572)

Variable names should start with uppercase (see https://llvm.org/docs/CodingStandards.html, "Name Types, Functions, Variables, and Enumerators Properly")

4388 ↗(On Diff #131572)

Same here: upper case + camel case, i.e. NotShiftV

This revision now requires changes to proceed.Jan 26 2018, 12:39 PM
thebolt updated this revision to Diff 131746.Jan 28 2018, 10:41 PM

Updated variable names to conform with coding style as per comments

rogfer01 added inline comments.Jan 29 2018, 1:55 AM
lib/Target/ARM/ARMISelLowering.cpp
4387 ↗(On Diff #131746)

Given that you don't use the value of KVal itself except to check if it is 0 or -1, you can simplify this to isNullConstant(*K) and isAllOnesConstant(*K).

test/CodeGen/ARM/atomic-op.ll
132 ↗(On Diff #131746)

If you don't want a cmp now you may want to consider adding a CHECK-NOT: cmp

thebolt updated this revision to Diff 131756.Jan 29 2018, 2:26 AM
thebolt marked 2 inline comments as done.

Incorporated changes based on comment from Roger Ferrer regarding unnecessary KVal variable.

Added some more negative tests to make sure no comparisons are generated when not expected.

pbarrio accepted this revision.Jan 29 2018, 5:05 AM

LGTM now. In other circumstances, I would wait for someone more experienced than me, but this is a small peephole optimization. I've also tested the patch and it works correctly, so I think it has very little risk.

This revision is now accepted and ready to land.Jan 29 2018, 5:05 AM

I don't have any commit rights so would be happy if someone else picks it up

I'll do it.

This revision was automatically updated to reflect the committed changes.

Committed now. Thanks for the patch @thebolt!

I've reverted your commit in r323929.
It miscompiles the following program on armv7-linux-android:
#include <stdio.h>

typedef unsigned long long u64;
typedef signed long long s64;
s64 f(u64 x) {

return (x > (u64)0x7fffffffffffffffULL) ? (s64)1 : -(s64)x;

}

int main() {

volatile u64 x = 42;
s64 y = f(x);
printf("%lld\n", y);

}

Without this patch, the output is -42.
With your patch, the output is 4294967254.

It also breaks the android bot:

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-android/builds/7279
efriedma added inline comments.Jan 31 2018, 3:27 PM
llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp
4421

The use of the constant "31" here is a bit suspicious, given you don't check the type of "VT".

4423

getAllOnesConstant?

Sorry, I missed the sanitizer failure in yesterday's buildbot message and thought it wasn't related to this patch. I will leave some time for @thebolt to fix it, otherwise maybe I can take a look.

Thanks for the revert.

I'm looking into it right now.

I found one direct error (which caused the armv7-linux-android misscompile) and also "pessimised" it a bit to only do the optimization on i32 variables. This still covers the majority of cases as clang for 16/8 bit variables creates signextend->select->truncate sequences, but avoids a few corner cases.

Will try to post a new patch during the day, but I don't know how to runtime test it or do any more testing than "ninja check" (I runtime-test it on our Cortex-M4 ARM based boards).

yroux added a subscriber: yroux.Feb 1 2018, 4:25 AM

I can test it in the same condition as our buildbots if you want... will take a bit of time though

thebolt updated this revision to Diff 132542.Feb 2 2018, 12:30 AM

This update hopefully solves the issues found by @eugenis

It adds a missing check for correctly formed expression, and limits the optimization to 32-bit values (which in practice is a majority of the cases you want to cover). Factored out the code that checks for the expression, in similar way to already existing isSaturatingExpression.

@yroux Can you please test and see that this indeed solves all the problems?

thebolt reopened this revision.Feb 2 2018, 12:31 AM
This revision is now accepted and ready to land.Feb 2 2018, 12:31 AM
thebolt requested review of this revision.Feb 2 2018, 12:31 AM
yroux added a comment.Feb 2 2018, 1:07 AM

@thebolt validation is on-going

yroux added a comment.Feb 2 2018, 1:57 AM

stage2 build is not completed yet, but there is no more issue in tblgen with this patch.

LGTM (just two small typos)

yroux added inline comments.Feb 2 2018, 1:59 AM
lib/Target/ARM/ARMISelLowering.cpp
4393 ↗(On Diff #132542)

typo: that's up to the caller

4458 ↗(On Diff #132542)

I'd say: "which have flexible..."

thebolt marked 2 inline comments as done.Feb 2 2018, 2:20 AM

I have corrected the typos in my code, I will however await further functional testing before submitting a new (hopefully final) version

@yroux did the entire build and verification pass? If so I will tomorrow (when I have my work computer) upload a new version of the diff with the comment fixes, and then hope it can be approved/commited without breaking anything this time.

yroux added a comment.Feb 6 2018, 12:22 PM

@thebolt yes 2 stage build and regression testsuite passed on armv7 builder, thanks for the fix.

thebolt updated this revision to Diff 133185.Feb 7 2018, 2:39 AM

Updated with fixes for the misspelled comments.
Also added two test-cases to catch invalid transformations (such as the ones causing problems earlier).

chrib added a subscriber: chrib.Feb 8 2018, 12:06 AM

Ping? As this revision now has been tested to not break in the same way the first (accepted) version did, could someone please re-review? If the current subscribers are not right, who should I add as reviewer?

rogfer01 added inline comments.Feb 19 2018, 1:26 AM
lib/Target/ARM/ARMISelLowering.cpp
4402–4404 ↗(On Diff #133185)

For consistency with the rest of the file I think you want to remove the braces here.

4417 ↗(On Diff #133185)

Ditto.

4427 ↗(On Diff #133185)

Ditto

4463 ↗(On Diff #133185)

I'm not convinced of the implicit dependency between the function returning false when Op.getValueType() != MVT::i32 and the 32 bit constant created here.

I wonder if it would be better (just suggesting! I'm happy to be convinced otherwise :)) if you want to do the check here (instead of inside the function).

if (VT == MVT::i32 && isLowerSaturatingConditional(Op, SatValue, LowerSatConstant)) {

This way the dependency of types is more obvious. Perhaps you foresee that we will call isLowerSaturatingConditional in other places and then the we would have to repeat the check?

4468 ↗(On Diff #133185)

Braces.

thebolt marked 5 inline comments as done.Feb 19 2018, 1:37 AM
thebolt added inline comments.
lib/Target/ARM/ARMISelLowering.cpp
4463 ↗(On Diff #133185)

I buy your argument. Technically there is not any 32-bit constant created (the constant will as far as I can tell have same width as VT) however the transformation from select_cc->bitoperation is only done on 32-bit operations.
I have moved out the condition check on the type as you suggested (and removed it from the function that checks if it is a lower saturate, making it more general)

thebolt updated this revision to Diff 134875.Feb 19 2018, 2:59 AM
thebolt marked an inline comment as done.

Updated coding style as per feedback from @rogfer01

rogfer01 accepted this revision.Feb 19 2018, 3:46 AM

Thanks a lot Marten.

This looks good to me now.

This revision is now accepted and ready to land.Feb 19 2018, 3:46 AM

As before, I don't have any commit access so would appreciate if someone else picks it up.

If nobody does it before, I'll give it a try early next week.

pbarrio added inline comments.Feb 28 2018, 6:16 AM
lib/Target/ARM/ARMISelLowering.cpp
4398 ↗(On Diff #134875)

VT is not used anywhere. I will remove it before committing.

This revision was automatically updated to reflect the committed changes.

Buildbot failing here: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-msan/builds/2998

The errors don't look related:

FAIL: lld :: ELF/linkerscript/eh-frame-reloc-out-of-range.test (673 of 1549)
FAIL: lld :: ELF/strip-all.s (1509 of 1533)

But I haven't investigated in depth. @thebolt you may want to take a look at it.