Page MenuHomePhabricator

[X86] Emit more efficient >= comparisons against 0

Authored by majnemer on Aug 18 2015, 11:29 PM.



We don't do a great job with >= 0 comparisons against zero when the
result is used as an i8.

Given something like:
void f(long long LL, bool *B) {

*B = LL >= 0;


We used to generate:

shrq    $63, %rdi
xorb    $1, %dil
movb    %dil, (%rsi)

Now we generate:

testq   %rdi, %rdi
setns   (%rsi)

Diff Detail


Event Timeline

majnemer updated this revision to Diff 32504.Aug 18 2015, 11:29 PM
majnemer retitled this revision from to [X86] Emit more efficient >= comparisons against 0.
majnemer updated this object.
majnemer added reviewers: chandlerc, resistor.
majnemer added a subscriber: llvm-commits.
spatel added a subscriber: spatel.Aug 19 2015, 7:44 AM
mbodart added inline comments.Aug 19 2015, 9:53 AM
24077 ↗(On Diff #32504)

It would be nice to have a brief comment here describing the general pattern.
Something like:

Attempt to transform signedness tests similar to (i8)(X >= 0), appearing in DAG form
as for example (xor i8 trunc(A i32 >> 31), 1), into setcc A SETGT -1.

24078 ↗(On Diff #32504)

I would think we also want to suppress this if the xor has more than one user.

24113 ↗(On Diff #32504)

This is fine. But just curious, is there a reason you chose SETGT -1 over SETGE 0?

majnemer added inline comments.Aug 19 2015, 10:10 AM
24077 ↗(On Diff #32504)

Sure, will do!

24078 ↗(On Diff #32504)

It might be a little early for me (no coffee, etc.), why is it undesirable to use a test/setns sequence if there is more than one user?

24113 ↗(On Diff #32504)

We seem to normalize it to that:

Using SETGE 0 works equally well but results in a test/setge sequence. The sequence is equivalent but also requires remembering that test sets OF to 0.

mbodart added inline comments.Aug 19 2015, 10:28 AM
24078 ↗(On Diff #32504)

More than one user likley means the xor is still live.

So the transformation would just add more instructions, without removing any.

I'm a bit new to LLVM, so maybe there's some larger context I'm not seeing.
If so, please shed some light.

24113 ↗(On Diff #32504)

OK, thanks for the explanation.

majnemer added inline comments.Aug 19 2015, 12:15 PM
24078 ↗(On Diff #32504)

My understanding is that this would replace all users of the xor to instead use the setcc which would mean no additional instructions would get generated. However, I think it makes sense to check to see if the truncate and shift both have one use.

majnemer updated this revision to Diff 32595.Aug 19 2015, 12:24 PM
  • Address review comments
spatel added inline comments.Aug 19 2015, 12:51 PM
24089 ↗(On Diff #32595)

typo: "aginst"

24107 ↗(On Diff #32595)

typo: "ammount"

24121 ↗(On Diff #32595)

This function description is too specific now.

215 ↗(On Diff #32595)

B is not used in this test.

215–224 ↗(On Diff #32595)

Add tests for an i1 output, i16 input, and i64 input?

majnemer updated this revision to Diff 32602.Aug 19 2015, 1:08 PM
  • Address review comments
  • Address Sanjay's review comments.
mbodart added inline comments.Aug 19 2015, 1:26 PM
24078 ↗(On Diff #32602)

Guess my coffee wasn't strong enough either ;-)
Yes, the trunc/shift usage counts were what I meant.

spatel added inline comments.Aug 19 2015, 1:37 PM
24112 ↗(On Diff #32602)

It would be great to add the earlier review comment about the difference between SETGT -1 and SETGE 0 here in the code comment. Save someone a little wondering down the line. :)

I have no further comments.

Thanks, David!

This revision was automatically updated to reflect the committed changes.