This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombine] (addcarry 0, 0, X) -> (ext/trunc X)
ClosedPublic

Authored by deadalnix on May 5 2017, 10:15 AM.

Details

Summary

While this makes some case better and some case worse - so it's unclear if it is a worthy combine just by itself - this is a useful canonicalisation.

As per discussion in D32756 .

Diff Detail

Repository
rL LLVM

Event Timeline

deadalnix created this revision.May 5 2017, 10:15 AM
RKSimon added inline comments.May 5 2017, 10:31 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2146 ↗(On Diff #97982)

I'm not sure if you can do this - aren't carry bits boolean types? So won't they be TargetLowering::BooleanContent?

getBoolExtOrTrunc doesn't quite work - you're wanting a 0 or 1 result......

deadalnix added inline comments.May 5 2017, 2:16 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2146 ↗(On Diff #97982)

No they are whatever the target decides to use as boolean. All i1 are one after legalization.

I'm not sure, however, what's the correct way to extends this. What's wrong with getBoolExtOrTrunc ?

deadalnix updated this revision to Diff 98022.May 5 2017, 2:25 PM

Use getBoolExtOrTrunc .

spatel added inline comments.May 8 2017, 3:13 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2146 ↗(On Diff #97982)

The bug may not be visible if x86 is the only target to allow ISD::ADDCARRY so far because x86 has "ZeroOrOneBooleanContent" for scalars.

So we're seeing something like this:

  t23: i64,i8 = uaddo t2, t4
t24: i64,i8 = addcarry Constant:i64<0>, Constant:i64<0>, t23:1

The 2nd result of the uaddo is an i8 which is "a boolean that indicates if an overflow occurred".
Now, if the target has "ZeroOrNegativeOneBooleanContent", that result will be "i8 -1" when overflow occurred for the uaddo. If you then use getBoolExtOrTrunc() on that to get the i64, you'll get a sign-extend to i64 -1. That's wrong - we wanted a "i64 1" as the first result of the addcarry. We must mask ('and x, 1') or trunc the CarryIn operand before casting it to N0.getValueType().

deadalnix added inline comments.May 8 2017, 3:47 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2146 ↗(On Diff #97982)

Good catch, I need to fix this. Thanks.

deadalnix updated this revision to Diff 98230.May 8 2017, 4:27 PM

Add an and in the pattern as to make the the value is 0 or 1.

spatel edited edge metadata.May 9 2017, 8:53 AM

See inline for a few nits, but I think this makes sense now. If I'm seeing the diffs correctly, there was no codegen difference for x86 after adding the 'and' mask, so we must be recognizing and optimizing that pattern. @RKSimon - do you see any other problems?

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2148–2149 ↗(On Diff #98230)

It's borderline, but I think we prefer to use the actual type (EVT) in code like this.

2151 ↗(On Diff #98230)

Do you need to explicitly add to worklist? I thought this gets handled automatically by the combiner.

test/CodeGen/X86/mul-i1024.ll
5 ↗(On Diff #98230)

What does testing with i1024 tell us that testing with i512 (or i256 for that matter) would not?
This is a massive file and a massive diff, and I have no idea how to make sense of it. If others can follow what's happening in this codegen, I suppose we should leave it. Otherwise, let's just get rid of this file?

RKSimon edited edge metadata.May 9 2017, 1:13 PM

See inline for a few nits, but I think this makes sense now. If I'm seeing the diffs correctly, there was no codegen difference for x86 after adding the 'and' mask, so we must be recognizing and optimizing that pattern. @RKSimon - do you see any other problems?

This looks right to me now - thanks for reviewing the getBoolExtOrTrunc usage.

Other than what's to be done with the mul-i1024 this looks good to go.

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2151 ↗(On Diff #98230)

Yes, this should be kept.

deadalnix added inline comments.May 10 2017, 1:24 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2151 ↗(On Diff #98230)

You need to add when you aren't returning the node for replacement, which is the case here.

deadalnix updated this revision to Diff 98415.May 10 2017, 1:42 AM

Add explicit EVT types.

filcab added a subscriber: filcab.May 10 2017, 7:13 AM
filcab added inline comments.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2140 ↗(On Diff #98415)

Please split these changes before committing the patch.

deadalnix added inline comments.May 10 2017, 10:21 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2140 ↗(On Diff #98415)

I put these change with the patch because there is no need to precompute DL if we don't have the patch. If you feel strongly about this I can just commit this as a NFC and rebase the patch.

deadalnix updated this revision to Diff 98494.May 10 2017, 11:14 AM

Remove NFC changes.

deadalnix updated this revision to Diff 98990.May 15 2017, 5:54 AM

Monday ping.

spatel added inline comments.May 15 2017, 6:12 AM
test/CodeGen/X86/mul-i1024.ll
5 ↗(On Diff #98230)

I didn't see an answer to this question. Do we need to test i512 / i1024? How are they different than i256?

deadalnix added inline comments.May 17 2017, 11:07 PM
test/CodeGen/X86/mul-i1024.ll
5 ↗(On Diff #98230)

Really, I have no idea. These tests weren't added for this diff specifically, so that's a bit outside of the scope of this change.

I suggest that we just delete mul-i1024.ll before accepting this patch - any objections?

spatel added a subscriber: chfast.

The test file(s) were added with D24478 / rL281403. cc'ing @chfast to see if it's necessary.

PR30354 ( https://bugs.llvm.org//show_bug.cgi?id=30354 ) specified i1024, but do we actually need such a large value to trigger the bug?

spatel accepted this revision.May 19 2017, 10:41 AM

I agree that we shouldn't gate this patch waiting for a reply to the i1024 question. If you're confident that the code in those super-wide cases is still correct, then LGTM.

This revision is now accepted and ready to land.May 19 2017, 10:41 AM

I looked at the codegen for the mul and it looks correct. There are very little actual changes, but, they do cause registers to be allocated differently, so the diff ends up being huge.

This revision was automatically updated to reflect the committed changes.
llvm/trunk/test/CodeGen/X86/addcarry.ll