Page MenuHomePhabricator

[SelectionDAG] Combine U{ADD,SUB}O diamonds into {ADD,SUB}CARRY
ClosedPublic

Authored by davezarzycki on Nov 11 2019, 6:28 AM.

Details

Summary

Convert (uaddo (uaddo x, y), carryIn) into addcarry x, y, carryIn if-and-only-if the carry flags of the first two uaddo are merged via OR or XOR.

Work remaining: match ADD, etc.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Hi everybody. Feedback is still appreciated. What's blocking this change from being committed?

lebedev.ri added inline comments.Nov 18 2019, 12:04 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2844–2849

I find this confusing.
Which one of Carry0 and Carry1 is the first one?
Can this be simplified to something like

// ???
if (Carry1.getOperand(0) != Carry0.getValue(0))
  std::swap(Carry0, Carry1);
if (Carry1.getOperand(0) != Carry0.getValue(0))
  return SDValue();

This would be more idiomatic..

2854–2856

Do we care what Carry0.getOperand(1) is, what produces it?

davezarzycki added inline comments.Nov 18 2019, 12:45 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2844–2849

Sure. I can refactor that. I guess the comment wasn't clear enough. The code wants the UADDO (or USUBO) that does the primary arithmetic to be Carry0. I.e. the node at the top of the ASCII art. The UADDO (or USUBO) that does the carry/borrow in is Carry1. I.e. the node in the middle of the ASCII art.

2854–2856

We want that operand to be a carry bit (or at least plausibly so), so that we don't create bogus addcarry/subcarry nodes.

davezarzycki edited the summary of this revision. (Show Details)
  1. Check TLI.isOperationLegalOrCustom().
  2. Handle AND being used to merge flags.
  3. More source comments.
lebedev.ri added inline comments.Nov 18 2019, 3:04 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2821–2822

Since you've drawn the original graph, also showing the result might be good?

2844–2849

Please mark done comments as done.

2848

Ah, so the numbering is from bottom to top, i expected the other way around.

2855–2857

s/ISDOp/NewOp/

2863–2867

There's getAsCarry() should it be used here, or are we okay with some other i1?
(can we have some other i1 by now?)

2873–2876

This could use a few comments

davezarzycki marked 13 inline comments as done.Nov 18 2019, 4:29 AM
davezarzycki added inline comments.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2848

I've swapped the ordering to match your expectations.

2863–2867

That function is non-recursive, so it doesn't work. In other words, it doesn't know how to handle carry flags being merged. I can change that if people are okay with that.

lebedev.ri added inline comments.Nov 18 2019, 5:17 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2848

(It's just that i think this ordering is more prevalent, i've practically not seen the opposite ordering in the codebase)

2863–2867

No real opinion from me here.

2875

I'm not following this bit, why do we do this only for ISD::AND?

davezarzycki marked 4 inline comments as done and an inline comment as not done.Nov 18 2019, 5:33 AM
davezarzycki added inline comments.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2875

As the comment above tries to explain, we can prove in this diamond UADDO/USUBO scenario that it is impossible for both UADDO/USUBO nodes to overflow at the same time. Either one does, or the other, therefore using AND to merge the carry flags will always return zero.

chfast added inline comments.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2850

Have you seen the getAsCarry() helper? It might help simplifying matching here.

2856

The ADDCARRY rather replaces UADDO so you should use the debug location of the Carry1 node.

2875

I think you should use CombineTo() here to bump stats counters, produce some debug logs, updated worklist and delete replaced nodes.

It can be something like:

CombineTo(Carry1.getNode(), Merged.getValue(0), undef);
return CombineTo(N, Merged.getValue(1));

Both Carry1 and N should be deleted by CombineTo().

davezarzycki marked 4 inline comments as done.Nov 18 2019, 6:15 AM
davezarzycki added inline comments.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2850

As mentioned elsewhere in this thread, getAsCarry() would need to be updated in order to be used here. In particular, it would need become aware of this proposed optimization because the carry flag might not have been merged yet.

2856

The ADDCARRY replaces three operations. Why is one node better than another for the debug location?

2875

Okay. I'll look into that.

lebedev.ri added inline comments.Nov 18 2019, 6:37 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2856

I'd say the current debuginfo is correct.

2875

As the comment above tries to explain, we can prove in this diamond UADDO/USUBO scenario that it is impossible for both UADDO/USUBO nodes to overflow at the same time. Either one does, or the other, therefore using AND to merge the carry flags will always return zero.

Yes, but that wasn't the question i was asking.
My question was - what happens with uses of Carry1.getValue(0) if we have non-AND?
Since you don't replace, they will still use old nodes,
which means the old instruction chain will stay?
I think this may missing more tests.

davezarzycki marked an inline comment as done.Nov 18 2019, 7:55 AM
davezarzycki added inline comments.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2875

There is a test for the zero case. See bogus_add_U320_without_i128_and in addcarry.ll.

In any case, I'll switch to CombineTo

Switch to DAGCombiner::CombineTo

davezarzycki marked 4 inline comments as done.Nov 18 2019, 9:14 AM

With the exception of some debate among the reviewers about what the right debug location should be, I think I've addressed all actionable comments so far.

lebedev.ri added inline comments.Nov 18 2019, 9:52 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2875

FWIW the previous version (no CombineTo(), only a singled RAUW call) looked more better to me,
but maybe i'm missing the point as to why that was wrong.

2876–2877

I'm sorry, i still don't understand why this is being done only in ISD::AND case.
Please explain that in a new comment in the code.

lebedev.ri added inline comments.Nov 18 2019, 10:02 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2876–2877

To be specific, i don't understand why this isn't:

DAG.ReplaceAllUsesOfValueWith(Carry1.getValue(0), Merged.getValue(0)); // debatable, but done always
if (N->getOpcode() == ISD::AND) // Both carrys can't overflow at the same time
  return DAG.getConstant(0, DL, MVT::i1);
return Merged.getValue(1);
davezarzycki marked 2 inline comments as done.Nov 19 2019, 12:29 AM
davezarzycki added inline comments.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2876–2877

That also works. I'll switch to that.

Why in your mind is the first line of your suggestion "debatable"?

lebedev.ri marked an inline comment as done.Nov 19 2019, 12:40 AM
lebedev.ri added inline comments.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2876–2877

Because i do not understand why currently Combiner.CombineTo() is used,
maybe using DAG.ReplaceAllUsesOfValueWith() would be incorrect there.

chfast added inline comments.Nov 19 2019, 12:40 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2875

To my understanding CombineTo() is a rich version of RAUW. It does RAUW, but also can add new nodes to the DAG Combiner worklist, updates DAG combiner stats, prints debug logs about which nodes has been combined into which, handles removed nodes.

lebedev.ri added inline comments.Nov 19 2019, 12:58 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2875

I would think simply returning the new value would be enough,
and whatever code called us will do all the same anyway?

davezarzycki marked 2 inline comments as done.Nov 19 2019, 1:14 AM
davezarzycki added inline comments.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2876–2877

I tried this:

-  Combiner.CombineTo(Carry1.getNode(), Merged.getValue(0),
-                     DAG.getUNDEF(Merged.getValue(1).getValueType()));
+  DAG.ReplaceAllUsesOfValueWith(Carry1.getValue(0), Merged.getValue(0));
   if (N->getOpcode() == ISD::AND)
-    return Combiner.CombineTo(N, DAG.getConstant(0, DL, MVT::i1));
-  return Combiner.CombineTo(N, Merged.getValue(1));
+    return DAG.getConstant(0, DL, MVT::i1);
+  return Merged.getValue(1);

And it also worked. Who is responsible for this code and decide which approach is preferable?

davezarzycki marked an inline comment as done.Nov 19 2019, 10:26 PM
davezarzycki added inline comments.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2876–2877

Ping. @RKSimon @spatel @craig.topper – As the top three committers to DAGCombiner.cpp in the last two years, do you have any preference about DAG.ReplaceAllUsesOfValueWith(…) versus Combiner.CombineTo(…)? Otherwise I believe that all feedback has been addressed to date. What's blocking approval? Thanks! :-)

Why does Carry1's result 0 not get replaced for the xor/or case?

Unless I'm misunderstanding you, I think the many rounds of review feedback are clouding the code review. Lines 2874 and 2875 replace Carry1's result 0 in all cases (OR, XOR, and AND).

Unless I'm misunderstanding you, I think the many rounds of review feedback are clouding the code review. Lines 2874 and 2875 replace Carry1's result 0 in all cases (OR, XOR, and AND).

Doesn't line 2873 return if the opcode is not ISD::AND?

Unless I'm misunderstanding you, I think the many rounds of review feedback are clouding the code review. Lines 2874 and 2875 replace Carry1's result 0 in all cases (OR, XOR, and AND).

Doesn't line 2873 return if the opcode is not ISD::AND?

Ah. Oops. I paused on uploading new patches while @chfast and @lebedev.ri were debating whether DAG.ReplaceAllUsesOfValueWith(…) or Combiner.CombineTo(…) was better. (Do you have a preference?) I'll update the patch now with the former.

Update the patch now that the DAG RAUW versus Combiner discussion has settled down. I personally like the former because it's simpler, but I do like the arguments in favor of the DAGCombiner. Either way, I'll defer to a code owner.

Almost LG to me, but apparently i have more comments.

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2812

Wait, In is some third carry-in bit, right? This diagram doesn't really make that obvious.

2820

Out = (addcarry A, B, In).carry

2837–2840

Why can Carry0.getValue(0) only be in Carry1.getOperand(0), but not in Carry1.getOperand(1)?
I think there is no such restriction at least for add.

2846–2848

We are replacing and/or/xor node, and we only create a single replacement node,
so i don't think we need any one-use checks here?

2849

I think it will be helpful to

SDValue CarryBit = Carry1.getOperand(1);
davezarzycki marked 10 inline comments as done.Nov 20 2019, 5:01 AM
davezarzycki added inline comments.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2812

No. There is one carry in bit in this graph: In. There are two carry out bits being merged in a way that implies an addcarry or subcarry is possible. I can try to make this more obvious.

2820

Sure. In the next patch.

2837–2840

Sure. I'll make it more robust in the next patch. For whatever it may be worth, this is the order output by CodeGenPrepare when creates implicit UADDO/USUBO nodes. It's also happens to be the order that clang generates via its addc/subc builtins.

2846–2848

I was trying to be conservative. After some thought, I think dropping the one-use tests is harmless. I'll remove them from the next patch.

2849

Sure. In the next patch.

davezarzycki marked 5 inline comments as done.
  1. Prettier ASCII art.
  2. Add robustness against the carry in bit swapping positions during optimization passes.
  3. Dropped use-once checks on the first uaddo/usubo node.

This patch fixes a bug introduced during the last round of feedback. By making the patch robust against the optimizer swapping operands to UADDO, the patch started tolerating the carry in bit being on the lefthand side of USUBO. That is wrong, and it's now fixed.

lebedev.ri accepted this revision.Nov 20 2019, 6:05 AM

With latest update i think this is basically LG, last few non-functional comments.
But please wait a bit for @deadalnix / @craig.topper / @RKSimon.

This patch fixes a bug introduced during the last round of feedback. By making the patch robust against the optimizer swapping operands to UADDO, the patch started tolerating the carry in bit being on the lefthand side of USUBO. That is wrong, and it's now fixed.

Thank you! I was just figuring out if that bug *wasn't* there.

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2815

Since you are drawing add, you could swap it's operands, and thus CarryIn wouldn't have to intercept PartialCarryOutX. Just a thought, not sure it matters.

2861–2864
if (CarryIn.getOpcode() != ISD::ZERO_EXTEND)
  return SDValue();
CarryIn = CarryIn.getOperand(0);
if (CarryIn.getValueType() != MVT::i1)
  return SDValue();

It's better than seeing CarryIn.getOperand(0) later and going "wait what why we use first operand from CarryIn instruction?"

2869

CarryIn);

This revision is now accepted and ready to land.Nov 20 2019, 6:05 AM
lebedev.ri added inline comments.Nov 20 2019, 6:18 AM
test/CodeGen/X86/addcarry.ll
623

Please precommit all tests first.

davezarzycki marked 5 inline comments as done.Nov 20 2019, 6:22 AM
davezarzycki added inline comments.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2815

I'd really rather not. The current diagram matches the UADDO/USUBO generated by CodeGenPrepare from plain IR. It also matches the UADDO/USUBO that clang emits as a part of its builtin addc/subc intrinsics. Also, if one substitutes usubo for uaddo, the diagram still holds true, whereas if the add operands were swapped, the diagram would be wrong for usubo.

2861–2864

Sure. I'll change that before committing.

lebedev.ri marked an inline comment as done.Nov 20 2019, 6:28 AM
lebedev.ri added inline comments.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2815

Sure, that was only an observation.

davezarzycki marked 3 inline comments as done.Nov 20 2019, 6:30 AM
davezarzycki added inline comments.
test/CodeGen/X86/addcarry.ll
623

Oh, whoops. I missed this feedback once I saw the "ready to land" status change. Sorry. At least all of the other tests were pre-committed.

lebedev.ri added inline comments.Nov 20 2019, 6:35 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2831

Just make it a member function?

2839–2840

Very last note, i think it would be best to also

if (N->getOpcode() != ISD::AND && N->getOpcode() != ISD::OR && N->getOpcode() != ISD::XOR)
  return SDValue();

(or maybe that should be an assert)

davezarzycki added a comment.EditedNov 20 2019, 6:54 AM

There was this line:

But please wait a bit for @deadalnix / @craig.topper / @RKSimon.

Oh, sorry, I missed that there was a comment above the quoted text in addition to below it. I'm happy to revert this if somebody wants, or fix things in a followup patch.

I'm a bit late to the party, but this is nothing short of amazing. Congratulations!