This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] mulhi + 1 never overflow.
ClosedPublic

Authored by deadalnix on Feb 6 2017, 9:13 AM.

Details

Diff Detail

Repository
rL LLVM

Event Timeline

deadalnix created this revision.Feb 6 2017, 9:13 AM

Is it possible to write a testcase which isn't quite so big?

lib/CodeGen/SelectionDAG/SelectionDAG.cpp
2779 ↗(On Diff #87241)

Missing a check for which result you're using from the UMUL_LOHI. (I think this transform is only valid for the high result.)

2780 ↗(On Diff #87241)

Could you just write this as "~N1Zero == 1"?

There is a nice small test case in adde-carry.ll , but it needs D29528 in addition to that one to trigger.

lib/CodeGen/SelectionDAG/SelectionDAG.cpp
2779 ↗(On Diff #87241)

You are absolutely right.

2780 ↗(On Diff #87241)

I was wondering if I should match 0 or not. I went for the safe road and matched 0.

deadalnix updated this revision to Diff 87305.Feb 6 2017, 2:23 PM

Only match mulhi

efriedma added inline comments.Feb 6 2017, 2:39 PM
lib/CodeGen/SelectionDAG/SelectionDAG.cpp
2781 ↗(On Diff #87305)

Another question: is there some reason to expect that N0 is the multiply, rather than N1?

deadalnix added inline comments.Feb 6 2017, 2:45 PM
lib/CodeGen/SelectionDAG/SelectionDAG.cpp
2781 ↗(On Diff #87305)

The specific case I'm looking at is fma like construct on big ints. After legalization you end up with combination of adde and mulhi no the left hand side. But indeed, that's an option to do it the other way around.

deadalnix added inline comments.Feb 6 2017, 4:11 PM
lib/CodeGen/SelectionDAG/SelectionDAG.cpp
2781 ↗(On Diff #87305)

So i tried to do it both ways. I see no difference in any of my tests cases, but, on the other hand, I see no reason not to do it. What do you think ?

deadalnix updated this revision to Diff 87533.Feb 7 2017, 3:02 PM

Match the pattern both ways.

efriedma added inline comments.Feb 15 2017, 1:44 PM
lib/CodeGen/SelectionDAG/SelectionDAG.cpp
2781 ↗(On Diff #87305)

I don't really like duplicating the code like that... but I guess we need to do it that way to avoid calling computeKnownBits when we don't use the result?

You can probably construct a smaller testcase to trigger this; maybe something like (uint64_t)(c == 0) + (((uint64_t)a * (uint64_t)b) >> 32), where a, b, and c are 32-bit integers.

deadalnix updated this revision to Diff 90233.Mar 1 2017, 2:46 PM

Update so it doesn't depend on D29565 anymore and has its own test.

efriedma accepted this revision.Mar 1 2017, 2:59 PM

LGTM.

test/CodeGen/X86/overflow.ll
38 ↗(On Diff #90233)

Maybe change this test to return %7 rather than just one bit of the high half of %7, so the code isn't completely dead? Not really important.

This revision is now accepted and ready to land.Mar 1 2017, 2:59 PM
deadalnix added inline comments.Mar 1 2017, 3:10 PM
test/CodeGen/X86/overflow.ll
38 ↗(On Diff #90233)

The code is dead either way, we shift right to get the carry.

efriedma added inline comments.Mar 1 2017, 3:13 PM
test/CodeGen/X86/overflow.ll
38 ↗(On Diff #90233)

I mean make the function return i128, and make the last instruction "ret i128 %7", so we return the 64-bit sum in RAX, and zero in RDX.

deadalnix added inline comments.Mar 1 2017, 3:34 PM
test/CodeGen/X86/overflow.ll
38 ↗(On Diff #90233)

Patch incoming.

deadalnix updated this revision to Diff 90250.Mar 1 2017, 3:51 PM

Improve test case

This revision was automatically updated to reflect the committed changes.