This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Custom Lower MULLH{S,U} for v16i8, v8i16, and v4i32
ClosedPublic

Authored by zatrazz on Apr 24 2018, 6:44 AM.

Details

Summary

This patch adds a custom lowering for ISD::MULH{S,U} used on divide by
constant optimization (DAGCombiner::BuildSDIV and DAGCombiner::BuildUDIV).

New patterns for smull and umull are added, so AArch64ISD::{S,U}MULL
can be correctly lowered to smull2 and umull2.

Diff Detail

Event Timeline

zatrazz created this revision.Apr 24 2018, 6:44 AM
SjoerdMeijer added inline comments.Apr 27 2018, 1:47 AM
lib/Target/AArch64/AArch64ISelLowering.cpp
2557

Do you need to pass the 'sign' boolean? Can you not look at the opcode and check for ISD::MULHS or ISD::MULHU?

2749

If you check the opcode inside LowerMULH, this can be simplified a bit, both these cases can fallthrough and call LowerMULH.

test/CodeGen/AArch64/arm64-neon-mul-div-cte.ll
4

nit: this is mul16xi8

15

nit: this is mul8xi16

zatrazz added inline comments.Apr 30 2018, 5:39 AM
lib/Target/AArch64/AArch64ISelLowering.cpp
2557

Yes, you can divise the correct ISD from SDValue opcode, I will change it.

test/CodeGen/AArch64/arm64-neon-mul-div-cte.ll
4

Ack.

15

Ack.

zatrazz updated this revision to Diff 144544.Apr 30 2018, 5:40 AM

Update patch from previous comments.

Sorry, I needed to get up to speed here, also with this hacker's delight trick.
I think it would be good if you add some comments to LowerMULH what we are trying to do here.

I think I have some more comments, mainly about the tests:

  • you're not checking the magic number (e.g. in the first test "movi v1.16b, #57"), but I think that is quite relevant here.
  • you're only using constant vectors with value 9. I think it would be good to vary here.
  • I think you should also define the operands of the smull2 and smull instructions with regexps, and use them.
  • nit: some functions still have inconsistent names, e.g. function "umul8xi16" tests "udiv <16 x i8>"

Sorry, I needed to get up to speed here, also with this hacker's delight trick.
I think it would be good if you add some comments to LowerMULH what we are trying to do here.

Ack.

I think I have some more comments, mainly about the tests:

  • you're not checking the magic number (e.g. in the first test "movi v1.16b, #57"), but I think that is quite relevant here.

Right, but the change itself is to correct lower the multiply high part of the divide by constant optimization. I have adjusted the next patch version to match the expected constant as well.

  • you're only using constant vectors with value 9. I think it would be good to vary here.

Ack.

  • I think you should also define the operands of the smull2 and smull instructions with regexps, and use them.

Ack.

  • nit: some functions still have inconsistent names, e.g. function "umul8xi16" tests "udiv <16 x i8>"

Ack, I have changed to div*.

zatrazz updated this revision to Diff 144900.May 2 2018, 10:20 AM

Update patch from previous comments.

SjoerdMeijer accepted this revision.May 3 2018, 1:07 AM

Thanks. This looks OK to me.

This revision is now accepted and ready to land.May 3 2018, 1:07 AM
This revision was automatically updated to reflect the committed changes.