This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] Add support for mulhi const folding in DAGCombiner
ClosedPublic

Authored by dstuttard on May 28 2021, 8:42 AM.

Diff Detail

Event Timeline

dstuttard created this revision.May 28 2021, 8:42 AM
dstuttard requested review of this revision.May 28 2021, 8:42 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 28 2021, 8:42 AM
RKSimon added inline comments.May 28 2021, 8:50 AM
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
5086

APInt::extractBits ?

5092

APInt::extractBits ?

Test cases would be useful as well of course

foad added inline comments.May 28 2021, 9:03 AM
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
5084

You don't need "OrTrunc" here and below.

dstuttard updated this revision to Diff 348550.May 28 2021, 9:14 AM

Thanks for reviews.
Made suggested changes.

I'll add some test cases as well - but I'm off for a few days. I'll do it on return.

dstuttard marked 3 inline comments as done.May 28 2021, 9:14 AM
craig.topper added inline comments.
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
4467

Use DL variable that already exists

4519

Use DL variable that already exists

dstuttard updated this revision to Diff 352637.Jun 17 2021, 1:14 AM

Updating for review comments.

Tests still to be added...

dstuttard marked 2 inline comments as done.Jun 17 2021, 1:14 AM

Updating for review comments.

Tests still to be added...

Do you have any in mind? I can add SSE2 vector examples if that would help? But scalars are trickier

foad added a comment.Jun 18 2021, 3:57 AM

You can simplify the test case to:

define amdgpu_cs i64 @main(i64 %arg) {
entry:
  %d = udiv i64 %arg, 100000
  ret i64 %d
}

and it still shows the effect. Surely there are already some tests for i64 divide-by-constant that you could tweak, rather than adding a whole new file.

llvm/test/CodeGen/AMDGPU/dagcombine-mulhs-const.ll
5 ↗(On Diff #352960)

Obviously folding the mul_hi is good, but the s_add that you check for looks like this:

s_mov_b32 s0, 0x346d900
...
s_add_u32 s0, 0x4237, s0

so it should also be folded to a constant!

@dstuttard Please can you rebase? rGcc38f8939da4aec85e7d0ef4de412e30d4de5a14 should give you vector coverage

dstuttard updated this revision to Diff 356448.Jul 5 2021, 1:47 AM

Updated existing test based on feedback

Rebase and adjust X86 test that now folds

@dstuttard Please can you rebase? rGcc38f8939da4aec85e7d0ef4de412e30d4de5a14 should give you vector coverage

Thanks - have rebased and updated the test

dstuttard added inline comments.Jul 5 2021, 1:50 AM
llvm/test/CodeGen/AMDGPU/dagcombine-mulhs-const.ll
5 ↗(On Diff #352960)

Yes - that could be another one to do - then fix up this test (or not worry about it at all given that there's now an X86 test that tests this, thanks to Simon)

Cheers!

llvm/test/CodeGen/AMDGPU/udiv.ll
206

Possibly pre-commit this to show current codegen? Do we need a GFX1030-NOT v_mul_hi_u32 check of some kind?

212

Fix missing newline

dstuttard updated this revision to Diff 356464.Jul 5 2021, 3:27 AM

Pre-committed test and rebased on top

See D105424

dstuttard marked 2 inline comments as done.Jul 5 2021, 3:28 AM
dstuttard added inline comments.
llvm/test/CodeGen/AMDGPU/udiv.ll
206

Good idea. See latest change.

dstuttard marked an inline comment as done.Jul 5 2021, 3:28 AM
RKSimon accepted this revision.Jul 5 2021, 3:58 AM

LGTM - cheers!

This revision is now accepted and ready to land.Jul 5 2021, 3:58 AM
This revision was landed with ongoing or failed builds.Jul 5 2021, 4:08 AM
This revision was automatically updated to reflect the committed changes.