This is an archive of the discontinued LLVM Phabricator instance.

X86ISelDAGToDAG: Transform TEST + MOV64ri to SHR + TEST
ClosedPublic

Authored by MatzeB on Mar 9 2022, 11:35 AM.

Details

Summary

Optimize a pattern where a sequence of 8/16 or 32 bits is tested for zero: LLVM normalizes this towards and AND with mask which is usually good, but does not work well on X86 when the mask does not fit into a 64bit register. This DagToDAG peephole transforms sequences like:

movabsq $562941363486720, %rax # imm = 0x1FFFE00000000
testq %rax, %rdi

to

shrq $33, %rdi
testw %di, %di

the result has a shorter encoding and saves a register if the tested value isn't used otherwise.

Diff Detail

Event Timeline

MatzeB created this revision.Mar 9 2022, 11:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 9 2022, 11:35 AM
MatzeB requested review of this revision.Mar 9 2022, 11:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 9 2022, 11:35 AM
craig.topper added inline comments.Mar 9 2022, 11:45 AM
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
5620

Is this code already doing something similar? Sanjay had another patch to this code D121147 recently.

MatzeB added inline comments.Mar 9 2022, 3:03 PM
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
5620

Interesting. I believe this case covers only the cases where the mask covers the highest bit in the register and so allows us to get rid of the test completely, while my case covers cases of the mask being "in-between" and not covering the highest bit so we still need the test...

That said let me dig deeper whether there is code to be shared/reorganized/merged...

MatzeB updated this revision to Diff 414235.Mar 9 2022, 4:04 PM

Changed the code to hook into the existing similar case in X86DAGToDAGISel::Select.

MatzeB marked an inline comment as done.Mar 9 2022, 4:04 PM
MatzeB updated this revision to Diff 414238.Mar 9 2022, 4:10 PM

Linted and added some comments.

MatzeB updated this revision to Diff 414240.Mar 9 2022, 4:31 PM
craig.topper added inline comments.Mar 9 2022, 4:39 PM
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
5631

Are we emitting TEST+SHR rather than SHR by itself.

5636

Same question.

5683

Initialize TestOpcode to X86::TEST64rr instead of DELETED_NODE. Then this becomes

if (SubRegIdx != 0)
  Shift = CurDAG->getTargetExtractSubreg(SubRegIdx, dl, SubRegVT, Shift);
Test = CurDAG->getMachineNode(TestOpcode, dl, MVT::i32, Shift, Shift);
ReplaceNode(Node, Test)
MatzeB added inline comments.Mar 9 2022, 4:43 PM
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
5631

It seems we emit a TEST here but a separate transformation will identify the TEST as redundant and re-use the flags of the shift instruction.

5636

as above.

MatzeB updated this revision to Diff 414243.Mar 9 2022, 4:49 PM

Address review feedback

MatzeB marked 3 inline comments as done.Mar 9 2022, 4:50 PM
MatzeB updated this revision to Diff 414423.Mar 10 2022, 10:25 AM

There was a question about the effect of extra uses within this pattern, so I added some tests to show the effect on the existing code:
ca808e89242f

Can you add some tests similar to those, so we'll have more visibility into what this change will do?

MatzeB updated this revision to Diff 415492.Mar 15 2022, 10:27 AM

rebase; add extra hasOneUseCheck for the constant.

spatel accepted this revision.Mar 15 2022, 12:42 PM

LGTM

This revision is now accepted and ready to land.Mar 15 2022, 12:42 PM
This revision was landed with ongoing or failed builds.Mar 15 2022, 2:18 PM
This revision was automatically updated to reflect the committed changes.