This is an archive of the discontinued LLVM Phabricator instance.

[DAG] Conservatively check the VSELECT Operation Action in tryToFoldExtendSelectLoad
ClosedPublic

Authored by xiangzhangllvm on Mar 22 2023, 8:17 PM.

Details

Summary

tryToFoldExtendSelectLoad can be called in DAG combine2 which has already finish legalization
So we should conservatively check the Operation Action for new VSELECT.

Diff Detail

Event Timeline

xiangzhangllvm created this revision.Mar 22 2023, 8:17 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 22 2023, 8:17 PM
xiangzhangllvm requested review of this revision.Mar 22 2023, 8:17 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 22 2023, 8:17 PM
LuoYuanke added inline comments.Mar 23 2023, 4:11 AM
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1043

vXi16?

llvm/test/CodeGen/X86/vselect-post-combine.ll
5

Can we simplify the variable name?

RKSimon retitled this revision from [DAG Combine] Conservatively check the VSELECT Operation Action in tryToFoldExtendSelectLoad to [DAG] Conservatively check the VSELECT Operation Action in tryToFoldExtendSelectLoad.Mar 23 2023, 4:53 AM
RKSimon added inline comments.
llvm/test/CodeGen/X86/vselect-post-combine.ll
27

You might be able to further simplify this test?

LuoYuanke added inline comments.Mar 23 2023, 5:41 AM
llvm/test/CodeGen/X86/select-ext.ll
97 ↗(On Diff #507607)

It seems generate more code. Probably you need check "Level <= AfterLegalizeVectorOps" in tryToFoldExtendSelectLoad(), so that there is more combine opportunity.

LuoYuanke added inline comments.Mar 23 2023, 5:51 AM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
12246

Check "Level <= AfterLegalizeVectorOps" instead?

LuoYuanke added inline comments.Mar 23 2023, 6:50 AM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
12246

Maybe changed as below.

if ((N0->getOpcode() == ISD::VSELECT && Level > AfterLegalizeVectorOps &&
     TLI.getOperationAction(ISD::VSELECT, VT) != TargetLowering::Legal
     ))
  return SDValue();
LuoYuanke added inline comments.Mar 23 2023, 6:56 AM
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1043

On the other hand, why can't use VPBLENDVB ymm1, ymm2, ymm3/m256 (https://www.felixcloutier.com/x86/pblendvb) to vselect v16i16? We can bitcast v16i16 to v32i8.

pengfei added inline comments.Mar 23 2023, 8:15 AM
llvm/test/CodeGen/X86/vselect-post-combine.ll
3

This problem is not limited to Windows. Linux has the some problem if passing a ptr and load it to <32 x i8>.

xiangzhangllvm added inline comments.Mar 23 2023, 6:12 PM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
12246

Yes, I also had a quick think about it, just need to pass Level here.

llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
1043

Yes, but not only bitcast, it also need to bring new instructions to transfer value of ymm1.

llvm/test/CodeGen/X86/vselect-post-combine.ll
3

Yes, I tended to rm windows then I found it is not triggered, duo to the previous optimization difference.

5

Sure

27

Let me have a try, in fact I spend a lot time to simplify the raw test to current status : )

pengfei added inline comments.Mar 23 2023, 6:34 PM
llvm/test/CodeGen/X86/vselect-post-combine.ll
3

The difference first comes from ABI, so you can remove it by

define ptr @test_mul(ptr %0) {
%vecinit.i.i.i.i.i92 = load <32 x i8>, ptr %0
xiangzhangllvm marked 4 inline comments as done.
xiangzhangllvm added inline comments.Mar 23 2023, 7:20 PM
llvm/test/CodeGen/X86/vselect-post-combine.ll
3

Sorry, there is a delay for me to see this comment, I have update it : ), the test options is mainly to trigger the combine, now we can trigger it. Can we go on now?

LuoYuanke added inline comments.Mar 23 2023, 7:25 PM
llvm/test/CodeGen/X86/vselect-post-combine.ll
19

Could you simplify %shuffle.i.i.i6.i.i as well?

LuoYuanke accepted this revision.Mar 23 2023, 7:32 PM

LGMT with NIT comments.

This revision is now accepted and ready to land.Mar 23 2023, 7:32 PM
pengfei accepted this revision.Mar 23 2023, 8:05 PM
xiangzhangllvm marked 2 inline comments as done.Mar 23 2023, 8:06 PM
This revision was automatically updated to reflect the committed changes.