We only need to merge memory operands for instructions that access
memory. This slightly reduces the number of actions executed.
Details
Diff Detail
Event Timeline
Limiting memory operands to instructions involving memory is a change I want to make but before we do that, could you elaborate a bit on the spurious memory operand you're seeing? In particular, was that memory operand present on any of the instructions covered by the match? I'm wondering if you've hit the bug that will be fixed by https://reviews.llvm.org/D36094.
Yes, D36094 fixes the problem too! Here's what's going on in the test case, without any of the 2 patches:
Legalize Machine IR for: test_stack_args_i32 Legalizing: %vreg6<def>(p0) = G_FRAME_INDEX <fi#-1>; .. Already legal Legalizing: %vreg4<def>(s32) = G_LOAD %vreg6; mem:LD4[FixedStack-1](align=0) .. Already legal Legalizing: %vreg7<def>(p0) = G_FRAME_INDEX <fi#-2>; .. Already legal Legalizing: %vreg5<def>(s32) = G_LOAD %vreg7; mem:LD4[FixedStack-2](align=0) .. Already legal Legalizing: %vreg8<def>(s32) = G_ADD %vreg2, %vreg5; .. Already legal ...... 34378: Resume at 34378 (0 try-blocks remain) 34379: Begin try-block 34382: GIM_CheckFeatures(ExpectedBitsetID=5) 34385: GIM_CheckNumOperands(MIs[0], Expected=3) 34388: GIM_CheckOpcode(MIs[0], ExpectedOpcode=30) // Got=30 34392: GIM_CheckType(MIs[0]->getOperand(0), TypeID=21) 34396: GIM_CheckRegBankForClass(MIs[0]->getOperand(0), RCEnum=1) 34400: GIM_CheckType(MIs[0]->getOperand(1), TypeID=21) 34404: GIM_CheckRegBankForClass(MIs[0]->getOperand(1), RCEnum=1) 34408: GIM_CheckType(MIs[0]->getOperand(2), TypeID=21) 34412: GIM_CheckRegBankForClass(MIs[0]->getOperand(2), RCEnum=1) 34415: GIR_BuildMI(OutMIs[0], 112) 34419: GIR_Copy(OutMIs[0], MIs[0], 0) 34423: GIR_Copy(OutMIs[0], MIs[0], 1) 34427: GIR_Copy(OutMIs[0], MIs[0], 2) 34430: GIR_AddImm(OutMIs[0], 14) 34433: GIR_AddRegister(OutMIs[0], 0) 34436: GIR_AddRegister(OutMIs[0], 0) 34438: GIR_MergeMemOperands(OutMIs[0]) 34440: GIR_EraseFromParent(MIs[0]) Converting operand: %vreg8<def> Converting operand: %vreg2 Converting operand: %vreg5 Converting operand: %noreg Converting operand: %noreg 34442: GIR_ConstrainSelectedInstOperands(OutMIs[0]) 34443: GIR_DoneInto: %vreg8<def>(s32) = ADDrr %vreg2, %vreg5, pred:14, pred:%noreg, opt:%noreg; mem:LD4[FixedStack-2](align=0) GPR:%vreg8,%vreg2,%vreg5
With the MachineVerifier enabled for ARM, this causes it to fail on the test case.
Would it still make sense to only emit GIR_MergeMemOperands for instructions that mayLoad/mayStore? Unless I am missing something, I think not having this action when it's not required would speed up GlobalISel a tiny bit.
Great! In that case I'll update the summary of that patch to indicate that arm-isel.ll shows the same problem in-tree.
Would it still make sense to only emit GIR_MergeMemOperands for instructions that mayLoad/mayStore? Unless I am missing something, I think not having this action when it's not required would speed up GlobalISel a tiny bit.
Yes, it does. As you say, not using GIR_MergeMemOperands unnecessarily will speed up the matcher. It will also become important later on when load/store is importable and we support multi-instruction emission since something will have to ensure the memory operands are added to appropriate instructions.
This patch LGTM. Ideally, D36094 would land first since that fixes the underlying problem but that hasn't been reviewed yet and I see no reason to require that it lands first.