Page MenuHomePhabricator

[llvm-mca] Fix processing thumb instruction set
ClosedPublic

Authored by evgeny777 on Nov 18 2020, 6:08 AM.

Details

Summary

ATM, there are two kind of problems with thumb instructions (both cause assertion in InstrBuilder)

  1. Incorrect detection of variadic ops

Some thumb instructions (like tMOVSr) have less number of operands in their MCInstrDesc than MCInst.getNumOperands(). This makes llvm-mca beleive those are variadic, while they're not.

  1. Detcting more explicit defs than there actually are.

This happens because Thumb target defines CPSR (s_cc_out) as output operand (ARM target defines it as input),

Diff Detail

Event Timeline

evgeny777 created this revision.Nov 18 2020, 6:08 AM
evgeny777 requested review of this revision.Nov 18 2020, 6:08 AM
andreadb added inline comments.Nov 18 2020, 4:34 PM
llvm/lib/MCA/InstrBuilder.cpp
306–307

Does it mean that the OptionalDef operand for Thumb instructions is always at index NumExplicitDefs-1 (rather than at index MCDesc.getNumOperands() - 1)?
Is the OptionalDef already counted as an explicit operand? That's really odd...

This change breaks the assumptions made by the loop at line 348. The assumption is that implicit definitions always follow explicit definitions. By decreasing NumExplicitDefs you are also affecting how that loop iterates over the implicit definitions.

About the code comment: it should mention the issue with the optional def being already counted as an explicit definition.

307

See my previous comment about the value ot TotalDefs.

309

Shouldn't we update also TotalDefs? Otherwise we reserve the wrong number of writes at line 312.

311–312

Some Thumb instructions (like tMOVSr) have less number of operands in their MCInstrDesc than MCInst.getNumOperands()

Could you elaborate this a bit more? It is a bit concerning that there is a mismatch in the number of operands. What are exactly these extra operands?

I need to understand if the assumptions made by this algorithm are still valid. Specifically, I need to understand if all of the assumptions described in the comment starting at line 256 are still valid. Can we still guarantee that "Uses always start at index #(MCDesc.getNumDefs())"? Is the order of writes still preserved?

350

After your change, NumExplicitDefs may no longer match the value of MCDesc.getNumDefs().
The value of Index is incorrect computed for Thumb instructions that have an optional def. For those instructions, when CurrentDef is zero, Index references the OptionalDef operand.

evgeny777 added inline comments.Nov 23 2020, 1:13 AM
llvm/lib/MCA/InstrBuilder.cpp
306–307

Does it mean that the OptionalDef operand for Thumb instructions is always at index NumExplicitDefs-1

Yes, for Thumb1 only. Thumb2 instructions follow ARM rules
Just look at any Thumb1 instruction definition, which modifies CPSR:

def tADC {  // InstructionEncoding
//...
  dag OutOperandList = (outs tGPR:$Rdn, s_cc_out:$s);
//...
}

Unfortunately moving s_cc_out from the end of output operand list to the beginning of input operand list breaks too many things

Is the OptionalDef already counted as an explicit operand? That's really odd.

Yes, because it's in the output operand list

This change breaks the assumptions made by the loop at line 348. The assumption is that implicit definitions always follow explicit definitions.

Thumb1 instructions with optional def do not have implicit operands, however I'll fix this in upcoming update

309

I think we do a fixup in the end of populateWrites function

ID.Writes.resize(CurrentDef);
311–312

Could you elaborate this a bit more? It is a bit concerning that there is a mismatch in the number of operands. What are exactly these extra operands?

If you look at tMOVSr you'll find out that it has 4 operands. For instance movs r2, r3 is assembled to:

<MCInst #4275 tMOVSr <MCOperand Reg:74> <MCOperand Reg:75> <MCOperand Imm:14> <MCOperand Reg:0>>

The last two args are, I beleive, built-in ARMCC::AL predicate

However tMOVSr definition lists only 2 arguments:

def tMOVSr {  // InstructionEncoding
// ...
  dag OutOperandList = (outs tGPR:$Rd);
  dag InOperandList = (ins tGPR:$Rm);
// ...
}

InstrBuilder incorrectly treats tMOVSr as variadic, because 4 != 2

evgeny777 updated this revision to Diff 306982.Nov 23 2020, 1:27 AM
  • Added check for Thumb1, because Thumb2 instructions follow different encoding rules
  • Added comments
dmgreen added inline comments.Nov 23 2020, 2:19 AM
llvm/lib/MCA/InstrBuilder.cpp
251

Please don't rely on ThumbFrm, a lot of those flags are legacy and should be removed.

I was hoping that this wouldn't have to say "if (thumb)" anywhere, even if the instructions are a little odd. Can it check the position of the optional def instead? So it detects instructions like this, without having to be target dependent.

306–307

Yes. Optional Defs are a little unfortunate, but have been around for a long time and I wouldn't recommend the major changes needed to alter how they work.

311–312

This one sounds like a bug to me. Maybe in the AsmParsing? MOVS can't be used in IT blocks, so probably shouldn't use predicated. If you look at PR36658.mir, it looks like $r1 = tMOVSr $r0, implicit-def dead $cpsr

evgeny777 added inline comments.Nov 23 2020, 3:36 AM
llvm/lib/MCA/InstrBuilder.cpp
251

Can it check the position of the optional def instead

Do you have an idea how to do this? One has to distinguish
a) Instruction that writes to single register and optionally modifies CPSR (Thumb1)
b) Instruction that writes to two registers and optionally modifies CPSR (ARM, Thumb2)

311–312

This one sounds like a bug to me

May be. I tried to address this in disassembler, but it broke a lot of stuff.

Here is a piece from ARMDisassembler.cpp

// A few instructions actually have predicates encoded in them.  Don't
// try to overwrite it if we're seeing one of those.
switch (MI.getOpcode()) {
  case ARM::tBcc:
  case ARM::t2Bcc:
  case ARM::tCBZ:
  case ARM::tCBNZ:
  case ARM::tCPS:
  case ARM::t2CPS3p:
  case ARM::t2CPS2p:
  case ARM::t2CPS1p:
  case ARM::t2CSEL:
  case ARM::t2CSINC:
  case ARM::t2CSINV:
  case ARM::t2CSNEG:
  case ARM::tMOVSr:
  case ARM::tSETEND:

Thanks for the updated patch.

I wonder if we could use MCOperandInfo::isOptionalDef() to identify the index of an optional definition.
That being said, I leave the review of IsThumb1OptDefInst to you and @dmgreen as I am not very familiar with the MC lowering of Thumb1 instructions.
The rest of the patch looks good to me.

llvm/lib/MCA/InstrBuilder.cpp
309

You are right. That should be enough. Thanks for checking.

dmgreen added inline comments.Nov 23 2020, 4:22 AM
llvm/lib/MCA/InstrBuilder.cpp
251

Can we just check which operand it is? Use MCDesc.OpInfo[i].isOptionalDef()? Can we just skip it in that case?

311–312

It's possibly this code from ARMAsmParser:

TmpInst.setOpcode(Inst.getOperand(4).getReg() ? ARM::tMOVSr : ARM::tMOVr);
TmpInst.addOperand(Inst.getOperand(0));
TmpInst.addOperand(Inst.getOperand(1));
TmpInst.addOperand(Inst.getOperand(2));
TmpInst.addOperand(Inst.getOperand(3));

Addressed. Thanks for insights!

dmgreen added inline comments.Nov 24 2020, 12:29 AM
llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
10312 ↗(On Diff #307254)

Do you mind pulling this out into it's own review. I think it's fine, but just in case it probably deserves to be it's own patch.

The test can stay here, so long as there is some test for movs rm, rn.

A couple of minor nits. Otherwise the llvm-mca change looks good.
Thanks.

llvm/lib/MCA/InstrBuilder.cpp
261–264

This comment should be updated now.

By default, the optional register definition is still expected to be the last operand. However, it can also be one of the explicit defs.
The rest of this logic still works under the assumption that instructions can only declare a single optional register definition at most.

312

Instead of -1, you could just default to MCDesc.getNumOperands() - 1.
That way, the statement at line 381 becomes a simple assignment with no conditional expression.

evgeny777 updated this revision to Diff 307345.Nov 24 2020, 7:05 AM

Addressed. Revision now depends on D92029

andreadb accepted this revision.Nov 24 2020, 7:16 AM

LGTM

llvm/lib/MCA/InstrBuilder.cpp
319–320

This comment can be removed. It is no longer a Thumb1 thing only.

This revision is now accepted and ready to land.Nov 24 2020, 7:16 AM
This revision was automatically updated to reflect the committed changes.