This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Support "call" pseudoinstruction in the MC layer
ClosedPublic

Authored by shiva0217 on Apr 19 2018, 8:07 PM.

Details

Summary

To do this:

  1. Add PseudoCALLIndirct to match indirect function call.
  2. Add PseudoCALL to support parsing and print pseudo call in assembly
  3. Expand PseudoCALL to the following form with R_RISCV_CALL relocation type while encoding: auipc ra, func jalr ra, ra, 0

If we expand PseudoCALL before emitting assembly, we will see auipc and jalr pair when compiling with -S.
It's hard for assembly parser to parsing this pair and identify it's semantic is function call and then insert R_RISCV_CALL relocation type. Although we could insert R_RISCV_PCREL_HI20 and R_RISCV_PCREL_LO12_I relocation types instead of R_RISCV_CALL. Due to RISCV relocation design, auipc and jalr pair only can relax to jal with R_RISCV_CALL + R_RISCV_RELAX relocation types.

We expand PseudoCALL as late as encoding(RISCVMCCodeEmitter) instead of before emitting assembly(RISCVAsmPrinter) because we want to preserve call pseudoinstruction in assembly code. It's more readable and assembly parser could identify call assembly and insert R_RISCV_CALL relocation type.

Diff Detail

Repository
rL LLVM

Event Timeline

shiva0217 created this revision.Apr 19 2018, 8:07 PM
asb requested changes to this revision.Apr 23 2018, 7:49 AM
asb added inline comments.
lib/Target/RISCV/MCTargetDesc/RISCVELFObjectWriter.cpp
26–27 ↗(On Diff #143221)

I'd phrase this as "Return true if the given relocation must be with a symbol rather than section plus offset".

lib/Target/RISCV/MCTargetDesc/RISCVFixupKinds.h
51 ↗(On Diff #143221)

It would be less ambiguous to say "auipc instruction in a pair composed of adjacent auipc+jalr instructions".

lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
89 ↗(On Diff #143221)

clang-format suggests some slightly different indentation within this function.

91 ↗(On Diff #143221)

By expanding the 'call' at this late stage, there's no opportunity for the instruction compression MC->MC transform (which you might care about with -mno-relax and a fully resolved target). Of course there is no compressed auipc, and you'd have to be rather lucky to end up with jalr ra, ra, 0. A comment to document this limitation might be worthwhile.

lib/Target/RISCV/MCTargetDesc/RISCVMCExpr.cpp
35–36 ↗(On Diff #143221)

Is there any way we can test this has the desired effect? I had a quick play around, and it seems quite difficult to trigger this code path, but perhaps I'm missing something obvious.

test/MC/RISCV/function-call.s
8 ↗(On Diff #143221)

This needs more tests, e.g. call 1234 will trigger an assertion in the current patch.

I think we want test coverage for at least call 1234, calll %pcrel_lo(1234), call %pcrel_lo(foo), call %lo(1234), call %pcrel_lo(foo).

This revision now requires changes to proceed.Apr 23 2018, 7:49 AM
shiva0217 updated this revision to Diff 143688.Apr 24 2018, 1:43 AM

Update patch to address Alex's comments.

shiva0217 added inline comments.Apr 24 2018, 1:51 AM
lib/Target/RISCV/MCTargetDesc/RISCVMCExpr.cpp
35–36 ↗(On Diff #143221)

Hi Alex. The code path can be trigger by llvm-mc -show-encoding and I have added the testing in function-call.s.

test/MC/RISCV/function-call.s
8 ↗(On Diff #143221)

Hi Alex. I have added the new predicate call_target and relative parsing functions to verify call operand. Thanks for your comments to cover this.

asb added a comment.Apr 24 2018, 6:36 AM

Thanks for the updates Shiva. I've added a few more suggested changes, and I think we're almost there.

I think the patch description also needs to be updated to justify why the CALL->AUIPC+JALR instruction should be as late as RISCVMCCodeEmitter and can't be earlier (e.g. RISCVAsmPrinter).

lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
174–186 ↗(On Diff #143688)

When possible, the backend tries to use generic names for operand types and predicates. e.g. isSimm21. Lets make this isBareSymbol. this can be implemented as below:

bool isBareSymbol() const {
  int64_t Imm;
  RISCVMCExpr::VariantKind VK;
  // Must be of 'immediate' type but not a constant.
  if (!isImm() || evaluateConstantImm(Imm, VK))
    return false;
  return RISCVAsmParser::classifySymbolRef(getImm(), VK, Imm) &&
         VK == RISCVMCExpr::VK_RISCV_None;
}
495–499 ↗(On Diff #143688)

Duplicates addImmOperands. Delete.

725–730 ↗(On Diff #143688)
case Match_InvalidBareSymbol: {
  SMLoc ErrorLoc = ((RISCVOperand &)*Operands[ErrorInfo]).getStartLoc();
  return Error(ErrorLoc, "operand must be a bare symbol name");
}
lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
89 ↗(On Diff #143688)

So -> meaning

90 ↗(On Diff #143688)

"It's ok because" -> "This is acceptable because".

103 ↗(On Diff #143688)

assert(Func.isExpr() && "Expected expression");

lib/Target/RISCV/RISCVInstrInfo.td
156–169 ↗(On Diff #143688)
def BareSymbol : AsmOperandClass {
  let Name = "BareSymbol";
  let RenderMethod = "addImmOperands";
  let DiagnosticType = "InvalidBareSymbol";
}

// A bare symbol.
def bare_symbol : Operand<XLenVT> {
  let ParserMatchClass = BareSymbol;
  let MCOperandPredicate = [{
     return MCOp.isBareSymbolRef();
  }];
}
shiva0217 updated this revision to Diff 143856.Apr 24 2018, 7:53 PM
shiva0217 edited the summary of this revision. (Show Details)

Update patch to address Alex's comments.

asb accepted this revision.Apr 25 2018, 3:54 AM

Thanks Shiva, this is looking good to me.

In the process of committing, it's probably worth adding a call bar to function-call.s with the same set of checks as for call foo just so there's coverage for a call to an external symbol.

This revision is now accepted and ready to land.Apr 25 2018, 3:54 AM
This revision was automatically updated to reflect the committed changes.