This is an archive of the discontinued LLVM Phabricator instance.

[BPF] Fix in/out argument constraints for CORE_MEM instructions
ClosedPublic

Authored by eddyz87 on Aug 12 2023, 6:16 PM.

Details

Summary

When LLVM is build with LLVM_ENABLE_EXPENSIVE_CHECKS=ON option the
following C code snippet:

struct t {
  int a;
} __attribute__((preserve_access_index));

void test(struct t *t) {
  t->a = 42;
}

Causes an assertion:

llvm
$ clang -g -O2 -c --target=bpf -mcpu=v2 t.c -o /dev/null

Function Live Ins: $r1 in %0

bb.0.entry:
  liveins: $r1
  DBG_VALUE $r1, $noreg, !"t", ...
  %0:gpr = COPY $r1
  DBG_VALUE %0:gpr, $noreg, !"t", ...
  %1:gpr = LD_imm64 @"llvm.t:0:0$0:0"
  %3:gpr = ADD_rr %0:gpr(tied-def 0), killed %1:gpr
  %4:gpr = MOV_ri 42
  CORE_MEM killed %4:gpr, 411, %0:gpr, @"llvm.t:0:0$0:0", ...
  RET debug-location !25; t.c:7:1

*** Bad machine code: Explicit definition marked as use ***
- function:    test
- basic block: %bb.0 entry (0x6210000d8a90)
- instruction: CORE_MEM killed %4:gpr, 411, %0:gpr, @"llvm.t:0:0$0:0", ...
- operand 0:   killed %4:gpr

This happens because CORE_MEM instruction is defined to have output
operands:

tablegen
  def CORE_MEM : TYPE_LD_ST<BPF_MEM.Value, BPF_W.Value,
                            (outs GPR:$dst),
                            (ins u64imm:$opcode, GPR:$src, u64imm:$offset),
                            "$dst = core_mem($opcode, $src, $offset)",
                            []>;

As documented in [1]:

By convention, the LLVM code generator orders instruction operands
so that all register definitions come before the register uses, even
on architectures that are normally printed in other orders.

In other words, the first argument for CORE_MEM is considered to be
a "def", while in reality it is "use":

llvm
  %1:gpr = LD_imm64 @"llvm.t:0:0$0:0"
  %3:gpr = ADD_rr %0:gpr(tied-def 0), killed %1:gpr
  %4:gpr = MOV_ri 42
   '---------------.
                   v
  CORE_MEM killed %4:gpr, 411, %0:gpr, @"llvm.t:0:0$0:0", ...

Here is how CORE_MEM is constructed in
BPFMISimplifyPatchable::checkADDrr():

cpp
    BuildMI(*DefInst->getParent(), *DefInst, DefInst->getDebugLoc(), TII->get(COREOp))
        .add(DefInst->getOperand(0)).addImm(Opcode).add(*BaseOp)
        .addGlobalAddress(GVal);

Note that first operand is constructed as .add(DefInst->getOperand(0)).

For LD{D,W,H,B} instructions the DefInst->getOperand(0) is a
destination register of a load, so instruction is constructed in
accordance with outs declaration.

For ST{D,W,H,B} instructions the DefInst->getOperand(0) is a
source register of a store (value to be stored), so instruction
violates the outs declaration.

This commit fixes the issue by splitting CORE_MEM in three
instructions: CORE_ST, CORE_LD64, CORE_LD32 with correct outs
specifications.

[1] https://llvm.org/docs/CodeGenerator.html#the-machineinstr-class

Diff Detail

Event Timeline

eddyz87 created this revision.Aug 12 2023, 6:16 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 12 2023, 6:16 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
eddyz87 published this revision for review.Aug 13 2023, 7:26 AM
eddyz87 added a reviewer: yonghong-song.

Hi Yonghong,

Could you please take a look at this revision?
All BPF Kernel selftests are passing and there are no changes in the generated object files (I checked for cpuv4).
This is the reason for testbot failures reported for D140804.
(The test case in D140804 was written in a way that triggers this failure).

Herald added a project: Restricted Project. · View Herald TranscriptAug 13 2023, 7:26 AM
eddyz87 planned changes to this revision.Aug 13 2023, 9:55 AM

Hold-on, I need to update this one to be more in line with changes necessary for D140804 functionality.

eddyz87 updated this revision to Diff 549735.Aug 13 2023, 11:44 AM
eddyz87 edited the summary of this revision. (Show Details)

It is possible to avoid separate versions for CORE_ST32 and CORE_ST64, merging these simplifies implementation for D140804.

Hi @yonghong-song,

I have finished changes for this revision, please take a look.

Your approach seems fine and probably better esp. for CORE for stores. Another solution is similar to D157805 where we can clear 'killed' for the 'out' parameter, right?

... Another solution is similar to D157805 where we can clear 'killed' for the 'out' parameter, right?

The error is reported by the following code in the MachineVerifier.cpp:

void
MachineVerifier::visitMachineOperand(const MachineOperand *MO, unsigned MONum) {
  const MachineInstr *MI = MO->getParent();
  const MCInstrDesc &MCID = MI->getDesc();
  unsigned NumDefs = MCID.getNumDefs();
  if (MCID.getOpcode() == TargetOpcode::PATCHPOINT)
    NumDefs = (MONum == 0 && MO->isReg()) ? NumDefs : 0;

  // The first MCID.NumDefs operands must be explicit register defines
  if (MONum < NumDefs) {
    const MCOperandInfo &MCOI = MCID.operands()[MONum];
    if (!MO->isReg())
      report("Explicit definition must be a register", MO, MONum);
    else if (!MO->isDef() && !MCOI.isOptionalDef())
      report("Explicit definition marked as use", MO, MONum);  <---- error is reported here
    else if (MO->isImplicit())
      report("Explicit definition marked as implicit", MO, MONum);
  } else if (...) {
    ...
  }
  ...
}

As far as I understand this code:

  • MCID is a static information about instruction, generated from the td file;
  • MCID.getNumDefs() is the number of values specified in the outs part of the pattern (e.g. for CORE_MEM outs defined as (outs GPR:$dst), hence getNumDefs() returns 1);
  • to satisfy the above check on the machine operand level the following change is needed:
@@ -146,8 +147,11 @@ void BPFMISimplifyPatchable::checkADDrr(MachineRegisterInfo *MRI,
         continue;
     }
 
+    MachineOperand &FirstMO = DefInst->getOperand(0);
+    FirstMO.setIsKill(false); // Can't setIsDef(true) if kill is set
+    FirstMO.setIsDef(true);
     BuildMI(*DefInst->getParent(), *DefInst, DefInst->getDebugLoc(), TII->get(COREOp))
-        .add(DefInst->getOperand(0)).addImm(Opcode).add(*BaseOp)
+        .add(FirstMO).addImm(Opcode).add(*BaseOp)
         .addGlobalAddress(GVal);
     DefInst->eraseFromParent();
   }

But this is not conservative for stores (as store does not define it's first operand) and also leads to incorrect SSA form, which is reported by the MachineVerifier:

bb.0.entry:
  liveins: $r1
  DBG_VALUE $r1, $noreg, !"t", !DIExpression(), debug-location !18; t.c:0 line no:5
  %0:gpr = COPY $r1
  DBG_VALUE %0:gpr, $noreg, !"t", !DIExpression(), debug-location !18; t.c:0 line no:5
  %1:gpr = LD_imm64 @"llvm.t:0:0$0:0"
  %3:gpr = ADD_rr %0:gpr(tied-def 0), killed %1:gpr
  %4:gpr = MOV_ri 42
  %4:gpr = CORE_MEM 411, %0:gpr, @"llvm.t:0:0$0:0", debug-location !19; t.c:6:8
  RET debug-location !25; t.c:7:1

# End machine code for function test.

*** Bad machine code: Multiple virtual register defs in SSA form ***
- function:    test
- basic block: %bb.0 entry (0x621000068290)
- instruction: %4:gpr = MOV_ri 42
- operand 0:   %4:gpr

*** Bad machine code: Multiple virtual register defs in SSA form ***
- function:    test
- basic block: %bb.0 entry (0x621000068290)
- instruction: %4:gpr = CORE_MEM 411, %0:gpr, @"llvm.t:0:0$0:0", debug-location !19; t.c:6:8
- operand 0:   %4:gpr

Which complains about two definitions for %4.

yonghong-song accepted this revision.Aug 13 2023, 4:02 PM

Okay, I see. The key is Multiple virtual register defs in SSA form and the resulted IR is broken w.r.t. SSA.
Ya, I mentioned earlier, indeed your change is good and new IR is much more intuitive from data flow perspective.

This revision is now accepted and ready to land.Aug 13 2023, 4:02 PM