This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/BPF/
-
Target/
-
BPF/
-
BPFISelLowering.h
-
BPFISelLowering.cpp
-
BPFInstrInfo.cpp
-
BPFInstrInfo.td
-
BPFMIPeephole.cpp
-
BPFMISimplifyPatchable.cpp
-
BPFRegisterInfo.td
-
BTFDebug.cpp
-
Disassembler/
-
BPFDisassembler.cpp
-
test/
-
CodeGen/BPF/
-
BPF/
-
32-bit-subreg-load-store.ll
-
CORE/
-
offset-reloc-end-load.ll
-
small-core-load.ll
-
assembler-disassembler.s
-
disassemble-mcpu-v3.s
-
is_trunc_free.ll
-
is_zext_free2.ll
-
ldsx.ll
-
memcmp.ll
-
remove_truncate_7.ll
-
rodata_5.ll
-
MC/BPF/
-
BPF/
-
insn-unit.s
1/2
load-store-32.s

Differential D156559

[BPF] Consolidate 32-bit and 64-bit LDX/STX operations
DraftPublic

Authored by eddyz87 on Jul 28 2023, 9:50 AM.

Download Raw Diff

This is a draft revision that has not yet been submitted for review.

Details

Reviewers: None

Summary

In BPF instruction set [1] load and store instructions are the same
for both ALU and ALU64 modes. Current STW/STW32, STH/STH32, STB/STB32,
LDW/LDW32, LDH/LDH32, LDB/LDB32 instructions have identical byte-code
representations pairwise.

However, at assembly level current BPF backend has different
syntactical representations for these instructions, e.g.:

in -mcpu=v2 mode: r0 = *(u32 *)(r1 + 42)
  vs
in -mcpu=v3 mode: w0 = *(u32 *)(r1 + 42)

This discrepancy is discussed in recent LKML mail thread [2].
This commit removes ST*32/LD*32 instructions replacing those with
64-bit versions combined with EXTRACT_SUBREG when appropriate,
thus removing the notational difference.

E.g. for the code snippet below:

void bar(unsigned int *a, unsigned int *b) { *a = *b; }

Results of clang -O2 -mcpu=v3 --target=bpf -S -o - t.c change as
follows:

before                   after
------                   -----
w2 = *(u32 *)(r2 + 0)    r2 = *(u32 *)(r2 + 0)
*(u32 *)(r1 + 0) = w2    *(u32 *)(r1 + 0) = r2

[1] https://www.kernel.org/doc/html/latest/bpf/instruction-set.html#load-and-store-instructions
[2] https://lore.kernel.org/bpf/87ila7dhmp.fsf@oracle.com/

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

eddyz87 created this revision.Jul 28 2023, 9:50 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 28 2023, 9:50 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

Harbormaster completed remote builds in B248881: Diff 545208.Jul 28 2023, 1:16 PM

yonghong-song added a subscriber: yonghong-song.Jul 28 2023, 4:23 PM

yonghong-song added inline comments.

llvm/test/MC/BPF/load-store-32.s
8	If I understand correctly, the asm syntax `w5 = (u8 )(r0 + 0)` is not supported any more with this patch. That means, if users write inline asm in their code like ... w5 = (u8 )(r0 + 0) ... The inline asm can be successfully compiled with llvm17/llvm16 etc. but will fail compilation with llvm18 (with this patch). Is this what we want?

eddyz87 added inline comments.Jul 30 2023, 5:14 AM

llvm/test/MC/BPF/load-store-32.s

You are correct. I'm trying to fix this using InstAlias. So far I have the incantation below that works but is kind-of ugly:

foreach I = 0-11 in {
  def : InstAlias<!strconcat("w"#I, " = *(u8 *)($src $offset)"),
                  (LDB !cast<Ri>("R"#I), GPR:$src, i16imm:$offset)>;
  def : InstAlias<!strconcat("w"#I, " = *(u16 *)($src $offset)"),
                  (LDH !cast<Ri>("R"#I), GPR:$src, i16imm:$offset)>;
  def : InstAlias<!strconcat("w"#I, " = *(u32 *)($src $offset)"),
                  (LDW !cast<Ri>("R"#I), GPR:$src, i16imm:$offset)>;
  def : InstAlias<!strconcat("*(u8 *)($dst $offset) = ", "w"#I),
                  (STB !cast<Ri>("R"#I), GPR:$dst, i16imm:$offset)>;
  def : InstAlias<!strconcat("*(u16 *)($dst $offset) = ", "w"#I),
                  (STH !cast<Ri>("R"#I), GPR:$dst, i16imm:$offset)>;
  def : InstAlias<!strconcat("*(u32 *)($dst $offset) = ", "w"#I),
                  (STW !cast<Ri>("R"#I), GPR:$dst, i16imm:$offset)>;
}

I'd prefer to have something like below instead:

def : InstAlias<"$dst = *(u8 *)($src $offset)",
  (LDB (Reg32To64 GPR32:$dst).Reg64, GPR:$src, i16imm:$offset)>;

But can't figure out how to define Reg32To64 at the moment. Still trying to figure it out but if you have a suggestion, please share.

Use InstAlias to preserve w0 = *(u8*)(r1 + 2) syntax.

Harbormaster completed remote builds in B249097: Diff 545492.Jul 30 2023, 6:15 PM

Mark zero-extension as free for LOAD instructions, see is_zext_free2.ll.
Use INSERT_SUBREG instead of SUBREG_TO_REG for ST{B,H,W} GPR32 patterns.

Harbormaster completed remote builds in B250653: Diff 547618.Aug 6 2023, 6:24 PM

Added isSmallLoad() check for BPFMIPeephole::eliminateZExtSeq() to eliminate unnecessary ZEXT after CORE instructions.

Harbormaster completed remote builds in B252367: Diff 549952.Aug 14 2023, 10:23 AM

Revision Contents

Path

Size

llvm/

lib/

Target/

BPF/

1 line

12 lines

4 lines

167 lines

21 lines

BPFMISimplifyPatchable.cpp

10 lines

BPFRegisterInfo.td

8 lines

BTFDebug.cpp

2 lines

Disassembler/

BPFDisassembler.cpp

3 lines

test/

CodeGen/

BPF/

32-bit-subreg-load-store.ll

12 lines

CORE/

offset-reloc-end-load.ll

7 lines

small-core-load.ll

118 lines

assembler-disassembler.s

20 lines

disassemble-mcpu-v3.s

2 lines

2 lines

22 lines

2 lines

2 lines

4 lines

4 lines

MC/

BPF/

insn-unit.s

18 lines

load-store-32.s

16 lines

Diff 549952

llvm/lib/Target/BPF/BPFISelLowering.h

Show First 20 Lines • Show All 138 Lines • ▼ Show 20 Lines	private:
// type Ty1 to type Ty2. e.g. On BPF at alu32 mode, it's free to truncate		// type Ty1 to type Ty2. e.g. On BPF at alu32 mode, it's free to truncate
// a i64 value in register R1 to i32 by referencing its sub-register W1.		// a i64 value in register R1 to i32 by referencing its sub-register W1.
bool isTruncateFree(Type Ty1, Type Ty2) const override;		bool isTruncateFree(Type Ty1, Type Ty2) const override;
bool isTruncateFree(EVT VT1, EVT VT2) const override;		bool isTruncateFree(EVT VT1, EVT VT2) const override;

// For 32bit ALU result zext to 64bit is free.		// For 32bit ALU result zext to 64bit is free.
bool isZExtFree(Type Ty1, Type Ty2) const override;		bool isZExtFree(Type Ty1, Type Ty2) const override;
bool isZExtFree(EVT VT1, EVT VT2) const override;		bool isZExtFree(EVT VT1, EVT VT2) const override;
		bool isZExtFree(SDValue Val, EVT VT2) const override;

unsigned EmitSubregExt(MachineInstr &MI, MachineBasicBlock *BB, unsigned Reg,		unsigned EmitSubregExt(MachineInstr &MI, MachineBasicBlock *BB, unsigned Reg,
bool isSigned) const;		bool isSigned) const;

MachineBasicBlock * EmitInstrWithCustomInserterMemcpy(MachineInstr &MI,		MachineBasicBlock * EmitInstrWithCustomInserterMemcpy(MachineInstr &MI,
MachineBasicBlock *BB)		MachineBasicBlock *BB)
const;		const;

};		};
}		}

#endif		#endif

llvm/lib/Target/BPF/BPFISelLowering.cpp

	Show First 20 Lines • Show All 218 Lines • ▼ Show 20 Lines
	bool BPFTargetLowering::isZExtFree(EVT VT1, EVT VT2) const {			bool BPFTargetLowering::isZExtFree(EVT VT1, EVT VT2) const {
	if (!getHasAlu32() \|\| !VT1.isInteger() \|\| !VT2.isInteger())			if (!getHasAlu32() \|\| !VT1.isInteger() \|\| !VT2.isInteger())
	return false;			return false;
	unsigned NumBits1 = VT1.getSizeInBits();			unsigned NumBits1 = VT1.getSizeInBits();
	unsigned NumBits2 = VT2.getSizeInBits();			unsigned NumBits2 = VT2.getSizeInBits();
	return NumBits1 == 32 && NumBits2 == 64;			return NumBits1 == 32 && NumBits2 == 64;
	}			}

				bool BPFTargetLowering::isZExtFree(SDValue Val, EVT VT2) const {
				EVT VT1 = Val.getValueType();
				if (Val.getOpcode() == ISD::LOAD && VT1.isSimple() && VT2.isSimple()) {
				MVT MT1 = VT1.getSimpleVT().SimpleTy;
				MVT MT2 = VT2.getSimpleVT().SimpleTy;
				if ((MT1 == MVT::i8 \|\| MT1 == MVT::i16 \|\| MT1 == MVT::i32) &&
				(MT2 == MVT::i32 \|\| MT2 == MVT::i64))
				return true;
				}
				return TargetLoweringBase::isZExtFree(Val, VT2);
				}

	BPFTargetLowering::ConstraintType			BPFTargetLowering::ConstraintType
	BPFTargetLowering::getConstraintType(StringRef Constraint) const {			BPFTargetLowering::getConstraintType(StringRef Constraint) const {
	if (Constraint.size() == 1) {			if (Constraint.size() == 1) {
	switch (Constraint[0]) {			switch (Constraint[0]) {
	default:			default:
	break;			break;
	case 'w':			case 'w':
	return C_RegisterClass;			return C_RegisterClass;
	▲ Show 20 Lines • Show All 672 Lines • Show Last 20 Lines

llvm/lib/Target/BPF/BPFInstrInfo.cpp

Show First 20 Lines • Show All 131 Lines • ▼ Show 20 Lines	if (I != MBB.end())
DL = I->getDebugLoc();		DL = I->getDebugLoc();

if (RC == &BPF::GPRRegClass)		if (RC == &BPF::GPRRegClass)
BuildMI(MBB, I, DL, get(BPF::STD))		BuildMI(MBB, I, DL, get(BPF::STD))
.addReg(SrcReg, getKillRegState(IsKill))		.addReg(SrcReg, getKillRegState(IsKill))
.addFrameIndex(FI)		.addFrameIndex(FI)
.addImm(0);		.addImm(0);
else if (RC == &BPF::GPR32RegClass)		else if (RC == &BPF::GPR32RegClass)
BuildMI(MBB, I, DL, get(BPF::STW32))		BuildMI(MBB, I, DL, get(BPF::STW))
.addReg(SrcReg, getKillRegState(IsKill))		.addReg(SrcReg, getKillRegState(IsKill))
.addFrameIndex(FI)		.addFrameIndex(FI)
.addImm(0);		.addImm(0);
else		else
llvm_unreachable("Can't store this register to stack slot");		llvm_unreachable("Can't store this register to stack slot");
}		}

void BPFInstrInfo::loadRegFromStackSlot(MachineBasicBlock &MBB,		void BPFInstrInfo::loadRegFromStackSlot(MachineBasicBlock &MBB,
MachineBasicBlock::iterator I,		MachineBasicBlock::iterator I,
Register DestReg, int FI,		Register DestReg, int FI,
const TargetRegisterClass *RC,		const TargetRegisterClass *RC,
const TargetRegisterInfo *TRI,		const TargetRegisterInfo *TRI,
Register VReg) const {		Register VReg) const {
DebugLoc DL;		DebugLoc DL;
if (I != MBB.end())		if (I != MBB.end())
DL = I->getDebugLoc();		DL = I->getDebugLoc();

if (RC == &BPF::GPRRegClass)		if (RC == &BPF::GPRRegClass)
BuildMI(MBB, I, DL, get(BPF::LDD), DestReg).addFrameIndex(FI).addImm(0);		BuildMI(MBB, I, DL, get(BPF::LDD), DestReg).addFrameIndex(FI).addImm(0);
else if (RC == &BPF::GPR32RegClass)		else if (RC == &BPF::GPR32RegClass)
BuildMI(MBB, I, DL, get(BPF::LDW32), DestReg).addFrameIndex(FI).addImm(0);		BuildMI(MBB, I, DL, get(BPF::LDW), DestReg).addFrameIndex(FI).addImm(0);
else		else
llvm_unreachable("Can't load this register from stack slot");		llvm_unreachable("Can't load this register from stack slot");
}		}

bool BPFInstrInfo::analyzeBranch(MachineBasicBlock &MBB,		bool BPFInstrInfo::analyzeBranch(MachineBasicBlock &MBB,
MachineBasicBlock *&TBB,		MachineBasicBlock *&TBB,
MachineBasicBlock *&FBB,		MachineBasicBlock *&FBB,
SmallVectorImpl<MachineOperand> &Cond,		SmallVectorImpl<MachineOperand> &Cond,
▲ Show 20 Lines • Show All 92 Lines • Show Last 20 Lines

llvm/lib/Target/BPF/BPFInstrInfo.td

Show First 20 Lines • Show All 425 Lines • ▼ Show 20 Lines	def LD_pseudo
let Inst{51-48} = dst;		let Inst{51-48} = dst;
let Inst{55-52} = pseudo;		let Inst{55-52} = pseudo;
let Inst{47-32} = 0;		let Inst{47-32} = 0;
let Inst{31-0} = imm{31-0};		let Inst{31-0} = imm{31-0};
let BPFClass = BPF_LD;		let BPFClass = BPF_LD;
}		}

// STORE instructions		// STORE instructions
class STORE<BPFWidthModifer SizeOp, string OpcodeStr, list<dag> Pattern>		class STORE_reg<BPFWidthModifer SizeOp, string OpcodeStr, PatFrag OpNode>
: TYPE_LD_ST<BPF_MEM.Value, SizeOp.Value,		: TYPE_LD_ST<BPF_MEM.Value, SizeOp.Value,
(outs),		(outs),
(ins GPR:$src, MEMri:$addr),		(ins GPR:$src, MEMri:$addr),
"("#OpcodeStr#" )($addr) = $src",		"("#OpcodeStr#" )($addr) = $src",
Pattern> {		[(OpNode GPR:$src, ADDRri:$addr)]> {
bits<4> src;		bits<4> src;
bits<20> addr;		bits<20> addr;

let Inst{51-48} = addr{19-16}; // base reg		let Inst{51-48} = addr{19-16}; // base reg
let Inst{55-52} = src;		let Inst{55-52} = src;
let Inst{47-32} = addr{15-0}; // offset		let Inst{47-32} = addr{15-0}; // offset
let BPFClass = BPF_STX;		let BPFClass = BPF_STX;
}		}

class STOREi64<BPFWidthModifer Opc, string OpcodeStr, PatFrag OpNode>		def STD : STORE_reg<BPF_DW, "u64", store>;
: STORE<Opc, OpcodeStr, [(OpNode i64:$src, ADDRri:$addr)]>;		def STW : STORE_reg<BPF_W, "u32", truncstorei32>;
		def STH : STORE_reg<BPF_H, "u16", truncstorei16>;
		def STB : STORE_reg<BPF_B, "u8" , truncstorei8>;

		def : Pat<(store i64:$src, ADDRri:$dst), (STD i64:$src, ADDRri:$dst)>;
		def : Pat<(truncstorei32 i64:$src, ADDRri:$dst), (STW i64:$src, ADDRri:$dst)>;
		def : Pat<(truncstorei16 i64:$src, ADDRri:$dst), (STH i64:$src, ADDRri:$dst)>;
		def : Pat<(truncstorei8 i64:$src, ADDRri:$dst), (STB i64:$src, ADDRri:$dst)>;

let Predicates = [BPFNoALU32] in {		let Predicates = [BPFHasALU32] in {
def STW : STOREi64<BPF_W, "u32", truncstorei32>;		def : Pat<(store GPR32:$src, ADDRri:$dst),
def STH : STOREi64<BPF_H, "u16", truncstorei16>;		(STW (INSERT_SUBREG (i64 (IMPLICIT_DEF)), GPR32:$src, sub_32), ADDRri:$dst)>;
def STB : STOREi64<BPF_B, "u8", truncstorei8>;		def : Pat<(truncstorei16 GPR32:$src, ADDRri:$dst),
		(STH (INSERT_SUBREG (i64 (IMPLICIT_DEF)), GPR32:$src, sub_32), ADDRri:$dst)>;
		def : Pat<(truncstorei8 GPR32:$src, ADDRri:$dst),
		(STB (INSERT_SUBREG (i64 (IMPLICIT_DEF)), GPR32:$src, sub_32), ADDRri:$dst)>;
}		}
def STD : STOREi64<BPF_DW, "u64", store>;

// LOAD instructions		// LOAD instructions
class LOAD<BPFWidthModifer SizeOp, BPFModeModifer ModOp, string OpcodeStr, list<dag> Pattern>		class LOAD<BPFWidthModifer SizeOp, BPFModeModifer ModOp, string OpcodeStr, PatFrag OpNode>
: TYPE_LD_ST<ModOp.Value, SizeOp.Value,		: TYPE_LD_ST<ModOp.Value, SizeOp.Value,
(outs GPR:$dst),		(outs GPR:$dst),
(ins MEMri:$addr),		(ins MEMri:$addr),
"$dst = ("#OpcodeStr#" )($addr)",		"$dst = ("#OpcodeStr#" )($addr)",
Pattern> {		[(set i64:$dst, (OpNode ADDRri:$addr))]> {
bits<4> dst;		bits<4> dst;
bits<20> addr;		bits<20> addr;

let Inst{51-48} = dst;		let Inst{51-48} = dst;
let Inst{55-52} = addr{19-16};		let Inst{55-52} = addr{19-16};
let Inst{47-32} = addr{15-0};		let Inst{47-32} = addr{15-0};
let BPFClass = BPF_LDX;		let BPFClass = BPF_LDX;
}		}

class LOADi64<BPFWidthModifer SizeOp, BPFModeModifer ModOp, string OpcodeStr, PatFrag OpNode>		def LDD : LOAD<BPF_DW, BPF_MEM, "u64", load>;
: LOAD<SizeOp, ModOp, OpcodeStr, [(set i64:$dst, (OpNode ADDRri:$addr))]>;		def LDW : LOAD<BPF_W, BPF_MEM, "u32", zextloadi32>;
		def LDH : LOAD<BPF_H, BPF_MEM, "u16", zextloadi16>;
		def LDB : LOAD<BPF_B, BPF_MEM, "u8", zextloadi8>;

		let Predicates = [BPFHasALU32] in {
		def : Pat<(i32 (load ADDRri:$src)), (EXTRACT_SUBREG (LDW ADDRri:$src), sub_32)>;

		def : Pat<(i32 (zextloadi16 ADDRri:$src)), (EXTRACT_SUBREG (LDH ADDRri:$src), sub_32)>;
		def : Pat<(i32 (zextloadi8 ADDRri:$src)), (EXTRACT_SUBREG (LDB ADDRri:$src), sub_32)>;

		def : Pat<(i32 (extloadi16 ADDRri:$src)), (EXTRACT_SUBREG (LDH ADDRri:$src), sub_32)>;
		def : Pat<(i32 (extloadi8 ADDRri:$src)), (EXTRACT_SUBREG (LDB ADDRri:$src), sub_32)>;

		def : Pat<(i64 (extloadi32 ADDRri:$src)), (LDW ADDRri:$src)>;
		def : Pat<(i64 (extloadi16 ADDRri:$src)), (LDH ADDRri:$src)>;
		def : Pat<(i64 (extloadi8 ADDRri:$src)), (LDB ADDRri:$src)>;

		foreach P = AllRegs in {
		def : InstAlias<P.R32#" = (u8 )($src $offset)", (LDB P.R64, GPR:$src, i16imm:$offset)>;
		def : InstAlias<P.R32#" = (u16 )($src $offset)", (LDH P.R64, GPR:$src, i16imm:$offset)>;
		def : InstAlias<P.R32#" = (u32 )($src $offset)", (LDW P.R64, GPR:$src, i16imm:$offset)>;
		def : InstAlias<"(u8 )($dst $offset) = "#P.R32, (STB P.R64, GPR:$dst, i16imm:$offset)>;
		def : InstAlias<"(u16 )($dst $offset) = "#P.R32, (STH P.R64, GPR:$dst, i16imm:$offset)>;
		def : InstAlias<"(u32 )($dst $offset) = "#P.R32, (STW P.R64, GPR:$dst, i16imm:$offset)>;
		}
		}

		let Predicates = [BPFHasLdsx] in {
		def LDWSX : LOAD<BPF_W, BPF_MEMSX, "s32", sextloadi32>;
		def LDHSX : LOAD<BPF_H, BPF_MEMSX, "s16", sextloadi16>;
		def LDBSX : LOAD<BPF_B, BPF_MEMSX, "s8", sextloadi8>;
		}

		let Predicates = [BPFHasALU32, BPFHasLdsx] in {
		def : Pat<(i32 (sextloadi8 ADDRri:$src)), (EXTRACT_SUBREG (LDBSX ADDRri:$src), sub_32)>;
		def : Pat<(i32 (sextloadi16 ADDRri:$src)), (EXTRACT_SUBREG (LDHSX ADDRri:$src), sub_32)>;
		}

let isCodeGenOnly = 1 in {		let isCodeGenOnly = 1 in {
def CORE_MEM : TYPE_LD_ST<BPF_MEM.Value, BPF_W.Value,		def CORE_MEM : TYPE_LD_ST<BPF_MEM.Value, BPF_W.Value,
(outs GPR:$dst),		(outs GPR:$dst),
(ins u64imm:$opcode, GPR:$src, u64imm:$offset),		(ins u64imm:$opcode, GPR:$src, u64imm:$offset),
"$dst = core_mem($opcode, $src, $offset)",		"$dst = core_mem($opcode, $src, $offset)",
[]>;		[]>;
def CORE_ALU32_MEM : TYPE_LD_ST<BPF_MEM.Value, BPF_W.Value,
(outs GPR32:$dst),
(ins u64imm:$opcode, GPR:$src, u64imm:$offset),
"$dst = core_alu32_mem($opcode, $src, $offset)",
[]>;
let Constraints = "$dst = $src" in {		let Constraints = "$dst = $src" in {
def CORE_SHIFT : ALU_RR<BPF_ALU64, BPF_LSH, 0,		def CORE_SHIFT : ALU_RR<BPF_ALU64, BPF_LSH, 0,
(outs GPR:$dst),		(outs GPR:$dst),
(ins u64imm:$opcode, GPR:$src, u64imm:$offset),		(ins u64imm:$opcode, GPR:$src, u64imm:$offset),
"$dst = core_shift($opcode, $src, $offset)",		"$dst = core_shift($opcode, $src, $offset)",
[]>;		[]>;
}		}
}		}

let Predicates = [BPFNoALU32] in {
def LDW : LOADi64<BPF_W, BPF_MEM, "u32", zextloadi32>;
def LDH : LOADi64<BPF_H, BPF_MEM, "u16", zextloadi16>;
def LDB : LOADi64<BPF_B, BPF_MEM, "u8", zextloadi8>;
}

let Predicates = [BPFHasLdsx] in {
def LDWSX : LOADi64<BPF_W, BPF_MEMSX, "s32", sextloadi32>;
def LDHSX : LOADi64<BPF_H, BPF_MEMSX, "s16", sextloadi16>;
def LDBSX : LOADi64<BPF_B, BPF_MEMSX, "s8", sextloadi8>;
}

def LDD : LOADi64<BPF_DW, BPF_MEM, "u64", load>;

class BRANCH<BPFJumpOp Opc, string OpcodeStr, list<dag> Pattern>		class BRANCH<BPFJumpOp Opc, string OpcodeStr, list<dag> Pattern>
: TYPE_ALU_JMP<Opc.Value, BPF_K.Value,		: TYPE_ALU_JMP<Opc.Value, BPF_K.Value,
(outs),		(outs),
(ins brtarget:$BrDst),		(ins brtarget:$BrDst),
!strconcat(OpcodeStr, " $BrDst"),		!strconcat(OpcodeStr, " $BrDst"),
Pattern> {		Pattern> {
bits<16> BrDst;		bits<16> BrDst;

▲ Show 20 Lines • Show All 461 Lines • ▼ Show 20 Lines
// For i64 -> i32 truncation, use the 32-bit subregister directly.		// For i64 -> i32 truncation, use the 32-bit subregister directly.
def : Pat<(i32 (trunc GPR:$src)),		def : Pat<(i32 (trunc GPR:$src)),
(i32 (EXTRACT_SUBREG GPR:$src, sub_32))>;		(i32 (EXTRACT_SUBREG GPR:$src, sub_32))>;

// For i32 -> i64 anyext, we don't care about the high bits.		// For i32 -> i64 anyext, we don't care about the high bits.
def : Pat<(i64 (anyext GPR32:$src)),		def : Pat<(i64 (anyext GPR32:$src)),
(INSERT_SUBREG (i64 (IMPLICIT_DEF)), GPR32:$src, sub_32)>;		(INSERT_SUBREG (i64 (IMPLICIT_DEF)), GPR32:$src, sub_32)>;

class STORE32<BPFWidthModifer SizeOp, string OpcodeStr, list<dag> Pattern>
: TYPE_LD_ST<BPF_MEM.Value, SizeOp.Value,
(outs),
(ins GPR32:$src, MEMri:$addr),
"("#OpcodeStr#" )($addr) = $src",
Pattern> {
bits<4> src;
bits<20> addr;

let Inst{51-48} = addr{19-16}; // base reg
let Inst{55-52} = src;
let Inst{47-32} = addr{15-0}; // offset
let BPFClass = BPF_STX;
}

class STOREi32<BPFWidthModifer Opc, string OpcodeStr, PatFrag OpNode>
: STORE32<Opc, OpcodeStr, [(OpNode i32:$src, ADDRri:$addr)]>;

let Predicates = [BPFHasALU32], DecoderNamespace = "BPFALU32" in {
def STW32 : STOREi32<BPF_W, "u32", store>;
def STH32 : STOREi32<BPF_H, "u16", truncstorei16>;
def STB32 : STOREi32<BPF_B, "u8", truncstorei8>;
}

class LOAD32<BPFWidthModifer SizeOp, BPFModeModifer ModOp, string OpcodeStr, list<dag> Pattern>
: TYPE_LD_ST<ModOp.Value, SizeOp.Value,
(outs GPR32:$dst),
(ins MEMri:$addr),
"$dst = ("#OpcodeStr#" )($addr)",
Pattern> {
bits<4> dst;
bits<20> addr;

let Inst{51-48} = dst;
let Inst{55-52} = addr{19-16};
let Inst{47-32} = addr{15-0};
let BPFClass = BPF_LDX;
}

class LOADi32<BPFWidthModifer SizeOp, BPFModeModifer ModOp, string OpcodeStr, PatFrag OpNode>
: LOAD32<SizeOp, ModOp, OpcodeStr, [(set i32:$dst, (OpNode ADDRri:$addr))]>;

let Predicates = [BPFHasALU32], DecoderNamespace = "BPFALU32" in {
def LDW32 : LOADi32<BPF_W, BPF_MEM, "u32", load>;
def LDH32 : LOADi32<BPF_H, BPF_MEM, "u16", zextloadi16>;
def LDB32 : LOADi32<BPF_B, BPF_MEM, "u8", zextloadi8>;
}

let Predicates = [BPFHasALU32] in {
def : Pat<(truncstorei8 GPR:$src, ADDRri:$dst),
(STB32 (EXTRACT_SUBREG GPR:$src, sub_32), ADDRri:$dst)>;
def : Pat<(truncstorei16 GPR:$src, ADDRri:$dst),
(STH32 (EXTRACT_SUBREG GPR:$src, sub_32), ADDRri:$dst)>;
def : Pat<(truncstorei32 GPR:$src, ADDRri:$dst),
(STW32 (EXTRACT_SUBREG GPR:$src, sub_32), ADDRri:$dst)>;
def : Pat<(i32 (extloadi8 ADDRri:$src)), (i32 (LDB32 ADDRri:$src))>;
def : Pat<(i32 (extloadi16 ADDRri:$src)), (i32 (LDH32 ADDRri:$src))>;

let Predicates = [BPFHasLdsx] in {
def : Pat<(i32 (sextloadi8 ADDRri:$src)), (EXTRACT_SUBREG (LDBSX ADDRri:$src), sub_32)>;
def : Pat<(i32 (sextloadi16 ADDRri:$src)), (EXTRACT_SUBREG (LDHSX ADDRri:$src), sub_32)>;
}

def : Pat<(i64 (zextloadi8 ADDRri:$src)),
(SUBREG_TO_REG (i64 0), (LDB32 ADDRri:$src), sub_32)>;
def : Pat<(i64 (zextloadi16 ADDRri:$src)),
(SUBREG_TO_REG (i64 0), (LDH32 ADDRri:$src), sub_32)>;
def : Pat<(i64 (zextloadi32 ADDRri:$src)),
(SUBREG_TO_REG (i64 0), (LDW32 ADDRri:$src), sub_32)>;
def : Pat<(i64 (extloadi8 ADDRri:$src)),
(SUBREG_TO_REG (i64 0), (LDB32 ADDRri:$src), sub_32)>;
def : Pat<(i64 (extloadi16 ADDRri:$src)),
(SUBREG_TO_REG (i64 0), (LDH32 ADDRri:$src), sub_32)>;
def : Pat<(i64 (extloadi32 ADDRri:$src)),
(SUBREG_TO_REG (i64 0), (LDW32 ADDRri:$src), sub_32)>;
}

let usesCustomInserter = 1, isCodeGenOnly = 1 in {		let usesCustomInserter = 1, isCodeGenOnly = 1 in {
def MEMCPY : Pseudo<		def MEMCPY : Pseudo<
(outs),		(outs),
(ins GPR:$dst, GPR:$src, i64imm:$len, i64imm:$align, variable_ops),		(ins GPR:$dst, GPR:$src, i64imm:$len, i64imm:$align, variable_ops),
"#memcpy dst: $dst, src: $src, len: $len, align: $align",		"#memcpy dst: $dst, src: $src, len: $len, align: $align",
[(BPFmemcpy GPR:$dst, GPR:$src, imm:$len, imm:$align)]>;		[(BPFmemcpy GPR:$dst, GPR:$src, imm:$len, imm:$align)]>;
}		}

llvm/lib/Target/BPF/BPFMIPeephole.cpp

Show First 20 Lines • Show All 85 Lines • ▼ Show 20 Lines
// Initialize class variables.		// Initialize class variables.
void BPFMIPeephole::initialize(MachineFunction &MFParm) {		void BPFMIPeephole::initialize(MachineFunction &MFParm) {
MF = &MFParm;		MF = &MFParm;
MRI = &MF->getRegInfo();		MRI = &MF->getRegInfo();
TII = MF->getSubtarget<BPFSubtarget>().getInstrInfo();		TII = MF->getSubtarget<BPFSubtarget>().getInstrInfo();
LLVM_DEBUG(dbgs() << "* BPF MachineSSA ZEXT Elim peephole pass *\n\n");		LLVM_DEBUG(dbgs() << "* BPF MachineSSA ZEXT Elim peephole pass *\n\n");
}		}

		static bool isSmallLoad(MachineInstr *MI) {
		unsigned Opcode = MI->getOpcode() == BPF::CORE_MEM
		? MI->getOperand(1).getImm()
		: MI->getOpcode();
		return Opcode == BPF::LDB \|\| Opcode == BPF::LDH \|\| Opcode == BPF::LDW;
		}

bool BPFMIPeephole::isCopyFrom32Def(MachineInstr *CopyMI)		bool BPFMIPeephole::isCopyFrom32Def(MachineInstr *CopyMI)
{		{
MachineOperand &opnd = CopyMI->getOperand(1);		MachineOperand &opnd = CopyMI->getOperand(1);

if (!opnd.isReg())		if (!opnd.isReg())
return false;		return false;

// Return false if getting value from a 32bit physical register.		// Return false if getting value from a 32bit physical register.
// Most likely, this physical register is aliased to		// Most likely, this physical register is aliased to
// function call return value or current function parameters.		// function call return value or current function parameters.
Register Reg = opnd.getReg();		Register Reg = opnd.getReg();
if (!Reg.isVirtual())		if (!Reg.isVirtual())
return false;		return false;

		MachineInstr *DefInsn = MRI->getVRegDef(Reg);

		// 8/16/32-bit loads have GPRRegclass but there is no need to
		// zero-extend values coming from such loads.
		if (isSmallLoad(DefInsn))
		return true;

if (MRI->getRegClass(Reg) == &BPF::GPRRegClass)		if (MRI->getRegClass(Reg) == &BPF::GPRRegClass)
return false;		return false;

MachineInstr *DefInsn = MRI->getVRegDef(Reg);
if (!isInsnFrom32Def(DefInsn))		if (!isInsnFrom32Def(DefInsn))
return false;		return false;

return true;		return true;
}		}

bool BPFMIPeephole::isPhiFrom32Def(MachineInstr *PhiMI)		bool BPFMIPeephole::isPhiFrom32Def(MachineInstr *PhiMI)
{		{
▲ Show 20 Lines • Show All 519 Lines • ▼ Show 20 Lines	bool runOnMachineFunction(MachineFunction &MF) override {

return eliminateTruncSeq();		return eliminateTruncSeq();
}		}
};		};

static bool TruncSizeCompatible(int TruncSize, unsigned opcode)		static bool TruncSizeCompatible(int TruncSize, unsigned opcode)
{		{
if (TruncSize == 1)		if (TruncSize == 1)
return opcode == BPF::LDB \|\| opcode == BPF::LDB32;		return opcode == BPF::LDB;

if (TruncSize == 2)		if (TruncSize == 2)
return opcode == BPF::LDH \|\| opcode == BPF::LDH32;		return opcode == BPF::LDH;

if (TruncSize == 4)		if (TruncSize == 4)
return opcode == BPF::LDW \|\| opcode == BPF::LDW32;		return opcode == BPF::LDW;

return false;		return false;
}		}

// Initialize class variables.		// Initialize class variables.
void BPFMIPeepholeTruncElim::initialize(MachineFunction &MFParm) {		void BPFMIPeepholeTruncElim::initialize(MachineFunction &MFParm) {
MF = &MFParm;		MF = &MFParm;
MRI = &MF->getRegInfo();		MRI = &MF->getRegInfo();
▲ Show 20 Lines • Show All 124 Lines • Show Last 20 Lines

llvm/lib/Target/BPF/BPFMISimplifyPatchable.cpp

Show First 20 Lines • Show All 89 Lines • ▼ Show 20 Lines
void BPFMISimplifyPatchable::initialize(MachineFunction &MFParm) {		void BPFMISimplifyPatchable::initialize(MachineFunction &MFParm) {
MF = &MFParm;		MF = &MFParm;
TII = MF->getSubtarget<BPFSubtarget>().getInstrInfo();		TII = MF->getSubtarget<BPFSubtarget>().getInstrInfo();
LLVM_DEBUG(dbgs() << "* BPF simplify patchable insts pass *\n\n");		LLVM_DEBUG(dbgs() << "* BPF simplify patchable insts pass *\n\n");
}		}

bool BPFMISimplifyPatchable::isLoadInst(unsigned Opcode) {		bool BPFMISimplifyPatchable::isLoadInst(unsigned Opcode) {
return Opcode == BPF::LDD \|\| Opcode == BPF::LDW \|\| Opcode == BPF::LDH \|\|		return Opcode == BPF::LDD \|\| Opcode == BPF::LDW \|\| Opcode == BPF::LDH \|\|
Opcode == BPF::LDB \|\| Opcode == BPF::LDW32 \|\| Opcode == BPF::LDH32 \|\|		Opcode == BPF::LDB \|\| Opcode == BPF::LDWSX \|\| Opcode == BPF::LDHSX \|\|
Opcode == BPF::LDB32 \|\| Opcode == BPF::LDWSX \|\| Opcode == BPF::LDHSX \|\|
Opcode == BPF::LDBSX;		Opcode == BPF::LDBSX;
}		}

void BPFMISimplifyPatchable::checkADDrr(MachineRegisterInfo *MRI,		void BPFMISimplifyPatchable::checkADDrr(MachineRegisterInfo *MRI,
MachineOperand RelocOp, const GlobalValue GVal) {		MachineOperand RelocOp, const GlobalValue GVal) {
const MachineInstr *Inst = RelocOp->getParent();		const MachineInstr *Inst = RelocOp->getParent();
const MachineOperand *Op1 = &Inst->getOperand(1);		const MachineOperand *Op1 = &Inst->getOperand(1);
const MachineOperand *Op2 = &Inst->getOperand(2);		const MachineOperand *Op2 = &Inst->getOperand(2);
Show All 10 Lines	for (MachineOperand &MO :
MachineInstr *DefInst = MO.getParent();		MachineInstr *DefInst = MO.getParent();
unsigned Opcode = DefInst->getOpcode();		unsigned Opcode = DefInst->getOpcode();
unsigned COREOp;		unsigned COREOp;
if (Opcode == BPF::LDB \|\| Opcode == BPF::LDH \|\| Opcode == BPF::LDW \|\|		if (Opcode == BPF::LDB \|\| Opcode == BPF::LDH \|\| Opcode == BPF::LDW \|\|
Opcode == BPF::LDD \|\| Opcode == BPF::STB \|\| Opcode == BPF::STH \|\|		Opcode == BPF::LDD \|\| Opcode == BPF::STB \|\| Opcode == BPF::STH \|\|
Opcode == BPF::STW \|\| Opcode == BPF::STD \|\| Opcode == BPF::LDWSX \|\|		Opcode == BPF::STW \|\| Opcode == BPF::STD \|\| Opcode == BPF::LDWSX \|\|
Opcode == BPF::LDHSX \|\| Opcode == BPF::LDBSX)		Opcode == BPF::LDHSX \|\| Opcode == BPF::LDBSX)
COREOp = BPF::CORE_MEM;		COREOp = BPF::CORE_MEM;
else if (Opcode == BPF::LDB32 \|\| Opcode == BPF::LDH32 \|\|
Opcode == BPF::LDW32 \|\| Opcode == BPF::STB32 \|\|
Opcode == BPF::STH32 \|\| Opcode == BPF::STW32)
COREOp = BPF::CORE_ALU32_MEM;
else		else
continue;		continue;

// It must be a form of %2 = (type )(%1 + 0) or (type )(%1 + 0) = %2.		// It must be a form of %2 = (type )(%1 + 0) or (type )(%1 + 0) = %2.
const MachineOperand &ImmOp = DefInst->getOperand(2);		const MachineOperand &ImmOp = DefInst->getOperand(2);
if (!ImmOp.isImm() \|\| ImmOp.getImm() != 0)		if (!ImmOp.isImm() \|\| ImmOp.getImm() != 0)
continue;		continue;

// Reject the form:		// Reject the form:
// %1 = ADD_rr %2, %3		// %1 = ADD_rr %2, %3
// (type )(%2 + 0) = %1		// (type )(%2 + 0) = %1
if (Opcode == BPF::STB \|\| Opcode == BPF::STH \|\| Opcode == BPF::STW \|\|		if (Opcode == BPF::STB \|\| Opcode == BPF::STH \|\| Opcode == BPF::STW \|\|
Opcode == BPF::STD \|\| Opcode == BPF::STB32 \|\| Opcode == BPF::STH32 \|\|		Opcode == BPF::STD) {
Opcode == BPF::STW32) {
const MachineOperand &Opnd = DefInst->getOperand(0);		const MachineOperand &Opnd = DefInst->getOperand(0);
if (Opnd.isReg() && Opnd.getReg() == MO.getReg())		if (Opnd.isReg() && Opnd.getReg() == MO.getReg())
continue;		continue;
}		}

BuildMI(DefInst->getParent(), DefInst, DefInst->getDebugLoc(), TII->get(COREOp))		BuildMI(DefInst->getParent(), DefInst, DefInst->getDebugLoc(), TII->get(COREOp))
.add(DefInst->getOperand(0)).addImm(Opcode).add(*BaseOp)		.add(DefInst->getOperand(0)).addImm(Opcode).add(*BaseOp)
.addGlobalAddress(GVal);		.addGlobalAddress(GVal);
▲ Show 20 Lines • Show All 179 Lines • Show Last 20 Lines

llvm/lib/Target/BPF/BPFRegisterInfo.td

	Show All 29 Lines

	foreach I = 0-11 in {			foreach I = 0-11 in {
	// 32-bit Integer (alias to low part of 64-bit register).			// 32-bit Integer (alias to low part of 64-bit register).
	def W#I : Wi<I, "w"#I>, DwarfRegNum<[I]>;			def W#I : Wi<I, "w"#I>, DwarfRegNum<[I]>;
	// 64-bit Integer registers			// 64-bit Integer registers
	def R#I : Ri<I, "r"#I, [!cast<Wi>("W"#I)]>, DwarfRegNum<[I]>;			def R#I : Ri<I, "r"#I, [!cast<Wi>("W"#I)]>, DwarfRegNum<[I]>;
	}			}

				class RegPair<Wi w, Ri r> {
				Wi R32 = w;
				Ri R64 = r;
				}

				defvar AllRegs =
				!foreach(I, !range(0, 11), RegPair<!cast<Wi>("W"#I), !cast<Ri>("R"#I)>);

	// Register classes.			// Register classes.
	def GPR32 : RegisterClass<"BPF", [i32], 64, (add			def GPR32 : RegisterClass<"BPF", [i32], 64, (add
	(sequence "W%u", 1, 9),			(sequence "W%u", 1, 9),
	W0, // Return value			W0, // Return value
	W11, // Stack Ptr			W11, // Stack Ptr
	W10 // Frame Ptr			W10 // Frame Ptr
	)>;			)>;

	def GPR : RegisterClass<"BPF", [i64], 64, (add			def GPR : RegisterClass<"BPF", [i64], 64, (add
	(sequence "R%u", 1, 9),			(sequence "R%u", 1, 9),
	R0, // Return value			R0, // Return value
	R11, // Stack Ptr			R11, // Stack Ptr
	R10 // Frame Ptr			R10 // Frame Ptr
	)>;			)>;

llvm/lib/Target/BPF/BTFDebug.cpp

Show First 20 Lines • Show All 1,344 Lines • ▼ Show 20 Lines	if (MI->getOpcode() == BPF::LD_imm64) {
// Later, the insn is replaced with "r2 = <offset>"		// Later, the insn is replaced with "r2 = <offset>"
// where "<offset>" equals to the offset based on current		// where "<offset>" equals to the offset based on current
// type definitions.		// type definitions.
//		//
// If the insn is "r2 = LD_imm64 @<an TypeIdAttr global>",		// If the insn is "r2 = LD_imm64 @<an TypeIdAttr global>",
// The LD_imm64 result will be replaced with a btf type id.		// The LD_imm64 result will be replaced with a btf type id.
processGlobalValue(MI->getOperand(1));		processGlobalValue(MI->getOperand(1));
} else if (MI->getOpcode() == BPF::CORE_MEM \|\|		} else if (MI->getOpcode() == BPF::CORE_MEM \|\|
MI->getOpcode() == BPF::CORE_ALU32_MEM \|\|
MI->getOpcode() == BPF::CORE_SHIFT) {		MI->getOpcode() == BPF::CORE_SHIFT) {
// relocation insn is a load, store or shift insn.		// relocation insn is a load, store or shift insn.
processGlobalValue(MI->getOperand(3));		processGlobalValue(MI->getOperand(3));
} else if (MI->getOpcode() == BPF::JAL) {		} else if (MI->getOpcode() == BPF::JAL) {
// check extern function references		// check extern function references
const MachineOperand &MO = MI->getOperand(0);		const MachineOperand &MO = MI->getOperand(0);
if (MO.isGlobal()) {		if (MO.isGlobal()) {
processFuncPrototypes(dyn_cast<Function>(MO.getGlobal()));		processFuncPrototypes(dyn_cast<Function>(MO.getGlobal()));
▲ Show 20 Lines • Show All 163 Lines • ▼ Show 20 Lines	if (MO.isGlobal()) {
else		else
OutMI.setOpcode(BPF::MOV_ri);		OutMI.setOpcode(BPF::MOV_ri);
OutMI.addOperand(MCOperand::createReg(MI->getOperand(0).getReg()));		OutMI.addOperand(MCOperand::createReg(MI->getOperand(0).getReg()));
OutMI.addOperand(MCOperand::createImm(Imm));		OutMI.addOperand(MCOperand::createImm(Imm));
return true;		return true;
}		}
}		}
} else if (MI->getOpcode() == BPF::CORE_MEM \|\|		} else if (MI->getOpcode() == BPF::CORE_MEM \|\|
MI->getOpcode() == BPF::CORE_ALU32_MEM \|\|
MI->getOpcode() == BPF::CORE_SHIFT) {		MI->getOpcode() == BPF::CORE_SHIFT) {
const MachineOperand &MO = MI->getOperand(3);		const MachineOperand &MO = MI->getOperand(3);
if (MO.isGlobal()) {		if (MO.isGlobal()) {
const GlobalValue *GVal = MO.getGlobal();		const GlobalValue *GVal = MO.getGlobal();
auto *GVar = dyn_cast<GlobalVariable>(GVal);		auto *GVar = dyn_cast<GlobalVariable>(GVal);
if (GVar && GVar->hasAttribute(BPFCoreSharedInfo::AmaAttr)) {		if (GVar && GVar->hasAttribute(BPFCoreSharedInfo::AmaAttr)) {
uint32_t Imm = PatchImms[GVar].first;		uint32_t Imm = PatchImms[GVar].first;
OutMI.setOpcode(MI->getOperand(1).getImm());		OutMI.setOpcode(MI->getOperand(1).getImm());
▲ Show 20 Lines • Show All 97 Lines • Show Last 20 Lines

llvm/lib/Target/BPF/Disassembler/BPFDisassembler.cpp

Show First 20 Lines • Show All 170 Lines • ▼ Show 20 Lines	DecodeStatus BPFDisassembler::getInstruction(MCInst &Instr, uint64_t &Size,
DecodeStatus Result;		DecodeStatus Result;

Result = readInstruction64(Bytes, Address, Size, Insn, IsLittleEndian);		Result = readInstruction64(Bytes, Address, Size, Insn, IsLittleEndian);
if (Result == MCDisassembler::Fail) return MCDisassembler::Fail;		if (Result == MCDisassembler::Fail) return MCDisassembler::Fail;

uint8_t InstClass = getInstClass(Insn);		uint8_t InstClass = getInstClass(Insn);
uint8_t InstMode = getInstMode(Insn);		uint8_t InstMode = getInstMode(Insn);
if ((InstClass == BPF_LDX \|\| InstClass == BPF_STX) &&		if ((InstClass == BPF_LDX \|\| InstClass == BPF_STX) &&
getInstSize(Insn) != BPF_DW &&		InstMode == BPF_ATOMIC && getInstSize(Insn) != BPF_DW &&
(InstMode == BPF_MEM \|\| InstMode == BPF_ATOMIC) &&
STI.hasFeature(BPF::ALU32))		STI.hasFeature(BPF::ALU32))
Result = decodeInstruction(DecoderTableBPFALU3264, Instr, Insn, Address,		Result = decodeInstruction(DecoderTableBPFALU3264, Instr, Insn, Address,
this, STI);		this, STI);
else		else
Result = decodeInstruction(DecoderTableBPF64, Instr, Insn, Address, this,		Result = decodeInstruction(DecoderTableBPF64, Instr, Insn, Address, this,
STI);		STI);

if (Result == MCDisassembler::Fail) return MCDisassembler::Fail;		if (Result == MCDisassembler::Fail) return MCDisassembler::Fail;
Show All 36 Lines

llvm/test/CodeGen/BPF/32-bit-subreg-load-store.ll

	Show All 37 Lines
	; void storeu64(unsigned long long *p, unsigned long long v)			; void storeu64(unsigned long long *p, unsigned long long v)
	; {			; {
	; *p = v;			; *p = v;
	; }			; }
	; Function Attrs: norecurse nounwind readonly			; Function Attrs: norecurse nounwind readonly
	define dso_local zeroext i8 @loadu8(ptr nocapture readonly %p) local_unnamed_addr #0 {			define dso_local zeroext i8 @loadu8(ptr nocapture readonly %p) local_unnamed_addr #0 {
	entry:			entry:
	%0 = load i8, ptr %p, align 1			%0 = load i8, ptr %p, align 1
	; CHECK: w{{[0-9]+}} = (u8 )(r{{[0-9]+}} + 0)			; CHECK: r{{[0-9]+}} = (u8 )(r{{[0-9]+}} + 0)
	ret i8 %0			ret i8 %0
	}			}

	; Function Attrs: norecurse nounwind readonly			; Function Attrs: norecurse nounwind readonly
	define dso_local zeroext i16 @loadu16(ptr nocapture readonly %p) local_unnamed_addr #0 {			define dso_local zeroext i16 @loadu16(ptr nocapture readonly %p) local_unnamed_addr #0 {
	entry:			entry:
	%0 = load i16, ptr %p, align 2			%0 = load i16, ptr %p, align 2
	; CHECK: w{{[0-9]+}} = (u16 )(r{{[0-9]+}} + 0)			; CHECK: r{{[0-9]+}} = (u16 )(r{{[0-9]+}} + 0)
	ret i16 %0			ret i16 %0
	}			}

	; Function Attrs: norecurse nounwind readonly			; Function Attrs: norecurse nounwind readonly
	define dso_local i32 @loadu32(ptr nocapture readonly %p) local_unnamed_addr #0 {			define dso_local i32 @loadu32(ptr nocapture readonly %p) local_unnamed_addr #0 {
	entry:			entry:
	%0 = load i32, ptr %p, align 4			%0 = load i32, ptr %p, align 4
	; CHECK: w{{[0-9]+}} = (u32 )(r{{[0-9]+}} + 0)			; CHECK: r{{[0-9]+}} = (u32 )(r{{[0-9]+}} + 0)
	ret i32 %0			ret i32 %0
	}			}

	; Function Attrs: norecurse nounwind readonly			; Function Attrs: norecurse nounwind readonly
	define dso_local i64 @loadu64(ptr nocapture readonly %p) local_unnamed_addr #0 {			define dso_local i64 @loadu64(ptr nocapture readonly %p) local_unnamed_addr #0 {
	entry:			entry:
	%0 = load i64, ptr %p, align 8			%0 = load i64, ptr %p, align 8
	; CHECK: r{{[0-9]+}} = (u64 )(r{{[0-9]+}} + 0)			; CHECK: r{{[0-9]+}} = (u64 )(r{{[0-9]+}} + 0)
	ret i64 %0			ret i64 %0
	}			}

	; Function Attrs: norecurse nounwind			; Function Attrs: norecurse nounwind
	define dso_local void @storeu8(ptr nocapture %p, i64 %v) local_unnamed_addr #1 {			define dso_local void @storeu8(ptr nocapture %p, i64 %v) local_unnamed_addr #1 {
	entry:			entry:
	%conv = trunc i64 %v to i8			%conv = trunc i64 %v to i8
	store i8 %conv, ptr %p, align 1			store i8 %conv, ptr %p, align 1
	; CHECK: (u8 )(r{{[0-9]+}} + 0) = w{{[0-9]+}}			; CHECK: (u8 )(r{{[0-9]+}} + 0) = r{{[0-9]+}}
	ret void			ret void
	}			}

	; Function Attrs: norecurse nounwind			; Function Attrs: norecurse nounwind
	define dso_local void @storeu16(ptr nocapture %p, i64 %v) local_unnamed_addr #1 {			define dso_local void @storeu16(ptr nocapture %p, i64 %v) local_unnamed_addr #1 {
	entry:			entry:
	%conv = trunc i64 %v to i16			%conv = trunc i64 %v to i16
	store i16 %conv, ptr %p, align 2			store i16 %conv, ptr %p, align 2
	; CHECK: (u16 )(r{{[0-9]+}} + 0) = w{{[0-9]+}}			; CHECK: (u16 )(r{{[0-9]+}} + 0) = r{{[0-9]+}}
	ret void			ret void
	}			}

	; Function Attrs: norecurse nounwind			; Function Attrs: norecurse nounwind
	define dso_local void @storeu32(ptr nocapture %p, i64 %v) local_unnamed_addr #1 {			define dso_local void @storeu32(ptr nocapture %p, i64 %v) local_unnamed_addr #1 {
	entry:			entry:
	%conv = trunc i64 %v to i32			%conv = trunc i64 %v to i32
	store i32 %conv, ptr %p, align 4			store i32 %conv, ptr %p, align 4
	; CHECK: (u32 )(r{{[0-9]+}} + 0) = w{{[0-9]+}}			; CHECK: (u32 )(r{{[0-9]+}} + 0) = r{{[0-9]+}}
	ret void			ret void
	}			}

	; Function Attrs: norecurse nounwind			; Function Attrs: norecurse nounwind
	define dso_local void @storeu64(ptr nocapture %p, i64 %v) local_unnamed_addr #1 {			define dso_local void @storeu64(ptr nocapture %p, i64 %v) local_unnamed_addr #1 {
	entry:			entry:
	store i64 %v, ptr %p, align 8			store i64 %v, ptr %p, align 8
	; CHECK: (u64 )(r{{[0-9]+}} + 0) = r{{[0-9]+}}			; CHECK: (u64 )(r{{[0-9]+}} + 0) = r{{[0-9]+}}
	ret void			ret void
	}			}

llvm/test/CodeGen/BPF/CORE/offset-reloc-end-load.ll

	; RUN: opt -O2 %s \| llvm-dis > %t1			; RUN: opt -O2 %s \| llvm-dis > %t1
	; RUN: llc -filetype=asm -o - %t1 \| FileCheck -check-prefixes=CHECK,CHECK-ALU64 %s			; RUN: llc -filetype=asm -o - %t1 \| FileCheck %s
	; RUN: llc -mattr=+alu32 -filetype=asm -o - %t1 \| FileCheck -check-prefixes=CHECK,CHECK-ALU32 %s			; RUN: llc -mattr=+alu32 -filetype=asm -o - %t1 \| FileCheck %s
	;			;
	; Source Code:			; Source Code:
	; #define _(x) (__builtin_preserve_access_index(x))			; #define _(x) (__builtin_preserve_access_index(x))
	; struct s {int a; int b;};			; struct s {int a; int b;};
	; int test(struct s arg) { return (const int *)_(&arg->b); }			; int test(struct s arg) { return (const int *)_(&arg->b); }
	; Compiler flag to generate IR:			; Compiler flag to generate IR:
	; clang -target bpf -S -O2 -g -emit-llvm -Xclang -disable-llvm-passes test.c			; clang -target bpf -S -O2 -g -emit-llvm -Xclang -disable-llvm-passes test.c

	target triple = "bpf"			target triple = "bpf"

	%struct.s = type { i32, i32 }			%struct.s = type { i32, i32 }

	; Function Attrs: nounwind readonly			; Function Attrs: nounwind readonly
	define dso_local i32 @test(ptr readonly %arg) local_unnamed_addr #0 !dbg !11 {			define dso_local i32 @test(ptr readonly %arg) local_unnamed_addr #0 !dbg !11 {
	entry:			entry:
	call void @llvm.dbg.value(metadata ptr %arg, metadata !20, metadata !DIExpression()), !dbg !21			call void @llvm.dbg.value(metadata ptr %arg, metadata !20, metadata !DIExpression()), !dbg !21
	%0 = tail call ptr @llvm.preserve.struct.access.index.p0.p0.ss(ptr elementtype(%struct.s) %arg, i32 1, i32 1), !dbg !22, !llvm.preserve.access.index !15			%0 = tail call ptr @llvm.preserve.struct.access.index.p0.p0.ss(ptr elementtype(%struct.s) %arg, i32 1, i32 1), !dbg !22, !llvm.preserve.access.index !15
	%1 = load i32, ptr %0, align 4, !dbg !23, !tbaa !24			%1 = load i32, ptr %0, align 4, !dbg !23, !tbaa !24
	ret i32 %1, !dbg !28			ret i32 %1, !dbg !28
	}			}

	; CHECK-LABEL: test			; CHECK-LABEL: test
	; CHECK-ALU64: r0 = (u32 )(r1 + 4)			; CHECK: r0 = (u32 )(r1 + 4)
	; CHECK-ALU32: w0 = (u32 )(r1 + 4)
	; CHECK: exit			; CHECK: exit
	;			;
	; CHECK: .long 1 # BTF_KIND_STRUCT(id = 2)			; CHECK: .long 1 # BTF_KIND_STRUCT(id = 2)
	;			;
	; CHECK: .byte 115 # string offset=1			; CHECK: .byte 115 # string offset=1
	; CHECK: .ascii ".text" # string offset=20			; CHECK: .ascii ".text" # string offset=20
	; CHECK: .ascii "0:1" # string offset=26			; CHECK: .ascii "0:1" # string offset=26
	;			;
	▲ Show 20 Lines • Show All 51 Lines • Show Last 20 Lines

llvm/test/CodeGen/BPF/CORE/small-core-load.ll

This file was added.

				; RUN: llc -mtriple=bpfel -mcpu=v4 -filetype=obj < %s \
				; RUN: \| llvm-objdump --no-show-raw-insn --no-addresses -d - \
				; RUN: \| FileCheck %s

				; Check that BPFMIPeephole::eliminateZExtSeq() knows how to handle
				; "small" (32/16/8-bit) loads from CORE instructions.
				;
				; Generated from the following C code:
				; struct t {
				; unsigned char a;
				; } __attribute__((preserve_access_index));
				;
				; unsigned int foo(struct t t, unsigned long b, unsigned long p) {
				; unsigned int a;
				; if (b)
				; a = t->a;
				; else
				; a = 0;
				; *p = a;
				; return a;
				; }
				;
				; Using the following command:
				; clang -g -O2 -emit-llvm -S --target=bpf t.c -o t.ll

				@"llvm.t:0:0$0:0" = external global i64, !llvm.preserve.access.index !0 #0

				; Function Attrs: nofree nosync nounwind memory(read, argmem: readwrite, inaccessiblemem: none)
				define dso_local i32 @foo(ptr noundef readonly %t, i64 noundef %b, ptr nocapture noundef writeonly %p) local_unnamed_addr #1 !dbg !12 {
				entry:
				call void @llvm.dbg.value(metadata ptr %t, metadata !20, metadata !DIExpression()), !dbg !24
				call void @llvm.dbg.value(metadata i64 %b, metadata !21, metadata !DIExpression()), !dbg !24
				call void @llvm.dbg.value(metadata ptr %p, metadata !22, metadata !DIExpression()), !dbg !24
				%tobool.not = icmp eq i64 %b, 0, !dbg !25
				br i1 %tobool.not, label %if.end, label %if.then, !dbg !27

				if.then: ; preds = %entry
				%0 = load i64, ptr @"llvm.t:0:0$0:0", align 8
				%1 = getelementptr i8, ptr %t, i64 %0
				%2 = tail call ptr @llvm.bpf.passthrough.p0.p0(i32 0, ptr %1)
				%3 = load i8, ptr %2, align 1, !dbg !28, !tbaa !29
				%conv = zext i8 %3 to i32, !dbg !33
				call void @llvm.dbg.value(metadata i32 %conv, metadata !23, metadata !DIExpression()), !dbg !24
				br label %if.end, !dbg !34

				if.end: ; preds = %entry, %if.then
				%a.0 = phi i32 [ %conv, %if.then ], [ 0, %entry ], !dbg !35
				call void @llvm.dbg.value(metadata i32 %a.0, metadata !23, metadata !DIExpression()), !dbg !24
				%conv1 = zext i32 %a.0 to i64, !dbg !36
				store i64 %conv1, ptr %p, align 8, !dbg !37, !tbaa !38
				ret i32 %a.0, !dbg !40
				}

				; CHECK: <foo>:
				; CHECK-NEXT: w0 = 0x0
				; CHECK-NEXT: if r2 == 0x0 goto +0x1 <[[L:.*]]>
				; CHECK-NEXT: r0 = (u8 )(r1 + 0x0)
				; CHECK-EMPTY:
				; CHECK-NEXT: <[[L]]>:
				; CHECK-NEXT: (u64 )(r3 + 0x0) = r0
				; CHECK-NEXT: exit

				; Function Attrs: nofree nosync nounwind memory(none)
				declare ptr @llvm.bpf.passthrough.p0.p0(i32, ptr) #2

				; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare void @llvm.dbg.value(metadata, metadata, metadata) #3

				attributes #0 = { "btf_ama" }
				attributes #1 = { nofree nosync nounwind memory(read, argmem: readwrite, inaccessiblemem: none) "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #2 = { nofree nosync nounwind memory(none) }
				attributes #3 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.dbg.cu = !{!5}
				!llvm.module.flags = !{!6, !7, !8, !9, !10}
				!llvm.ident = !{!11}

				!0 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "t", file: !1, line: 1, size: 8, elements: !2)
				!1 = !DIFile(filename: "t.c", directory: "/home/eddy/work/tmp", checksumkind: CSK_MD5, checksum: "6232d59853f85f13ad6bb49cfe4de63d")
				!2 = !{!3}
				!3 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !0, file: !1, line: 2, baseType: !4, size: 8)
				!4 = !DIBasicType(name: "unsigned char", size: 8, encoding: DW_ATE_unsigned_char)
				!5 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang version 18.0.0 (/home/eddy/work/llvm-project/clang cf42dc00d29d1b1cc97262051fef95237e9c2fe3)", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
				!6 = !{i32 7, !"Dwarf Version", i32 5}
				!7 = !{i32 2, !"Debug Info Version", i32 3}
				!8 = !{i32 1, !"wchar_size", i32 4}
				!9 = !{i32 7, !"frame-pointer", i32 2}
				!10 = !{i32 7, !"debug-info-assignment-tracking", i1 true}
				!11 = !{!"clang version 18.0.0 (/home/eddy/work/llvm-project/clang cf42dc00d29d1b1cc97262051fef95237e9c2fe3)"}
				!12 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 5, type: !13, scopeLine: 5, flags: DIFlagPrototyped \| DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition \| DISPFlagOptimized, unit: !5, retainedNodes: !19)
				!13 = !DISubroutineType(types: !14)
				!14 = !{!15, !16, !17, !18}
				!15 = !DIBasicType(name: "unsigned int", size: 32, encoding: DW_ATE_unsigned)
				!16 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !0, size: 64)
				!17 = !DIBasicType(name: "unsigned long", size: 64, encoding: DW_ATE_unsigned)
				!18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !17, size: 64)
				!19 = !{!20, !21, !22, !23}
				!20 = !DILocalVariable(name: "t", arg: 1, scope: !12, file: !1, line: 5, type: !16)
				!21 = !DILocalVariable(name: "b", arg: 2, scope: !12, file: !1, line: 5, type: !17)
				!22 = !DILocalVariable(name: "p", arg: 3, scope: !12, file: !1, line: 5, type: !18)
				!23 = !DILocalVariable(name: "a", scope: !12, file: !1, line: 6, type: !15)
				!24 = !DILocation(line: 0, scope: !12)
				!25 = !DILocation(line: 7, column: 7, scope: !26)
				!26 = distinct !DILexicalBlock(scope: !12, file: !1, line: 7, column: 7)
				!27 = !DILocation(line: 7, column: 7, scope: !12)
				!28 = !DILocation(line: 8, column: 12, scope: !26)
				!29 = !{!30, !31, i64 0}
				!30 = !{!"t", !31, i64 0}
				!31 = !{!"omnipotent char", !32, i64 0}
				!32 = !{!"Simple C/C++ TBAA"}
				!33 = !DILocation(line: 8, column: 9, scope: !26)
				!34 = !DILocation(line: 8, column: 5, scope: !26)
				!35 = !DILocation(line: 0, scope: !26)
				!36 = !DILocation(line: 11, column: 8, scope: !12)
				!37 = !DILocation(line: 11, column: 6, scope: !12)
				!38 = !{!39, !39, i64 0}
				!39 = !{!"long", !31, i64 0}
				!40 = !DILocation(line: 12, column: 3, scope: !12)

llvm/test/CodeGen/BPF/assembler-disassembler.s

	Show First 20 Lines • Show All 203 Lines • ▼ Show 20 Lines
	if w1 s<= 42 goto +0			if w1 s<= 42 goto +0
	if w1 s<= w2 goto +0			if w1 s<= w2 goto +0

	// CHECK: 85 00 00 00 2a 00 00 00 call 0x2a			// CHECK: 85 00 00 00 2a 00 00 00 call 0x2a
	call +42			call +42
	// CHECK: 95 00 00 00 00 00 00 00 exit			// CHECK: 95 00 00 00 00 00 00 00 exit
	exit			exit

	// Note: For the group below w1 is used as a destination for sizes u8, u16, u32.			// CHECK: 71 21 2a 00 00 00 00 00 r1 = (u8 )(r2 + 0x2a)
	// This is disassembler quirk, but is technically not wrong, as there are			// CHECK: 69 21 2a 00 00 00 00 00 r1 = (u16 )(r2 + 0x2a)
	// no different encodings for 'r1 = load' vs 'w1 = load'.			// CHECK: 61 21 2a 00 00 00 00 00 r1 = (u32 )(r2 + 0x2a)
	//
	// CHECK: 71 21 2a 00 00 00 00 00 w1 = (u8 )(r2 + 0x2a)
	// CHECK: 69 21 2a 00 00 00 00 00 w1 = (u16 )(r2 + 0x2a)
	// CHECK: 61 21 2a 00 00 00 00 00 w1 = (u32 )(r2 + 0x2a)
	// CHECK: 79 21 2a 00 00 00 00 00 r1 = (u64 )(r2 + 0x2a)			// CHECK: 79 21 2a 00 00 00 00 00 r1 = (u64 )(r2 + 0x2a)
	r1 = (u8)(r2 + 42)			r1 = (u8)(r2 + 42)
	r1 = (u16)(r2 + 42)			r1 = (u16)(r2 + 42)
	r1 = (u32)(r2 + 42)			r1 = (u32)(r2 + 42)
	r1 = (u64)(r2 + 42)			r1 = (u64)(r2 + 42)

	// Note: For the group below w1 is used as a source for sizes u8, u16, u32.			// CHECK: 73 12 2a 00 00 00 00 00 (u8 )(r2 + 0x2a) = r1
	// This is disassembler quirk, but is technically not wrong, as there are			// CHECK: 6b 12 2a 00 00 00 00 00 (u16 )(r2 + 0x2a) = r1
	// no different encodings for 'store r1' vs 'store w1'.			// CHECK: 63 12 2a 00 00 00 00 00 (u32 )(r2 + 0x2a) = r1
	//
	// CHECK: 73 12 2a 00 00 00 00 00 (u8 )(r2 + 0x2a) = w1
	// CHECK: 6b 12 2a 00 00 00 00 00 (u16 )(r2 + 0x2a) = w1
	// CHECK: 63 12 2a 00 00 00 00 00 (u32 )(r2 + 0x2a) = w1
	// CHECK: 7b 12 2a 00 00 00 00 00 (u64 )(r2 + 0x2a) = r1			// CHECK: 7b 12 2a 00 00 00 00 00 (u64 )(r2 + 0x2a) = r1
	(u8)(r2 + 42) = r1			(u8)(r2 + 42) = r1
	(u16)(r2 + 42) = r1			(u16)(r2 + 42) = r1
	(u32)(r2 + 42) = r1			(u32)(r2 + 42) = r1
	(u64)(r2 + 42) = r1			(u64)(r2 + 42) = r1

	// CHECK: c3 21 01 00 00 00 00 00 lock (u32 )(r1 + 0x1) += w2			// CHECK: c3 21 01 00 00 00 00 00 lock (u32 )(r1 + 0x1) += w2
	// CHECK: c3 21 01 00 50 00 00 00 lock (u32 )(r1 + 0x1) &= w2			// CHECK: c3 21 01 00 50 00 00 00 lock (u32 )(r1 + 0x1) &= w2
	▲ Show 20 Lines • Show All 52 Lines • Show Last 20 Lines

llvm/test/CodeGen/BPF/disassemble-mcpu-v3.s

	// Make sure that llvm-objdump --mcpu=v3 enables ALU32 feature.			// Make sure that llvm-objdump --mcpu=v3 enables ALU32 feature.
	//			//
	// Only test a few instructions here, assembler-disassembler.s is more			// Only test a few instructions here, assembler-disassembler.s is more
	// comprehensive but uses --mattr=+alu32 option.			// comprehensive but uses --mattr=+alu32 option.
	//			//
	// RUN: llvm-mc -triple bpfel --mcpu=v3 --assemble --filetype=obj %s -o %t			// RUN: llvm-mc -triple bpfel --mcpu=v3 --assemble --filetype=obj %s -o %t
	// RUN: llvm-objdump -d --mcpu=v2 %t \| FileCheck %s --check-prefix=V2			// RUN: llvm-objdump -d --mcpu=v2 %t \| FileCheck %s --check-prefix=V2
	// RUN: llvm-objdump -d --mcpu=v3 %t \| FileCheck %s --check-prefix=V3			// RUN: llvm-objdump -d --mcpu=v3 %t \| FileCheck %s --check-prefix=V3

	w0 = (u32 )(r1 + 0)			w0 = (u32 )(r1 + 0)
	lock (u32 )(r1 + 0x1) &= w2			lock (u32 )(r1 + 0x1) &= w2


	// V2: 61 10 00 00 00 00 00 00 r0 = (u32 )(r1 + 0x0)			// V2: 61 10 00 00 00 00 00 00 r0 = (u32 )(r1 + 0x0)
	// V2: c3 21 01 00 50 00 00 00 <unknown>			// V2: c3 21 01 00 50 00 00 00 <unknown>

	// V3: 61 10 00 00 00 00 00 00 w0 = (u32 )(r1 + 0x0)			// V3: 61 10 00 00 00 00 00 00 r0 = (u32 )(r1 + 0x0)
	// V3: c3 21 01 00 50 00 00 00 lock (u32 )(r1 + 0x1) &= w2			// V3: c3 21 01 00 50 00 00 00 lock (u32 )(r1 + 0x1) &= w2

llvm/test/CodeGen/BPF/is_trunc_free.ll

Show First 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	if.end10: ; preds = %entry
%call = tail call i32 @work(ptr nonnull %skb, i32 %conv13) #2		%call = tail call i32 @work(ptr nonnull %skb, i32 %conv13) #2
br label %cleanup		br label %cleanup

cleanup: ; preds = %entry, %if.end10		cleanup: ; preds = %entry, %if.end10
%retval.0 = phi i32 [ %call, %if.end10 ], [ 0, %entry ]		%retval.0 = phi i32 [ %call, %if.end10 ], [ 0, %entry ]
ret i32 %retval.0		ret i32 %retval.0
}		}

; CHECK: w{{[0-9]+}} = (u32 )(r{{[0-9]+}} + 0)		; CHECK: r{{[0-9]+}} = (u32 )(r{{[0-9]+}} + 0)
; CHECK-NOT: w{{[0-9]+}} = w{{[0-9]+}}		; CHECK-NOT: w{{[0-9]+}} = w{{[0-9]+}}

declare dso_local i32 @work(ptr, i32) local_unnamed_addr #1		declare dso_local i32 @work(ptr, i32) local_unnamed_addr #1

attributes #0 = { nounwind "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }		attributes #0 = { nounwind "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }		attributes #1 = { "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #2 = { nounwind }		attributes #2 = { nounwind }

Show All 11 Lines

llvm/test/CodeGen/BPF/is_zext_free2.ll

This file was added.

				; RUN: llc -march=bpfel -mattr=+alu32 < %s \| FileCheck %s

				; Check that zero extension is considered free for load instructions.
				; The test case is derived from a bigger C program using llvm-reduce
				; and manual simplifications.

				target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"

				define i1 @foo(ptr %ptr) {
				entry:
				%byte = load volatile i8, ptr %ptr, align 1
				br label %next

				; Jump to the new basic block is important, because it creates a COPY
				; instruction for %byte, which might be constructed differently
				; depending on TLI.isZExtFree() results, see RegsForValue::getCopyToRegs().
				next:
				; The 'icmp eq i8' requires second argument to be zero extended.
				%cmp = icmp eq i8 12, %byte
				; CHECK-NOT: {{[rw][0-9]}} &= 255
				ret i1 %cmp
				}

llvm/test/CodeGen/BPF/ldsx.ll

	Show First 20 Lines • Show All 42 Lines • ▼ Show 20 Lines
	; CHECK: r0 = (s16 )(r1 + 0) # encoding: [0x89,0x10,0x00,0x00,0x00,0x00,0x00,0x00]			; CHECK: r0 = (s16 )(r1 + 0) # encoding: [0x89,0x10,0x00,0x00,0x00,0x00,0x00,0x00]

	; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(argmem: read)			; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(argmem: read)
	define dso_local i32 @f3(ptr nocapture noundef readonly %p) local_unnamed_addr #0 {			define dso_local i32 @f3(ptr nocapture noundef readonly %p) local_unnamed_addr #0 {
	entry:			entry:
	%0 = load i32, ptr %p, align 4, !tbaa !8			%0 = load i32, ptr %p, align 4, !tbaa !8
	ret i32 %0			ret i32 %0
	}			}
	; CHECK: w0 = (u32 )(r1 + 0) # encoding: [0x61,0x10,0x00,0x00,0x00,0x00,0x00,0x00]			; CHECK: r0 = (u32 )(r1 + 0) # encoding: [0x61,0x10,0x00,0x00,0x00,0x00,0x00,0x00]

	; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(argmem: read)			; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(argmem: read)
	define dso_local i64 @f4(ptr nocapture noundef readonly %p) local_unnamed_addr #0 {			define dso_local i64 @f4(ptr nocapture noundef readonly %p) local_unnamed_addr #0 {
	entry:			entry:
	%0 = load i8, ptr %p, align 1, !tbaa !3			%0 = load i8, ptr %p, align 1, !tbaa !3
	%conv = sext i8 %0 to i64			%conv = sext i8 %0 to i64
	ret i64 %conv			ret i64 %conv
	}			}
	▲ Show 20 Lines • Show All 45 Lines • Show Last 20 Lines

llvm/test/CodeGen/BPF/memcmp.ll

	Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines
	; CHECK-DAG: (u32 )(r1 + 0)			; CHECK-DAG: (u32 )(r1 + 0)
	; CHECK-DAG: (u32 )(r1 + 4)			; CHECK-DAG: (u32 )(r1 + 4)
	; CHECK-DAG: (u32 )(r10 - 16)			; CHECK-DAG: (u32 )(r10 - 16)
	; CHECK-DAG: (u32 )(r10 - 20)			; CHECK-DAG: (u32 )(r10 - 20)
	; CHECK-DAG: (u32 )(r10 - 8)			; CHECK-DAG: (u32 )(r10 - 8)
	; CHECK-DAG: (u32 )(r10 - 12)			; CHECK-DAG: (u32 )(r10 - 12)
	; CHECK-DAG: (u32 )(r1 + 8)			; CHECK-DAG: (u32 )(r1 + 8)
	; CHECK-DAG: (u32 )(r1 + 12)			; CHECK-DAG: (u32 )(r1 + 12)
	; CHECK-DAG: (u32 )(r2 + 16)			; CHECK-DAG: (u32 )(r{{[23]}} + 16)
	; CHECK-DAG: (u32 )(r10 - 4)			; CHECK-DAG: (u32 )(r10 - 4)

	; Function Attrs: argmemonly mustprogress nofree nosync nounwind willreturn			; Function Attrs: argmemonly mustprogress nofree nosync nounwind willreturn
	declare void @llvm.lifetime.start.p0(i64 immarg, ptr nocapture) #1			declare void @llvm.lifetime.start.p0(i64 immarg, ptr nocapture) #1

	declare dso_local void @bar1(ptr noundef) local_unnamed_addr #2			declare dso_local void @bar1(ptr noundef) local_unnamed_addr #2

	; Function Attrs: argmemonly mustprogress nofree nounwind readonly willreturn			; Function Attrs: argmemonly mustprogress nofree nounwind readonly willreturn
	Show All 17 Lines

llvm/test/CodeGen/BPF/remove_truncate_7.ll

	Show First 20 Lines • Show All 42 Lines • ▼ Show 20 Lines

	if.end:			if.end:
	%p.0.in.in = phi i32 [ %0, %if.then ], [ %1, %if.else ]			%p.0.in.in = phi i32 [ %0, %if.then ], [ %1, %if.else ]
	%p.0.in = zext i32 %p.0.in.in to i64			%p.0.in = zext i32 %p.0.in.in to i64
	%p.0 = inttoptr i64 %p.0.in to ptr			%p.0 = inttoptr i64 %p.0.in to ptr
	ret ptr %p.0			ret ptr %p.0
	}			}

	; CHECK: w0 = (u32 )(r2 + 0)			; CHECK: r0 = (u32 )(r2 + 0)
	; CHECK: w0 = (u32 )(r2 + 4)			; CHECK: r0 = (u32 )(r2 + 4)
	; CHECK-NOT: r[[#]] = w[[#]]			; CHECK-NOT: r[[#]] = w[[#]]
	; CHECK: exit			; CHECK: exit

llvm/test/CodeGen/BPF/rodata_5.ll

Show All 28 Lines	entry:
call void @llvm.lifetime.start.p0(i64 3, ptr nonnull %v1)		call void @llvm.lifetime.start.p0(i64 3, ptr nonnull %v1)
call void @llvm.memcpy.p0.p0.i64(ptr nonnull align 1 dereferenceable(3) %v1, ptr nonnull align 1 dereferenceable(3) @__const.test.v, i64 3, i1 false)		call void @llvm.memcpy.p0.p0.i64(ptr nonnull align 1 dereferenceable(3) %v1, ptr nonnull align 1 dereferenceable(3) @__const.test.v, i64 3, i1 false)
call void @foo(ptr nonnull %v1)		call void @foo(ptr nonnull %v1)
call void @llvm.lifetime.end.p0(i64 3, ptr nonnull %v1)		call void @llvm.lifetime.end.p0(i64 3, ptr nonnull %v1)
ret i32 0		ret i32 0
}		}
; CHECK-NOT: w{{[0-9]+}} = (u16 )		; CHECK-NOT: w{{[0-9]+}} = (u16 )
; CHECK-NOT: w{{[0-9]+}} = (u8 )		; CHECK-NOT: w{{[0-9]+}} = (u8 )
; CHECK: (u16 )(r10 - 4) = w{{[0-9]+}}		; CHECK: (u8 )(r10 - 2) = r{{[0-9]+}}
; CHECK: (u8 )(r10 - 2) = w{{[0-9]+}}		; CHECK: (u16 )(r10 - 4) = r{{[0-9]+}}

; Function Attrs: argmemonly nounwind willreturn		; Function Attrs: argmemonly nounwind willreturn
declare void @llvm.lifetime.start.p0(i64 immarg, ptr nocapture)		declare void @llvm.lifetime.start.p0(i64 immarg, ptr nocapture)

; Function Attrs: argmemonly nounwind willreturn		; Function Attrs: argmemonly nounwind willreturn
declare void @llvm.memcpy.p0.p0.i64(ptr noalias nocapture writeonly, ptr noalias nocapture readonly, i64, i1 immarg)		declare void @llvm.memcpy.p0.p0.i64(ptr noalias nocapture writeonly, ptr noalias nocapture readonly, i64, i1 immarg)

declare dso_local void @foo(ptr) local_unnamed_addr		declare dso_local void @foo(ptr) local_unnamed_addr

; Function Attrs: argmemonly nounwind willreturn		; Function Attrs: argmemonly nounwind willreturn
declare void @llvm.lifetime.end.p0(i64 immarg, ptr nocapture)		declare void @llvm.lifetime.end.p0(i64 immarg, ptr nocapture)

llvm/test/MC/BPF/insn-unit.s

	Show All 28 Lines
	// CHECK: 18 09 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r9 = 0 ll			// CHECK: 18 09 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r9 = 0 ll
	// CHECK: 0000000000000060: R_BPF_64_64 dummy_map			// CHECK: 0000000000000060: R_BPF_64_64 dummy_map

	// ======== BPF_LDX Class ========			// ======== BPF_LDX Class ========
	r5 = (u8 )(r0 + 0) // BPF_LDX \| BPF_B			r5 = (u8 )(r0 + 0) // BPF_LDX \| BPF_B
	r6 = (u16 )(r1 + 8) // BPF_LDX \| BPF_H			r6 = (u16 )(r1 + 8) // BPF_LDX \| BPF_H
	r7 = (u32 )(r2 + 16) // BPF_LDX \| BPF_W			r7 = (u32 )(r2 + 16) // BPF_LDX \| BPF_W
	r8 = (u64 )(r3 - 30) // BPF_LDX \| BPF_DW			r8 = (u64 )(r3 - 30) // BPF_LDX \| BPF_DW
	// CHECK-64: 71 05 00 00 00 00 00 00 r5 = (u8 )(r0 + 0)			// CHECK: 71 05 00 00 00 00 00 00 r5 = (u8 )(r0 + 0)
	// CHECK-64: 69 16 08 00 00 00 00 00 r6 = (u16 )(r1 + 8)			// CHECK: 69 16 08 00 00 00 00 00 r6 = (u16 )(r1 + 8)
	// CHECK-64: 61 27 10 00 00 00 00 00 r7 = (u32 )(r2 + 16)			// CHECK: 61 27 10 00 00 00 00 00 r7 = (u32 )(r2 + 16)
	// CHECK-32: 71 05 00 00 00 00 00 00 w5 = (u8 )(r0 + 0)
	// CHECK-32: 69 16 08 00 00 00 00 00 w6 = (u16 )(r1 + 8)
	// CHECK-32: 61 27 10 00 00 00 00 00 w7 = (u32 )(r2 + 16)
	// CHECK: 79 38 e2 ff 00 00 00 00 r8 = (u64 )(r3 - 30)			// CHECK: 79 38 e2 ff 00 00 00 00 r8 = (u64 )(r3 - 30)

	// ======== BPF_STX Class ========			// ======== BPF_STX Class ========
	(u8 )(r0 + 0) = r7 // BPF_STX \| BPF_B			(u8 )(r0 + 0) = r7 // BPF_STX \| BPF_B
	(u16 )(r1 + 8) = r8 // BPF_STX \| BPF_H			(u16 )(r1 + 8) = r8 // BPF_STX \| BPF_H
	(u32 )(r2 + 16) = r9 // BPF_STX \| BPF_W			(u32 )(r2 + 16) = r9 // BPF_STX \| BPF_W
	(u64 )(r3 - 30) = r10 // BPF_STX \| BPF_DW			(u64 )(r3 - 30) = r10 // BPF_STX \| BPF_DW
	// CHECK-64: 73 70 00 00 00 00 00 00 (u8 )(r0 + 0) = r7			// CHECK: 73 70 00 00 00 00 00 00 (u8 )(r0 + 0) = r7
	// CHECK-64: 6b 81 08 00 00 00 00 00 (u16 )(r1 + 8) = r8			// CHECK: 6b 81 08 00 00 00 00 00 (u16 )(r1 + 8) = r8
	// CHECK-64: 63 92 10 00 00 00 00 00 (u32 )(r2 + 16) = r9			// CHECK: 63 92 10 00 00 00 00 00 (u32 )(r2 + 16) = r9
	// CHECK-32: 73 70 00 00 00 00 00 00 (u8 )(r0 + 0) = w7
	// CHECK-32: 6b 81 08 00 00 00 00 00 (u16 )(r1 + 8) = w8
	// CHECK-32: 63 92 10 00 00 00 00 00 (u32 )(r2 + 16) = w9
	// CHECK: 7b a3 e2 ff 00 00 00 00 (u64 )(r3 - 30) = r10			// CHECK: 7b a3 e2 ff 00 00 00 00 (u64 )(r3 - 30) = r10

	lock (u32 )(r2 + 16) += r9 // BPF_STX \| BPF_W \| BPF_XADD			lock (u32 )(r2 + 16) += r9 // BPF_STX \| BPF_W \| BPF_XADD
	lock (u64 )(r3 - 30) += r10 // BPF_STX \| BPF_DW \| BPF_XADD			lock (u64 )(r3 - 30) += r10 // BPF_STX \| BPF_DW \| BPF_XADD
	// CHECK-64: c3 92 10 00 00 00 00 00 lock (u32 )(r2 + 16) += r9			// CHECK-64: c3 92 10 00 00 00 00 00 lock (u32 )(r2 + 16) += r9
	// CHECK-32: c3 92 10 00 00 00 00 00 lock (u32 )(r2 + 16) += w9			// CHECK-32: c3 92 10 00 00 00 00 00 lock (u32 )(r2 + 16) += w9
	// CHECK: db a3 e2 ff 00 00 00 00 lock (u64 )(r3 - 30) += r10			// CHECK: db a3 e2 ff 00 00 00 00 lock (u64 )(r3 - 30) += r10

	▲ Show 20 Lines • Show All 119 Lines • Show Last 20 Lines

llvm/test/MC/BPF/load-store-32.s

	# RUN: llvm-mc -triple bpfel -filetype=obj -o %t %s			# RUN: llvm-mc -triple bpfel -filetype=obj -o %t %s
	# RUN: llvm-objdump --no-print-imm-hex --mattr=+alu32 -d -r %t \| FileCheck --check-prefix=CHECK-32 %s			# RUN: llvm-objdump --no-print-imm-hex --mattr=+alu32 -d -r %t \
	# RUN: llvm-objdump --no-print-imm-hex -d -r %t \| FileCheck %s			# RUN: \| FileCheck --check-prefixes=CHECK,CHECK-32 %s
				# RUN: llvm-objdump --no-print-imm-hex -d -r %t \
				# RUN: \| FileCheck --check-prefixes=CHECK,CHECK-64 %s

	// ======== BPF_LDX Class ========			// ======== BPF_LDX Class ========
	w5 = (u8 )(r0 + 0) // BPF_LDX \| BPF_B			w5 = (u8 )(r0 + 0) // BPF_LDX \| BPF_B
	w6 = (u16 )(r1 + 8) // BPF_LDX \| BPF_H			w6 = (u16 )(r1 + 8) // BPF_LDX \| BPF_H
	w7 = (u32 )(r2 + 16) // BPF_LDX \| BPF_W			w7 = (u32 )(r2 + 16) // BPF_LDX \| BPF_W
	yonghong-songUnsubmitted Not Done Reply Inline Actions If I understand correctly, the asm syntax `w5 = (u8 )(r0 + 0)` is not supported any more with this patch. That means, if users write inline asm in their code like ... w5 = (u8 )(r0 + 0) ... The inline asm can be successfully compiled with llvm17/llvm16 etc. but will fail compilation with llvm18 (with this patch). Is this what we want? yonghong-song: If I understand correctly, the asm syntax `w5 = (u8 )(r0 + 0)` is not supported any more with…
	eddyz87AuthorUnsubmitted Done Reply Inline Actions You are correct. I'm trying to fix this using `InstAlias`. So far I have the incantation below that works but is kind-of ugly: foreach I = 0-11 in { def : InstAlias<!strconcat("w"#I, " = (u8 )($src $offset)"), (LDB !cast<Ri>("R"#I), GPR:$src, i16imm:$offset)>; def : InstAlias<!strconcat("w"#I, " = (u16 )($src $offset)"), (LDH !cast<Ri>("R"#I), GPR:$src, i16imm:$offset)>; def : InstAlias<!strconcat("w"#I, " = (u32 )($src $offset)"), (LDW !cast<Ri>("R"#I), GPR:$src, i16imm:$offset)>; def : InstAlias<!strconcat("(u8 )($dst $offset) = ", "w"#I), (STB !cast<Ri>("R"#I), GPR:$dst, i16imm:$offset)>; def : InstAlias<!strconcat("(u16 )($dst $offset) = ", "w"#I), (STH !cast<Ri>("R"#I), GPR:$dst, i16imm:$offset)>; def : InstAlias<!strconcat("(u32 )($dst $offset) = ", "w"#I), (STW !cast<Ri>("R"#I), GPR:$dst, i16imm:$offset)>; } I'd prefer to have something like below instead: def : InstAlias<"$dst = (u8 )($src $offset)", (LDB (Reg32To64 GPR32:$dst).Reg64, GPR:$src, i16imm:$offset)>; But can't figure out how to define `Reg32To64` at the moment. Still trying to figure it out but if you have a suggestion, please share. eddyz87: You are correct. I'm trying to fix this using `InstAlias`. So far I have the incantation below…
	// CHECK-32: 71 05 00 00 00 00 00 00 w5 = (u8 )(r0 + 0)
	// CHECK-32: 69 16 08 00 00 00 00 00 w6 = (u16 )(r1 + 8)
	// CHECK-32: 61 27 10 00 00 00 00 00 w7 = (u32 )(r2 + 16)
	// CHECK: 71 05 00 00 00 00 00 00 r5 = (u8 )(r0 + 0)			// CHECK: 71 05 00 00 00 00 00 00 r5 = (u8 )(r0 + 0)
	// CHECK: 69 16 08 00 00 00 00 00 r6 = (u16 )(r1 + 8)			// CHECK: 69 16 08 00 00 00 00 00 r6 = (u16 )(r1 + 8)
	// CHECK: 61 27 10 00 00 00 00 00 r7 = (u32 )(r2 + 16)			// CHECK: 61 27 10 00 00 00 00 00 r7 = (u32 )(r2 + 16)

	// ======== BPF_STX Class ========			// ======== BPF_STX Class ========
	(u8 )(r0 + 0) = w7 // BPF_STX \| BPF_B			(u8 )(r0 + 0) = w7 // BPF_STX \| BPF_B
	(u16 )(r1 + 8) = w8 // BPF_STX \| BPF_H			(u16 )(r1 + 8) = w8 // BPF_STX \| BPF_H
	(u32 )(r2 + 16) = w9 // BPF_STX \| BPF_W			(u32 )(r2 + 16) = w9 // BPF_STX \| BPF_W
	lock (u32 )(r2 + 16) += w9 // BPF_STX \| BPF_W \| BPF_XADD			lock (u32 )(r2 + 16) += w9 // BPF_STX \| BPF_W \| BPF_XADD
	// CHECK-32: 73 70 00 00 00 00 00 00 (u8 )(r0 + 0) = w7
	// CHECK-32: 6b 81 08 00 00 00 00 00 (u16 )(r1 + 8) = w8
	// CHECK-32: 63 92 10 00 00 00 00 00 (u32 )(r2 + 16) = w9
	// CHECK-32: c3 92 10 00 00 00 00 00 lock (u32 )(r2 + 16) += w9
	// CHECK: 73 70 00 00 00 00 00 00 (u8 )(r0 + 0) = r7			// CHECK: 73 70 00 00 00 00 00 00 (u8 )(r0 + 0) = r7
	// CHECK: 6b 81 08 00 00 00 00 00 (u16 )(r1 + 8) = r8			// CHECK: 6b 81 08 00 00 00 00 00 (u16 )(r1 + 8) = r8
	// CHECK: 63 92 10 00 00 00 00 00 (u32 )(r2 + 16) = r9			// CHECK: 63 92 10 00 00 00 00 00 (u32 )(r2 + 16) = r9
	// CHECK: c3 92 10 00 00 00 00 00 lock (u32 )(r2 + 16) += r9			// CHECK-32: c3 92 10 00 00 00 00 00 lock (u32 )(r2 + 16) += w9
				// CHECK-64: c3 92 10 00 00 00 00 00 lock (u32 )(r2 + 16) += r9

This is an archive of the discontinued LLVM Phabricator instance.

[BPF] Consolidate 32-bit and 64-bit LDX/STX operationsDraftPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 549952

llvm/lib/Target/BPF/BPFISelLowering.h

llvm/lib/Target/BPF/BPFISelLowering.cpp

llvm/lib/Target/BPF/BPFInstrInfo.cpp

llvm/lib/Target/BPF/BPFInstrInfo.td

llvm/lib/Target/BPF/BPFMIPeephole.cpp

llvm/lib/Target/BPF/BPFMISimplifyPatchable.cpp

llvm/lib/Target/BPF/BPFRegisterInfo.td

llvm/lib/Target/BPF/BTFDebug.cpp

llvm/lib/Target/BPF/Disassembler/BPFDisassembler.cpp

llvm/test/CodeGen/BPF/32-bit-subreg-load-store.ll

llvm/test/CodeGen/BPF/CORE/offset-reloc-end-load.ll

llvm/test/CodeGen/BPF/CORE/small-core-load.ll

llvm/test/CodeGen/BPF/assembler-disassembler.s

llvm/test/CodeGen/BPF/disassemble-mcpu-v3.s

llvm/test/CodeGen/BPF/is_trunc_free.ll

llvm/test/CodeGen/BPF/is_zext_free2.ll

llvm/test/CodeGen/BPF/ldsx.ll

llvm/test/CodeGen/BPF/memcmp.ll

llvm/test/CodeGen/BPF/remove_truncate_7.ll

llvm/test/CodeGen/BPF/rodata_5.ll

llvm/test/MC/BPF/insn-unit.s

llvm/test/MC/BPF/load-store-32.s

[BPF] Consolidate 32-bit and 64-bit LDX/STX operations
DraftPublic