This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/Basic/Targets/
-
Basic/
-
Targets/
-
BPF.h
-
BPF.cpp
-
test/Misc/
-
Misc/
-
target-invalid-cpu-note.c
-
llvm/
-
lib/Target/BPF/
-
Target/
-
BPF/
-
AsmParser/
-
BPFAsmParser.cpp
-
BPF.td
-
BPFISelDAGToDAG.cpp
-
BPFISelLowering.h
-
BPFISelLowering.cpp
1
BPFInstrFormats.td
2/4
BPFInstrInfo.td
2/8
BPFMIPeephole.cpp
-
BPFMISimplifyPatchable.cpp
-
BPFSubtarget.h
-
BPFSubtarget.cpp
-
Disassembler/
-
BPFDisassembler.cpp
-
MCTargetDesc/
1
BPFAsmBackend.cpp
-
BPFInstPrinter.cpp
-
BPFMCCodeEmitter.cpp
1
BPFMCFixups.h
-
BPFMCTargetDesc.cpp
-
test/CodeGen/BPF/
-
CodeGen/
-
BPF/
-
bswap.ll
-
ldsx.ll
1/3
movsx.ll
-
sdiv_smod.ll

Differential D144829

[BPF] Add a few new insns under cpu=v4
ClosedPublic

Authored by yonghong-song on Feb 26 2023, 9:51 AM.

Download Raw Diff

Details

Reviewers

ast
jemarch
eddyz87

Commits

rG6c412b6c6faa: [BPF] Add a few new insns under cpu=v4

Summary

In [1], a few new insns are proposed to exgend BPF ISA to

. fixing the limitation of existing insn (e.g., 16bit jmp offset)
. adding new insns which may improve code quality (sign_ext_ld, sign_ext_mov, st) 
. feature complete (sdiv, smod)
. better user experience (bswap)

This patch implemented insn encoding for

. sign-extended load
. sign-extended mov 
. sdiv/smod
. bswap insns
. unconditional jump with 32bit offset

The new bswap insns are generated under cpu=v4 for builtin_bswap.
For cpu=v3 or earlier, for builtin_bswap, be or le insns are generated
which is not intuitive for the user.

To support 32-bit branch offset, a 32-bit ja (JMPL) insn is implemented.
For conditional branch which is beyond 16-bit offset, llvm will do
some transformation 'cond_jmp' -> 'cond_jmp + jmpl' to simulate 32bit
conditional jmp. See BPFMIPeephole.cpp for details. The algorithm is
hueristic based. I have tested bpf selftest pyperf600 with unroll account
600 which can indeed generate 32-bit jump insn, e.g.,

13:       06 00 00 00 9b cd 00 00 gotol +0xcd9b <LBB0_6619>

Eduard is working on to add 'st' insn to cpu=v4.

A list of llc flags:

disable-ldsx, disable-movsx, disable-bswap,
disable-sdiv-smod, disable-gotol

can be used to disable a particular insn for cpu v4.
For example, user can do:

llc -march=bpf -mcpu=v4 -disable-movsx t.ll

to enable cpu v4 without movsx insns.

References:

[1] https://lore.kernel.org/bpf/4bfe98be-5333-1c7e-2f6d-42486c8ec039@meta.com/

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

yonghong-song created this revision.Feb 26 2023, 9:51 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 26 2023, 9:51 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

yonghong-song requested review of this revision.Feb 26 2023, 9:51 AM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptFeb 26 2023, 9:51 AM

Herald added subscribers: llvm-commits, cfe-commits. · View Herald Transcript

yonghong-song edited the summary of this revision. (Show Details)Feb 26 2023, 10:35 AM

yonghong-song added reviewers: ast, jemarch.

yonghong-song added a subscriber: anakryiko.

Harbormaster completed remote builds in B216086: Diff 500597.Feb 26 2023, 10:43 AM

yonghong-song added a reviewer: eddyz87.Feb 26 2023, 11:00 AM

Hi Yonghong,

Left a few nitpicks and one comment in BPFMIPreEmitPeephole::adjustBranch() that I think points to a bug.
Overall adjustBranch() algorithm looks good. It would be great to have some test cases for it, e.g. preprocess test .ll by replacing some template with a bunch of noops or something like this.
The instruction encoding seem to match mailing list description.
Also one check-all test is failing for me:

home/eddy/work/llvm-project/clang/test/Misc/target-invalid-cpu-note.c:76:14: error: BPF-NEXT: expected string not found in input
// BPF-NEXT: note: valid target CPU values are: generic, v1, v2, v3, probe{{$}}

llvm/lib/Target/BPF/BPFInstrFormats.td
93	Nitpick: the mailing list doc refers to this as `BPF_SMEM`.
llvm/lib/Target/BPF/BPFMIPeephole.cpp
335	Nitpick: this would not be executed for `-O0`, but is required for correct execution. void BPFPassConfig::addPreEmitPass() { addPass(createBPFMIPreEmitCheckingPass()); if (getOptLevel() != CodeGenOpt::None) if (!DisableMIPeephole) addPass(createBPFMIPreEmitPeepholePass()); }
456	As far as I understand: `SoFarNumInsns[JmpBB]` is a number of instructions from function start till the end of `JmpBB`; `CurrNumInsns` is a number of instructions from function start till the end of `MBB`. So, `SoFarNumInsns[JmpBB] - CurrNumInsns` gives the distance between basic block ends. However, the jump would happen to the basic block start, so the actual distance should be computed as `SoFarNumInsns[JmpBB] - JmpBB.size() - CurrNumInsns`. Am I confused?
483	Is it possible to rewrite as below instead? B2: ... if (!cond) goto B3 gotol B5 B3: ... Seems to be equivalent but with less instructions.
553	Nitpick: `(Dist <= INT16_MAX && Dist >= INT16_MIN)` is used in the previous two cases.
llvm/lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp
108	This is because `Value` is in bytes, right? Could you please drop a comment here.

Fixed issues reported by Eduard
llvm-objdump issue (as stated in 'Summary') is not resolved yet.

Harbormaster completed remote builds in B232769: Diff 523253.May 18 2023, 3:27 AM

Fixed previous llvm-objdump issue for '> 16bit' 'gotol' insns.
Now basic functionality for cpu=v4 should be complete for llvm, further work will focus on kernel.

Harbormaster completed remote builds in B233021: Diff 523569.May 18 2023, 10:32 PM

added support of new instructions in inline assembly.

Harbormaster completed remote builds in B241045: Diff 534361.Jun 25 2023, 11:25 AM

avoid changing conditions during JMP -> JMPL conversion. Otherwise, verification may fail in some cases.

Harbormaster completed remote builds in B241346: Diff 534801.Jun 26 2023, 6:56 PM

Hi Yonghong,

What is the current plan for these changes?
I'd like to refresh D140804 to make BPF_ST instruction available for cpu v4.
I see that the latest CI run failed because of libcxx issue, I think it is completely unrelated to this revision, see here.

In D144829#4465193, @eddyz87 wrote:

Hi Yonghong,

What is the current plan for these changes?
I'd like to refresh D140804 to make BPF_ST instruction available for cpu v4.
I see that the latest CI run failed because of libcxx issue, I think it is completely unrelated to this revision, see here.

Eduard, please go ahead to refresh BPF_ST patch (D140804). For this patch, we will land it until the kernel patch addressed all major comments. When it is ready to merge, you will comment and let you know.

rename some insn names or mode names (movs -> movsx, lds -> ldsx, MEMS -> MEMSX) etc to be consistent with kernel.
add 5 llc flags to control on/off for each kind of insn (sdiv/smod, ldsx, movsx, bswap, gotol) to debugging purpose.

Harbormaster completed remote builds in B244987: Diff 539842.Jul 13 2023, 1:41 AM

ast added inline comments.Jul 14 2023, 8:49 AM

llvm/lib/Target/BPF/BPFInstrInfo.td
56	Here and elsewhere... let's drop CPUv4 mid prefix. imo the extra verbosity doesn't improve readability. Same with the flag: disable-cpuv4-movsx. I can be disable-movsx. s/BPFHasCPUv4_ldsx/BPFHasLdsx/ s/getCPUv4_bswap/getHasBswap/ or even shorter hasBswap ?

yonghong-song added inline comments.Jul 14 2023, 4:17 PM

llvm/lib/Target/BPF/BPFInstrInfo.td
56	Make sense. Will do. Ya, hasBswap is good enough to capture what it means.

Dropping 'CPUv4' in some variable/function names and also in debug flags.

overall looks good. one small nit.

llvm/lib/Target/BPF/MCTargetDesc/BPFMCFixups.h
17	a little bit too much of copy paste :)

This revision is now accepted and ready to land.Jul 17 2023, 7:20 PM

remove a copy-paste comment from s390 arch

yonghong-song retitled this revision from [WIP][BPF] Add a few new insns under cpu=v4 to [BPF] Add a few new insns under cpu=v4.Jul 19 2023, 4:48 PM

lgtm. @eddyz87 pls take a look

Harbormaster completed remote builds in B246716: Diff 542245.Jul 19 2023, 6:52 PM

I tried adding a test similar to assemble-disassemble.ll:

// RUN: llvm-mc -triple bpfel --mcpu=v4 --assemble --filetype=obj %s \
// RUN:   | llvm-objdump -d --mattr=+alu32 - \
// RUN:   | FileCheck %s

// CHECK: d7 01 00 00 10 00 00 00	r1 = bswap16 r1
// CHECK: d7 02 00 00 20 00 00 00	r2 = bswap32 r2
// CHECK: d7 03 00 00 40 00 00 00	r3 = bswap64 r3
r1 = bswap16 r1
r2 = bswap32 r2
r3 = bswap64 r3

// CHECK: 91 41 00 00 00 00 00 00	r1 = *(s8 *)(r4 + 0x0)
// CHECK: 89 52 04 00 00 00 00 00	r2 = *(s16 *)(r5 + 0x4)
// CHECK: 81 63 08 00 00 00 00 00	r3 = *(s32 *)(r6 + 0x8)
r1 = *(s8 *)(r4 + 0)
r2 = *(s16 *)(r5 + 4)
r3 = *(s32 *)(r6 + 8)

// CHECK: 91 41 00 00 00 00 00 00	w1 = *(s8 *)(r4 + 0x0)
// CHECK: 89 52 04 00 00 00 00 00	w2 = *(s16 *)(r5 + 0x4)
w1 = *(s8 *)(r4 + 0)
w2 = *(s16 *)(r5 + 4)

// CHECK: bf 41 08 00 00 00 00 00	r1 = (s8)r4
// CHECK: bf 52 10 00 00 00 00 00	r2 = (s16)r5
// CHECK: bf 63 20 00 00 00 00 00	r3 = (s32)w6
r1 = (s8)r4
r2 = (s16)r5
r3 = (s32)w6
// Should this work as well: r3 = (s32)r6 ?

// CHECK: bc 31 08 00 00 00 00 00	w1 = (s8)w3
// CHECK: bc 42 10 00 00 00 00 00	w2 = (s16)w4
w1 = (s8)w3
w2 = (s16)w4

// CHECK: 3f 31 01 00 00 00 00 00	r1 s/= r3
// CHECK: 9f 42 01 00 00 00 00 00	r2 s%= r4
r1 s/= r3
r2 s%= r4

// CHECK: 3c 31 01 00 00 00 00 00	w1 s/= w3
// CHECK: 9c 42 01 00 00 00 00 00	w2 s%= w4
w1 s/= w3
w2 s%= w4

And it looks like some instructions are not printed correctly:

$ llvm-mc -triple bpfel --mcpu=v4 --assemble --filetype=obj /home/eddy/work/llvm-project/llvm/test/CodeGen/BPF/assembler-disassembler-v4.s | llvm-objdump -d --mattr=+alu32 -

<stdin>:	file format elf64-bpf

Disassembly of section .text:

0000000000000000 <.text>:
       0:	d7 01 00 00 10 00 00 00	r1 = bswap16 r1
       1:	d7 02 00 00 20 00 00 00	r2 = bswap32 r2
       2:	d7 03 00 00 40 00 00 00	r3 = bswap64 r3
       3:	91 41 00 00 00 00 00 00	w1 = *(s8 *)(r4 + 0x0)
       4:	89 52 04 00 00 00 00 00	w2 = *(s16 *)(r5 + 0x4)
       5:	81 63 08 00 00 00 00 00	<unknown>
       6:	91 41 00 00 00 00 00 00	w1 = *(s8 *)(r4 + 0x0)
       7:	89 52 04 00 00 00 00 00	w2 = *(s16 *)(r5 + 0x4)
       8:	bf 41 08 00 00 00 00 00	r1 = (s8)r4
       9:	bf 52 10 00 00 00 00 00	r2 = (s16)r5
      10:	bf 63 20 00 00 00 00 00	r3 = (s32)w6
      11:	bc 31 08 00 00 00 00 00	w1 = (s8)w3
      12:	bc 42 10 00 00 00 00 00	w2 = (s16)w4
      13:	3f 31 01 00 00 00 00 00	r1 s/= r3
      14:	9f 42 01 00 00 00 00 00	r2 s%= r4
      15:	3c 31 01 00 00 00 00 00	w1 s/= w3
      16:	9c 42 01 00 00 00 00 00	w2 s%= w4

I'm not sure if this is an issue with disassembler or some additional --mattr options are needed.

llvm/lib/Target/BPF/BPFInstrInfo.td
379	I think it is possible to avoid matching expansion pattern `(sra (shl GPR:$src, (i64 56))` here, and instead turn off the expansion when `movsx` is available. I tried the change below and all BPF codegen tests are passing. Do I miss something? diff --git a/llvm/lib/Target/BPF/BPFISelLowering.cpp b/llvm/lib/Target/BPF/BPFISelLowering.cpp index 9a7357d6ad04..5e84af009591 100644 --- a/llvm/lib/Target/BPF/BPFISelLowering.cpp +++ b/llvm/lib/Target/BPF/BPFISelLowering.cpp @@ -132,9 +132,11 @@ BPFTargetLowering::BPFTargetLowering(const TargetMachine &TM, setOperationAction(ISD::CTLZ_ZERO_UNDEF, MVT::i64, Custom); setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i1, Expand); - setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Expand); - setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Expand); - setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i32, Expand); + if (!STI.hasMovsx()) { + setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Expand); + setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Expand); + setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i32, Expand); + } // Extended load operations for i1 types must be promoted for (MVT VT : MVT::integer_valuetypes()) { diff --git a/llvm/lib/Target/BPF/BPFInstrInfo.td b/llvm/lib/Target/BPF/BPFInstrInfo.td index a1d532e60db2..29bec72aa92d 100644 --- a/llvm/lib/Target/BPF/BPFInstrInfo.td +++ b/llvm/lib/Target/BPF/BPFInstrInfo.td @@ -376,11 +376,11 @@ let Predicates = [BPFHasMovsx] in { def MOVSX_rr_8 : ALU_RR<BPF_ALU64, BPF_MOV, 8, (outs GPR:$dst), (ins GPR:$src), "$dst = (s8)$src", - [(set GPR:$dst, (sra (shl GPR:$src, (i64 56)), (i64 56)))]>; + [(set GPR:$dst, (sext_inreg GPR:$src, i8))]>; def MOVSX_rr_16 : ALU_RR<BPF_ALU64, BPF_MOV, 16, (outs GPR:$dst), (ins GPR:$src), "$dst = (s16)$src", - [(set GPR:$dst, (sra (shl GPR:$src, (i64 48)), (i64 48)))]>; + [(set GPR:$dst, (sext_inreg GPR:$src, i16))]>; def MOVSX_rr_32 : ALU_RR<BPF_ALU64, BPF_MOV, 32, (outs GPR:$dst), (ins GPR32:$src), "$dst = (s32)$src", @@ -388,11 +388,11 @@ let Predicates = [BPFHasMovsx] in { def MOVSX_rr_32_8 : ALU_RR<BPF_ALU, BPF_MOV, 8, (outs GPR32:$dst), (ins GPR32:$src), "$dst = (s8)$src", - [(set GPR32:$dst, (sra (shl GPR32:$src, (i32 24)), (i32 24)))]>; + [(set GPR32:$dst, (sext_inreg GPR32:$src, i8))]>; def MOVSX_rr_32_16 : ALU_RR<BPF_ALU, BPF_MOV, 16, (outs GPR32:$dst), (ins GPR32:$src), "$dst = (s16)$src", - [(set GPR32:$dst, (sra (shl GPR32:$src, (i32 16)), (i32 16)))]>; + [(set GPR32:$dst, (sext_inreg GPR32:$src, i16))]>; } }
llvm/lib/Target/BPF/BPFMIPeephole.cpp
321	Is this map unused?
412	Nitpick: Fangrui suggested in my llvm-objdump revisions to use `DenseMap` in most cases (as `std::map` allocates for each pair).
llvm/test/CodeGen/BPF/movsx.ll
31	This does not seem right, as it does not sign extend 8-bit argument to 16-bit value.
39	Shouldn't this be `w0 = (s8)w1`? A few checks below also look strange.

In D144829#4519036, @eddyz87 wrote:

I tried adding a test similar to assemble-disassemble.ll:

// RUN: llvm-mc -triple bpfel --mcpu=v4 --assemble --filetype=obj %s \
// RUN:   | llvm-objdump -d --mattr=+alu32 - \
// RUN:   | FileCheck %s

// CHECK: d7 01 00 00 10 00 00 00	r1 = bswap16 r1
// CHECK: d7 02 00 00 20 00 00 00	r2 = bswap32 r2
// CHECK: d7 03 00 00 40 00 00 00	r3 = bswap64 r3
r1 = bswap16 r1
r2 = bswap32 r2
r3 = bswap64 r3

// CHECK: 91 41 00 00 00 00 00 00	r1 = *(s8 *)(r4 + 0x0)
// CHECK: 89 52 04 00 00 00 00 00	r2 = *(s16 *)(r5 + 0x4)
// CHECK: 81 63 08 00 00 00 00 00	r3 = *(s32 *)(r6 + 0x8)
r1 = *(s8 *)(r4 + 0)
r2 = *(s16 *)(r5 + 4)
r3 = *(s32 *)(r6 + 8)

// CHECK: 91 41 00 00 00 00 00 00	w1 = *(s8 *)(r4 + 0x0)
// CHECK: 89 52 04 00 00 00 00 00	w2 = *(s16 *)(r5 + 0x4)
w1 = *(s8 *)(r4 + 0)
w2 = *(s16 *)(r5 + 4)

// CHECK: bf 41 08 00 00 00 00 00	r1 = (s8)r4
// CHECK: bf 52 10 00 00 00 00 00	r2 = (s16)r5
// CHECK: bf 63 20 00 00 00 00 00	r3 = (s32)w6
r1 = (s8)r4
r2 = (s16)r5
r3 = (s32)w6
// Should this work as well: r3 = (s32)r6 ?

// CHECK: bc 31 08 00 00 00 00 00	w1 = (s8)w3
// CHECK: bc 42 10 00 00 00 00 00	w2 = (s16)w4
w1 = (s8)w3
w2 = (s16)w4

// CHECK: 3f 31 01 00 00 00 00 00	r1 s/= r3
// CHECK: 9f 42 01 00 00 00 00 00	r2 s%= r4
r1 s/= r3
r2 s%= r4

// CHECK: 3c 31 01 00 00 00 00 00	w1 s/= w3
// CHECK: 9c 42 01 00 00 00 00 00	w2 s%= w4
w1 s/= w3
w2 s%= w4

And it looks like some instructions are not printed correctly:

$ llvm-mc -triple bpfel --mcpu=v4 --assemble --filetype=obj /home/eddy/work/llvm-project/llvm/test/CodeGen/BPF/assembler-disassembler-v4.s | llvm-objdump -d --mattr=+alu32 -

<stdin>:	file format elf64-bpf

Disassembly of section .text:

0000000000000000 <.text>:
       0:	d7 01 00 00 10 00 00 00	r1 = bswap16 r1
       1:	d7 02 00 00 20 00 00 00	r2 = bswap32 r2
       2:	d7 03 00 00 40 00 00 00	r3 = bswap64 r3
       3:	91 41 00 00 00 00 00 00	w1 = *(s8 *)(r4 + 0x0)
       4:	89 52 04 00 00 00 00 00	w2 = *(s16 *)(r5 + 0x4)
       5:	81 63 08 00 00 00 00 00	<unknown>
       6:	91 41 00 00 00 00 00 00	w1 = *(s8 *)(r4 + 0x0)
       7:	89 52 04 00 00 00 00 00	w2 = *(s16 *)(r5 + 0x4)
       8:	bf 41 08 00 00 00 00 00	r1 = (s8)r4
       9:	bf 52 10 00 00 00 00 00	r2 = (s16)r5
      10:	bf 63 20 00 00 00 00 00	r3 = (s32)w6
      11:	bc 31 08 00 00 00 00 00	w1 = (s8)w3
      12:	bc 42 10 00 00 00 00 00	w2 = (s16)w4
      13:	3f 31 01 00 00 00 00 00	r1 s/= r3
      14:	9f 42 01 00 00 00 00 00	r2 s%= r4
      15:	3c 31 01 00 00 00 00 00	w1 s/= w3
      16:	9c 42 01 00 00 00 00 00	w2 s%= w4

I'm not sure if this is an issue with disassembler or some additional --mattr options are needed.

There is a problem in the td file for 32-bit signed load. Current definition is not quite right since it is supposed to sign-extension all the way to 64bit. I will fix it in the next revision.

yonghong-song added inline comments.Jul 24 2023, 12:26 AM

llvm/lib/Target/BPF/BPFInstrInfo.td
379	This indeed can simplify the code. I will incorporate your change into the patch. Thanks!
llvm/lib/Target/BPF/BPFMIPeephole.cpp
321	No. This is a leftover. Will remove.
412	Will try to use DenseMap.
llvm/test/CodeGen/BPF/movsx.ll
31	This is probably due to ABI. For example, $ cat t1.c __attribute__((noinline)) short f1(char a) { return a * a; } int f2(int a) { return f1(a); } $ clang --target=bpf -O2 -mcpu=v4 -S t1.c f1: # @f1 # %bb.0: # %entry w0 = w1 w0 *= w0 exit .Lfunc_end0: .size f1, .Lfunc_end0-f1 # -- End function .globl f2 # -- Begin function f2 .p2align 3 .type f2,@function f2: # @f2 # %bb.0: # %entry w1 = (s8)w1 call f1 w0 = (s16)w0 exit You can see in function f2(), the sign-extension has been done properly. and that is probably the reason in f1(), the compiler didn't generate proper sign extension code. I will modify the test to generate proper sign extension like the above f2().

Hi Yonghong,

Thank you for the comments.
Could you please also add a few tests for gotol?
Sorry, I should have asked for those last week.

Could you please also add a few tests for gotol?

Will do!

Three major changes in this patch:

for ldsx insns, remove 32bit ldsx insns (1-byte and 2-byte sign extension) since the ldsx insn expects to sign extension all the way up to 8-byte and normal 32bit insn (e.g. BPF_ALU) expects to zero out the top bits. Instead do a ldbsx/ldhsx and then take the lower 4 byte to extract 32bit value. This also resolved one disasm issue reported by Eduard.
for movsx insn, for 32bit sign extenstion to 64bit. Match both "sext_inreg GPR:$src, i32" (left and right shifting) and "sext GPR32:$src".
Add an internal flag to control when to generate gotol insns in BPFMIPeephole.cpp. This permits a simpler test for gotol insns.

With the above changes, the following change is needed:

diff --git a/tools/testing/selftests/bpf/progs/verifier_movsx.c b/tools/testing/selftests/bpf/progs/verifier_movsx.c
index 5ee7d004f8ba..e27bfa11c9b3 100644
--- a/tools/testing/selftests/bpf/progs/verifier_movsx.c
+++ b/tools/testing/selftests/bpf/progs/verifier_movsx.c
@@ -59,7 +59,7 @@ __naked void mov64sx_s32(void)
 {
        asm volatile ("                                 \
        r0 = 0xfffffffe;                                \
-       r0 = (s32)w0;                                   \
+       r0 = (s32)r0;                                   \
        r0 >>= 1;                                       \
        exit;                                           \
 "      ::: __clobber_all);
@@ -181,7 +181,7 @@ __naked void mov64sx_s32_range(void)
 {
        asm volatile ("                                 \
        call %[bpf_get_prandom_u32];                    \
-       r1 = (s32)w0;                                   \
+       r1 = (s32)r0;                                   \
        /* r1 with s32 range */                         \
        if r1 s> 0x7fffffff goto l0_%=;                 \
        if r1 s< -0x80000000 goto l0_%=;                \

in order to compile kernel cpu v4 support (patch series v3)

https://lore.kernel.org/bpf/20230720000103.99949-1-yhs@fb.com/

I will update the kernel side once we resolved all llvm issues.

Harbormaster completed remote builds in B247801: Diff 543709.Jul 25 2023, 12:34 AM

Hi Yonghong,

Looks good to me, thanks!
Before landing this, could you please adjust tests a little bit more?

Extend assembler-disassembler-v4.s with signed div and mod, e.g.:

// CHECK: 3f 31 01 00 00 00 00 00	r1 s/= r3
// CHECK: 9f 42 01 00 00 00 00 00	r2 s%= r4
r1 s/= r3
r2 s%= r4

// CHECK: 3c 31 01 00 00 00 00 00	w1 s/= w3
// CHECK: 9c 42 01 00 00 00 00 00	w2 s%= w4
w1 s/= w3
w2 s%= w4

For gotol add a test case which tries each possibility in BPFMIPreEmitPeephole::adjustBranch()

Add more tests in assembler-disassembler-v4.s and gotol.ll.

Harbormaster completed remote builds in B248050: Diff 544055.Jul 25 2023, 10:15 PM

This revision was landed with ongoing or failed builds.Jul 26 2023, 8:37 AM

Closed by commit rG6c412b6c6faa: [BPF] Add a few new insns under cpu=v4 (authored by yonghong-song). · Explain Why

This revision was automatically updated to reflect the committed changes.

yonghong-song added a commit: rG6c412b6c6faa: [BPF] Add a few new insns under cpu=v4.

Revision Contents

Path

Size

clang/

lib/

Basic/

Targets/

BPF.h

2 lines

BPF.cpp

2 lines

test/

Misc/

target-invalid-cpu-note.c

2 lines

llvm/

lib/

Target/

BPF/

AsmParser/

8 lines

3 lines

20 lines

1 line

26 lines

1 line

171 lines

224 lines

BPFMISimplifyPatchable.cpp

8 lines

BPFSubtarget.h

8 lines

BPFSubtarget.cpp

27 lines

Disassembler/

BPFDisassembler.cpp

5 lines

MCTargetDesc/

26 lines

11 lines

3 lines

27 lines

13 lines

test/

CodeGen/

BPF/

47 lines

104 lines

79 lines

77 lines

Diff 541265

clang/lib/Basic/Targets/BPF.h

Show First 20 Lines • Show All 100 Lines • ▼ Show 20 Lines	CallingConvCheckResult checkCallingConvention(CallingConv CC) const override {
}		}
}		}

bool isValidCPUName(StringRef Name) const override;		bool isValidCPUName(StringRef Name) const override;

void fillValidCPUList(SmallVectorImpl<StringRef> &Values) const override;		void fillValidCPUList(SmallVectorImpl<StringRef> &Values) const override;

bool setCPU(const std::string &Name) override {		bool setCPU(const std::string &Name) override {
if (Name == "v3") {		if (Name == "v3" \|\| Name == "v4") {
HasAlu32 = true;		HasAlu32 = true;
}		}

StringRef CPUName(Name);		StringRef CPUName(Name);
return isValidCPUName(CPUName);		return isValidCPUName(CPUName);
}		}
};		};
} // namespace targets		} // namespace targets
} // namespace clang		} // namespace clang
#endif // LLVM_CLANG_LIB_BASIC_TARGETS_BPF_H		#endif // LLVM_CLANG_LIB_BASIC_TARGETS_BPF_H

clang/lib/Basic/Targets/BPF.cpp

	Show All 26 Lines

	void BPFTargetInfo::getTargetDefines(const LangOptions &Opts,			void BPFTargetInfo::getTargetDefines(const LangOptions &Opts,
	MacroBuilder &Builder) const {			MacroBuilder &Builder) const {
	Builder.defineMacro("__bpf__");			Builder.defineMacro("__bpf__");
	Builder.defineMacro("__BPF__");			Builder.defineMacro("__BPF__");
	}			}

	static constexpr llvm::StringLiteral ValidCPUNames[] = {"generic", "v1", "v2",			static constexpr llvm::StringLiteral ValidCPUNames[] = {"generic", "v1", "v2",
	"v3", "probe"};			"v3", "v4", "probe"};

	bool BPFTargetInfo::isValidCPUName(StringRef Name) const {			bool BPFTargetInfo::isValidCPUName(StringRef Name) const {
	return llvm::is_contained(ValidCPUNames, Name);			return llvm::is_contained(ValidCPUNames, Name);
	}			}

	void BPFTargetInfo::fillValidCPUList(SmallVectorImpl<StringRef> &Values) const {			void BPFTargetInfo::fillValidCPUList(SmallVectorImpl<StringRef> &Values) const {
	Values.append(std::begin(ValidCPUNames), std::end(ValidCPUNames));			Values.append(std::begin(ValidCPUNames), std::end(ValidCPUNames));
	}			}
	Show All 16 Lines

clang/test/Misc/target-invalid-cpu-note.c

	Show First 20 Lines • Show All 67 Lines • ▼ Show 20 Lines
	// LANAI-NEXT: note: valid target CPU values are: v11{{$}}			// LANAI-NEXT: note: valid target CPU values are: v11{{$}}

	// RUN: not %clang_cc1 -triple hexagon--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 \| FileCheck %s --check-prefix HEXAGON			// RUN: not %clang_cc1 -triple hexagon--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 \| FileCheck %s --check-prefix HEXAGON
	// HEXAGON: error: unknown target CPU 'not-a-cpu'			// HEXAGON: error: unknown target CPU 'not-a-cpu'
	// HEXAGON-NEXT: note: valid target CPU values are: hexagonv5, hexagonv55, hexagonv60, hexagonv62, hexagonv65, hexagonv66, hexagonv67, hexagonv67t, hexagonv68, hexagonv69, hexagonv71, hexagonv71t, hexagonv73{{$}}			// HEXAGON-NEXT: note: valid target CPU values are: hexagonv5, hexagonv55, hexagonv60, hexagonv62, hexagonv65, hexagonv66, hexagonv67, hexagonv67t, hexagonv68, hexagonv69, hexagonv71, hexagonv71t, hexagonv73{{$}}

	// RUN: not %clang_cc1 -triple bpf--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 \| FileCheck %s --check-prefix BPF			// RUN: not %clang_cc1 -triple bpf--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 \| FileCheck %s --check-prefix BPF
	// BPF: error: unknown target CPU 'not-a-cpu'			// BPF: error: unknown target CPU 'not-a-cpu'
	// BPF-NEXT: note: valid target CPU values are: generic, v1, v2, v3, probe{{$}}			// BPF-NEXT: note: valid target CPU values are: generic, v1, v2, v3, v4, probe{{$}}

	// RUN: not %clang_cc1 -triple avr--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 \| FileCheck %s --check-prefix AVR			// RUN: not %clang_cc1 -triple avr--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 \| FileCheck %s --check-prefix AVR
	// AVR: error: unknown target CPU 'not-a-cpu'			// AVR: error: unknown target CPU 'not-a-cpu'
	// AVR-NEXT: note: valid target CPU values are: avr1, at90s1200, attiny11, attiny12, attiny15, attiny28, avr2, at90s2313, at90s2323, at90s2333, at90s2343, attiny22, attiny26, at86rf401, at90s4414, at90s4433, at90s4434, at90s8515, at90c8534, at90s8535, avr25, ata5272, ata6616c, attiny13, attiny13a, attiny2313, attiny2313a, attiny24, attiny24a, attiny4313, attiny44, attiny44a, attiny84, attiny84a, attiny25, attiny45, attiny85, attiny261, attiny261a, attiny441, attiny461, attiny461a, attiny841, attiny861, attiny861a, attiny87, attiny43u, attiny48, attiny88, attiny828, avr3, at43usb355, at76c711, avr31, atmega103, at43usb320, avr35, attiny167, at90usb82, at90usb162, ata5505, ata6617c, ata664251, atmega8u2, atmega16u2, atmega32u2, attiny1634, avr4, atmega8, ata6289, atmega8a, ata6285, ata6286, ata6612c, atmega48, atmega48a, atmega48pa, atmega48pb, atmega48p, atmega88, atmega88a, atmega88p, atmega88pa, atmega88pb, atmega8515, atmega8535, atmega8hva, at90pwm1, at90pwm2, at90pwm2b, at90pwm3, at90pwm3b, at90pwm81, avr5, ata5702m322, ata5782, ata5790, ata5790n, ata5791, ata5795, ata5831, ata6613c, ata6614q, ata8210, ata8510, atmega16, atmega16a, atmega161, atmega162, atmega163, atmega164a, atmega164p, atmega164pa, atmega165, atmega165a, atmega165p, atmega165pa, atmega168, atmega168a, atmega168p, atmega168pa, atmega168pb, atmega169, atmega169a, atmega169p, atmega169pa, atmega32, atmega32a, atmega323, atmega324a, atmega324p, atmega324pa, atmega324pb, atmega325, atmega325a, atmega325p, atmega325pa, atmega3250, atmega3250a, atmega3250p, atmega3250pa, atmega328, atmega328p, atmega328pb, atmega329, atmega329a, atmega329p, atmega329pa, atmega3290, atmega3290a, atmega3290p, atmega3290pa, atmega406, atmega64, atmega64a, atmega640, atmega644, atmega644a, atmega644p, atmega644pa, atmega645, atmega645a, atmega645p, atmega649, atmega649a, atmega649p, atmega6450, atmega6450a, atmega6450p, atmega6490, atmega6490a, atmega6490p, atmega64rfr2, atmega644rfr2, atmega16hva, atmega16hva2, atmega16hvb, atmega16hvbrevb, atmega32hvb, atmega32hvbrevb, atmega64hve, atmega64hve2, at90can32, at90can64, at90pwm161, at90pwm216, at90pwm316, atmega32c1, atmega64c1, atmega16m1, atmega32m1, atmega64m1, atmega16u4, atmega32u4, atmega32u6, at90usb646, at90usb647, at90scr100, at94k, m3000, avr51, atmega128, atmega128a, atmega1280, atmega1281, atmega1284, atmega1284p, atmega128rfa1, atmega128rfr2, atmega1284rfr2, at90can128, at90usb1286, at90usb1287, avr6, atmega2560, atmega2561, atmega256rfr2, atmega2564rfr2, avrxmega2, atxmega16a4, atxmega16a4u, atxmega16c4, atxmega16d4, atxmega32a4, atxmega32a4u, atxmega32c3, atxmega32c4, atxmega32d3, atxmega32d4, atxmega32e5, atxmega16e5, atxmega8e5, avrxmega4, atxmega64a3, atxmega64a3u, atxmega64a4u, atxmega64b1, atxmega64b3, atxmega64c3, atxmega64d3, atxmega64d4, avrxmega5, atxmega64a1, atxmega64a1u, avrxmega6, atxmega128a3, atxmega128a3u, atxmega128b1, atxmega128b3, atxmega128c3, atxmega128d3, atxmega128d4, atxmega192a3, atxmega192a3u, atxmega192c3, atxmega192d3, atxmega256a3, atxmega256a3u, atxmega256a3b, atxmega256a3bu, atxmega256c3, atxmega256d3, atxmega384c3, atxmega384d3, avrxmega7, atxmega128a1, atxmega128a1u, atxmega128a4u, avrtiny, attiny4, attiny5, attiny9, attiny10, attiny20, attiny40, attiny102, attiny104, avrxmega3, attiny202, attiny402, attiny204, attiny404, attiny804, attiny1604, attiny406, attiny806, attiny1606, attiny807, attiny1607, attiny212, attiny412, attiny214, attiny414, attiny814, attiny1614, attiny416, attiny816, attiny1616, attiny3216, attiny417, attiny817, attiny1617, attiny3217, attiny1624, attiny1626, attiny1627, atmega808, atmega809, atmega1608, atmega1609, atmega3208, atmega3209, atmega4808, atmega4809			// AVR-NEXT: note: valid target CPU values are: avr1, at90s1200, attiny11, attiny12, attiny15, attiny28, avr2, at90s2313, at90s2323, at90s2333, at90s2343, attiny22, attiny26, at86rf401, at90s4414, at90s4433, at90s4434, at90s8515, at90c8534, at90s8535, avr25, ata5272, ata6616c, attiny13, attiny13a, attiny2313, attiny2313a, attiny24, attiny24a, attiny4313, attiny44, attiny44a, attiny84, attiny84a, attiny25, attiny45, attiny85, attiny261, attiny261a, attiny441, attiny461, attiny461a, attiny841, attiny861, attiny861a, attiny87, attiny43u, attiny48, attiny88, attiny828, avr3, at43usb355, at76c711, avr31, atmega103, at43usb320, avr35, attiny167, at90usb82, at90usb162, ata5505, ata6617c, ata664251, atmega8u2, atmega16u2, atmega32u2, attiny1634, avr4, atmega8, ata6289, atmega8a, ata6285, ata6286, ata6612c, atmega48, atmega48a, atmega48pa, atmega48pb, atmega48p, atmega88, atmega88a, atmega88p, atmega88pa, atmega88pb, atmega8515, atmega8535, atmega8hva, at90pwm1, at90pwm2, at90pwm2b, at90pwm3, at90pwm3b, at90pwm81, avr5, ata5702m322, ata5782, ata5790, ata5790n, ata5791, ata5795, ata5831, ata6613c, ata6614q, ata8210, ata8510, atmega16, atmega16a, atmega161, atmega162, atmega163, atmega164a, atmega164p, atmega164pa, atmega165, atmega165a, atmega165p, atmega165pa, atmega168, atmega168a, atmega168p, atmega168pa, atmega168pb, atmega169, atmega169a, atmega169p, atmega169pa, atmega32, atmega32a, atmega323, atmega324a, atmega324p, atmega324pa, atmega324pb, atmega325, atmega325a, atmega325p, atmega325pa, atmega3250, atmega3250a, atmega3250p, atmega3250pa, atmega328, atmega328p, atmega328pb, atmega329, atmega329a, atmega329p, atmega329pa, atmega3290, atmega3290a, atmega3290p, atmega3290pa, atmega406, atmega64, atmega64a, atmega640, atmega644, atmega644a, atmega644p, atmega644pa, atmega645, atmega645a, atmega645p, atmega649, atmega649a, atmega649p, atmega6450, atmega6450a, atmega6450p, atmega6490, atmega6490a, atmega6490p, atmega64rfr2, atmega644rfr2, atmega16hva, atmega16hva2, atmega16hvb, atmega16hvbrevb, atmega32hvb, atmega32hvbrevb, atmega64hve, atmega64hve2, at90can32, at90can64, at90pwm161, at90pwm216, at90pwm316, atmega32c1, atmega64c1, atmega16m1, atmega32m1, atmega64m1, atmega16u4, atmega32u4, atmega32u6, at90usb646, at90usb647, at90scr100, at94k, m3000, avr51, atmega128, atmega128a, atmega1280, atmega1281, atmega1284, atmega1284p, atmega128rfa1, atmega128rfr2, atmega1284rfr2, at90can128, at90usb1286, at90usb1287, avr6, atmega2560, atmega2561, atmega256rfr2, atmega2564rfr2, avrxmega2, atxmega16a4, atxmega16a4u, atxmega16c4, atxmega16d4, atxmega32a4, atxmega32a4u, atxmega32c3, atxmega32c4, atxmega32d3, atxmega32d4, atxmega32e5, atxmega16e5, atxmega8e5, avrxmega4, atxmega64a3, atxmega64a3u, atxmega64a4u, atxmega64b1, atxmega64b3, atxmega64c3, atxmega64d3, atxmega64d4, avrxmega5, atxmega64a1, atxmega64a1u, avrxmega6, atxmega128a3, atxmega128a3u, atxmega128b1, atxmega128b3, atxmega128c3, atxmega128d3, atxmega128d4, atxmega192a3, atxmega192a3u, atxmega192c3, atxmega192d3, atxmega256a3, atxmega256a3u, atxmega256a3b, atxmega256a3bu, atxmega256c3, atxmega256d3, atxmega384c3, atxmega384d3, avrxmega7, atxmega128a1, atxmega128a1u, atxmega128a4u, avrtiny, attiny4, attiny5, attiny9, attiny10, attiny20, attiny40, attiny102, attiny104, avrxmega3, attiny202, attiny402, attiny204, attiny404, attiny804, attiny1604, attiny406, attiny806, attiny1606, attiny807, attiny1607, attiny212, attiny412, attiny214, attiny414, attiny814, attiny1614, attiny416, attiny816, attiny1616, attiny3216, attiny417, attiny817, attiny1617, attiny3217, attiny1624, attiny1626, attiny1627, atmega808, atmega809, atmega1608, atmega1609, atmega3208, atmega3209, atmega4808, atmega4809

	// RUN: not %clang_cc1 -triple riscv32 -target-cpu not-a-cpu -fsyntax-only %s 2>&1 \| FileCheck %s --check-prefix RISCV32			// RUN: not %clang_cc1 -triple riscv32 -target-cpu not-a-cpu -fsyntax-only %s 2>&1 \| FileCheck %s --check-prefix RISCV32
	// RISCV32: error: unknown target CPU 'not-a-cpu'			// RISCV32: error: unknown target CPU 'not-a-cpu'
	// RISCV32-NEXT: note: valid target CPU values are: generic-rv32, rocket-rv32, sifive-e20, sifive-e21, sifive-e24, sifive-e31, sifive-e34, sifive-e76, syntacore-scr1-base, syntacore-scr1-max{{$}}			// RISCV32-NEXT: note: valid target CPU values are: generic-rv32, rocket-rv32, sifive-e20, sifive-e21, sifive-e24, sifive-e31, sifive-e34, sifive-e76, syntacore-scr1-base, syntacore-scr1-max{{$}}
	Show All 12 Lines

llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp

Show First 20 Lines • Show All 221 Lines • ▼ Show 20 Lines	public:
}		}

// Identifiers that can be used at the start of a statment.		// Identifiers that can be used at the start of a statment.
static bool isValidIdAtStart(StringRef Name) {		static bool isValidIdAtStart(StringRef Name) {
return StringSwitch<bool>(Name.lower())		return StringSwitch<bool>(Name.lower())
.Case("if", true)		.Case("if", true)
.Case("call", true)		.Case("call", true)
.Case("goto", true)		.Case("goto", true)
		.Case("gotol", true)
.Case("*", true)		.Case("*", true)
.Case("exit", true)		.Case("exit", true)
.Case("lock", true)		.Case("lock", true)
.Case("ld_pseudo", true)		.Case("ld_pseudo", true)
.Default(false);		.Default(false);
}		}

// Identifiers that can be used in the middle of a statment.		// Identifiers that can be used in the middle of a statment.
static bool isValidIdInMiddle(StringRef Name) {		static bool isValidIdInMiddle(StringRef Name) {
return StringSwitch<bool>(Name.lower())		return StringSwitch<bool>(Name.lower())
.Case("u64", true)		.Case("u64", true)
.Case("u32", true)		.Case("u32", true)
.Case("u16", true)		.Case("u16", true)
.Case("u8", true)		.Case("u8", true)
		.Case("s32", true)
		.Case("s16", true)
		.Case("s8", true)
.Case("be64", true)		.Case("be64", true)
.Case("be32", true)		.Case("be32", true)
.Case("be16", true)		.Case("be16", true)
.Case("le64", true)		.Case("le64", true)
.Case("le32", true)		.Case("le32", true)
.Case("le16", true)		.Case("le16", true)
		.Case("bswap16", true)
		.Case("bswap32", true)
		.Case("bswap64", true)
.Case("goto", true)		.Case("goto", true)
		.Case("gotol", true)
.Case("ll", true)		.Case("ll", true)
.Case("skb", true)		.Case("skb", true)
.Case("s", true)		.Case("s", true)
.Case("atomic_fetch_add", true)		.Case("atomic_fetch_add", true)
.Case("atomic_fetch_and", true)		.Case("atomic_fetch_and", true)
.Case("atomic_fetch_or", true)		.Case("atomic_fetch_or", true)
.Case("atomic_fetch_xor", true)		.Case("atomic_fetch_xor", true)
.Case("xchg_64", true)		.Case("xchg_64", true)
▲ Show 20 Lines • Show All 263 Lines • Show Last 20 Lines

llvm/lib/Target/BPF/BPF.td

	Show All 24 Lines

	def DwarfRIS: SubtargetFeature<"dwarfris", "UseDwarfRIS", "true",			def DwarfRIS: SubtargetFeature<"dwarfris", "UseDwarfRIS", "true",
	"Disable MCAsmInfo DwarfUsesRelocationsAcrossSections">;			"Disable MCAsmInfo DwarfUsesRelocationsAcrossSections">;

	def : Proc<"generic", []>;			def : Proc<"generic", []>;
	def : Proc<"v1", []>;			def : Proc<"v1", []>;
	def : Proc<"v2", []>;			def : Proc<"v2", []>;
	def : Proc<"v3", [ALU32]>;			def : Proc<"v3", [ALU32]>;
				def : Proc<"v4", [ALU32]>;
	def : Proc<"probe", []>;			def : Proc<"probe", []>;

	def BPFInstPrinter : AsmWriter {			def BPFInstPrinter : AsmWriter {
	string AsmWriterClassName = "InstPrinter";			string AsmWriterClassName = "InstPrinter";
	bit isMCAsmWriter = 1;			bit isMCAsmWriter = 1;
	}			}

	def BPFAsmParser : AsmParser {			def BPFAsmParser : AsmParser {
	bit HasMnemonicFirst = 0;			bit HasMnemonicFirst = 0;
	}			}

	def BPFAsmParserVariant : AsmParserVariant {			def BPFAsmParserVariant : AsmParserVariant {
	int Variant = 0;			int Variant = 0;
	string Name = "BPF";			string Name = "BPF";
	string BreakCharacters = ".";			string BreakCharacters = ".";
	string TokenizingCharacters = "#()[]=:.<>!+*";			string TokenizingCharacters = "#()[]=:.<>!+*%/";
	}			}

	def BPF : Target {			def BPF : Target {
	let InstructionSet = BPFInstrInfo;			let InstructionSet = BPFInstrInfo;
	let AssemblyWriters = [BPFInstPrinter];			let AssemblyWriters = [BPFInstPrinter];
	let AssemblyParsers = [BPFAsmParser];			let AssemblyParsers = [BPFAsmParser];
	let AssemblyParserVariants = [BPFAsmParserVariant];			let AssemblyParserVariants = [BPFAsmParserVariant];
	}			}

llvm/lib/Target/BPF/BPFISelDAGToDAG.cpp

Show First 20 Lines • Show All 186 Lines • ▼ Show 20 Lines	if (Node->isMachineOpcode()) {
return;		return;
}		}

// tablegen selection should be handled here.		// tablegen selection should be handled here.
switch (Opcode) {		switch (Opcode) {
default:		default:
break;		break;
case ISD::SDIV: {		case ISD::SDIV: {
		if (!Subtarget->hasSdivSmod()) {
DebugLoc Empty;		DebugLoc Empty;
const DebugLoc &DL = Node->getDebugLoc();		const DebugLoc &DL = Node->getDebugLoc();
if (DL != Empty)		if (DL != Empty)
errs() << "Error at line " << DL.getLine() << ": ";		errs() << "Error at line " << DL.getLine() << ": ";
else		else
errs() << "Error: ";		errs() << "Error: ";
errs() << "Unsupport signed division for DAG: ";		errs() << "Unsupport signed division for DAG: ";
Node->print(errs(), CurDAG);		Node->print(errs(), CurDAG);
errs() << "Please convert to unsigned div/mod.\n";		errs() << "Please convert to unsigned div/mod.\n";
		}
break;		break;
}		}
case ISD::INTRINSIC_W_CHAIN: {		case ISD::INTRINSIC_W_CHAIN: {
unsigned IntNo = cast<ConstantSDNode>(Node->getOperand(1))->getZExtValue();		unsigned IntNo = cast<ConstantSDNode>(Node->getOperand(1))->getZExtValue();
switch (IntNo) {		switch (IntNo) {
case Intrinsic::bpf_load_byte:		case Intrinsic::bpf_load_byte:
case Intrinsic::bpf_load_half:		case Intrinsic::bpf_load_half:
case Intrinsic::bpf_load_word: {		case Intrinsic::bpf_load_word: {
▲ Show 20 Lines • Show All 291 Lines • Show Last 20 Lines

llvm/lib/Target/BPF/BPFISelLowering.h

Show First 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	public:

MVT getScalarShiftAmountTy(const DataLayout &, EVT) const override;		MVT getScalarShiftAmountTy(const DataLayout &, EVT) const override;

private:		private:
// Control Instruction Selection Features		// Control Instruction Selection Features
bool HasAlu32;		bool HasAlu32;
bool HasJmp32;		bool HasJmp32;
bool HasJmpExt;		bool HasJmpExt;
		bool HasMovsx;

SDValue LowerBR_CC(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerBR_CC(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerSELECT_CC(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerSELECT_CC(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerGlobalAddress(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerGlobalAddress(SDValue Op, SelectionDAG &DAG) const;

// Lower the result values of a call, copying them out of physregs into vregs		// Lower the result values of a call, copying them out of physregs into vregs
SDValue LowerCallResult(SDValue Chain, SDValue InGlue,		SDValue LowerCallResult(SDValue Chain, SDValue InGlue,
CallingConv::ID CallConv, bool IsVarArg,		CallingConv::ID CallConv, bool IsVarArg,
▲ Show 20 Lines • Show All 76 Lines • Show Last 20 Lines

llvm/lib/Target/BPF/BPFISelLowering.cpp

Show First 20 Lines • Show All 96 Lines • ▼ Show 20 Lines	BPFTargetLowering::BPFTargetLowering(const TargetMachine &TM,
}		}

for (auto VT : { MVT::i32, MVT::i64 }) {		for (auto VT : { MVT::i32, MVT::i64 }) {
if (VT == MVT::i32 && !STI.getHasAlu32())		if (VT == MVT::i32 && !STI.getHasAlu32())
continue;		continue;

setOperationAction(ISD::SDIVREM, VT, Expand);		setOperationAction(ISD::SDIVREM, VT, Expand);
setOperationAction(ISD::UDIVREM, VT, Expand);		setOperationAction(ISD::UDIVREM, VT, Expand);
		if (!STI.hasSdivSmod())
setOperationAction(ISD::SREM, VT, Expand);		setOperationAction(ISD::SREM, VT, Expand);
setOperationAction(ISD::MULHU, VT, Expand);		setOperationAction(ISD::MULHU, VT, Expand);
setOperationAction(ISD::MULHS, VT, Expand);		setOperationAction(ISD::MULHS, VT, Expand);
setOperationAction(ISD::UMUL_LOHI, VT, Expand);		setOperationAction(ISD::UMUL_LOHI, VT, Expand);
setOperationAction(ISD::SMUL_LOHI, VT, Expand);		setOperationAction(ISD::SMUL_LOHI, VT, Expand);
setOperationAction(ISD::ROTR, VT, Expand);		setOperationAction(ISD::ROTR, VT, Expand);
setOperationAction(ISD::ROTL, VT, Expand);		setOperationAction(ISD::ROTL, VT, Expand);
setOperationAction(ISD::SHL_PARTS, VT, Expand);		setOperationAction(ISD::SHL_PARTS, VT, Expand);
setOperationAction(ISD::SRL_PARTS, VT, Expand);		setOperationAction(ISD::SRL_PARTS, VT, Expand);
Show All 22 Lines	BPFTargetLowering::BPFTargetLowering(const TargetMachine &TM,
setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i32, Expand);		setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i32, Expand);

// Extended load operations for i1 types must be promoted		// Extended load operations for i1 types must be promoted
for (MVT VT : MVT::integer_valuetypes()) {		for (MVT VT : MVT::integer_valuetypes()) {
setLoadExtAction(ISD::EXTLOAD, VT, MVT::i1, Promote);		setLoadExtAction(ISD::EXTLOAD, VT, MVT::i1, Promote);
setLoadExtAction(ISD::ZEXTLOAD, VT, MVT::i1, Promote);		setLoadExtAction(ISD::ZEXTLOAD, VT, MVT::i1, Promote);
setLoadExtAction(ISD::SEXTLOAD, VT, MVT::i1, Promote);		setLoadExtAction(ISD::SEXTLOAD, VT, MVT::i1, Promote);

		if (!STI.hasLdsx()) {
setLoadExtAction(ISD::SEXTLOAD, VT, MVT::i8, Expand);		setLoadExtAction(ISD::SEXTLOAD, VT, MVT::i8, Expand);
setLoadExtAction(ISD::SEXTLOAD, VT, MVT::i16, Expand);		setLoadExtAction(ISD::SEXTLOAD, VT, MVT::i16, Expand);
setLoadExtAction(ISD::SEXTLOAD, VT, MVT::i32, Expand);		setLoadExtAction(ISD::SEXTLOAD, VT, MVT::i32, Expand);
}		}
		}

setBooleanContents(ZeroOrOneBooleanContent);		setBooleanContents(ZeroOrOneBooleanContent);

// Function alignments		// Function alignments
setMinFunctionAlignment(Align(8));		setMinFunctionAlignment(Align(8));
setPrefFunctionAlignment(Align(8));		setPrefFunctionAlignment(Align(8));

if (BPFExpandMemcpyInOrder) {		if (BPFExpandMemcpyInOrder) {
Show All 22 Lines	if (BPFExpandMemcpyInOrder) {
MaxStoresPerMemmove = MaxStoresPerMemmoveOptSize = CommonMaxStores;		MaxStoresPerMemmove = MaxStoresPerMemmoveOptSize = CommonMaxStores;
MaxLoadsPerMemcmp = MaxLoadsPerMemcmpOptSize = CommonMaxStores;		MaxLoadsPerMemcmp = MaxLoadsPerMemcmpOptSize = CommonMaxStores;
}		}

// CPU/Feature control		// CPU/Feature control
HasAlu32 = STI.getHasAlu32();		HasAlu32 = STI.getHasAlu32();
HasJmp32 = STI.getHasJmp32();		HasJmp32 = STI.getHasJmp32();
HasJmpExt = STI.getHasJmpExt();		HasJmpExt = STI.getHasJmpExt();
		HasMovsx = STI.hasMovsx();
}		}

bool BPFTargetLowering::isOffsetFoldingLegal(const GlobalAddressSDNode *GA) const {		bool BPFTargetLowering::isOffsetFoldingLegal(const GlobalAddressSDNode *GA) const {
return false;		return false;
}		}

bool BPFTargetLowering::isTruncateFree(Type Ty1, Type Ty2) const {		bool BPFTargetLowering::isTruncateFree(Type Ty1, Type Ty2) const {
if (!Ty1->isIntegerTy() \|\| !Ty2->isIntegerTy())		if (!Ty1->isIntegerTy() \|\| !Ty2->isIntegerTy())
▲ Show 20 Lines • Show All 474 Lines • ▼ Show 20 Lines	BPFTargetLowering::EmitSubregExt(MachineInstr &MI, MachineBasicBlock *BB,
if (!isSigned) {		if (!isSigned) {
Register PromotedReg0 = RegInfo.createVirtualRegister(RC);		Register PromotedReg0 = RegInfo.createVirtualRegister(RC);
BuildMI(BB, DL, TII.get(BPF::MOV_32_64), PromotedReg0).addReg(Reg);		BuildMI(BB, DL, TII.get(BPF::MOV_32_64), PromotedReg0).addReg(Reg);
return PromotedReg0;		return PromotedReg0;
}		}
Register PromotedReg0 = RegInfo.createVirtualRegister(RC);		Register PromotedReg0 = RegInfo.createVirtualRegister(RC);
Register PromotedReg1 = RegInfo.createVirtualRegister(RC);		Register PromotedReg1 = RegInfo.createVirtualRegister(RC);
Register PromotedReg2 = RegInfo.createVirtualRegister(RC);		Register PromotedReg2 = RegInfo.createVirtualRegister(RC);
		if (HasMovsx) {
		BuildMI(BB, DL, TII.get(BPF::MOVSX_rr_32), PromotedReg0).addReg(Reg);
		} else {
BuildMI(BB, DL, TII.get(BPF::MOV_32_64), PromotedReg0).addReg(Reg);		BuildMI(BB, DL, TII.get(BPF::MOV_32_64), PromotedReg0).addReg(Reg);
BuildMI(BB, DL, TII.get(BPF::SLL_ri), PromotedReg1)		BuildMI(BB, DL, TII.get(BPF::SLL_ri), PromotedReg1)
.addReg(PromotedReg0).addImm(32);		.addReg(PromotedReg0).addImm(32);
BuildMI(BB, DL, TII.get(RShiftOp), PromotedReg2)		BuildMI(BB, DL, TII.get(RShiftOp), PromotedReg2)
.addReg(PromotedReg1).addImm(32);		.addReg(PromotedReg1).addImm(32);
		}

return PromotedReg2;		return PromotedReg2;
}		}

MachineBasicBlock *		MachineBasicBlock *
BPFTargetLowering::EmitInstrWithCustomInserterMemcpy(MachineInstr &MI,		BPFTargetLowering::EmitInstrWithCustomInserterMemcpy(MachineInstr &MI,
MachineBasicBlock *BB)		MachineBasicBlock *BB)
const {		const {
▲ Show 20 Lines • Show All 196 Lines • Show Last 20 Lines

llvm/lib/Target/BPF/BPFInstrFormats.td

	Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines
	class BPFModeModifer<bits<3> val> {			class BPFModeModifer<bits<3> val> {
	bits<3> Value = val;			bits<3> Value = val;
	}			}

	def BPF_IMM : BPFModeModifer<0x0>;			def BPF_IMM : BPFModeModifer<0x0>;
	def BPF_ABS : BPFModeModifer<0x1>;			def BPF_ABS : BPFModeModifer<0x1>;
	def BPF_IND : BPFModeModifer<0x2>;			def BPF_IND : BPFModeModifer<0x2>;
	def BPF_MEM : BPFModeModifer<0x3>;			def BPF_MEM : BPFModeModifer<0x3>;
				def BPF_MEMSX : BPFModeModifer<0x4>;
				eddyz87Unsubmitted Not Done Reply Inline Actions Nitpick: the mailing list doc refers to this as `BPF_SMEM`. eddyz87: Nitpick: the mailing list doc refers to this as `BPF_SMEM`.
	def BPF_ATOMIC : BPFModeModifer<0x6>;			def BPF_ATOMIC : BPFModeModifer<0x6>;

	class BPFAtomicFlag<bits<4> val> {			class BPFAtomicFlag<bits<4> val> {
	bits<4> Value = val;			bits<4> Value = val;
	}			}

	def BPF_FETCH : BPFAtomicFlag<0x1>;			def BPF_FETCH : BPFAtomicFlag<0x1>;

	Show All 24 Lines

llvm/lib/Target/BPF/BPFInstrInfo.td

Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines
def BPFWrapper : SDNode<"BPFISD::Wrapper", SDT_BPFWrapper>;		def BPFWrapper : SDNode<"BPFISD::Wrapper", SDT_BPFWrapper>;
def BPFmemcpy : SDNode<"BPFISD::MEMCPY", SDT_BPFMEMCPY,		def BPFmemcpy : SDNode<"BPFISD::MEMCPY", SDT_BPFMEMCPY,
[SDNPHasChain, SDNPInGlue, SDNPOutGlue,		[SDNPHasChain, SDNPInGlue, SDNPOutGlue,
SDNPMayStore, SDNPMayLoad]>;		SDNPMayStore, SDNPMayLoad]>;
def BPFIsLittleEndian : Predicate<"CurDAG->getDataLayout().isLittleEndian()">;		def BPFIsLittleEndian : Predicate<"CurDAG->getDataLayout().isLittleEndian()">;
def BPFIsBigEndian : Predicate<"!CurDAG->getDataLayout().isLittleEndian()">;		def BPFIsBigEndian : Predicate<"!CurDAG->getDataLayout().isLittleEndian()">;
def BPFHasALU32 : Predicate<"Subtarget->getHasAlu32()">;		def BPFHasALU32 : Predicate<"Subtarget->getHasAlu32()">;
def BPFNoALU32 : Predicate<"!Subtarget->getHasAlu32()">;		def BPFNoALU32 : Predicate<"!Subtarget->getHasAlu32()">;
		def BPFHasLdsx : Predicate<"Subtarget->hasLdsx()">;
		astUnsubmitted Not Done Reply Inline Actions Here and elsewhere... let's drop CPUv4 mid prefix. imo the extra verbosity doesn't improve readability. Same with the flag: disable-cpuv4-movsx. I can be disable-movsx. s/BPFHasCPUv4_ldsx/BPFHasLdsx/ s/getCPUv4_bswap/getHasBswap/ or even shorter hasBswap ? ast: Here and elsewhere... let's drop CPUv4 mid prefix. imo the extra verbosity doesn't improve…
		yonghong-songAuthorUnsubmitted Done Reply Inline Actions Make sense. Will do. Ya, hasBswap is good enough to capture what it means. yonghong-song: Make sense. Will do. Ya, hasBswap is good enough to capture what it means.
		def BPFHasMovsx : Predicate<"Subtarget->hasMovsx()">;
		def BPFHasBswap : Predicate<"Subtarget->hasBswap()">;
		def BPFHasSdivSmod : Predicate<"Subtarget->hasSdivSmod()">;
		def BPFNoMovsx : Predicate<"!Subtarget->hasMovsx()">;
		def BPFNoBswap : Predicate<"!Subtarget->hasBswap()">;

def brtarget : Operand<OtherVT> {		def brtarget : Operand<OtherVT> {
let PrintMethod = "printBrTargetOperand";		let PrintMethod = "printBrTargetOperand";
}		}
def calltarget : Operand<i64>;		def calltarget : Operand<i64>;

def u64imm : Operand<i64> {		def u64imm : Operand<i64> {
let PrintMethod = "printImm64Operand";		let PrintMethod = "printImm64Operand";
▲ Show 20 Lines • Show All 171 Lines • ▼ Show 20 Lines
defm JSGE : J<BPF_JSGE, "s>=", BPF_CC_GE, BPF_CC_GE_32>;		defm JSGE : J<BPF_JSGE, "s>=", BPF_CC_GE, BPF_CC_GE_32>;
defm JULT : J<BPF_JLT, "<", BPF_CC_LTU, BPF_CC_LTU_32>;		defm JULT : J<BPF_JLT, "<", BPF_CC_LTU, BPF_CC_LTU_32>;
defm JULE : J<BPF_JLE, "<=", BPF_CC_LEU, BPF_CC_LEU_32>;		defm JULE : J<BPF_JLE, "<=", BPF_CC_LEU, BPF_CC_LEU_32>;
defm JSLT : J<BPF_JSLT, "s<", BPF_CC_LT, BPF_CC_LT_32>;		defm JSLT : J<BPF_JSLT, "s<", BPF_CC_LT, BPF_CC_LT_32>;
defm JSLE : J<BPF_JSLE, "s<=", BPF_CC_LE, BPF_CC_LE_32>;		defm JSLE : J<BPF_JSLE, "s<=", BPF_CC_LE, BPF_CC_LE_32>;
}		}

// ALU instructions		// ALU instructions
class ALU_RI<BPFOpClass Class, BPFArithOp Opc,		class ALU_RI<BPFOpClass Class, BPFArithOp Opc, int off,
dag outs, dag ins, string asmstr, list<dag> pattern>		dag outs, dag ins, string asmstr, list<dag> pattern>
: TYPE_ALU_JMP<Opc.Value, BPF_K.Value, outs, ins, asmstr, pattern> {		: TYPE_ALU_JMP<Opc.Value, BPF_K.Value, outs, ins, asmstr, pattern> {
bits<4> dst;		bits<4> dst;
bits<32> imm;		bits<32> imm;

let Inst{51-48} = dst;		let Inst{51-48} = dst;
		let Inst{47-32} = off;
let Inst{31-0} = imm;		let Inst{31-0} = imm;
let BPFClass = Class;		let BPFClass = Class;
}		}

class ALU_RR<BPFOpClass Class, BPFArithOp Opc,		class ALU_RR<BPFOpClass Class, BPFArithOp Opc, int off,
dag outs, dag ins, string asmstr, list<dag> pattern>		dag outs, dag ins, string asmstr, list<dag> pattern>
: TYPE_ALU_JMP<Opc.Value, BPF_X.Value, outs, ins, asmstr, pattern> {		: TYPE_ALU_JMP<Opc.Value, BPF_X.Value, outs, ins, asmstr, pattern> {
bits<4> dst;		bits<4> dst;
bits<4> src;		bits<4> src;

let Inst{55-52} = src;		let Inst{55-52} = src;
let Inst{51-48} = dst;		let Inst{51-48} = dst;
		let Inst{47-32} = off;
let BPFClass = Class;		let BPFClass = Class;
}		}

multiclass ALU<BPFArithOp Opc, string OpcodeStr, SDNode OpNode> {		multiclass ALU<BPFArithOp Opc, int off, string OpcodeStr, SDNode OpNode> {
def _rr : ALU_RR<BPF_ALU64, Opc,		def _rr : ALU_RR<BPF_ALU64, Opc, off,
(outs GPR:$dst),		(outs GPR:$dst),
(ins GPR:$src2, GPR:$src),		(ins GPR:$src2, GPR:$src),
"$dst "#OpcodeStr#" $src",		"$dst "#OpcodeStr#" $src",
[(set GPR:$dst, (OpNode i64:$src2, i64:$src))]>;		[(set GPR:$dst, (OpNode i64:$src2, i64:$src))]>;
def _ri : ALU_RI<BPF_ALU64, Opc,		def _ri : ALU_RI<BPF_ALU64, Opc, off,
(outs GPR:$dst),		(outs GPR:$dst),
(ins GPR:$src2, i64imm:$imm),		(ins GPR:$src2, i64imm:$imm),
"$dst "#OpcodeStr#" $imm",		"$dst "#OpcodeStr#" $imm",
[(set GPR:$dst, (OpNode GPR:$src2, i64immSExt32:$imm))]>;		[(set GPR:$dst, (OpNode GPR:$src2, i64immSExt32:$imm))]>;
def _rr_32 : ALU_RR<BPF_ALU, Opc,		def _rr_32 : ALU_RR<BPF_ALU, Opc, off,
(outs GPR32:$dst),		(outs GPR32:$dst),
(ins GPR32:$src2, GPR32:$src),		(ins GPR32:$src2, GPR32:$src),
"$dst "#OpcodeStr#" $src",		"$dst "#OpcodeStr#" $src",
[(set GPR32:$dst, (OpNode i32:$src2, i32:$src))]>;		[(set GPR32:$dst, (OpNode i32:$src2, i32:$src))]>;
def _ri_32 : ALU_RI<BPF_ALU, Opc,		def _ri_32 : ALU_RI<BPF_ALU, Opc, off,
(outs GPR32:$dst),		(outs GPR32:$dst),
(ins GPR32:$src2, i32imm:$imm),		(ins GPR32:$src2, i32imm:$imm),
"$dst "#OpcodeStr#" $imm",		"$dst "#OpcodeStr#" $imm",
[(set GPR32:$dst, (OpNode GPR32:$src2, i32immSExt32:$imm))]>;		[(set GPR32:$dst, (OpNode GPR32:$src2, i32immSExt32:$imm))]>;
}		}

let Constraints = "$dst = $src2" in {		let Constraints = "$dst = $src2" in {
let isAsCheapAsAMove = 1 in {		let isAsCheapAsAMove = 1 in {
defm ADD : ALU<BPF_ADD, "+=", add>;		defm ADD : ALU<BPF_ADD, 0, "+=", add>;
defm SUB : ALU<BPF_SUB, "-=", sub>;		defm SUB : ALU<BPF_SUB, 0, "-=", sub>;
defm OR : ALU<BPF_OR, "\|=", or>;		defm OR : ALU<BPF_OR, 0, "\|=", or>;
defm AND : ALU<BPF_AND, "&=", and>;		defm AND : ALU<BPF_AND, 0, "&=", and>;
defm SLL : ALU<BPF_LSH, "<<=", shl>;		defm SLL : ALU<BPF_LSH, 0, "<<=", shl>;
defm SRL : ALU<BPF_RSH, ">>=", srl>;		defm SRL : ALU<BPF_RSH, 0, ">>=", srl>;
defm XOR : ALU<BPF_XOR, "^=", xor>;		defm XOR : ALU<BPF_XOR, 0, "^=", xor>;
defm SRA : ALU<BPF_ARSH, "s>>=", sra>;		defm SRA : ALU<BPF_ARSH, 0, "s>>=", sra>;
}		}
defm MUL : ALU<BPF_MUL, "*=", mul>;		defm MUL : ALU<BPF_MUL, 0, "*=", mul>;
defm DIV : ALU<BPF_DIV, "/=", udiv>;		defm DIV : ALU<BPF_DIV, 0, "/=", udiv>;
defm MOD : ALU<BPF_MOD, "%=", urem>;		defm MOD : ALU<BPF_MOD, 0, "%=", urem>;

		let Predicates = [BPFHasSdivSmod] in {
		defm SDIV : ALU<BPF_DIV, 1, "s/=", sdiv>;
		defm SMOD : ALU<BPF_MOD, 1, "s%=", srem>;
		}
}		}

class NEG_RR<BPFOpClass Class, BPFArithOp Opc,		class NEG_RR<BPFOpClass Class, BPFArithOp Opc,
dag outs, dag ins, string asmstr, list<dag> pattern>		dag outs, dag ins, string asmstr, list<dag> pattern>
: TYPE_ALU_JMP<Opc.Value, 0, outs, ins, asmstr, pattern> {		: TYPE_ALU_JMP<Opc.Value, 0, outs, ins, asmstr, pattern> {
bits<4> dst;		bits<4> dst;

let Inst{51-48} = dst;		let Inst{51-48} = dst;
Show All 23 Lines	class LD_IMM64<bits<4> Pseudo, string OpcodeStr>
let Inst{55-52} = Pseudo;		let Inst{55-52} = Pseudo;
let Inst{47-32} = 0;		let Inst{47-32} = 0;
let Inst{31-0} = imm{31-0};		let Inst{31-0} = imm{31-0};
let BPFClass = BPF_LD;		let BPFClass = BPF_LD;
}		}

let isReMaterializable = 1, isAsCheapAsAMove = 1 in {		let isReMaterializable = 1, isAsCheapAsAMove = 1 in {
def LD_imm64 : LD_IMM64<0, "=">;		def LD_imm64 : LD_IMM64<0, "=">;
def MOV_rr : ALU_RR<BPF_ALU64, BPF_MOV,		def MOV_rr : ALU_RR<BPF_ALU64, BPF_MOV, 0,
(outs GPR:$dst),		(outs GPR:$dst),
(ins GPR:$src),		(ins GPR:$src),
"$dst = $src",		"$dst = $src",
[]>;		[]>;
def MOV_ri : ALU_RI<BPF_ALU64, BPF_MOV,		def MOV_ri : ALU_RI<BPF_ALU64, BPF_MOV, 0,
(outs GPR:$dst),		(outs GPR:$dst),
(ins i64imm:$imm),		(ins i64imm:$imm),
"$dst = $imm",		"$dst = $imm",
[(set GPR:$dst, (i64 i64immSExt32:$imm))]>;		[(set GPR:$dst, (i64 i64immSExt32:$imm))]>;
def MOV_rr_32 : ALU_RR<BPF_ALU, BPF_MOV,		def MOV_rr_32 : ALU_RR<BPF_ALU, BPF_MOV, 0,
(outs GPR32:$dst),		(outs GPR32:$dst),
(ins GPR32:$src),		(ins GPR32:$src),
"$dst = $src",		"$dst = $src",
[]>;		[]>;
def MOV_ri_32 : ALU_RI<BPF_ALU, BPF_MOV,		def MOV_ri_32 : ALU_RI<BPF_ALU, BPF_MOV, 0,
(outs GPR32:$dst),		(outs GPR32:$dst),
(ins i32imm:$imm),		(ins i32imm:$imm),
"$dst = $imm",		"$dst = $imm",
[(set GPR32:$dst, (i32 i32immSExt32:$imm))]>;		[(set GPR32:$dst, (i32 i32immSExt32:$imm))]>;

		let Predicates = [BPFHasMovsx] in {
		def MOVSX_rr_8 : ALU_RR<BPF_ALU64, BPF_MOV, 8,
		(outs GPR:$dst), (ins GPR:$src),
		"$dst = (s8)$src",
		[(set GPR:$dst, (sra (shl GPR:$src, (i64 56)), (i64 56)))]>;
		eddyz87Unsubmitted Not Done Reply Inline Actions I think it is possible to avoid matching expansion pattern `(sra (shl GPR:$src, (i64 56))` here, and instead turn off the expansion when `movsx` is available. I tried the change below and all BPF codegen tests are passing. Do I miss something? diff --git a/llvm/lib/Target/BPF/BPFISelLowering.cpp b/llvm/lib/Target/BPF/BPFISelLowering.cpp index 9a7357d6ad04..5e84af009591 100644 --- a/llvm/lib/Target/BPF/BPFISelLowering.cpp +++ b/llvm/lib/Target/BPF/BPFISelLowering.cpp @@ -132,9 +132,11 @@ BPFTargetLowering::BPFTargetLowering(const TargetMachine &TM, setOperationAction(ISD::CTLZ_ZERO_UNDEF, MVT::i64, Custom); setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i1, Expand); - setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Expand); - setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Expand); - setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i32, Expand); + if (!STI.hasMovsx()) { + setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Expand); + setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Expand); + setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i32, Expand); + } // Extended load operations for i1 types must be promoted for (MVT VT : MVT::integer_valuetypes()) { diff --git a/llvm/lib/Target/BPF/BPFInstrInfo.td b/llvm/lib/Target/BPF/BPFInstrInfo.td index a1d532e60db2..29bec72aa92d 100644 --- a/llvm/lib/Target/BPF/BPFInstrInfo.td +++ b/llvm/lib/Target/BPF/BPFInstrInfo.td @@ -376,11 +376,11 @@ let Predicates = [BPFHasMovsx] in { def MOVSX_rr_8 : ALU_RR<BPF_ALU64, BPF_MOV, 8, (outs GPR:$dst), (ins GPR:$src), "$dst = (s8)$src", - [(set GPR:$dst, (sra (shl GPR:$src, (i64 56)), (i64 56)))]>; + [(set GPR:$dst, (sext_inreg GPR:$src, i8))]>; def MOVSX_rr_16 : ALU_RR<BPF_ALU64, BPF_MOV, 16, (outs GPR:$dst), (ins GPR:$src), "$dst = (s16)$src", - [(set GPR:$dst, (sra (shl GPR:$src, (i64 48)), (i64 48)))]>; + [(set GPR:$dst, (sext_inreg GPR:$src, i16))]>; def MOVSX_rr_32 : ALU_RR<BPF_ALU64, BPF_MOV, 32, (outs GPR:$dst), (ins GPR32:$src), "$dst = (s32)$src", @@ -388,11 +388,11 @@ let Predicates = [BPFHasMovsx] in { def MOVSX_rr_32_8 : ALU_RR<BPF_ALU, BPF_MOV, 8, (outs GPR32:$dst), (ins GPR32:$src), "$dst = (s8)$src", - [(set GPR32:$dst, (sra (shl GPR32:$src, (i32 24)), (i32 24)))]>; + [(set GPR32:$dst, (sext_inreg GPR32:$src, i8))]>; def MOVSX_rr_32_16 : ALU_RR<BPF_ALU, BPF_MOV, 16, (outs GPR32:$dst), (ins GPR32:$src), "$dst = (s16)$src", - [(set GPR32:$dst, (sra (shl GPR32:$src, (i32 16)), (i32 16)))]>; + [(set GPR32:$dst, (sext_inreg GPR32:$src, i16))]>; } } eddyz87: I think it is possible to avoid matching expansion pattern `(sra (shl GPR:$src, (i64 56))` here…
		yonghong-songAuthorUnsubmitted Done Reply Inline Actions This indeed can simplify the code. I will incorporate your change into the patch. Thanks! yonghong-song: This indeed can simplify the code. I will incorporate your change into the patch. Thanks!
		def MOVSX_rr_16 : ALU_RR<BPF_ALU64, BPF_MOV, 16,
		(outs GPR:$dst), (ins GPR:$src),
		"$dst = (s16)$src",
		[(set GPR:$dst, (sra (shl GPR:$src, (i64 48)), (i64 48)))]>;
		def MOVSX_rr_32 : ALU_RR<BPF_ALU64, BPF_MOV, 32,
		(outs GPR:$dst), (ins GPR32:$src),
		"$dst = (s32)$src",
		[(set GPR:$dst, (sext GPR32:$src))]>;
		def MOVSX_rr_32_8 : ALU_RR<BPF_ALU, BPF_MOV, 8,
		(outs GPR32:$dst), (ins GPR32:$src),
		"$dst = (s8)$src",
		[(set GPR32:$dst, (sra (shl GPR32:$src, (i32 24)), (i32 24)))]>;
		def MOVSX_rr_32_16 : ALU_RR<BPF_ALU, BPF_MOV, 16,
		(outs GPR32:$dst), (ins GPR32:$src),
		"$dst = (s16)$src",
		[(set GPR32:$dst, (sra (shl GPR32:$src, (i32 16)), (i32 16)))]>;
		}
}		}

def FI_ri		def FI_ri
: TYPE_LD_ST<BPF_IMM.Value, BPF_DW.Value,		: TYPE_LD_ST<BPF_IMM.Value, BPF_DW.Value,
(outs GPR:$dst),		(outs GPR:$dst),
(ins MEMri:$addr),		(ins MEMri:$addr),
"lea\t$dst, $addr",		"lea\t$dst, $addr",
[(set i64:$dst, FIri:$addr)]> {		[(set i64:$dst, FIri:$addr)]> {
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines
let Predicates = [BPFNoALU32] in {		let Predicates = [BPFNoALU32] in {
def STW : STOREi64<BPF_W, "u32", truncstorei32>;		def STW : STOREi64<BPF_W, "u32", truncstorei32>;
def STH : STOREi64<BPF_H, "u16", truncstorei16>;		def STH : STOREi64<BPF_H, "u16", truncstorei16>;
def STB : STOREi64<BPF_B, "u8", truncstorei8>;		def STB : STOREi64<BPF_B, "u8", truncstorei8>;
}		}
def STD : STOREi64<BPF_DW, "u64", store>;		def STD : STOREi64<BPF_DW, "u64", store>;

// LOAD instructions		// LOAD instructions
class LOAD<BPFWidthModifer SizeOp, string OpcodeStr, list<dag> Pattern>		class LOAD<BPFWidthModifer SizeOp, BPFModeModifer ModOp, string OpcodeStr, list<dag> Pattern>
: TYPE_LD_ST<BPF_MEM.Value, SizeOp.Value,		: TYPE_LD_ST<ModOp.Value, SizeOp.Value,
(outs GPR:$dst),		(outs GPR:$dst),
(ins MEMri:$addr),		(ins MEMri:$addr),
"$dst = ("#OpcodeStr#" )($addr)",		"$dst = ("#OpcodeStr#" )($addr)",
Pattern> {		Pattern> {
bits<4> dst;		bits<4> dst;
bits<20> addr;		bits<20> addr;

let Inst{51-48} = dst;		let Inst{51-48} = dst;
let Inst{55-52} = addr{19-16};		let Inst{55-52} = addr{19-16};
let Inst{47-32} = addr{15-0};		let Inst{47-32} = addr{15-0};
let BPFClass = BPF_LDX;		let BPFClass = BPF_LDX;
}		}

class LOADi64<BPFWidthModifer SizeOp, string OpcodeStr, PatFrag OpNode>		class LOADi64<BPFWidthModifer SizeOp, BPFModeModifer ModOp, string OpcodeStr, PatFrag OpNode>
: LOAD<SizeOp, OpcodeStr, [(set i64:$dst, (OpNode ADDRri:$addr))]>;		: LOAD<SizeOp, ModOp, OpcodeStr, [(set i64:$dst, (OpNode ADDRri:$addr))]>;

let isCodeGenOnly = 1 in {		let isCodeGenOnly = 1 in {
def CORE_MEM : TYPE_LD_ST<BPF_MEM.Value, BPF_W.Value,		def CORE_MEM : TYPE_LD_ST<BPF_MEM.Value, BPF_W.Value,
(outs GPR:$dst),		(outs GPR:$dst),
(ins u64imm:$opcode, GPR:$src, u64imm:$offset),		(ins u64imm:$opcode, GPR:$src, u64imm:$offset),
"$dst = core_mem($opcode, $src, $offset)",		"$dst = core_mem($opcode, $src, $offset)",
[]>;		[]>;
def CORE_ALU32_MEM : TYPE_LD_ST<BPF_MEM.Value, BPF_W.Value,		def CORE_ALU32_MEM : TYPE_LD_ST<BPF_MEM.Value, BPF_W.Value,
(outs GPR32:$dst),		(outs GPR32:$dst),
(ins u64imm:$opcode, GPR:$src, u64imm:$offset),		(ins u64imm:$opcode, GPR:$src, u64imm:$offset),
"$dst = core_alu32_mem($opcode, $src, $offset)",		"$dst = core_alu32_mem($opcode, $src, $offset)",
[]>;		[]>;
let Constraints = "$dst = $src" in {		let Constraints = "$dst = $src" in {
def CORE_SHIFT : ALU_RR<BPF_ALU64, BPF_LSH,		def CORE_SHIFT : ALU_RR<BPF_ALU64, BPF_LSH, 0,
(outs GPR:$dst),		(outs GPR:$dst),
(ins u64imm:$opcode, GPR:$src, u64imm:$offset),		(ins u64imm:$opcode, GPR:$src, u64imm:$offset),
"$dst = core_shift($opcode, $src, $offset)",		"$dst = core_shift($opcode, $src, $offset)",
[]>;		[]>;
}		}
}		}

let Predicates = [BPFNoALU32] in {		let Predicates = [BPFNoALU32] in {
def LDW : LOADi64<BPF_W, "u32", zextloadi32>;		def LDW : LOADi64<BPF_W, BPF_MEM, "u32", zextloadi32>;
def LDH : LOADi64<BPF_H, "u16", zextloadi16>;		def LDH : LOADi64<BPF_H, BPF_MEM, "u16", zextloadi16>;
def LDB : LOADi64<BPF_B, "u8", zextloadi8>;		def LDB : LOADi64<BPF_B, BPF_MEM, "u8", zextloadi8>;
		}

		let Predicates = [BPFHasLdsx] in {
		def LDWSX : LOADi64<BPF_W, BPF_MEMSX, "s32", sextloadi32>;
		def LDHSX : LOADi64<BPF_H, BPF_MEMSX, "s16", sextloadi16>;
		def LDBSX : LOADi64<BPF_B, BPF_MEMSX, "s8", sextloadi8>;
}		}

def LDD : LOADi64<BPF_DW, "u64", load>;		def LDD : LOADi64<BPF_DW, BPF_MEM, "u64", load>;

class BRANCH<BPFJumpOp Opc, string OpcodeStr, list<dag> Pattern>		class BRANCH<BPFJumpOp Opc, string OpcodeStr, list<dag> Pattern>
: TYPE_ALU_JMP<Opc.Value, BPF_K.Value,		: TYPE_ALU_JMP<Opc.Value, BPF_K.Value,
(outs),		(outs),
(ins brtarget:$BrDst),		(ins brtarget:$BrDst),
!strconcat(OpcodeStr, " $BrDst"),		!strconcat(OpcodeStr, " $BrDst"),
Pattern> {		Pattern> {
bits<16> BrDst;		bits<16> BrDst;

let Inst{47-32} = BrDst;		let Inst{47-32} = BrDst;
let BPFClass = BPF_JMP;		let BPFClass = BPF_JMP;
}		}

		class BRANCH_LONG<BPFJumpOp Opc, string OpcodeStr, list<dag> Pattern>
		: TYPE_ALU_JMP<Opc.Value, BPF_K.Value,
		(outs),
		(ins brtarget:$BrDst),
		!strconcat(OpcodeStr, " $BrDst"),
		Pattern> {
		bits<32> BrDst;

		let Inst{31-0} = BrDst;
		let BPFClass = BPF_JMP32;
		}

class CALL<string OpcodeStr>		class CALL<string OpcodeStr>
: TYPE_ALU_JMP<BPF_CALL.Value, BPF_K.Value,		: TYPE_ALU_JMP<BPF_CALL.Value, BPF_K.Value,
(outs),		(outs),
(ins calltarget:$BrDst),		(ins calltarget:$BrDst),
!strconcat(OpcodeStr, " $BrDst"),		!strconcat(OpcodeStr, " $BrDst"),
[]> {		[]> {
bits<32> BrDst;		bits<32> BrDst;

Show All 11 Lines	class CALLX<string OpcodeStr>

let Inst{31-0} = BrDst;		let Inst{31-0} = BrDst;
let BPFClass = BPF_JMP;		let BPFClass = BPF_JMP;
}		}

// Jump always		// Jump always
let isBranch = 1, isTerminator = 1, hasDelaySlot=0, isBarrier = 1 in {		let isBranch = 1, isTerminator = 1, hasDelaySlot=0, isBarrier = 1 in {
def JMP : BRANCH<BPF_JA, "goto", [(br bb:$BrDst)]>;		def JMP : BRANCH<BPF_JA, "goto", [(br bb:$BrDst)]>;
		def JMPL : BRANCH_LONG<BPF_JA, "gotol", []>;
}		}

// Jump and link		// Jump and link
let isCall=1, hasDelaySlot=0, Uses = [R11],		let isCall=1, hasDelaySlot=0, Uses = [R11],
// Potentially clobbered registers		// Potentially clobbered registers
Defs = [R0, R1, R2, R3, R4, R5] in {		Defs = [R0, R1, R2, R3, R4, R5] in {
def JAL : CALL<"call">;		def JAL : CALL<"call">;
def JALX : CALLX<"callx">;		def JALX : CALLX<"callx">;
▲ Show 20 Lines • Show All 313 Lines • ▼ Show 20 Lines	let Predicates = [BPFHasALU32], Defs = [W0], Uses = [W0],
def CMPXCHGW32 : CMPXCHG32<BPF_W, "32", atomic_cmp_swap_32>;		def CMPXCHGW32 : CMPXCHG32<BPF_W, "32", atomic_cmp_swap_32>;
}		}

let Defs = [R0], Uses = [R0] in {		let Defs = [R0], Uses = [R0] in {
def CMPXCHGD : CMPXCHG<BPF_DW, "64", atomic_cmp_swap_64>;		def CMPXCHGD : CMPXCHG<BPF_DW, "64", atomic_cmp_swap_64>;
}		}

// bswap16, bswap32, bswap64		// bswap16, bswap32, bswap64
class BSWAP<bits<32> SizeOp, string OpcodeStr, BPFSrcType SrcType, list<dag> Pattern>		class BSWAP<BPFOpClass Class, bits<32> SizeOp, string OpcodeStr, BPFSrcType SrcType, list<dag> Pattern>
: TYPE_ALU_JMP<BPF_END.Value, SrcType.Value,		: TYPE_ALU_JMP<BPF_END.Value, SrcType.Value,
(outs GPR:$dst),		(outs GPR:$dst),
(ins GPR:$src),		(ins GPR:$src),
"$dst = "#OpcodeStr#" $src",		"$dst = "#OpcodeStr#" $src",
Pattern> {		Pattern> {
bits<4> dst;		bits<4> dst;

let Inst{51-48} = dst;		let Inst{51-48} = dst;
let Inst{31-0} = SizeOp;		let Inst{31-0} = SizeOp;
let BPFClass = BPF_ALU;		let BPFClass = Class;
}		}


let Constraints = "$dst = $src" in {		let Constraints = "$dst = $src" in {
		let Predicates = [BPFHasBswap] in {
		def BSWAP16 : BSWAP<BPF_ALU64, 16, "bswap16", BPF_TO_LE, [(set GPR:$dst, (srl (bswap GPR:$src), (i64 48)))]>;
		def BSWAP32 : BSWAP<BPF_ALU64, 32, "bswap32", BPF_TO_LE, [(set GPR:$dst, (srl (bswap GPR:$src), (i64 32)))]>;
		def BSWAP64 : BSWAP<BPF_ALU64, 64, "bswap64", BPF_TO_LE, [(set GPR:$dst, (bswap GPR:$src))]>;
		}

		let Predicates = [BPFNoBswap] in {
let Predicates = [BPFIsLittleEndian] in {		let Predicates = [BPFIsLittleEndian] in {
def BE16 : BSWAP<16, "be16", BPF_TO_BE, [(set GPR:$dst, (srl (bswap GPR:$src), (i64 48)))]>;		def BE16 : BSWAP<BPF_ALU, 16, "be16", BPF_TO_BE, [(set GPR:$dst, (srl (bswap GPR:$src), (i64 48)))]>;
def BE32 : BSWAP<32, "be32", BPF_TO_BE, [(set GPR:$dst, (srl (bswap GPR:$src), (i64 32)))]>;		def BE32 : BSWAP<BPF_ALU, 32, "be32", BPF_TO_BE, [(set GPR:$dst, (srl (bswap GPR:$src), (i64 32)))]>;
def BE64 : BSWAP<64, "be64", BPF_TO_BE, [(set GPR:$dst, (bswap GPR:$src))]>;		def BE64 : BSWAP<BPF_ALU, 64, "be64", BPF_TO_BE, [(set GPR:$dst, (bswap GPR:$src))]>;
}		}
let Predicates = [BPFIsBigEndian] in {		let Predicates = [BPFIsBigEndian] in {
def LE16 : BSWAP<16, "le16", BPF_TO_LE, [(set GPR:$dst, (srl (bswap GPR:$src), (i64 48)))]>;		def LE16 : BSWAP<BPF_ALU, 16, "le16", BPF_TO_LE, [(set GPR:$dst, (srl (bswap GPR:$src), (i64 48)))]>;
def LE32 : BSWAP<32, "le32", BPF_TO_LE, [(set GPR:$dst, (srl (bswap GPR:$src), (i64 32)))]>;		def LE32 : BSWAP<BPF_ALU, 32, "le32", BPF_TO_LE, [(set GPR:$dst, (srl (bswap GPR:$src), (i64 32)))]>;
def LE64 : BSWAP<64, "le64", BPF_TO_LE, [(set GPR:$dst, (bswap GPR:$src))]>;		def LE64 : BSWAP<BPF_ALU, 64, "le64", BPF_TO_LE, [(set GPR:$dst, (bswap GPR:$src))]>;
		}
}		}
}		}

let Defs = [R0, R1, R2, R3, R4, R5], Uses = [R6], hasSideEffects = 1,		let Defs = [R0, R1, R2, R3, R4, R5], Uses = [R6], hasSideEffects = 1,
hasExtraDefRegAllocReq = 1, hasExtraSrcRegAllocReq = 1, mayLoad = 1 in {		hasExtraDefRegAllocReq = 1, hasExtraSrcRegAllocReq = 1, mayLoad = 1 in {
class LOAD_ABS<BPFWidthModifer SizeOp, string OpcodeStr, Intrinsic OpNode>		class LOAD_ABS<BPFWidthModifer SizeOp, string OpcodeStr, Intrinsic OpNode>
: TYPE_LD_ST<BPF_ABS.Value, SizeOp.Value,		: TYPE_LD_ST<BPF_ABS.Value, SizeOp.Value,
(outs),		(outs),
(ins GPR:$skb, i64imm:$imm),		(ins GPR:$skb, i64imm:$imm),
Show All 22 Lines
def LD_ABS_H : LOAD_ABS<BPF_H, "u16", int_bpf_load_half>;		def LD_ABS_H : LOAD_ABS<BPF_H, "u16", int_bpf_load_half>;
def LD_ABS_W : LOAD_ABS<BPF_W, "u32", int_bpf_load_word>;		def LD_ABS_W : LOAD_ABS<BPF_W, "u32", int_bpf_load_word>;

def LD_IND_B : LOAD_IND<BPF_B, "u8", int_bpf_load_byte>;		def LD_IND_B : LOAD_IND<BPF_B, "u8", int_bpf_load_byte>;
def LD_IND_H : LOAD_IND<BPF_H, "u16", int_bpf_load_half>;		def LD_IND_H : LOAD_IND<BPF_H, "u16", int_bpf_load_half>;
def LD_IND_W : LOAD_IND<BPF_W, "u32", int_bpf_load_word>;		def LD_IND_W : LOAD_IND<BPF_W, "u32", int_bpf_load_word>;

let isCodeGenOnly = 1 in {		let isCodeGenOnly = 1 in {
def MOV_32_64 : ALU_RR<BPF_ALU, BPF_MOV,		def MOV_32_64 : ALU_RR<BPF_ALU, BPF_MOV, 0,
(outs GPR:$dst), (ins GPR32:$src),		(outs GPR:$dst), (ins GPR32:$src),
"$dst = $src", []>;		"$dst = $src", []>;
}		}

		let Predicates = [BPFNoMovsx] in {
def : Pat<(i64 (sext GPR32:$src)),		def : Pat<(i64 (sext GPR32:$src)),
(SRA_ri (SLL_ri (MOV_32_64 GPR32:$src), 32), 32)>;		(SRA_ri (SLL_ri (MOV_32_64 GPR32:$src), 32), 32)>;
		}

def : Pat<(i64 (zext GPR32:$src)), (MOV_32_64 GPR32:$src)>;		def : Pat<(i64 (zext GPR32:$src)), (MOV_32_64 GPR32:$src)>;

// For i64 -> i32 truncation, use the 32-bit subregister directly.		// For i64 -> i32 truncation, use the 32-bit subregister directly.
def : Pat<(i32 (trunc GPR:$src)),		def : Pat<(i32 (trunc GPR:$src)),
(i32 (EXTRACT_SUBREG GPR:$src, sub_32))>;		(i32 (EXTRACT_SUBREG GPR:$src, sub_32))>;

// For i32 -> i64 anyext, we don't care about the high bits.		// For i32 -> i64 anyext, we don't care about the high bits.
Show All 19 Lines	class STOREi32<BPFWidthModifer Opc, string OpcodeStr, PatFrag OpNode>
: STORE32<Opc, OpcodeStr, [(OpNode i32:$src, ADDRri:$addr)]>;		: STORE32<Opc, OpcodeStr, [(OpNode i32:$src, ADDRri:$addr)]>;

let Predicates = [BPFHasALU32], DecoderNamespace = "BPFALU32" in {		let Predicates = [BPFHasALU32], DecoderNamespace = "BPFALU32" in {
def STW32 : STOREi32<BPF_W, "u32", store>;		def STW32 : STOREi32<BPF_W, "u32", store>;
def STH32 : STOREi32<BPF_H, "u16", truncstorei16>;		def STH32 : STOREi32<BPF_H, "u16", truncstorei16>;
def STB32 : STOREi32<BPF_B, "u8", truncstorei8>;		def STB32 : STOREi32<BPF_B, "u8", truncstorei8>;
}		}

class LOAD32<BPFWidthModifer SizeOp, string OpcodeStr, list<dag> Pattern>		class LOAD32<BPFWidthModifer SizeOp, BPFModeModifer ModOp, string OpcodeStr, list<dag> Pattern>
: TYPE_LD_ST<BPF_MEM.Value, SizeOp.Value,		: TYPE_LD_ST<ModOp.Value, SizeOp.Value,
(outs GPR32:$dst),		(outs GPR32:$dst),
(ins MEMri:$addr),		(ins MEMri:$addr),
"$dst = ("#OpcodeStr#" )($addr)",		"$dst = ("#OpcodeStr#" )($addr)",
Pattern> {		Pattern> {
bits<4> dst;		bits<4> dst;
bits<20> addr;		bits<20> addr;

let Inst{51-48} = dst;		let Inst{51-48} = dst;
let Inst{55-52} = addr{19-16};		let Inst{55-52} = addr{19-16};
let Inst{47-32} = addr{15-0};		let Inst{47-32} = addr{15-0};
let BPFClass = BPF_LDX;		let BPFClass = BPF_LDX;
}		}

class LOADi32<BPFWidthModifer SizeOp, string OpcodeStr, PatFrag OpNode>		class LOADi32<BPFWidthModifer SizeOp, BPFModeModifer ModOp, string OpcodeStr, PatFrag OpNode>
: LOAD32<SizeOp, OpcodeStr, [(set i32:$dst, (OpNode ADDRri:$addr))]>;		: LOAD32<SizeOp, ModOp, OpcodeStr, [(set i32:$dst, (OpNode ADDRri:$addr))]>;

let Predicates = [BPFHasALU32], DecoderNamespace = "BPFALU32" in {		let Predicates = [BPFHasALU32], DecoderNamespace = "BPFALU32" in {
def LDW32 : LOADi32<BPF_W, "u32", load>;		def LDW32 : LOADi32<BPF_W, BPF_MEM, "u32", load>;
def LDH32 : LOADi32<BPF_H, "u16", zextloadi16>;		def LDH32 : LOADi32<BPF_H, BPF_MEM, "u16", zextloadi16>;
def LDB32 : LOADi32<BPF_B, "u8", zextloadi8>;		def LDB32 : LOADi32<BPF_B, BPF_MEM, "u8", zextloadi8>;
		}

		let Predicates = [BPFHasLdsx], DecoderNamespace = "BPFALU32" in {
		def LDH32SX : LOADi32<BPF_H, BPF_MEMSX, "s16", sextloadi16>;
		def LDB32SX : LOADi32<BPF_B, BPF_MEMSX, "s8", sextloadi8>;
}		}

let Predicates = [BPFHasALU32] in {		let Predicates = [BPFHasALU32] in {
def : Pat<(truncstorei8 GPR:$src, ADDRri:$dst),		def : Pat<(truncstorei8 GPR:$src, ADDRri:$dst),
(STB32 (EXTRACT_SUBREG GPR:$src, sub_32), ADDRri:$dst)>;		(STB32 (EXTRACT_SUBREG GPR:$src, sub_32), ADDRri:$dst)>;
def : Pat<(truncstorei16 GPR:$src, ADDRri:$dst),		def : Pat<(truncstorei16 GPR:$src, ADDRri:$dst),
(STH32 (EXTRACT_SUBREG GPR:$src, sub_32), ADDRri:$dst)>;		(STH32 (EXTRACT_SUBREG GPR:$src, sub_32), ADDRri:$dst)>;
def : Pat<(truncstorei32 GPR:$src, ADDRri:$dst),		def : Pat<(truncstorei32 GPR:$src, ADDRri:$dst),
(STW32 (EXTRACT_SUBREG GPR:$src, sub_32), ADDRri:$dst)>;		(STW32 (EXTRACT_SUBREG GPR:$src, sub_32), ADDRri:$dst)>;
def : Pat<(i32 (extloadi8 ADDRri:$src)), (i32 (LDB32 ADDRri:$src))>;		def : Pat<(i32 (extloadi8 ADDRri:$src)), (i32 (LDB32 ADDRri:$src))>;
def : Pat<(i32 (extloadi16 ADDRri:$src)), (i32 (LDH32 ADDRri:$src))>;		def : Pat<(i32 (extloadi16 ADDRri:$src)), (i32 (LDH32 ADDRri:$src))>;

def : Pat<(i64 (zextloadi8 ADDRri:$src)),		def : Pat<(i64 (zextloadi8 ADDRri:$src)),
(SUBREG_TO_REG (i64 0), (LDB32 ADDRri:$src), sub_32)>;		(SUBREG_TO_REG (i64 0), (LDB32 ADDRri:$src), sub_32)>;
def : Pat<(i64 (zextloadi16 ADDRri:$src)),		def : Pat<(i64 (zextloadi16 ADDRri:$src)),
(SUBREG_TO_REG (i64 0), (LDH32 ADDRri:$src), sub_32)>;		(SUBREG_TO_REG (i64 0), (LDH32 ADDRri:$src), sub_32)>;
def : Pat<(i64 (zextloadi32 ADDRri:$src)),		def : Pat<(i64 (zextloadi32 ADDRri:$src)),
(SUBREG_TO_REG (i64 0), (LDW32 ADDRri:$src), sub_32)>;		(SUBREG_TO_REG (i64 0), (LDW32 ADDRri:$src), sub_32)>;
def : Pat<(i64 (extloadi8 ADDRri:$src)),		def : Pat<(i64 (extloadi8 ADDRri:$src)),
(SUBREG_TO_REG (i64 0), (LDB32 ADDRri:$src), sub_32)>;		(SUBREG_TO_REG (i64 0), (LDB32 ADDRri:$src), sub_32)>;
Show All 13 Lines

llvm/lib/Target/BPF/BPFMIPeephole.cpp

Show All 23 Lines
#include "BPFInstrInfo.h"		#include "BPFInstrInfo.h"
#include "BPFTargetMachine.h"		#include "BPFTargetMachine.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/CodeGen/MachineFunctionPass.h"		#include "llvm/CodeGen/MachineFunctionPass.h"
#include "llvm/CodeGen/MachineInstrBuilder.h"		#include "llvm/CodeGen/MachineInstrBuilder.h"
#include "llvm/CodeGen/MachineRegisterInfo.h"		#include "llvm/CodeGen/MachineRegisterInfo.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include <set>		#include <set>
		#include <map>

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "bpf-mi-zext-elim"		#define DEBUG_TYPE "bpf-mi-zext-elim"

STATISTIC(ZExtElemNum, "Number of zero extension shifts eliminated");		STATISTIC(ZExtElemNum, "Number of zero extension shifts eliminated");

namespace {		namespace {
▲ Show 20 Lines • Show All 257 Lines • ▼ Show 20 Lines

namespace {		namespace {

struct BPFMIPreEmitPeephole : public MachineFunctionPass {		struct BPFMIPreEmitPeephole : public MachineFunctionPass {

static char ID;		static char ID;
MachineFunction *MF;		MachineFunction *MF;
const TargetRegisterInfo *TRI;		const TargetRegisterInfo *TRI;
		const BPFInstrInfo *TII;
		bool SupportGotol;

BPFMIPreEmitPeephole() : MachineFunctionPass(ID) {		BPFMIPreEmitPeephole() : MachineFunctionPass(ID) {
initializeBPFMIPreEmitPeepholePass(*PassRegistry::getPassRegistry());		initializeBPFMIPreEmitPeepholePass(*PassRegistry::getPassRegistry());
}		}

private:		private:
// Initialize class variables.		// Initialize class variables.
void initialize(MachineFunction &MFParm);		void initialize(MachineFunction &MFParm);

		bool in16BitRange(int Num);
bool eliminateRedundantMov();		bool eliminateRedundantMov();
		bool adjustBranch();

		std::map<unsigned, unsigned> ReverseCondOpMap;
		eddyz87Unsubmitted Not Done Reply Inline Actions Is this map unused? eddyz87: Is this map unused?
		yonghong-songAuthorUnsubmitted Done Reply Inline Actions No. This is a leftover. Will remove. yonghong-song: No. This is a leftover. Will remove.

public:		public:

// Main entry point for this pass.		// Main entry point for this pass.
bool runOnMachineFunction(MachineFunction &MF) override {		bool runOnMachineFunction(MachineFunction &MF) override {
if (skipFunction(MF.getFunction()))		if (skipFunction(MF.getFunction()))
return false;		return false;

initialize(MF);		initialize(MF);

return eliminateRedundantMov();		bool Changed;
		Changed = eliminateRedundantMov();
		if (SupportGotol)
		Changed = adjustBranch() \|\| Changed;
		eddyz87Unsubmitted Not Done Reply Inline Actions Nitpick: this would not be executed for `-O0`, but is required for correct execution. void BPFPassConfig::addPreEmitPass() { addPass(createBPFMIPreEmitCheckingPass()); if (getOptLevel() != CodeGenOpt::None) if (!DisableMIPeephole) addPass(createBPFMIPreEmitPeepholePass()); } eddyz87: Nitpick: this would not be executed for `-O0`, but is required for correct execution. ``` void…
		return Changed;
}		}
};		};

// Initialize class variables.		// Initialize class variables.
void BPFMIPreEmitPeephole::initialize(MachineFunction &MFParm) {		void BPFMIPreEmitPeephole::initialize(MachineFunction &MFParm) {
MF = &MFParm;		MF = &MFParm;
		TII = MF->getSubtarget<BPFSubtarget>().getInstrInfo();
TRI = MF->getSubtarget<BPFSubtarget>().getRegisterInfo();		TRI = MF->getSubtarget<BPFSubtarget>().getRegisterInfo();
		SupportGotol = MF->getSubtarget<BPFSubtarget>().hasGotol();
LLVM_DEBUG(dbgs() << "* BPF PreEmit peephole pass *\n\n");		LLVM_DEBUG(dbgs() << "* BPF PreEmit peephole pass *\n\n");
}		}

bool BPFMIPreEmitPeephole::eliminateRedundantMov() {		bool BPFMIPreEmitPeephole::eliminateRedundantMov() {
MachineInstr* ToErase = nullptr;		MachineInstr* ToErase = nullptr;
bool Eliminated = false;		bool Eliminated = false;

for (MachineBasicBlock &MBB : *MF) {		for (MachineBasicBlock &MBB : *MF) {
Show All 28 Lines	for (MachineInstr &MI : MBB) {
Eliminated = true;		Eliminated = true;
}		}
}		}
}		}

return Eliminated;		return Eliminated;
}		}

		bool BPFMIPreEmitPeephole::in16BitRange(int Num) {
		// Well, the cut-off is not precisely at 16bit range since
		// new codes are added during the transformation. So let us
		// a little bit conservative.
		return Num >= (INT16_MIN >> 1) && Num <= (INT16_MAX >> 1);
		}

		// Before cpu=v4, only 16bit branch target offset (-0x8000 to 0x7fff)
		// is supported for both unconditional (JMP) and condition (JEQ, JSGT,
		// etc.) branches. In certain cases, e.g., full unrolling, the branch
		// target offset might exceed 16bit range. If this happens, the llvm
		// will generate incorrect code as the offset is truncated to 16bit.
		//
		// To fix this rare case, a new insn JMPL is introduced. This new
		// insn supports supports 32bit branch target offset. The compiler
		// does not use this insn during insn selection. Rather, BPF backend
		// will estimate the branch target offset and do JMP -> JMPL and
		// JEQ -> JEQ + JMPL conversion if the estimated branch target offset
		// is beyond 16bit.
		bool BPFMIPreEmitPeephole::adjustBranch() {
		bool Changed = false;
		int CurrNumInsns = 0;
		std::map<MachineBasicBlock *, int> SoFarNumInsns;
		eddyz87Unsubmitted Not Done Reply Inline Actions Nitpick: Fangrui suggested in my llvm-objdump revisions to use `DenseMap` in most cases (as `std::map` allocates for each pair). eddyz87: Nitpick: Fangrui suggested in my llvm-objdump revisions to use `DenseMap` in most cases (as…
		yonghong-songAuthorUnsubmitted Done Reply Inline Actions Will try to use DenseMap. yonghong-song: Will try to use DenseMap.
		std::map<MachineBasicBlock , MachineBasicBlock > FollowThroughBB;
		std::vector<MachineBasicBlock *> MBBs;

		MachineBasicBlock *PrevBB = nullptr;
		for (MachineBasicBlock &MBB : *MF) {
		// MBB.size() is the number of insns in this basic block, including some
		// debug info, e.g., DEBUG_VALUE, so we may over-count a little bit.
		// Typically we have way more normal insns than DEBUG_VALUE insns.
		// Also, if we indeed need to convert conditional branch like JEQ to
		// JEQ + JMPL, we actually introduced some new insns like below.
		CurrNumInsns += (int)MBB.size();
		SoFarNumInsns[&MBB] = CurrNumInsns;
		if (PrevBB != nullptr)
		FollowThroughBB[PrevBB] = &MBB;
		PrevBB = &MBB;
		// A list of original BBs to make later traveral easier.
		MBBs.push_back(&MBB);
		}
		FollowThroughBB[PrevBB] = nullptr;

		for (unsigned i = 0; i < MBBs.size(); i++) {
		// We have four cases here:
		// (1). no terminator, simple follow through.
		// (2). jmp to another bb.
		// (3). conditional jmp to another bb or follow through.
		// (4). conditional jmp followed by an unconditional jmp.
		MachineInstr CondJmp = nullptr, UncondJmp = nullptr;

		MachineBasicBlock *MBB = MBBs[i];
		for (MachineInstr &Term : MBB->terminators()) {
		if (Term.isConditionalBranch()) {
		assert(CondJmp == nullptr);
		CondJmp = &Term;
		} else if (Term.isUnconditionalBranch()) {
		assert(UncondJmp == nullptr);
		UncondJmp = &Term;
		}
		}

		// (1). no terminator, simple follow through.
		if (!CondJmp && !UncondJmp)
		continue;

		MachineBasicBlock CondTargetBB, JmpBB;
		eddyz87Unsubmitted Not Done Reply Inline Actions As far as I understand: `SoFarNumInsns[JmpBB]` is a number of instructions from function start till the end of `JmpBB`; `CurrNumInsns` is a number of instructions from function start till the end of `MBB`. So, `SoFarNumInsns[JmpBB] - CurrNumInsns` gives the distance between basic block ends. However, the jump would happen to the basic block start, so the actual distance should be computed as `SoFarNumInsns[JmpBB] - JmpBB.size() - CurrNumInsns`. Am I confused? eddyz87: As far as I understand: - `SoFarNumInsns[JmpBB]` is a number of instructions from function…
		CurrNumInsns = SoFarNumInsns[MBB];

		// (2). jmp to another bb.
		if (!CondJmp && UncondJmp) {
		JmpBB = UncondJmp->getOperand(0).getMBB();
		if (in16BitRange(SoFarNumInsns[JmpBB] - JmpBB->size() - CurrNumInsns))
		continue;

		// replace this insn as a JMPL.
		BuildMI(MBB, UncondJmp->getDebugLoc(), TII->get(BPF::JMPL)).addMBB(JmpBB);
		UncondJmp->eraseFromParent();
		Changed = true;
		continue;
		}

		const BasicBlock *TermBB = MBB->getBasicBlock();
		int Dist;

		// (3). conditional jmp to another bb or follow through.
		if (!UncondJmp) {
		CondTargetBB = CondJmp->getOperand(2).getMBB();
		MachineBasicBlock *FollowBB = FollowThroughBB[MBB];
		Dist = SoFarNumInsns[CondTargetBB] - CondTargetBB->size() - CurrNumInsns;
		if (in16BitRange(Dist))
		continue;

		// We have
		eddyz87Unsubmitted Not Done Reply Inline Actions Is it possible to rewrite as below instead? B2: ... if (!cond) goto B3 gotol B5 B3: ... Seems to be equivalent but with less instructions. eddyz87: Is it possible to rewrite as below instead? ``` B2: ... if (!cond) goto B3…
		// B2: ...
		// if (cond) goto B5
		// B3: ...
		// where B2 -> B5 is beyond 16bit range.
		//
		// We do not have 32bit cond jmp insn. So we try to do
		// the following.
		// B2: ...
		// if (cond) goto New_B1
		// New_B0 goto B3
		// New_B1: gotol B5
		// B3: ...
		// Basically two new basic blocks are created.
		MachineBasicBlock *New_B0 = MF->CreateMachineBasicBlock(TermBB);
		MachineBasicBlock *New_B1 = MF->CreateMachineBasicBlock(TermBB);

		// Insert New_B0 and New_B1 into function block list.
		MachineFunction::iterator MBB_I = ++MBB->getIterator();
		MF->insert(MBB_I, New_B0);
		MF->insert(MBB_I, New_B1);

		// replace B2 cond jump
		if (CondJmp->getOperand(1).isReg())
		BuildMI(MBB, MachineBasicBlock::iterator(CondJmp), CondJmp->getDebugLoc(), TII->get(CondJmp->getOpcode()))
		.addReg(CondJmp->getOperand(0).getReg())
		.addReg(CondJmp->getOperand(1).getReg())
		.addMBB(New_B1);
		else
		BuildMI(MBB, MachineBasicBlock::iterator(CondJmp), CondJmp->getDebugLoc(), TII->get(CondJmp->getOpcode()))
		.addReg(CondJmp->getOperand(0).getReg())
		.addImm(CondJmp->getOperand(1).getImm())
		.addMBB(New_B1);

		// it is possible that CondTargetBB and FollowBB are the same. But the
		// above Dist checking should already filtered this case.
		MBB->removeSuccessor(CondTargetBB);
		MBB->removeSuccessor(FollowBB);
		MBB->addSuccessor(New_B0);
		MBB->addSuccessor(New_B1);

		// Populate insns in New_B0 and New_B1.
		BuildMI(New_B0, CondJmp->getDebugLoc(), TII->get(BPF::JMP)).addMBB(FollowBB);
		BuildMI(New_B1, CondJmp->getDebugLoc(), TII->get(BPF::JMPL))
		.addMBB(CondTargetBB);

		New_B0->addSuccessor(FollowBB);
		New_B1->addSuccessor(CondTargetBB);
		CondJmp->eraseFromParent();
		Changed = true;
		continue;
		}

		// (4). conditional jmp followed by an unconditional jmp.
		CondTargetBB = CondJmp->getOperand(2).getMBB();
		JmpBB = UncondJmp->getOperand(0).getMBB();

		// We have
		// B2: ...
		// if (cond) goto B5
		// JMP B7
		// B3: ...
		//
		// If only B2->B5 is out of 16bit range, we can do
		// B2: ...
		// if (cond) goto new_B
		// JMP B7
		// New_B: gotol B5
		// B3: ...
		//
		// If only 'JMP B7' is out of 16bit range, we can replace
		eddyz87Unsubmitted Not Done Reply Inline Actions Nitpick: `(Dist <= INT16_MAX && Dist >= INT16_MIN)` is used in the previous two cases. eddyz87: Nitpick: `(Dist <= INT16_MAX && Dist >= INT16_MIN)` is used in the previous two cases.
		// 'JMP B7' with 'JMPL B7'.
		//
		// If both B2->B5 and 'JMP B7' is out of range, just do
		// both the above transformations.
		Dist = SoFarNumInsns[CondTargetBB] - CondTargetBB->size() - CurrNumInsns;
		if (!in16BitRange(Dist)) {
		MachineBasicBlock *New_B = MF->CreateMachineBasicBlock(TermBB);

		// Insert New_B0 into function block list.
		MF->insert(++MBB->getIterator(), New_B);

		// replace B2 cond jump
		if (CondJmp->getOperand(1).isReg())
		BuildMI(MBB, MachineBasicBlock::iterator(CondJmp), CondJmp->getDebugLoc(), TII->get(CondJmp->getOpcode()))
		.addReg(CondJmp->getOperand(0).getReg())
		.addReg(CondJmp->getOperand(1).getReg())
		.addMBB(New_B);
		else
		BuildMI(MBB, MachineBasicBlock::iterator(CondJmp), CondJmp->getDebugLoc(), TII->get(CondJmp->getOpcode()))
		.addReg(CondJmp->getOperand(0).getReg())
		.addImm(CondJmp->getOperand(1).getImm())
		.addMBB(New_B);

		if (CondTargetBB != JmpBB)
		MBB->removeSuccessor(CondTargetBB);
		MBB->addSuccessor(New_B);

		// Populate insn in New_B.
		BuildMI(New_B, CondJmp->getDebugLoc(), TII->get(BPF::JMPL)).addMBB(CondTargetBB);

		New_B->addSuccessor(CondTargetBB);
		CondJmp->eraseFromParent();
		Changed = true;
		}

		if (!in16BitRange(SoFarNumInsns[JmpBB] - CurrNumInsns)) {
		BuildMI(MBB, UncondJmp->getDebugLoc(), TII->get(BPF::JMPL)).addMBB(JmpBB);
		UncondJmp->eraseFromParent();
		Changed = true;
		}
		}

		return Changed;
		}

} // end default namespace		} // end default namespace

INITIALIZE_PASS(BPFMIPreEmitPeephole, "bpf-mi-pemit-peephole",		INITIALIZE_PASS(BPFMIPreEmitPeephole, "bpf-mi-pemit-peephole",
"BPF PreEmit Peephole Optimization", false, false)		"BPF PreEmit Peephole Optimization", false, false)

char BPFMIPreEmitPeephole::ID = 0;		char BPFMIPreEmitPeephole::ID = 0;
FunctionPass* llvm::createBPFMIPreEmitPeepholePass()		FunctionPass* llvm::createBPFMIPreEmitPeepholePass()
{		{
▲ Show 20 Lines • Show All 179 Lines • Show Last 20 Lines

llvm/lib/Target/BPF/BPFMISimplifyPatchable.cpp

Show First 20 Lines • Show All 90 Lines • ▼ Show 20 Lines	void BPFMISimplifyPatchable::initialize(MachineFunction &MFParm) {
MF = &MFParm;		MF = &MFParm;
TII = MF->getSubtarget<BPFSubtarget>().getInstrInfo();		TII = MF->getSubtarget<BPFSubtarget>().getInstrInfo();
LLVM_DEBUG(dbgs() << "* BPF simplify patchable insts pass *\n\n");		LLVM_DEBUG(dbgs() << "* BPF simplify patchable insts pass *\n\n");
}		}

bool BPFMISimplifyPatchable::isLoadInst(unsigned Opcode) {		bool BPFMISimplifyPatchable::isLoadInst(unsigned Opcode) {
return Opcode == BPF::LDD \|\| Opcode == BPF::LDW \|\| Opcode == BPF::LDH \|\|		return Opcode == BPF::LDD \|\| Opcode == BPF::LDW \|\| Opcode == BPF::LDH \|\|
Opcode == BPF::LDB \|\| Opcode == BPF::LDW32 \|\| Opcode == BPF::LDH32 \|\|		Opcode == BPF::LDB \|\| Opcode == BPF::LDW32 \|\| Opcode == BPF::LDH32 \|\|
Opcode == BPF::LDB32;		Opcode == BPF::LDB32 \|\| Opcode == BPF::LDWSX \|\| Opcode == BPF::LDHSX \|\|
		Opcode == BPF::LDBSX \|\| Opcode == BPF::LDH32SX \|\|
		Opcode == BPF::LDB32SX;
}		}

void BPFMISimplifyPatchable::checkADDrr(MachineRegisterInfo *MRI,		void BPFMISimplifyPatchable::checkADDrr(MachineRegisterInfo *MRI,
MachineOperand RelocOp, const GlobalValue GVal) {		MachineOperand RelocOp, const GlobalValue GVal) {
const MachineInstr *Inst = RelocOp->getParent();		const MachineInstr *Inst = RelocOp->getParent();
const MachineOperand *Op1 = &Inst->getOperand(1);		const MachineOperand *Op1 = &Inst->getOperand(1);
const MachineOperand *Op2 = &Inst->getOperand(2);		const MachineOperand *Op2 = &Inst->getOperand(2);
const MachineOperand *BaseOp = (RelocOp == Op1) ? Op2 : Op1;		const MachineOperand *BaseOp = (RelocOp == Op1) ? Op2 : Op1;

// Go through all uses of %1 as in %1 = ADD_rr %2, %3		// Go through all uses of %1 as in %1 = ADD_rr %2, %3
const MachineOperand Op0 = Inst->getOperand(0);		const MachineOperand Op0 = Inst->getOperand(0);
for (MachineOperand &MO :		for (MachineOperand &MO :
llvm::make_early_inc_range(MRI->use_operands(Op0.getReg()))) {		llvm::make_early_inc_range(MRI->use_operands(Op0.getReg()))) {
// The candidate needs to have a unique definition.		// The candidate needs to have a unique definition.
if (!MRI->getUniqueVRegDef(MO.getReg()))		if (!MRI->getUniqueVRegDef(MO.getReg()))
continue;		continue;

MachineInstr *DefInst = MO.getParent();		MachineInstr *DefInst = MO.getParent();
unsigned Opcode = DefInst->getOpcode();		unsigned Opcode = DefInst->getOpcode();
unsigned COREOp;		unsigned COREOp;
if (Opcode == BPF::LDB \|\| Opcode == BPF::LDH \|\| Opcode == BPF::LDW \|\|		if (Opcode == BPF::LDB \|\| Opcode == BPF::LDH \|\| Opcode == BPF::LDW \|\|
Opcode == BPF::LDD \|\| Opcode == BPF::STB \|\| Opcode == BPF::STH \|\|		Opcode == BPF::LDD \|\| Opcode == BPF::STB \|\| Opcode == BPF::STH \|\|
Opcode == BPF::STW \|\| Opcode == BPF::STD)		Opcode == BPF::STW \|\| Opcode == BPF::STD \|\| Opcode == BPF::LDWSX \|\|
		Opcode == BPF::LDHSX \|\| Opcode == BPF::LDBSX \|\| Opcode == BPF::LDH32SX \|\|
		Opcode == BPF::LDB32SX)
COREOp = BPF::CORE_MEM;		COREOp = BPF::CORE_MEM;
else if (Opcode == BPF::LDB32 \|\| Opcode == BPF::LDH32 \|\|		else if (Opcode == BPF::LDB32 \|\| Opcode == BPF::LDH32 \|\|
Opcode == BPF::LDW32 \|\| Opcode == BPF::STB32 \|\|		Opcode == BPF::LDW32 \|\| Opcode == BPF::STB32 \|\|
Opcode == BPF::STH32 \|\| Opcode == BPF::STW32)		Opcode == BPF::STH32 \|\| Opcode == BPF::STW32)
COREOp = BPF::CORE_ALU32_MEM;		COREOp = BPF::CORE_ALU32_MEM;
else		else
continue;		continue;

▲ Show 20 Lines • Show All 198 Lines • Show Last 20 Lines

llvm/lib/Target/BPF/BPFSubtarget.h

Show First 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	protected:
bool HasJmp32;		bool HasJmp32;

// whether the cpu supports alu32 instructions.		// whether the cpu supports alu32 instructions.
bool HasAlu32;		bool HasAlu32;

// whether we should enable MCAsmInfo DwarfUsesRelocationsAcrossSections		// whether we should enable MCAsmInfo DwarfUsesRelocationsAcrossSections
bool UseDwarfRIS;		bool UseDwarfRIS;

		// whether cpu v4 insns are enabled.
		bool HasLdsx, HasMovsx, HasBswap, HasSdivSmod, HasGotol;

public:		public:
// This constructor initializes the data members to match that		// This constructor initializes the data members to match that
// of the specified triple.		// of the specified triple.
BPFSubtarget(const Triple &TT, const std::string &CPU, const std::string &FS,		BPFSubtarget(const Triple &TT, const std::string &CPU, const std::string &FS,
const TargetMachine &TM);		const TargetMachine &TM);

BPFSubtarget &initializeSubtargetDependencies(StringRef CPU, StringRef FS);		BPFSubtarget &initializeSubtargetDependencies(StringRef CPU, StringRef FS);

// ParseSubtargetFeatures - Parses features string setting specified		// ParseSubtargetFeatures - Parses features string setting specified
// subtarget options. Definition of function is auto generated by tblgen.		// subtarget options. Definition of function is auto generated by tblgen.
void ParseSubtargetFeatures(StringRef CPU, StringRef TuneCPU, StringRef FS);		void ParseSubtargetFeatures(StringRef CPU, StringRef TuneCPU, StringRef FS);
bool getHasJmpExt() const { return HasJmpExt; }		bool getHasJmpExt() const { return HasJmpExt; }
bool getHasJmp32() const { return HasJmp32; }		bool getHasJmp32() const { return HasJmp32; }
bool getHasAlu32() const { return HasAlu32; }		bool getHasAlu32() const { return HasAlu32; }
bool getUseDwarfRIS() const { return UseDwarfRIS; }		bool getUseDwarfRIS() const { return UseDwarfRIS; }
		bool hasLdsx() const { return HasLdsx; }
		bool hasMovsx() const { return HasMovsx; }
		bool hasBswap() const { return HasBswap; }
		bool hasSdivSmod() const { return HasSdivSmod; }
		bool hasGotol() const { return HasGotol; }

const BPFInstrInfo *getInstrInfo() const override { return &InstrInfo; }		const BPFInstrInfo *getInstrInfo() const override { return &InstrInfo; }
const BPFFrameLowering *getFrameLowering() const override {		const BPFFrameLowering *getFrameLowering() const override {
return &FrameLowering;		return &FrameLowering;
}		}
const BPFTargetLowering *getTargetLowering() const override {		const BPFTargetLowering *getTargetLowering() const override {
return &TLInfo;		return &TLInfo;
}		}
Show All 10 Lines

llvm/lib/Target/BPF/BPFSubtarget.cpp

	Show All 17 Lines
	using namespace llvm;			using namespace llvm;

	#define DEBUG_TYPE "bpf-subtarget"			#define DEBUG_TYPE "bpf-subtarget"

	#define GET_SUBTARGETINFO_TARGET_DESC			#define GET_SUBTARGETINFO_TARGET_DESC
	#define GET_SUBTARGETINFO_CTOR			#define GET_SUBTARGETINFO_CTOR
	#include "BPFGenSubtargetInfo.inc"			#include "BPFGenSubtargetInfo.inc"

				static cl::opt<bool> Disable_ldsx("disable-ldsx", cl::Hidden, cl::init(false),
				cl::desc("Disable ldsx insns"));
				static cl::opt<bool> Disable_movsx("disable-movsx", cl::Hidden, cl::init(false),
				cl::desc("Disable movsx insns"));
				static cl::opt<bool> Disable_bswap("disable-bswap", cl::Hidden, cl::init(false),
				cl::desc("Disable bswap insns"));
				static cl::opt<bool> Disable_sdiv_smod("disable-sdiv-smod", cl::Hidden,
				cl::init(false), cl::desc("Disable sdiv/smod insns"));
				static cl::opt<bool> Disable_gotol("disable-gotol", cl::Hidden, cl::init(false),
				cl::desc("Disable gotol insn"));

	void BPFSubtarget::anchor() {}			void BPFSubtarget::anchor() {}

	BPFSubtarget &BPFSubtarget::initializeSubtargetDependencies(StringRef CPU,			BPFSubtarget &BPFSubtarget::initializeSubtargetDependencies(StringRef CPU,
	StringRef FS) {			StringRef FS) {
	initializeEnvironment();			initializeEnvironment();
	initSubtargetFeatures(CPU, FS);			initSubtargetFeatures(CPU, FS);
	ParseSubtargetFeatures(CPU, /TuneCPU/ CPU, FS);			ParseSubtargetFeatures(CPU, /TuneCPU/ CPU, FS);
	return *this;			return *this;
	}			}

	void BPFSubtarget::initializeEnvironment() {			void BPFSubtarget::initializeEnvironment() {
	HasJmpExt = false;			HasJmpExt = false;
	HasJmp32 = false;			HasJmp32 = false;
	HasAlu32 = false;			HasAlu32 = false;
	UseDwarfRIS = false;			UseDwarfRIS = false;
				HasLdsx = false;
				HasMovsx = false;
				HasBswap = false;
				HasSdivSmod = false;
				HasGotol = false;
	}			}

	void BPFSubtarget::initSubtargetFeatures(StringRef CPU, StringRef FS) {			void BPFSubtarget::initSubtargetFeatures(StringRef CPU, StringRef FS) {
	if (CPU == "probe")			if (CPU == "probe")
	CPU = sys::detail::getHostCPUNameForBPF();			CPU = sys::detail::getHostCPUNameForBPF();
	if (CPU == "generic" \|\| CPU == "v1")			if (CPU == "generic" \|\| CPU == "v1")
	return;			return;
	if (CPU == "v2") {			if (CPU == "v2") {
	HasJmpExt = true;			HasJmpExt = true;
	return;			return;
	}			}
	if (CPU == "v3") {			if (CPU == "v3") {
	HasJmpExt = true;			HasJmpExt = true;
	HasJmp32 = true;			HasJmp32 = true;
	HasAlu32 = true;			HasAlu32 = true;
	return;			return;
	}			}
				if (CPU == "v4") {
				HasJmpExt = true;
				HasJmp32 = true;
				HasAlu32 = true;
				HasLdsx = !Disable_ldsx;
				HasMovsx = !Disable_movsx;
				HasBswap = !Disable_bswap;
				HasSdivSmod = !Disable_sdiv_smod;
				HasGotol = !Disable_gotol;
				return;
				}
	}			}

	BPFSubtarget::BPFSubtarget(const Triple &TT, const std::string &CPU,			BPFSubtarget::BPFSubtarget(const Triple &TT, const std::string &CPU,
	const std::string &FS, const TargetMachine &TM)			const std::string &FS, const TargetMachine &TM)
	: BPFGenSubtargetInfo(TT, CPU, /TuneCPU/ CPU, FS),			: BPFGenSubtargetInfo(TT, CPU, /TuneCPU/ CPU, FS),
	FrameLowering(initializeSubtargetDependencies(CPU, FS)),			FrameLowering(initializeSubtargetDependencies(CPU, FS)),
	TLInfo(TM, *this) {}			TLInfo(TM, *this) {}

llvm/lib/Target/BPF/Disassembler/BPFDisassembler.cpp

Show First 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	enum BPF_SIZE {
BPF_DW = 0x3		BPF_DW = 0x3
};		};

enum BPF_MODE {		enum BPF_MODE {
BPF_IMM = 0x0,		BPF_IMM = 0x0,
BPF_ABS = 0x1,		BPF_ABS = 0x1,
BPF_IND = 0x2,		BPF_IND = 0x2,
BPF_MEM = 0x3,		BPF_MEM = 0x3,
BPF_LEN = 0x4,		BPF_MEMSX = 0x4,
BPF_MSH = 0x5,
BPF_ATOMIC = 0x6		BPF_ATOMIC = 0x6
};		};

BPFDisassembler(const MCSubtargetInfo &STI, MCContext &Ctx)		BPFDisassembler(const MCSubtargetInfo &STI, MCContext &Ctx)
: MCDisassembler(STI, Ctx) {}		: MCDisassembler(STI, Ctx) {}
~BPFDisassembler() override = default;		~BPFDisassembler() override = default;

DecodeStatus getInstruction(MCInst &Instr, uint64_t &Size,		DecodeStatus getInstruction(MCInst &Instr, uint64_t &Size,
▲ Show 20 Lines • Show All 103 Lines • ▼ Show 20 Lines	DecodeStatus BPFDisassembler::getInstruction(MCInst &Instr, uint64_t &Size,

Result = readInstruction64(Bytes, Address, Size, Insn, IsLittleEndian);		Result = readInstruction64(Bytes, Address, Size, Insn, IsLittleEndian);
if (Result == MCDisassembler::Fail) return MCDisassembler::Fail;		if (Result == MCDisassembler::Fail) return MCDisassembler::Fail;

uint8_t InstClass = getInstClass(Insn);		uint8_t InstClass = getInstClass(Insn);
uint8_t InstMode = getInstMode(Insn);		uint8_t InstMode = getInstMode(Insn);
if ((InstClass == BPF_LDX \|\| InstClass == BPF_STX) &&		if ((InstClass == BPF_LDX \|\| InstClass == BPF_STX) &&
getInstSize(Insn) != BPF_DW &&		getInstSize(Insn) != BPF_DW &&
(InstMode == BPF_MEM \|\| InstMode == BPF_ATOMIC) &&		(InstMode == BPF_MEM \|\| InstMode == BPF_MEMSX \|\| InstMode == BPF_ATOMIC) &&
STI.hasFeature(BPF::ALU32))		STI.hasFeature(BPF::ALU32))
Result = decodeInstruction(DecoderTableBPFALU3264, Instr, Insn, Address,		Result = decodeInstruction(DecoderTableBPFALU3264, Instr, Insn, Address,
this, STI);		this, STI);
else		else
Result = decodeInstruction(DecoderTableBPF64, Instr, Insn, Address, this,		Result = decodeInstruction(DecoderTableBPF64, Instr, Insn, Address, this,
STI);		STI);

if (Result == MCDisassembler::Fail) return MCDisassembler::Fail;		if (Result == MCDisassembler::Fail) return MCDisassembler::Fail;
Show All 36 Lines

llvm/lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp

//===-- BPFAsmBackend.cpp - BPF Assembler Backend -------------------------===//		//===-- BPFAsmBackend.cpp - BPF Assembler Backend -------------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		#include "MCTargetDesc/BPFMCFixups.h"
#include "MCTargetDesc/BPFMCTargetDesc.h"		#include "MCTargetDesc/BPFMCTargetDesc.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/MC/MCAsmBackend.h"		#include "llvm/MC/MCAsmBackend.h"
#include "llvm/MC/MCAssembler.h"		#include "llvm/MC/MCAssembler.h"
#include "llvm/MC/MCContext.h"		#include "llvm/MC/MCContext.h"
#include "llvm/MC/MCFixup.h"		#include "llvm/MC/MCFixup.h"
		#include "llvm/MC/MCFixupKindInfo.h"
#include "llvm/MC/MCObjectWriter.h"		#include "llvm/MC/MCObjectWriter.h"
#include "llvm/Support/EndianStream.h"		#include "llvm/Support/EndianStream.h"
#include <cassert>		#include <cassert>
#include <cstdint>		#include <cstdint>

using namespace llvm;		using namespace llvm;

namespace {		namespace {
Show All 13 Lines	public:

// No instruction requires relaxation		// No instruction requires relaxation
bool fixupNeedsRelaxation(const MCFixup &Fixup, uint64_t Value,		bool fixupNeedsRelaxation(const MCFixup &Fixup, uint64_t Value,
const MCRelaxableFragment *DF,		const MCRelaxableFragment *DF,
const MCAsmLayout &Layout) const override {		const MCAsmLayout &Layout) const override {
return false;		return false;
}		}

unsigned getNumFixupKinds() const override { return 1; }		unsigned getNumFixupKinds() const override {
		return BPF::NumTargetFixupKinds;
		}
		const MCFixupKindInfo &getFixupKindInfo(MCFixupKind Kind) const override;

bool writeNopData(raw_ostream &OS, uint64_t Count,		bool writeNopData(raw_ostream &OS, uint64_t Count,
const MCSubtargetInfo *STI) const override;		const MCSubtargetInfo *STI) const override;
};		};

} // end anonymous namespace		} // end anonymous namespace

		const MCFixupKindInfo &
		BPFAsmBackend::getFixupKindInfo(MCFixupKind Kind) const {
		const static MCFixupKindInfo Infos[BPF::NumTargetFixupKinds] = {
		{ "FK_BPF_PCRel_4", 0, 32, MCFixupKindInfo::FKF_IsPCRel },
		};

		if (Kind < FirstTargetFixupKind)
		return MCAsmBackend::getFixupKindInfo(Kind);

		assert(unsigned(Kind - FirstTargetFixupKind) < getNumFixupKinds() &&
		"Invalid kind!");
		return Infos[Kind - FirstTargetFixupKind];
		}

bool BPFAsmBackend::writeNopData(raw_ostream &OS, uint64_t Count,		bool BPFAsmBackend::writeNopData(raw_ostream &OS, uint64_t Count,
const MCSubtargetInfo *STI) const {		const MCSubtargetInfo *STI) const {
if ((Count % 8) != 0)		if ((Count % 8) != 0)
return false;		return false;

for (uint64_t i = 0; i < Count; i += 8)		for (uint64_t i = 0; i < Count; i += 8)
support::endian::write<uint64_t>(OS, 0x15000000, Endian);		support::endian::write<uint64_t>(OS, 0x15000000, Endian);

Show All 20 Lines	if (Fixup.getKind() == FK_SecRel_8) {
Value = (uint32_t)((Value - 8) / 8);		Value = (uint32_t)((Value - 8) / 8);
if (Endian == support::little) {		if (Endian == support::little) {
Data[Fixup.getOffset() + 1] = 0x10;		Data[Fixup.getOffset() + 1] = 0x10;
support::endian::write32le(&Data[Fixup.getOffset() + 4], Value);		support::endian::write32le(&Data[Fixup.getOffset() + 4], Value);
} else {		} else {
Data[Fixup.getOffset() + 1] = 0x1;		Data[Fixup.getOffset() + 1] = 0x1;
support::endian::write32be(&Data[Fixup.getOffset() + 4], Value);		support::endian::write32be(&Data[Fixup.getOffset() + 4], Value);
}		}
		} else if (Fixup.getTargetKind() == BPF::FK_BPF_PCRel_4) {
		// The input Value represents the number of bytes.
		eddyz87Unsubmitted Not Done Reply Inline Actions This is because `Value` is in bytes, right? Could you please drop a comment here. eddyz87: This is because `Value` is in bytes, right? Could you please drop a comment here.
		Value = (uint32_t)((Value - 8) / 8);
		support::endian::write<uint32_t>(&Data[Fixup.getOffset() + 4], Value,
		Endian);
} else {		} else {
assert(Fixup.getKind() == FK_PCRel_2);		assert(Fixup.getKind() == FK_PCRel_2);

int64_t ByteOff = (int64_t)Value - 8;		int64_t ByteOff = (int64_t)Value - 8;
if (ByteOff > INT16_MAX * 8 \|\| ByteOff < INT16_MIN * 8)		if (ByteOff > INT16_MAX * 8 \|\| ByteOff < INT16_MIN * 8)
report_fatal_error("Branch target out of insn range");		report_fatal_error("Branch target out of insn range");

Value = (uint16_t)((Value - 8) / 8);		Value = (uint16_t)((Value - 8) / 8);
Show All 23 Lines

llvm/lib/Target/BPF/MCTargetDesc/BPFInstPrinter.cpp

//===-- BPFInstPrinter.cpp - Convert BPF MCInst to asm syntax -------------===//		//===-- BPFInstPrinter.cpp - Convert BPF MCInst to asm syntax -------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This class prints an BPF MCInst to a .s file.		// This class prints an BPF MCInst to a .s file.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//


		#include "BPF.h"
#include "MCTargetDesc/BPFInstPrinter.h"		#include "MCTargetDesc/BPFInstPrinter.h"
#include "llvm/MC/MCAsmInfo.h"		#include "llvm/MC/MCAsmInfo.h"
#include "llvm/MC/MCExpr.h"		#include "llvm/MC/MCExpr.h"
#include "llvm/MC/MCInst.h"		#include "llvm/MC/MCInst.h"
#include "llvm/MC/MCRegister.h"		#include "llvm/MC/MCRegister.h"
#include "llvm/MC/MCSymbol.h"		#include "llvm/MC/MCSymbol.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
▲ Show 20 Lines • Show All 74 Lines • ▼ Show 20 Lines	void BPFInstPrinter::printImm64Operand(const MCInst *MI, unsigned OpNo,
else		else
O << Op;		O << Op;
}		}

void BPFInstPrinter::printBrTargetOperand(const MCInst *MI, unsigned OpNo,		void BPFInstPrinter::printBrTargetOperand(const MCInst *MI, unsigned OpNo,
raw_ostream &O) {		raw_ostream &O) {
const MCOperand &Op = MI->getOperand(OpNo);		const MCOperand &Op = MI->getOperand(OpNo);
if (Op.isImm()) {		if (Op.isImm()) {
		if (MI->getOpcode() == BPF::JMPL) {
		int32_t Imm = Op.getImm();
		O << ((Imm >= 0) ? "+" : "") << formatImm(Imm);
		} else {
int16_t Imm = Op.getImm();		int16_t Imm = Op.getImm();
O << ((Imm >= 0) ? "+" : "") << formatImm(Imm);		O << ((Imm >= 0) ? "+" : "") << formatImm(Imm);
		}
} else if (Op.isExpr()) {		} else if (Op.isExpr()) {
printExpr(Op.getExpr(), O);		printExpr(Op.getExpr(), O);
} else {		} else {
O << Op;		O << Op;
}		}
}		}

llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp

//===-- BPFMCCodeEmitter.cpp - Convert BPF code to machine code -----------===//		//===-- BPFMCCodeEmitter.cpp - Convert BPF code to machine code -----------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file implements the BPFMCCodeEmitter class.		// This file implements the BPFMCCodeEmitter class.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		#include "MCTargetDesc/BPFMCFixups.h"
#include "MCTargetDesc/BPFMCTargetDesc.h"		#include "MCTargetDesc/BPFMCTargetDesc.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/MC/MCCodeEmitter.h"		#include "llvm/MC/MCCodeEmitter.h"
#include "llvm/MC/MCExpr.h"		#include "llvm/MC/MCExpr.h"
#include "llvm/MC/MCFixup.h"		#include "llvm/MC/MCFixup.h"
#include "llvm/MC/MCInst.h"		#include "llvm/MC/MCInst.h"
#include "llvm/MC/MCInstrInfo.h"		#include "llvm/MC/MCInstrInfo.h"
#include "llvm/MC/MCRegisterInfo.h"		#include "llvm/MC/MCRegisterInfo.h"
▲ Show 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	unsigned BPFMCCodeEmitter::getMachineOpValue(const MCInst &MI,

assert(Expr->getKind() == MCExpr::SymbolRef);		assert(Expr->getKind() == MCExpr::SymbolRef);

if (MI.getOpcode() == BPF::JAL)		if (MI.getOpcode() == BPF::JAL)
// func call name		// func call name
Fixups.push_back(MCFixup::create(0, Expr, FK_PCRel_4));		Fixups.push_back(MCFixup::create(0, Expr, FK_PCRel_4));
else if (MI.getOpcode() == BPF::LD_imm64)		else if (MI.getOpcode() == BPF::LD_imm64)
Fixups.push_back(MCFixup::create(0, Expr, FK_SecRel_8));		Fixups.push_back(MCFixup::create(0, Expr, FK_SecRel_8));
		else if (MI.getOpcode() == BPF::JMPL)
		Fixups.push_back(MCFixup::create(0, Expr, (MCFixupKind)BPF::FK_BPF_PCRel_4));
else		else
// bb label		// bb label
Fixups.push_back(MCFixup::create(0, Expr, FK_PCRel_2));		Fixups.push_back(MCFixup::create(0, Expr, FK_PCRel_2));

return 0;		return 0;
}		}

static uint8_t SwapBits(uint8_t Val)		static uint8_t SwapBits(uint8_t Val)
▲ Show 20 Lines • Show All 64 Lines • Show Last 20 Lines

llvm/lib/Target/BPF/MCTargetDesc/BPFMCFixups.h

This file was added.

				//=======-- BPFMCFixups.h - BPF-specific fixup entries ------- C++ --=======//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_LIB_TARGET_BPF_MCTARGETDESC_SYSTEMZMCFIXUPS_H
				#define LLVM_LIB_TARGET_BPF_MCTARGETDESC_SYSTEMZMCFIXUPS_H

				#include "llvm/MC/MCFixup.h"

				namespace llvm {
				namespace BPF {
				enum FixupKind {
				// These correspond directly to R_390_* relocations.
				astUnsubmitted Not Done Reply Inline Actions a little bit too much of copy paste :) ast: a little bit too much of copy paste :)
				FK_BPF_PCRel_4 = FirstTargetFixupKind,

				// Marker
				LastTargetFixupKind,
				NumTargetFixupKinds = LastTargetFixupKind - FirstTargetFixupKind
				};
				} // end namespace BPF
				} // end namespace llvm

				#endif

llvm/lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.cpp

	Show First 20 Lines • Show All 73 Lines • ▼ Show 20 Lines
	class BPFMCInstrAnalysis : public MCInstrAnalysis {			class BPFMCInstrAnalysis : public MCInstrAnalysis {
	public:			public:
	explicit BPFMCInstrAnalysis(const MCInstrInfo *Info)			explicit BPFMCInstrAnalysis(const MCInstrInfo *Info)
	: MCInstrAnalysis(Info) {}			: MCInstrAnalysis(Info) {}

	bool evaluateBranch(const MCInst &Inst, uint64_t Addr, uint64_t Size,			bool evaluateBranch(const MCInst &Inst, uint64_t Addr, uint64_t Size,
	uint64_t &Target) const override {			uint64_t &Target) const override {
	// The target is the 3rd operand of cond inst and the 1st of uncond inst.			// The target is the 3rd operand of cond inst and the 1st of uncond inst.
	int16_t Imm;			int32_t Imm;
	if (isConditionalBranch(Inst)) {			if (isConditionalBranch(Inst)) {
	Imm = Inst.getOperand(2).getImm();			Imm = (short)Inst.getOperand(2).getImm();
	} else if (isUnconditionalBranch(Inst))			} else if (isUnconditionalBranch(Inst)) {
	Imm = Inst.getOperand(0).getImm();			if (Inst.getOpcode() == BPF::JMP)
				Imm = (short)Inst.getOperand(0).getImm();
	else			else
				Imm = (int)Inst.getOperand(0).getImm();
				} else
	return false;			return false;

	Target = Addr + Size + Imm * Size;			Target = Addr + Size + Imm * Size;
	return true;			return true;
	}			}
	};			};

	} // end anonymous namespace			} // end anonymous namespace
	▲ Show 20 Lines • Show All 56 Lines • Show Last 20 Lines

llvm/test/CodeGen/BPF/bswap.ll

This file was added.

				; RUN: llc -march=bpfel -mcpu=v4 -verify-machineinstrs -show-mc-encoding < %s \| FileCheck %s
				; Source:
				; long foo(int a, int b, long c) {
				; a = __builtin_bswap16(a);
				; b = __builtin_bswap32(b);
				; c = __builtin_bswap64(c);
				; return a + b + c;
				; }
				; Compilation flags:
				; clang -target bpf -O2 -S -emit-llvm t.c

				; Function Attrs: mustprogress nofree nosync nounwind willreturn memory(none)
				define dso_local i64 @foo(i32 noundef %a, i32 noundef %b, i64 noundef %c) local_unnamed_addr #0 {
				entry:
				%conv = trunc i32 %a to i16
				%0 = tail call i16 @llvm.bswap.i16(i16 %conv)
				%conv1 = zext i16 %0 to i32
				%1 = tail call i32 @llvm.bswap.i32(i32 %b)
				%2 = tail call i64 @llvm.bswap.i64(i64 %c)
				%add = add nsw i32 %1, %conv1
				%conv2 = sext i32 %add to i64
				%add3 = add nsw i64 %2, %conv2
				ret i64 %add3
				}

				; CHECK: r1 = bswap16 r1 # encoding: [0xd7,0x01,0x00,0x00,0x10,0x00,0x00,0x00]
				; CHECK: r2 = bswap32 r2 # encoding: [0xd7,0x02,0x00,0x00,0x20,0x00,0x00,0x00]
				; CHECK: r0 = bswap64 r0 # encoding: [0xd7,0x00,0x00,0x00,0x40,0x00,0x00,0x00]

				; Function Attrs: mustprogress nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare i16 @llvm.bswap.i16(i16) #1

				; Function Attrs: mustprogress nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare i32 @llvm.bswap.i32(i32) #1

				; Function Attrs: mustprogress nocallback nofree nosync nounwind speculatable willreturn memory(none)
				declare i64 @llvm.bswap.i64(i64) #1

				attributes #0 = { mustprogress nofree nosync nounwind willreturn memory(none) "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
				attributes #1 = { mustprogress nocallback nofree nosync nounwind speculatable willreturn memory(none) }

				!llvm.module.flags = !{!0, !1}
				!llvm.ident = !{!2}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{i32 7, !"frame-pointer", i32 2}
				!2 = !{!"clang version 17.0.0 (https://github.com/llvm/llvm-project.git a2913a8a2bfe572d2f1bfea950ab9b0848373648)"}

llvm/test/CodeGen/BPF/ldsx.ll

This file was added.

				; RUN: llc -march=bpfel -mcpu=v4 -verify-machineinstrs -show-mc-encoding < %s \| FileCheck %s
				; Source:
				; int f1(char *p) {
				; return *p;
				; }
				; int f2(short *p) {
				; return *p;
				; }
				; int f3(int *p) {
				; return *p;
				; }
				; long f4(char *p) {
				; return *p;
				; }
				; long f5(short *p) {
				; return *p;
				; }
				; long f6(int *p) {
				; return *p;
				; }
				; long f7(long *p) {
				; return *p;
				; }
				; Compilation flags:
				; clang -target bpf -O2 -S -emit-llvm -Xclang -disable-llvm-passes t.c

				; Function Attrs: argmemonly mustprogress nofree norecurse nosync nounwind readonly willreturn
				define dso_local i32 @f1(ptr nocapture noundef readonly %p) local_unnamed_addr #0 {
				entry:
				%0 = load i8, ptr %p, align 1, !tbaa !3
				%conv = sext i8 %0 to i32
				; CHECK: w0 = (s8 )(r1 + 0) # encoding: [0x91,0x10,0x00,0x00,0x00,0x00,0x00,0x00]
				ret i32 %conv
				}

				; Function Attrs: argmemonly mustprogress nofree norecurse nosync nounwind readonly willreturn
				define dso_local i32 @f2(ptr nocapture noundef readonly %p) local_unnamed_addr #0 {
				entry:
				%0 = load i16, ptr %p, align 2, !tbaa !6
				%conv = sext i16 %0 to i32
				; CHECK: w0 = (s16 )(r1 + 0) # encoding: [0x89,0x10,0x00,0x00,0x00,0x00,0x00,0x00]
				ret i32 %conv
				}

				; Function Attrs: argmemonly mustprogress nofree norecurse nosync nounwind readonly willreturn
				define dso_local i32 @f3(ptr nocapture noundef readonly %p) local_unnamed_addr #0 {
				entry:
				%0 = load i32, ptr %p, align 4, !tbaa !8
				; CHECK: w0 = (u32 )(r1 + 0) # encoding: [0x61,0x10,0x00,0x00,0x00,0x00,0x00,0x00]
				ret i32 %0
				}

				; Function Attrs: argmemonly mustprogress nofree norecurse nosync nounwind readonly willreturn
				define dso_local i64 @f4(ptr nocapture noundef readonly %p) local_unnamed_addr #0 {
				entry:
				%0 = load i8, ptr %p, align 1, !tbaa !3
				%conv = sext i8 %0 to i64
				ret i64 %conv
				; CHECK: r0 = (s8 )(r1 + 0) # encoding: [0x91,0x10,0x00,0x00,0x00,0x00,0x00,0x00]
				}

				; Function Attrs: argmemonly mustprogress nofree norecurse nosync nounwind readonly willreturn
				define dso_local i64 @f5(ptr nocapture noundef readonly %p) local_unnamed_addr #0 {
				entry:
				%0 = load i16, ptr %p, align 2, !tbaa !6
				%conv = sext i16 %0 to i64
				ret i64 %conv
				; CHECK: r0 = (s16 )(r1 + 0) # encoding: [0x89,0x10,0x00,0x00,0x00,0x00,0x00,0x00]
				}

				; Function Attrs: argmemonly mustprogress nofree norecurse nosync nounwind readonly willreturn
				define dso_local i64 @f6(ptr nocapture noundef readonly %p) local_unnamed_addr #0 {
				entry:
				%0 = load i32, ptr %p, align 4, !tbaa !8
				%conv = sext i32 %0 to i64
				ret i64 %conv
				; CHECK: r0 = (s32 )(r1 + 0) # encoding: [0x81,0x10,0x00,0x00,0x00,0x00,0x00,0x00]
				}

				; Function Attrs: argmemonly mustprogress nofree norecurse nosync nounwind readonly willreturn
				define dso_local i64 @f7(ptr nocapture noundef readonly %p) local_unnamed_addr #0 {
				entry:
				%0 = load i64, ptr %p, align 8, !tbaa !10
				ret i64 %0
				; CHECK: r0 = (u64 )(r1 + 0) # encoding: [0x79,0x10,0x00,0x00,0x00,0x00,0x00,0x00]
				}

				attributes #0 = { argmemonly mustprogress nofree norecurse nosync nounwind readonly willreturn "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" }

				!llvm.module.flags = !{!0, !1}
				!llvm.ident = !{!2}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{i32 7, !"frame-pointer", i32 2}
				!2 = !{!"clang version 16.0.0 (https://github.com/llvm/llvm-project.git 68665544c7d59735e9c0bb32b08829c006c7c594)"}
				!3 = !{!4, !4, i64 0}
				!4 = !{!"omnipotent char", !5, i64 0}
				!5 = !{!"Simple C/C++ TBAA"}
				!6 = !{!7, !7, i64 0}
				!7 = !{!"short", !4, i64 0}
				!8 = !{!9, !9, i64 0}
				!9 = !{!"int", !4, i64 0}
				!10 = !{!11, !11, i64 0}
				!11 = !{!"long", !4, i64 0}

llvm/test/CodeGen/BPF/movsx.ll

This file was added.

				; RUN: llc -march=bpfel -mcpu=v4 -verify-machineinstrs -show-mc-encoding < %s \| FileCheck %s
				; Source:
				; short f1(char a) {
				; return a;
				; }
				; int f2(char a) {
				; return a;
				; }
				; long f3(char a) {
				; return a;
				; }
				; int f4(short a) {
				; return a;
				; }
				; long f5(short a) {
				; return a;
				; }
				; long f6(int a) {
				; return a;
				; }
				; Compilation flags:
				; clang -target bpf -O2 -S -emit-llvm t.c

				; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none)
				define dso_local i16 @f1(i8 noundef signext %a) local_unnamed_addr #0 {
				entry:
				%conv = sext i8 %a to i16
				ret i16 %conv
				}
				; CHECK: w0 = w1 # encoding: [0xbc,0x10,0x00,0x00,0x00,0x00,0x00,0x00]

				eddyz87Unsubmitted Not Done Reply Inline Actions This does not seem right, as it does not sign extend 8-bit argument to 16-bit value. eddyz87: This does not seem right, as it does not sign extend 8-bit argument to 16-bit value.
				yonghong-songAuthorUnsubmitted Done Reply Inline Actions This is probably due to ABI. For example, $ cat t1.c __attribute__((noinline)) short f1(char a) { return a * a; } int f2(int a) { return f1(a); } $ clang --target=bpf -O2 -mcpu=v4 -S t1.c f1: # @f1 # %bb.0: # %entry w0 = w1 w0 = w0 exit .Lfunc_end0: .size f1, .Lfunc_end0-f1 # -- End function .globl f2 # -- Begin function f2 .p2align 3 .type f2,@function f2: # @f2 # %bb.0: # %entry w1 = (s8)w1 call f1 w0 = (s16)w0 exit You can see in function f2(), the sign-extension has been done properly. and that is probably the reason in f1(), the compiler didn't generate proper sign extension code. I will modify the test to generate proper sign extension like the above f2(). yonghong-song:* This is probably due to ABI. For example, ``` $ cat t1.c __attribute__((noinline)) short f1…
				; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none)
				define dso_local i32 @f2(i8 noundef signext %a) local_unnamed_addr #0 {
				entry:
				%conv = sext i8 %a to i32
				ret i32 %conv
				}
				; CHECK: w0 = w1 # encoding: [0xbc,0x10,0x00,0x00,0x00,0x00,0x00,0x00]

				eddyz87Unsubmitted Not Done Reply Inline Actions Shouldn't this be `w0 = (s8)w1`? A few checks below also look strange. eddyz87: Shouldn't this be `w0 = (s8)w1`? A few checks below also look strange.
				; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none)
				define dso_local i64 @f3(i8 noundef signext %a) local_unnamed_addr #0 {
				entry:
				%conv = sext i8 %a to i64
				ret i64 %conv
				}
				; CHECK: r0 = (s32)w1 # encoding: [0xbf,0x10,0x20,0x00,0x00,0x00,0x00,0x00]

				; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none)
				define dso_local i32 @f4(i16 noundef signext %a) local_unnamed_addr #0 {
				entry:
				%conv = sext i16 %a to i32
				ret i32 %conv
				}
				; CHECK: w0 = w1 # encoding: [0xbc,0x10,0x00,0x00,0x00,0x00,0x00,0x00]

				; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none)
				define dso_local i64 @f5(i16 noundef signext %a) local_unnamed_addr #0 {
				entry:
				%conv = sext i16 %a to i64
				ret i64 %conv
				}
				; CHECK: r0 = (s32)w1 # encoding: [0xbf,0x10,0x20,0x00,0x00,0x00,0x00,0x00]

				; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none)
				define dso_local i64 @f6(i32 noundef %a) local_unnamed_addr #0 {
				entry:
				%conv = sext i32 %a to i64
				ret i64 %conv
				}
				; CHECK: r0 = (s32)w1 # encoding: [0xbf,0x10,0x20,0x00,0x00,0x00,0x00,0x00]

				attributes #0 = { mustprogress nofree norecurse nosync nounwind willreturn memory(none) "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" }

				!llvm.module.flags = !{!0, !1}
				!llvm.ident = !{!2}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{i32 7, !"frame-pointer", i32 2}
				!2 = !{!"clang version 17.0.0 (https://github.com/llvm/llvm-project.git a2913a8a2bfe572d2f1bfea950ab9b0848373648)"}

llvm/test/CodeGen/BPF/sdiv_smod.ll

This file was added.

				; RUN: llc -march=bpfel -mcpu=v4 -verify-machineinstrs -show-mc-encoding < %s \| FileCheck %s
				; Source:
				; int foo(int a, int b, int c) {
				; return a/b + a%c;
				; }
				; long bar(long a, long b, long c) {
				; return a/b + a%c;
				; }
				; Compilation flags:
				; clang -target bpf -O2 -S -emit-llvm -Xclang -disable-llvm-passes t.c

				; Function Attrs: nounwind
				define dso_local i32 @foo(i32 noundef %a, i32 noundef %b, i32 noundef %c) #0 {
				entry:
				%a.addr = alloca i32, align 4
				%b.addr = alloca i32, align 4
				%c.addr = alloca i32, align 4
				store i32 %a, ptr %a.addr, align 4, !tbaa !3
				store i32 %b, ptr %b.addr, align 4, !tbaa !3
				store i32 %c, ptr %c.addr, align 4, !tbaa !3
				%0 = load i32, ptr %a.addr, align 4, !tbaa !3
				%1 = load i32, ptr %b.addr, align 4, !tbaa !3
				%div = sdiv i32 %0, %1
				%2 = load i32, ptr %a.addr, align 4, !tbaa !3
				%3 = load i32, ptr %c.addr, align 4, !tbaa !3
				%rem = srem i32 %2, %3
				%add = add nsw i32 %div, %rem
				ret i32 %add
				}

				; CHECK: w0 = w1
				; CHECK-NEXT: (u32 )(r10 - 8) = w2
				; CHECK-NEXT: (u32 )(r10 - 4) = w0
				; CHECK-NEXT: (u32 )(r10 - 12) = w3
				; CHECK-NEXT: w1 s%= w3 # encoding: [0x9c,0x31,0x01,0x00,0x00,0x00,0x00,0x00]
				; CHECK-NEXT: w0 s/= w2 # encoding: [0x3c,0x20,0x01,0x00,0x00,0x00,0x00,0x00]

				; Function Attrs: nounwind
				define dso_local i64 @bar(i64 noundef %a, i64 noundef %b, i64 noundef %c) #0 {
				entry:
				%a.addr = alloca i64, align 8
				%b.addr = alloca i64, align 8
				%c.addr = alloca i64, align 8
				store i64 %a, ptr %a.addr, align 8, !tbaa !7
				store i64 %b, ptr %b.addr, align 8, !tbaa !7
				store i64 %c, ptr %c.addr, align 8, !tbaa !7
				%0 = load i64, ptr %a.addr, align 8, !tbaa !7
				%1 = load i64, ptr %b.addr, align 8, !tbaa !7
				%div = sdiv i64 %0, %1
				%2 = load i64, ptr %a.addr, align 8, !tbaa !7
				%3 = load i64, ptr %c.addr, align 8, !tbaa !7
				%rem = srem i64 %2, %3
				%add = add nsw i64 %div, %rem
				ret i64 %add
				}

				; CHECK: r0 = r1
				; CHECK-NEXT: (u64 )(r10 - 16) = r2
				; CHECK-NEXT: (u64 )(r10 - 8) = r0
				; CHECK-NEXT: (u64 )(r10 - 24) = r3
				; CHECK-NEXT: r1 s%= r3 # encoding: [0x9f,0x31,0x01,0x00,0x00,0x00,0x00,0x00]
				; CHECK-NEXT: r0 s/= r2 # encoding: [0x3f,0x20,0x01,0x00,0x00,0x00,0x00,0x00]

				attributes #0 = { nounwind "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" }

				!llvm.module.flags = !{!0, !1}
				!llvm.ident = !{!2}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{i32 7, !"frame-pointer", i32 2}
				!2 = !{!"clang version 17.0.0 (https://github.com/llvm/llvm-project.git 569bd3b841e3167ddd7c6ceeddb282d3c280e761)"}
				!3 = !{!4, !4, i64 0}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}
				!7 = !{!8, !8, i64 0}
				!8 = !{!"long", !5, i64 0}

This is an archive of the discontinued LLVM Phabricator instance.

[BPF] Add a few new insns under cpu=v4ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 541265

clang/lib/Basic/Targets/BPF.h

clang/lib/Basic/Targets/BPF.cpp

clang/test/Misc/target-invalid-cpu-note.c

llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp

llvm/lib/Target/BPF/BPF.td

llvm/lib/Target/BPF/BPFISelDAGToDAG.cpp

llvm/lib/Target/BPF/BPFISelLowering.h

llvm/lib/Target/BPF/BPFISelLowering.cpp

llvm/lib/Target/BPF/BPFInstrFormats.td

llvm/lib/Target/BPF/BPFInstrInfo.td

llvm/lib/Target/BPF/BPFMIPeephole.cpp

llvm/lib/Target/BPF/BPFMISimplifyPatchable.cpp

llvm/lib/Target/BPF/BPFSubtarget.h

llvm/lib/Target/BPF/BPFSubtarget.cpp

llvm/lib/Target/BPF/Disassembler/BPFDisassembler.cpp

llvm/lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp

llvm/lib/Target/BPF/MCTargetDesc/BPFInstPrinter.cpp

llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp

llvm/lib/Target/BPF/MCTargetDesc/BPFMCFixups.h

llvm/lib/Target/BPF/MCTargetDesc/BPFMCTargetDesc.cpp

llvm/test/CodeGen/BPF/bswap.ll

llvm/test/CodeGen/BPF/ldsx.ll

llvm/test/CodeGen/BPF/movsx.ll

llvm/test/CodeGen/BPF/sdiv_smod.ll

[BPF] Add a few new insns under cpu=v4
ClosedPublic