This is an archive of the discontinued LLVM Phabricator instance.

[BPF] Add a few new insns under cpu=v4
ClosedPublic

Authored by yonghong-song on Feb 26 2023, 9:51 AM.

Details

Summary

In [1], a few new insns are proposed to exgend BPF ISA to

. fixing the limitation of existing insn (e.g., 16bit jmp offset)
. adding new insns which may improve code quality (sign_ext_ld, sign_ext_mov, st) 
. feature complete (sdiv, smod)
. better user experience (bswap)

This patch implemented insn encoding for

. sign-extended load
. sign-extended mov 
. sdiv/smod
. bswap insns
. unconditional jump with 32bit offset

The new bswap insns are generated under cpu=v4 for builtin_bswap.
For cpu=v3 or earlier, for
builtin_bswap, be or le insns are generated
which is not intuitive for the user.

To support 32-bit branch offset, a 32-bit ja (JMPL) insn is implemented.
For conditional branch which is beyond 16-bit offset, llvm will do
some transformation 'cond_jmp' -> 'cond_jmp + jmpl' to simulate 32bit
conditional jmp. See BPFMIPeephole.cpp for details. The algorithm is
hueristic based. I have tested bpf selftest pyperf600 with unroll account
600 which can indeed generate 32-bit jump insn, e.g.,

13:       06 00 00 00 9b cd 00 00 gotol +0xcd9b <LBB0_6619>

Eduard is working on to add 'st' insn to cpu=v4.

A list of llc flags:

disable-ldsx, disable-movsx, disable-bswap,
disable-sdiv-smod, disable-gotol

can be used to disable a particular insn for cpu v4.
For example, user can do:

llc -march=bpf -mcpu=v4 -disable-movsx t.ll

to enable cpu v4 without movsx insns.

References:

[1] https://lore.kernel.org/bpf/4bfe98be-5333-1c7e-2f6d-42486c8ec039@meta.com/

Diff Detail

Event Timeline

yonghong-song created this revision.Feb 26 2023, 9:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 26 2023, 9:51 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
yonghong-song requested review of this revision.Feb 26 2023, 9:51 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptFeb 26 2023, 9:51 AM
yonghong-song edited the summary of this revision. (Show Details)Feb 26 2023, 10:35 AM
yonghong-song added reviewers: ast, jemarch.
yonghong-song added a subscriber: anakryiko.

Hi Yonghong,

Left a few nitpicks and one comment in BPFMIPreEmitPeephole::adjustBranch() that I think points to a bug.
Overall adjustBranch() algorithm looks good. It would be great to have some test cases for it, e.g. preprocess test .ll by replacing some template with a bunch of noops or something like this.
The instruction encoding seem to match mailing list description.
Also one check-all test is failing for me:

home/eddy/work/llvm-project/clang/test/Misc/target-invalid-cpu-note.c:76:14: error: BPF-NEXT: expected string not found in input
// BPF-NEXT: note: valid target CPU values are: generic, v1, v2, v3, probe{{$}}
llvm/lib/Target/BPF/BPFInstrFormats.td
93

Nitpick: the mailing list doc refers to this as BPF_SMEM.

llvm/lib/Target/BPF/BPFMIPeephole.cpp
335

Nitpick: this would not be executed for -O0, but is required for correct execution.

void BPFPassConfig::addPreEmitPass() {
  addPass(createBPFMIPreEmitCheckingPass());
  if (getOptLevel() != CodeGenOpt::None)
    if (!DisableMIPeephole)
      addPass(createBPFMIPreEmitPeepholePass());
}
456

As far as I understand:

  • SoFarNumInsns[JmpBB] is a number of instructions from function start till the end of JmpBB;
  • CurrNumInsns is a number of instructions from function start till the end of MBB.

So, SoFarNumInsns[JmpBB] - CurrNumInsns gives the distance between basic block ends. However, the jump would happen to the basic block start, so the actual distance should be computed as SoFarNumInsns[JmpBB] - JmpBB.size() - CurrNumInsns.

Am I confused?

483

Is it possible to rewrite as below instead?

B2:     ...
        if (!cond) goto B3
        gotol B5
B3: ...

Seems to be equivalent but with less instructions.

553

Nitpick: (Dist <= INT16_MAX && Dist >= INT16_MIN) is used in the previous two cases.

llvm/lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp
108

This is because Value is in bytes, right?
Could you please drop a comment here.

  • Fixed issues reported by Eduard
  • llvm-objdump issue (as stated in 'Summary') is not resolved yet.
yonghong-song edited the summary of this revision. (Show Details)
  • Fixed previous llvm-objdump issue for '> 16bit' 'gotol' insns.
  • Now basic functionality for cpu=v4 should be complete for llvm, further work will focus on kernel.
  • added support of new instructions in inline assembly.
  • avoid changing conditions during JMP -> JMPL conversion. Otherwise, verification may fail in some cases.

Hi Yonghong,

What is the current plan for these changes?
I'd like to refresh D140804 to make BPF_ST instruction available for cpu v4.
I see that the latest CI run failed because of libcxx issue, I think it is completely unrelated to this revision, see here.

Hi Yonghong,

What is the current plan for these changes?
I'd like to refresh D140804 to make BPF_ST instruction available for cpu v4.
I see that the latest CI run failed because of libcxx issue, I think it is completely unrelated to this revision, see here.

Eduard, please go ahead to refresh BPF_ST patch (D140804). For this patch, we will land it until the kernel patch addressed all major comments. When it is ready to merge, you will comment and let you know.

yonghong-song edited the summary of this revision. (Show Details)
  • rename some insn names or mode names (movs -> movsx, lds -> ldsx, MEMS -> MEMSX) etc to be consistent with kernel.
  • add 5 llc flags to control on/off for each kind of insn (sdiv/smod, ldsx, movsx, bswap, gotol) to debugging purpose.
ast added inline comments.Jul 14 2023, 8:49 AM
llvm/lib/Target/BPF/BPFInstrInfo.td
56

Here and elsewhere... let's drop CPUv4 mid prefix. imo the extra verbosity doesn't improve readability.
Same with the flag: disable-cpuv4-movsx. I can be disable-movsx.

s/BPFHasCPUv4_ldsx/BPFHasLdsx/
s/getCPUv4_bswap/getHasBswap/ or even shorter hasBswap ?

yonghong-song added inline comments.Jul 14 2023, 4:17 PM
llvm/lib/Target/BPF/BPFInstrInfo.td
56

Make sense. Will do. Ya, hasBswap is good enough to capture what it means.

yonghong-song edited the summary of this revision. (Show Details)
  • Dropping 'CPUv4' in some variable/function names and also in debug flags.
ast accepted this revision.Jul 17 2023, 7:20 PM

overall looks good. one small nit.

llvm/lib/Target/BPF/MCTargetDesc/BPFMCFixups.h
18

a little bit too much of copy paste :)

This revision is now accepted and ready to land.Jul 17 2023, 7:20 PM
  • remove a copy-paste comment from s390 arch
yonghong-song retitled this revision from [WIP][BPF] Add a few new insns under cpu=v4 to [BPF] Add a few new insns under cpu=v4.Jul 19 2023, 4:48 PM
ast accepted this revision.Jul 19 2023, 5:07 PM

lgtm. @eddyz87 pls take a look

eddyz87 added a comment.EditedJul 20 2023, 7:33 AM

I tried adding a test similar to assemble-disassemble.ll:

// RUN: llvm-mc -triple bpfel --mcpu=v4 --assemble --filetype=obj %s \
// RUN:   | llvm-objdump -d --mattr=+alu32 - \
// RUN:   | FileCheck %s

// CHECK: d7 01 00 00 10 00 00 00	r1 = bswap16 r1
// CHECK: d7 02 00 00 20 00 00 00	r2 = bswap32 r2
// CHECK: d7 03 00 00 40 00 00 00	r3 = bswap64 r3
r1 = bswap16 r1
r2 = bswap32 r2
r3 = bswap64 r3

// CHECK: 91 41 00 00 00 00 00 00	r1 = *(s8 *)(r4 + 0x0)
// CHECK: 89 52 04 00 00 00 00 00	r2 = *(s16 *)(r5 + 0x4)
// CHECK: 81 63 08 00 00 00 00 00	r3 = *(s32 *)(r6 + 0x8)
r1 = *(s8 *)(r4 + 0)
r2 = *(s16 *)(r5 + 4)
r3 = *(s32 *)(r6 + 8)

// CHECK: 91 41 00 00 00 00 00 00	w1 = *(s8 *)(r4 + 0x0)
// CHECK: 89 52 04 00 00 00 00 00	w2 = *(s16 *)(r5 + 0x4)
w1 = *(s8 *)(r4 + 0)
w2 = *(s16 *)(r5 + 4)

// CHECK: bf 41 08 00 00 00 00 00	r1 = (s8)r4
// CHECK: bf 52 10 00 00 00 00 00	r2 = (s16)r5
// CHECK: bf 63 20 00 00 00 00 00	r3 = (s32)w6
r1 = (s8)r4
r2 = (s16)r5
r3 = (s32)w6
// Should this work as well: r3 = (s32)r6 ?

// CHECK: bc 31 08 00 00 00 00 00	w1 = (s8)w3
// CHECK: bc 42 10 00 00 00 00 00	w2 = (s16)w4
w1 = (s8)w3
w2 = (s16)w4

// CHECK: 3f 31 01 00 00 00 00 00	r1 s/= r3
// CHECK: 9f 42 01 00 00 00 00 00	r2 s%= r4
r1 s/= r3
r2 s%= r4

// CHECK: 3c 31 01 00 00 00 00 00	w1 s/= w3
// CHECK: 9c 42 01 00 00 00 00 00	w2 s%= w4
w1 s/= w3
w2 s%= w4

And it looks like some instructions are not printed correctly:

$ llvm-mc -triple bpfel --mcpu=v4 --assemble --filetype=obj /home/eddy/work/llvm-project/llvm/test/CodeGen/BPF/assembler-disassembler-v4.s | llvm-objdump -d --mattr=+alu32 -

<stdin>:	file format elf64-bpf

Disassembly of section .text:

0000000000000000 <.text>:
       0:	d7 01 00 00 10 00 00 00	r1 = bswap16 r1
       1:	d7 02 00 00 20 00 00 00	r2 = bswap32 r2
       2:	d7 03 00 00 40 00 00 00	r3 = bswap64 r3
       3:	91 41 00 00 00 00 00 00	w1 = *(s8 *)(r4 + 0x0)
       4:	89 52 04 00 00 00 00 00	w2 = *(s16 *)(r5 + 0x4)
       5:	81 63 08 00 00 00 00 00	<unknown>
       6:	91 41 00 00 00 00 00 00	w1 = *(s8 *)(r4 + 0x0)
       7:	89 52 04 00 00 00 00 00	w2 = *(s16 *)(r5 + 0x4)
       8:	bf 41 08 00 00 00 00 00	r1 = (s8)r4
       9:	bf 52 10 00 00 00 00 00	r2 = (s16)r5
      10:	bf 63 20 00 00 00 00 00	r3 = (s32)w6
      11:	bc 31 08 00 00 00 00 00	w1 = (s8)w3
      12:	bc 42 10 00 00 00 00 00	w2 = (s16)w4
      13:	3f 31 01 00 00 00 00 00	r1 s/= r3
      14:	9f 42 01 00 00 00 00 00	r2 s%= r4
      15:	3c 31 01 00 00 00 00 00	w1 s/= w3
      16:	9c 42 01 00 00 00 00 00	w2 s%= w4

I'm not sure if this is an issue with disassembler or some additional --mattr options are needed.

llvm/lib/Target/BPF/BPFInstrInfo.td
379

I think it is possible to avoid matching expansion pattern (sra (shl GPR:$src, (i64 56)) here, and instead turn off the expansion when movsx is available.

I tried the change below and all BPF codegen tests are passing. Do I miss something?


diff --git a/llvm/lib/Target/BPF/BPFISelLowering.cpp b/llvm/lib/Target/BPF/BPFISelLowering.cpp
index 9a7357d6ad04..5e84af009591 100644
--- a/llvm/lib/Target/BPF/BPFISelLowering.cpp
+++ b/llvm/lib/Target/BPF/BPFISelLowering.cpp
@@ -132,9 +132,11 @@ BPFTargetLowering::BPFTargetLowering(const TargetMachine &TM,
   setOperationAction(ISD::CTLZ_ZERO_UNDEF, MVT::i64, Custom);
 
   setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i1, Expand);
-  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Expand);
-  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Expand);
-  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i32, Expand);
+  if (!STI.hasMovsx()) {
+    setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Expand);
+    setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Expand);
+    setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i32, Expand);
+  }
 
   // Extended load operations for i1 types must be promoted
   for (MVT VT : MVT::integer_valuetypes()) {
diff --git a/llvm/lib/Target/BPF/BPFInstrInfo.td b/llvm/lib/Target/BPF/BPFInstrInfo.td
index a1d532e60db2..29bec72aa92d 100644
--- a/llvm/lib/Target/BPF/BPFInstrInfo.td
+++ b/llvm/lib/Target/BPF/BPFInstrInfo.td
@@ -376,11 +376,11 @@ let Predicates = [BPFHasMovsx] in {
   def MOVSX_rr_8 : ALU_RR<BPF_ALU64, BPF_MOV, 8,
                       (outs GPR:$dst), (ins GPR:$src),
                       "$dst = (s8)$src",
-                      [(set GPR:$dst, (sra (shl GPR:$src, (i64 56)), (i64 56)))]>;
+                      [(set GPR:$dst, (sext_inreg GPR:$src, i8))]>;
   def MOVSX_rr_16 : ALU_RR<BPF_ALU64, BPF_MOV, 16,
                       (outs GPR:$dst), (ins GPR:$src),
                       "$dst = (s16)$src",
-                      [(set GPR:$dst, (sra (shl GPR:$src, (i64 48)), (i64 48)))]>;
+                      [(set GPR:$dst, (sext_inreg GPR:$src, i16))]>;
   def MOVSX_rr_32 : ALU_RR<BPF_ALU64, BPF_MOV, 32,
                       (outs GPR:$dst), (ins GPR32:$src),
                       "$dst = (s32)$src",
@@ -388,11 +388,11 @@ let Predicates = [BPFHasMovsx] in {
   def MOVSX_rr_32_8 : ALU_RR<BPF_ALU, BPF_MOV, 8,
                       (outs GPR32:$dst), (ins GPR32:$src),
                       "$dst = (s8)$src",
-                      [(set GPR32:$dst, (sra (shl GPR32:$src, (i32 24)), (i32 24)))]>;
+                      [(set GPR32:$dst, (sext_inreg GPR32:$src, i8))]>;
   def MOVSX_rr_32_16 : ALU_RR<BPF_ALU, BPF_MOV, 16,
                       (outs GPR32:$dst), (ins GPR32:$src),
                       "$dst = (s16)$src",
-                      [(set GPR32:$dst, (sra (shl GPR32:$src, (i32 16)), (i32 16)))]>;
+                      [(set GPR32:$dst, (sext_inreg GPR32:$src, i16))]>;
 }
 }
llvm/lib/Target/BPF/BPFMIPeephole.cpp
321

Is this map unused?

412

Nitpick: Fangrui suggested in my llvm-objdump revisions to use DenseMap in most cases (as std::map allocates for each pair).

llvm/test/CodeGen/BPF/movsx.ll
31

This does not seem right, as it does not sign extend 8-bit argument to 16-bit value.

39

Shouldn't this be w0 = (s8)w1?
A few checks below also look strange.

I tried adding a test similar to assemble-disassemble.ll:

// RUN: llvm-mc -triple bpfel --mcpu=v4 --assemble --filetype=obj %s \
// RUN:   | llvm-objdump -d --mattr=+alu32 - \
// RUN:   | FileCheck %s

// CHECK: d7 01 00 00 10 00 00 00	r1 = bswap16 r1
// CHECK: d7 02 00 00 20 00 00 00	r2 = bswap32 r2
// CHECK: d7 03 00 00 40 00 00 00	r3 = bswap64 r3
r1 = bswap16 r1
r2 = bswap32 r2
r3 = bswap64 r3

// CHECK: 91 41 00 00 00 00 00 00	r1 = *(s8 *)(r4 + 0x0)
// CHECK: 89 52 04 00 00 00 00 00	r2 = *(s16 *)(r5 + 0x4)
// CHECK: 81 63 08 00 00 00 00 00	r3 = *(s32 *)(r6 + 0x8)
r1 = *(s8 *)(r4 + 0)
r2 = *(s16 *)(r5 + 4)
r3 = *(s32 *)(r6 + 8)

// CHECK: 91 41 00 00 00 00 00 00	w1 = *(s8 *)(r4 + 0x0)
// CHECK: 89 52 04 00 00 00 00 00	w2 = *(s16 *)(r5 + 0x4)
w1 = *(s8 *)(r4 + 0)
w2 = *(s16 *)(r5 + 4)

// CHECK: bf 41 08 00 00 00 00 00	r1 = (s8)r4
// CHECK: bf 52 10 00 00 00 00 00	r2 = (s16)r5
// CHECK: bf 63 20 00 00 00 00 00	r3 = (s32)w6
r1 = (s8)r4
r2 = (s16)r5
r3 = (s32)w6
// Should this work as well: r3 = (s32)r6 ?

// CHECK: bc 31 08 00 00 00 00 00	w1 = (s8)w3
// CHECK: bc 42 10 00 00 00 00 00	w2 = (s16)w4
w1 = (s8)w3
w2 = (s16)w4

// CHECK: 3f 31 01 00 00 00 00 00	r1 s/= r3
// CHECK: 9f 42 01 00 00 00 00 00	r2 s%= r4
r1 s/= r3
r2 s%= r4

// CHECK: 3c 31 01 00 00 00 00 00	w1 s/= w3
// CHECK: 9c 42 01 00 00 00 00 00	w2 s%= w4
w1 s/= w3
w2 s%= w4

And it looks like some instructions are not printed correctly:

$ llvm-mc -triple bpfel --mcpu=v4 --assemble --filetype=obj /home/eddy/work/llvm-project/llvm/test/CodeGen/BPF/assembler-disassembler-v4.s | llvm-objdump -d --mattr=+alu32 -

<stdin>:	file format elf64-bpf

Disassembly of section .text:

0000000000000000 <.text>:
       0:	d7 01 00 00 10 00 00 00	r1 = bswap16 r1
       1:	d7 02 00 00 20 00 00 00	r2 = bswap32 r2
       2:	d7 03 00 00 40 00 00 00	r3 = bswap64 r3
       3:	91 41 00 00 00 00 00 00	w1 = *(s8 *)(r4 + 0x0)
       4:	89 52 04 00 00 00 00 00	w2 = *(s16 *)(r5 + 0x4)
       5:	81 63 08 00 00 00 00 00	<unknown>
       6:	91 41 00 00 00 00 00 00	w1 = *(s8 *)(r4 + 0x0)
       7:	89 52 04 00 00 00 00 00	w2 = *(s16 *)(r5 + 0x4)
       8:	bf 41 08 00 00 00 00 00	r1 = (s8)r4
       9:	bf 52 10 00 00 00 00 00	r2 = (s16)r5
      10:	bf 63 20 00 00 00 00 00	r3 = (s32)w6
      11:	bc 31 08 00 00 00 00 00	w1 = (s8)w3
      12:	bc 42 10 00 00 00 00 00	w2 = (s16)w4
      13:	3f 31 01 00 00 00 00 00	r1 s/= r3
      14:	9f 42 01 00 00 00 00 00	r2 s%= r4
      15:	3c 31 01 00 00 00 00 00	w1 s/= w3
      16:	9c 42 01 00 00 00 00 00	w2 s%= w4

I'm not sure if this is an issue with disassembler or some additional --mattr options are needed.

There is a problem in the td file for 32-bit signed load. Current definition is not quite right since it is supposed to sign-extension all the way to 64bit. I will fix it in the next revision.

yonghong-song added inline comments.Jul 24 2023, 12:26 AM
llvm/lib/Target/BPF/BPFInstrInfo.td
379

This indeed can simplify the code. I will incorporate your change into the patch. Thanks!

llvm/lib/Target/BPF/BPFMIPeephole.cpp
321

No. This is a leftover. Will remove.

412

Will try to use DenseMap.

llvm/test/CodeGen/BPF/movsx.ll
31

This is probably due to ABI. For example,

$ cat t1.c
__attribute__((noinline)) short f1(char a) {
  return a * a;
}

int f2(int a) {
  return f1(a);
}


$ clang --target=bpf -O2 -mcpu=v4 -S t1.c

f1:                                     # @f1
# %bb.0:                                # %entry
        w0 = w1
        w0 *= w0
        exit
.Lfunc_end0:
        .size   f1, .Lfunc_end0-f1
                                        # -- End function
        .globl  f2                              # -- Begin function f2
        .p2align        3
        .type   f2,@function
f2:                                     # @f2
# %bb.0:                                # %entry
        w1 = (s8)w1
        call f1
        w0 = (s16)w0
        exit

You can see in function f2(), the sign-extension has been done properly. and that is probably the reason in f1(), the compiler didn't generate proper sign extension code.

I will modify the test to generate proper sign extension like the above f2().

Hi Yonghong,

Thank you for the comments.
Could you please also add a few tests for gotol?
Sorry, I should have asked for those last week.

Could you please also add a few tests for gotol?

Will do!

Three major changes in this patch:

  • for ldsx insns, remove 32bit ldsx insns (1-byte and 2-byte sign extension) since the ldsx insn expects to sign extension all the way up to 8-byte and normal 32bit insn (e.g. BPF_ALU) expects to zero out the top bits. Instead do a ldbsx/ldhsx and then take the lower 4 byte to extract 32bit value. This also resolved one disasm issue reported by Eduard.
  • for movsx insn, for 32bit sign extenstion to 64bit. Match both "sext_inreg GPR:$src, i32" (left and right shifting) and "sext GPR32:$src".
  • Add an internal flag to control when to generate gotol insns in BPFMIPeephole.cpp. This permits a simpler test for gotol insns.

With the above changes, the following change is needed:

diff --git a/tools/testing/selftests/bpf/progs/verifier_movsx.c b/tools/testing/selftests/bpf/progs/verifier_movsx.c
index 5ee7d004f8ba..e27bfa11c9b3 100644
--- a/tools/testing/selftests/bpf/progs/verifier_movsx.c
+++ b/tools/testing/selftests/bpf/progs/verifier_movsx.c
@@ -59,7 +59,7 @@ __naked void mov64sx_s32(void)
 {
        asm volatile ("                                 \
        r0 = 0xfffffffe;                                \
-       r0 = (s32)w0;                                   \
+       r0 = (s32)r0;                                   \
        r0 >>= 1;                                       \
        exit;                                           \
 "      ::: __clobber_all);
@@ -181,7 +181,7 @@ __naked void mov64sx_s32_range(void)
 {
        asm volatile ("                                 \
        call %[bpf_get_prandom_u32];                    \
-       r1 = (s32)w0;                                   \
+       r1 = (s32)r0;                                   \
        /* r1 with s32 range */                         \
        if r1 s> 0x7fffffff goto l0_%=;                 \
        if r1 s< -0x80000000 goto l0_%=;                \

in order to compile kernel cpu v4 support (patch series v3)

https://lore.kernel.org/bpf/20230720000103.99949-1-yhs@fb.com/

I will update the kernel side once we resolved all llvm issues.

eddyz87 accepted this revision.Jul 25 2023, 8:27 AM

Hi Yonghong,

Looks good to me, thanks!
Before landing this, could you please adjust tests a little bit more?

  • Extend assembler-disassembler-v4.s with signed div and mod, e.g.:
// CHECK: 3f 31 01 00 00 00 00 00	r1 s/= r3
// CHECK: 9f 42 01 00 00 00 00 00	r2 s%= r4
r1 s/= r3
r2 s%= r4

// CHECK: 3c 31 01 00 00 00 00 00	w1 s/= w3
// CHECK: 9c 42 01 00 00 00 00 00	w2 s%= w4
w1 s/= w3
w2 s%= w4
  • For gotol add a test case which tries each possibility in BPFMIPreEmitPeephole::adjustBranch()
  • Add more tests in assembler-disassembler-v4.s and gotol.ll.
This revision was landed with ongoing or failed builds.Jul 26 2023, 8:37 AM
This revision was automatically updated to reflect the committed changes.