This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Add new pass to transform undef to pseudo for vector values.
ClosedPublic

Authored by BeMg on Jul 14 2022, 12:38 AM.

Details

Summary

RISC-V vector instruction has register overlapping constraint for certain
instructions, and will cause illegal instruction trap if violated, we use
early clobber to model this constraint, but it can't prevent register allocator
allocated same or overlapped if the input register is undef value, so convert
IMPLICIT_DEF to temporary pseudo could prevent that happen, it's not best way
to resolve this. Ideally we should model the constraint right, but before we
model the constraint right, it's the approach to prevent that happen.

See also: https://github.com/llvm/llvm-project/issues/50157

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
BeMg added a comment.Oct 21 2022, 12:39 AM

Does this patch work for this test case

define internal void @foo() {
loopIR.preheader.i.i:
  %v15 = tail call <vscale x 1 x i16> @llvm.experimental.stepvector.nxv1i16()
  %v17 = tail call <vscale x 8 x i16> @llvm.vector.insert.nxv8i16.nxv1i16(<vscale x 8 x i16> poison, <vscale x 1 x i16> %v15, i64 0)
  %vs12.i.i.i = add <vscale x 1 x i16> %v15, shufflevector (<vscale x 1 x i16> insertelement (<vscale x 1 x i16> poison, i16 1, i32 0), <vscale x 1 x i16> poison, <vscale x 1 x i32> zeroinitializer)
  %v18 = tail call <vscale x 8 x i16> @llvm.vector.insert.nxv8i16.nxv1i16(<vscale x 8 x i16> poison, <vscale x 1 x i16> %vs12.i.i.i, i64 0)
  %vs16.i.i.i = add <vscale x 1 x i16> %v15, shufflevector (<vscale x 1 x i16> insertelement (<vscale x 1 x i16> poison, i16 3, i32 0), <vscale x 1 x i16> poison, <vscale x 1 x i32> zeroinitializer)
  %v20 = tail call <vscale x 8 x i16> @llvm.vector.insert.nxv8i16.nxv1i16(<vscale x 8 x i16> poison, <vscale x 1 x i16> %vs16.i.i.i, i64 0)
  br label %loopIR3.i.i

loopIR3.i.i:                                      ; preds = %loopIR3.i.i, %loopIR.preheader.i.i
  %v37 = load <vscale x 8 x i8>, ptr addrspace(1) null, align 8
  %v38 = tail call <vscale x 8 x i8> @llvm.riscv.vrgatherei16.vv.nxv8i8.i64(<vscale x 8 x i8> undef, <vscale x 8 x i8> %v37, <vscale x 8 x i16> %v17, i64 4)
  %v40 = tail call <vscale x 8 x i8> @llvm.riscv.vrgatherei16.vv.nxv8i8.i64(<vscale x 8 x i8> undef, <vscale x 8 x i8> %v37, <vscale x 8 x i16> %v18, i64 4)
  %v42 = and <vscale x 8 x i8> %v38, %v40
  %v46 = tail call <vscale x 8 x i8> @llvm.riscv.vrgatherei16.vv.nxv8i8.i64(<vscale x 8 x i8> undef, <vscale x 8 x i8> %v37, <vscale x 8 x i16> %v20, i64 4)
  %v60 = and <vscale x 8 x i8> %v42, %v46
  store <vscale x 8 x i8> %v60, ptr addrspace(1) null, align 4
  br label %loopIR3.i.i
}

declare <vscale x 1 x i16> @llvm.experimental.stepvector.nxv1i16()

declare <vscale x 8 x i16> @llvm.vector.insert.nxv8i16.nxv1i16(<vscale x 8 x i16>, <vscale x 1 x i16>, i64 immarg) 

declare <vscale x 8 x i8> @llvm.riscv.vrgatherei16.vv.nxv8i8.i64(<vscale x 8 x i8>, <vscale x 8 x i8>, <vscale x 8 x i16>, i64)

This IR doesn't generate the undef+early-clobber situation, so this pass will not work on it.

RA result will not break the early-clobber constraint in current compiler.

Corrsponding MachineInst before RA place below:

********** MACHINEINSTRS **********
# Machine code for function foo: NoPHIs, TracksLiveness, TiedOpsRewritten, TracksDebugUserValues

0B      bb.0.loopIR.preheader.i.i:
          successors: %bb.1(0x80000000); %bb.1(100.00%)

16B       dead %16:gpr = PseudoVSETVLIX0 $x0, 206, implicit-def $vl, implicit-def $vtype
32B       undef %0.sub_vrm1_0:vrm2 = PseudoVID_V_MF4 -1, 4, implicit $vl, implicit $vtype
64B       undef %1.sub_vrm1_0:vrm2 = PseudoVADD_VI_MF4 %0.sub_vrm1_0:vrm2, 1, -1, 4, implicit $vl, implicit $vtype
96B       undef %2.sub_vrm1_0:vrm2 = PseudoVADD_VI_MF4 %0.sub_vrm1_0:vrm2, 3, -1, 4, implicit $vl, implicit $vtype

128B    bb.1.loopIR3.i.i:
        ; predecessors: %bb.0, %bb.1
          successors: %bb.1(0x80000000); %bb.1(100.00%)

160B      %10:vr = VL1RE8_V $x0 :: (load unknown-size from `ptr addrspace(1) null`, align 8, addrspace 1)
176B      dead $x0 = PseudoVSETIVLI 4, 192, implicit-def $vl, implicit-def $vtype
192B      early-clobber %11:vr = PseudoVRGATHEREI16_VV_M1_M2 %10:vr, %0:vrm2, 4, 3, implicit $vl, implicit $vtype
208B      early-clobber %12:vr = PseudoVRGATHEREI16_VV_M1_M2 %10:vr, %1:vrm2, 4, 3, implicit $vl, implicit $vtype
224B      dead %17:gpr = PseudoVSETVLIX0 $x0, 192, implicit-def $vl, implicit-def $vtype
240B      %13:vr = PseudoVAND_VV_M1 %11:vr, %12:vr, -1, 3, implicit $vl, implicit $vtype
256B      dead $x0 = PseudoVSETIVLI 4, 192, implicit-def $vl, implicit-def $vtype
272B      early-clobber %14:vr = PseudoVRGATHEREI16_VV_M1_M2 %10:vr, %2:vrm2, 4, 3, implicit $vl, implicit $vtype
288B      dead %18:gpr = PseudoVSETVLIX0 $x0, 192, implicit-def $vl, implicit-def $vtype
304B      %15:vr = PseudoVAND_VV_M1 %13:vr, %14:vr, -1, 3, implicit $vl, implicit $vtype
320B      VS1R_V %15:vr, $x0 :: (store unknown-size into `ptr addrspace(1) null`, align 4, addrspace 1)
336B      PseudoBR %bb.1

Generated assembly see the note inline.

foo:                                    # @foo
	.cfi_startproc
# %bb.0:                                # %loopIR.preheader.i.i
	vsetvli	a0, zero, e16, mf4, ta, ma
	vid.v	v8
	vadd.vi	v10, v8, 1
	vadd.vi	v12, v8, 3
.LBB0_1:                                # %loopIR3.i.i
                                        # =>This Inner Loop Header: Depth=1
	vl1r.v	v14, (zero)
	vsetivli	zero, 4, e8, m1, ta, ma
	vrgatherei16.vv	v15, v14, v8  <- The v14 here is LMUL=2 so it's v14 and v15. This means writing v15 violated the early clobber constraint.
	vrgatherei16.vv	v16, v14, v10
	vsetvli	a0, zero, e8, m1, ta, ma
	vand.vv	v15, v15, v16
	vsetivli	zero, 4, e8, m1, ta, ma
	vrgatherei16.vv	v16, v14, v12
	vsetvli	a0, zero, e8, m1, ta, ma
	vand.vv	v14, v15, v16
	vs1r.v	v14, (zero)
	j	.LBB0_1
.Lfunc_end0:
	.size	foo, .Lfunc_end0-foo
	.cfi_endproc
                                        # -- End function
	.section	".note.GNU-stack","",@progbits

Is this MIR wrong here? Does %10 should be marked as vrm2?
Compiler treat %11 and %10 as M1, %0 as M2, so i think RA doesn't violated the early clobber constraint.

192B      early-clobber %11:vr = PseudoVRGATHEREI16_VV_M1_M2 %10:vr, %0:vrm2, 4, 3, implicit $vl, implicit $vtype
vrgatherei16.vv	v15, v14, v8 
// %11 -> v15
// %10 -> v14
// %0 -> v8

Compiler Register allocation result:

********** REWRITE VIRTUAL REGISTERS **********
********** Function: foo
********** REGISTER MAP **********
[%0 -> $v8m2] VRM2
[%1 -> $v10m2] VRM2
[%2 -> $v12m2] VRM2
[%10 -> $v14] VR
[%11 -> $v15] VR
[%12 -> $v16] VR
[%13 -> $v15] VR
[%14 -> $v16] VR
[%15 -> $v14] VR
[%16 -> $x10] GPR
[%17 -> $x10] GPR
[%18 -> $x10] GPR
BeMg updated this revision to Diff 469502.Oct 21 2022, 1:25 AM
  1. Update the approach that finding the VR super class.

Does this patch work for this test case

define internal void @foo() {
loopIR.preheader.i.i:
  %v15 = tail call <vscale x 1 x i16> @llvm.experimental.stepvector.nxv1i16()
  %v17 = tail call <vscale x 8 x i16> @llvm.vector.insert.nxv8i16.nxv1i16(<vscale x 8 x i16> poison, <vscale x 1 x i16> %v15, i64 0)
  %vs12.i.i.i = add <vscale x 1 x i16> %v15, shufflevector (<vscale x 1 x i16> insertelement (<vscale x 1 x i16> poison, i16 1, i32 0), <vscale x 1 x i16> poison, <vscale x 1 x i32> zeroinitializer)
  %v18 = tail call <vscale x 8 x i16> @llvm.vector.insert.nxv8i16.nxv1i16(<vscale x 8 x i16> poison, <vscale x 1 x i16> %vs12.i.i.i, i64 0)
  %vs16.i.i.i = add <vscale x 1 x i16> %v15, shufflevector (<vscale x 1 x i16> insertelement (<vscale x 1 x i16> poison, i16 3, i32 0), <vscale x 1 x i16> poison, <vscale x 1 x i32> zeroinitializer)
  %v20 = tail call <vscale x 8 x i16> @llvm.vector.insert.nxv8i16.nxv1i16(<vscale x 8 x i16> poison, <vscale x 1 x i16> %vs16.i.i.i, i64 0)
  br label %loopIR3.i.i

loopIR3.i.i:                                      ; preds = %loopIR3.i.i, %loopIR.preheader.i.i
  %v37 = load <vscale x 8 x i8>, ptr addrspace(1) null, align 8
  %v38 = tail call <vscale x 8 x i8> @llvm.riscv.vrgatherei16.vv.nxv8i8.i64(<vscale x 8 x i8> undef, <vscale x 8 x i8> %v37, <vscale x 8 x i16> %v17, i64 4)
  %v40 = tail call <vscale x 8 x i8> @llvm.riscv.vrgatherei16.vv.nxv8i8.i64(<vscale x 8 x i8> undef, <vscale x 8 x i8> %v37, <vscale x 8 x i16> %v18, i64 4)
  %v42 = and <vscale x 8 x i8> %v38, %v40
  %v46 = tail call <vscale x 8 x i8> @llvm.riscv.vrgatherei16.vv.nxv8i8.i64(<vscale x 8 x i8> undef, <vscale x 8 x i8> %v37, <vscale x 8 x i16> %v20, i64 4)
  %v60 = and <vscale x 8 x i8> %v42, %v46
  store <vscale x 8 x i8> %v60, ptr addrspace(1) null, align 4
  br label %loopIR3.i.i
}

declare <vscale x 1 x i16> @llvm.experimental.stepvector.nxv1i16()

declare <vscale x 8 x i16> @llvm.vector.insert.nxv8i16.nxv1i16(<vscale x 8 x i16>, <vscale x 1 x i16>, i64 immarg) 

declare <vscale x 8 x i8> @llvm.riscv.vrgatherei16.vv.nxv8i8.i64(<vscale x 8 x i8>, <vscale x 8 x i8>, <vscale x 8 x i16>, i64)

This IR doesn't generate the undef+early-clobber situation, so this pass will not work on it.

RA result will not break the early-clobber constraint in current compiler.

Corrsponding MachineInst before RA place below:

********** MACHINEINSTRS **********
# Machine code for function foo: NoPHIs, TracksLiveness, TiedOpsRewritten, TracksDebugUserValues

0B      bb.0.loopIR.preheader.i.i:
          successors: %bb.1(0x80000000); %bb.1(100.00%)

16B       dead %16:gpr = PseudoVSETVLIX0 $x0, 206, implicit-def $vl, implicit-def $vtype
32B       undef %0.sub_vrm1_0:vrm2 = PseudoVID_V_MF4 -1, 4, implicit $vl, implicit $vtype
64B       undef %1.sub_vrm1_0:vrm2 = PseudoVADD_VI_MF4 %0.sub_vrm1_0:vrm2, 1, -1, 4, implicit $vl, implicit $vtype
96B       undef %2.sub_vrm1_0:vrm2 = PseudoVADD_VI_MF4 %0.sub_vrm1_0:vrm2, 3, -1, 4, implicit $vl, implicit $vtype

128B    bb.1.loopIR3.i.i:
        ; predecessors: %bb.0, %bb.1
          successors: %bb.1(0x80000000); %bb.1(100.00%)

160B      %10:vr = VL1RE8_V $x0 :: (load unknown-size from `ptr addrspace(1) null`, align 8, addrspace 1)
176B      dead $x0 = PseudoVSETIVLI 4, 192, implicit-def $vl, implicit-def $vtype
192B      early-clobber %11:vr = PseudoVRGATHEREI16_VV_M1_M2 %10:vr, %0:vrm2, 4, 3, implicit $vl, implicit $vtype
208B      early-clobber %12:vr = PseudoVRGATHEREI16_VV_M1_M2 %10:vr, %1:vrm2, 4, 3, implicit $vl, implicit $vtype
224B      dead %17:gpr = PseudoVSETVLIX0 $x0, 192, implicit-def $vl, implicit-def $vtype
240B      %13:vr = PseudoVAND_VV_M1 %11:vr, %12:vr, -1, 3, implicit $vl, implicit $vtype
256B      dead $x0 = PseudoVSETIVLI 4, 192, implicit-def $vl, implicit-def $vtype
272B      early-clobber %14:vr = PseudoVRGATHEREI16_VV_M1_M2 %10:vr, %2:vrm2, 4, 3, implicit $vl, implicit $vtype
288B      dead %18:gpr = PseudoVSETVLIX0 $x0, 192, implicit-def $vl, implicit-def $vtype
304B      %15:vr = PseudoVAND_VV_M1 %13:vr, %14:vr, -1, 3, implicit $vl, implicit $vtype
320B      VS1R_V %15:vr, $x0 :: (store unknown-size into `ptr addrspace(1) null`, align 4, addrspace 1)
336B      PseudoBR %bb.1

Generated assembly see the note inline.

foo:                                    # @foo
	.cfi_startproc
# %bb.0:                                # %loopIR.preheader.i.i
	vsetvli	a0, zero, e16, mf4, ta, ma
	vid.v	v8
	vadd.vi	v10, v8, 1
	vadd.vi	v12, v8, 3
.LBB0_1:                                # %loopIR3.i.i
                                        # =>This Inner Loop Header: Depth=1
	vl1r.v	v14, (zero)
	vsetivli	zero, 4, e8, m1, ta, ma
	vrgatherei16.vv	v15, v14, v8  <- The v14 here is LMUL=2 so it's v14 and v15. This means writing v15 violated the early clobber constraint.
	vrgatherei16.vv	v16, v14, v10
	vsetvli	a0, zero, e8, m1, ta, ma
	vand.vv	v15, v15, v16
	vsetivli	zero, 4, e8, m1, ta, ma
	vrgatherei16.vv	v16, v14, v12
	vsetvli	a0, zero, e8, m1, ta, ma
	vand.vv	v14, v15, v16
	vs1r.v	v14, (zero)
	j	.LBB0_1
.Lfunc_end0:
	.size	foo, .Lfunc_end0-foo
	.cfi_endproc
                                        # -- End function
	.section	".note.GNU-stack","",@progbits

Is this MIR wrong here? Does %10 should be marked as vrm2?
Compiler treat %11 and %10 as M1, %0 as M2, so i think RA doesn't violated the early clobber constraint.

192B      early-clobber %11:vr = PseudoVRGATHEREI16_VV_M1_M2 %10:vr, %0:vrm2, 4, 3, implicit $vl, implicit $vtype
vrgatherei16.vv	v15, v14, v8 
// %11 -> v15
// %10 -> v14
// %0 -> v8

Compiler Register allocation result:

********** REWRITE VIRTUAL REGISTERS **********
********** Function: foo
********** REGISTER MAP **********
[%0 -> $v8m2] VRM2
[%1 -> $v10m2] VRM2
[%2 -> $v12m2] VRM2
[%10 -> $v14] VR
[%11 -> $v15] VR
[%12 -> $v16] VR
[%13 -> $v15] VR
[%14 -> $v16] VR
[%15 -> $v14] VR
[%16 -> $x10] GPR
[%17 -> $x10] GPR
[%18 -> $x10] GPR

You're right. I mixed up the operand order. I need to go look at this test again. It used to fail. Maybe we fixed it some other way.

I remember now. It only miscompiles with -riscv-enable-subreg-liveness

That produces

foo:                                    # @foo
        .cfi_startproc
# %bb.0:                                # %loopIR.preheader.i.i
        vsetvli a0, zero, e16, mf4, ta, ma
        vid.v   v8
        vadd.vi v10, v8, 1
        vadd.vi v12, v8, 3
.LBB0_1:                                # %loopIR3.i.i
                                        # =>This Inner Loop Header: Depth=1
        vl1r.v  v9, (zero)
        vsetivli        zero, 4, e8, m1, ta, ma
        vrgatherei16.vv v11, v9, v8
        vrgatherei16.vv v13, v9, v10
        vsetvli a0, zero, e8, m1, ta, ma
        vand.vv v11, v11, v13
        vsetivli        zero, 4, e8, m1, ta, ma
        vrgatherei16.vv v13, v9, v12 <- this instruction violates the early clobber constraint
        vsetvli a0, zero, e8, m1, ta, ma
        vand.vv v9, v11, v13
        vs1r.v  v9, (zero)
        j       .LBB0_1
.Lfunc_end0:
        .size   foo, .Lfunc_end0-foo
        .cfi_endproc
                                        # -- End function                        
        .section        ".note.GNU-stack","",@progbits
BeMg updated this revision to Diff 470444.Oct 25 2022, 4:29 AM

Replace Insert_subreg with PesudoInit when subreg liveness is enabled.

BeMg added a comment.Oct 25 2022, 4:47 AM

I remember now. It only miscompiles with -riscv-enable-subreg-liveness

That produces

foo:                                    # @foo
        .cfi_startproc
# %bb.0:                                # %loopIR.preheader.i.i
        vsetvli a0, zero, e16, mf4, ta, ma
        vid.v   v8
        vadd.vi v10, v8, 1
        vadd.vi v12, v8, 3
.LBB0_1:                                # %loopIR3.i.i
                                        # =>This Inner Loop Header: Depth=1
        vl1r.v  v9, (zero)
        vsetivli        zero, 4, e8, m1, ta, ma
        vrgatherei16.vv v11, v9, v8
        vrgatherei16.vv v13, v9, v10
        vsetvli a0, zero, e8, m1, ta, ma
        vand.vv v11, v11, v13
        vsetivli        zero, 4, e8, m1, ta, ma
        vrgatherei16.vv v13, v9, v12 <- this instruction violates the early clobber constraint
        vsetvli a0, zero, e8, m1, ta, ma
        vand.vv v9, v11, v13
        vs1r.v  v9, (zero)
        j       .LBB0_1
.Lfunc_end0:
        .size   foo, .Lfunc_end0-foo
        .cfi_endproc
                                        # -- End function                        
        .section        ".note.GNU-stack","",@progbits

Add one more condtion for subreg-liveness.

The assembly show as below:

        .p2align        2                               # -- Begin function foo
        .type   foo,@function
foo:                                    # @foo
        .cfi_startproc
# %bb.0:                                # %loopIR.preheader.i.i
        vsetvli a0, zero, e16, mf4, ta, ma
        vid.v   v14
        vadd.vi v15, v14, 1
        vadd.vi v16, v14, 3
        vmv1r.v v8, v14
        vmv1r.v v10, v15
        vmv1r.v v12, v16
.LBB0_1:                                # %loopIR3.i.i
                                        # =>This Inner Loop Header: Depth=1
        vl1r.v  v14, (zero)
        vsetivli        zero, 4, e8, m1, ta, ma
        vrgatherei16.vv v15, v14, v8
        vrgatherei16.vv v16, v14, v10
        vsetvli a0, zero, e8, m1, ta, ma
        vand.vv v15, v15, v16
        vsetivli        zero, 4, e8, m1, ta, ma
        vrgatherei16.vv v16, v14, v12
        vsetvli a0, zero, e8, m1, ta, ma
        vand.vv v14, v15, v16
        vs1r.v  v14, (zero)
        j       .LBB0_1
.Lfunc_end0:
        .size   foo, .Lfunc_end0-foo
        .cfi_endproc
                                        # -- End function
        .section        ".note.GNU-stack","",@progbits
BeMg updated this revision to Diff 474153.Nov 8 2022, 10:10 PM

Handle subregister liveness situation

BeMg updated this revision to Diff 474165.Nov 8 2022, 11:21 PM

Avoid infinite loop by DenseMap

BeMg retitled this revision from [WIP][RISCV] Add new pass to transform undef to pesudo for vector values. to [RISCV] Add new pass to transform undef to pesudo for vector values..Nov 9 2022, 5:13 PM
kito-cheng added inline comments.Nov 9 2022, 5:26 PM
llvm/test/CodeGen/RISCV/rvv/undef-earlyclobber-chain.ll
1

Could you add a pre-commit patch for this testcase so that could easier demonstrate what's get fixed?

kito-cheng added inline comments.Nov 9 2022, 5:39 PM
llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp
51

SmallPtrSet for Seen

BeMg updated this revision to Diff 474426.Nov 9 2022, 9:25 PM

Change DenseMap into SmallPrtSet

BeMg marked an inline comment as done.Nov 9 2022, 9:27 PM
BeMg added inline comments.
llvm/test/CodeGen/RISCV/rvv/undef-earlyclobber-chain.ll
1

New patch https://reviews.llvm.org/D137763 for test only and set up the dependency.

Should I remove the testcase in this patch?

BeMg updated this revision to Diff 474443.Nov 9 2022, 10:26 PM

Update testcase

BeMg updated this revision to Diff 476831.Nov 21 2022, 2:08 AM
  1. Record PHI node subregister change
  2. New INSERT_SUBREG insert before early-clobber instruciton
  3. Subreg index computation include PHI node
BeMg added a comment.Nov 21 2022, 2:44 AM

Handle Sub-register undef+early-clobber

For sub-registers, there is the same issue. The register allocator will also generate the program that breaks the early-clobber constraint. The reason for this situation is that the partial register used in instruction (with early-clobber flag) is undef. For example:

early-clobber %12:vr = PseudoVRGATHEREI16_VV_M1_M2 %10:vr, %1:vrm2, 4, 3, implicit $vl, implicit $vtype

->

vrgatherei16.vv v13, v9, v12

v12 is selected as VRM2, it will occupy the v12~v13. The register allocator still allocates the v13 for %12:vrm2 due to the v13 is undef for %1:vr in the register allocation stage. This is an example of how an undef subregister breaks the early-clobber constraint in the register allocation stage.

Here we propose an approach to fix this problem. The concept is the same as a normal undef register situation. We define the sub-register with pseudo instruction and remove it in the later pass (after RA).

There are three steps for this approach:

  1. Select the def-use chain from implicit_def to the first user with early-clobber constraint
  2. Compute the undef sub-register index from collecting information from INSERT_SUBREG and PHI node
  3. Insert the PseudoInit and INSERT_SUBREG for undefined sub-register after the last INSERT_SUGREG that updates the sub-register

Here we show the example with the pattern that will trigger undef+early-clobber issue.

Step 1

There are three def-use chains we need to care about in this program.

The pattern will look like

v0 = Implicit_def
…
INSERT_SUBREG | COPY | PHI
…
early-clobber rd = Op vN

Step 2

The INSERT_SUBREG node third operand is subregister index. It shows that this node defines which sub-register in the whole register. We can use the information to construct the sub-register that is undefined.

We use the LaneBitMask for this purpose.

LaneBitmask == 0xC for whole VRM2 register
LaneBitmask == 0x4 for %subreg.sub_vrm1_0 
LaneBitmask == 0x8 for %subreg.sub_vrm1_1

If we get the following def-use chain in step1

%4:vrm2 = Implicit_def
%0:vrm2 = INSERT_SUBREG %4, %subreg.sub_vrm1_0
early-clobber %11:vr = Op %0

0xC is VRM2’s LaneBitMask and 0x4 is already defined by INSERT_SUBREG in the program.

0xC & ~0x4 = 0x8 -> subreg.sub_vrm1_1

In this case, subreg.sub_vrm1_1 is the undefined sub-register before being used by early-clobber instruction.

Step 3

We can define a sub-register by INSERT_SUBREG between the last INSERT_SUBREG and the user with early-clobber. Our goal is to make sure the sub-registers are all defined before being used by early-clobber instruction.

%4:vrm2 = Implicit_def
%0:vrm2 = INSERT_SUBREG %4, %subreg.sub_vrm1_0
early-clobber %11:vr = Op %0

->

%4:vrm2 = Implicit_def
%0:vrm2 = INSERT_SUBREG %4, %subreg.sub_vrm1_0
%21:vr = PseudoRVVInitUndefM1
%22:vrm2 = INSERT_SUBREG %1:vrm2, %21:vr, %subreg.sub_vrm1_1
early-clobber %11:vr = Op %22

PHI in def-use chain

In Step 2, PHI will be seen as another instruction that will change the subregister defined region. The PHINodeLaneBitRecord will record the LaneBitMask from both predecessors, and insert the INSERT_SUBREG with this information.

BeMg updated this revision to Diff 478437.Nov 28 2022, 6:47 PM

Update testcase

pseudo is misspelled in the title

craig.topper added inline comments.Nov 29 2022, 4:41 PM
llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp
2

pesudo -> pseudo

11

"pesudo" -> "pseudo" in two places

19

pesudo -> pseudo

19

latter -> later

246

Pesudo -> Pseudo

247

regitser -> register

289

Candidata -> Candidate?

llvm/test/CodeGen/RISCV/regalloc-last-chance-recoloring-failure.ll
29

Are we treating insert_subreg for segment load tuples the same as inserting a small LMUL into a wider LMUL?

craig.topper added inline comments.Nov 29 2022, 4:44 PM
llvm/test/CodeGen/RISCV/rvv/subregister-undef-early-clobber-vrm4.mir
118 ↗(On Diff #478437)

Why is there an PseudoRVVInitUnde pseudo in the IR before the pass runs?

craig.topper added inline comments.Nov 29 2022, 4:54 PM
llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp
74

You can pass Register by value

106

origin -> original

111

Can't we do something like

VRRegClass.hasSubClassEq(MRI->getRegClass(R)) ||
VRM2RegClass.hasSubClassEq(MRI->getRegClass(R)) ||
VRM4RegClass.hasSubClassEq(MRI->getRegClass(R)) ||
VRM8RegClass.hasSubClassEq(MRI->getRegClass(R))

why do we need to use getVRLargestSuperClass?

BeMg retitled this revision from [RISCV] Add new pass to transform undef to pesudo for vector values. to [RISCV] Add new pass to transform undef to pseudo for vector values..Nov 30 2022, 12:40 AM
BeMg updated this revision to Diff 478877.Nov 30 2022, 3:15 AM
  1. Fix spell
  2. Update isVectorRegClass
BeMg updated this revision to Diff 479164.Nov 30 2022, 9:45 PM
  1. Rplace PseudoRVVInitUndef with VLE in MIR test
  2. Move MIR test into precommit
BeMg added inline comments.Dec 13 2022, 6:18 AM
llvm/test/CodeGen/RISCV/regalloc-last-chance-recoloring-failure.ll
29

This pass doesn't consider segment load as instruction that assign sub-register.

The following Insert_subreg work like put %5:vrm2 into %6:vrm4

%1:vrm4 = IMPLICIT_DEF
%5:vrm2 = PseudoVLE32_V_M2 killed %4, 0, 5 /* e32 */
%6:vrm4 = INSERT_SUBREG %1, %5, %subreg.sub_vrm2_0

Do we should treat vloxseg2ei32 as INSERT_SUBREG in this patch?

craig.topper added inline comments.Dec 13 2022, 2:40 PM
llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp
172

Why passing Insts by value? That will make a copy but it doesn't look like we need a copy.

182

Why passing Insts by value? That will make a copy but it doesn't look like we need a copy.

183

unsigned -> Register

197

unsigned -> Register

197

Why passing Insts by value? That will make a copy but it doesn't look like we need a copy.

200

unsigned -> Register

299

unsigned -> Register

300

unsigned -> Register

313

createVirtualRegister returns Register not unsigned

317

createVirtualRegister returns Register not unsigned

llvm/test/CodeGen/RISCV/regalloc-last-chance-recoloring-failure.ll
29

Nevermind, I didn't realize this test had so many undef and poison operands. I suspect llvm-reduce or bugpoint. I dislike tests with undef/poison operands. It makes things very fragile. It would be legal for DAG combine to delete a large portion of this test.

craig.topper added inline comments.Dec 13 2022, 2:52 PM
llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp
279

"the " is unnecessary in this sentence

craig.topper added inline comments.Dec 13 2022, 2:54 PM
llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp
28

"is be occupied" -> "is occupied"

Is it possible to insert the PseudoRVVInitUndef instructions after we've left SSA and the LiveIntervals have been built. Would that make it easier to find the undef lanes?

BeMg updated this revision to Diff 482694.Dec 13 2022, 8:14 PM
  1. unsigned -> Register
  2. use const std::vector<MachineInstr *> & instead of call by value
BeMg updated this revision to Diff 483741.Dec 17 2022, 4:25 AM

Reuse DetectDeadLanes pass info for undefined subregister

BeMg updated this revision to Diff 484196.Dec 20 2022, 2:08 AM

Move DetectDeadLanes pass change into another patch

craig.topper added inline comments.Dec 21 2022, 10:35 AM
llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp
202

Isn't this calculating for the entire function? But handleSubReg is called for individual instructions. So we'll be recomputing information right?

This could certainly use some new MIR tests. I didn't look super closely but I'm not sure you're correctly handling undef vs. not-undef subreg defs

llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp
54

DenseMap

91–101

I don't understand the point of this function, getSuperClasses is already sorted by largest. You can just take the first?

106

Could this use the new getPhysRegBaseClass?

108

Don't call getRegClass for each use

142

Reg.isVirtual()

201

unique_ptr

BeMg updated this revision to Diff 484791.Dec 22 2022, 3:36 AM
  1. Use unique_ptr for VRegInfo
  2. Only run once DetectDeadLanes for each function
  3. Remove Seen and PHINodeLaneBitRecord
  4. Only call getRegClass once
BeMg marked 4 inline comments as done.Dec 22 2022, 3:39 AM
BeMg added inline comments.
llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp
91–101

When this function take VRNoV0RegClassID as input. the getSuperClasses will return as following order.

AnyRegRegClassID 1
VMRegClassID 20
VRRegClassID 21 <- stop here
...

This patch only want those four RegClass as result.

106

Could this hook use for virtual register? It seem only for physical register but this pass run before register allocation.

202

Yes, you're right. we don't need recompute this info for each instruction. It call only once now.

BeMg updated this revision to Diff 485949.Jan 3 2023, 3:15 AM

Add more subregister testcase

craig.topper added inline comments.Jan 3 2023, 9:31 AM
llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp
39

Are there any maps in the code anymore?

llvm/test/CodeGen/RISCV/O3-pipeline.ll
109

If I understand correctly, we're effectively running DetectDeadLanes inside of RISCV init undef pass and then running the real DetectDeadLanes pass which won't do anything because we already did it?

craig.topper added inline comments.Jan 3 2023, 10:35 AM
llvm/test/CodeGen/RISCV/O3-pipeline.ll
109

Can we run DetectDeadLanes, then run our pass and just use the portion of the DetectDeadLanes that computes the Lane Masks in our pass?

BeMg updated this revision to Diff 486465.Jan 4 2023, 8:55 PM
  1. Move init undef pass after DetectDeadLanes
  2. Remove <map>
craig.topper added inline comments.Jan 4 2023, 9:09 PM
llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp
282

Should this be std::unique_ptr<VRegInfo[]> since it points to an array?

BeMg updated this revision to Diff 486467.Jan 4 2023, 9:26 PM

Update std::unique_ptr<VRegInfo> with std::unique_ptr<VRegInfo[]>

BeMg updated this revision to Diff 487364.Jan 9 2023, 3:01 AM

rebase

BeMg updated this revision to Diff 487663.Jan 9 2023, 8:43 PM

rebaseY

luke957 removed a subscriber: luke957.Jan 21 2023, 7:51 AM
BeMg updated this revision to Diff 494165.Feb 1 2023, 10:02 PM

Use D141993 as DetectDeadLane user interface

craig.topper added inline comments.Feb 13 2023, 3:13 PM
llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
130

I kind of think we should keep these instructions all the way to RISCVAsmPrinter::emitInstruction. Setting the operands to "undef" says it is ok to change the operands after this pass runs, but its not. RISCVExpandPseudo runs late enough there is probably no pass that will change them. Keeping the pseudo all the way to RISCVAsmPrinter::emitInstruction removes any possibility.

llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp
88

Can this be

if (RISCV::VRM8RegClass.hasSubClassEq(RC))
  return &RISCV::VRM8RegClass;
if (RISCV::VRM4RegClass.hasSubClassEq(RC))
  return &RISCV::VRM4RegClass;
if (RISCV::VRM2RegClass.hasSubClassEq(RC))
  return &RISCV::VRM2RegClass;
if (RISCV::VRRegClass.hasSubClassEq(RC))
  return &RISCV::VRRegClass;
return RC;
117

return RISCV::PseudoRVVInitUndefM1;

152

Do we need to scan all operands? Can we check if MO is a tied use and find the operand it is tied to check the early clobber?

188

Use defs()?

188

Replace the place with llvm::any_of?

201

Can we scan uses() instead of operands()?

206

I think we wouldn't need this if we only checked uses?

229

Can we put Info.UsedLanes & ~Info.DefinedLanes into a variable? We use that expression twice

232

Lastest isn't a word

BeMg updated this revision to Diff 497281.Feb 14 2023, 5:09 AM
  1. Skip undef-init pseudo in ASMEmitter instead of removing it in PseudoExpend pass
  2. Update operands with defs() or uses()
  3. Fix some typo
  4. Use variable to represent Info.UsedLanes & ~Info.DefinedLanes
BeMg updated this revision to Diff 497285.Feb 14 2023, 5:26 AM

Remove removeTempRVVInitUndef declaration

BeMg marked 8 inline comments as done.Feb 14 2023, 6:15 AM
BeMg added inline comments.
llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp
152
0B      bb.0.entry:
16B       LIFETIME_START %stack.0.dst
32B       %1:vr = IMPLICIT_DEF
48B       early-clobber %0:vr = PseudoVRGATHER_VI_M1 killed %1:vr, 0, 0, 5
64B       %2:gpr = ADDI %stack.0.dst, 0
80B       PseudoVSE32_V_M1 killed %0:vr, killed %2:gpr, 0, 5
96B       LIFETIME_END %stack.0.dst
112B      %3:gpr = COPY $x0
128B      $x10 = COPY %3:gpr
144B      PseudoRET implicit $x10

Here we maybe need scan both defs and uses operand, because the early-clobber operand not always exist tied-to operand.

BeMg added inline comments.Feb 14 2023, 6:25 AM
llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
130

Skip PseudoRVVInitUndefM1|2|4|8 in RISCVAsmPrinter::emitInstruction and remove undef-init relate function in RISCVExpandPseudo.

craig.topper accepted this revision.Feb 14 2023, 2:32 PM

LGTM

llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp
152

Right. I wasn't thinking about that right.

This revision is now accepted and ready to land.Feb 14 2023, 2:32 PM
This revision was landed with ongoing or failed builds.Feb 14 2023, 7:51 PM
This revision was automatically updated to reflect the committed changes.
fmayer added a subscriber: fmayer.Feb 15 2023, 11:28 AM

It seems like this breaks under ASan: https://lab.llvm.org/buildbot/#/builders/5/builds/31529/steps/13/logs/stdio

FAIL: LLVM :: CodeGen/RISCV/regalloc-last-chance-recoloring-failure.ll (35389 of 72944)
******************** TEST 'LLVM :: CodeGen/RISCV/regalloc-last-chance-recoloring-failure.ll' FAILED ********************
Script:
--
: 'RUN: at line 2';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/llc -mtriple=riscv64 -mattr=+f,+m,+zfh,+experimental-zvfh    -riscv-enable-subreg-liveness=false < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/RISCV/regalloc-last-chance-recoloring-failure.ll | /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/RISCV/regalloc-last-chance-recoloring-failure.ll
: 'RUN: at line 4';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/llc -mtriple=riscv64 -mattr=+f,+m,+zfh,+experimental-zvfh < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/RISCV/regalloc-last-chance-recoloring-failure.ll    -riscv-enable-subreg-liveness=true| /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/RISCV/regalloc-last-chance-recoloring-failure.ll --check-prefix=SUBREGLIVENESS
--
Exit Code: 2
Command Output (stderr):
--
=================================================================
==924420==ERROR: AddressSanitizer: use-after-poison on address 0x62100002da58 at pc 0x564ad06d209e bp 0x7ffe116375f0 sp 0x7ffe116375e8
READ of size 8 at 0x62100002da58 thread T0
    #0 0x564ad06d209d in operands_begin /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/CodeGen/MachineInstr.h:635:42
    #1 0x564ad06d209d in defs /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/CodeGen/MachineInstr.h:679:23
    #2 0x564ad06d209d in isEarlyClobberMI /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp:175:26
    #3 0x564ad06d209d in processBasicBlock /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp:250:39
    #4 0x564ad06d209d in (anonymous namespace)::RISCVInitUndef::runOnMachineFunction(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp:270:16
    #5 0x564ad2aa92d2 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:91:13
    #6 0x564ad39ba4a3 in llvm::FPPassManager::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1430:27
    #7 0x564ad39d4140 in llvm::FPPassManager::runOnModule(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1476:16
    #8 0x564ad39bc312 in runOnModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1545:27
    #9 0x564ad39bc312 in llvm::legacy::PassManagerImpl::run(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:535:44
    #10 0x564acd741a10 in compileModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:733:8
    #11 0x564acd741a10 in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:420:22
    #12 0x7f2985940d8f  (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d)
    #13 0x7f2985940e3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d)
    #14 0x564acd66dba4 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/llc+0x7670ba4)
0x62100002da58 is located 2392 bytes inside of 4096-byte region [0x62100002d100,0x62100002e100)
allocated by thread T0 here:
    #0 0x564acd72db32 in operator new(unsigned long, std::align_val_t) /b/sanitizer-x86_64-linux-fast/build/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:107:3
    #1 0x564acdc2908d in Allocate /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/Support/AllocatorBase.h:86:12
    #2 0x564acdc2908d in StartNewSlab /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/Support/Allocator.h:339:42
    #3 0x564acdc2908d in llvm::BumpPtrAllocatorImpl<llvm::MallocAllocator, 4096ul, 4096ul, 128ul>::Allocate(unsigned long, llvm::Align) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/Support/Allocator.h:195:5
    #4 0x564ad2a81869 in Allocate /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/Support/Allocator.h:209:12
    #5 0x564ad2a81869 in operator new<llvm::MallocAllocator, 4096UL, 4096UL, 128UL> /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/Support/Allocator.h:443:20
    #6 0x564ad2a81869 in llvm::MachineFunction::init() /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineFunction.cpp:185:15
    #7 0x564ad2b43c17 in llvm::MachineModuleInfo::getOrCreateMachineFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineModuleInfo.cpp:108:14
    #8 0x564ad2aa8def in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:46:29
    #9 0x564ad39ba4a3 in llvm::FPPassManager::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1430:27
    #10 0x564ad39d4140 in llvm::FPPassManager::runOnModule(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1476:16
    #11 0x564ad39bc312 in runOnModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1545:27
    #12 0x564ad39bc312 in llvm::legacy::PassManagerImpl::run(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:535:44
    #13 0x564acd741a10 in compileModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:733:8
    #14 0x564acd741a10 in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:420:22
    #15 0x7f2985940d8f  (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d)
SUMMARY: AddressSanitizer: use-after-poison /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/CodeGen/MachineInstr.h:635:42 in operands_begin
Shadow bytes around the buggy address:
  0x0c427fffdaf0: 00 f7 00 00 00 00 00 00 00 00 00 f7 00 00 00 00
  0x0c427fffdb00: 00 00 00 00 00 00 00 00 00 00 00 00 f7 00 00 00
  0x0c427fffdb10: 00 00 00 00 00 00 f7 00 00 00 00 00 00 00 00 00
  0x0c427fffdb20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c427fffdb30: 00 00 00 00 00 00 00 f7 00 00 00 00 00 00 00 00
=>0x0c427fffdb40: 00 f7 00 00 00 00 f7 f7 f7 f7 f7[f7]f7 f7 f7 f7
  0x0c427fffdb50: f7 00 00 00 00 f7 00 00 00 00 00 00 00 00 00 f7
  0x0c427fffdb60: 00 00 00 00 00 00 00 00 f7 00 00 00 00 00 00 00
  0x0c427fffdb70: 00 00 f7 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c427fffdb80: 00 00 00 f7 00 00 00 00 00 00 00 00 00 f7 00 00
  0x0c427fffdb90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f7 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==924420==ABORTING
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/RISCV/regalloc-last-chance-recoloring-failure.ll --check-prefix=SUBREGLIVENESS
--
********************
Testing:  0.. 10.. 20.. 30.. 40..
FAIL: LLVM :: CodeGen/RISCV/rvv/undef-earlyclobber-chain.ll (35659 of 72944)
******************** TEST 'LLVM :: CodeGen/RISCV/rvv/undef-earlyclobber-chain.ll' FAILED ********************
Script:
--
: 'RUN: at line 2';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/llc -mtriple riscv64 -mattr=+v -riscv-enable-subreg-liveness < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/RISCV/rvv/undef-earlyclobber-chain.ll  | /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/RISCV/rvv/undef-earlyclobber-chain.ll
--
Exit Code: 2
Command Output (stderr):
--
=================================================================
==928893==ERROR: AddressSanitizer: use-after-poison on address 0x62100005a980 at pc 0x55a20925a09e bp 0x7ffebd9ed930 sp 0x7ffebd9ed928
READ of size 8 at 0x62100005a980 thread T0
    #0 0x55a20925a09d in operands_begin /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/CodeGen/MachineInstr.h:635:42
    #1 0x55a20925a09d in defs /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/CodeGen/MachineInstr.h:679:23
    #2 0x55a20925a09d in isEarlyClobberMI /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp:175:26
    #3 0x55a20925a09d in processBasicBlock /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp:250:39
    #4 0x55a20925a09d in (anonymous namespace)::RISCVInitUndef::runOnMachineFunction(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp:270:16
    #5 0x55a20b6312d2 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:91:13
    #6 0x55a20c5424a3 in llvm::FPPassManager::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1430:27
    #7 0x55a20c55c140 in llvm::FPPassManager::runOnModule(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1476:16
    #8 0x55a20c544312 in runOnModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1545:27
    #9 0x55a20c544312 in llvm::legacy::PassManagerImpl::run(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:535:44
    #10 0x55a2062c9a10 in compileModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:733:8
    #11 0x55a2062c9a10 in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:420:22
    #12 0x7f2447396d8f  (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d)
    #13 0x7f2447396e3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d)
    #14 0x55a2061f5ba4 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/llc+0x7670ba4)
0x62100005a980 is located 2176 bytes inside of 4096-byte region [0x62100005a100,0x62100005b100)
allocated by thread T0 here:
    #0 0x55a2062b5b32 in operator new(unsigned long, std::align_val_t) /b/sanitizer-x86_64-linux-fast/build/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:107:3
    #1 0x55a2067b108d in Allocate /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/Support/AllocatorBase.h:86:12
    #2 0x55a2067b108d in StartNewSlab /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/Support/Allocator.h:339:42
    #3 0x55a2067b108d in llvm::BumpPtrAllocatorImpl<llvm::MallocAllocator, 4096ul, 4096ul, 128ul>::Allocate(unsigned long, llvm::Align) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/Support/Allocator.h:195:5
    #4 0x55a20b609869 in Allocate /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/Support/Allocator.h:209:12
    #5 0x55a20b609869 in operator new<llvm::MallocAllocator, 4096UL, 4096UL, 128UL> /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/Support/Allocator.h:443:20
    #6 0x55a20b609869 in llvm::MachineFunction::init() /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineFunction.cpp:185:15
    #7 0x55a20b6cbc17 in llvm::MachineModuleInfo::getOrCreateMachineFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineModuleInfo.cpp:108:14
    #8 0x55a20b630def in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:46:29
    #9 0x55a20c5424a3 in llvm::FPPassManager::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1430:27
    #10 0x55a20c55c140 in llvm::FPPassManager::runOnModule(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1476:16
    #11 0x55a20c544312 in runOnModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1545:27
    #12 0x55a20c544312 in llvm::legacy::PassManagerImpl::run(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:535:44
    #13 0x55a2062c9a10 in compileModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:733:8
    #14 0x55a2062c9a10 in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:420:22
    #15 0x7f2447396d8f  (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d)
SUMMARY: AddressSanitizer: use-after-poison /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/CodeGen/MachineInstr.h:635:42 in operands_begin
Shadow bytes around the buggy address:
  0x0c42800034e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c42800034f0: 00 f7 00 00 00 00 00 00 00 00 00 f7 00 00 00 00
  0x0c4280003500: f7 00 00 00 00 00 00 00 00 00 f7 00 00 00 00 00
  0x0c4280003510: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c4280003520: 00 00 00 00 00 00 00 00 00 00 00 f7 f7 f7 f7 f7
=>0x0c4280003530:[f7]f7 f7 f7 f7 f7 00 00 00 00 f7 00 00 00 00 00
  0x0c4280003540: 00 00 00 00 f7 00 00 00 00 00 00 00 00 00 00 00
  0x0c4280003550: 00 00 00 00 00 f7 00 00 00 00 00 00 00 00 00 f7
  0x0c4280003560: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c4280003570: f7 00 00 00 00 00 00 00 00 00 f7 00 00 00 00 f7
  0x0c4280003580: 00 00 00 00 00 00 00 00 00 f7 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==928893==ABORTING
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/CodeGen/RISCV/rvv/undef-earlyclobber-chain.ll
MaskRay added inline comments.Feb 15 2023, 11:50 AM
llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp
2

RISCVRVVInitUndef.cpp

216

delete blank line

craig.topper reopened this revision.Feb 15 2023, 12:49 PM
This revision is now accepted and ready to land.Feb 15 2023, 12:49 PM
craig.topper requested changes to this revision.Feb 15 2023, 12:50 PM

Removing my approval until the asan failure is fixed

This revision now requires changes to proceed.Feb 15 2023, 12:50 PM
BeMg updated this revision to Diff 497869.Feb 15 2023, 7:33 PM
  1. Update comment
  2. Remove blank line
  3. Use the old version isEarlyClobberMI
craig.topper added inline comments.Feb 15 2023, 7:36 PM
llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp
176

Any idea why the llvm::any_of didn't work?

jrtc27 added inline comments.Feb 15 2023, 8:51 PM
llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp
176

Because the thing that calls this is erasing from the MBB whilst it iterates over it... so this is the wrong fix and probably doesn't even work either

BeMg updated this revision to Diff 497878.Feb 15 2023, 8:53 PM
  1. Restore any_of version isEarlyClobberMI
  2. handleImplicitDef will remove ImplicitDef MI, and be use in following for loop body. I think this is the reason it trigger ASan fail.
craig.topper added inline comments.Feb 16 2023, 12:37 AM
llvm/lib/Target/RISCV/RISCVRVVInitUndef.cpp
176

Do we need [&] here or would [] work?

BeMg updated this revision to Diff 497924.Feb 16 2023, 1:11 AM
  1. Rename some variables name (UserMOs -> UseMOs, UseMO -> UserMO)
  2. Reorder if statement in processBasicBlock
  3. Use [] instead of [&] because the body of the function doesn't use any variables other than DefMO.
This revision is now accepted and ready to land.Feb 20 2023, 5:00 PM

Illegal instructions "vslideup.vi v8, v8, 5" and "vslideup.vi v8, v8, 2" are generated.

Illegal instructions "vslideup.vi v8, v8, 5" and "vslideup.vi v8, v8, 2" are generated.

Looks like the undef operand is not the passthru operand in this case. It's the value to slide. This pass only fixed the case where the passthru was undef. Can you file a new issue?

evandro removed a subscriber: evandro.Sep 12 2023, 5:49 PM