Immediate must be in an integer range [0,255] for umin/umax instruction.
Extend pattern matching helper SelectSVEArithImm() to take in value type
bitwidth when checking immediate value is in range or not.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Current upstream mis-compile, take t.ll , run "llc -mtriple=aarch64-linux-gnu -mattr=+sve < t.ll"
define <vscale x 4 x i32> @test(<vscale x 4 x i32> %a) { %pg = shufflevector <vscale x 4 x i1> insertelement (<vscale x 4 x i1> undef, i1 true, i32 0), <vscale x 4 x i1> undef, <vscale x 4 x i32> zeroinitializer %elt = insertelement <vscale x 4 x i32> undef, i32 257, i32 0 %splat = shufflevector <vscale x 4 x i32> %elt, <vscale x 4 x i32> undef, <vscale x 4 x i32> zeroinitializer %umin = call <vscale x 4 x i32> @llvm.aarch64.sve.umin.nxv4i32(<vscale x 4 x i1> %pg, <vscale x 4 x i32> %a, <vscale x 4 x i32> %splat) ret <vscale x 4 x i32> %umin } declare <vscale x 4 x i32> @llvm.aarch64.sve.umin.nxv4i32(<vscale x 4 x i1>, <vscale x 4 x i32>, <vscale x 4 x i32>)
Then you see it's mis-compiled into
// %bb.0:
umin z0.s, z0.s, #1
This patch fixes this error, and generate
// %bb.0:
mov w8, #257 ptrue p0.s mov z1.s, w8 umin z0.s, p0/m, z0.s, z1.s
llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp | ||
---|---|---|
3146 | Maybe I'm missing something obvious, but I don't see why any masking is needed. |
llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp | ||
---|---|---|
3146 | We need to apply masking "ImmVal & 0xFF" for i8 at least. Otherwise we got regression. Take t.ll below, if we don't apply the mask for i8, then we generate "mov z1.b, #-127 ". define <vscale x 16 x i8> @umax_i8_large(<vscale x 16 x i8> %a) { %elt = insertelement <vscale x 16 x i8> undef, i8 129, i32 0 %splat = shufflevector <vscale x 16 x i8> %elt, <vscale x 16 x i8> undef, <vscale x 16 x i32> zeroinitializer %cmp = icmp ugt <vscale x 16 x i8> %a, %splat %res = select <vscale x 16 x i1> %cmp, <vscale x 16 x i8> %a, <vscale x 16 x i8> %splat ret <vscale x 16 x i8> %res } |
llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp | ||
---|---|---|
3146 | To add to this, the SelectionDAG node for the operand looks something like the following: nxv16i8 AArch64ISD::DUP(i32 -127). |
nit: DL can be inlined directly into its single use, i.e.