This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/AArch64/
-
Target/
-
AArch64/
1/3
AArch64ISelLowering.h
-
test/CodeGen/AArch64/
-
CodeGen/
-
AArch64/
1/2
arm64-assert-zext-sext.ll
1
shift_minsize.ll

Differential D87771

[AArch64] Emit zext move when the source of the zext is AssertZext or AssertSext
ClosedPublic

Authored by wwei on Sep 16 2020, 8:51 AM.

Download Raw Diff

Details

Reviewers

t.p.northover
paulwalker-arm
efriedma
dmgreen
samparker

Commits

rG992698cfbc89: [AArch64] Emit zext move when the source of the zext is AssertZext or AssertSext

Summary

When the source of the zext is AssertZext or AssertSext, it is hard to know any information about the upper 32 bits,
so we should insert a zext move before emitting SUBREG_TO_REG to define the lower 32 bits.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=47543

Diff Detail

Event Timeline

wwei created this revision.Sep 16 2020, 8:51 AM

Herald added subscribers: llvm-commits, danielkiss, hiraditya, kristof.beyls. · View Herald TranscriptSep 16 2020, 8:51 AM

wwei requested review of this revision.Sep 16 2020, 8:51 AM

Harbormaster completed remote builds in B71885: Diff 292236.Sep 16 2020, 9:32 AM

efriedma added a subscriber: spop.Sep 16 2020, 12:31 PM

efriedma added inline comments.

llvm/lib/Target/AArch64/AArch64ISelLowering.h
425	AssertSext/AssertZext don't really indicate anything either way; it would make sense to check the operand, maybe? That said, I guess the operand will usually be a CopyFromReg, so maybe it doesn't matter much. More generally, I don't like the general approach of guessing what isel will do, as opposed to examining what isel actually did. Due to the way the dot product patterns were written, it's actually possible for an i32 EXTRACT_VECTOR_ELT to produce a value that isn't zero-extended (testcase follows). And I'm not confident there aren't other weird edge cases. I'd be happier doing this after isel, when we can tell what instruction actually produced the value in question. define i64 @test_udot_v8i8(i8* nocapture readonly %a, i8* nocapture readonly %b) { entry: ; CHECK-LABEL: test_udot_v8i8: ; CHECK: udot {{v[0-9]+}}.2s, {{v[0-9]+}}.8b, {{v[0-9]+}}.8b %0 = bitcast i8* %a to <8 x i8>* %1 = load <8 x i8>, <8 x i8>* %0 %2 = zext <8 x i8> %1 to <8 x i32> %3 = bitcast i8* %b to <8 x i8>* %4 = load <8 x i8>, <8 x i8>* %3 %5 = zext <8 x i8> %4 to <8 x i32> %6 = mul nuw nsw <8 x i32> %5, %2 %7 = call i32 @llvm.experimental.vector.reduce.add.v8i32(<8 x i32> %6) %8 = zext i32 %7 to i64 ret i64 %8 } declare i32 @llvm.experimental.vector.reduce.add.v8i32(<8 x i32>)
llvm/test/CodeGen/AArch64/arm64-assert-zext-sext.ll
1	This testcase is way too complicated; can you extract out the bit that actually triggers this issue?
llvm/test/CodeGen/AArch64/shift_minsize.ll
62	This mov shouldn't be necessary, but the reason isn't really related to this patch, I guess.

wwei updated this revision to Diff 292542.Sep 17 2020, 9:27 AM

wwei added inline comments.Sep 17 2020, 9:30 AM

llvm/test/CodeGen/AArch64/arm64-assert-zext-sext.ll
1	Done， test cases have been simplified，

wwei added inline comments.Sep 17 2020, 10:10 AM

llvm/lib/Target/AArch64/AArch64ISelLowering.h
425	I basically agree with you, it seems more reasonable doing it after isel. Especially when DAG isel is based on basic block scope, it is difficult to consider global data/control flow. For example，in `arm64-assert-zext-sext.ll` case, both `AssertZext` and `AssertZext` will have operand `CopyFromReg`, but what really matters is where the operand (maybe EXTRACT_SUBREG or Truncate) of `CopyFromReg` comes from. Unfortunately, they're from other blocks. Therefore, in the ISEL process, we can only perform conservative instruction selection. This is the right way to fix bugs in pr47543. I have also considered adding check for the operand of `AssertSext/AssertZext`, like below: static inline bool isDef32(const SDNode &N) { unsigned Opc = N.getOpcode(); if(Opc == AssertSext \|\| Opc == AssertZext) Opc = N.getOperand(0).getOpcode(); return Opc != ISD::TRUNCATE && Opc != TargetOpcode::EXTRACT_SUBREG && Opc != ISD::CopyFromReg; } But I'm not sure whether the check is required. Finally, I refer to the implementation in x86. def def32 : PatLeaf<(i32 GR32:$src), [{ return N->getOpcode() != ISD::TRUNCATE && N->getOpcode() != TargetOpcode::EXTRACT_SUBREG && N->getOpcode() != ISD::CopyFromReg && N->getOpcode() != ISD::AssertSext && N->getOpcode() != ISD::AssertZext; }]>; Doing it after isel will be better. However, many scenarios may need to be considered. It is not clear what edge or strange cases may occur. The current patch is used as a trade-off solution.

LGTM

I'm okay with taking this as an incremental improvement, and then looking into a different approach later.

llvm/lib/Target/AArch64/AArch64ISelLowering.h
425	I don't anticipate any complex issues performing the transform after isel, but I could be missing something.

This revision is now accepted and ready to land.Sep 17 2020, 11:29 AM

Closed by commit rG992698cfbc89: [AArch64] Emit zext move when the source of the zext is AssertZext or AssertSext (authored by wwei). · Explain WhySep 17 2020, 10:09 PM

This revision was automatically updated to reflect the committed changes.

wwei added a commit: rG992698cfbc89: [AArch64] Emit zext move when the source of the zext is AssertZext or AssertSext.

guopeilin mentioned this in D91512: [AArch64][Isel] Avoid implicit zext for SIGN_EXTEND_INREG (TRUNCATE).Nov 20 2020, 12:52 AM

yutsumi added a subscriber: yutsumi.Nov 4 2021, 5:58 PM

yutsumi mentioned this in D111035: [AArch64] Fix incorrect removal of COPYs in AArch64RedundantCopyElimination.Nov 8 2021, 1:29 AM

Revision Contents

Path

Size

llvm/

lib/

Target/

AArch64/

AArch64ISelLowering.h

6 lines

test/

CodeGen/

AArch64/

arm64-assert-zext-sext.ll

101 lines

shift_minsize.ll

6 lines

Diff 292236

llvm/lib/Target/AArch64/AArch64ISelLowering.h

	Show First 20 Lines • Show All 409 Lines • ▼ Show 20 Lines

	} // end namespace AArch64ISD			} // end namespace AArch64ISD

	namespace {			namespace {

	// Any instruction that defines a 32-bit result zeros out the high half of the			// Any instruction that defines a 32-bit result zeros out the high half of the
	// register. Truncate can be lowered to EXTRACT_SUBREG. CopyFromReg may			// register. Truncate can be lowered to EXTRACT_SUBREG. CopyFromReg may
	// be copying from a truncate. But any other 32-bit operation will zero-extend			// be copying from a truncate. But any other 32-bit operation will zero-extend
	// up to 64 bits.			// up to 64 bits. AssertSext/AssertZext aren't saying anything about the upper
				// 32 bits, they're probably just qualifying a CopyFromReg.
	// FIXME: X86 also checks for CMOV here. Do we need something similar?			// FIXME: X86 also checks for CMOV here. Do we need something similar?
	static inline bool isDef32(const SDNode &N) {			static inline bool isDef32(const SDNode &N) {
	unsigned Opc = N.getOpcode();			unsigned Opc = N.getOpcode();
	return Opc != ISD::TRUNCATE && Opc != TargetOpcode::EXTRACT_SUBREG &&			return Opc != ISD::TRUNCATE && Opc != TargetOpcode::EXTRACT_SUBREG &&
	Opc != ISD::CopyFromReg;			Opc != ISD::CopyFromReg && Opc != ISD::AssertSext &&
				Opc != ISD::AssertZext;
				efriedmaUnsubmitted Not Done Reply Inline Actions AssertSext/AssertZext don't really indicate anything either way; it would make sense to check the operand, maybe? That said, I guess the operand will usually be a CopyFromReg, so maybe it doesn't matter much. More generally, I don't like the general approach of guessing what isel will do, as opposed to examining what isel actually did. Due to the way the dot product patterns were written, it's actually possible for an i32 EXTRACT_VECTOR_ELT to produce a value that isn't zero-extended (testcase follows). And I'm not confident there aren't other weird edge cases. I'd be happier doing this after isel, when we can tell what instruction actually produced the value in question. define i64 @test_udot_v8i8(i8* nocapture readonly %a, i8* nocapture readonly %b) { entry: ; CHECK-LABEL: test_udot_v8i8: ; CHECK: udot {{v[0-9]+}}.2s, {{v[0-9]+}}.8b, {{v[0-9]+}}.8b %0 = bitcast i8* %a to <8 x i8>* %1 = load <8 x i8>, <8 x i8>* %0 %2 = zext <8 x i8> %1 to <8 x i32> %3 = bitcast i8* %b to <8 x i8>* %4 = load <8 x i8>, <8 x i8>* %3 %5 = zext <8 x i8> %4 to <8 x i32> %6 = mul nuw nsw <8 x i32> %5, %2 %7 = call i32 @llvm.experimental.vector.reduce.add.v8i32(<8 x i32> %6) %8 = zext i32 %7 to i64 ret i64 %8 } declare i32 @llvm.experimental.vector.reduce.add.v8i32(<8 x i32>) efriedma: AssertSext/AssertZext don't really indicate anything either way; it would make sense to check…
				wweiAuthorUnsubmitted Done Reply Inline Actions I basically agree with you, it seems more reasonable doing it after isel. Especially when DAG isel is based on basic block scope, it is difficult to consider global data/control flow. For example，in `arm64-assert-zext-sext.ll` case, both `AssertZext` and `AssertZext` will have operand `CopyFromReg`, but what really matters is where the operand (maybe EXTRACT_SUBREG or Truncate) of `CopyFromReg` comes from. Unfortunately, they're from other blocks. Therefore, in the ISEL process, we can only perform conservative instruction selection. This is the right way to fix bugs in pr47543. I have also considered adding check for the operand of `AssertSext/AssertZext`, like below: static inline bool isDef32(const SDNode &N) { unsigned Opc = N.getOpcode(); if(Opc == AssertSext \|\| Opc == AssertZext) Opc = N.getOperand(0).getOpcode(); return Opc != ISD::TRUNCATE && Opc != TargetOpcode::EXTRACT_SUBREG && Opc != ISD::CopyFromReg; } But I'm not sure whether the check is required. Finally, I refer to the implementation in x86. def def32 : PatLeaf<(i32 GR32:$src), [{ return N->getOpcode() != ISD::TRUNCATE && N->getOpcode() != TargetOpcode::EXTRACT_SUBREG && N->getOpcode() != ISD::CopyFromReg && N->getOpcode() != ISD::AssertSext && N->getOpcode() != ISD::AssertZext; }]>; Doing it after isel will be better. However, many scenarios may need to be considered. It is not clear what edge or strange cases may occur. The current patch is used as a trade-off solution. wwei: I basically agree with you, it seems more reasonable doing it after isel. Especially when DAG…
				efriedmaUnsubmitted Not Done Reply Inline Actions I don't anticipate any complex issues performing the transform after isel, but I could be missing something. efriedma: I don't anticipate any complex issues performing the transform after isel, but I could be…
	}			}

	} // end anonymous namespace			} // end anonymous namespace

	class AArch64Subtarget;			class AArch64Subtarget;
	class AArch64TargetMachine;			class AArch64TargetMachine;

	class AArch64TargetLowering : public TargetLowering {			class AArch64TargetLowering : public TargetLowering {
	▲ Show 20 Lines • Show All 572 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/arm64-assert-zext-sext.ll

This file was added.

				; RUN: llc -O2 -mtriple=aarch64-linux-gnu < %s \| FileCheck %s
				efriedmaUnsubmitted Not Done Reply Inline Actions This testcase is way too complicated; can you extract out the bit that actually triggers this issue? efriedma: This testcase is way too complicated; can you extract out the bit that actually triggers this…
				wweiAuthorUnsubmitted Done Reply Inline Actions Done， test cases have been simplified， wwei: Done， test cases have been simplified，

				%struct.a = type { i32 }

				@a = internal unnamed_addr global i1 false, align 8
				@b = common local_unnamed_addr global i32 0, align 4
				@e = common local_unnamed_addr global i32 0, align 4
				@d = common local_unnamed_addr global i32 0, align 4
				@.str = private unnamed_addr constant [6 x i8] c"%llu\0A\00", align 1
				@g = global i32 193, align 4
				@f = local_unnamed_addr global i8* bitcast (i32* @g to i8*), align 8
				@j = common global i32 0, align 4
				@k = global i16* bitcast (i32* @j to i16*), align 8
				@n = local_unnamed_addr global i32 bitcast (i16 @k to i32**), align 8
				@m = common local_unnamed_addr global %struct.a zeroinitializer, align 4

				declare i32 @printf(i8* nocapture readonly, ...) local_unnamed_addr

				define i32 @assertzext() local_unnamed_addr {
				entry:
				%.b = load i1, i1* @a, align 8
				%i = select i1 %.b, i64 0, i64 66296709418
				store i1 true, i1* @a, align 8
				%conv.i = trunc i64 %i to i32
				%.pr.i = load i32, i32* @b, align 4
				%cmp11.i = icmp eq i32 %.pr.i, 0
				br i1 %cmp11.i, label %end, label %for.body.lr.ph.i

				for.body.lr.ph.i: ; preds = %entry
				%i1 = load i32, i32* @e, align 4
				%sext.i = and i64 %i, 1872199978
				%inc.i.peel = add nsw i32 %.pr.i, 1
				%cmp.i.peel = icmp eq i32 %inc.i.peel, 0
				br i1 %cmp.i.peel, label %for.cond.for.end_crit_edge.i, label %for.body.i

				for.body.i: ; preds = %for.body.i, %for.body.lr.ph.i
				%i2 = phi i32 [ %inc.i, %for.body.i ], [ %inc.i.peel, %for.body.lr.ph.i ]
				%inc.i = add nsw i32 %i2, 1
				%cmp.i = icmp eq i32 %inc.i, 0
				br i1 %cmp.i, label %for.cond.for.end_crit_edge.i, label %for.body.i

				for.cond.for.end_crit_edge.i: ; preds = %for.body.i, %for.body.lr.ph.i
				%n.012.i.lcssa = phi i64 [ %sext.i, %for.body.lr.ph.i ], [ 0, %for.body.i ]
				%conv2.i = sext i32 %i1 to i64
				%i3 = inttoptr i64 %conv2.i to i64*
				store i64 %n.012.i.lcssa, i64* %i3, align 8
				store i32 0, i32* @b, align 4
				br label %end

				end: ; preds = %for.cond.for.end_crit_edge.i, %entry
				%n.0.lcssa.i = phi i32 [ 0, %for.cond.for.end_crit_edge.i ], [ %conv.i, %entry ]
				%call.i = tail call i32 (i8, ...) @printf(i8 nonnull dereferenceable(1) getelementptr inbounds ([6 x i8], [6 x i8]* @.str, i64 0, i64 0), i32 %n.0.lcssa.i)
				%conv4.i = sext i32 %n.0.lcssa.i to i64
				%call5.i = tail call i32 (i8, ...) @printf(i8 nonnull dereferenceable(1) getelementptr inbounds ([6 x i8], [6 x i8]* @.str, i64 0, i64 0), i64 %conv4.i)
				ret i32 0
				; CHECK-LABEL: end
				; CHECK: mov w{{[0-9]+}}, w{{[0-9]+}}
				; CHECK: bl printf
				; CHECK: mov w{{[0-9]+}}, w{{[0-9]+}}
				; CHECK: bl printf
				}

				declare i64 @test(i64)

				define i32 @assertsext() {
				entry:
				%i = load i8, i8* @f, align 8
				%i1 = load i8, i8* %i, align 1
				%conv.i.i = sext i8 %i1 to i32
				%i2 = load i32, i32* getelementptr inbounds (%struct.a, %struct.a* @m, i64 0, i32 0), align 4
				%mul = mul nsw i32 %i2, %conv.i.i
				%xor = xor i32 %mul, 3
				%tobool = icmp eq i32 %xor, 0
				br i1 %tobool, label %t.exit, label %land.end

				land.end: ; preds = %entry
				%conv1 = zext i32 %conv.i.i to i64
				%conv.i.i.i = sext i32 %xor to i64
				%div.i7 = udiv i64 2036854775807, %conv1
				%cmp = icmp slt i64 %div.i7, %conv.i.i.i
				%spec.select = select i1 %cmp, i32 1, i32 %conv.i.i
				br label %t.exit
				; CHECK-LABEL: land.end
				; CHECK: mov w{{[0-9]+}}, w{{[0-9]+}}
				; CHECK: udiv x{{[0-9]+}}, x{{[0-9]+}}, x{{[0-9]+}}

				t.exit: ; preds = %land.end, %entry
				%i3 = phi i32 [ %conv.i.i, %entry ], [ %spec.select, %land.end ]
				%conv3.i.i.i = trunc i32 %i3 to i16
				%i4 = load i16, i16* @k, align 8
				store i16 %conv3.i.i.i, i16* %i4, align 2
				%i5 = load i32, i32* @n, align 8
				%i6 = load i32, i32* %i5, align 8
				%i7 = load i32, i32* %i6, align 4
				%xor.i.i = xor i32 %i7, 8
				store i32 %xor.i.i, i32* %i6, align 4
				%call2.i = tail call i64 bitcast (i64 (i64)* @test to i64 (i32)*)(i32 undef)
				%i8 = load i32, i32* @j, align 4
				%call1 = tail call i32 (i8, ...) @printf(i8 nonnull dereferenceable(1) getelementptr inbounds ([6 x i8], [6 x i8]* @.str, i64 0, i64 0), i32 %i8)
				ret i32 0
				}

llvm/test/CodeGen/AArch64/shift_minsize.ll

	Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines
	}			}

	define dso_local { i64, i64 } @shl128(i64 %x.coerce0, i64 %x.coerce1, i8 signext %y) minsize optsize {			define dso_local { i64, i64 } @shl128(i64 %x.coerce0, i64 %x.coerce1, i8 signext %y) minsize optsize {
	; CHECK-LABEL: shl128:			; CHECK-LABEL: shl128:
	; CHECK: // %bb.0: // %entry			; CHECK: // %bb.0: // %entry
	; CHECK-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill			; CHECK-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill
	; CHECK-NEXT: .cfi_def_cfa_offset 16			; CHECK-NEXT: .cfi_def_cfa_offset 16
	; CHECK-NEXT: .cfi_offset w30, -16			; CHECK-NEXT: .cfi_offset w30, -16
	; CHECK-NEXT: // kill: def $w2 killed $w2 def $x2			; CHECK-NEXT: mov w2, w2
				efriedmaUnsubmitted Not Done Reply Inline Actions This mov shouldn't be necessary, but the reason isn't really related to this patch, I guess. efriedma: This mov shouldn't be necessary, but the reason isn't really related to this patch, I guess.
	; CHECK-NEXT: bl __ashlti3			; CHECK-NEXT: bl __ashlti3
	; CHECK-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload			; CHECK-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload
	; CHECK-NEXT: ret			; CHECK-NEXT: ret

	entry:			entry:
	%x.sroa.2.0.insert.ext = zext i64 %x.coerce1 to i128			%x.sroa.2.0.insert.ext = zext i64 %x.coerce1 to i128
	%x.sroa.2.0.insert.shift = shl nuw i128 %x.sroa.2.0.insert.ext, 64			%x.sroa.2.0.insert.shift = shl nuw i128 %x.sroa.2.0.insert.ext, 64
	%x.sroa.0.0.insert.ext = zext i64 %x.coerce0 to i128			%x.sroa.0.0.insert.ext = zext i64 %x.coerce0 to i128
	Show All 10 Lines
	}			}

	define dso_local { i64, i64 } @ashr128(i64 %x.coerce0, i64 %x.coerce1, i8 signext %y) minsize optsize {			define dso_local { i64, i64 } @ashr128(i64 %x.coerce0, i64 %x.coerce1, i8 signext %y) minsize optsize {
	; CHECK-LABEL: ashr128:			; CHECK-LABEL: ashr128:
	; CHECK: // %bb.0: // %entry			; CHECK: // %bb.0: // %entry
	; CHECK-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill			; CHECK-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill
	; CHECK-NEXT: .cfi_def_cfa_offset 16			; CHECK-NEXT: .cfi_def_cfa_offset 16
	; CHECK-NEXT: .cfi_offset w30, -16			; CHECK-NEXT: .cfi_offset w30, -16
	; CHECK-NEXT: // kill: def $w2 killed $w2 def $x2			; CHECK-NEXT: mov w2, w2
	; CHECK-NEXT: bl __ashrti3			; CHECK-NEXT: bl __ashrti3
	; CHECK-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload			; CHECK-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	entry:			entry:
	%x.sroa.2.0.insert.ext = zext i64 %x.coerce1 to i128			%x.sroa.2.0.insert.ext = zext i64 %x.coerce1 to i128
	%x.sroa.2.0.insert.shift = shl nuw i128 %x.sroa.2.0.insert.ext, 64			%x.sroa.2.0.insert.shift = shl nuw i128 %x.sroa.2.0.insert.ext, 64
	%x.sroa.0.0.insert.ext = zext i64 %x.coerce0 to i128			%x.sroa.0.0.insert.ext = zext i64 %x.coerce0 to i128
	%x.sroa.0.0.insert.insert = or i128 %x.sroa.2.0.insert.shift, %x.sroa.0.0.insert.ext			%x.sroa.0.0.insert.insert = or i128 %x.sroa.2.0.insert.shift, %x.sroa.0.0.insert.ext
	Show All 9 Lines
	}			}

	define dso_local { i64, i64 } @lshr128(i64 %x.coerce0, i64 %x.coerce1, i8 signext %y) minsize optsize {			define dso_local { i64, i64 } @lshr128(i64 %x.coerce0, i64 %x.coerce1, i8 signext %y) minsize optsize {
	; CHECK-LABEL: lshr128:			; CHECK-LABEL: lshr128:
	; CHECK: // %bb.0: // %entry			; CHECK: // %bb.0: // %entry
	; CHECK-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill			; CHECK-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill
	; CHECK-NEXT: .cfi_def_cfa_offset 16			; CHECK-NEXT: .cfi_def_cfa_offset 16
	; CHECK-NEXT: .cfi_offset w30, -16			; CHECK-NEXT: .cfi_offset w30, -16
	; CHECK-NEXT: // kill: def $w2 killed $w2 def $x2			; CHECK-NEXT: mov w2, w2
	; CHECK-NEXT: bl __lshrti3			; CHECK-NEXT: bl __lshrti3
	; CHECK-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload			; CHECK-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	entry:			entry:
	%x.sroa.2.0.insert.ext = zext i64 %x.coerce1 to i128			%x.sroa.2.0.insert.ext = zext i64 %x.coerce1 to i128
	%x.sroa.2.0.insert.shift = shl nuw i128 %x.sroa.2.0.insert.ext, 64			%x.sroa.2.0.insert.shift = shl nuw i128 %x.sroa.2.0.insert.ext, 64
	%x.sroa.0.0.insert.ext = zext i64 %x.coerce0 to i128			%x.sroa.0.0.insert.ext = zext i64 %x.coerce0 to i128
	%x.sroa.0.0.insert.insert = or i128 %x.sroa.2.0.insert.shift, %x.sroa.0.0.insert.ext			%x.sroa.0.0.insert.insert = or i128 %x.sroa.2.0.insert.shift, %x.sroa.0.0.insert.ext
	Show All 10 Lines