This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Target/X86/
-
Target/
-
X86/
-
X86ISelLowering.cpp
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
-
ipra-reg-alias.ll
-
load-scalar-as-vector.ll
-
mul-constant-i8.ll
-
urem-i8-constant.ll

Differential D54803

[x86] promote all multiply i8 by constant to i32
ClosedPublic

Authored by spatel on Nov 21 2018, 8:35 AM.

Download Raw Diff

Details

Reviewers

craig.topper
RKSimon
lebedev.ri

Commits

rGd31220e0de0d: [x86] promote all multiply i8 by constant to i32
rL347557: [x86] promote all multiply i8 by constant to i32

Summary

This is an alternative implementation of where D54770 would likely end up, so I'll abandon that if this is preferred.

We have these 2 "isDesirable" promotion hooks (I'm not sure why we need both of them, but that's independent of this patch), and we can adjust them to promote "mul i8 X, C" to i32. Then, all of our existing LEA and other multiply expansion magic happens as it would for i32 ops.

Some of the test diffs show that we could end up with an actual 32-bit mul instruction here because we choose not to expand to simpler ops. That instruction could be slower depending on the subtarget. On the plus side, this means we don't need a separate instruction to load the constant operand and possibly an extra instruction to move the result. If we need to tune mul i32 further, we could add a later transform that tries to shrink it back to i8 based on subtarget timing.

I did not bother to duplicate all of the 32-bit test file RUNs and target settings that exist to test whether LEA expansion is cheap or not. The diffs here assume a default target, so that means LEA is generally cheap.

Diff Detail

Repository: rL LLVM

Event Timeline

spatel created this revision.Nov 21 2018, 8:35 AM

Herald added a subscriber: mcrosier. · View Herald TranscriptNov 21 2018, 8:35 AM

This makes some sense to me. There is no 8-bit multiply with immediate instruction anyway, so we have to load the constant in a register. And the 8-bit multiply instruction has bad register allocation constraints. On Intel CPUs the 8-bit multiply with 16-bit result in AL/AH is the same latency/throughput as a 32-bit multiply with 32-bit result. The annoying thing is that we sometimes emit an explicit movzbl for the anyextend i8->i32 to do the promotion and prevent a partial register access. We explicitly don't do that for i16->i32 any extend due to the heavy promotion of i16 types.

spatel mentioned this in D54770: [x86] try to lower multiply i8 with constant to LEA.Nov 21 2018, 9:32 AM

lebedev.ri added inline comments.Nov 24 2018, 9:37 AM

lib/Target/X86/X86ISelLowering.cpp
41066–41077 ↗	(On Diff #174934)	I'm trying to parse this and i'm failing. The old code was: If the type is not `i16`, then it is desirable. (so i8/i32/i64) Else, see `switch`. New code says: If the type is not `i16` (so i8/i32/i64) and either the type is not `i8` (so i32/i64) or [the type is `i8` and] the opcode is not `ISD::MUL` then it is desirable Else, see `switch`. Doesn't this not only mark i8 MUL's as undesirable, but also mark all other i8 as desirable? Can this code please at least be uncondenced?

spatel marked an inline comment as done.Nov 24 2018, 9:59 AM

spatel added inline comments.

lib/Target/X86/X86ISelLowering.cpp
41066–41077 ↗	(On Diff #174934)	Your description sounds right - all i8 ops besides mul are still desirable - and that's the intended logic. I think it's just a DeMorgan illusion that makes it confusing (and I probably got it wrong in my initial draft locally!). I'll rearrange it to try to avoid that problem.

Patch updated:
After looking again, Roman's analysis was correct - the previous logic was bogus. But I'm not sure if we could expose the bug because of the interaction of these 2 hooks. Ie, the test diffs in this version of the patch are unchanged from before.

In any case, updated the code and comments to hopefully be clearer now.

Ok, i think now this looks good to me.

LGTM

This revision is now accepted and ready to land.Nov 25 2018, 7:19 PM

Closed by commit rL347557: [x86] promote all multiply i8 by constant to i32 (authored by spatel). · Explain WhyNov 26 2018, 7:25 AM

This revision was automatically updated to reflect the committed changes.

spatel mentioned this in D54640: [DAGCombiner] narrow truncated binops.Nov 26 2018, 9:03 AM

Intel core CPUs from Sandy Bridge on always store bits 63:16 and bits 7:0 in the same physical register file entry. Only bits 15:8 of EAX/EBX/ECX/EDX can be separated due to a write to AH/BH/CH/DH. For most binary arithmetic operations one of the input register is also the output register. So its easy to pass the upper bits through without modifying them. So "add %al, %bl" reads all 64-bits of %rax and %rbx (ignoring that %AH and %BH could have been written separately) and leaves bits 63:8 of %rbx unmodified. Instructions that write only bits 7:0 or bits 15:8 of a register and don't also read part of the same register trigger a merge uop to be inserted. This would be instructions like a load into %al or %ax. I believe move immediate into %al or %ax doesn't have a separate merge uop. Its single uop just reads the whole destination register and merges the immediate into the lower bits. 16-bit popcnt/lzcnt/tzcnt also have a false dependency on the upper bits so the single uop can do the merge. MOVSX/MOVZX from 8-bit to 16-bit are similar so that the upper bits can be preserved. If bits 15:8 have been separated and an instruction is issued that needs bits 15:8 and any of the other bits then a merge is inserted to join them.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

X86/

X86ISelLowering.cpp

63 lines

test/

CodeGen/

X86/

ipra-reg-alias.ll

12 lines

load-scalar-as-vector.ll

36 lines

mul-constant-i8.ll

196 lines

urem-i8-constant.ll

13 lines

Diff 175251

llvm/trunk/lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 41,054 Lines • ▼ Show 20 Lines	SDValue X86TargetLowering::PerformDAGCombine(SDNode *N,
case X86ISD::PCMPGT: return combineVectorCompare(N, DAG, Subtarget);		case X86ISD::PCMPGT: return combineVectorCompare(N, DAG, Subtarget);
case X86ISD::PMULDQ:		case X86ISD::PMULDQ:
case X86ISD::PMULUDQ: return combinePMULDQ(N, DAG, DCI);		case X86ISD::PMULUDQ: return combinePMULDQ(N, DAG, DCI);
}		}

return SDValue();		return SDValue();
}		}

/// Return true if the target has native support for the specified value type
/// and it is 'desirable' to use the type for the given node type. e.g. On x86
/// i16 is legal, but undesirable since i16 instruction encodings are longer and
/// some i16 instructions are slow.
bool X86TargetLowering::isTypeDesirableForOp(unsigned Opc, EVT VT) const {		bool X86TargetLowering::isTypeDesirableForOp(unsigned Opc, EVT VT) const {
if (!isTypeLegal(VT))		if (!isTypeLegal(VT))
return false;		return false;

// There are no vXi8 shifts.		// There are no vXi8 shifts.
if (Opc == ISD::SHL && VT.isVector() && VT.getVectorElementType() == MVT::i8)		if (Opc == ISD::SHL && VT.isVector() && VT.getVectorElementType() == MVT::i8)
return false;		return false;

if (VT != MVT::i16)		// 8-bit multiply is probably not much cheaper than 32-bit multiply, and
return true;		// we have specializations to turn 32-bit multiply into LEA or other ops.
		// Also, see the comment in "IsDesirableToPromoteOp" - where we additionally
		// check for a constant operand to the multiply.
		if (Opc == ISD::MUL && VT == MVT::i8)
		return false;

		// i16 instruction encodings are longer and some i16 instructions are slow,
		// so those are not desirable.
		if (VT == MVT::i16) {
switch (Opc) {		switch (Opc) {
default:		default:
return true;		break;
case ISD::LOAD:		case ISD::LOAD:
case ISD::SIGN_EXTEND:		case ISD::SIGN_EXTEND:
case ISD::ZERO_EXTEND:		case ISD::ZERO_EXTEND:
case ISD::ANY_EXTEND:		case ISD::ANY_EXTEND:
case ISD::SHL:		case ISD::SHL:
case ISD::SRL:		case ISD::SRL:
case ISD::SUB:		case ISD::SUB:
case ISD::ADD:		case ISD::ADD:
case ISD::MUL:		case ISD::MUL:
case ISD::AND:		case ISD::AND:
case ISD::OR:		case ISD::OR:
case ISD::XOR:		case ISD::XOR:
return false;		return false;
}		}
}		}

		// Any legal type not explicitly accounted for above here is desirable.
		return true;
		}

SDValue X86TargetLowering::expandIndirectJTBranch(const SDLoc& dl,		SDValue X86TargetLowering::expandIndirectJTBranch(const SDLoc& dl,
SDValue Value, SDValue Addr,		SDValue Value, SDValue Addr,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
const Module *M = DAG.getMachineFunction().getMMI().getModule();		const Module *M = DAG.getMachineFunction().getMMI().getModule();
Metadata *IsCFProtectionSupported = M->getModuleFlag("cf-protection-branch");		Metadata *IsCFProtectionSupported = M->getModuleFlag("cf-protection-branch");
if (IsCFProtectionSupported) {		if (IsCFProtectionSupported) {
// In case control-flow branch protection is enabled, we need to add		// In case control-flow branch protection is enabled, we need to add
// notrack prefix to the indirect branch.		// notrack prefix to the indirect branch.
// In order to do that we create NT_BRIND SDNode.		// In order to do that we create NT_BRIND SDNode.
// Upon ISEL, the pattern will convert it to jmp with NoTrack prefix.		// Upon ISEL, the pattern will convert it to jmp with NoTrack prefix.
return DAG.getNode(X86ISD::NT_BRIND, dl, MVT::Other, Value, Addr);		return DAG.getNode(X86ISD::NT_BRIND, dl, MVT::Other, Value, Addr);
}		}

return TargetLowering::expandIndirectJTBranch(dl, Value, Addr, DAG);		return TargetLowering::expandIndirectJTBranch(dl, Value, Addr, DAG);
}		}

/// This method query the target whether it is beneficial for dag combiner to
/// promote the specified node. If true, it should return the desired promotion
/// type by reference.
bool X86TargetLowering::IsDesirableToPromoteOp(SDValue Op, EVT &PVT) const {		bool X86TargetLowering::IsDesirableToPromoteOp(SDValue Op, EVT &PVT) const {
EVT VT = Op.getValueType();		EVT VT = Op.getValueType();
if (VT != MVT::i16)		bool Is8BitMulByConstant = VT == MVT::i8 && Op.getOpcode() == ISD::MUL &&
		isa<ConstantSDNode>(Op.getOperand(1));

		// i16 is legal, but undesirable since i16 instruction encodings are longer
		// and some i16 instructions are slow.
		// 8-bit multiply-by-constant can usually be expanded to something cheaper
		// using LEA and/or other ALU ops.
		if (VT != MVT::i16 && !Is8BitMulByConstant)
return false;		return false;

auto IsFoldableRMW = [](SDValue Load, SDValue Op) {		auto IsFoldableRMW = [](SDValue Load, SDValue Op) {
if (!Op.hasOneUse())		if (!Op.hasOneUse())
return false;		return false;
SDNode User = Op->use_begin();		SDNode User = Op->use_begin();
if (!ISD::isNormalStore(User))		if (!ISD::isNormalStore(User))
return false;		return false;
▲ Show 20 Lines • Show All 1,032 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/ipra-reg-alias.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -mtriple=x86_64-- -enable-ipra -print-regusage -o - 2>&1 < %s \| FileCheck %s --check-prefix=DEBUG
	; RUN: llc -mtriple=x86_64-- -enable-ipra -o - < %s \| FileCheck %s			; RUN: llc -mtriple=x86_64-- -enable-ipra -o - < %s \| FileCheck %s

	; Here only CL is clobbered so CH should not be clobbred, but CX, ECX and RCX
	; should be clobbered.
	; DEBUG: main Clobbered Registers: $ah $al $ax $cl $cx $eax $ecx $eflags $hax $rax $rcx

	define i8 @main(i8 %X) {			define i8 @main(i8 %X) {
	; CHECK-LABEL: main:			; CHECK-LABEL: main:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: movl %edi, %eax			; CHECK-NEXT: # kill: def $edi killed $edi def $rdi
	; CHECK-NEXT: movb $5, %cl			; CHECK-NEXT: leal (%rdi,%rdi,4), %eax
	; CHECK-NEXT: # kill: def $al killed $al killed $eax
	; CHECK-NEXT: mulb %cl
	; CHECK-NEXT: addb $5, %al			; CHECK-NEXT: addb $5, %al
				; CHECK-NEXT: # kill: def $al killed $al killed $eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%inc = add i8 %X, 1			%inc = add i8 %X, 1
	%inc2 = mul i8 %inc, 5			%inc2 = mul i8 %inc, 5
	ret i8 %inc2			ret i8 %inc2
	}			}

llvm/trunk/test/CodeGen/X86/load-scalar-as-vector.ll

Show First 20 Lines • Show All 512 Lines • ▼ Show 20 Lines	; AVX-NEXT: retq
%b = urem i64 42, %x		%b = urem i64 42, %x
%r = insertelement <2 x i64> undef, i64 %b, i32 0		%r = insertelement <2 x i64> undef, i64 %b, i32 0
ret <2 x i64> %r		ret <2 x i64> %r
}		}

define <16 x i8> @urem_op1_constant(i8* %p) nounwind {		define <16 x i8> @urem_op1_constant(i8* %p) nounwind {
; SSE-LABEL: urem_op1_constant:		; SSE-LABEL: urem_op1_constant:
; SSE: # %bb.0:		; SSE: # %bb.0:
; SSE-NEXT: movb (%rdi), %cl		; SSE-NEXT: movb (%rdi), %al
; SSE-NEXT: movl %ecx, %eax		; SSE-NEXT: movl %eax, %ecx
; SSE-NEXT: shrb %al		; SSE-NEXT: shrb %cl
		; SSE-NEXT: movzbl %cl, %ecx
		; SSE-NEXT: imull $49, %ecx, %ecx
		; SSE-NEXT: shrl $10, %ecx
		; SSE-NEXT: imull $42, %ecx, %ecx
		; SSE-NEXT: subb %cl, %al
; SSE-NEXT: movzbl %al, %eax		; SSE-NEXT: movzbl %al, %eax
; SSE-NEXT: imull $49, %eax, %eax
; SSE-NEXT: shrl $10, %eax
; SSE-NEXT: movb $42, %dl
; SSE-NEXT: # kill: def $al killed $al killed $eax
; SSE-NEXT: mulb %dl
; SSE-NEXT: subb %al, %cl
; SSE-NEXT: movzbl %cl, %eax
; SSE-NEXT: movd %eax, %xmm0		; SSE-NEXT: movd %eax, %xmm0
; SSE-NEXT: retq		; SSE-NEXT: retq
;		;
; AVX-LABEL: urem_op1_constant:		; AVX-LABEL: urem_op1_constant:
; AVX: # %bb.0:		; AVX: # %bb.0:
; AVX-NEXT: movb (%rdi), %cl		; AVX-NEXT: movb (%rdi), %al
; AVX-NEXT: movl %ecx, %eax		; AVX-NEXT: movl %eax, %ecx
; AVX-NEXT: shrb %al		; AVX-NEXT: shrb %cl
		; AVX-NEXT: movzbl %cl, %ecx
		; AVX-NEXT: imull $49, %ecx, %ecx
		; AVX-NEXT: shrl $10, %ecx
		; AVX-NEXT: imull $42, %ecx, %ecx
		; AVX-NEXT: subb %cl, %al
; AVX-NEXT: movzbl %al, %eax		; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: imull $49, %eax, %eax
; AVX-NEXT: shrl $10, %eax
; AVX-NEXT: movb $42, %dl
; AVX-NEXT: # kill: def $al killed $al killed $eax
; AVX-NEXT: mulb %dl
; AVX-NEXT: subb %al, %cl
; AVX-NEXT: movzbl %cl, %eax
; AVX-NEXT: vmovd %eax, %xmm0		; AVX-NEXT: vmovd %eax, %xmm0
; AVX-NEXT: retq		; AVX-NEXT: retq
%x = load i8, i8* %p		%x = load i8, i8* %p
%b = urem i8 %x, 42		%b = urem i8 %x, 42
%r = insertelement <16 x i8> undef, i8 %b, i32 0		%r = insertelement <16 x i8> undef, i8 %b, i32 0
ret <16 x i8> %r		ret <16 x i8> %r
}		}

▲ Show 20 Lines • Show All 316 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/mul-constant-i8.ll

	Show All 19 Lines
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 2			%m = mul i8 %x, 2
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_3(i8 %x) {			define i8 @test_mul_by_3(i8 %x) {
	; X64-LABEL: test_mul_by_3:			; X64-LABEL: test_mul_by_3:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $3, %cl			; X64-NEXT: leal (%rdi,%rdi,2), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 3			%m = mul i8 %x, 3
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_4(i8 %x) {			define i8 @test_mul_by_4(i8 %x) {
	; X64-LABEL: test_mul_by_4:			; X64-LABEL: test_mul_by_4:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: movl %edi, %eax
	; X64-NEXT: shlb $2, %al			; X64-NEXT: shlb $2, %al
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 4			%m = mul i8 %x, 4
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_5(i8 %x) {			define i8 @test_mul_by_5(i8 %x) {
	; X64-LABEL: test_mul_by_5:			; X64-LABEL: test_mul_by_5:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $5, %cl			; X64-NEXT: leal (%rdi,%rdi,4), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 5			%m = mul i8 %x, 5
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_6(i8 %x) {			define i8 @test_mul_by_6(i8 %x) {
	; X64-LABEL: test_mul_by_6:			; X64-LABEL: test_mul_by_6:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $6, %cl			; X64-NEXT: addl %edi, %edi
				; X64-NEXT: leal (%rdi,%rdi,2), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 6			%m = mul i8 %x, 6
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_7(i8 %x) {			define i8 @test_mul_by_7(i8 %x) {
	; X64-LABEL: test_mul_by_7:			; X64-LABEL: test_mul_by_7:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $7, %cl			; X64-NEXT: leal (,%rdi,8), %eax
				; X64-NEXT: subl %edi, %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 7			%m = mul i8 %x, 7
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_8(i8 %x) {			define i8 @test_mul_by_8(i8 %x) {
	; X64-LABEL: test_mul_by_8:			; X64-LABEL: test_mul_by_8:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: movl %edi, %eax
	; X64-NEXT: shlb $3, %al			; X64-NEXT: shlb $3, %al
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 8			%m = mul i8 %x, 8
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_9(i8 %x) {			define i8 @test_mul_by_9(i8 %x) {
	; X64-LABEL: test_mul_by_9:			; X64-LABEL: test_mul_by_9:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $9, %cl			; X64-NEXT: leal (%rdi,%rdi,8), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 9			%m = mul i8 %x, 9
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_10(i8 %x) {			define i8 @test_mul_by_10(i8 %x) {
	; X64-LABEL: test_mul_by_10:			; X64-LABEL: test_mul_by_10:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $10, %cl			; X64-NEXT: addl %edi, %edi
				; X64-NEXT: leal (%rdi,%rdi,4), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 10			%m = mul i8 %x, 10
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_11(i8 %x) {			define i8 @test_mul_by_11(i8 %x) {
	; X64-LABEL: test_mul_by_11:			; X64-LABEL: test_mul_by_11:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $11, %cl			; X64-NEXT: leal (%rdi,%rdi,4), %eax
				; X64-NEXT: leal (%rdi,%rax,2), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 11			%m = mul i8 %x, 11
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_12(i8 %x) {			define i8 @test_mul_by_12(i8 %x) {
	; X64-LABEL: test_mul_by_12:			; X64-LABEL: test_mul_by_12:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $12, %cl			; X64-NEXT: shll $2, %edi
				; X64-NEXT: leal (%rdi,%rdi,2), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 12			%m = mul i8 %x, 12
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_13(i8 %x) {			define i8 @test_mul_by_13(i8 %x) {
	; X64-LABEL: test_mul_by_13:			; X64-LABEL: test_mul_by_13:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $13, %cl			; X64-NEXT: leal (%rdi,%rdi,2), %eax
				; X64-NEXT: leal (%rdi,%rax,4), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 13			%m = mul i8 %x, 13
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_14(i8 %x) {			define i8 @test_mul_by_14(i8 %x) {
	; X64-LABEL: test_mul_by_14:			; X64-LABEL: test_mul_by_14:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: movl %edi, %eax
	; X64-NEXT: movb $14, %cl			; X64-NEXT: shll $4, %eax
				; X64-NEXT: subl %edi, %eax
				; X64-NEXT: subl %edi, %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 14			%m = mul i8 %x, 14
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_15(i8 %x) {			define i8 @test_mul_by_15(i8 %x) {
	; X64-LABEL: test_mul_by_15:			; X64-LABEL: test_mul_by_15:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $15, %cl			; X64-NEXT: leal (%rdi,%rdi,4), %eax
				; X64-NEXT: leal (%rax,%rax,2), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 15			%m = mul i8 %x, 15
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_16(i8 %x) {			define i8 @test_mul_by_16(i8 %x) {
	; X64-LABEL: test_mul_by_16:			; X64-LABEL: test_mul_by_16:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: movl %edi, %eax
	; X64-NEXT: shlb $4, %al			; X64-NEXT: shlb $4, %al
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 16			%m = mul i8 %x, 16
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_17(i8 %x) {			define i8 @test_mul_by_17(i8 %x) {
	; X64-LABEL: test_mul_by_17:			; X64-LABEL: test_mul_by_17:
	; X64: # %bb.0:			; X64: # %bb.0:
				; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: movl %edi, %eax
	; X64-NEXT: movb $17, %cl			; X64-NEXT: shll $4, %eax
				; X64-NEXT: leal (%rax,%rdi), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 17			%m = mul i8 %x, 17
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_18(i8 %x) {			define i8 @test_mul_by_18(i8 %x) {
	; X64-LABEL: test_mul_by_18:			; X64-LABEL: test_mul_by_18:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $18, %cl			; X64-NEXT: addl %edi, %edi
				; X64-NEXT: leal (%rdi,%rdi,8), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 18			%m = mul i8 %x, 18
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_19(i8 %x) {			define i8 @test_mul_by_19(i8 %x) {
	; X64-LABEL: test_mul_by_19:			; X64-LABEL: test_mul_by_19:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $19, %cl			; X64-NEXT: leal (%rdi,%rdi,8), %eax
				; X64-NEXT: leal (%rdi,%rax,2), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 19			%m = mul i8 %x, 19
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_20(i8 %x) {			define i8 @test_mul_by_20(i8 %x) {
	; X64-LABEL: test_mul_by_20:			; X64-LABEL: test_mul_by_20:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $20, %cl			; X64-NEXT: shll $2, %edi
				; X64-NEXT: leal (%rdi,%rdi,4), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 20			%m = mul i8 %x, 20
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_21(i8 %x) {			define i8 @test_mul_by_21(i8 %x) {
	; X64-LABEL: test_mul_by_21:			; X64-LABEL: test_mul_by_21:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $21, %cl			; X64-NEXT: leal (%rdi,%rdi,4), %eax
				; X64-NEXT: leal (%rdi,%rax,4), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 21			%m = mul i8 %x, 21
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_22(i8 %x) {			define i8 @test_mul_by_22(i8 %x) {
	; X64-LABEL: test_mul_by_22:			; X64-LABEL: test_mul_by_22:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $22, %cl			; X64-NEXT: leal (%rdi,%rdi,4), %eax
				; X64-NEXT: leal (%rdi,%rax,4), %eax
				; X64-NEXT: addl %edi, %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 22			%m = mul i8 %x, 22
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_23(i8 %x) {			define i8 @test_mul_by_23(i8 %x) {
	; X64-LABEL: test_mul_by_23:			; X64-LABEL: test_mul_by_23:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $23, %cl			; X64-NEXT: leal (%rdi,%rdi,2), %eax
				; X64-NEXT: shll $3, %eax
				; X64-NEXT: subl %edi, %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 23			%m = mul i8 %x, 23
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_24(i8 %x) {			define i8 @test_mul_by_24(i8 %x) {
	; X64-LABEL: test_mul_by_24:			; X64-LABEL: test_mul_by_24:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $24, %cl			; X64-NEXT: shll $3, %edi
				; X64-NEXT: leal (%rdi,%rdi,2), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 24			%m = mul i8 %x, 24
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_25(i8 %x) {			define i8 @test_mul_by_25(i8 %x) {
	; X64-LABEL: test_mul_by_25:			; X64-LABEL: test_mul_by_25:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $25, %cl			; X64-NEXT: leal (%rdi,%rdi,4), %eax
				; X64-NEXT: leal (%rax,%rax,4), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 25			%m = mul i8 %x, 25
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_26(i8 %x) {			define i8 @test_mul_by_26(i8 %x) {
	; X64-LABEL: test_mul_by_26:			; X64-LABEL: test_mul_by_26:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $26, %cl			; X64-NEXT: leal (%rdi,%rdi,4), %eax
				; X64-NEXT: leal (%rax,%rax,4), %eax
				; X64-NEXT: addl %edi, %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 26			%m = mul i8 %x, 26
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_27(i8 %x) {			define i8 @test_mul_by_27(i8 %x) {
	; X64-LABEL: test_mul_by_27:			; X64-LABEL: test_mul_by_27:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $27, %cl			; X64-NEXT: leal (%rdi,%rdi,8), %eax
				; X64-NEXT: leal (%rax,%rax,2), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 27			%m = mul i8 %x, 27
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_28(i8 %x) {			define i8 @test_mul_by_28(i8 %x) {
	; X64-LABEL: test_mul_by_28:			; X64-LABEL: test_mul_by_28:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $28, %cl			; X64-NEXT: leal (%rdi,%rdi,8), %eax
				; X64-NEXT: leal (%rax,%rax,2), %eax
				; X64-NEXT: addl %edi, %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 28			%m = mul i8 %x, 28
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_29(i8 %x) {			define i8 @test_mul_by_29(i8 %x) {
	; X64-LABEL: test_mul_by_29:			; X64-LABEL: test_mul_by_29:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $29, %cl			; X64-NEXT: leal (%rdi,%rdi,8), %eax
				; X64-NEXT: leal (%rax,%rax,2), %eax
				; X64-NEXT: addl %edi, %eax
				; X64-NEXT: addl %edi, %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 29			%m = mul i8 %x, 29
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_30(i8 %x) {			define i8 @test_mul_by_30(i8 %x) {
	; X64-LABEL: test_mul_by_30:			; X64-LABEL: test_mul_by_30:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: movl %edi, %eax
	; X64-NEXT: movb $30, %cl			; X64-NEXT: shll $5, %eax
				; X64-NEXT: subl %edi, %eax
				; X64-NEXT: subl %edi, %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 30			%m = mul i8 %x, 30
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_31(i8 %x) {			define i8 @test_mul_by_31(i8 %x) {
	; X64-LABEL: test_mul_by_31:			; X64-LABEL: test_mul_by_31:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: movl %edi, %eax
	; X64-NEXT: movb $31, %cl			; X64-NEXT: shll $5, %eax
				; X64-NEXT: subl %edi, %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 31			%m = mul i8 %x, 31
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_32(i8 %x) {			define i8 @test_mul_by_32(i8 %x) {
	; X64-LABEL: test_mul_by_32:			; X64-LABEL: test_mul_by_32:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: movl %edi, %eax
	; X64-NEXT: shlb $5, %al			; X64-NEXT: shlb $5, %al
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 32			%m = mul i8 %x, 32
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_37(i8 %x) {			define i8 @test_mul_by_37(i8 %x) {
	; X64-LABEL: test_mul_by_37:			; X64-LABEL: test_mul_by_37:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $37, %cl			; X64-NEXT: leal (%rdi,%rdi,8), %eax
				; X64-NEXT: leal (%rdi,%rax,4), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 37			%m = mul i8 %x, 37
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_41(i8 %x) {			define i8 @test_mul_by_41(i8 %x) {
	; X64-LABEL: test_mul_by_41:			; X64-LABEL: test_mul_by_41:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $41, %cl			; X64-NEXT: leal (%rdi,%rdi,4), %eax
				; X64-NEXT: leal (%rdi,%rax,8), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 41			%m = mul i8 %x, 41
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_62(i8 %x) {			define i8 @test_mul_by_62(i8 %x) {
	; X64-LABEL: test_mul_by_62:			; X64-LABEL: test_mul_by_62:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: movl %edi, %eax
	; X64-NEXT: movb $62, %cl			; X64-NEXT: shll $6, %eax
				; X64-NEXT: subl %edi, %eax
				; X64-NEXT: subl %edi, %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 62			%m = mul i8 %x, 62
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_66(i8 %x) {			define i8 @test_mul_by_66(i8 %x) {
	; X64-LABEL: test_mul_by_66:			; X64-LABEL: test_mul_by_66:
	; X64: # %bb.0:			; X64: # %bb.0:
				; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: movl %edi, %eax
	; X64-NEXT: movb $66, %cl			; X64-NEXT: shll $6, %eax
				; X64-NEXT: leal (%rax,%rdi,2), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 66			%m = mul i8 %x, 66
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_73(i8 %x) {			define i8 @test_mul_by_73(i8 %x) {
	; X64-LABEL: test_mul_by_73:			; X64-LABEL: test_mul_by_73:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $73, %cl			; X64-NEXT: leal (%rdi,%rdi,8), %eax
				; X64-NEXT: leal (%rdi,%rax,8), %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 73			%m = mul i8 %x, 73
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_520(i8 %x) {			define i8 @test_mul_by_520(i8 %x) {
	; X64-LABEL: test_mul_by_520:			; X64-LABEL: test_mul_by_520:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: movl %edi, %eax
	; X64-NEXT: shlb $3, %al			; X64-NEXT: shlb $3, %al
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, 520			%m = mul i8 %x, 520
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_neg10(i8 %x) {			define i8 @test_mul_by_neg10(i8 %x) {
	; X64-LABEL: test_mul_by_neg10:			; X64-LABEL: test_mul_by_neg10:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $-10, %cl			; X64-NEXT: addl %edi, %edi
				; X64-NEXT: leal (%rdi,%rdi,4), %eax
				; X64-NEXT: negl %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, -10			%m = mul i8 %x, -10
	ret i8 %m			ret i8 %m
	}			}

	define i8 @test_mul_by_neg36(i8 %x) {			define i8 @test_mul_by_neg36(i8 %x) {
	; X64-LABEL: test_mul_by_neg36:			; X64-LABEL: test_mul_by_neg36:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: # kill: def $edi killed $edi def $rdi
	; X64-NEXT: movb $-36, %cl			; X64-NEXT: shll $2, %edi
				; X64-NEXT: leal (%rdi,%rdi,8), %eax
				; X64-NEXT: negl %eax
	; X64-NEXT: # kill: def $al killed $al killed $eax			; X64-NEXT: # kill: def $al killed $al killed $eax
	; X64-NEXT: mulb %cl
	; X64-NEXT: retq			; X64-NEXT: retq
	%m = mul i8 %x, -36			%m = mul i8 %x, -36
	ret i8 %m			ret i8 %m
	}			}

llvm/trunk/test/CodeGen/X86/urem-i8-constant.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=i386-unknown-unknown \| FileCheck %s			; RUN: llc < %s -mtriple=i386-unknown-unknown \| FileCheck %s

	; computeKnownBits determines that we don't need a mask op that is required in the general case.			; computeKnownBits determines that we don't need a mask op that is required in the general case.

	define i8 @foo(i8 %tmp325) {			define i8 @foo(i8 %tmp325) {
	; CHECK-LABEL: foo:			; CHECK-LABEL: foo:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: movzbl {{[0-9]+}}(%esp), %ecx			; CHECK-NEXT: movzbl {{[0-9]+}}(%esp), %eax
	; CHECK-NEXT: imull $111, %ecx, %eax			; CHECK-NEXT: imull $111, %eax, %ecx
	; CHECK-NEXT: shrl $12, %eax			; CHECK-NEXT: shrl $12, %ecx
	; CHECK-NEXT: movb $37, %dl			; CHECK-NEXT: leal (%ecx,%ecx,8), %edx
				; CHECK-NEXT: leal (%ecx,%edx,4), %ecx
				; CHECK-NEXT: subb %cl, %al
	; CHECK-NEXT: # kill: def $al killed $al killed $eax			; CHECK-NEXT: # kill: def $al killed $al killed $eax
	; CHECK-NEXT: mulb %dl
	; CHECK-NEXT: subb %al, %cl
	; CHECK-NEXT: movl %ecx, %eax
	; CHECK-NEXT: retl			; CHECK-NEXT: retl
	%t546 = urem i8 %tmp325, 37			%t546 = urem i8 %tmp325, 37
	ret i8 %t546			ret i8 %t546
	}			}