This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/X86/
-
Target/
-
X86/
-
X86ISelLowering.h
27/29
X86ISelLowering.cpp
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
-
avx512-calling-conv.ll
5
bitreverse.ll
1/1
bmi-x86_64.ll
-
bmi.ll
5/9
btc_bts_btr.ll
3
combine-bitreverse.ll
-
combine-rotates.ll
1/2
const-shift-of-constmasked.ll
1
const-shift-with-and.ll
-
fold-and-shift.ll
-
limited-prec.ll
-
movmsk-cmp.ll
-
pr15267.ll
-
pr26350.ll
-
pr32282.ll
-
pr45995.ll
-
pull-binop-through-shift.ll
-
rev16.ll
-
rotate-extract.ll
-
selectcc-to-shiftand.ll
-
setcc.ll
-
shift-amount-mod.ll
-
shift-mask.ll
-
sttni.ll
-
tbm_patterns.ll
-
udiv_fix.ll
-
udiv_fix_sat.ll
-
urem-seteq-illegal-types.ll
-
vselect.ll
-
zext-logicop-shift-load.ll

Differential D141653

[X86] Improve instruction ordering of constant `srl/shl` with `and` to get better and-masks
AcceptedPublic

Authored by goldstein.w.n on Jan 12 2023, 9:34 PM.

Download Raw Diff

Details

Reviewers

pengfei
RKSimon
t.p.northover

Summary

This moves some logic from combineShiftLeft and generalizes is to
work for shl and and. It also improves the non-mask constant
generation (reducing based on getSignificantBits instead of
countTrailingOnes) so that we can catch some cases where imm64 ->
sign-extended imm32/imm8 or imm32 -> sign-extended imm8

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

goldstein.w.n created this revision.Jan 12 2023, 9:34 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2023, 9:34 PM

Herald added subscribers: pengfei, hiraditya. · View Herald Transcript

goldstein.w.n requested review of this revision.Jan 12 2023, 9:34 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2023, 9:34 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

goldstein.w.n added a parent revision: D141652: [X86] Add additional tests for constant `srl/shl` + `and`; NFC.Jan 12 2023, 9:34 PM

Harbormaster completed remote builds in B207539: Diff 488861.Jan 12 2023, 9:35 PM

goldstein.w.n added reviewers: pengfei, RKSimon.Jan 12 2023, 9:35 PM

pengfei added inline comments.Jan 13 2023, 6:23 PM

llvm/test/CodeGen/X86/bitreverse.ll
538	IIRC, LEA is expensive, so this looks like a good deal.
llvm/test/CodeGen/X86/bmi-x86_64.ll
131–132	Remove `BEXTR-SLOW` to eliminate the message.
llvm/test/CodeGen/X86/btc_bts_btr.ll
988	One more `andb`
1011	ditto.
llvm/test/CodeGen/X86/combine-bitreverse.ll
238	One more `shrl`?
289	ditto.
327	ditto.
llvm/test/CodeGen/X86/const-shift-of-constmasked.ll
656	You can add option `--no_x86_scrub_sp` when updating test.
1195	ditto.
llvm/test/CodeGen/X86/const-shift-with-and.ll
5	What's these tests used for? They are not changed in this patch? Besides, do you still need new test case for this patch? Are these covered by the other changes already?

goldstein.w.n added inline comments.Jan 13 2023, 10:04 PM

llvm/test/CodeGen/X86/btc_bts_btr.ll
988	One more `andb` Sorry, didn't see earlier. Think this breaks a few combine patterns that probably relied on semi-canonicalizarion of (and (shift x, y), z) instead of the other way around. Fixing the cases caught here is doable, but is it possible to maybe move this to later in the process? Currently it's guarded behind AfterLegalize (for the exact reason of not breaking other patterns), but is there a higher level or different pass this might be better placed in?

craig.topper added a subscriber: craig.topper.Jan 13 2023, 10:13 PM

craig.topper added inline comments.

llvm/test/CodeGen/X86/bitreverse.ll
538	I thought LEA was only expensive if it uses 3 sources.

craig.topper added inline comments.Jan 13 2023, 10:21 PM

llvm/lib/Target/X86/X86ISelLowering.cpp
47494	You could check `!VT.isScalarInteger()` instead.
47554	Drop else after return.
56345	Why checking both operands of the shift? Isn't only the first one interesting? An AND on the shift amount is very different.

pengfei added inline comments.Jan 14 2023, 2:05 AM

llvm/lib/Target/X86/X86ISelLowering.cpp
47679	Add one space after `if`.
49129	ditto.
llvm/test/CodeGen/X86/bitreverse.ll
538	You are right. SOM says only 3 operand LEA is slow. But I found MCA cannot reflect the difference. So this is yet another regression. old: https://godbolt.org/z/dPW93v8zT new: https://godbolt.org/z/hrEovKW4f

pengfei added inline comments.Jan 14 2023, 2:05 AM

llvm/lib/Target/X86/X86ISelLowering.cpp
47438	results
47438	One more space. You may run clang-format to help solvoing the format nits.
47533	Use curly braces for consistency.

craig.topper added inline comments.Jan 14 2023, 10:11 AM

llvm/test/CodeGen/X86/bitreverse.ll
538	Looks like the scheduler models make the distinction for AMD CPUs but not Intel Small snippet of the predicate check: X86SchedPredicates.td 61:def IsThreeOperandsLEAFn : X86ScheduleBdVer2.td 560: IsThreeOperandsLEAFn, X86ScheduleBtVer2.td 945: IsThreeOperandsLEAFn, X86ScheduleZnver3.td 590: IsThreeOperandsLEAFn,

pengfei added inline comments.Jan 14 2023, 11:00 PM

llvm/test/CodeGen/X86/bitreverse.ll
538	Thanks for the information! We may need to do the same thing for Intel. Filed #60043 for it.

Fix many a nit

goldstein.w.n marked 7 inline comments as done.Jan 15 2023, 3:10 PM

goldstein.w.n added inline comments.

llvm/test/CodeGen/X86/btc_bts_btr.ll
988	Do you know if there is a later stage I can move this transform too so that it won't break any other patterns?

Harbormaster completed remote builds in B207941: Diff 489399.Jan 15 2023, 4:05 PM

pengfei added inline comments.Jan 15 2023, 6:12 PM

llvm/test/CodeGen/X86/btc_bts_btr.ll
988	No, I don't. Maybe do it in a peephole pass after ISel?

craig.topper added inline comments.Jan 15 2023, 6:57 PM

llvm/test/CodeGen/X86/btc_bts_btr.ll
988	We have something similarish implemented during isel in `X86DAGToDAGISel::tryShrinkShlLogicImm`.

goldstein.w.n added inline comments.Jan 15 2023, 7:05 PM

llvm/test/CodeGen/X86/btc_bts_btr.ll
988	Do you know if there is a later stage I can move this transform too so that it won't break any other patterns?
988	We have something similarish implemented during isel in `X86DAGToDAGISel::tryShrinkShlLogicImm`. I did in fact try moving it there and it did clean up some of the missed optimizations but added some (more severe) other missed optimizations.
988	No, I don't. Maybe do it in a peephole pass after ISel? Is there an example file / pass you could point to for me to emulate?

RKSimon added inline comments.Jan 16 2023, 6:29 AM

llvm/lib/Target/X86/X86ISelLowering.cpp
47444	a lot of the if-else complexity is coming from you trying to handle and(shift(x,c1),c2) and shift(and(x,c2),c3) permutations in the same code - can you not have 2 variants in separate wrapper functions and then have a simpler core function that both call?
47454	(style) assert message
47480	(style) assert messages

goldstein.w.n added inline comments.Jan 16 2023, 6:49 AM

llvm/lib/Target/X86/X86ISelLowering.cpp
47444	a lot of the if-else complexity is coming from you trying to handle and(shift(x,c1),c2) and shift(and(x,c2),c3) permutations in the same code - can you not have 2 variants in separate wrapper functions and then have a simpler core function that both call? It seemed to me there was enough duplication / not "too" complex to justify one function, but can ofc split.

Fix regressions + more format issues

Herald added a subscriber: ecnelises. · View Herald TranscriptJan 16 2023, 3:22 PM

@pengfei fixed the added and and shift in btc_bts_btr.ll and combine-bitreverse.ll respectively. The diffs are gone so your comments don't show up anymore.

Fix andn + remove bextr + no-scrub-sp

goldstein.w.n marked 2 inline comments as done.Jan 16 2023, 5:07 PM

Harbormaster completed remote builds in B208125: Diff 489655.Jan 16 2023, 11:14 PM

pengfei added inline comments.Jan 18 2023, 7:32 AM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
9310 ↗	(On Diff #489655)	So it is only profitable when `MASK` is a constant? Why don't use `c3` in the comments?
9318 ↗	(On Diff #489655)	Duplicated define.
llvm/lib/Target/X86/X86ISelLowering.cpp
47459–47460	You could use `ISD::isExtOpcode(Opc)`.
47479	I'd say the implementation is too complicated to me to understand. And the overall improvement doesn't look worth the complexity. Not to mention there's still regression in bitreverse.ll I may not try to understand the code here, leave it for other reviewers.

goldstein.w.n added inline comments.Jan 18 2023, 9:02 AM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
9310 ↗	(On Diff #489655)	So it is only profitable when `MASK` is a constant? Why don't use `c3` in the comments?
llvm/lib/Target/X86/X86ISelLowering.cpp
47479	I'd say the implementation is too complicated to me to understand. And the overall improvement doesn't look worth the complexity. Not to mention there's still regression in bitreverse.ll I think the bitreverse.ll regressions have been fixed (that was the point of the new patterns in `DAGCombiner.cpp`). I may not try to understand the code here, leave it for other reviewers. Okay, I will split into 2-functions as Simon suggested which I think will decrease the complexity.

Split and/shift cases (readability). Fix some nits

pengfei added inline comments.Jan 18 2023, 6:31 PM

llvm/lib/Target/X86/X86ISelLowering.cpp
47479	I think the bitreverse.ll regressions have been fixed (that was the point of the new patterns in `DAGCombiner.cpp`). I think you fixed combine-bitreverse.ll rather than bitreverse.ll The problem in bitreverse.ll is 2 lea is replaced by 2 mov + and + shl. I think it is a minor regression due to the increased 2 mov, so I didn't insist on that. Is the TODO in the file to address this problem?

More cleanup

goldstein.w.n added inline comments.Jan 18 2023, 8:05 PM

llvm/lib/Target/X86/X86ISelLowering.cpp
47479	I'd say the implementation is too complicated to me to understand. And the overall improvement doesn't look worth the complexity. Not to mention there's still regression in bitreverse.ll I may not try to understand the code here, leave it for other reviewers. As simon suggested, split the two implementations. Think its a lot more readable now.
47479	I think the bitreverse.ll regressions have been fixed (that was the point of the new patterns in `DAGCombiner.cpp`). I think you fixed combine-bitreverse.ll rather than bitreverse.ll The problem in bitreverse.ll is 2 lea is replaced by 2 mov + and + shl. I think it is a minor regression due to the increased 2 mov, so I didn't insist on that. I see. I think this is b.c of the reordering `(and (shl X..)) -> (shl (and X..))` when `X` persists (then we need a `mov`). We could prevent it with a guard in `combineAndWithLogicalShift` that only did the transform if this was the last use of `X` the shiftop was `shl 1/2/3` (can become lea). Is there a way to query "isLastUse(X)" for an `SDValue` or do we just need to do `hasOneUse`? Also note other places in the patch get the inverse behavior i.e `llvm/test/CodeGen/X86/combine-rotates.ll:rotl_merge_i5`. Is the TODO in the file to address this problem? Not directly, no. Maybe by accident.

Fix regression in bitreverse

goldstein.w.n marked an inline comment as done.Jan 18 2023, 9:20 PM

goldstein.w.n added inline comments.

llvm/lib/Target/X86/X86ISelLowering.cpp
47479	I think the bitreverse.ll regressions have been fixed (that was the point of the new patterns in `DAGCombiner.cpp`). I think you fixed combine-bitreverse.ll rather than bitreverse.ll The problem in bitreverse.ll is 2 lea is replaced by 2 mov + and + shl. I think it is a minor regression due to the increased 2 mov, so I didn't insist on that. Is the TODO in the file to address this problem? Fixed.

The new code is a bit easier to understand now, at least to me. The idea masks sense to me though I still think the logic is a bit verbose.
I'm not going to block this patch, but please wait some days for other reviewers.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
9414 ↗	(On Diff #490373)	Unintended change.
llvm/lib/Target/X86/X86ISelLowering.cpp
47663	I found we prefer to `const APInt`.
47674	`countTrailingOnes`?
47676	Should it just need to check `MaskCnt == 8/16/32`. Or even `AndMask == 0xff/0xffff/0xffffffff`?
47706	`countTrailingOnes`?
47735	Duplicated.
47735	it's
47847	is
56681	Add a comment to explain this is for `combineLogicalShiftWithAnd`. Why `combineAndWithLogicalShift` not need this?

This revision is now accepted and ready to land.Jan 19 2023, 1:20 AM

Harbormaster completed remote builds in B208655: Diff 490373.Jan 19 2023, 3:43 AM

goldstein.w.n marked 6 inline comments as done.Jan 19 2023, 9:52 AM

goldstein.w.n added inline comments.

llvm/lib/Target/X86/X86ISelLowering.cpp
56681	Add a comment to explain this is for `combineLogicalShiftWithAnd`. Why `combineAndWithLogicalShift` not need this? this is only called for shift ops (i.e `VisitSHL` and `visitShiftByConstant`).

Fix nits + improve comments

Harbormaster completed remote builds in B208792: Diff 490567.Jan 19 2023, 4:13 PM

Update tests for all backends

goldstein.w.n added a reviewer: t.p.northover.Jan 20 2023, 10:09 PM

goldstein.w.n added inline comments.Jan 20 2023, 10:12 PM

llvm/test/CodeGen/AArch64/arm64-bitfield-extract.ll
942 ↗	(On Diff #491035)	@t.p.northover the changes to the `DagCombiner` (allowing `shl/shr` to combine through an `and`) seems to cause regression on Aarch64. Going to update with `TargetLowering` flag unless this is in fact preferable.

Put the DAGCombiner changes behind TLI flag as it seems to cause
minor regression on aarch64

goldstein.w.n added inline comments.Jan 21 2023, 12:16 AM

llvm/test/CodeGen/AArch64/arm64-bitfield-extract.ll
942 ↗	(On Diff #491035)	@t.p.northover the changes to the `DagCombiner` (allowing `shl/shr` to combine through an `and`) seems to cause regression on Aarch64. Going to update with `TargetLowering` flag unless this is in fact preferable. I put it behind `TLI.isDesirableToCombineShiftsAcross` which is disabled by default so should be no issue.

Made some changes (TLI flag) since this was accepted. Can someone do a quick pass of the most recent changes to re-verify?

Harbormaster completed remote builds in B209128: Diff 491045.Jan 21 2023, 8:21 AM

pengfei added inline comments.Jan 29 2023, 6:19 PM

llvm/include/llvm/CodeGen/TargetLowering.h
4013–4015 ↗	(On Diff #491045)	Is `Shift1` inner and `Shift2` outer?
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
9414 ↗	(On Diff #490373)	This still no changed.
llvm/lib/Target/X86/X86ISelLowering.cpp
56685–56686	The arguments are not used. Leave it `void` for now?

goldstein.w.n marked 6 inline comments as done.Jan 29 2023, 6:42 PM

goldstein.w.n added inline comments.

llvm/include/llvm/CodeGen/TargetLowering.h
4013–4015 ↗	(On Diff #491045)	Is `Shift1` inner and `Shift2` outer? Yes, renamed to `OuterShift` and `InnerShift` to be clearer.

Fix style, issue, rename TLI helper, void variables

pengfei added inline comments.Jan 29 2023, 7:01 PM

llvm/include/llvm/CodeGen/TargetLowering.h
4020 ↗	(On Diff #493177)	Is the comment still incorrect?

Fix comment

pengfei accepted this revision.Jan 29 2023, 7:39 PM

Harbormaster completed remote builds in B210681: Diff 493184.Jan 30 2023, 5:41 AM

goldstein.w.n added inline comments.Jan 30 2023, 10:08 AM

llvm/include/llvm/CodeGen/TargetLowering.h
4020 ↗	(On Diff #493177)	Is the comment still incorrect? Oof, fixed.

goldstein.w.n mentioned this in D146121: [DAG] Move lshr narrowing from visitANDLike to SimplifyDemandedBits.Mar 15 2023, 11:47 AM

@goldstein.w.n What is happening with this patch? After they leave my "Ready to Review" list I tend to lose track......

In D141653#4198903, @RKSimon wrote:

@goldstein.w.n What is happening with this patch? After they leave my "Ready to Review" list I tend to lose track......

When I tested on bootstrap build it caused on infinite loop. I think this approach is inherently brittle and susceptible
to that sort of bug. I haven't quite abandoned it as I think some of the shl/shr improvements can be salvaged. But
the imm improvedments I want to move to a new pass that runs at the way of end DAG lowering.

goldstein.w.n mentioned this in D150143: [X86] Add X86FixupVectorConstantsPass to fold vectors constant loads as broadcasts (WIP).May 8 2023, 4:55 PM

Revision Contents

Path

Size

llvm/

lib/

Target/

X86/

X86ISelLowering.h

3 lines

X86ISelLowering.cpp

189 lines

test/

CodeGen/

X86/

avx512-calling-conv.ll

123 lines

72 lines

36 lines

164 lines

6 lines

combine-bitreverse.ll

48 lines

combine-rotates.ll

10 lines

const-shift-of-constmasked.ll

202 lines

const-shift-with-and.ll

33 lines

11 lines

18 lines

8 lines

10 lines

2 lines

9 lines

12 lines

pull-binop-through-shift.ll

8 lines

rev16.ll

30 lines

rotate-extract.ll

4 lines

selectcc-to-shiftand.ll

5 lines

5 lines

3 lines

77 lines

28 lines

28 lines

3 lines

15 lines

urem-seteq-illegal-types.ll

6 lines

vselect.ll

8 lines

zext-logicop-shift-load.ll

2 lines

Diff 488861

llvm/lib/Target/X86/X86ISelLowering.h

Show First 20 Lines • Show All 1,012 Lines • ▼ Show 20 Lines	public:
/// Replace the results of node with an illegal result		/// Replace the results of node with an illegal result
/// type with new values built out of custom code.		/// type with new values built out of custom code.
///		///
void ReplaceNodeResults(SDNode *N, SmallVectorImpl<SDValue>&Results,		void ReplaceNodeResults(SDNode *N, SmallVectorImpl<SDValue>&Results,
SelectionDAG &DAG) const override;		SelectionDAG &DAG) const override;

SDValue PerformDAGCombine(SDNode *N, DAGCombinerInfo &DCI) const override;		SDValue PerformDAGCombine(SDNode *N, DAGCombinerInfo &DCI) const override;

		bool isDesirableToCommuteWithShift(const SDNode *N,
		CombineLevel Level) const override;

/// Return true if the target has native support for		/// Return true if the target has native support for
/// the specified value type and it is 'desirable' to use the type for the		/// the specified value type and it is 'desirable' to use the type for the
/// given node type. e.g. On x86 i16 is legal, but undesirable since i16		/// given node type. e.g. On x86 i16 is legal, but undesirable since i16
/// instruction encodings are longer and some i16 instructions are slow.		/// instruction encodings are longer and some i16 instructions are slow.
bool isTypeDesirableForOp(unsigned Opc, EVT VT) const override;		bool isTypeDesirableForOp(unsigned Opc, EVT VT) const override;

/// Return true if the target has native support for the		/// Return true if the target has native support for the
/// specified value type and it is 'desirable' to use the type. e.g. On x86		/// specified value type and it is 'desirable' to use the type. e.g. On x86
▲ Show 20 Lines • Show All 801 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 32,759 Lines • ▼ Show 20 Lines

	unsigned Opc = ExtOpc == ISD::SIGN_EXTEND ? ISD::MULHS : ISD::MULHU;			unsigned Opc = ExtOpc == ISD::SIGN_EXTEND ? ISD::MULHS : ISD::MULHU;
	SDValue Mulh = DAG.getNode(Opc, DL, MulVT, LHS, RHS);			SDValue Mulh = DAG.getNode(Opc, DL, MulVT, LHS, RHS);

	ExtOpc = N->getOpcode() == ISD::SRA ? ISD::SIGN_EXTEND : ISD::ZERO_EXTEND;			ExtOpc = N->getOpcode() == ISD::SRA ? ISD::SIGN_EXTEND : ISD::ZERO_EXTEND;
	return DAG.getNode(ExtOpc, DL, VT, Mulh);			return DAG.getNode(ExtOpc, DL, VT, Mulh);
	}			}

	static SDValue combineShiftLeft(SDNode *N, SelectionDAG &DAG) {			// Try and re-ordering an `and` and `srl/shl` if it result in a better constant
				pengfeiUnsubmitted Done Reply Inline Actions results pengfei: results
				pengfeiUnsubmitted Done Reply Inline Actions One more space. You may run clang-format to help solvoing the format nits. pengfei: One more space. You may run clang-format to help solvoing the format nits.
				// for the `and`. Note this only tries to optimize by re-ordering, other
				// patterns like (and (shl x, 4), 240) -> (and (shl x, 4), 255) (for the movzbl)
				// are handled elsewhere.
				static SDValue
				combineLogicalShiftWithAnd(SDNode *N, SelectionDAG &DAG,
				TargetLowering::DAGCombinerInfo &DCI) {
				RKSimonUnsubmitted Done Reply Inline Actions a lot of the if-else complexity is coming from you trying to handle and(shift(x,c1),c2) and shift(and(x,c2),c3) permutations in the same code - can you not have 2 variants in separate wrapper functions and then have a simpler core function that both call? RKSimon: a lot of the if-else complexity is coming from you trying to handle and(shift(x,c1),c2) and…
				goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions a lot of the if-else complexity is coming from you trying to handle and(shift(x,c1),c2) and shift(and(x,c2),c3) permutations in the same code - can you not have 2 variants in separate wrapper functions and then have a simpler core function that both call? It seemed to me there was enough duplication / not "too" complex to justify one function, but can ofc split. goldstein.w.n: > a lot of the if-else complexity is coming from you trying to handle and(shift(x,c1),c2) and…
				// Only do this on the last DAG combine as it can interfere with other
				// combines. This is also necessary to avoid and infinite loop between this
				// and `DAGCombiner::visitShiftByConstant`.
				if (!DCI.isAfterLegalizeDAG())
				return SDValue();

				SDNode ShiftOp, AndOp;
				SDValue RawVal;
				assert(N->getOpcode() == ISD::SRL \|\| N->getOpcode() == ISD::SHL \|\|
				N->getOpcode() == ISD::AND);
				RKSimonUnsubmitted Done Reply Inline Actions (style) assert message RKSimon: (style) assert message

				// Get ShiftOp, AndOp and RawVal (RawVal being the shifted amount assuming no
				// and).
				if (N->getOpcode() == ISD::SRL \|\| N->getOpcode() == ISD::SHL) {
				ShiftOp = N;
				AndOp = N->getOperand(0).getNode();
				pengfeiUnsubmitted Done Reply Inline Actions You could use `ISD::isExtOpcode(Opc)`. pengfei: You could use `ISD::isExtOpcode(Opc)`.
				if (AndOp->getOpcode() != ISD::AND \|\| !AndOp->hasOneUse())
				return SDValue();
				} else {
				AndOp = N;
				unsigned Idx;
				for (Idx = 0; Idx < 2; ++Idx) {
				ShiftOp = N->getOperand(Idx).getNode();
				if ((ShiftOp->getOpcode() == ISD::SRL \|\|
				ShiftOp->getOpcode() == ISD::SHL) &&
				ShiftOp->hasOneUse()) {
				RawVal = ShiftOp->getOperand(0);
				break;
				}
				}
				if (Idx == 2)
				return SDValue();
				}

				assert(ShiftOp->getOpcode() == ISD::SRL \|\| ShiftOp->getOpcode() == ISD::SHL);
				pengfeiUnsubmitted Done Reply Inline Actions I'd say the implementation is too complicated to me to understand. And the overall improvement doesn't look worth the complexity. Not to mention there's still regression in bitreverse.ll I may not try to understand the code here, leave it for other reviewers. pengfei: I'd say the implementation is too complicated to me to understand. And the overall improvement…
				goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions I'd say the implementation is too complicated to me to understand. And the overall improvement doesn't look worth the complexity. Not to mention there's still regression in bitreverse.ll I think the bitreverse.ll regressions have been fixed (that was the point of the new patterns in `DAGCombiner.cpp`). I may not try to understand the code here, leave it for other reviewers. Okay, I will split into 2-functions as Simon suggested which I think will decrease the complexity. goldstein.w.n: > I'd say the implementation is too complicated to me to understand. And the overall…
				pengfeiUnsubmitted Done Reply Inline Actions I think the bitreverse.ll regressions have been fixed (that was the point of the new patterns in `DAGCombiner.cpp`). I think you fixed combine-bitreverse.ll rather than bitreverse.ll The problem in bitreverse.ll is 2 lea is replaced by 2 mov + and + shl. I think it is a minor regression due to the increased 2 mov, so I didn't insist on that. Is the TODO in the file to address this problem? pengfei: > I think the bitreverse.ll regressions have been fixed (that was the point of the new patterns…
				goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions I think the bitreverse.ll regressions have been fixed (that was the point of the new patterns in `DAGCombiner.cpp`). I think you fixed combine-bitreverse.ll rather than bitreverse.ll The problem in bitreverse.ll is 2 lea is replaced by 2 mov + and + shl. I think it is a minor regression due to the increased 2 mov, so I didn't insist on that. I see. I think this is b.c of the reordering `(and (shl X..)) -> (shl (and X..))` when `X` persists (then we need a `mov`). We could prevent it with a guard in `combineAndWithLogicalShift` that only did the transform if this was the last use of `X` the shiftop was `shl 1/2/3` (can become lea). Is there a way to query "isLastUse(X)" for an `SDValue` or do we just need to do `hasOneUse`? Also note other places in the patch get the inverse behavior i.e `llvm/test/CodeGen/X86/combine-rotates.ll:rotl_merge_i5`. Is the TODO in the file to address this problem? Not directly, no. Maybe by accident. goldstein.w.n: > > I think the bitreverse.ll regressions have been fixed (that was the point of the new…
				goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions I think the bitreverse.ll regressions have been fixed (that was the point of the new patterns in `DAGCombiner.cpp`). I think you fixed combine-bitreverse.ll rather than bitreverse.ll The problem in bitreverse.ll is 2 lea is replaced by 2 mov + and + shl. I think it is a minor regression due to the increased 2 mov, so I didn't insist on that. Is the TODO in the file to address this problem? Fixed. goldstein.w.n: > > I think the bitreverse.ll regressions have been fixed (that was the point of the new…
				goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions I'd say the implementation is too complicated to me to understand. And the overall improvement doesn't look worth the complexity. Not to mention there's still regression in bitreverse.ll I may not try to understand the code here, leave it for other reviewers. As simon suggested, split the two implementations. Think its a lot more readable now. goldstein.w.n: > I'd say the implementation is too complicated to me to understand. And the overall…
				assert(AndOp->getOpcode() == ISD::AND);
				RKSimonUnsubmitted Done Reply Inline Actions (style) assert messages RKSimon: (style) assert messages

				// Get the `and` mask and RawVal if we didn't get it earlier.
				auto *AndC = dyn_cast<ConstantSDNode>(AndOp->getOperand(0));
				if (AndC == nullptr) {
				AndC = dyn_cast<ConstantSDNode>(AndOp->getOperand(1));
				if (!RawVal)
				RawVal = AndOp->getOperand(0);
				} else if (!RawVal) {
				RawVal = AndOp->getOperand(1);
				}
				EVT VT = RawVal.getValueType();
				// TODO: Makes sense to do this on vector types if it allows us to use a mask
				// thats easier to create.
				if (VT != MVT::i8 && VT != MVT::i16 && VT != MVT::i32 && VT != MVT::i64)
				craig.topperUnsubmitted Done Reply Inline Actions You could check `!VT.isScalarInteger()` instead. craig.topper: You could check `!VT.isScalarInteger()` instead.
				return SDValue();

				SDLoc DL(N);
				// Get the `srl` amount, only proceed if both `srl` amt and `and` mask are
				// constant.
				auto *ShiftC = dyn_cast<ConstantSDNode>(ShiftOp->getOperand(1));
				if (ShiftC == nullptr \|\| AndC == nullptr)
				return SDValue();

				APInt AndMask = AndC->getAPIntValue();
				unsigned ShiftCnt = ShiftC->getZExtValue();

				// If `AndMask` is already in form for `movl/movzwl/movzbl` then nothing to
				// do.
				unsigned MaskCnt = AndMask.getBitWidth() - AndMask.countLeadingZeros();

				if (AndMask.isMask()) {
				assert(MaskCnt == AndMask.countPopulation());
				if (MaskCnt >= 8 && isPowerOf2_32(MaskCnt))
				return SDValue();
				}

				for (unsigned MaskIdx = 0; MaskIdx < 2; ++MaskIdx) {
				// Determine Mask if we swap order of `srl/shl` and `and`.
				APInt NewAndMask;
				if (MaskIdx) {
				if (N->getOpcode() == ISD::AND && ShiftOp->getOpcode() == ISD::SHL)
				NewAndMask = AndMask.lshr(ShiftCnt);
				else
				break;
				} else if (N->getOpcode() == ISD::AND \|\| N->getOpcode() == ISD::SHL) {
				// Will never be beneficial if we can't extend a mask.
				if (!AndMask.isMask())
				continue;

				NewAndMask =
				APInt::getAllOnes(ShiftCnt + MaskCnt).zext(AndMask.getBitWidth());
				} else
				NewAndMask = AndMask.lshr(ShiftCnt);
				pengfeiUnsubmitted Done Reply Inline Actions Use curly braces for consistency. pengfei: Use curly braces for consistency.

				// If we can build a mask that can be `movl/movzwl/movzbl` OR just shrink
				// the mask (potentially getting better encoding) then do so.
				bool SwapOrder = false;
				if (NewAndMask.isMask()) {
				unsigned NewMaskCnt = NewAndMask.countPopulation();
				SwapOrder = isPowerOf2_32(NewMaskCnt) && NewMaskCnt >= 8;
				}
				if (!SwapOrder)
				SwapOrder =
				NewAndMask.getSignificantBits() < AndMask.getSignificantBits();
				if (!SwapOrder)
				continue;

				SDValue ret;
				if (N->getOpcode() == ISD::SRL \|\| N->getOpcode() == ISD::SHL)
				return DAG.getNode(ISD::AND, DL, VT,
				DAG.getNode(N->getOpcode(), DL, VT, RawVal,
				DAG.getConstant(ShiftCnt, DL, VT)),
				DAG.getConstant(NewAndMask, DL, VT));
				else
				craig.topperUnsubmitted Done Reply Inline Actions Drop else after return. craig.topper: Drop else after return.
				return DAG.getNode(ShiftOp->getOpcode(), DL, VT,
				DAG.getNode(ISD::AND, DL, VT, RawVal,
				DAG.getConstant(NewAndMask, DL, VT)),
				DAG.getConstant(ShiftCnt, DL, VT));
				}
				return SDValue();
				}

				static SDValue combineShiftLeft(SDNode *N, SelectionDAG &DAG,
				TargetLowering::DAGCombinerInfo &DCI) {
	SDValue N0 = N->getOperand(0);			SDValue N0 = N->getOperand(0);
	SDValue N1 = N->getOperand(1);			SDValue N1 = N->getOperand(1);
	ConstantSDNode *N1C = dyn_cast<ConstantSDNode>(N1);			ConstantSDNode *N1C = dyn_cast<ConstantSDNode>(N1);
	EVT VT = N0.getValueType();			EVT VT = N0.getValueType();

	// fold (shl (and (setcc_c), c1), c2) -> (and setcc_c, (c1 << c2))			// fold (shl (and (setcc_c), c1), c2) -> (and setcc_c, (c1 << c2))
	// since the result of setcc_c is all zero's or all ones.			// since the result of setcc_c is all zero's or all ones.
	if (VT.isInteger() && !VT.isVector() &&			if (VT.isInteger() && !VT.isVector() &&
	Show All 25 Lines
	MaskOK = Mask.isIntN(N00.getOperand(0).getValueSizeInBits());			MaskOK = Mask.isIntN(N00.getOperand(0).getValueSizeInBits());
	}			}
	if (MaskOK && Mask != 0) {			if (MaskOK && Mask != 0) {
	SDLoc DL(N);			SDLoc DL(N);
	return DAG.getNode(ISD::AND, DL, VT, N00, DAG.getConstant(Mask, DL, VT));			return DAG.getNode(ISD::AND, DL, VT, N00, DAG.getConstant(Mask, DL, VT));
	}			}
	}			}

				if (SDValue V = combineLogicalShiftWithAnd(N, DAG, DCI))
				return V;

	return SDValue();			return SDValue();
	}			}

	static SDValue combineShiftRightArithmetic(SDNode *N, SelectionDAG &DAG,			static SDValue combineShiftRightArithmetic(SDNode *N, SelectionDAG &DAG,
	const X86Subtarget &Subtarget) {			const X86Subtarget &Subtarget) {
	SDValue N0 = N->getOperand(0);			SDValue N0 = N->getOperand(0);
	SDValue N1 = N->getOperand(1);			SDValue N1 = N->getOperand(1);
	EVT VT = N0.getValueType();			EVT VT = N0.getValueType();
	Show All 38 Lines
	DAG.getNode(ISD::SIGN_EXTEND_INREG, DL, VT, N00, DAG.getValueType(SVT));			DAG.getNode(ISD::SIGN_EXTEND_INREG, DL, VT, N00, DAG.getValueType(SVT));
	SarConst = SarConst - (Size - ShiftSize);			SarConst = SarConst - (Size - ShiftSize);
	if (SarConst == 0)			if (SarConst == 0)
	return NN;			return NN;
	if (SarConst.isNegative())			if (SarConst.isNegative())
	return DAG.getNode(ISD::SHL, DL, VT, NN,			return DAG.getNode(ISD::SHL, DL, VT, NN,
	DAG.getConstant(-SarConst, DL, CVT));			DAG.getConstant(-SarConst, DL, CVT));
	return DAG.getNode(ISD::SRA, DL, VT, NN,			return DAG.getNode(ISD::SRA, DL, VT, NN,
	DAG.getConstant(SarConst, DL, CVT));			DAG.getConstant(SarConst, DL, CVT));
				pengfeiUnsubmitted Done Reply Inline Actions I found we prefer to `const APInt`. pengfei: I found we prefer to `const APInt`.
	}			}
	return SDValue();			return SDValue();
	}			}

	static SDValue combineShiftRightLogical(SDNode *N, SelectionDAG &DAG,			static SDValue combineShiftRightLogical(SDNode *N, SelectionDAG &DAG,
	TargetLowering::DAGCombinerInfo &DCI,			TargetLowering::DAGCombinerInfo &DCI,
	const X86Subtarget &Subtarget) {			const X86Subtarget &Subtarget) {
	SDValue N0 = N->getOperand(0);
	SDValue N1 = N->getOperand(1);
	EVT VT = N0.getValueType();

	if (SDValue V = combineShiftToPMULH(N, DAG, Subtarget))			if (SDValue V = combineShiftToPMULH(N, DAG, Subtarget))
	return V;			return V;

	// Only do this on the last DAG combine as it can interfere with other			// Only do this on the last DAG combine as it can interfere with other
				pengfeiUnsubmitted Done Reply Inline Actions `countTrailingOnes`? pengfei: `countTrailingOnes`?
	// combines.			// combines.
	if (!DCI.isAfterLegalizeDAG())			if (!DCI.isAfterLegalizeDAG())
				pengfeiUnsubmitted Done Reply Inline Actions Should it just need to check `MaskCnt == 8/16/32`. Or even `AndMask == 0xff/0xffff/0xffffffff`? pengfei: Should it just need to check `MaskCnt == 8/16/32`. Or even `AndMask == 0xff/0xffff/0xffffffff`?
	return SDValue();			return SDValue();

	// Try to improve a sequence of srl (and X, C1), C2 by inverting the order.			if(SDValue V = combineLogicalShiftWithAnd(N, DAG, DCI))
				pengfeiUnsubmitted Done Reply Inline Actions Add one space after `if`. pengfei: Add one space after `if`.
	// TODO: This is a generic DAG combine that became an x86-only combine to			return V;
	// avoid shortcomings in other folds such as bswap, bit-test ('bt'), and
	// and-not ('andn').
	if (N0.getOpcode() != ISD::AND \|\| !N0.hasOneUse())
	return SDValue();

	auto *ShiftC = dyn_cast<ConstantSDNode>(N1);
	auto *AndC = dyn_cast<ConstantSDNode>(N0.getOperand(1));
	if (!ShiftC \|\| !AndC)
	return SDValue();

	// If we can shrink the constant mask below 8-bits or 32-bits, then this
	// transform should reduce code size. It may also enable secondary transforms
	// from improved known-bits analysis or instruction selection.
	APInt MaskVal = AndC->getAPIntValue();

	// If this can be matched by a zero extend, don't optimize.
	if (MaskVal.isMask()) {
	unsigned TO = MaskVal.countTrailingOnes();
	if (TO >= 8 && isPowerOf2_32(TO))
	return SDValue();
	}

	APInt NewMaskVal = MaskVal.lshr(ShiftC->getAPIntValue());
	unsigned OldMaskSize = MaskVal.getMinSignedBits();
	unsigned NewMaskSize = NewMaskVal.getMinSignedBits();
	if ((OldMaskSize > 8 && NewMaskSize <= 8) \|\|
	(OldMaskSize > 32 && NewMaskSize <= 32)) {
	// srl (and X, AndC), ShiftC --> and (srl X, ShiftC), (AndC >> ShiftC)
	SDLoc DL(N);
	SDValue NewMask = DAG.getConstant(NewMaskVal, DL, VT);
	SDValue NewShift = DAG.getNode(ISD::SRL, DL, VT, N0.getOperand(0), N1);
	return DAG.getNode(ISD::AND, DL, VT, NewShift, NewMask);
	}
	return SDValue();			return SDValue();
	}			}

	static SDValue combineHorizOpWithShuffle(SDNode *N, SelectionDAG &DAG,			static SDValue combineHorizOpWithShuffle(SDNode *N, SelectionDAG &DAG,
	const X86Subtarget &Subtarget) {			const X86Subtarget &Subtarget) {
	unsigned Opcode = N->getOpcode();			unsigned Opcode = N->getOpcode();
	assert(isHorizOp(Opcode) && "Unexpected hadd/hsub/pack opcode");			assert(isHorizOp(Opcode) && "Unexpected hadd/hsub/pack opcode");

	SDLoc DL(N);			SDLoc DL(N);
	EVT VT = N->getValueType(0);			EVT VT = N->getValueType(0);
	SDValue N0 = N->getOperand(0);			SDValue N0 = N->getOperand(0);
	SDValue N1 = N->getOperand(1);			SDValue N1 = N->getOperand(1);
	EVT SrcVT = N0.getValueType();			EVT SrcVT = N0.getValueType();

	SDValue BC0 =			SDValue BC0 =
	N->isOnlyUserOf(N0.getNode()) ? peekThroughOneUseBitcasts(N0) : N0;			N->isOnlyUserOf(N0.getNode()) ? peekThroughOneUseBitcasts(N0) : N0;
	SDValue BC1 =			SDValue BC1 =
	N->isOnlyUserOf(N1.getNode()) ? peekThroughOneUseBitcasts(N1) : N1;			N->isOnlyUserOf(N1.getNode()) ? peekThroughOneUseBitcasts(N1) : N1;

	// Attempt to fold HOP(LOSUBVECTOR(SHUFFLE(X)),HISUBVECTOR(SHUFFLE(X)))			// Attempt to fold HOP(LOSUBVECTOR(SHUFFLE(X)),HISUBVECTOR(SHUFFLE(X)))
	// to SHUFFLE(HOP(LOSUBVECTOR(X),HISUBVECTOR(X))), this is mainly for			// to SHUFFLE(HOP(LOSUBVECTOR(X),HISUBVECTOR(X))), this is mainly for
	// truncation trees that help us avoid lane crossing shuffles.			// truncation trees that help us avoid lane crossing shuffles.
	// TODO: There's a lot more we can do for PACK/HADD style shuffle combines.			// TODO: There's a lot more we can do for PACK/HADD style shuffle combines.
	// TODO: We don't handle vXf64 shuffles yet.			// TODO: We don't handle vXf64 shuffles yet.
	if (VT.is128BitVector() && SrcVT.getScalarSizeInBits() <= 32) {			if (VT.is128BitVector() && SrcVT.getScalarSizeInBits() <= 32) {
				pengfeiUnsubmitted Not Done Reply Inline Actions `countTrailingOnes`? pengfei: `countTrailingOnes`?
	if (SDValue BCSrc = getSplitVectorSrc(BC0, BC1, false)) {			if (SDValue BCSrc = getSplitVectorSrc(BC0, BC1, false)) {
	SmallVector<SDValue> ShuffleOps;			SmallVector<SDValue> ShuffleOps;
	SmallVector<int> ShuffleMask, ScaledMask;			SmallVector<int> ShuffleMask, ScaledMask;
	SDValue Vec = peekThroughBitcasts(BCSrc);			SDValue Vec = peekThroughBitcasts(BCSrc);
	if (getTargetShuffleInputs(Vec, ShuffleOps, ShuffleMask, DAG)) {			if (getTargetShuffleInputs(Vec, ShuffleOps, ShuffleMask, DAG)) {
	resolveTargetShuffleInputsAndMask(ShuffleOps, ShuffleMask);			resolveTargetShuffleInputsAndMask(ShuffleOps, ShuffleMask);
	// To keep the HOP LHS/RHS coherency, we must be able to scale the unary			// To keep the HOP LHS/RHS coherency, we must be able to scale the unary
	// shuffle to a v4X64 width - we can probably relax this in the future.			// shuffle to a v4X64 width - we can probably relax this in the future.
	Show All 12 Lines
	}			}
	}			}
	}			}
	}			}

	// Attempt to fold HOP(SHUFFLE(X,Y),SHUFFLE(Z,W)) -> SHUFFLE(HOP()).			// Attempt to fold HOP(SHUFFLE(X,Y),SHUFFLE(Z,W)) -> SHUFFLE(HOP()).
	if (VT.is128BitVector() && SrcVT.getScalarSizeInBits() <= 32) {			if (VT.is128BitVector() && SrcVT.getScalarSizeInBits() <= 32) {
	// If either/both ops are a shuffle that can scale to v2x64,			// If either/both ops are a shuffle that can scale to v2x64,
	// then see if we can perform this as a v4x32 post shuffle.			// then see if we can perform this as a v4x32 post shuffle.
				pengfeiUnsubmitted Done Reply Inline Actions Duplicated. pengfei: Duplicated.
				pengfeiUnsubmitted Done Reply Inline Actions it's pengfei: it's
	SmallVector<SDValue> Ops0, Ops1;			SmallVector<SDValue> Ops0, Ops1;
	SmallVector<int> Mask0, Mask1, ScaledMask0, ScaledMask1;			SmallVector<int> Mask0, Mask1, ScaledMask0, ScaledMask1;
	bool IsShuf0 =			bool IsShuf0 =
	getTargetShuffleInputs(BC0, Ops0, Mask0, DAG) && !isAnyZero(Mask0) &&			getTargetShuffleInputs(BC0, Ops0, Mask0, DAG) && !isAnyZero(Mask0) &&
	scaleShuffleElements(Mask0, 2, ScaledMask0) &&			scaleShuffleElements(Mask0, 2, ScaledMask0) &&
	all_of(Ops0, [](SDValue Op) { return Op.getValueSizeInBits() == 128; });			all_of(Ops0, [](SDValue Op) { return Op.getValueSizeInBits() == 128; });
	bool IsShuf1 =			bool IsShuf1 =
	getTargetShuffleInputs(BC1, Ops1, Mask1, DAG) && !isAnyZero(Mask1) &&			getTargetShuffleInputs(BC1, Ops1, Mask1, DAG) && !isAnyZero(Mask1) &&
	▲ Show 20 Lines • Show All 95 Lines • ▼ Show 20 Lines
	unsigned NumDstElts = VT.getVectorNumElements();			unsigned NumDstElts = VT.getVectorNumElements();
	unsigned DstBitsPerElt = VT.getScalarSizeInBits();			unsigned DstBitsPerElt = VT.getScalarSizeInBits();
	unsigned SrcBitsPerElt = 2 * DstBitsPerElt;			unsigned SrcBitsPerElt = 2 * DstBitsPerElt;
	assert(N0.getScalarValueSizeInBits() == SrcBitsPerElt &&			assert(N0.getScalarValueSizeInBits() == SrcBitsPerElt &&
	N1.getScalarValueSizeInBits() == SrcBitsPerElt &&			N1.getScalarValueSizeInBits() == SrcBitsPerElt &&
	"Unexpected PACKSS/PACKUS input type");			"Unexpected PACKSS/PACKUS input type");

	bool IsSigned = (X86ISD::PACKSS == Opcode);			bool IsSigned = (X86ISD::PACKSS == Opcode);

				pengfeiUnsubmitted Not Done Reply Inline Actions is pengfei: is
	// Constant Folding.			// Constant Folding.
	APInt UndefElts0, UndefElts1;			APInt UndefElts0, UndefElts1;
	SmallVector<APInt, 32> EltBits0, EltBits1;			SmallVector<APInt, 32> EltBits0, EltBits1;
	if ((N0.isUndef() \|\| N->isOnlyUserOf(N0.getNode())) &&			if ((N0.isUndef() \|\| N->isOnlyUserOf(N0.getNode())) &&
	(N1.isUndef() \|\| N->isOnlyUserOf(N1.getNode())) &&			(N1.isUndef() \|\| N->isOnlyUserOf(N1.getNode())) &&
	getTargetConstantBitsFromNode(N0, SrcBitsPerElt, UndefElts0, EltBits0) &&			getTargetConstantBitsFromNode(N0, SrcBitsPerElt, UndefElts0, EltBits0) &&
	getTargetConstantBitsFromNode(N1, SrcBitsPerElt, UndefElts1, EltBits1)) {			getTargetConstantBitsFromNode(N1, SrcBitsPerElt, UndefElts1, EltBits1)) {
	unsigned NumLanes = VT.getSizeInBits() / 128;			unsigned NumLanes = VT.getSizeInBits() / 128;
	▲ Show 20 Lines • Show All 1,265 Lines • ▼ Show 20 Lines
	X86::MaxShuffleCombineDepth,			X86::MaxShuffleCombineDepth,
	/HasVarMask/ false, /AllowVarCrossLaneMask/ true,			/HasVarMask/ false, /AllowVarCrossLaneMask/ true,
	/AllowVarPerLaneMask/ true, DAG, Subtarget))			/AllowVarPerLaneMask/ true, DAG, Subtarget))
	return DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, VT, Shuffle,			return DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, VT, Shuffle,
	N0.getOperand(1));			N0.getOperand(1));
	}			}
	}			}

				if(SDValue V = combineLogicalShiftWithAnd(N, DAG, DCI))
				pengfeiUnsubmitted Done Reply Inline Actions ditto. pengfei: ditto.
				return V;

	return SDValue();			return SDValue();
	}			}

	// Canonicalize OR(AND(X,C),AND(Y,~C)) -> OR(AND(X,C),ANDNP(C,Y))			// Canonicalize OR(AND(X,C),AND(Y,~C)) -> OR(AND(X,C),ANDNP(C,Y))
	static SDValue canonicalizeBitSelect(SDNode *N, SelectionDAG &DAG,			static SDValue canonicalizeBitSelect(SDNode *N, SelectionDAG &DAG,
	const X86Subtarget &Subtarget) {			const X86Subtarget &Subtarget) {
	assert(N->getOpcode() == ISD::OR && "Unexpected Opcode");			assert(N->getOpcode() == ISD::OR && "Unexpected Opcode");

	▲ Show 20 Lines • Show All 7,036 Lines • ▼ Show 20 Lines
	case X86ISD::CMP: return combineCMP(N, DAG);			case X86ISD::CMP: return combineCMP(N, DAG);
	case ISD::ADD: return combineAdd(N, DAG, DCI, Subtarget);			case ISD::ADD: return combineAdd(N, DAG, DCI, Subtarget);
	case ISD::SUB: return combineSub(N, DAG, DCI, Subtarget);			case ISD::SUB: return combineSub(N, DAG, DCI, Subtarget);
	case X86ISD::ADD:			case X86ISD::ADD:
	case X86ISD::SUB: return combineX86AddSub(N, DAG, DCI);			case X86ISD::SUB: return combineX86AddSub(N, DAG, DCI);
	case X86ISD::SBB: return combineSBB(N, DAG);			case X86ISD::SBB: return combineSBB(N, DAG);
	case X86ISD::ADC: return combineADC(N, DAG, DCI);			case X86ISD::ADC: return combineADC(N, DAG, DCI);
	case ISD::MUL: return combineMul(N, DAG, DCI, Subtarget);			case ISD::MUL: return combineMul(N, DAG, DCI, Subtarget);
	case ISD::SHL: return combineShiftLeft(N, DAG);			case ISD::SHL: return combineShiftLeft(N, DAG, DCI);
	case ISD::SRA: return combineShiftRightArithmetic(N, DAG, Subtarget);			case ISD::SRA: return combineShiftRightArithmetic(N, DAG, Subtarget);
	case ISD::SRL: return combineShiftRightLogical(N, DAG, DCI, Subtarget);			case ISD::SRL: return combineShiftRightLogical(N, DAG, DCI, Subtarget);
	case ISD::AND: return combineAnd(N, DAG, DCI, Subtarget);			case ISD::AND: return combineAnd(N, DAG, DCI, Subtarget);
	case ISD::OR: return combineOr(N, DAG, DCI, Subtarget);			case ISD::OR: return combineOr(N, DAG, DCI, Subtarget);
	case ISD::XOR: return combineXor(N, DAG, DCI, Subtarget);			case ISD::XOR: return combineXor(N, DAG, DCI, Subtarget);
	case X86ISD::BEXTR:			case X86ISD::BEXTR:
	case X86ISD::BEXTRI: return combineBEXTR(N, DAG, DCI, Subtarget);			case X86ISD::BEXTRI: return combineBEXTR(N, DAG, DCI, Subtarget);
	case ISD::LOAD: return combineLoad(N, DAG, DCI, Subtarget);			case ISD::LOAD: return combineLoad(N, DAG, DCI, Subtarget);
	▲ Show 20 Lines • Show All 138 Lines • ▼ Show 20 Lines
	case X86ISD::SUBV_BROADCAST_LOAD: return combineBROADCAST_LOAD(N, DAG, DCI);			case X86ISD::SUBV_BROADCAST_LOAD: return combineBROADCAST_LOAD(N, DAG, DCI);
	case X86ISD::MOVDQ2Q: return combineMOVDQ2Q(N, DAG);			case X86ISD::MOVDQ2Q: return combineMOVDQ2Q(N, DAG);
	case X86ISD::PDEP: return combinePDEP(N, DAG, DCI);			case X86ISD::PDEP: return combinePDEP(N, DAG, DCI);
	}			}

	return SDValue();			return SDValue();
	}			}

				bool X86TargetLowering::isDesirableToCommuteWithShift(
				const SDNode *N, CombineLevel Level) const {
				if (Level < AfterLegalizeDAG)
				return true;

				if (N->getOpcode() == ISD::SRL \|\| N->getOpcode() == ISD::SHL)
				for (unsigned OpIdx = 0; OpIdx < 2; ++OpIdx)
				craig.topperUnsubmitted Done Reply Inline Actions Why checking both operands of the shift? Isn't only the first one interesting? An AND on the shift amount is very different. craig.topper: Why checking both operands of the shift? Isn't only the first one interesting? An AND on the…
				if (N->getOperand(OpIdx).getOpcode() == ISD::AND)
				return false;

				return true;
				}

	bool X86TargetLowering::isTypeDesirableForOp(unsigned Opc, EVT VT) const {			bool X86TargetLowering::isTypeDesirableForOp(unsigned Opc, EVT VT) const {
	if (!isTypeLegal(VT))			if (!isTypeLegal(VT))
	return false;			return false;

	// There are no vXi8 shifts.			// There are no vXi8 shifts.
	if (Opc == ISD::SHL && VT.isVector() && VT.getVectorElementType() == MVT::i8)			if (Opc == ISD::SHL && VT.isVector() && VT.getVectorElementType() == MVT::i8)
	return false;			return false;

	▲ Show 20 Lines • Show All 313 Lines • ▼ Show 20 Lines
	case 'G':			case 'G':
	case 'L':			case 'L':
	case 'M':			case 'M':
	return C_Immediate;			return C_Immediate;
	case 'C':			case 'C':
	case 'e':			case 'e':
	case 'Z':			case 'Z':
	return C_Other;			return C_Other;
	default:			default:
				pengfeiUnsubmitted Done Reply Inline Actions Add a comment to explain this is for `combineLogicalShiftWithAnd`. Why `combineAndWithLogicalShift` not need this? pengfei: Add a comment to explain this is for `combineLogicalShiftWithAnd`. Why…
				goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions Add a comment to explain this is for `combineLogicalShiftWithAnd`. Why `combineAndWithLogicalShift` not need this? this is only called for shift ops (i.e `VisitSHL` and `visitShiftByConstant`). goldstein.w.n: > Add a comment to explain this is for `combineLogicalShiftWithAnd`. > Why…
	break;			break;
	}			}
	}			}
	else if (Constraint.size() == 2) {			else if (Constraint.size() == 2) {
	switch (Constraint[0]) {			switch (Constraint[0]) {
				pengfeiUnsubmitted Done Reply Inline Actions The arguments are not used. Leave it `void` for now? pengfei: The arguments are not used. Leave it `void` for now?
	default:			default:
	break;			break;
	case 'Y':			case 'Y':
	switch (Constraint[1]) {			switch (Constraint[1]) {
	default:			default:
	break;			break;
	case 'z':			case 'z':
	return C_Register;			return C_Register;
	▲ Show 20 Lines • Show All 929 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/avx512-calling-conv.ll

	Show First 20 Lines • Show All 953 Lines • ▼ Show 20 Lines
	; KNL-NEXT: andl $1, %r9d			; KNL-NEXT: andl $1, %r9d
	; KNL-NEXT: shll $4, %r9d			; KNL-NEXT: shll $4, %r9d
	; KNL-NEXT: orl %ecx, %r9d			; KNL-NEXT: orl %ecx, %r9d
	; KNL-NEXT: andl $1, %r8d			; KNL-NEXT: andl $1, %r8d
	; KNL-NEXT: shll $5, %r8d			; KNL-NEXT: shll $5, %r8d
	; KNL-NEXT: orl %r9d, %r8d			; KNL-NEXT: orl %r9d, %r8d
	; KNL-NEXT: andl $1, %r10d			; KNL-NEXT: andl $1, %r10d
	; KNL-NEXT: shll $6, %r10d			; KNL-NEXT: shll $6, %r10d
	; KNL-NEXT: andl $1, %r11d
	; KNL-NEXT: shll $7, %r11d			; KNL-NEXT: shll $7, %r11d
	; KNL-NEXT: orl %r10d, %r11d			; KNL-NEXT: movzbl %r11b, %ecx
				; KNL-NEXT: orl %r10d, %ecx
	; KNL-NEXT: andl $1, %ebx			; KNL-NEXT: andl $1, %ebx
	; KNL-NEXT: shll $8, %ebx			; KNL-NEXT: shll $8, %ebx
	; KNL-NEXT: orl %r11d, %ebx			; KNL-NEXT: orl %ecx, %ebx
	; KNL-NEXT: andl $1, %r14d			; KNL-NEXT: andl $1, %r14d
	; KNL-NEXT: shll $9, %r14d			; KNL-NEXT: shll $9, %r14d
	; KNL-NEXT: orl %ebx, %r14d			; KNL-NEXT: orl %ebx, %r14d
	; KNL-NEXT: andl $1, %ebp			; KNL-NEXT: andl $1, %ebp
	; KNL-NEXT: shll $10, %ebp			; KNL-NEXT: shll $10, %ebp
	; KNL-NEXT: orl %r14d, %ebp			; KNL-NEXT: orl %r14d, %ebp
	; KNL-NEXT: orl %r8d, %ebp			; KNL-NEXT: orl %r8d, %ebp
	; KNL-NEXT: andl $1, %r15d			; KNL-NEXT: andl $1, %r15d
	; KNL-NEXT: shll $11, %r15d			; KNL-NEXT: shll $11, %r15d
	; KNL-NEXT: andl $1, %r12d			; KNL-NEXT: andl $1, %r12d
	; KNL-NEXT: shll $12, %r12d			; KNL-NEXT: shll $12, %r12d
	; KNL-NEXT: orl %r15d, %r12d			; KNL-NEXT: orl %r15d, %r12d
	; KNL-NEXT: andl $1, %r13d			; KNL-NEXT: andl $1, %r13d
	; KNL-NEXT: shll $13, %r13d			; KNL-NEXT: shll $13, %r13d
	; KNL-NEXT: orl %r12d, %r13d			; KNL-NEXT: orl %r12d, %r13d
	; KNL-NEXT: andl $1, %edx			; KNL-NEXT: andl $1, %edx
	; KNL-NEXT: shll $14, %edx			; KNL-NEXT: shll $14, %edx
	; KNL-NEXT: orl %r13d, %edx			; KNL-NEXT: orl %r13d, %edx
	; KNL-NEXT: andl $1, %esi
	; KNL-NEXT: shll $15, %esi			; KNL-NEXT: shll $15, %esi
	; KNL-NEXT: orl %edx, %esi			; KNL-NEXT: movzwl %si, %ecx
	; KNL-NEXT: orl %ebp, %esi			; KNL-NEXT: orl %edx, %ecx
	; KNL-NEXT: movw %si, (%rax)			; KNL-NEXT: orl %ebp, %ecx
				; KNL-NEXT: movw %cx, (%rax)
	; KNL-NEXT: popq %rbx			; KNL-NEXT: popq %rbx
	; KNL-NEXT: popq %r12			; KNL-NEXT: popq %r12
	; KNL-NEXT: popq %r13			; KNL-NEXT: popq %r13
	; KNL-NEXT: popq %r14			; KNL-NEXT: popq %r14
	; KNL-NEXT: popq %r15			; KNL-NEXT: popq %r15
	; KNL-NEXT: popq %rbp			; KNL-NEXT: popq %rbp
	; KNL-NEXT: retq			; KNL-NEXT: retq
	;			;
	▲ Show 20 Lines • Show All 268 Lines • ▼ Show 20 Lines
	; SKX-NEXT: andl $1, %r9d			; SKX-NEXT: andl $1, %r9d
	; SKX-NEXT: shll $4, %r9d			; SKX-NEXT: shll $4, %r9d
	; SKX-NEXT: orl %ecx, %r9d			; SKX-NEXT: orl %ecx, %r9d
	; SKX-NEXT: andl $1, %r8d			; SKX-NEXT: andl $1, %r8d
	; SKX-NEXT: shll $5, %r8d			; SKX-NEXT: shll $5, %r8d
	; SKX-NEXT: orl %r9d, %r8d			; SKX-NEXT: orl %r9d, %r8d
	; SKX-NEXT: andl $1, %r10d			; SKX-NEXT: andl $1, %r10d
	; SKX-NEXT: shll $6, %r10d			; SKX-NEXT: shll $6, %r10d
	; SKX-NEXT: andl $1, %r11d
	; SKX-NEXT: shll $7, %r11d			; SKX-NEXT: shll $7, %r11d
	; SKX-NEXT: orl %r10d, %r11d			; SKX-NEXT: movzbl %r11b, %ecx
				; SKX-NEXT: orl %r10d, %ecx
	; SKX-NEXT: andl $1, %ebx			; SKX-NEXT: andl $1, %ebx
	; SKX-NEXT: shll $8, %ebx			; SKX-NEXT: shll $8, %ebx
	; SKX-NEXT: orl %r11d, %ebx			; SKX-NEXT: orl %ecx, %ebx
	; SKX-NEXT: andl $1, %r14d			; SKX-NEXT: andl $1, %r14d
	; SKX-NEXT: shll $9, %r14d			; SKX-NEXT: shll $9, %r14d
	; SKX-NEXT: orl %ebx, %r14d			; SKX-NEXT: orl %ebx, %r14d
	; SKX-NEXT: andl $1, %ebp			; SKX-NEXT: andl $1, %ebp
	; SKX-NEXT: shll $10, %ebp			; SKX-NEXT: shll $10, %ebp
	; SKX-NEXT: orl %r14d, %ebp			; SKX-NEXT: orl %r14d, %ebp
	; SKX-NEXT: orl %r8d, %ebp			; SKX-NEXT: orl %r8d, %ebp
	; SKX-NEXT: andl $1, %r15d			; SKX-NEXT: andl $1, %r15d
	; SKX-NEXT: shll $11, %r15d			; SKX-NEXT: shll $11, %r15d
	; SKX-NEXT: andl $1, %r12d			; SKX-NEXT: andl $1, %r12d
	; SKX-NEXT: shll $12, %r12d			; SKX-NEXT: shll $12, %r12d
	; SKX-NEXT: orl %r15d, %r12d			; SKX-NEXT: orl %r15d, %r12d
	; SKX-NEXT: andl $1, %r13d			; SKX-NEXT: andl $1, %r13d
	; SKX-NEXT: shll $13, %r13d			; SKX-NEXT: shll $13, %r13d
	; SKX-NEXT: orl %r12d, %r13d			; SKX-NEXT: orl %r12d, %r13d
	; SKX-NEXT: andl $1, %edx			; SKX-NEXT: andl $1, %edx
	; SKX-NEXT: shll $14, %edx			; SKX-NEXT: shll $14, %edx
	; SKX-NEXT: orl %r13d, %edx			; SKX-NEXT: orl %r13d, %edx
	; SKX-NEXT: andl $1, %esi
	; SKX-NEXT: shll $15, %esi			; SKX-NEXT: shll $15, %esi
	; SKX-NEXT: orl %edx, %esi			; SKX-NEXT: movzwl %si, %ecx
	; SKX-NEXT: orl %ebp, %esi			; SKX-NEXT: orl %edx, %ecx
	; SKX-NEXT: movw %si, (%rax)			; SKX-NEXT: orl %ebp, %ecx
				; SKX-NEXT: movw %cx, (%rax)
	; SKX-NEXT: popq %rbx			; SKX-NEXT: popq %rbx
	; SKX-NEXT: popq %r12			; SKX-NEXT: popq %r12
	; SKX-NEXT: popq %r13			; SKX-NEXT: popq %r13
	; SKX-NEXT: popq %r14			; SKX-NEXT: popq %r14
	; SKX-NEXT: popq %r15			; SKX-NEXT: popq %r15
	; SKX-NEXT: popq %rbp			; SKX-NEXT: popq %rbp
	; SKX-NEXT: retq			; SKX-NEXT: retq
	;			;
	▲ Show 20 Lines • Show All 234 Lines • ▼ Show 20 Lines
	; KNL_X32-NEXT: kmovw %edx, %k1			; KNL_X32-NEXT: kmovw %edx, %k1
	; KNL_X32-NEXT: kshiftlw $15, %k1, %k1			; KNL_X32-NEXT: kshiftlw $15, %k1, %k1
	; KNL_X32-NEXT: korw %k1, %k0, %k0			; KNL_X32-NEXT: korw %k1, %k0, %k0
	; KNL_X32-NEXT: kmovw %ecx, %k1			; KNL_X32-NEXT: kmovw %ecx, %k1
	; KNL_X32-NEXT: kmovw (%esp), %k2 ## 2-byte Reload			; KNL_X32-NEXT: kmovw (%esp), %k2 ## 2-byte Reload
	; KNL_X32-NEXT: kandw %k2, %k0, %k0			; KNL_X32-NEXT: kandw %k2, %k0, %k0
	; KNL_X32-NEXT: kmovw %eax, %k2			; KNL_X32-NEXT: kmovw %eax, %k2
	; KNL_X32-NEXT: kandw %k1, %k2, %k1			; KNL_X32-NEXT: kandw %k1, %k2, %k1
	; KNL_X32-NEXT: movl {{[0-9]+}}(%esp), %eax			; KNL_X32-NEXT: movl {{[0-9]+}}(%esp), %ebp
	; KNL_X32-NEXT: kmovw %k1, %ebx			; KNL_X32-NEXT: kmovw %k1, %ebx
	; KNL_X32-NEXT: kshiftrw $1, %k0, %k1			; KNL_X32-NEXT: kshiftrw $1, %k0, %k1
	; KNL_X32-NEXT: kmovw %k1, %ebp			; KNL_X32-NEXT: kmovw %k1, %eax
	; KNL_X32-NEXT: kshiftrw $2, %k0, %k1			; KNL_X32-NEXT: kshiftrw $2, %k0, %k1
	; KNL_X32-NEXT: kmovw %k1, %esi
	; KNL_X32-NEXT: kshiftrw $3, %k0, %k1
	; KNL_X32-NEXT: kmovw %k1, %edi			; KNL_X32-NEXT: kmovw %k1, %edi
				; KNL_X32-NEXT: kshiftrw $3, %k0, %k1
				; KNL_X32-NEXT: kmovw %k1, %esi
	; KNL_X32-NEXT: kshiftrw $4, %k0, %k1			; KNL_X32-NEXT: kshiftrw $4, %k0, %k1
	; KNL_X32-NEXT: kmovw %k1, %edx			; KNL_X32-NEXT: kmovw %k1, %edx
	; KNL_X32-NEXT: kshiftrw $5, %k0, %k1			; KNL_X32-NEXT: kshiftrw $5, %k0, %k1
	; KNL_X32-NEXT: kmovw %k1, %ecx			; KNL_X32-NEXT: kmovw %k1, %ecx
	; KNL_X32-NEXT: kshiftrw $6, %k0, %k1			; KNL_X32-NEXT: kshiftrw $6, %k0, %k1
	; KNL_X32-NEXT: andl $1, %ebx			; KNL_X32-NEXT: andl $1, %ebx
	; KNL_X32-NEXT: movb %bl, 2(%eax)			; KNL_X32-NEXT: movb %bl, 2(%ebp)
	; KNL_X32-NEXT: kmovw %k0, %ebx			; KNL_X32-NEXT: kmovw %k0, %ebx
	; KNL_X32-NEXT: andl $1, %ebx			; KNL_X32-NEXT: andl $1, %ebx
	; KNL_X32-NEXT: andl $1, %ebp			; KNL_X32-NEXT: andl $1, %eax
	; KNL_X32-NEXT: leal (%ebx,%ebp,2), %ebx			; KNL_X32-NEXT: leal (%ebx,%eax,2), %eax
	; KNL_X32-NEXT: kmovw %k1, %ebp			; KNL_X32-NEXT: kmovw %k1, %ebx
	; KNL_X32-NEXT: kshiftrw $7, %k0, %k1			; KNL_X32-NEXT: kshiftrw $7, %k0, %k1
				; KNL_X32-NEXT: andl $1, %edi
				; KNL_X32-NEXT: leal (%eax,%edi,4), %edi
				; KNL_X32-NEXT: kmovw %k1, %eax
				; KNL_X32-NEXT: kshiftrw $8, %k0, %k1
	; KNL_X32-NEXT: andl $1, %esi			; KNL_X32-NEXT: andl $1, %esi
	; KNL_X32-NEXT: leal (%ebx,%esi,4), %ebx			; KNL_X32-NEXT: leal (%edi,%esi,8), %edi
	; KNL_X32-NEXT: kmovw %k1, %esi			; KNL_X32-NEXT: kmovw %k1, %esi
	; KNL_X32-NEXT: kshiftrw $8, %k0, %k1
	; KNL_X32-NEXT: andl $1, %edi
	; KNL_X32-NEXT: leal (%ebx,%edi,8), %ebx
	; KNL_X32-NEXT: kmovw %k1, %edi
	; KNL_X32-NEXT: kshiftrw $9, %k0, %k1			; KNL_X32-NEXT: kshiftrw $9, %k0, %k1
	; KNL_X32-NEXT: andl $1, %edx			; KNL_X32-NEXT: andl $1, %edx
	; KNL_X32-NEXT: shll $4, %edx			; KNL_X32-NEXT: shll $4, %edx
	; KNL_X32-NEXT: orl %ebx, %edx			; KNL_X32-NEXT: orl %edi, %edx
	; KNL_X32-NEXT: kmovw %k1, %ebx			; KNL_X32-NEXT: kmovw %k1, %edi
	; KNL_X32-NEXT: kshiftrw $10, %k0, %k1			; KNL_X32-NEXT: kshiftrw $10, %k0, %k1
	; KNL_X32-NEXT: andl $1, %ecx			; KNL_X32-NEXT: andl $1, %ecx
	; KNL_X32-NEXT: shll $5, %ecx			; KNL_X32-NEXT: shll $5, %ecx
	; KNL_X32-NEXT: orl %edx, %ecx			; KNL_X32-NEXT: orl %edx, %ecx
	; KNL_X32-NEXT: kmovw %k1, %edx			; KNL_X32-NEXT: kmovw %k1, %edx
	; KNL_X32-NEXT: kshiftrw $11, %k0, %k1			; KNL_X32-NEXT: kshiftrw $11, %k0, %k1
	; KNL_X32-NEXT: andl $1, %ebp			; KNL_X32-NEXT: andl $1, %ebx
	; KNL_X32-NEXT: shll $6, %ebp			; KNL_X32-NEXT: shll $6, %ebx
				; KNL_X32-NEXT: shll $7, %eax
				; KNL_X32-NEXT: movzbl %al, %eax
				; KNL_X32-NEXT: orl %ebx, %eax
				; KNL_X32-NEXT: kmovw %k1, %ebx
				; KNL_X32-NEXT: kshiftrw $12, %k0, %k1
	; KNL_X32-NEXT: andl $1, %esi			; KNL_X32-NEXT: andl $1, %esi
	; KNL_X32-NEXT: shll $7, %esi			; KNL_X32-NEXT: shll $8, %esi
	; KNL_X32-NEXT: orl %ebp, %esi			; KNL_X32-NEXT: orl %eax, %esi
	; KNL_X32-NEXT: kmovw %k1, %ebp			; KNL_X32-NEXT: kmovw %k1, %ebp
	; KNL_X32-NEXT: kshiftrw $12, %k0, %k1			; KNL_X32-NEXT: kshiftrw $13, %k0, %k1
	; KNL_X32-NEXT: andl $1, %edi			; KNL_X32-NEXT: andl $1, %edi
	; KNL_X32-NEXT: shll $8, %edi			; KNL_X32-NEXT: shll $9, %edi
	; KNL_X32-NEXT: orl %esi, %edi			; KNL_X32-NEXT: orl %esi, %edi
	; KNL_X32-NEXT: kmovw %k1, %esi			; KNL_X32-NEXT: kmovw %k1, %eax
	; KNL_X32-NEXT: kshiftrw $13, %k0, %k1
	; KNL_X32-NEXT: andl $1, %ebx
	; KNL_X32-NEXT: shll $9, %ebx
	; KNL_X32-NEXT: orl %edi, %ebx
	; KNL_X32-NEXT: kmovw %k1, %edi
	; KNL_X32-NEXT: kshiftrw $14, %k0, %k1			; KNL_X32-NEXT: kshiftrw $14, %k0, %k1
	; KNL_X32-NEXT: andl $1, %edx			; KNL_X32-NEXT: andl $1, %edx
	; KNL_X32-NEXT: shll $10, %edx			; KNL_X32-NEXT: shll $10, %edx
	; KNL_X32-NEXT: orl %ebx, %edx			; KNL_X32-NEXT: orl %edi, %edx
	; KNL_X32-NEXT: kmovw %k1, %ebx			; KNL_X32-NEXT: kmovw %k1, %esi
	; KNL_X32-NEXT: kshiftrw $15, %k0, %k0			; KNL_X32-NEXT: kshiftrw $15, %k0, %k0
	; KNL_X32-NEXT: orl %ecx, %edx			; KNL_X32-NEXT: orl %ecx, %edx
	; KNL_X32-NEXT: kmovw %k0, %ecx			; KNL_X32-NEXT: kmovw %k0, %ecx
				; KNL_X32-NEXT: andl $1, %ebx
				; KNL_X32-NEXT: shll $11, %ebx
	; KNL_X32-NEXT: andl $1, %ebp			; KNL_X32-NEXT: andl $1, %ebp
	; KNL_X32-NEXT: shll $11, %ebp			; KNL_X32-NEXT: shll $12, %ebp
				; KNL_X32-NEXT: orl %ebx, %ebp
				; KNL_X32-NEXT: andl $1, %eax
				; KNL_X32-NEXT: shll $13, %eax
				; KNL_X32-NEXT: orl %ebp, %eax
	; KNL_X32-NEXT: andl $1, %esi			; KNL_X32-NEXT: andl $1, %esi
	; KNL_X32-NEXT: shll $12, %esi			; KNL_X32-NEXT: shll $14, %esi
	; KNL_X32-NEXT: orl %ebp, %esi			; KNL_X32-NEXT: orl %eax, %esi
	; KNL_X32-NEXT: andl $1, %edi
	; KNL_X32-NEXT: shll $13, %edi
	; KNL_X32-NEXT: orl %esi, %edi
	; KNL_X32-NEXT: andl $1, %ebx
	; KNL_X32-NEXT: shll $14, %ebx
	; KNL_X32-NEXT: orl %edi, %ebx
	; KNL_X32-NEXT: andl $1, %ecx
	; KNL_X32-NEXT: shll $15, %ecx			; KNL_X32-NEXT: shll $15, %ecx
	; KNL_X32-NEXT: orl %ebx, %ecx			; KNL_X32-NEXT: movzwl %cx, %ecx
				; KNL_X32-NEXT: orl %esi, %ecx
	; KNL_X32-NEXT: orl %edx, %ecx			; KNL_X32-NEXT: orl %edx, %ecx
				; KNL_X32-NEXT: movl {{[0-9]+}}(%esp), %eax
	; KNL_X32-NEXT: movw %cx, (%eax)			; KNL_X32-NEXT: movw %cx, (%eax)
	; KNL_X32-NEXT: addl $16, %esp			; KNL_X32-NEXT: addl $16, %esp
	; KNL_X32-NEXT: popl %esi			; KNL_X32-NEXT: popl %esi
	; KNL_X32-NEXT: popl %edi			; KNL_X32-NEXT: popl %edi
	; KNL_X32-NEXT: popl %ebx			; KNL_X32-NEXT: popl %ebx
	; KNL_X32-NEXT: popl %ebp			; KNL_X32-NEXT: popl %ebp
	; KNL_X32-NEXT: retl $4			; KNL_X32-NEXT: retl $4
	;			;
	▲ Show 20 Lines • Show All 268 Lines • ▼ Show 20 Lines
	; FASTISEL-NEXT: andl $1, %r9d			; FASTISEL-NEXT: andl $1, %r9d
	; FASTISEL-NEXT: shll $4, %r9d			; FASTISEL-NEXT: shll $4, %r9d
	; FASTISEL-NEXT: orl %ecx, %r9d			; FASTISEL-NEXT: orl %ecx, %r9d
	; FASTISEL-NEXT: andl $1, %r8d			; FASTISEL-NEXT: andl $1, %r8d
	; FASTISEL-NEXT: shll $5, %r8d			; FASTISEL-NEXT: shll $5, %r8d
	; FASTISEL-NEXT: orl %r9d, %r8d			; FASTISEL-NEXT: orl %r9d, %r8d
	; FASTISEL-NEXT: andl $1, %r10d			; FASTISEL-NEXT: andl $1, %r10d
	; FASTISEL-NEXT: shll $6, %r10d			; FASTISEL-NEXT: shll $6, %r10d
	; FASTISEL-NEXT: andl $1, %r11d
	; FASTISEL-NEXT: shll $7, %r11d			; FASTISEL-NEXT: shll $7, %r11d
	; FASTISEL-NEXT: orl %r10d, %r11d			; FASTISEL-NEXT: movzbl %r11b, %ecx
				; FASTISEL-NEXT: orl %r10d, %ecx
	; FASTISEL-NEXT: andl $1, %ebx			; FASTISEL-NEXT: andl $1, %ebx
	; FASTISEL-NEXT: shll $8, %ebx			; FASTISEL-NEXT: shll $8, %ebx
	; FASTISEL-NEXT: orl %r11d, %ebx			; FASTISEL-NEXT: orl %ecx, %ebx
	; FASTISEL-NEXT: andl $1, %r14d			; FASTISEL-NEXT: andl $1, %r14d
	; FASTISEL-NEXT: shll $9, %r14d			; FASTISEL-NEXT: shll $9, %r14d
	; FASTISEL-NEXT: orl %ebx, %r14d			; FASTISEL-NEXT: orl %ebx, %r14d
	; FASTISEL-NEXT: andl $1, %ebp			; FASTISEL-NEXT: andl $1, %ebp
	; FASTISEL-NEXT: shll $10, %ebp			; FASTISEL-NEXT: shll $10, %ebp
	; FASTISEL-NEXT: orl %r14d, %ebp			; FASTISEL-NEXT: orl %r14d, %ebp
	; FASTISEL-NEXT: orl %r8d, %ebp			; FASTISEL-NEXT: orl %r8d, %ebp
	; FASTISEL-NEXT: andl $1, %r15d			; FASTISEL-NEXT: andl $1, %r15d
	; FASTISEL-NEXT: shll $11, %r15d			; FASTISEL-NEXT: shll $11, %r15d
	; FASTISEL-NEXT: andl $1, %r12d			; FASTISEL-NEXT: andl $1, %r12d
	; FASTISEL-NEXT: shll $12, %r12d			; FASTISEL-NEXT: shll $12, %r12d
	; FASTISEL-NEXT: orl %r15d, %r12d			; FASTISEL-NEXT: orl %r15d, %r12d
	; FASTISEL-NEXT: andl $1, %r13d			; FASTISEL-NEXT: andl $1, %r13d
	; FASTISEL-NEXT: shll $13, %r13d			; FASTISEL-NEXT: shll $13, %r13d
	; FASTISEL-NEXT: orl %r12d, %r13d			; FASTISEL-NEXT: orl %r12d, %r13d
	; FASTISEL-NEXT: andl $1, %edx			; FASTISEL-NEXT: andl $1, %edx
	; FASTISEL-NEXT: shll $14, %edx			; FASTISEL-NEXT: shll $14, %edx
	; FASTISEL-NEXT: orl %r13d, %edx			; FASTISEL-NEXT: orl %r13d, %edx
	; FASTISEL-NEXT: andl $1, %esi
	; FASTISEL-NEXT: shll $15, %esi			; FASTISEL-NEXT: shll $15, %esi
	; FASTISEL-NEXT: orl %edx, %esi			; FASTISEL-NEXT: movzwl %si, %ecx
	; FASTISEL-NEXT: orl %ebp, %esi			; FASTISEL-NEXT: orl %edx, %ecx
	; FASTISEL-NEXT: movw %si, (%rax)			; FASTISEL-NEXT: orl %ebp, %ecx
				; FASTISEL-NEXT: movw %cx, (%rax)
	; FASTISEL-NEXT: popq %rbx			; FASTISEL-NEXT: popq %rbx
	; FASTISEL-NEXT: popq %r12			; FASTISEL-NEXT: popq %r12
	; FASTISEL-NEXT: popq %r13			; FASTISEL-NEXT: popq %r13
	; FASTISEL-NEXT: popq %r14			; FASTISEL-NEXT: popq %r14
	; FASTISEL-NEXT: popq %r15			; FASTISEL-NEXT: popq %r15
	; FASTISEL-NEXT: popq %rbp			; FASTISEL-NEXT: popq %rbp
	; FASTISEL-NEXT: retq			; FASTISEL-NEXT: retq
	%c = and <17 x i1> %a, %b			%c = and <17 x i1> %a, %b
	▲ Show 20 Lines • Show All 2,235 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/bitreverse.ll

Show First 20 Lines • Show All 507 Lines • ▼ Show 20 Lines	; GFNI-NEXT: retq
ret i8 %b		ret i8 %b
}		}

declare i4 @llvm.bitreverse.i4(i4) readnone		declare i4 @llvm.bitreverse.i4(i4) readnone

define i4 @test_bitreverse_i4(i4 %a) {		define i4 @test_bitreverse_i4(i4 %a) {
; X86-LABEL: test_bitreverse_i4:		; X86-LABEL: test_bitreverse_i4:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
; X86-NEXT: movl %ecx, %eax		; X86-NEXT: movl %eax, %ecx
; X86-NEXT: andb $8, %al		; X86-NEXT: andb $2, %cl
; X86-NEXT: movl %ecx, %edx		; X86-NEXT: addb %cl, %cl
; X86-NEXT: addb %cl, %dl		; X86-NEXT: movl %eax, %edx
; X86-NEXT: andb $4, %dl		; X86-NEXT: andb $1, %dl
; X86-NEXT: movb %cl, %ah		; X86-NEXT: shlb $3, %dl
; X86-NEXT: shlb $3, %ah		; X86-NEXT: orb %cl, %dl
; X86-NEXT: andb $8, %ah		; X86-NEXT: movl %eax, %ecx
; X86-NEXT: orb %dl, %ah
; X86-NEXT: shrb %cl		; X86-NEXT: shrb %cl
; X86-NEXT: andb $2, %cl		; X86-NEXT: andb $2, %cl
; X86-NEXT: orb %ah, %cl		; X86-NEXT: orb %dl, %cl
; X86-NEXT: shrb $3, %al		; X86-NEXT: shrb $3, %al
		; X86-NEXT: andb $1, %al
; X86-NEXT: orb %cl, %al		; X86-NEXT: orb %cl, %al
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_bitreverse_i4:		; X64-LABEL: test_bitreverse_i4:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: # kill: def $edi killed $edi def $rdi
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andb $8, %al		; X64-NEXT: andb $2, %al
; X64-NEXT: leal (%rdi,%rdi), %ecx		; X64-NEXT: addb %al, %al
pengfeiUnsubmitted Not Done Reply Inline Actions IIRC, LEA is expensive, so this looks like a good deal. pengfei: IIRC, LEA is expensive, so this looks like a good deal.
craig.topperUnsubmitted Not Done Reply Inline Actions I thought LEA was only expensive if it uses 3 sources. craig.topper: I thought LEA was only expensive if it uses 3 sources.
pengfeiUnsubmitted Not Done Reply Inline Actions You are right. SOM says only 3 operand LEA is slow. But I found MCA cannot reflect the difference. So this is yet another regression. old: https://godbolt.org/z/dPW93v8zT new: https://godbolt.org/z/hrEovKW4f pengfei: You are right. SOM says only 3 operand LEA is slow. But I found MCA cannot reflect the…
craig.topperUnsubmitted Not Done Reply Inline Actions Looks like the scheduler models make the distinction for AMD CPUs but not Intel Small snippet of the predicate check: X86SchedPredicates.td 61:def IsThreeOperandsLEAFn : X86ScheduleBdVer2.td 560: IsThreeOperandsLEAFn, X86ScheduleBtVer2.td 945: IsThreeOperandsLEAFn, X86ScheduleZnver3.td 590: IsThreeOperandsLEAFn, craig.topper: Looks like the scheduler models make the distinction for AMD CPUs but not Intel Small snippet…
pengfeiUnsubmitted Not Done Reply Inline Actions Thanks for the information! We may need to do the same thing for Intel. Filed #60043 for it. pengfei: Thanks for the information! We may need to do the same thing for Intel. Filed #60043 for it.
; X64-NEXT: andb $4, %cl		; X64-NEXT: movl %edi, %ecx
; X64-NEXT: leal (,%rdi,8), %edx		; X64-NEXT: andb $1, %cl
; X64-NEXT: andb $8, %dl		; X64-NEXT: shlb $3, %cl
; X64-NEXT: orb %cl, %dl		; X64-NEXT: orb %al, %cl
; X64-NEXT: shrb %dil		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andb $2, %dil		; X64-NEXT: shrb %al
; X64-NEXT: orb %dil, %dl		; X64-NEXT: andb $2, %al
; X64-NEXT: shrb $3, %al		; X64-NEXT: orb %cl, %al
; X64-NEXT: orb %dl, %al		; X64-NEXT: shrb $3, %dil
		; X64-NEXT: andb $1, %dil
		; X64-NEXT: orb %dil, %al
; X64-NEXT: retq		; X64-NEXT: retq
;		;
; X86XOP-LABEL: test_bitreverse_i4:		; X86XOP-LABEL: test_bitreverse_i4:
; X86XOP: # %bb.0:		; X86XOP: # %bb.0:
; X86XOP-NEXT: vmovd {{.*#+}} xmm0 = mem[0],zero,zero,zero		; X86XOP-NEXT: vmovd {{.*#+}} xmm0 = mem[0],zero,zero,zero
; X86XOP-NEXT: vpperm {{\.?LCPI[0-9]+_[0-9]+}}, %xmm0, %xmm0, %xmm0		; X86XOP-NEXT: vpperm {{\.?LCPI[0-9]+_[0-9]+}}, %xmm0, %xmm0, %xmm0
; X86XOP-NEXT: vmovd %xmm0, %eax		; X86XOP-NEXT: vmovd %xmm0, %eax
; X86XOP-NEXT: shrb $4, %al		; X86XOP-NEXT: shrb $4, %al
; X86XOP-NEXT: # kill: def $al killed $al killed $eax		; X86XOP-NEXT: # kill: def $al killed $al killed $eax
; X86XOP-NEXT: retl		; X86XOP-NEXT: retl
;		;
; GFNI-LABEL: test_bitreverse_i4:		; GFNI-LABEL: test_bitreverse_i4:
; GFNI: # %bb.0:		; GFNI: # %bb.0:
; GFNI-NEXT: # kill: def $edi killed $edi def $rdi
; GFNI-NEXT: movl %edi, %eax		; GFNI-NEXT: movl %edi, %eax
; GFNI-NEXT: andb $8, %al		; GFNI-NEXT: andb $2, %al
; GFNI-NEXT: leal (%rdi,%rdi), %ecx		; GFNI-NEXT: addb %al, %al
; GFNI-NEXT: andb $4, %cl		; GFNI-NEXT: movl %edi, %ecx
; GFNI-NEXT: leal (,%rdi,8), %edx		; GFNI-NEXT: andb $1, %cl
; GFNI-NEXT: andb $8, %dl		; GFNI-NEXT: shlb $3, %cl
; GFNI-NEXT: orb %cl, %dl		; GFNI-NEXT: orb %al, %cl
; GFNI-NEXT: shrb %dil		; GFNI-NEXT: movl %edi, %eax
; GFNI-NEXT: andb $2, %dil		; GFNI-NEXT: shrb %al
; GFNI-NEXT: orb %dil, %dl		; GFNI-NEXT: andb $2, %al
; GFNI-NEXT: shrb $3, %al		; GFNI-NEXT: orb %cl, %al
; GFNI-NEXT: orb %dl, %al		; GFNI-NEXT: shrb $3, %dil
		; GFNI-NEXT: andb $1, %dil
		; GFNI-NEXT: orb %dil, %al
; GFNI-NEXT: retq		; GFNI-NEXT: retq
%b = call i4 @llvm.bitreverse.i4(i4 %a)		%b = call i4 @llvm.bitreverse.i4(i4 %a)
ret i4 %b		ret i4 %b
}		}

; These tests check that bitreverse(constant) calls are folded		; These tests check that bitreverse(constant) calls are folded

define <2 x i16> @fold_v2i16() {		define <2 x i16> @fold_v2i16() {
▲ Show 20 Lines • Show All 1,029 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/bmi-x86_64.ll

	Show All 10 Lines
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: bextrq %rsi, %rdi, %rax			; CHECK-NEXT: bextrq %rsi, %rdi, %rax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%tmp = tail call i64 @llvm.x86.bmi.bextr.64(i64 %x, i64 %y)			%tmp = tail call i64 @llvm.x86.bmi.bextr.64(i64 %x, i64 %y)
	ret i64 %tmp			ret i64 %tmp
	}			}

	define i64 @bextr64b(i64 %x) uwtable ssp {			define i64 @bextr64b(i64 %x) uwtable ssp {
	; BEXTR-SLOW-LABEL: bextr64b:			; CHECK-LABEL: bextr64b:
	; BEXTR-SLOW: # %bb.0:			; CHECK: # %bb.0:
	; BEXTR-SLOW-NEXT: movq %rdi, %rax			; CHECK-NEXT: movzwl %di, %eax
	; BEXTR-SLOW-NEXT: shrl $4, %eax			; CHECK-NEXT: shrl $4, %eax
	; BEXTR-SLOW-NEXT: andl $4095, %eax # imm = 0xFFF			; CHECK-NEXT: retq
	; BEXTR-SLOW-NEXT: retq
	;
	; BEXTR-FAST-LABEL: bextr64b:
	; BEXTR-FAST: # %bb.0:
	; BEXTR-FAST-NEXT: movl $3076, %eax # imm = 0xC04
	; BEXTR-FAST-NEXT: bextrl %eax, %edi, %eax
	; BEXTR-FAST-NEXT: retq
	%1 = lshr i64 %x, 4			%1 = lshr i64 %x, 4
	%2 = and i64 %1, 4095			%2 = and i64 %1, 4095
	ret i64 %2			ret i64 %2
	}			}

	; Make sure we still use the AH subreg trick to extract 15:8			; Make sure we still use the AH subreg trick to extract 15:8
	define i64 @bextr64_subreg(i64 %x) uwtable ssp {			define i64 @bextr64_subreg(i64 %x) uwtable ssp {
	; CHECK-LABEL: bextr64_subreg:			; CHECK-LABEL: bextr64_subreg:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: movq %rdi, %rax			; CHECK-NEXT: movq %rdi, %rax
	; CHECK-NEXT: movzbl %ah, %eax			; CHECK-NEXT: movzbl %ah, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%1 = lshr i64 %x, 8			%1 = lshr i64 %x, 8
	%2 = and i64 %1, 255			%2 = and i64 %1, 255
	ret i64 %2			ret i64 %2
	}			}

	define i64 @bextr64b_load(ptr %x) {			define i64 @bextr64b_load(ptr %x) {
	; BEXTR-SLOW-LABEL: bextr64b_load:			; CHECK-LABEL: bextr64b_load:
	; BEXTR-SLOW: # %bb.0:			; CHECK: # %bb.0:
	; BEXTR-SLOW-NEXT: movl (%rdi), %eax			; CHECK-NEXT: movzwl (%rdi), %eax
	; BEXTR-SLOW-NEXT: shrl $4, %eax			; CHECK-NEXT: shrl $4, %eax
	; BEXTR-SLOW-NEXT: andl $4095, %eax # imm = 0xFFF			; CHECK-NEXT: retq
	; BEXTR-SLOW-NEXT: retq
	;
	; BEXTR-FAST-LABEL: bextr64b_load:
	; BEXTR-FAST: # %bb.0:
	; BEXTR-FAST-NEXT: movl $3076, %eax # imm = 0xC04
	; BEXTR-FAST-NEXT: bextrl %eax, (%rdi), %eax
	; BEXTR-FAST-NEXT: retq
	%1 = load i64, ptr %x, align 8			%1 = load i64, ptr %x, align 8
	%2 = lshr i64 %1, 4			%2 = lshr i64 %1, 4
	%3 = and i64 %2, 4095			%3 = and i64 %2, 4095
	ret i64 %3			ret i64 %3
	}			}

	; PR34042			; PR34042
	define i64 @bextr64c(i64 %x, i32 %y) {			define i64 @bextr64c(i64 %x, i32 %y) {
	▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: movabsq $8589934590, %rax # imm = 0x1FFFFFFFE			; CHECK-NEXT: movabsq $8589934590, %rax # imm = 0x1FFFFFFFE
	; CHECK-NEXT: andq %rdi, %rax			; CHECK-NEXT: andq %rdi, %rax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	entry:			entry:
	%shr = lshr i64 %x, 2			%shr = lshr i64 %x, 2
	%and = and i64 %shr, 8589934590			%and = and i64 %shr, 8589934590
	ret i64 %and			ret i64 %and
	}			}
				;; NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
				; BEXTR-SLOW: {{.*}}
				pengfeiUnsubmitted Done Reply Inline Actions Remove `BEXTR-SLOW` to eliminate the message. pengfei: Remove `BEXTR-SLOW` to eliminate the message.

llvm/test/CodeGen/X86/bmi.ll

Show First 20 Lines • Show All 358 Lines • ▼ Show 20 Lines
; X64-NEXT: bextrl %esi, (%rdi), %eax		; X64-NEXT: bextrl %esi, (%rdi), %eax
; X64-NEXT: retq		; X64-NEXT: retq
%x1 = load i32, ptr %x		%x1 = load i32, ptr %x
%tmp = tail call i32 @llvm.x86.bmi.bextr.32(i32 %x1, i32 %y)		%tmp = tail call i32 @llvm.x86.bmi.bextr.32(i32 %x1, i32 %y)
ret i32 %tmp		ret i32 %tmp
}		}

define i32 @bextr32b(i32 %x) uwtable ssp {		define i32 @bextr32b(i32 %x) uwtable ssp {
; X86-SLOW-BEXTR-LABEL: bextr32b:		; X86-LABEL: bextr32b:
		; X86: # %bb.0:
		; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax
		; X86-NEXT: shrl $4, %eax
		; X86-NEXT: retl
		;
		; X64-LABEL: bextr32b:
		; X64: # %bb.0:
		; X64-NEXT: movzwl %di, %eax
		; X64-NEXT: shrl $4, %eax
		; X64-NEXT: retq
		%1 = lshr i32 %x, 4
		%2 = and i32 %1, 4095
		ret i32 %2
		}

		define i32 @bextr32b_no_mov(i32 %x) uwtable ssp {
		; X86-SLOW-BEXTR-LABEL: bextr32b_no_mov:
; X86-SLOW-BEXTR: # %bb.0:		; X86-SLOW-BEXTR: # %bb.0:
; X86-SLOW-BEXTR-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-SLOW-BEXTR-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-SLOW-BEXTR-NEXT: shrl $4, %eax		; X86-SLOW-BEXTR-NEXT: shrl $3, %eax
; X86-SLOW-BEXTR-NEXT: andl $4095, %eax # imm = 0xFFF		; X86-SLOW-BEXTR-NEXT: andl $4095, %eax # imm = 0xFFF
; X86-SLOW-BEXTR-NEXT: retl		; X86-SLOW-BEXTR-NEXT: retl
;		;
; X64-SLOW-BEXTR-LABEL: bextr32b:		; X64-SLOW-BEXTR-LABEL: bextr32b_no_mov:
; X64-SLOW-BEXTR: # %bb.0:		; X64-SLOW-BEXTR: # %bb.0:
; X64-SLOW-BEXTR-NEXT: movl %edi, %eax		; X64-SLOW-BEXTR-NEXT: movl %edi, %eax
; X64-SLOW-BEXTR-NEXT: shrl $4, %eax		; X64-SLOW-BEXTR-NEXT: shrl $3, %eax
; X64-SLOW-BEXTR-NEXT: andl $4095, %eax # imm = 0xFFF		; X64-SLOW-BEXTR-NEXT: andl $4095, %eax # imm = 0xFFF
; X64-SLOW-BEXTR-NEXT: retq		; X64-SLOW-BEXTR-NEXT: retq
;		;
; X86-FAST-BEXTR-LABEL: bextr32b:		; X86-FAST-BEXTR-LABEL: bextr32b_no_mov:
; X86-FAST-BEXTR: # %bb.0:		; X86-FAST-BEXTR: # %bb.0:
; X86-FAST-BEXTR-NEXT: movl $3076, %eax # imm = 0xC04		; X86-FAST-BEXTR-NEXT: movl $3075, %eax # imm = 0xC03
; X86-FAST-BEXTR-NEXT: bextrl %eax, {{[0-9]+}}(%esp), %eax		; X86-FAST-BEXTR-NEXT: bextrl %eax, {{[0-9]+}}(%esp), %eax
; X86-FAST-BEXTR-NEXT: retl		; X86-FAST-BEXTR-NEXT: retl
;		;
; X64-FAST-BEXTR-LABEL: bextr32b:		; X64-FAST-BEXTR-LABEL: bextr32b_no_mov:
; X64-FAST-BEXTR: # %bb.0:		; X64-FAST-BEXTR: # %bb.0:
; X64-FAST-BEXTR-NEXT: movl $3076, %eax # imm = 0xC04		; X64-FAST-BEXTR-NEXT: movl $3075, %eax # imm = 0xC03
; X64-FAST-BEXTR-NEXT: bextrl %eax, %edi, %eax		; X64-FAST-BEXTR-NEXT: bextrl %eax, %edi, %eax
; X64-FAST-BEXTR-NEXT: retq		; X64-FAST-BEXTR-NEXT: retq
%1 = lshr i32 %x, 4		%1 = lshr i32 %x, 3
%2 = and i32 %1, 4095		%2 = and i32 %1, 4095
ret i32 %2		ret i32 %2
}		}

; Make sure we still use AH subreg trick to extract 15:8		; Make sure we still use AH subreg trick to extract 15:8
define i32 @bextr32_subreg(i32 %x) uwtable ssp {		define i32 @bextr32_subreg(i32 %x) uwtable ssp {
; X86-LABEL: bextr32_subreg:		; X86-LABEL: bextr32_subreg:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: bextr32_subreg:		; X64-LABEL: bextr32_subreg:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: movzbl %ah, %eax		; X64-NEXT: movzbl %ah, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%1 = lshr i32 %x, 8		%1 = lshr i32 %x, 8
%2 = and i32 %1, 255		%2 = and i32 %1, 255
ret i32 %2		ret i32 %2
}		}

define i32 @bextr32b_load(ptr %x) uwtable ssp {		define i32 @bextr32b_load(ptr %x) uwtable ssp {
; X86-SLOW-BEXTR-LABEL: bextr32b_load:		; X86-LABEL: bextr32b_load:
		; X86: # %bb.0:
		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
		; X86-NEXT: movzwl (%eax), %eax
		; X86-NEXT: shrl $4, %eax
		; X86-NEXT: retl
		;
		; X64-LABEL: bextr32b_load:
		; X64: # %bb.0:
		; X64-NEXT: movzwl (%rdi), %eax
		; X64-NEXT: shrl $4, %eax
		; X64-NEXT: retq
		%1 = load i32, ptr %x
		%2 = lshr i32 %1, 4
		%3 = and i32 %2, 4095
		ret i32 %3
		}

		define i32 @bextr32_load_no_mov(ptr %x) uwtable ssp {
		; X86-SLOW-BEXTR-LABEL: bextr32_load_no_mov:
; X86-SLOW-BEXTR: # %bb.0:		; X86-SLOW-BEXTR: # %bb.0:
; X86-SLOW-BEXTR-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-SLOW-BEXTR-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-SLOW-BEXTR-NEXT: movl (%eax), %eax		; X86-SLOW-BEXTR-NEXT: movl (%eax), %eax
; X86-SLOW-BEXTR-NEXT: shrl $4, %eax		; X86-SLOW-BEXTR-NEXT: shrl $3, %eax
; X86-SLOW-BEXTR-NEXT: andl $4095, %eax # imm = 0xFFF		; X86-SLOW-BEXTR-NEXT: andl $4095, %eax # imm = 0xFFF
; X86-SLOW-BEXTR-NEXT: retl		; X86-SLOW-BEXTR-NEXT: retl
;		;
; X64-SLOW-BEXTR-LABEL: bextr32b_load:		; X64-SLOW-BEXTR-LABEL: bextr32_load_no_mov:
; X64-SLOW-BEXTR: # %bb.0:		; X64-SLOW-BEXTR: # %bb.0:
; X64-SLOW-BEXTR-NEXT: movl (%rdi), %eax		; X64-SLOW-BEXTR-NEXT: movl (%rdi), %eax
; X64-SLOW-BEXTR-NEXT: shrl $4, %eax		; X64-SLOW-BEXTR-NEXT: shrl $3, %eax
; X64-SLOW-BEXTR-NEXT: andl $4095, %eax # imm = 0xFFF		; X64-SLOW-BEXTR-NEXT: andl $4095, %eax # imm = 0xFFF
; X64-SLOW-BEXTR-NEXT: retq		; X64-SLOW-BEXTR-NEXT: retq
;		;
; X86-FAST-BEXTR-LABEL: bextr32b_load:		; X86-FAST-BEXTR-LABEL: bextr32_load_no_mov:
; X86-FAST-BEXTR: # %bb.0:		; X86-FAST-BEXTR: # %bb.0:
; X86-FAST-BEXTR-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-FAST-BEXTR-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-FAST-BEXTR-NEXT: movl $3076, %ecx # imm = 0xC04		; X86-FAST-BEXTR-NEXT: movl $3075, %ecx # imm = 0xC03
; X86-FAST-BEXTR-NEXT: bextrl %ecx, (%eax), %eax		; X86-FAST-BEXTR-NEXT: bextrl %ecx, (%eax), %eax
; X86-FAST-BEXTR-NEXT: retl		; X86-FAST-BEXTR-NEXT: retl
;		;
; X64-FAST-BEXTR-LABEL: bextr32b_load:		; X64-FAST-BEXTR-LABEL: bextr32_load_no_mov:
; X64-FAST-BEXTR: # %bb.0:		; X64-FAST-BEXTR: # %bb.0:
; X64-FAST-BEXTR-NEXT: movl $3076, %eax # imm = 0xC04		; X64-FAST-BEXTR-NEXT: movl $3075, %eax # imm = 0xC03
; X64-FAST-BEXTR-NEXT: bextrl %eax, (%rdi), %eax		; X64-FAST-BEXTR-NEXT: bextrl %eax, (%rdi), %eax
; X64-FAST-BEXTR-NEXT: retq		; X64-FAST-BEXTR-NEXT: retq
%1 = load i32, ptr %x		%1 = load i32, ptr %x
%2 = lshr i32 %1, 4		%2 = lshr i32 %1, 3
%3 = and i32 %2, 4095		%3 = and i32 %2, 4095
ret i32 %3		ret i32 %3
}		}

; PR34042		; PR34042
define i32 @bextr32c(i32 %x, i16 zeroext %y) {		define i32 @bextr32c(i32 %x, i16 zeroext %y) {
; X86-LABEL: bextr32c:		; X86-LABEL: bextr32c:
; X86: # %bb.0:		; X86: # %bb.0:
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%tmp2 = and i32 %x1, %tmp		%tmp2 = and i32 %x1, %tmp
ret i32 %tmp2		ret i32 %tmp2
}		}

define i32 @blsi32_z(i32 %a, i32 %b) nounwind {		define i32 @blsi32_z(i32 %a, i32 %b) nounwind {
; X86-LABEL: blsi32_z:		; X86-LABEL: blsi32_z:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: blsil {{[0-9]+}}(%esp), %eax		; X86-NEXT: blsil {{[0-9]+}}(%esp), %eax
; X86-NEXT: jne .LBB25_2		; X86-NEXT: jne .LBB27_2
; X86-NEXT: # %bb.1:		; X86-NEXT: # %bb.1:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: .LBB25_2:		; X86-NEXT: .LBB27_2:
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: blsi32_z:		; X64-LABEL: blsi32_z:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: blsil %edi, %eax		; X64-NEXT: blsil %edi, %eax
; X64-NEXT: cmovel %esi, %eax		; X64-NEXT: cmovel %esi, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = sub i32 0, %a		%t0 = sub i32 0, %a
▲ Show 20 Lines • Show All 87 Lines • ▼ Show 20 Lines
; X86-NEXT: xorl %edx, %edx		; X86-NEXT: xorl %edx, %edx
; X86-NEXT: movl %ecx, %eax		; X86-NEXT: movl %ecx, %eax
; X86-NEXT: negl %eax		; X86-NEXT: negl %eax
; X86-NEXT: sbbl %esi, %edx		; X86-NEXT: sbbl %esi, %edx
; X86-NEXT: andl %esi, %edx		; X86-NEXT: andl %esi, %edx
; X86-NEXT: andl %ecx, %eax		; X86-NEXT: andl %ecx, %eax
; X86-NEXT: movl %eax, %ecx		; X86-NEXT: movl %eax, %ecx
; X86-NEXT: orl %edx, %ecx		; X86-NEXT: orl %edx, %ecx
; X86-NEXT: jne .LBB29_2		; X86-NEXT: jne .LBB31_2
; X86-NEXT: # %bb.1:		; X86-NEXT: # %bb.1:
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: .LBB29_2:		; X86-NEXT: .LBB31_2:
; X86-NEXT: popl %esi		; X86-NEXT: popl %esi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: blsi64_z:		; X64-LABEL: blsi64_z:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: blsiq %rdi, %rax		; X64-NEXT: blsiq %rdi, %rax
; X64-NEXT: cmoveq %rsi, %rax		; X64-NEXT: cmoveq %rsi, %rax
; X64-NEXT: retq		; X64-NEXT: retq
▲ Show 20 Lines • Show All 104 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%tmp2 = xor i32 %x1, %tmp		%tmp2 = xor i32 %x1, %tmp
ret i32 %tmp2		ret i32 %tmp2
}		}

define i32 @blsmsk32_z(i32 %a, i32 %b) nounwind {		define i32 @blsmsk32_z(i32 %a, i32 %b) nounwind {
; X86-LABEL: blsmsk32_z:		; X86-LABEL: blsmsk32_z:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: blsmskl {{[0-9]+}}(%esp), %eax		; X86-NEXT: blsmskl {{[0-9]+}}(%esp), %eax
; X86-NEXT: jne .LBB34_2		; X86-NEXT: jne .LBB36_2
; X86-NEXT: # %bb.1:		; X86-NEXT: # %bb.1:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: .LBB34_2:		; X86-NEXT: .LBB36_2:
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: blsmsk32_z:		; X64-LABEL: blsmsk32_z:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: blsmskl %edi, %eax		; X64-NEXT: blsmskl %edi, %eax
; X64-NEXT: cmovel %esi, %eax		; X64-NEXT: cmovel %esi, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = sub i32 %a, 1		%t0 = sub i32 %a, 1
▲ Show 20 Lines • Show All 85 Lines • ▼ Show 20 Lines
; X86-NEXT: movl %ecx, %eax		; X86-NEXT: movl %ecx, %eax
; X86-NEXT: addl $-1, %eax		; X86-NEXT: addl $-1, %eax
; X86-NEXT: movl %esi, %edx		; X86-NEXT: movl %esi, %edx
; X86-NEXT: adcl $-1, %edx		; X86-NEXT: adcl $-1, %edx
; X86-NEXT: xorl %ecx, %eax		; X86-NEXT: xorl %ecx, %eax
; X86-NEXT: xorl %esi, %edx		; X86-NEXT: xorl %esi, %edx
; X86-NEXT: movl %eax, %ecx		; X86-NEXT: movl %eax, %ecx
; X86-NEXT: orl %edx, %ecx		; X86-NEXT: orl %edx, %ecx
; X86-NEXT: jne .LBB38_2		; X86-NEXT: jne .LBB40_2
; X86-NEXT: # %bb.1:		; X86-NEXT: # %bb.1:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: .LBB38_2:		; X86-NEXT: .LBB40_2:
; X86-NEXT: popl %esi		; X86-NEXT: popl %esi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: blsmsk64_z:		; X64-LABEL: blsmsk64_z:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: blsmskq %rdi, %rax		; X64-NEXT: blsmskq %rdi, %rax
; X64-NEXT: cmoveq %rsi, %rax		; X64-NEXT: cmoveq %rsi, %rax
; X64-NEXT: retq		; X64-NEXT: retq
▲ Show 20 Lines • Show All 104 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%tmp2 = and i32 %x1, %tmp		%tmp2 = and i32 %x1, %tmp
ret i32 %tmp2		ret i32 %tmp2
}		}

define i32 @blsr32_z(i32 %a, i32 %b) nounwind {		define i32 @blsr32_z(i32 %a, i32 %b) nounwind {
; X86-LABEL: blsr32_z:		; X86-LABEL: blsr32_z:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: blsrl {{[0-9]+}}(%esp), %eax		; X86-NEXT: blsrl {{[0-9]+}}(%esp), %eax
; X86-NEXT: jne .LBB43_2		; X86-NEXT: jne .LBB45_2
; X86-NEXT: # %bb.1:		; X86-NEXT: # %bb.1:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: .LBB43_2:		; X86-NEXT: .LBB45_2:
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: blsr32_z:		; X64-LABEL: blsr32_z:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: blsrl %edi, %eax		; X64-NEXT: blsrl %edi, %eax
; X64-NEXT: cmovel %esi, %eax		; X64-NEXT: cmovel %esi, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = sub i32 %a, 1		%t0 = sub i32 %a, 1
▲ Show 20 Lines • Show All 85 Lines • ▼ Show 20 Lines
; X86-NEXT: movl %ecx, %eax		; X86-NEXT: movl %ecx, %eax
; X86-NEXT: addl $-1, %eax		; X86-NEXT: addl $-1, %eax
; X86-NEXT: movl %esi, %edx		; X86-NEXT: movl %esi, %edx
; X86-NEXT: adcl $-1, %edx		; X86-NEXT: adcl $-1, %edx
; X86-NEXT: andl %ecx, %eax		; X86-NEXT: andl %ecx, %eax
; X86-NEXT: andl %esi, %edx		; X86-NEXT: andl %esi, %edx
; X86-NEXT: movl %eax, %ecx		; X86-NEXT: movl %eax, %ecx
; X86-NEXT: orl %edx, %ecx		; X86-NEXT: orl %edx, %ecx
; X86-NEXT: jne .LBB47_2		; X86-NEXT: jne .LBB49_2
; X86-NEXT: # %bb.1:		; X86-NEXT: # %bb.1:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: .LBB47_2:		; X86-NEXT: .LBB49_2:
; X86-NEXT: popl %esi		; X86-NEXT: popl %esi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: blsr64_z:		; X64-LABEL: blsr64_z:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: blsrq %rdi, %rax		; X64-NEXT: blsrq %rdi, %rax
; X64-NEXT: cmoveq %rsi, %rax		; X64-NEXT: cmoveq %rsi, %rax
; X64-NEXT: retq		; X64-NEXT: retq
▲ Show 20 Lines • Show All 117 Lines • ▼ Show 20 Lines
}		}

define void @pr40060(i32, i32) {		define void @pr40060(i32, i32) {
; X86-LABEL: pr40060:		; X86-LABEL: pr40060:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: bextrl %eax, {{[0-9]+}}(%esp), %eax		; X86-NEXT: bextrl %eax, {{[0-9]+}}(%esp), %eax
; X86-NEXT: testl %eax, %eax		; X86-NEXT: testl %eax, %eax
; X86-NEXT: js .LBB52_1		; X86-NEXT: js .LBB54_1
; X86-NEXT: # %bb.2:		; X86-NEXT: # %bb.2:
; X86-NEXT: jmp bar # TAILCALL		; X86-NEXT: jmp bar # TAILCALL
; X86-NEXT: .LBB52_1:		; X86-NEXT: .LBB54_1:
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: pr40060:		; X64-LABEL: pr40060:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: bextrl %esi, %edi, %eax		; X64-NEXT: bextrl %esi, %edi, %eax
; X64-NEXT: testl %eax, %eax		; X64-NEXT: testl %eax, %eax
; X64-NEXT: js .LBB52_1		; X64-NEXT: js .LBB54_1
; X64-NEXT: # %bb.2:		; X64-NEXT: # %bb.2:
; X64-NEXT: jmp bar # TAILCALL		; X64-NEXT: jmp bar # TAILCALL
; X64-NEXT: .LBB52_1:		; X64-NEXT: .LBB54_1:
; X64-NEXT: retq		; X64-NEXT: retq
%3 = tail call i32 @llvm.x86.bmi.bextr.32(i32 %0, i32 %1)		%3 = tail call i32 @llvm.x86.bmi.bextr.32(i32 %0, i32 %1)
%4 = icmp sgt i32 %3, -1		%4 = icmp sgt i32 %3, -1
br i1 %4, label %5, label %6		br i1 %4, label %5, label %6

tail call void @bar()		tail call void @bar()
br label %6		br label %6

ret void		ret void
}		}

define i32 @blsr32_branch(i32 %x) {		define i32 @blsr32_branch(i32 %x) {
; X86-LABEL: blsr32_branch:		; X86-LABEL: blsr32_branch:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: pushl %esi		; X86-NEXT: pushl %esi
; X86-NEXT: .cfi_def_cfa_offset 8		; X86-NEXT: .cfi_def_cfa_offset 8
; X86-NEXT: .cfi_offset %esi, -8		; X86-NEXT: .cfi_offset %esi, -8
; X86-NEXT: blsrl {{[0-9]+}}(%esp), %esi		; X86-NEXT: blsrl {{[0-9]+}}(%esp), %esi
; X86-NEXT: jne .LBB53_2		; X86-NEXT: jne .LBB55_2
; X86-NEXT: # %bb.1:		; X86-NEXT: # %bb.1:
; X86-NEXT: calll bar		; X86-NEXT: calll bar
; X86-NEXT: .LBB53_2:		; X86-NEXT: .LBB55_2:
; X86-NEXT: movl %esi, %eax		; X86-NEXT: movl %esi, %eax
; X86-NEXT: popl %esi		; X86-NEXT: popl %esi
; X86-NEXT: .cfi_def_cfa_offset 4		; X86-NEXT: .cfi_def_cfa_offset 4
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: blsr32_branch:		; X64-LABEL: blsr32_branch:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: pushq %rbx		; X64-NEXT: pushq %rbx
; X64-NEXT: .cfi_def_cfa_offset 16		; X64-NEXT: .cfi_def_cfa_offset 16
; X64-NEXT: .cfi_offset %rbx, -16		; X64-NEXT: .cfi_offset %rbx, -16
; X64-NEXT: blsrl %edi, %ebx		; X64-NEXT: blsrl %edi, %ebx
; X64-NEXT: jne .LBB53_2		; X64-NEXT: jne .LBB55_2
; X64-NEXT: # %bb.1:		; X64-NEXT: # %bb.1:
; X64-NEXT: callq bar		; X64-NEXT: callq bar
; X64-NEXT: .LBB53_2:		; X64-NEXT: .LBB55_2:
; X64-NEXT: movl %ebx, %eax		; X64-NEXT: movl %ebx, %eax
; X64-NEXT: popq %rbx		; X64-NEXT: popq %rbx
; X64-NEXT: .cfi_def_cfa_offset 8		; X64-NEXT: .cfi_def_cfa_offset 8
; X64-NEXT: retq		; X64-NEXT: retq
%tmp = sub i32 %x, 1		%tmp = sub i32 %x, 1
%tmp2 = and i32 %x, %tmp		%tmp2 = and i32 %x, %tmp
%cmp = icmp eq i32 %tmp2, 0		%cmp = icmp eq i32 %tmp2, 0
br i1 %cmp, label %1, label %2		br i1 %cmp, label %1, label %2
Show All 17 Lines
; X86-NEXT: movl %eax, %esi		; X86-NEXT: movl %eax, %esi
; X86-NEXT: addl $-1, %esi		; X86-NEXT: addl $-1, %esi
; X86-NEXT: movl %ecx, %edi		; X86-NEXT: movl %ecx, %edi
; X86-NEXT: adcl $-1, %edi		; X86-NEXT: adcl $-1, %edi
; X86-NEXT: andl %eax, %esi		; X86-NEXT: andl %eax, %esi
; X86-NEXT: andl %ecx, %edi		; X86-NEXT: andl %ecx, %edi
; X86-NEXT: movl %esi, %eax		; X86-NEXT: movl %esi, %eax
; X86-NEXT: orl %edi, %eax		; X86-NEXT: orl %edi, %eax
; X86-NEXT: jne .LBB54_2		; X86-NEXT: jne .LBB56_2
; X86-NEXT: # %bb.1:		; X86-NEXT: # %bb.1:
; X86-NEXT: calll bar		; X86-NEXT: calll bar
; X86-NEXT: .LBB54_2:		; X86-NEXT: .LBB56_2:
; X86-NEXT: movl %esi, %eax		; X86-NEXT: movl %esi, %eax
; X86-NEXT: movl %edi, %edx		; X86-NEXT: movl %edi, %edx
; X86-NEXT: popl %esi		; X86-NEXT: popl %esi
; X86-NEXT: .cfi_def_cfa_offset 8		; X86-NEXT: .cfi_def_cfa_offset 8
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: .cfi_def_cfa_offset 4		; X86-NEXT: .cfi_def_cfa_offset 4
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: blsr64_branch:		; X64-LABEL: blsr64_branch:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: pushq %rbx		; X64-NEXT: pushq %rbx
; X64-NEXT: .cfi_def_cfa_offset 16		; X64-NEXT: .cfi_def_cfa_offset 16
; X64-NEXT: .cfi_offset %rbx, -16		; X64-NEXT: .cfi_offset %rbx, -16
; X64-NEXT: blsrq %rdi, %rbx		; X64-NEXT: blsrq %rdi, %rbx
; X64-NEXT: jne .LBB54_2		; X64-NEXT: jne .LBB56_2
; X64-NEXT: # %bb.1:		; X64-NEXT: # %bb.1:
; X64-NEXT: callq bar		; X64-NEXT: callq bar
; X64-NEXT: .LBB54_2:		; X64-NEXT: .LBB56_2:
; X64-NEXT: movq %rbx, %rax		; X64-NEXT: movq %rbx, %rax
; X64-NEXT: popq %rbx		; X64-NEXT: popq %rbx
; X64-NEXT: .cfi_def_cfa_offset 8		; X64-NEXT: .cfi_def_cfa_offset 8
; X64-NEXT: retq		; X64-NEXT: retq
%tmp = sub i64 %x, 1		%tmp = sub i64 %x, 1
%tmp2 = and i64 %x, %tmp		%tmp2 = and i64 %x, %tmp
%cmp = icmp eq i64 %tmp2, 0		%cmp = icmp eq i64 %tmp2, 0
br i1 %cmp, label %1, label %2		br i1 %cmp, label %1, label %2

tail call void @bar()		tail call void @bar()
br label %2		br label %2
ret i64 %tmp2		ret i64 %tmp2
}		}

define i32 @blsi32_branch(i32 %x) {		define i32 @blsi32_branch(i32 %x) {
; X86-LABEL: blsi32_branch:		; X86-LABEL: blsi32_branch:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: pushl %esi		; X86-NEXT: pushl %esi
; X86-NEXT: .cfi_def_cfa_offset 8		; X86-NEXT: .cfi_def_cfa_offset 8
; X86-NEXT: .cfi_offset %esi, -8		; X86-NEXT: .cfi_offset %esi, -8
; X86-NEXT: blsil {{[0-9]+}}(%esp), %esi		; X86-NEXT: blsil {{[0-9]+}}(%esp), %esi
; X86-NEXT: jne .LBB55_2		; X86-NEXT: jne .LBB57_2
; X86-NEXT: # %bb.1:		; X86-NEXT: # %bb.1:
; X86-NEXT: calll bar		; X86-NEXT: calll bar
; X86-NEXT: .LBB55_2:		; X86-NEXT: .LBB57_2:
; X86-NEXT: movl %esi, %eax		; X86-NEXT: movl %esi, %eax
; X86-NEXT: popl %esi		; X86-NEXT: popl %esi
; X86-NEXT: .cfi_def_cfa_offset 4		; X86-NEXT: .cfi_def_cfa_offset 4
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: blsi32_branch:		; X64-LABEL: blsi32_branch:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: pushq %rbx		; X64-NEXT: pushq %rbx
; X64-NEXT: .cfi_def_cfa_offset 16		; X64-NEXT: .cfi_def_cfa_offset 16
; X64-NEXT: .cfi_offset %rbx, -16		; X64-NEXT: .cfi_offset %rbx, -16
; X64-NEXT: blsil %edi, %ebx		; X64-NEXT: blsil %edi, %ebx
; X64-NEXT: jne .LBB55_2		; X64-NEXT: jne .LBB57_2
; X64-NEXT: # %bb.1:		; X64-NEXT: # %bb.1:
; X64-NEXT: callq bar		; X64-NEXT: callq bar
; X64-NEXT: .LBB55_2:		; X64-NEXT: .LBB57_2:
; X64-NEXT: movl %ebx, %eax		; X64-NEXT: movl %ebx, %eax
; X64-NEXT: popq %rbx		; X64-NEXT: popq %rbx
; X64-NEXT: .cfi_def_cfa_offset 8		; X64-NEXT: .cfi_def_cfa_offset 8
; X64-NEXT: retq		; X64-NEXT: retq
%tmp = sub i32 0, %x		%tmp = sub i32 0, %x
%tmp2 = and i32 %x, %tmp		%tmp2 = and i32 %x, %tmp
%cmp = icmp eq i32 %tmp2, 0		%cmp = icmp eq i32 %tmp2, 0
br i1 %cmp, label %1, label %2		br i1 %cmp, label %1, label %2
Show All 17 Lines
; X86-NEXT: xorl %esi, %esi		; X86-NEXT: xorl %esi, %esi
; X86-NEXT: movl %eax, %edi		; X86-NEXT: movl %eax, %edi
; X86-NEXT: negl %edi		; X86-NEXT: negl %edi
; X86-NEXT: sbbl %ecx, %esi		; X86-NEXT: sbbl %ecx, %esi
; X86-NEXT: andl %ecx, %esi		; X86-NEXT: andl %ecx, %esi
; X86-NEXT: andl %eax, %edi		; X86-NEXT: andl %eax, %edi
; X86-NEXT: movl %edi, %eax		; X86-NEXT: movl %edi, %eax
; X86-NEXT: orl %esi, %eax		; X86-NEXT: orl %esi, %eax
; X86-NEXT: jne .LBB56_2		; X86-NEXT: jne .LBB58_2
; X86-NEXT: # %bb.1:		; X86-NEXT: # %bb.1:
; X86-NEXT: calll bar		; X86-NEXT: calll bar
; X86-NEXT: .LBB56_2:		; X86-NEXT: .LBB58_2:
; X86-NEXT: movl %edi, %eax		; X86-NEXT: movl %edi, %eax
; X86-NEXT: movl %esi, %edx		; X86-NEXT: movl %esi, %edx
; X86-NEXT: popl %esi		; X86-NEXT: popl %esi
; X86-NEXT: .cfi_def_cfa_offset 8		; X86-NEXT: .cfi_def_cfa_offset 8
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: .cfi_def_cfa_offset 4		; X86-NEXT: .cfi_def_cfa_offset 4
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: blsi64_branch:		; X64-LABEL: blsi64_branch:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: pushq %rbx		; X64-NEXT: pushq %rbx
; X64-NEXT: .cfi_def_cfa_offset 16		; X64-NEXT: .cfi_def_cfa_offset 16
; X64-NEXT: .cfi_offset %rbx, -16		; X64-NEXT: .cfi_offset %rbx, -16
; X64-NEXT: blsiq %rdi, %rbx		; X64-NEXT: blsiq %rdi, %rbx
; X64-NEXT: jne .LBB56_2		; X64-NEXT: jne .LBB58_2
; X64-NEXT: # %bb.1:		; X64-NEXT: # %bb.1:
; X64-NEXT: callq bar		; X64-NEXT: callq bar
; X64-NEXT: .LBB56_2:		; X64-NEXT: .LBB58_2:
; X64-NEXT: movq %rbx, %rax		; X64-NEXT: movq %rbx, %rax
; X64-NEXT: popq %rbx		; X64-NEXT: popq %rbx
; X64-NEXT: .cfi_def_cfa_offset 8		; X64-NEXT: .cfi_def_cfa_offset 8
; X64-NEXT: retq		; X64-NEXT: retq
%tmp = sub i64 0, %x		%tmp = sub i64 0, %x
%tmp2 = and i64 %x, %tmp		%tmp2 = and i64 %x, %tmp
%cmp = icmp eq i64 %tmp2, 0		%cmp = icmp eq i64 %tmp2, 0
br i1 %cmp, label %1, label %2		br i1 %cmp, label %1, label %2

tail call void @bar()		tail call void @bar()
br label %2		br label %2
ret i64 %tmp2		ret i64 %tmp2
}		}

declare dso_local void @bar()		declare dso_local void @bar()

define void @pr42118_i32(i32 %x) {		define void @pr42118_i32(i32 %x) {
; X86-LABEL: pr42118_i32:		; X86-LABEL: pr42118_i32:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: blsrl {{[0-9]+}}(%esp), %eax		; X86-NEXT: blsrl {{[0-9]+}}(%esp), %eax
; X86-NEXT: jne .LBB57_1		; X86-NEXT: jne .LBB59_1
; X86-NEXT: # %bb.2:		; X86-NEXT: # %bb.2:
; X86-NEXT: jmp bar # TAILCALL		; X86-NEXT: jmp bar # TAILCALL
; X86-NEXT: .LBB57_1:		; X86-NEXT: .LBB59_1:
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: pr42118_i32:		; X64-LABEL: pr42118_i32:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: blsrl %edi, %eax		; X64-NEXT: blsrl %edi, %eax
; X64-NEXT: jne .LBB57_1		; X64-NEXT: jne .LBB59_1
; X64-NEXT: # %bb.2:		; X64-NEXT: # %bb.2:
; X64-NEXT: jmp bar # TAILCALL		; X64-NEXT: jmp bar # TAILCALL
; X64-NEXT: .LBB57_1:		; X64-NEXT: .LBB59_1:
; X64-NEXT: retq		; X64-NEXT: retq
%tmp = sub i32 0, %x		%tmp = sub i32 0, %x
%tmp1 = and i32 %tmp, %x		%tmp1 = and i32 %tmp, %x
%cmp = icmp eq i32 %tmp1, %x		%cmp = icmp eq i32 %tmp1, %x
br i1 %cmp, label %1, label %2		br i1 %cmp, label %1, label %2

tail call void @bar()		tail call void @bar()
br label %2		br label %2
Show All 11 Lines
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl %eax, %edx		; X86-NEXT: movl %eax, %edx
; X86-NEXT: addl $-1, %edx		; X86-NEXT: addl $-1, %edx
; X86-NEXT: movl %ecx, %esi		; X86-NEXT: movl %ecx, %esi
; X86-NEXT: adcl $-1, %esi		; X86-NEXT: adcl $-1, %esi
; X86-NEXT: andl %eax, %edx		; X86-NEXT: andl %eax, %edx
; X86-NEXT: andl %ecx, %esi		; X86-NEXT: andl %ecx, %esi
; X86-NEXT: orl %edx, %esi		; X86-NEXT: orl %edx, %esi
; X86-NEXT: jne .LBB58_1		; X86-NEXT: jne .LBB60_1
; X86-NEXT: # %bb.2:		; X86-NEXT: # %bb.2:
; X86-NEXT: popl %esi		; X86-NEXT: popl %esi
; X86-NEXT: .cfi_def_cfa_offset 4		; X86-NEXT: .cfi_def_cfa_offset 4
; X86-NEXT: jmp bar # TAILCALL		; X86-NEXT: jmp bar # TAILCALL
; X86-NEXT: .LBB58_1:		; X86-NEXT: .LBB60_1:
; X86-NEXT: .cfi_def_cfa_offset 8		; X86-NEXT: .cfi_def_cfa_offset 8
; X86-NEXT: popl %esi		; X86-NEXT: popl %esi
; X86-NEXT: .cfi_def_cfa_offset 4		; X86-NEXT: .cfi_def_cfa_offset 4
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: pr42118_i64:		; X64-LABEL: pr42118_i64:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: blsrq %rdi, %rax		; X64-NEXT: blsrq %rdi, %rax
; X64-NEXT: jne .LBB58_1		; X64-NEXT: jne .LBB60_1
; X64-NEXT: # %bb.2:		; X64-NEXT: # %bb.2:
; X64-NEXT: jmp bar # TAILCALL		; X64-NEXT: jmp bar # TAILCALL
; X64-NEXT: .LBB58_1:		; X64-NEXT: .LBB60_1:
; X64-NEXT: retq		; X64-NEXT: retq
%tmp = sub i64 0, %x		%tmp = sub i64 0, %x
%tmp1 = and i64 %tmp, %x		%tmp1 = and i64 %tmp, %x
%cmp = icmp eq i64 %tmp1, %x		%cmp = icmp eq i64 %tmp1, %x
br i1 %cmp, label %1, label %2		br i1 %cmp, label %1, label %2

tail call void @bar()		tail call void @bar()
br label %2		br label %2

ret void		ret void
}		}

define i32 @blsi_cflag_32(i32 %x, i32 %y) nounwind {		define i32 @blsi_cflag_32(i32 %x, i32 %y) nounwind {
; X86-LABEL: blsi_cflag_32:		; X86-LABEL: blsi_cflag_32:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: testl %eax, %eax		; X86-NEXT: testl %eax, %eax
; X86-NEXT: jne .LBB59_1		; X86-NEXT: jne .LBB61_1
; X86-NEXT: # %bb.2:		; X86-NEXT: # %bb.2:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: retl		; X86-NEXT: retl
; X86-NEXT: .LBB59_1:		; X86-NEXT: .LBB61_1:
; X86-NEXT: blsil %eax, %eax		; X86-NEXT: blsil %eax, %eax
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: blsi_cflag_32:		; X64-LABEL: blsi_cflag_32:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: blsil %edi, %eax		; X64-NEXT: blsil %edi, %eax
; X64-NEXT: cmovael %esi, %eax		; X64-NEXT: cmovael %esi, %eax
; X64-NEXT: retq		; X64-NEXT: retq
Show All 12 Lines
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-NEXT: xorl %edx, %edx		; X86-NEXT: xorl %edx, %edx
; X86-NEXT: movl %ecx, %eax		; X86-NEXT: movl %ecx, %eax
; X86-NEXT: negl %eax		; X86-NEXT: negl %eax
; X86-NEXT: sbbl %esi, %edx		; X86-NEXT: sbbl %esi, %edx
; X86-NEXT: movl %ecx, %edi		; X86-NEXT: movl %ecx, %edi
; X86-NEXT: orl %esi, %edi		; X86-NEXT: orl %esi, %edi
; X86-NEXT: jne .LBB60_1		; X86-NEXT: jne .LBB62_1
; X86-NEXT: # %bb.2:		; X86-NEXT: # %bb.2:
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: jmp .LBB60_3		; X86-NEXT: jmp .LBB62_3
; X86-NEXT: .LBB60_1:		; X86-NEXT: .LBB62_1:
; X86-NEXT: andl %esi, %edx		; X86-NEXT: andl %esi, %edx
; X86-NEXT: andl %ecx, %eax		; X86-NEXT: andl %ecx, %eax
; X86-NEXT: .LBB60_3:		; X86-NEXT: .LBB62_3:
; X86-NEXT: popl %esi		; X86-NEXT: popl %esi
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: blsi_cflag_64:		; X64-LABEL: blsi_cflag_64:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: blsiq %rdi, %rax		; X64-NEXT: blsiq %rdi, %rax
; X64-NEXT: cmovaeq %rsi, %rax		; X64-NEXT: cmovaeq %rsi, %rax
; X64-NEXT: retq		; X64-NEXT: retq
%tobool = icmp eq i64 %x, 0		%tobool = icmp eq i64 %x, 0
%sub = sub nsw i64 0, %x		%sub = sub nsw i64 0, %x
%and = and i64 %sub, %x		%and = and i64 %sub, %x
%cond = select i1 %tobool, i64 %y, i64 %and		%cond = select i1 %tobool, i64 %y, i64 %and
ret i64 %cond		ret i64 %cond
}		}

llvm/test/CodeGen/X86/btc_bts_btr.ll

	Show First 20 Lines • Show All 977 Lines • ▼ Show 20 Lines
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: movl %edi, %eax
	; X64-NEXT: shll $2, %esi			; X64-NEXT: shll $2, %esi
	; X64-NEXT: btsl %esi, %eax			; X64-NEXT: btsl %esi, %eax
	; X64-NEXT: retq			; X64-NEXT: retq
	;			;
	; X86-LABEL: bts_32_mask_zeros:			; X86-LABEL: bts_32_mask_zeros:
	; X86: # %bb.0:			; X86: # %bb.0:
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: andb $7, %cl
				pengfeiUnsubmitted Not Done Reply Inline Actions One more `andb` pengfei: One more `andb`
				goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions One more `andb` Sorry, didn't see earlier. Think this breaks a few combine patterns that probably relied on semi-canonicalizarion of (and (shift x, y), z) instead of the other way around. Fixing the cases caught here is doable, but is it possible to maybe move this to later in the process? Currently it's guarded behind AfterLegalize (for the exact reason of not breaking other patterns), but is there a higher level or different pass this might be better placed in? goldstein.w.n: > One more `andb` Sorry, didn't see earlier. Think this breaks a few combine patterns that…
				goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions Do you know if there is a later stage I can move this transform too so that it won't break any other patterns? goldstein.w.n: Do you know if there is a later stage I can move this transform too so that it won't break any…
				goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions Do you know if there is a later stage I can move this transform too so that it won't break any other patterns? goldstein.w.n: > Do you know if there is a later stage I can move this transform too so that it won't break…
				pengfeiUnsubmitted Not Done Reply Inline Actions No, I don't. Maybe do it in a peephole pass after ISel? pengfei: No, I don't. Maybe do it in a peephole pass after ISel?
				craig.topperUnsubmitted Not Done Reply Inline Actions We have something similarish implemented during isel in `X86DAGToDAGISel::tryShrinkShlLogicImm`. craig.topper: We have something similarish implemented during isel in `X86DAGToDAGISel::tryShrinkShlLogicImm`.
				goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions We have something similarish implemented during isel in `X86DAGToDAGISel::tryShrinkShlLogicImm`. I did in fact try moving it there and it did clean up some of the missed optimizations but added some (more severe) other missed optimizations. goldstein.w.n: > We have something similarish implemented during isel in `X86DAGToDAGISel…
				goldstein.w.nAuthorUnsubmitted Done Reply Inline Actions No, I don't. Maybe do it in a peephole pass after ISel? Is there an example file / pass you could point to for me to emulate? goldstein.w.n: > No, I don't. Maybe do it in a peephole pass after ISel? Is there an example file / pass you…
	; X86-NEXT: shlb $2, %cl			; X86-NEXT: shlb $2, %cl
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: btsl %ecx, %eax			; X86-NEXT: btsl %ecx, %eax
	; X86-NEXT: retl			; X86-NEXT: retl
	%1 = shl i32 %n, 2			%1 = shl i32 %n, 2
	%2 = and i32 %1, 31			%2 = and i32 %1, 31
	%3 = shl i32 1, %2			%3 = shl i32 1, %2
	%4 = or i32 %x, %3			%4 = or i32 %x, %3
	ret i32 %4			ret i32 %4
	}			}

	define i32 @btc_32_mask_zeros(i32 %x, i32 %n) {			define i32 @btc_32_mask_zeros(i32 %x, i32 %n) {
	; X64-LABEL: btc_32_mask_zeros:			; X64-LABEL: btc_32_mask_zeros:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: movl %edi, %eax
	; X64-NEXT: shll $2, %esi			; X64-NEXT: shll $2, %esi
	; X64-NEXT: btcl %esi, %eax			; X64-NEXT: btcl %esi, %eax
	; X64-NEXT: retq			; X64-NEXT: retq
	;			;
	; X86-LABEL: btc_32_mask_zeros:			; X86-LABEL: btc_32_mask_zeros:
	; X86: # %bb.0:			; X86: # %bb.0:
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: andb $7, %cl
				pengfeiUnsubmitted Not Done Reply Inline Actions ditto. pengfei: ditto.
	; X86-NEXT: shlb $2, %cl			; X86-NEXT: shlb $2, %cl
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: btcl %ecx, %eax			; X86-NEXT: btcl %ecx, %eax
	; X86-NEXT: retl			; X86-NEXT: retl
	%1 = shl i32 %n, 2			%1 = shl i32 %n, 2
	%2 = and i32 %1, 31			%2 = and i32 %1, 31
	%3 = shl i32 1, %2			%3 = shl i32 1, %2
	%4 = xor i32 %x, %3			%4 = xor i32 %x, %3
	ret i32 %4			ret i32 %4
	}			}
	▲ Show 20 Lines • Show All 99 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/combine-bitreverse.ll

	Show First 20 Lines • Show All 229 Lines • ▼ Show 20 Lines
	; X86-NEXT: andl $252645135, %eax # imm = 0xF0F0F0F			; X86-NEXT: andl $252645135, %eax # imm = 0xF0F0F0F
	; X86-NEXT: orl %ecx, %eax			; X86-NEXT: orl %ecx, %eax
	; X86-NEXT: movl %eax, %ecx			; X86-NEXT: movl %eax, %ecx
	; X86-NEXT: andl $858993459, %ecx # imm = 0x33333333			; X86-NEXT: andl $858993459, %ecx # imm = 0x33333333
	; X86-NEXT: shrl $2, %eax			; X86-NEXT: shrl $2, %eax
	; X86-NEXT: andl $858993459, %eax # imm = 0x33333333			; X86-NEXT: andl $858993459, %eax # imm = 0x33333333
	; X86-NEXT: leal (%eax,%ecx,4), %ecx			; X86-NEXT: leal (%eax,%ecx,4), %ecx
	; X86-NEXT: movl %ecx, %eax			; X86-NEXT: movl %ecx, %eax
	; X86-NEXT: andl $5592405, %eax # imm = 0x555555			; X86-NEXT: shrl %eax
				pengfeiUnsubmitted Not Done Reply Inline Actions One more `shrl`? pengfei: One more `shrl`?
	; X86-NEXT: shll $6, %ecx			; X86-NEXT: andl $22369621, %eax # imm = 0x1555555
	; X86-NEXT: andl $-1431655808, %ecx # imm = 0xAAAAAA80			; X86-NEXT: andl $5592405, %ecx # imm = 0x555555
	; X86-NEXT: shll $8, %eax			; X86-NEXT: shll $8, %ecx
				; X86-NEXT: shll $7, %eax
	; X86-NEXT: orl %ecx, %eax			; X86-NEXT: orl %ecx, %eax
	; X86-NEXT: bswapl %eax			; X86-NEXT: bswapl %eax
	; X86-NEXT: movl %eax, %ecx			; X86-NEXT: movl %eax, %ecx
	; X86-NEXT: andl $986895, %ecx # imm = 0xF0F0F			; X86-NEXT: andl $986895, %ecx # imm = 0xF0F0F
	; X86-NEXT: shll $4, %ecx			; X86-NEXT: shll $4, %ecx
	; X86-NEXT: shrl $4, %eax			; X86-NEXT: shrl $4, %eax
	; X86-NEXT: andl $135204623, %eax # imm = 0x80F0F0F			; X86-NEXT: andl $135204623, %eax # imm = 0x80F0F0F
	; X86-NEXT: orl %ecx, %eax			; X86-NEXT: orl %ecx, %eax
	Show All 18 Lines
	; X64-NEXT: shll $4, %eax			; X64-NEXT: shll $4, %eax
	; X64-NEXT: shrl $4, %edi			; X64-NEXT: shrl $4, %edi
	; X64-NEXT: andl $252645135, %edi # imm = 0xF0F0F0F			; X64-NEXT: andl $252645135, %edi # imm = 0xF0F0F0F
	; X64-NEXT: orl %eax, %edi			; X64-NEXT: orl %eax, %edi
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: movl %edi, %eax
	; X64-NEXT: andl $858993459, %eax # imm = 0x33333333			; X64-NEXT: andl $858993459, %eax # imm = 0x33333333
	; X64-NEXT: shrl $2, %edi			; X64-NEXT: shrl $2, %edi
	; X64-NEXT: andl $858993459, %edi # imm = 0x33333333			; X64-NEXT: andl $858993459, %edi # imm = 0x33333333
	; X64-NEXT: leal (%rdi,%rax,4), %eax			; X64-NEXT: leal (%rdi,%rax,4), %ecx
	; X64-NEXT: movl %eax, %ecx			; X64-NEXT: movl %ecx, %eax
				; X64-NEXT: shrl %eax
				; X64-NEXT: andl $22369621, %eax # imm = 0x1555555
	; X64-NEXT: andl $5592405, %ecx # imm = 0x555555			; X64-NEXT: andl $5592405, %ecx # imm = 0x555555
	; X64-NEXT: shll $6, %eax
	; X64-NEXT: andl $-1431655808, %eax # imm = 0xAAAAAA80
	; X64-NEXT: shll $8, %ecx			; X64-NEXT: shll $8, %ecx
	; X64-NEXT: orl %eax, %ecx			; X64-NEXT: shll $7, %eax
	; X64-NEXT: bswapl %ecx			; X64-NEXT: orl %ecx, %eax
	; X64-NEXT: movl %ecx, %eax			; X64-NEXT: bswapl %eax
	; X64-NEXT: andl $986895, %eax # imm = 0xF0F0F			; X64-NEXT: movl %eax, %ecx
	; X64-NEXT: shll $4, %eax			; X64-NEXT: andl $986895, %ecx # imm = 0xF0F0F
	; X64-NEXT: shrl $4, %ecx			; X64-NEXT: shll $4, %ecx
	; X64-NEXT: andl $135204623, %ecx # imm = 0x80F0F0F			; X64-NEXT: shrl $4, %eax
				pengfeiUnsubmitted Not Done Reply Inline Actions ditto. pengfei: ditto.
	; X64-NEXT: orl %eax, %ecx			; X64-NEXT: andl $135204623, %eax # imm = 0x80F0F0F
	; X64-NEXT: movl %ecx, %eax			; X64-NEXT: orl %ecx, %eax
	; X64-NEXT: andl $3355443, %eax # imm = 0x333333			; X64-NEXT: movl %eax, %ecx
	; X64-NEXT: shrl $2, %ecx			; X64-NEXT: andl $3355443, %ecx # imm = 0x333333
	; X64-NEXT: andl $36909875, %ecx # imm = 0x2333333			; X64-NEXT: shrl $2, %eax
	; X64-NEXT: leal (%rcx,%rax,4), %eax			; X64-NEXT: andl $36909875, %eax # imm = 0x2333333
				; X64-NEXT: leal (%rax,%rcx,4), %eax
	; X64-NEXT: movl %eax, %ecx			; X64-NEXT: movl %eax, %ecx
	; X64-NEXT: andl $1431655765, %ecx # imm = 0x55555555			; X64-NEXT: andl $1431655765, %ecx # imm = 0x55555555
	; X64-NEXT: shrl %eax			; X64-NEXT: shrl %eax
	; X64-NEXT: andl $1431655765, %eax # imm = 0x55555555			; X64-NEXT: andl $1431655765, %eax # imm = 0x55555555
	; X64-NEXT: leal (%rax,%rcx,2), %eax			; X64-NEXT: leal (%rax,%rcx,2), %eax
	; X64-NEXT: retq			; X64-NEXT: retq
	%b = call i32 @llvm.bitreverse.i32(i32 %a0)			%b = call i32 @llvm.bitreverse.i32(i32 %a0)
	%c = shl i32 %b, 7			%c = shl i32 %b, 7
	Show All 14 Lines
	; X86-NEXT: orl %ecx, %eax			; X86-NEXT: orl %ecx, %eax
	; X86-NEXT: movl %eax, %ecx			; X86-NEXT: movl %eax, %ecx
	; X86-NEXT: andl $858993459, %ecx # imm = 0x33333333			; X86-NEXT: andl $858993459, %ecx # imm = 0x33333333
	; X86-NEXT: shrl $2, %eax			; X86-NEXT: shrl $2, %eax
	; X86-NEXT: andl $858993459, %eax # imm = 0x33333333			; X86-NEXT: andl $858993459, %eax # imm = 0x33333333
	; X86-NEXT: leal (%eax,%ecx,4), %eax			; X86-NEXT: leal (%eax,%ecx,4), %eax
	; X86-NEXT: movl %eax, %ecx			; X86-NEXT: movl %eax, %ecx
	; X86-NEXT: andl $357913941, %ecx # imm = 0x15555555			; X86-NEXT: andl $357913941, %ecx # imm = 0x15555555
	; X86-NEXT: andl $-1431655766, %eax # imm = 0xAAAAAAAA			; X86-NEXT: shrl %eax
				pengfeiUnsubmitted Not Done Reply Inline Actions ditto. pengfei: ditto.
				; X86-NEXT: andl $1431655765, %eax # imm = 0x55555555
				; X86-NEXT: addl %eax, %eax
	; X86-NEXT: leal (%eax,%ecx,4), %eax			; X86-NEXT: leal (%eax,%ecx,4), %eax
	; X86-NEXT: bswapl %eax			; X86-NEXT: bswapl %eax
	; X86-NEXT: movl %eax, %ecx			; X86-NEXT: movl %eax, %ecx
	; X86-NEXT: andl $235867919, %ecx # imm = 0xE0F0F0F			; X86-NEXT: andl $235867919, %ecx # imm = 0xE0F0F0F
	; X86-NEXT: shll $4, %ecx			; X86-NEXT: shll $4, %ecx
	; X86-NEXT: shrl $4, %eax			; X86-NEXT: shrl $4, %eax
	; X86-NEXT: andl $252645135, %eax # imm = 0xF0F0F0F			; X86-NEXT: andl $252645135, %eax # imm = 0xF0F0F0F
	; X86-NEXT: orl %ecx, %eax			; X86-NEXT: orl %ecx, %eax
	▲ Show 20 Lines • Show All 113 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/combine-rotates.ll

Show First 20 Lines • Show All 434 Lines • ▼ Show 20 Lines	; CHECK-NEXT: retq
ret i32 %4		ret i32 %4
}		}

; Ensure we normalize the inner rotation before adding the results.		; Ensure we normalize the inner rotation before adding the results.
define i5 @rotl_merge_i5(i5 %x) {		define i5 @rotl_merge_i5(i5 %x) {
; CHECK-LABEL: rotl_merge_i5:		; CHECK-LABEL: rotl_merge_i5:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: # kill: def $edi killed $edi def $rdi		; CHECK-NEXT: # kill: def $edi killed $edi def $rdi
; CHECK-NEXT: leal (,%rdi,4), %ecx		; CHECK-NEXT: leal (,%rdi,4), %eax
; CHECK-NEXT: movl %edi, %eax		; CHECK-NEXT: shrb $3, %dil
; CHECK-NEXT: andb $24, %al		; CHECK-NEXT: andb $3, %dil
; CHECK-NEXT: shrb $3, %al		; CHECK-NEXT: orb %dil, %al
; CHECK-NEXT: orb %cl, %al		; CHECK-NEXT: # kill: def $al killed $al killed $eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%r1 = call i5 @llvm.fshl.i5(i5 %x, i5 %x, i5 -1)		%r1 = call i5 @llvm.fshl.i5(i5 %x, i5 %x, i5 -1)
%r2 = call i5 @llvm.fshl.i5(i5 %r1, i5 %r1, i5 1)		%r2 = call i5 @llvm.fshl.i5(i5 %r1, i5 %r1, i5 1)
ret i5 %r2		ret i5 %r2
}		}
declare i5 @llvm.fshl.i5(i5, i5, i5)		declare i5 @llvm.fshl.i5(i5, i5, i5)

declare <4 x i32> @llvm.fshl.v4i32(<4 x i32>, <4 x i32>, <4 x i32>)		declare <4 x i32> @llvm.fshl.v4i32(<4 x i32>, <4 x i32>, <4 x i32>)
declare <4 x i32> @llvm.fshr.v4i32(<4 x i32>, <4 x i32>, <4 x i32>)		declare <4 x i32> @llvm.fshr.v4i32(<4 x i32>, <4 x i32>, <4 x i32>)

llvm/test/CodeGen/X86/const-shift-of-constmasked.ll

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=i686-unknown-unknown \| FileCheck %s --check-prefix=X86		; RUN: llc < %s -mtriple=i686-unknown-unknown \| FileCheck %s --check-prefix=X86
; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefix=X64		; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefix=X64

; The mask is all-ones, potentially shifted.		; The mask is all-ones, potentially shifted.

;------------------------------------------------------------------------------;		;------------------------------------------------------------------------------;
; 8-bit		; 8-bit
;------------------------------------------------------------------------------;		;------------------------------------------------------------------------------;

; lshr		; lshr

define i8 @test_i8_7_mask_lshr_1(i8 %a0) {		define i8 @test_i8_7_mask_lshr_1(i8 %a0) {
; X86-LABEL: test_i8_7_mask_lshr_1:		; X86-LABEL: test_i8_7_mask_lshr_1:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andb $6, %al
; X86-NEXT: shrb %al		; X86-NEXT: shrb %al
		; X86-NEXT: andb $3, %al
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i8_7_mask_lshr_1:		; X64-LABEL: test_i8_7_mask_lshr_1:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andb $6, %al
; X64-NEXT: shrb %al		; X64-NEXT: shrb %al
		; X64-NEXT: andb $3, %al
; X64-NEXT: # kill: def $al killed $al killed $eax		; X64-NEXT: # kill: def $al killed $al killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i8 %a0, 7		%t0 = and i8 %a0, 7
%t1 = lshr i8 %t0, 1		%t1 = lshr i8 %t0, 1
ret i8 %t1		ret i8 %t1
}		}

define i8 @test_i8_28_mask_lshr_1(i8 %a0) {		define i8 @test_i8_28_mask_lshr_1(i8 %a0) {
; X86-LABEL: test_i8_28_mask_lshr_1:		; X86-LABEL: test_i8_28_mask_lshr_1:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andb $28, %al
; X86-NEXT: shrb %al		; X86-NEXT: shrb %al
		; X86-NEXT: andb $14, %al
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i8_28_mask_lshr_1:		; X64-LABEL: test_i8_28_mask_lshr_1:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andb $28, %al
; X64-NEXT: shrb %al		; X64-NEXT: shrb %al
		; X64-NEXT: andb $14, %al
; X64-NEXT: # kill: def $al killed $al killed $eax		; X64-NEXT: # kill: def $al killed $al killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i8 %a0, 28		%t0 = and i8 %a0, 28
%t1 = lshr i8 %t0, 1		%t1 = lshr i8 %t0, 1
ret i8 %t1		ret i8 %t1
}		}
define i8 @test_i8_28_mask_lshr_2(i8 %a0) {		define i8 @test_i8_28_mask_lshr_2(i8 %a0) {
; X86-LABEL: test_i8_28_mask_lshr_2:		; X86-LABEL: test_i8_28_mask_lshr_2:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andb $28, %al
; X86-NEXT: shrb $2, %al		; X86-NEXT: shrb $2, %al
		; X86-NEXT: andb $7, %al
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i8_28_mask_lshr_2:		; X64-LABEL: test_i8_28_mask_lshr_2:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andb $28, %al
; X64-NEXT: shrb $2, %al		; X64-NEXT: shrb $2, %al
		; X64-NEXT: andb $7, %al
; X64-NEXT: # kill: def $al killed $al killed $eax		; X64-NEXT: # kill: def $al killed $al killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i8 %a0, 28		%t0 = and i8 %a0, 28
%t1 = lshr i8 %t0, 2		%t1 = lshr i8 %t0, 2
ret i8 %t1		ret i8 %t1
}		}
define i8 @test_i8_28_mask_lshr_3(i8 %a0) {		define i8 @test_i8_28_mask_lshr_3(i8 %a0) {
; X86-LABEL: test_i8_28_mask_lshr_3:		; X86-LABEL: test_i8_28_mask_lshr_3:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andb $24, %al
; X86-NEXT: shrb $3, %al		; X86-NEXT: shrb $3, %al
		; X86-NEXT: andb $3, %al
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i8_28_mask_lshr_3:		; X64-LABEL: test_i8_28_mask_lshr_3:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andb $24, %al
; X64-NEXT: shrb $3, %al		; X64-NEXT: shrb $3, %al
		; X64-NEXT: andb $3, %al
; X64-NEXT: # kill: def $al killed $al killed $eax		; X64-NEXT: # kill: def $al killed $al killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i8 %a0, 28		%t0 = and i8 %a0, 28
%t1 = lshr i8 %t0, 3		%t1 = lshr i8 %t0, 3
ret i8 %t1		ret i8 %t1
}		}
define i8 @test_i8_28_mask_lshr_4(i8 %a0) {		define i8 @test_i8_28_mask_lshr_4(i8 %a0) {
; X86-LABEL: test_i8_28_mask_lshr_4:		; X86-LABEL: test_i8_28_mask_lshr_4:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andb $16, %al
; X86-NEXT: shrb $4, %al		; X86-NEXT: shrb $4, %al
		; X86-NEXT: andb $1, %al
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i8_28_mask_lshr_4:		; X64-LABEL: test_i8_28_mask_lshr_4:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andb $16, %al
; X64-NEXT: shrb $4, %al		; X64-NEXT: shrb $4, %al
		; X64-NEXT: andb $1, %al
; X64-NEXT: # kill: def $al killed $al killed $eax		; X64-NEXT: # kill: def $al killed $al killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i8 %a0, 28		%t0 = and i8 %a0, 28
%t1 = lshr i8 %t0, 4		%t1 = lshr i8 %t0, 4
ret i8 %t1		ret i8 %t1
}		}

define i8 @test_i8_224_mask_lshr_1(i8 %a0) {		define i8 @test_i8_224_mask_lshr_1(i8 %a0) {
Show All 14 Lines	; X64-NEXT: retq
%t0 = and i8 %a0, 224		%t0 = and i8 %a0, 224
%t1 = lshr i8 %t0, 1		%t1 = lshr i8 %t0, 1
ret i8 %t1		ret i8 %t1
}		}
define i8 @test_i8_224_mask_lshr_4(i8 %a0) {		define i8 @test_i8_224_mask_lshr_4(i8 %a0) {
; X86-LABEL: test_i8_224_mask_lshr_4:		; X86-LABEL: test_i8_224_mask_lshr_4:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andb $-32, %al
; X86-NEXT: shrb $4, %al		; X86-NEXT: shrb $4, %al
		; X86-NEXT: andb $14, %al
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i8_224_mask_lshr_4:		; X64-LABEL: test_i8_224_mask_lshr_4:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andb $-32, %al
; X64-NEXT: shrb $4, %al		; X64-NEXT: shrb $4, %al
		; X64-NEXT: andb $14, %al
; X64-NEXT: # kill: def $al killed $al killed $eax		; X64-NEXT: # kill: def $al killed $al killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i8 %a0, 224		%t0 = and i8 %a0, 224
%t1 = lshr i8 %t0, 4		%t1 = lshr i8 %t0, 4
ret i8 %t1		ret i8 %t1
}		}
define i8 @test_i8_224_mask_lshr_5(i8 %a0) {		define i8 @test_i8_224_mask_lshr_5(i8 %a0) {
; X86-LABEL: test_i8_224_mask_lshr_5:		; X86-LABEL: test_i8_224_mask_lshr_5:
Show All 31 Lines
}		}

; ashr		; ashr

define i8 @test_i8_7_mask_ashr_1(i8 %a0) {		define i8 @test_i8_7_mask_ashr_1(i8 %a0) {
; X86-LABEL: test_i8_7_mask_ashr_1:		; X86-LABEL: test_i8_7_mask_ashr_1:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andb $6, %al
; X86-NEXT: shrb %al		; X86-NEXT: shrb %al
		; X86-NEXT: andb $3, %al
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i8_7_mask_ashr_1:		; X64-LABEL: test_i8_7_mask_ashr_1:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andb $6, %al
; X64-NEXT: shrb %al		; X64-NEXT: shrb %al
		; X64-NEXT: andb $3, %al
; X64-NEXT: # kill: def $al killed $al killed $eax		; X64-NEXT: # kill: def $al killed $al killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i8 %a0, 7		%t0 = and i8 %a0, 7
%t1 = ashr i8 %t0, 1		%t1 = ashr i8 %t0, 1
ret i8 %t1		ret i8 %t1
}		}

define i8 @test_i8_28_mask_ashr_1(i8 %a0) {		define i8 @test_i8_28_mask_ashr_1(i8 %a0) {
; X86-LABEL: test_i8_28_mask_ashr_1:		; X86-LABEL: test_i8_28_mask_ashr_1:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andb $28, %al
; X86-NEXT: shrb %al		; X86-NEXT: shrb %al
		; X86-NEXT: andb $14, %al
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i8_28_mask_ashr_1:		; X64-LABEL: test_i8_28_mask_ashr_1:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andb $28, %al
; X64-NEXT: shrb %al		; X64-NEXT: shrb %al
		; X64-NEXT: andb $14, %al
; X64-NEXT: # kill: def $al killed $al killed $eax		; X64-NEXT: # kill: def $al killed $al killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i8 %a0, 28		%t0 = and i8 %a0, 28
%t1 = ashr i8 %t0, 1		%t1 = ashr i8 %t0, 1
ret i8 %t1		ret i8 %t1
}		}
define i8 @test_i8_28_mask_ashr_2(i8 %a0) {		define i8 @test_i8_28_mask_ashr_2(i8 %a0) {
; X86-LABEL: test_i8_28_mask_ashr_2:		; X86-LABEL: test_i8_28_mask_ashr_2:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andb $28, %al
; X86-NEXT: shrb $2, %al		; X86-NEXT: shrb $2, %al
		; X86-NEXT: andb $7, %al
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i8_28_mask_ashr_2:		; X64-LABEL: test_i8_28_mask_ashr_2:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andb $28, %al
; X64-NEXT: shrb $2, %al		; X64-NEXT: shrb $2, %al
		; X64-NEXT: andb $7, %al
; X64-NEXT: # kill: def $al killed $al killed $eax		; X64-NEXT: # kill: def $al killed $al killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i8 %a0, 28		%t0 = and i8 %a0, 28
%t1 = ashr i8 %t0, 2		%t1 = ashr i8 %t0, 2
ret i8 %t1		ret i8 %t1
}		}
define i8 @test_i8_28_mask_ashr_3(i8 %a0) {		define i8 @test_i8_28_mask_ashr_3(i8 %a0) {
; X86-LABEL: test_i8_28_mask_ashr_3:		; X86-LABEL: test_i8_28_mask_ashr_3:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andb $24, %al
; X86-NEXT: shrb $3, %al		; X86-NEXT: shrb $3, %al
		; X86-NEXT: andb $3, %al
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i8_28_mask_ashr_3:		; X64-LABEL: test_i8_28_mask_ashr_3:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andb $24, %al
; X64-NEXT: shrb $3, %al		; X64-NEXT: shrb $3, %al
		; X64-NEXT: andb $3, %al
; X64-NEXT: # kill: def $al killed $al killed $eax		; X64-NEXT: # kill: def $al killed $al killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i8 %a0, 28		%t0 = and i8 %a0, 28
%t1 = ashr i8 %t0, 3		%t1 = ashr i8 %t0, 3
ret i8 %t1		ret i8 %t1
}		}
define i8 @test_i8_28_mask_ashr_4(i8 %a0) {		define i8 @test_i8_28_mask_ashr_4(i8 %a0) {
; X86-LABEL: test_i8_28_mask_ashr_4:		; X86-LABEL: test_i8_28_mask_ashr_4:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andb $16, %al
; X86-NEXT: shrb $4, %al		; X86-NEXT: shrb $4, %al
		; X86-NEXT: andb $1, %al
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i8_28_mask_ashr_4:		; X64-LABEL: test_i8_28_mask_ashr_4:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andb $16, %al
; X64-NEXT: shrb $4, %al		; X64-NEXT: shrb $4, %al
		; X64-NEXT: andb $1, %al
; X64-NEXT: # kill: def $al killed $al killed $eax		; X64-NEXT: # kill: def $al killed $al killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i8 %a0, 28		%t0 = and i8 %a0, 28
%t1 = ashr i8 %t0, 4		%t1 = ashr i8 %t0, 4
ret i8 %t1		ret i8 %t1
}		}

define i8 @test_i8_224_mask_ashr_1(i8 %a0) {		define i8 @test_i8_224_mask_ashr_1(i8 %a0) {
▲ Show 20 Lines • Show All 245 Lines • ▼ Show 20 Lines
; 16-bit		; 16-bit
;------------------------------------------------------------------------------;		;------------------------------------------------------------------------------;

; lshr		; lshr

define i16 @test_i16_127_mask_lshr_1(i16 %a0) {		define i16 @test_i16_127_mask_lshr_1(i16 %a0) {
; X86-LABEL: test_i16_127_mask_lshr_1:		; X86-LABEL: test_i16_127_mask_lshr_1:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl $126, %eax
; X86-NEXT: shrl %eax		; X86-NEXT: shrl %eax
		; X86-NEXT: andl $63, %eax
; X86-NEXT: # kill: def $ax killed $ax killed $eax		; X86-NEXT: # kill: def $ax killed $ax killed $eax
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i16_127_mask_lshr_1:		; X64-LABEL: test_i16_127_mask_lshr_1:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andl $126, %eax
; X64-NEXT: shrl %eax		; X64-NEXT: shrl %eax
		; X64-NEXT: andl $63, %eax
; X64-NEXT: # kill: def $ax killed $ax killed $eax		; X64-NEXT: # kill: def $ax killed $ax killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i16 %a0, 127		%t0 = and i16 %a0, 127
%t1 = lshr i16 %t0, 1		%t1 = lshr i16 %t0, 1
ret i16 %t1		ret i16 %t1
}		}

define i16 @test_i16_2032_mask_lshr_3(i16 %a0) {		define i16 @test_i16_2032_mask_lshr_3(i16 %a0) {
; X86-LABEL: test_i16_2032_mask_lshr_3:		; X86-LABEL: test_i16_2032_mask_lshr_3:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl $2032, %eax # imm = 0x7F0
; X86-NEXT: shrl $3, %eax		; X86-NEXT: shrl $3, %eax
		; X86-NEXT: andl $254, %eax
; X86-NEXT: # kill: def $ax killed $ax killed $eax		; X86-NEXT: # kill: def $ax killed $ax killed $eax
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i16_2032_mask_lshr_3:		; X64-LABEL: test_i16_2032_mask_lshr_3:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andl $2032, %eax # imm = 0x7F0
; X64-NEXT: shrl $3, %eax		; X64-NEXT: shrl $3, %eax
		; X64-NEXT: andl $254, %eax
; X64-NEXT: # kill: def $ax killed $ax killed $eax		; X64-NEXT: # kill: def $ax killed $ax killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i16 %a0, 2032		%t0 = and i16 %a0, 2032
%t1 = lshr i16 %t0, 3		%t1 = lshr i16 %t0, 3
ret i16 %t1		ret i16 %t1
}		}
define i16 @test_i16_2032_mask_lshr_4(i16 %a0) {		define i16 @test_i16_2032_mask_lshr_4(i16 %a0) {
; X86-LABEL: test_i16_2032_mask_lshr_4:		; X86-LABEL: test_i16_2032_mask_lshr_4:
▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%t1 = lshr i16 %t0, 6		%t1 = lshr i16 %t0, 6
ret i16 %t1		ret i16 %t1
}		}

define i16 @test_i16_65024_mask_lshr_1(i16 %a0) {		define i16 @test_i16_65024_mask_lshr_1(i16 %a0) {
; X86-LABEL: test_i16_65024_mask_lshr_1:		; X86-LABEL: test_i16_65024_mask_lshr_1:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl $65024, %eax # imm = 0xFE00
; X86-NEXT: shrl %eax		; X86-NEXT: shrl %eax
		; X86-NEXT: andl $32512, %eax # imm = 0x7F00
; X86-NEXT: # kill: def $ax killed $ax killed $eax		; X86-NEXT: # kill: def $ax killed $ax killed $eax
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i16_65024_mask_lshr_1:		; X64-LABEL: test_i16_65024_mask_lshr_1:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andl $65024, %eax # imm = 0xFE00
; X64-NEXT: shrl %eax		; X64-NEXT: shrl %eax
		; X64-NEXT: andl $32512, %eax # imm = 0x7F00
; X64-NEXT: # kill: def $ax killed $ax killed $eax		; X64-NEXT: # kill: def $ax killed $ax killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i16 %a0, 65024		%t0 = and i16 %a0, 65024
%t1 = lshr i16 %t0, 1		%t1 = lshr i16 %t0, 1
ret i16 %t1		ret i16 %t1
}		}

		; Explicit `movzbl 5(%esp), %eax` for X86 because the exact value is
		pengfeiUnsubmitted Done Reply Inline Actions You can add option `--no_x86_scrub_sp` when updating test. pengfei: You can add option `--no_x86_scrub_sp` when updating test.
		; necessary to optimize out the `shr`.
define i16 @test_i16_65024_mask_lshr_8(i16 %a0) {		define i16 @test_i16_65024_mask_lshr_8(i16 %a0) {
; X86-LABEL: test_i16_65024_mask_lshr_8:		; X86-LABEL: test_i16_65024_mask_lshr_8:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl $65024, %eax # imm = 0xFE00		; X86-NEXT: andl $-2, %eax
; X86-NEXT: shrl $8, %eax
; X86-NEXT: # kill: def $ax killed $ax killed $eax		; X86-NEXT: # kill: def $ax killed $ax killed $eax
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i16_65024_mask_lshr_8:		; X64-LABEL: test_i16_65024_mask_lshr_8:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andl $65024, %eax # imm = 0xFE00
; X64-NEXT: shrl $8, %eax		; X64-NEXT: shrl $8, %eax
		; X64-NEXT: andl $254, %eax
; X64-NEXT: # kill: def $ax killed $ax killed $eax		; X64-NEXT: # kill: def $ax killed $ax killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i16 %a0, 65024		%t0 = and i16 %a0, 65024
%t1 = lshr i16 %t0, 8		%t1 = lshr i16 %t0, 8
ret i16 %t1		ret i16 %t1
}		}
define i16 @test_i16_65024_mask_lshr_9(i16 %a0) {		define i16 @test_i16_65024_mask_lshr_9(i16 %a0) {
; X86-LABEL: test_i16_65024_mask_lshr_9:		; X86-LABEL: test_i16_65024_mask_lshr_9:
Show All 32 Lines	; X64-NEXT: retq
ret i16 %t1		ret i16 %t1
}		}

; ashr		; ashr

define i16 @test_i16_127_mask_ashr_1(i16 %a0) {		define i16 @test_i16_127_mask_ashr_1(i16 %a0) {
; X86-LABEL: test_i16_127_mask_ashr_1:		; X86-LABEL: test_i16_127_mask_ashr_1:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl $126, %eax
; X86-NEXT: shrl %eax		; X86-NEXT: shrl %eax
		; X86-NEXT: andl $63, %eax
; X86-NEXT: # kill: def $ax killed $ax killed $eax		; X86-NEXT: # kill: def $ax killed $ax killed $eax
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i16_127_mask_ashr_1:		; X64-LABEL: test_i16_127_mask_ashr_1:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andl $126, %eax
; X64-NEXT: shrl %eax		; X64-NEXT: shrl %eax
		; X64-NEXT: andl $63, %eax
; X64-NEXT: # kill: def $ax killed $ax killed $eax		; X64-NEXT: # kill: def $ax killed $ax killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i16 %a0, 127		%t0 = and i16 %a0, 127
%t1 = ashr i16 %t0, 1		%t1 = ashr i16 %t0, 1
ret i16 %t1		ret i16 %t1
}		}

define i16 @test_i16_2032_mask_ashr_3(i16 %a0) {		define i16 @test_i16_2032_mask_ashr_3(i16 %a0) {
; X86-LABEL: test_i16_2032_mask_ashr_3:		; X86-LABEL: test_i16_2032_mask_ashr_3:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl $2032, %eax # imm = 0x7F0
; X86-NEXT: shrl $3, %eax		; X86-NEXT: shrl $3, %eax
		; X86-NEXT: andl $254, %eax
; X86-NEXT: # kill: def $ax killed $ax killed $eax		; X86-NEXT: # kill: def $ax killed $ax killed $eax
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i16_2032_mask_ashr_3:		; X64-LABEL: test_i16_2032_mask_ashr_3:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andl $2032, %eax # imm = 0x7F0
; X64-NEXT: shrl $3, %eax		; X64-NEXT: shrl $3, %eax
		; X64-NEXT: andl $254, %eax
; X64-NEXT: # kill: def $ax killed $ax killed $eax		; X64-NEXT: # kill: def $ax killed $ax killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i16 %a0, 2032		%t0 = and i16 %a0, 2032
%t1 = ashr i16 %t0, 3		%t1 = ashr i16 %t0, 3
ret i16 %t1		ret i16 %t1
}		}
define i16 @test_i16_2032_mask_ashr_4(i16 %a0) {		define i16 @test_i16_2032_mask_ashr_4(i16 %a0) {
; X86-LABEL: test_i16_2032_mask_ashr_4:		; X86-LABEL: test_i16_2032_mask_ashr_4:
▲ Show 20 Lines • Show All 135 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
ret i16 %t1		ret i16 %t1
}		}

; shl		; shl

define i16 @test_i16_127_mask_shl_1(i16 %a0) {		define i16 @test_i16_127_mask_shl_1(i16 %a0) {
; X86-LABEL: test_i16_127_mask_shl_1:		; X86-LABEL: test_i16_127_mask_shl_1:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl $127, %eax
; X86-NEXT: addl %eax, %eax		; X86-NEXT: addl %eax, %eax
		; X86-NEXT: movzbl %al, %eax
; X86-NEXT: # kill: def $ax killed $ax killed $eax		; X86-NEXT: # kill: def $ax killed $ax killed $eax
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i16_127_mask_shl_1:		; X64-LABEL: test_i16_127_mask_shl_1:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: # kill: def $edi killed $edi def $rdi		; X64-NEXT: addl %edi, %edi
; X64-NEXT: andl $127, %edi		; X64-NEXT: movzbl %dil, %eax
; X64-NEXT: leal (%rdi,%rdi), %eax
; X64-NEXT: # kill: def $ax killed $ax killed $eax		; X64-NEXT: # kill: def $ax killed $ax killed $eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i16 %a0, 127		%t0 = and i16 %a0, 127
%t1 = shl i16 %t0, 1		%t1 = shl i16 %t0, 1
ret i16 %t1		ret i16 %t1
}		}
define i16 @test_i16_127_mask_shl_8(i16 %a0) {		define i16 @test_i16_127_mask_shl_8(i16 %a0) {
; X86-LABEL: test_i16_127_mask_shl_8:		; X86-LABEL: test_i16_127_mask_shl_8:
▲ Show 20 Lines • Show All 158 Lines • ▼ Show 20 Lines
; 32-bit		; 32-bit
;------------------------------------------------------------------------------;		;------------------------------------------------------------------------------;

; lshr		; lshr

define i32 @test_i32_32767_mask_lshr_1(i32 %a0) {		define i32 @test_i32_32767_mask_lshr_1(i32 %a0) {
; X86-LABEL: test_i32_32767_mask_lshr_1:		; X86-LABEL: test_i32_32767_mask_lshr_1:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl $32766, %eax # imm = 0x7FFE		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shrl %eax		; X86-NEXT: shrl %eax
		; X86-NEXT: andl $16383, %eax # imm = 0x3FFF
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i32_32767_mask_lshr_1:		; X64-LABEL: test_i32_32767_mask_lshr_1:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andl $32766, %eax # imm = 0x7FFE
; X64-NEXT: shrl %eax		; X64-NEXT: shrl %eax
		; X64-NEXT: andl $16383, %eax # imm = 0x3FFF
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i32 %a0, 32767		%t0 = and i32 %a0, 32767
%t1 = lshr i32 %t0, 1		%t1 = lshr i32 %t0, 1
ret i32 %t1		ret i32 %t1
}		}

define i32 @test_i32_8388352_mask_lshr_7(i32 %a0) {		define i32 @test_i32_8388352_mask_lshr_7(i32 %a0) {
; X86-LABEL: test_i32_8388352_mask_lshr_7:		; X86-LABEL: test_i32_8388352_mask_lshr_7:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl $8388352, %eax # imm = 0x7FFF00		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shrl $7, %eax		; X86-NEXT: shrl $7, %eax
		; X86-NEXT: andl $65534, %eax # imm = 0xFFFE
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i32_8388352_mask_lshr_7:		; X64-LABEL: test_i32_8388352_mask_lshr_7:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andl $8388352, %eax # imm = 0x7FFF00
; X64-NEXT: shrl $7, %eax		; X64-NEXT: shrl $7, %eax
		; X64-NEXT: andl $65534, %eax # imm = 0xFFFE
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i32 %a0, 8388352		%t0 = and i32 %a0, 8388352
%t1 = lshr i32 %t0, 7		%t1 = lshr i32 %t0, 7
ret i32 %t1		ret i32 %t1
}		}
define i32 @test_i32_8388352_mask_lshr_8(i32 %a0) {		define i32 @test_i32_8388352_mask_lshr_8(i32 %a0) {
; X86-LABEL: test_i32_8388352_mask_lshr_8:		; X86-LABEL: test_i32_8388352_mask_lshr_8:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl $8388352, %eax # imm = 0x7FFF00		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shrl $8, %eax		; X86-NEXT: shrl $8, %eax
		; X86-NEXT: andl $32767, %eax # imm = 0x7FFF
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i32_8388352_mask_lshr_8:		; X64-LABEL: test_i32_8388352_mask_lshr_8:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andl $8388352, %eax # imm = 0x7FFF00
; X64-NEXT: shrl $8, %eax		; X64-NEXT: shrl $8, %eax
		; X64-NEXT: andl $32767, %eax # imm = 0x7FFF
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i32 %a0, 8388352		%t0 = and i32 %a0, 8388352
%t1 = lshr i32 %t0, 8		%t1 = lshr i32 %t0, 8
ret i32 %t1		ret i32 %t1
}		}
define i32 @test_i32_8388352_mask_lshr_9(i32 %a0) {		define i32 @test_i32_8388352_mask_lshr_9(i32 %a0) {
; X86-LABEL: test_i32_8388352_mask_lshr_9:		; X86-LABEL: test_i32_8388352_mask_lshr_9:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl $8388096, %eax # imm = 0x7FFE00		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shrl $9, %eax		; X86-NEXT: shrl $9, %eax
		; X86-NEXT: andl $16383, %eax # imm = 0x3FFF
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i32_8388352_mask_lshr_9:		; X64-LABEL: test_i32_8388352_mask_lshr_9:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andl $8388096, %eax # imm = 0x7FFE00
; X64-NEXT: shrl $9, %eax		; X64-NEXT: shrl $9, %eax
		; X64-NEXT: andl $16383, %eax # imm = 0x3FFF
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i32 %a0, 8388352		%t0 = and i32 %a0, 8388352
%t1 = lshr i32 %t0, 9		%t1 = lshr i32 %t0, 9
ret i32 %t1		ret i32 %t1
}		}
define i32 @test_i32_8388352_mask_lshr_10(i32 %a0) {		define i32 @test_i32_8388352_mask_lshr_10(i32 %a0) {
; X86-LABEL: test_i32_8388352_mask_lshr_10:		; X86-LABEL: test_i32_8388352_mask_lshr_10:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl $8387584, %eax # imm = 0x7FFC00		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shrl $10, %eax		; X86-NEXT: shrl $10, %eax
		; X86-NEXT: andl $8191, %eax # imm = 0x1FFF
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i32_8388352_mask_lshr_10:		; X64-LABEL: test_i32_8388352_mask_lshr_10:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andl $8387584, %eax # imm = 0x7FFC00
; X64-NEXT: shrl $10, %eax		; X64-NEXT: shrl $10, %eax
		; X64-NEXT: andl $8191, %eax # imm = 0x1FFF
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i32 %a0, 8388352		%t0 = and i32 %a0, 8388352
%t1 = lshr i32 %t0, 10		%t1 = lshr i32 %t0, 10
ret i32 %t1		ret i32 %t1
}		}

define i32 @test_i32_4294836224_mask_lshr_1(i32 %a0) {		define i32 @test_i32_4294836224_mask_lshr_1(i32 %a0) {
; X86-LABEL: test_i32_4294836224_mask_lshr_1:		; X86-LABEL: test_i32_4294836224_mask_lshr_1:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl $-131072, %eax # imm = 0xFFFE0000		; X86-NEXT: movl $-131072, %eax # imm = 0xFFFE0000
; X86-NEXT: andl {{[0-9]+}}(%esp), %eax		; X86-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shrl %eax		; X86-NEXT: shrl %eax
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i32_4294836224_mask_lshr_1:		; X64-LABEL: test_i32_4294836224_mask_lshr_1:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andl $-131072, %eax # imm = 0xFFFE0000		; X64-NEXT: andl $-131072, %eax # imm = 0xFFFE0000
; X64-NEXT: shrl %eax		; X64-NEXT: shrl %eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i32 %a0, 4294836224		%t0 = and i32 %a0, 4294836224
%t1 = lshr i32 %t0, 1		%t1 = lshr i32 %t0, 1
ret i32 %t1		ret i32 %t1
}		}


		; Explicit `movzwl 6(%esp), %eax` for X86 because the exact value is
		pengfeiUnsubmitted Not Done Reply Inline Actions ditto. pengfei: ditto.
		; necessary to optimize out the `shr`.
define i32 @test_i32_4294836224_mask_lshr_16(i32 %a0) {		define i32 @test_i32_4294836224_mask_lshr_16(i32 %a0) {
; X86-LABEL: test_i32_4294836224_mask_lshr_16:		; X86-LABEL: test_i32_4294836224_mask_lshr_16:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl $-131072, %eax # imm = 0xFFFE0000		; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl {{[0-9]+}}(%esp), %eax		; X86-NEXT: andl $-2, %eax
; X86-NEXT: shrl $16, %eax
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i32_4294836224_mask_lshr_16:		; X64-LABEL: test_i32_4294836224_mask_lshr_16:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andl $-131072, %eax # imm = 0xFFFE0000
; X64-NEXT: shrl $16, %eax		; X64-NEXT: shrl $16, %eax
		; X64-NEXT: andl $-2, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i32 %a0, 4294836224		%t0 = and i32 %a0, 4294836224
%t1 = lshr i32 %t0, 16		%t1 = lshr i32 %t0, 16
ret i32 %t1		ret i32 %t1
}		}
define i32 @test_i32_4294836224_mask_lshr_17(i32 %a0) {		define i32 @test_i32_4294836224_mask_lshr_17(i32 %a0) {
; X86-LABEL: test_i32_4294836224_mask_lshr_17:		; X86-LABEL: test_i32_4294836224_mask_lshr_17:
; X86: # %bb.0:		; X86: # %bb.0:
Show All 27 Lines	; X64-NEXT: retq
ret i32 %t1		ret i32 %t1
}		}

; ashr		; ashr

define i32 @test_i32_32767_mask_ashr_1(i32 %a0) {		define i32 @test_i32_32767_mask_ashr_1(i32 %a0) {
; X86-LABEL: test_i32_32767_mask_ashr_1:		; X86-LABEL: test_i32_32767_mask_ashr_1:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl $32766, %eax # imm = 0x7FFE		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shrl %eax		; X86-NEXT: shrl %eax
		; X86-NEXT: andl $16383, %eax # imm = 0x3FFF
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i32_32767_mask_ashr_1:		; X64-LABEL: test_i32_32767_mask_ashr_1:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andl $32766, %eax # imm = 0x7FFE
; X64-NEXT: shrl %eax		; X64-NEXT: shrl %eax
		; X64-NEXT: andl $16383, %eax # imm = 0x3FFF
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i32 %a0, 32767		%t0 = and i32 %a0, 32767
%t1 = ashr i32 %t0, 1		%t1 = ashr i32 %t0, 1
ret i32 %t1		ret i32 %t1
}		}

define i32 @test_i32_8388352_mask_ashr_7(i32 %a0) {		define i32 @test_i32_8388352_mask_ashr_7(i32 %a0) {
; X86-LABEL: test_i32_8388352_mask_ashr_7:		; X86-LABEL: test_i32_8388352_mask_ashr_7:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl $8388352, %eax # imm = 0x7FFF00		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shrl $7, %eax		; X86-NEXT: shrl $7, %eax
		; X86-NEXT: andl $65534, %eax # imm = 0xFFFE
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i32_8388352_mask_ashr_7:		; X64-LABEL: test_i32_8388352_mask_ashr_7:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andl $8388352, %eax # imm = 0x7FFF00
; X64-NEXT: shrl $7, %eax		; X64-NEXT: shrl $7, %eax
		; X64-NEXT: andl $65534, %eax # imm = 0xFFFE
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i32 %a0, 8388352		%t0 = and i32 %a0, 8388352
%t1 = ashr i32 %t0, 7		%t1 = ashr i32 %t0, 7
ret i32 %t1		ret i32 %t1
}		}
define i32 @test_i32_8388352_mask_ashr_8(i32 %a0) {		define i32 @test_i32_8388352_mask_ashr_8(i32 %a0) {
; X86-LABEL: test_i32_8388352_mask_ashr_8:		; X86-LABEL: test_i32_8388352_mask_ashr_8:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl $8388352, %eax # imm = 0x7FFF00		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shrl $8, %eax		; X86-NEXT: shrl $8, %eax
		; X86-NEXT: andl $32767, %eax # imm = 0x7FFF
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i32_8388352_mask_ashr_8:		; X64-LABEL: test_i32_8388352_mask_ashr_8:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andl $8388352, %eax # imm = 0x7FFF00
; X64-NEXT: shrl $8, %eax		; X64-NEXT: shrl $8, %eax
		; X64-NEXT: andl $32767, %eax # imm = 0x7FFF
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i32 %a0, 8388352		%t0 = and i32 %a0, 8388352
%t1 = ashr i32 %t0, 8		%t1 = ashr i32 %t0, 8
ret i32 %t1		ret i32 %t1
}		}
define i32 @test_i32_8388352_mask_ashr_9(i32 %a0) {		define i32 @test_i32_8388352_mask_ashr_9(i32 %a0) {
; X86-LABEL: test_i32_8388352_mask_ashr_9:		; X86-LABEL: test_i32_8388352_mask_ashr_9:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl $8388096, %eax # imm = 0x7FFE00		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shrl $9, %eax		; X86-NEXT: shrl $9, %eax
		; X86-NEXT: andl $16383, %eax # imm = 0x3FFF
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i32_8388352_mask_ashr_9:		; X64-LABEL: test_i32_8388352_mask_ashr_9:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andl $8388096, %eax # imm = 0x7FFE00
; X64-NEXT: shrl $9, %eax		; X64-NEXT: shrl $9, %eax
		; X64-NEXT: andl $16383, %eax # imm = 0x3FFF
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i32 %a0, 8388352		%t0 = and i32 %a0, 8388352
%t1 = ashr i32 %t0, 9		%t1 = ashr i32 %t0, 9
ret i32 %t1		ret i32 %t1
}		}
define i32 @test_i32_8388352_mask_ashr_10(i32 %a0) {		define i32 @test_i32_8388352_mask_ashr_10(i32 %a0) {
; X86-LABEL: test_i32_8388352_mask_ashr_10:		; X86-LABEL: test_i32_8388352_mask_ashr_10:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl $8387584, %eax # imm = 0x7FFC00		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shrl $10, %eax		; X86-NEXT: shrl $10, %eax
		; X86-NEXT: andl $8191, %eax # imm = 0x1FFF
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i32_8388352_mask_ashr_10:		; X64-LABEL: test_i32_8388352_mask_ashr_10:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: andl $8387584, %eax # imm = 0x7FFC00
; X64-NEXT: shrl $10, %eax		; X64-NEXT: shrl $10, %eax
		; X64-NEXT: andl $8191, %eax # imm = 0x1FFF
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i32 %a0, 8388352		%t0 = and i32 %a0, 8388352
%t1 = ashr i32 %t0, 10		%t1 = ashr i32 %t0, 10
ret i32 %t1		ret i32 %t1
}		}

define i32 @test_i32_4294836224_mask_ashr_1(i32 %a0) {		define i32 @test_i32_4294836224_mask_ashr_1(i32 %a0) {
; X86-LABEL: test_i32_4294836224_mask_ashr_1:		; X86-LABEL: test_i32_4294836224_mask_ashr_1:
▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
ret i32 %t1		ret i32 %t1
}		}

; shl		; shl

define i32 @test_i32_32767_mask_shl_1(i32 %a0) {		define i32 @test_i32_32767_mask_shl_1(i32 %a0) {
; X86-LABEL: test_i32_32767_mask_shl_1:		; X86-LABEL: test_i32_32767_mask_shl_1:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl $32767, %eax # imm = 0x7FFF		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NEXT: addl %eax, %eax		; X86-NEXT: addl %eax, %eax
		; X86-NEXT: movzwl %ax, %eax
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i32_32767_mask_shl_1:		; X64-LABEL: test_i32_32767_mask_shl_1:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: # kill: def $edi killed $edi def $rdi		; X64-NEXT: addl %edi, %edi
; X64-NEXT: andl $32767, %edi # imm = 0x7FFF		; X64-NEXT: movzwl %di, %eax
; X64-NEXT: leal (%rdi,%rdi), %eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i32 %a0, 32767		%t0 = and i32 %a0, 32767
%t1 = shl i32 %t0, 1		%t1 = shl i32 %t0, 1
ret i32 %t1		ret i32 %t1
}		}
define i32 @test_i32_32767_mask_shl_16(i32 %a0) {		define i32 @test_i32_32767_mask_shl_16(i32 %a0) {
; X86-LABEL: test_i32_32767_mask_shl_16:		; X86-LABEL: test_i32_32767_mask_shl_16:
; X86: # %bb.0:		; X86: # %bb.0:
▲ Show 20 Lines • Show All 141 Lines • ▼ Show 20 Lines
; 64-bit		; 64-bit
;------------------------------------------------------------------------------;		;------------------------------------------------------------------------------;

; lshr		; lshr

define i64 @test_i64_2147483647_mask_lshr_1(i64 %a0) {		define i64 @test_i64_2147483647_mask_lshr_1(i64 %a0) {
; X86-LABEL: test_i64_2147483647_mask_lshr_1:		; X86-LABEL: test_i64_2147483647_mask_lshr_1:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl $2147483646, %eax # imm = 0x7FFFFFFE		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shrl %eax		; X86-NEXT: shrl %eax
		; X86-NEXT: andl $1073741823, %eax # imm = 0x3FFFFFFF
; X86-NEXT: xorl %edx, %edx		; X86-NEXT: xorl %edx, %edx
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i64_2147483647_mask_lshr_1:		; X64-LABEL: test_i64_2147483647_mask_lshr_1:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: andl $2147483646, %eax # imm = 0x7FFFFFFE		; X64-NEXT: shrl %eax
; X64-NEXT: shrq %rax		; X64-NEXT: andl $1073741823, %eax # imm = 0x3FFFFFFF
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i64 %a0, 2147483647		%t0 = and i64 %a0, 2147483647
%t1 = lshr i64 %t0, 1		%t1 = lshr i64 %t0, 1
ret i64 %t1		ret i64 %t1
}		}

define i64 @test_i64_140737488289792_mask_lshr_15(i64 %a0) {		define i64 @test_i64_140737488289792_mask_lshr_15(i64 %a0) {
; X86-LABEL: test_i64_140737488289792_mask_lshr_15:		; X86-LABEL: test_i64_140737488289792_mask_lshr_15:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzwl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movzwl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: shll $16, %ecx		; X86-NEXT: shll $16, %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shldl $17, %ecx, %eax		; X86-NEXT: shldl $17, %ecx, %eax
; X86-NEXT: xorl %edx, %edx		; X86-NEXT: xorl %edx, %edx
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i64_140737488289792_mask_lshr_15:		; X64-LABEL: test_i64_140737488289792_mask_lshr_15:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movabsq $140737488289792, %rax # imm = 0x7FFFFFFF0000		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: andq %rdi, %rax
; X64-NEXT: shrq $15, %rax		; X64-NEXT: shrq $15, %rax
		; X64-NEXT: andl $-2, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i64 %a0, 140737488289792		%t0 = and i64 %a0, 140737488289792
%t1 = lshr i64 %t0, 15		%t1 = lshr i64 %t0, 15
ret i64 %t1		ret i64 %t1
}		}
define i64 @test_i64_140737488289792_mask_lshr_16(i64 %a0) {		define i64 @test_i64_140737488289792_mask_lshr_16(i64 %a0) {
; X86-LABEL: test_i64_140737488289792_mask_lshr_16:		; X86-LABEL: test_i64_140737488289792_mask_lshr_16:
; X86: # %bb.0:		; X86: # %bb.0:
▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl $-2, %eax		; X86-NEXT: andl $-2, %eax
; X86-NEXT: xorl %edx, %edx		; X86-NEXT: xorl %edx, %edx
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i64_18446744065119617024_mask_lshr_32:		; X64-LABEL: test_i64_18446744065119617024_mask_lshr_32:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movabsq $-8589934592, %rax # imm = 0xFFFFFFFE00000000		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: andq %rdi, %rax
; X64-NEXT: shrq $32, %rax		; X64-NEXT: shrq $32, %rax
		; X64-NEXT: andl $-2, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i64 %a0, 18446744065119617024		%t0 = and i64 %a0, 18446744065119617024
%t1 = lshr i64 %t0, 32		%t1 = lshr i64 %t0, 32
ret i64 %t1		ret i64 %t1
}		}
define i64 @test_i64_18446744065119617024_mask_lshr_33(i64 %a0) {		define i64 @test_i64_18446744065119617024_mask_lshr_33(i64 %a0) {
; X86-LABEL: test_i64_18446744065119617024_mask_lshr_33:		; X86-LABEL: test_i64_18446744065119617024_mask_lshr_33:
; X86: # %bb.0:		; X86: # %bb.0:
Show All 29 Lines	; X64-NEXT: retq
ret i64 %t1		ret i64 %t1
}		}

; ashr		; ashr

define i64 @test_i64_2147483647_mask_ashr_1(i64 %a0) {		define i64 @test_i64_2147483647_mask_ashr_1(i64 %a0) {
; X86-LABEL: test_i64_2147483647_mask_ashr_1:		; X86-LABEL: test_i64_2147483647_mask_ashr_1:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl $2147483646, %eax # imm = 0x7FFFFFFE		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shrl %eax		; X86-NEXT: shrl %eax
		; X86-NEXT: andl $1073741823, %eax # imm = 0x3FFFFFFF
; X86-NEXT: xorl %edx, %edx		; X86-NEXT: xorl %edx, %edx
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i64_2147483647_mask_ashr_1:		; X64-LABEL: test_i64_2147483647_mask_ashr_1:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: andl $2147483646, %eax # imm = 0x7FFFFFFE		; X64-NEXT: shrl %eax
; X64-NEXT: shrq %rax		; X64-NEXT: andl $1073741823, %eax # imm = 0x3FFFFFFF
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i64 %a0, 2147483647		%t0 = and i64 %a0, 2147483647
%t1 = ashr i64 %t0, 1		%t1 = ashr i64 %t0, 1
ret i64 %t1		ret i64 %t1
}		}

define i64 @test_i64_140737488289792_mask_ashr_15(i64 %a0) {		define i64 @test_i64_140737488289792_mask_ashr_15(i64 %a0) {
; X86-LABEL: test_i64_140737488289792_mask_ashr_15:		; X86-LABEL: test_i64_140737488289792_mask_ashr_15:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzwl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movzwl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: shll $16, %ecx		; X86-NEXT: shll $16, %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shldl $17, %ecx, %eax		; X86-NEXT: shldl $17, %ecx, %eax
; X86-NEXT: xorl %edx, %edx		; X86-NEXT: xorl %edx, %edx
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i64_140737488289792_mask_ashr_15:		; X64-LABEL: test_i64_140737488289792_mask_ashr_15:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movabsq $140737488289792, %rax # imm = 0x7FFFFFFF0000		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: andq %rdi, %rax
; X64-NEXT: shrq $15, %rax		; X64-NEXT: shrq $15, %rax
		; X64-NEXT: andl $-2, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i64 %a0, 140737488289792		%t0 = and i64 %a0, 140737488289792
%t1 = ashr i64 %t0, 15		%t1 = ashr i64 %t0, 15
ret i64 %t1		ret i64 %t1
}		}
define i64 @test_i64_140737488289792_mask_ashr_16(i64 %a0) {		define i64 @test_i64_140737488289792_mask_ashr_16(i64 %a0) {
; X86-LABEL: test_i64_140737488289792_mask_ashr_16:		; X86-LABEL: test_i64_140737488289792_mask_ashr_16:
; X86: # %bb.0:		; X86: # %bb.0:
▲ Show 20 Lines • Show All 136 Lines • ▼ Show 20 Lines
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: addl %eax, %eax		; X86-NEXT: addl %eax, %eax
; X86-NEXT: xorl %edx, %edx		; X86-NEXT: xorl %edx, %edx
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i64_2147483647_mask_shl_1:		; X64-LABEL: test_i64_2147483647_mask_shl_1:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: andl $2147483647, %edi # imm = 0x7FFFFFFF		; X64-NEXT: leal (%rdi,%rdi), %eax
; X64-NEXT: leaq (%rdi,%rdi), %rax
; X64-NEXT: retq		; X64-NEXT: retq
%t0 = and i64 %a0, 2147483647		%t0 = and i64 %a0, 2147483647
%t1 = shl i64 %t0, 1		%t1 = shl i64 %t0, 1
ret i64 %t1		ret i64 %t1
}		}
define i64 @test_i64_2147483647_mask_shl_32(i64 %a0) {		define i64 @test_i64_2147483647_mask_shl_32(i64 %a0) {
; X86-LABEL: test_i64_2147483647_mask_shl_32:		; X86-LABEL: test_i64_2147483647_mask_shl_32:
; X86: # %bb.0:		; X86: # %bb.0:
▲ Show 20 Lines • Show All 151 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/const-shift-with-and.ll

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=i686-unknown-unknown \| FileCheck %s --check-prefix=X86		; RUN: llc < %s -mtriple=i686-unknown-unknown \| FileCheck %s --check-prefix=X86
; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefix=X64		; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefix=X64

define i64 @and_shr_from_mask_i64(i64 %x) nounwind {		define i64 @and_shr_from_mask_i64(i64 %x) nounwind {
		pengfeiUnsubmitted Not Done Reply Inline Actions What's these tests used for? They are not changed in this patch? Besides, do you still need new test case for this patch? Are these covered by the other changes already? pengfei: What's these tests used for? They are not changed in this patch? Besides, do you still need new…
; X86-LABEL: and_shr_from_mask_i64:		; X86-LABEL: and_shr_from_mask_i64:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shrl $4, %eax		; X86-NEXT: shrl $4, %eax
; X86-NEXT: xorl %edx, %edx		; X86-NEXT: xorl %edx, %edx
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: and_shr_from_mask_i64:		; X64-LABEL: and_shr_from_mask_i64:
▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%and = and i64 %x, 240		%and = and i64 %x, 240
%shr = lshr i64 %and, 4		%shr = lshr i64 %and, 4
ret i64 %shr		ret i64 %shr
}		}

define i32 @shr_and_to_mask_i32(i32 %x) nounwind {		define i32 @shr_and_to_mask_i32(i32 %x) nounwind {
; X86-LABEL: shr_and_to_mask_i32:		; X86-LABEL: shr_and_to_mask_i32:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shrl $6, %eax		; X86-NEXT: shrl $6, %eax
; X86-NEXT: andl $1023, %eax # imm = 0x3FF
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: shr_and_to_mask_i32:		; X64-LABEL: shr_and_to_mask_i32:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movzwl %di, %eax
; X64-NEXT: shrl $6, %eax		; X64-NEXT: shrl $6, %eax
; X64-NEXT: andl $1023, %eax # imm = 0x3FF
; X64-NEXT: retq		; X64-NEXT: retq
%shr = lshr i32 %x, 6		%shr = lshr i32 %x, 6
%and = and i32 %shr, 1023		%and = and i32 %shr, 1023
ret i32 %and		ret i32 %and
}		}

define i64 @and_shl_to_mask_i64(i64 %x) nounwind {		define i64 @and_shl_to_mask_i64(i64 %x) nounwind {
; X86-LABEL: and_shl_to_mask_i64:		; X86-LABEL: and_shl_to_mask_i64:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl $511, %eax # imm = 0x1FF		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shll $7, %eax		; X86-NEXT: shll $7, %eax
		; X86-NEXT: movzwl %ax, %eax
; X86-NEXT: xorl %edx, %edx		; X86-NEXT: xorl %edx, %edx
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: and_shl_to_mask_i64:		; X64-LABEL: and_shl_to_mask_i64:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: shll $7, %edi
; X64-NEXT: andl $511, %eax # imm = 0x1FF		; X64-NEXT: movzwl %di, %eax
; X64-NEXT: shlq $7, %rax
; X64-NEXT: retq		; X64-NEXT: retq
%and = and i64 %x, 511		%and = and i64 %x, 511
%shl = shl i64 %and, 7		%shl = shl i64 %and, 7
ret i64 %shl		ret i64 %shl
}		}

define i16 @shl_and_to_mask_i16(i16 %x) nounwind {		define i16 @shl_and_to_mask_i16(i16 %x) nounwind {
; X86-LABEL: shl_and_to_mask_i16:		; X86-LABEL: shl_and_to_mask_i16:
▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%and = and i32 %x, 511		%and = and i32 %x, 511
%shl = shl i32 %and, 8		%shl = shl i32 %and, 8
ret i32 %shl		ret i32 %shl
}		}

define i64 @and_shr_to_shrink_i64(i64 %x) nounwind {		define i64 @and_shr_to_shrink_i64(i64 %x) nounwind {
; X86-LABEL: and_shr_to_shrink_i64:		; X86-LABEL: and_shr_to_shrink_i64:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl $64704, %eax # imm = 0xFCC0		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shrl $6, %eax		; X86-NEXT: shrl $6, %eax
		; X86-NEXT: andl $1011, %eax # imm = 0x3F3
; X86-NEXT: xorl %edx, %edx		; X86-NEXT: xorl %edx, %edx
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: and_shr_to_shrink_i64:		; X64-LABEL: and_shr_to_shrink_i64:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: andl $64704, %eax # imm = 0xFCC0
; X64-NEXT: shrq $6, %rax		; X64-NEXT: shrq $6, %rax
		; X64-NEXT: andl $1011, %eax # imm = 0x3F3
; X64-NEXT: retq		; X64-NEXT: retq
%and = and i64 %x, 64704		%and = and i64 %x, 64704
%shr = lshr i64 %and, 6		%shr = lshr i64 %and, 6
ret i64 %shr		ret i64 %shr
}		}

define i32 @shl_and_to_shrink_i32(i32 %x) nounwind {		define i32 @shl_and_to_shrink_i32(i32 %x) nounwind {
; X86-LABEL: shl_and_to_shrink_i32:		; X86-LABEL: shl_and_to_shrink_i32:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl $511, %eax # imm = 0x1FF
		; X86-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shll $8, %eax		; X86-NEXT: shll $8, %eax
; X86-NEXT: andl $130816, %eax # imm = 0x1FF00
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: shl_and_to_shrink_i32:		; X64-LABEL: shl_and_to_shrink_i32:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
		; X64-NEXT: andl $511, %eax # imm = 0x1FF
; X64-NEXT: shll $8, %eax		; X64-NEXT: shll $8, %eax
; X64-NEXT: andl $130816, %eax # imm = 0x1FF00
; X64-NEXT: retq		; X64-NEXT: retq
%shl = shl i32 %x, 8		%shl = shl i32 %x, 8
%and = and i32 %shl, 131071		%and = and i32 %shl, 131071
ret i32 %and		ret i32 %and
}		}

define i64 @shr_and_from_shrink8_i64(i64 %x) nounwind {		define i64 @shr_and_from_shrink8_i64(i64 %x) nounwind {
; X86-LABEL: shr_and_from_shrink8_i64:		; X86-LABEL: shr_and_from_shrink8_i64:
▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%shl = shl i32 %x, 5		%shl = shl i32 %x, 5
%and = and i32 %shl, 2016		%and = and i32 %shl, 2016
ret i32 %and		ret i32 %and
}		}

define i64 @shr_and_from_shrink32_i64(i64 %x) nounwind {		define i64 @shr_and_from_shrink32_i64(i64 %x) nounwind {
; X86-LABEL: shr_and_from_shrink32_i64:		; X86-LABEL: shr_and_from_shrink32_i64:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl $8192, %eax # imm = 0x2000
		; X86-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shll $16, %eax		; X86-NEXT: shll $16, %eax
; X86-NEXT: andl $536870912, %eax # imm = 0x20000000
; X86-NEXT: xorl %edx, %edx		; X86-NEXT: xorl %edx, %edx
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: shr_and_from_shrink32_i64:		; X64-LABEL: shr_and_from_shrink32_i64:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: shrq $16, %rax		; X64-NEXT: shrq $16, %rax
; X64-NEXT: andl $536870912, %eax # imm = 0x20000000		; X64-NEXT: andl $536870912, %eax # imm = 0x20000000
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines
}		}

define i64 @shl_and_to_shrink32_i64(i64 %x) nounwind {		define i64 @shl_and_to_shrink32_i64(i64 %x) nounwind {
; X86-LABEL: shl_and_to_shrink32_i64:		; X86-LABEL: shl_and_to_shrink32_i64:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: movl %eax, %edx		; X86-NEXT: movl %eax, %edx
; X86-NEXT: shrl $27, %edx		; X86-NEXT: shrl $27, %edx
		; X86-NEXT: andl $89501588, %eax # imm = 0x555AF94
; X86-NEXT: shll $5, %eax		; X86-NEXT: shll $5, %eax
; X86-NEXT: andl $-1430916480, %eax # imm = 0xAAB5F280
; X86-NEXT: andl $-4, %edx		; X86-NEXT: andl $-4, %edx
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: shl_and_to_shrink32_i64:		; X64-LABEL: shl_and_to_shrink32_i64:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: andl $-447369324, %eax # imm = 0xE555AF94		; X64-NEXT: andl $-447369324, %eax # imm = 0xE555AF94
; X64-NEXT: shlq $5, %rax		; X64-NEXT: shlq $5, %rax
▲ Show 20 Lines • Show All 161 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/fold-and-shift.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=i686-- \| FileCheck %s			; RUN: llc < %s -mtriple=i686-- \| FileCheck %s

	define i32 @t1(ptr %X, i32 %i) {			define i32 @t1(ptr %X, i32 %i) {
	; CHECK-LABEL: t1:			; CHECK-LABEL: t1:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: movl {{[0-9]+}}(%esp), %eax			; CHECK-NEXT: movzbl {{[0-9]+}}(%esp), %eax
	; CHECK-NEXT: movl {{[0-9]+}}(%esp), %ecx			; CHECK-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; CHECK-NEXT: movzbl %cl, %ecx			; CHECK-NEXT: movl (%ecx,%eax,4), %eax
	; CHECK-NEXT: movl (%eax,%ecx,4), %eax
	; CHECK-NEXT: retl			; CHECK-NEXT: retl

	entry:			entry:
	%tmp2 = shl i32 %i, 2			%tmp2 = shl i32 %i, 2
	%tmp4 = and i32 %tmp2, 1020			%tmp4 = and i32 %tmp2, 1020
	%tmp7 = getelementptr i8, ptr %X, i32 %tmp4			%tmp7 = getelementptr i8, ptr %X, i32 %tmp4
	%tmp9 = load i32, ptr %tmp7			%tmp9 = load i32, ptr %tmp7
	ret i32 %tmp9			ret i32 %tmp9
	}			}

	define i32 @t2(ptr %X, i32 %i) {			define i32 @t2(ptr %X, i32 %i) {
	; CHECK-LABEL: t2:			; CHECK-LABEL: t2:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: movl {{[0-9]+}}(%esp), %eax			; CHECK-NEXT: movl {{[0-9]+}}(%esp), %eax
	; CHECK-NEXT: movl {{[0-9]+}}(%esp), %ecx			; CHECK-NEXT: movzwl {{[0-9]+}}(%esp), %ecx
	; CHECK-NEXT: movzwl %cx, %ecx			; CHECK-NEXT: addl %ecx, %ecx
	; CHECK-NEXT: movl (%eax,%ecx,4), %eax			; CHECK-NEXT: movl (%eax,%ecx,2), %eax
	; CHECK-NEXT: retl			; CHECK-NEXT: retl

	entry:			entry:
	%tmp2 = shl i32 %i, 1			%tmp2 = shl i32 %i, 1
	%tmp4 = and i32 %tmp2, 131070			%tmp4 = and i32 %tmp2, 131070
	%tmp7 = getelementptr i16, ptr %X, i32 %tmp4			%tmp7 = getelementptr i16, ptr %X, i32 %tmp4
	%tmp9 = load i32, ptr %tmp7			%tmp9 = load i32, ptr %tmp7
	ret i32 %tmp9			ret i32 %tmp9
	▲ Show 20 Lines • Show All 88 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/limited-prec.ll

	Show First 20 Lines • Show All 312 Lines • ▼ Show 20 Lines
	; precision6-LABEL: f4:			; precision6-LABEL: f4:
	; precision6: # %bb.0: # %entry			; precision6: # %bb.0: # %entry
	; precision6-NEXT: subl $8, %esp			; precision6-NEXT: subl $8, %esp
	; precision6-NEXT: movl {{[0-9]+}}(%esp), %eax			; precision6-NEXT: movl {{[0-9]+}}(%esp), %eax
	; precision6-NEXT: movl %eax, %ecx			; precision6-NEXT: movl %eax, %ecx
	; precision6-NEXT: andl $8388607, %ecx # imm = 0x7FFFFF			; precision6-NEXT: andl $8388607, %ecx # imm = 0x7FFFFF
	; precision6-NEXT: orl $1065353216, %ecx # imm = 0x3F800000			; precision6-NEXT: orl $1065353216, %ecx # imm = 0x3F800000
	; precision6-NEXT: movl %ecx, (%esp)			; precision6-NEXT: movl %ecx, (%esp)
	; precision6-NEXT: andl $2139095040, %eax # imm = 0x7F800000
	; precision6-NEXT: shrl $23, %eax			; precision6-NEXT: shrl $23, %eax
				; precision6-NEXT: movzbl %al, %eax
	; precision6-NEXT: addl $-127, %eax			; precision6-NEXT: addl $-127, %eax
	; precision6-NEXT: movl %eax, {{[0-9]+}}(%esp)			; precision6-NEXT: movl %eax, {{[0-9]+}}(%esp)
	; precision6-NEXT: flds (%esp)			; precision6-NEXT: flds (%esp)
	; precision6-NEXT: fld %st(0)			; precision6-NEXT: fld %st(0)
	; precision6-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}			; precision6-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}
	; precision6-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision6-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	; precision6-NEXT: fmulp %st, %st(1)			; precision6-NEXT: fmulp %st, %st(1)
	; precision6-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision6-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	; precision6-NEXT: fildl {{[0-9]+}}(%esp)			; precision6-NEXT: fildl {{[0-9]+}}(%esp)
	; precision6-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}			; precision6-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}
	; precision6-NEXT: faddp %st, %st(1)			; precision6-NEXT: faddp %st, %st(1)
	; precision6-NEXT: addl $8, %esp			; precision6-NEXT: addl $8, %esp
	; precision6-NEXT: retl			; precision6-NEXT: retl
	;			;
	; precision12-LABEL: f4:			; precision12-LABEL: f4:
	; precision12: # %bb.0: # %entry			; precision12: # %bb.0: # %entry
	; precision12-NEXT: subl $8, %esp			; precision12-NEXT: subl $8, %esp
	; precision12-NEXT: movl {{[0-9]+}}(%esp), %eax			; precision12-NEXT: movl {{[0-9]+}}(%esp), %eax
	; precision12-NEXT: movl %eax, %ecx			; precision12-NEXT: movl %eax, %ecx
	; precision12-NEXT: andl $8388607, %ecx # imm = 0x7FFFFF			; precision12-NEXT: andl $8388607, %ecx # imm = 0x7FFFFF
	; precision12-NEXT: orl $1065353216, %ecx # imm = 0x3F800000			; precision12-NEXT: orl $1065353216, %ecx # imm = 0x3F800000
	; precision12-NEXT: movl %ecx, (%esp)			; precision12-NEXT: movl %ecx, (%esp)
	; precision12-NEXT: andl $2139095040, %eax # imm = 0x7F800000
	; precision12-NEXT: shrl $23, %eax			; precision12-NEXT: shrl $23, %eax
				; precision12-NEXT: movzbl %al, %eax
	; precision12-NEXT: addl $-127, %eax			; precision12-NEXT: addl $-127, %eax
	; precision12-NEXT: movl %eax, {{[0-9]+}}(%esp)			; precision12-NEXT: movl %eax, {{[0-9]+}}(%esp)
	; precision12-NEXT: flds (%esp)			; precision12-NEXT: flds (%esp)
	; precision12-NEXT: fld %st(0)			; precision12-NEXT: fld %st(0)
	; precision12-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}			; precision12-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}
	; precision12-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision12-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	; precision12-NEXT: fmul %st(1), %st			; precision12-NEXT: fmul %st(1), %st
	; precision12-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision12-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	Show All 10 Lines
	; precision18-LABEL: f4:			; precision18-LABEL: f4:
	; precision18: # %bb.0: # %entry			; precision18: # %bb.0: # %entry
	; precision18-NEXT: subl $8, %esp			; precision18-NEXT: subl $8, %esp
	; precision18-NEXT: movl {{[0-9]+}}(%esp), %eax			; precision18-NEXT: movl {{[0-9]+}}(%esp), %eax
	; precision18-NEXT: movl %eax, %ecx			; precision18-NEXT: movl %eax, %ecx
	; precision18-NEXT: andl $8388607, %ecx # imm = 0x7FFFFF			; precision18-NEXT: andl $8388607, %ecx # imm = 0x7FFFFF
	; precision18-NEXT: orl $1065353216, %ecx # imm = 0x3F800000			; precision18-NEXT: orl $1065353216, %ecx # imm = 0x3F800000
	; precision18-NEXT: movl %ecx, (%esp)			; precision18-NEXT: movl %ecx, (%esp)
	; precision18-NEXT: andl $2139095040, %eax # imm = 0x7F800000
	; precision18-NEXT: shrl $23, %eax			; precision18-NEXT: shrl $23, %eax
				; precision18-NEXT: movzbl %al, %eax
	; precision18-NEXT: addl $-127, %eax			; precision18-NEXT: addl $-127, %eax
	; precision18-NEXT: movl %eax, {{[0-9]+}}(%esp)			; precision18-NEXT: movl %eax, {{[0-9]+}}(%esp)
	; precision18-NEXT: flds (%esp)			; precision18-NEXT: flds (%esp)
	; precision18-NEXT: fld %st(0)			; precision18-NEXT: fld %st(0)
	; precision18-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}			; precision18-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}
	; precision18-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision18-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	; precision18-NEXT: fmul %st(1), %st			; precision18-NEXT: fmul %st(1), %st
	; precision18-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision18-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	Show All 22 Lines
	; precision6-LABEL: f5:			; precision6-LABEL: f5:
	; precision6: # %bb.0: # %entry			; precision6: # %bb.0: # %entry
	; precision6-NEXT: subl $8, %esp			; precision6-NEXT: subl $8, %esp
	; precision6-NEXT: movl {{[0-9]+}}(%esp), %eax			; precision6-NEXT: movl {{[0-9]+}}(%esp), %eax
	; precision6-NEXT: movl %eax, %ecx			; precision6-NEXT: movl %eax, %ecx
	; precision6-NEXT: andl $8388607, %ecx # imm = 0x7FFFFF			; precision6-NEXT: andl $8388607, %ecx # imm = 0x7FFFFF
	; precision6-NEXT: orl $1065353216, %ecx # imm = 0x3F800000			; precision6-NEXT: orl $1065353216, %ecx # imm = 0x3F800000
	; precision6-NEXT: movl %ecx, (%esp)			; precision6-NEXT: movl %ecx, (%esp)
	; precision6-NEXT: andl $2139095040, %eax # imm = 0x7F800000
	; precision6-NEXT: shrl $23, %eax			; precision6-NEXT: shrl $23, %eax
				; precision6-NEXT: movzbl %al, %eax
	; precision6-NEXT: addl $-127, %eax			; precision6-NEXT: addl $-127, %eax
	; precision6-NEXT: movl %eax, {{[0-9]+}}(%esp)			; precision6-NEXT: movl %eax, {{[0-9]+}}(%esp)
	; precision6-NEXT: flds (%esp)			; precision6-NEXT: flds (%esp)
	; precision6-NEXT: fld %st(0)			; precision6-NEXT: fld %st(0)
	; precision6-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}			; precision6-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}
	; precision6-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision6-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	; precision6-NEXT: fmulp %st, %st(1)			; precision6-NEXT: fmulp %st, %st(1)
	; precision6-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision6-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	; precision6-NEXT: fiaddl {{[0-9]+}}(%esp)			; precision6-NEXT: fiaddl {{[0-9]+}}(%esp)
	; precision6-NEXT: addl $8, %esp			; precision6-NEXT: addl $8, %esp
	; precision6-NEXT: retl			; precision6-NEXT: retl
	;			;
	; precision12-LABEL: f5:			; precision12-LABEL: f5:
	; precision12: # %bb.0: # %entry			; precision12: # %bb.0: # %entry
	; precision12-NEXT: subl $8, %esp			; precision12-NEXT: subl $8, %esp
	; precision12-NEXT: movl {{[0-9]+}}(%esp), %eax			; precision12-NEXT: movl {{[0-9]+}}(%esp), %eax
	; precision12-NEXT: movl %eax, %ecx			; precision12-NEXT: movl %eax, %ecx
	; precision12-NEXT: andl $8388607, %ecx # imm = 0x7FFFFF			; precision12-NEXT: andl $8388607, %ecx # imm = 0x7FFFFF
	; precision12-NEXT: orl $1065353216, %ecx # imm = 0x3F800000			; precision12-NEXT: orl $1065353216, %ecx # imm = 0x3F800000
	; precision12-NEXT: movl %ecx, (%esp)			; precision12-NEXT: movl %ecx, (%esp)
	; precision12-NEXT: andl $2139095040, %eax # imm = 0x7F800000
	; precision12-NEXT: shrl $23, %eax			; precision12-NEXT: shrl $23, %eax
				; precision12-NEXT: movzbl %al, %eax
	; precision12-NEXT: addl $-127, %eax			; precision12-NEXT: addl $-127, %eax
	; precision12-NEXT: movl %eax, {{[0-9]+}}(%esp)			; precision12-NEXT: movl %eax, {{[0-9]+}}(%esp)
	; precision12-NEXT: flds (%esp)			; precision12-NEXT: flds (%esp)
	; precision12-NEXT: fld %st(0)			; precision12-NEXT: fld %st(0)
	; precision12-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}			; precision12-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}
	; precision12-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision12-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	; precision12-NEXT: fmul %st(1), %st			; precision12-NEXT: fmul %st(1), %st
	; precision12-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision12-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	; precision12-NEXT: fmul %st(1), %st			; precision12-NEXT: fmul %st(1), %st
	; precision12-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision12-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	; precision12-NEXT: fmulp %st, %st(1)			; precision12-NEXT: fmulp %st, %st(1)
	; precision12-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision12-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	; precision12-NEXT: fiaddl {{[0-9]+}}(%esp)			; precision12-NEXT: fiaddl {{[0-9]+}}(%esp)
	; precision12-NEXT: addl $8, %esp			; precision12-NEXT: addl $8, %esp
	; precision12-NEXT: retl			; precision12-NEXT: retl
	;			;
	; precision18-LABEL: f5:			; precision18-LABEL: f5:
	; precision18: # %bb.0: # %entry			; precision18: # %bb.0: # %entry
	; precision18-NEXT: subl $8, %esp			; precision18-NEXT: subl $8, %esp
	; precision18-NEXT: movl {{[0-9]+}}(%esp), %eax			; precision18-NEXT: movl {{[0-9]+}}(%esp), %eax
	; precision18-NEXT: movl %eax, %ecx			; precision18-NEXT: movl %eax, %ecx
	; precision18-NEXT: andl $8388607, %ecx # imm = 0x7FFFFF			; precision18-NEXT: andl $8388607, %ecx # imm = 0x7FFFFF
	; precision18-NEXT: orl $1065353216, %ecx # imm = 0x3F800000			; precision18-NEXT: orl $1065353216, %ecx # imm = 0x3F800000
	; precision18-NEXT: movl %ecx, (%esp)			; precision18-NEXT: movl %ecx, (%esp)
	; precision18-NEXT: andl $2139095040, %eax # imm = 0x7F800000
	; precision18-NEXT: shrl $23, %eax			; precision18-NEXT: shrl $23, %eax
				; precision18-NEXT: movzbl %al, %eax
	; precision18-NEXT: addl $-127, %eax			; precision18-NEXT: addl $-127, %eax
	; precision18-NEXT: movl %eax, {{[0-9]+}}(%esp)			; precision18-NEXT: movl %eax, {{[0-9]+}}(%esp)
	; precision18-NEXT: flds (%esp)			; precision18-NEXT: flds (%esp)
	; precision18-NEXT: fld %st(0)			; precision18-NEXT: fld %st(0)
	; precision18-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}			; precision18-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}
	; precision18-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision18-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	; precision18-NEXT: fmul %st(1), %st			; precision18-NEXT: fmul %st(1), %st
	; precision18-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision18-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	Show All 20 Lines
	; precision6-LABEL: f6:			; precision6-LABEL: f6:
	; precision6: # %bb.0: # %entry			; precision6: # %bb.0: # %entry
	; precision6-NEXT: subl $8, %esp			; precision6-NEXT: subl $8, %esp
	; precision6-NEXT: movl {{[0-9]+}}(%esp), %eax			; precision6-NEXT: movl {{[0-9]+}}(%esp), %eax
	; precision6-NEXT: movl %eax, %ecx			; precision6-NEXT: movl %eax, %ecx
	; precision6-NEXT: andl $8388607, %ecx # imm = 0x7FFFFF			; precision6-NEXT: andl $8388607, %ecx # imm = 0x7FFFFF
	; precision6-NEXT: orl $1065353216, %ecx # imm = 0x3F800000			; precision6-NEXT: orl $1065353216, %ecx # imm = 0x3F800000
	; precision6-NEXT: movl %ecx, (%esp)			; precision6-NEXT: movl %ecx, (%esp)
	; precision6-NEXT: andl $2139095040, %eax # imm = 0x7F800000
	; precision6-NEXT: shrl $23, %eax			; precision6-NEXT: shrl $23, %eax
				; precision6-NEXT: movzbl %al, %eax
	; precision6-NEXT: addl $-127, %eax			; precision6-NEXT: addl $-127, %eax
	; precision6-NEXT: movl %eax, {{[0-9]+}}(%esp)			; precision6-NEXT: movl %eax, {{[0-9]+}}(%esp)
	; precision6-NEXT: flds (%esp)			; precision6-NEXT: flds (%esp)
	; precision6-NEXT: fld %st(0)			; precision6-NEXT: fld %st(0)
	; precision6-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}			; precision6-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}
	; precision6-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision6-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	; precision6-NEXT: fmulp %st, %st(1)			; precision6-NEXT: fmulp %st, %st(1)
	; precision6-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision6-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	; precision6-NEXT: fildl {{[0-9]+}}(%esp)			; precision6-NEXT: fildl {{[0-9]+}}(%esp)
	; precision6-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}			; precision6-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}
	; precision6-NEXT: faddp %st, %st(1)			; precision6-NEXT: faddp %st, %st(1)
	; precision6-NEXT: addl $8, %esp			; precision6-NEXT: addl $8, %esp
	; precision6-NEXT: retl			; precision6-NEXT: retl
	;			;
	; precision12-LABEL: f6:			; precision12-LABEL: f6:
	; precision12: # %bb.0: # %entry			; precision12: # %bb.0: # %entry
	; precision12-NEXT: subl $8, %esp			; precision12-NEXT: subl $8, %esp
	; precision12-NEXT: movl {{[0-9]+}}(%esp), %eax			; precision12-NEXT: movl {{[0-9]+}}(%esp), %eax
	; precision12-NEXT: movl %eax, %ecx			; precision12-NEXT: movl %eax, %ecx
	; precision12-NEXT: andl $8388607, %ecx # imm = 0x7FFFFF			; precision12-NEXT: andl $8388607, %ecx # imm = 0x7FFFFF
	; precision12-NEXT: orl $1065353216, %ecx # imm = 0x3F800000			; precision12-NEXT: orl $1065353216, %ecx # imm = 0x3F800000
	; precision12-NEXT: movl %ecx, (%esp)			; precision12-NEXT: movl %ecx, (%esp)
	; precision12-NEXT: andl $2139095040, %eax # imm = 0x7F800000
	; precision12-NEXT: shrl $23, %eax			; precision12-NEXT: shrl $23, %eax
				; precision12-NEXT: movzbl %al, %eax
	; precision12-NEXT: addl $-127, %eax			; precision12-NEXT: addl $-127, %eax
	; precision12-NEXT: movl %eax, {{[0-9]+}}(%esp)			; precision12-NEXT: movl %eax, {{[0-9]+}}(%esp)
	; precision12-NEXT: flds (%esp)			; precision12-NEXT: flds (%esp)
	; precision12-NEXT: fld %st(0)			; precision12-NEXT: fld %st(0)
	; precision12-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}			; precision12-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}
	; precision12-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision12-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	; precision12-NEXT: fmul %st(1), %st			; precision12-NEXT: fmul %st(1), %st
	; precision12-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision12-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	; precision12-NEXT: fmulp %st, %st(1)			; precision12-NEXT: fmulp %st, %st(1)
	; precision12-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision12-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	; precision12-NEXT: fildl {{[0-9]+}}(%esp)			; precision12-NEXT: fildl {{[0-9]+}}(%esp)
	; precision12-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}			; precision12-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}
	; precision12-NEXT: faddp %st, %st(1)			; precision12-NEXT: faddp %st, %st(1)
	; precision12-NEXT: addl $8, %esp			; precision12-NEXT: addl $8, %esp
	; precision12-NEXT: retl			; precision12-NEXT: retl
	;			;
	; precision18-LABEL: f6:			; precision18-LABEL: f6:
	; precision18: # %bb.0: # %entry			; precision18: # %bb.0: # %entry
	; precision18-NEXT: subl $8, %esp			; precision18-NEXT: subl $8, %esp
	; precision18-NEXT: movl {{[0-9]+}}(%esp), %eax			; precision18-NEXT: movl {{[0-9]+}}(%esp), %eax
	; precision18-NEXT: movl %eax, %ecx			; precision18-NEXT: movl %eax, %ecx
	; precision18-NEXT: andl $8388607, %ecx # imm = 0x7FFFFF			; precision18-NEXT: andl $8388607, %ecx # imm = 0x7FFFFF
	; precision18-NEXT: orl $1065353216, %ecx # imm = 0x3F800000			; precision18-NEXT: orl $1065353216, %ecx # imm = 0x3F800000
	; precision18-NEXT: movl %ecx, (%esp)			; precision18-NEXT: movl %ecx, (%esp)
	; precision18-NEXT: andl $2139095040, %eax # imm = 0x7F800000
	; precision18-NEXT: shrl $23, %eax			; precision18-NEXT: shrl $23, %eax
				; precision18-NEXT: movzbl %al, %eax
	; precision18-NEXT: addl $-127, %eax			; precision18-NEXT: addl $-127, %eax
	; precision18-NEXT: movl %eax, {{[0-9]+}}(%esp)			; precision18-NEXT: movl %eax, {{[0-9]+}}(%esp)
	; precision18-NEXT: flds (%esp)			; precision18-NEXT: flds (%esp)
	; precision18-NEXT: fld %st(0)			; precision18-NEXT: fld %st(0)
	; precision18-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}			; precision18-NEXT: fmuls {{\.?LCPI[0-9]+_[0-9]+}}
	; precision18-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision18-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	; precision18-NEXT: fmul %st(1), %st			; precision18-NEXT: fmul %st(1), %st
	; precision18-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}			; precision18-NEXT: fadds {{\.?LCPI[0-9]+_[0-9]+}}
	Show All 18 Lines

llvm/test/CodeGen/X86/movmsk-cmp.ll

	Show First 20 Lines • Show All 3,847 Lines • ▼ Show 20 Lines
	; SSE-LABEL: movmsk_v16i8:			; SSE-LABEL: movmsk_v16i8:
	; SSE: # %bb.0:			; SSE: # %bb.0:
	; SSE-NEXT: pcmpeqb %xmm1, %xmm0			; SSE-NEXT: pcmpeqb %xmm1, %xmm0
	; SSE-NEXT: pmovmskb %xmm0, %eax			; SSE-NEXT: pmovmskb %xmm0, %eax
	; SSE-NEXT: movl %eax, %ecx			; SSE-NEXT: movl %eax, %ecx
	; SSE-NEXT: shrl $15, %ecx			; SSE-NEXT: shrl $15, %ecx
	; SSE-NEXT: movl %eax, %edx			; SSE-NEXT: movl %eax, %edx
	; SSE-NEXT: shrl $8, %edx			; SSE-NEXT: shrl $8, %edx
	; SSE-NEXT: andl $1, %edx
	; SSE-NEXT: andl $8, %eax
	; SSE-NEXT: shrl $3, %eax			; SSE-NEXT: shrl $3, %eax
	; SSE-NEXT: xorl %edx, %eax			; SSE-NEXT: xorl %edx, %eax
	; SSE-NEXT: andl %ecx, %eax			; SSE-NEXT: andl %ecx, %eax
	; SSE-NEXT: # kill: def $al killed $al killed $eax			; SSE-NEXT: # kill: def $al killed $al killed $eax
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX1OR2-LABEL: movmsk_v16i8:			; AVX1OR2-LABEL: movmsk_v16i8:
	; AVX1OR2: # %bb.0:			; AVX1OR2: # %bb.0:
	; AVX1OR2-NEXT: vpcmpeqb %xmm1, %xmm0, %xmm0			; AVX1OR2-NEXT: vpcmpeqb %xmm1, %xmm0, %xmm0
	; AVX1OR2-NEXT: vpmovmskb %xmm0, %eax			; AVX1OR2-NEXT: vpmovmskb %xmm0, %eax
	; AVX1OR2-NEXT: movl %eax, %ecx			; AVX1OR2-NEXT: movl %eax, %ecx
	; AVX1OR2-NEXT: shrl $15, %ecx			; AVX1OR2-NEXT: shrl $15, %ecx
	; AVX1OR2-NEXT: movl %eax, %edx			; AVX1OR2-NEXT: movl %eax, %edx
	; AVX1OR2-NEXT: shrl $8, %edx			; AVX1OR2-NEXT: shrl $8, %edx
	; AVX1OR2-NEXT: andl $1, %edx
	; AVX1OR2-NEXT: andl $8, %eax
	; AVX1OR2-NEXT: shrl $3, %eax			; AVX1OR2-NEXT: shrl $3, %eax
	; AVX1OR2-NEXT: xorl %edx, %eax			; AVX1OR2-NEXT: xorl %edx, %eax
	; AVX1OR2-NEXT: andl %ecx, %eax			; AVX1OR2-NEXT: andl %ecx, %eax
	; AVX1OR2-NEXT: # kill: def $al killed $al killed $eax			; AVX1OR2-NEXT: # kill: def $al killed $al killed $eax
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; KNL-LABEL: movmsk_v16i8:			; KNL-LABEL: movmsk_v16i8:
	; KNL: # %bb.0:			; KNL: # %bb.0:
	▲ Show 20 Lines • Show All 88 Lines • ▼ Show 20 Lines
	; TODO: Replace shift+mask chain with AND+CMP.			; TODO: Replace shift+mask chain with AND+CMP.
	define i1 @movmsk_v4i32(<4 x i32> %x, <4 x i32> %y) {			define i1 @movmsk_v4i32(<4 x i32> %x, <4 x i32> %y) {
	; SSE-LABEL: movmsk_v4i32:			; SSE-LABEL: movmsk_v4i32:
	; SSE: # %bb.0:			; SSE: # %bb.0:
	; SSE-NEXT: pcmpgtd %xmm0, %xmm1			; SSE-NEXT: pcmpgtd %xmm0, %xmm1
	; SSE-NEXT: movmskps %xmm1, %eax			; SSE-NEXT: movmskps %xmm1, %eax
	; SSE-NEXT: movl %eax, %ecx			; SSE-NEXT: movl %eax, %ecx
	; SSE-NEXT: shrb $3, %cl			; SSE-NEXT: shrb $3, %cl
	; SSE-NEXT: andb $4, %al
	; SSE-NEXT: shrb $2, %al			; SSE-NEXT: shrb $2, %al
				; SSE-NEXT: andb $1, %al
	; SSE-NEXT: xorb %cl, %al			; SSE-NEXT: xorb %cl, %al
	; SSE-NEXT: # kill: def $al killed $al killed $eax			; SSE-NEXT: # kill: def $al killed $al killed $eax
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX1OR2-LABEL: movmsk_v4i32:			; AVX1OR2-LABEL: movmsk_v4i32:
	; AVX1OR2: # %bb.0:			; AVX1OR2: # %bb.0:
	; AVX1OR2-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0			; AVX1OR2-NEXT: vpcmpgtd %xmm0, %xmm1, %xmm0
	; AVX1OR2-NEXT: vmovmskps %xmm0, %eax			; AVX1OR2-NEXT: vmovmskps %xmm0, %eax
	; AVX1OR2-NEXT: movl %eax, %ecx			; AVX1OR2-NEXT: movl %eax, %ecx
	; AVX1OR2-NEXT: shrb $3, %cl			; AVX1OR2-NEXT: shrb $3, %cl
	; AVX1OR2-NEXT: andb $4, %al
	; AVX1OR2-NEXT: shrb $2, %al			; AVX1OR2-NEXT: shrb $2, %al
				; AVX1OR2-NEXT: andb $1, %al
	; AVX1OR2-NEXT: xorb %cl, %al			; AVX1OR2-NEXT: xorb %cl, %al
	; AVX1OR2-NEXT: # kill: def $al killed $al killed $eax			; AVX1OR2-NEXT: # kill: def $al killed $al killed $eax
	; AVX1OR2-NEXT: retq			; AVX1OR2-NEXT: retq
	;			;
	; KNL-LABEL: movmsk_v4i32:			; KNL-LABEL: movmsk_v4i32:
	; KNL: # %bb.0:			; KNL: # %bb.0:
	; KNL-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1			; KNL-NEXT: # kill: def $xmm1 killed $xmm1 def $zmm1
	; KNL-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0			; KNL-NEXT: # kill: def $xmm0 killed $xmm0 def $zmm0
	▲ Show 20 Lines • Show All 658 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/pr15267.ll

Show First 20 Lines • Show All 79 Lines • ▼ Show 20 Lines	; CHECK-NEXT: retq
ret <4 x i64> %sext		ret <4 x i64> %sext
}		}

define <16 x i4> @test4(ptr %in) nounwind {		define <16 x i4> @test4(ptr %in) nounwind {
; CHECK-LABEL: test4:		; CHECK-LABEL: test4:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: movq (%rdi), %rax		; CHECK-NEXT: movq (%rdi), %rax
; CHECK-NEXT: movl %eax, %ecx		; CHECK-NEXT: movl %eax, %ecx
; CHECK-NEXT: shrl $4, %ecx
; CHECK-NEXT: andl $15, %ecx		; CHECK-NEXT: andl $15, %ecx
; CHECK-NEXT: movl %eax, %edx		; CHECK-NEXT: vmovd %ecx, %xmm0
; CHECK-NEXT: andl $15, %edx		; CHECK-NEXT: movzbl %al, %ecx
; CHECK-NEXT: vmovd %edx, %xmm0		; CHECK-NEXT: shrl $4, %ecx
; CHECK-NEXT: vpinsrb $1, %ecx, %xmm0, %xmm0		; CHECK-NEXT: vpinsrb $1, %ecx, %xmm0, %xmm0
; CHECK-NEXT: movl %eax, %ecx		; CHECK-NEXT: movl %eax, %ecx
; CHECK-NEXT: shrl $8, %ecx		; CHECK-NEXT: shrl $8, %ecx
; CHECK-NEXT: andl $15, %ecx		; CHECK-NEXT: andl $15, %ecx
; CHECK-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0		; CHECK-NEXT: vpinsrb $2, %ecx, %xmm0, %xmm0
; CHECK-NEXT: movl %eax, %ecx		; CHECK-NEXT: movzwl %ax, %ecx
; CHECK-NEXT: shrl $12, %ecx		; CHECK-NEXT: shrl $12, %ecx
; CHECK-NEXT: andl $15, %ecx
; CHECK-NEXT: vpinsrb $3, %ecx, %xmm0, %xmm0		; CHECK-NEXT: vpinsrb $3, %ecx, %xmm0, %xmm0
; CHECK-NEXT: movl %eax, %ecx		; CHECK-NEXT: movl %eax, %ecx
; CHECK-NEXT: shrl $16, %ecx		; CHECK-NEXT: shrl $16, %ecx
; CHECK-NEXT: andl $15, %ecx		; CHECK-NEXT: andl $15, %ecx
; CHECK-NEXT: vpinsrb $4, %ecx, %xmm0, %xmm0		; CHECK-NEXT: vpinsrb $4, %ecx, %xmm0, %xmm0
; CHECK-NEXT: movl %eax, %ecx		; CHECK-NEXT: movl %eax, %ecx
; CHECK-NEXT: shrl $20, %ecx		; CHECK-NEXT: shrl $20, %ecx
; CHECK-NEXT: andl $15, %ecx		; CHECK-NEXT: andl $15, %ecx
▲ Show 20 Lines • Show All 42 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/pr26350.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -disable-constant-hoisting < %s \| FileCheck %s			; RUN: llc -disable-constant-hoisting < %s \| FileCheck %s
	target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"			target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
	target triple = "i386-unknown-linux-gnu"			target triple = "i386-unknown-linux-gnu"

	@d = global i32 8, align 4			@d = global i32 8, align 4

	define i32 @main() {			define i32 @main() {
	; CHECK-LABEL: main:			; CHECK-LABEL: main:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: movl d, %eax			; CHECK-NEXT: movl d, %eax
	; CHECK-NEXT: movl %eax, %ecx			; CHECK-NEXT: movl %eax, %ecx
	; CHECK-NEXT: shrl $31, %ecx			; CHECK-NEXT: shrl $31, %ecx
				; CHECK-NEXT: andl $8, %eax
	; CHECK-NEXT: addl %eax, %eax			; CHECK-NEXT: addl %eax, %eax
	; CHECK-NEXT: andl $16, %eax
	; CHECK-NEXT: cmpl $-1, %eax			; CHECK-NEXT: cmpl $-1, %eax
	; CHECK-NEXT: sbbl $0, %ecx			; CHECK-NEXT: sbbl $0, %ecx
	; CHECK-NEXT: setb %al			; CHECK-NEXT: setb %al
	; CHECK-NEXT: movzbl %al, %eax			; CHECK-NEXT: movzbl %al, %eax
	; CHECK-NEXT: retl			; CHECK-NEXT: retl
	entry:			entry:
	%load = load i32, ptr @d, align 4			%load = load i32, ptr @d, align 4
	%conv1 = zext i32 %load to i64			%conv1 = zext i32 %load to i64
	%shl = shl i64 %conv1, 1			%shl = shl i64 %conv1, 1
	%mul = and i64 %shl, 4294967312			%mul = and i64 %shl, 4294967312
	%cmp = icmp ugt i64 4294967295, %mul			%cmp = icmp ugt i64 4294967295, %mul
	%zext = zext i1 %cmp to i32			%zext = zext i1 %cmp to i32
	ret i32 %zext			ret i32 %zext
	}			}

llvm/test/CodeGen/X86/pr32282.ll

	Show All 30 Lines
	; X86-NEXT: orl %eax, %edx			; X86-NEXT: orl %eax, %edx
	; X86-NEXT: setne {{[0-9]+}}(%esp)			; X86-NEXT: setne {{[0-9]+}}(%esp)
	; X86-NEXT: popl %eax			; X86-NEXT: popl %eax
	; X86-NEXT: retl			; X86-NEXT: retl
	;			;
	; X64-LABEL: foo:			; X64-LABEL: foo:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movq %rdi, %rax			; X64-NEXT: movq %rdi, %rax
	; X64-NEXT: movq d(%rip), %rcx			; X64-NEXT: movq d(%rip), %rdx
	; X64-NEXT: movabsq $3013716102212485120, %rdx # imm = 0x29D2DED3DE400000			; X64-NEXT: notq %rdx
	; X64-NEXT: andnq %rdx, %rcx, %rcx			; X64-NEXT: shrq $21, %rdx
	; X64-NEXT: shrq $21, %rcx			; X64-NEXT: movabsq $1437051821810, %rcx # imm = 0x14E96F69EF2
				; X64-NEXT: andq %rdx, %rcx
	; X64-NEXT: addq $7, %rcx			; X64-NEXT: addq $7, %rcx
	; X64-NEXT: movq %rdi, %rdx			; X64-NEXT: movq %rdi, %rdx
	; X64-NEXT: orq %rcx, %rdx			; X64-NEXT: orq %rcx, %rdx
	; X64-NEXT: shrq $32, %rdx			; X64-NEXT: shrq $32, %rdx
	; X64-NEXT: je .LBB0_1			; X64-NEXT: je .LBB0_1
	; X64-NEXT: # %bb.2:			; X64-NEXT: # %bb.2:
	; X64-NEXT: cqto			; X64-NEXT: cqto
	; X64-NEXT: idivq %rcx			; X64-NEXT: idivq %rcx
	Show All 36 Lines

llvm/test/CodeGen/X86/pr45995.ll

	Show All 12 Lines
	; CHECK-NEXT: .cfi_offset rbx, -32			; CHECK-NEXT: .cfi_offset rbx, -32
	; CHECK-NEXT: .cfi_offset r14, -24			; CHECK-NEXT: .cfi_offset r14, -24
	; CHECK-NEXT: .cfi_offset rbp, -16			; CHECK-NEXT: .cfi_offset rbp, -16
	; CHECK-NEXT: vpslld xmm0, xmm0, 31			; CHECK-NEXT: vpslld xmm0, xmm0, 31
	; CHECK-NEXT: vmovmskps edi, xmm0			; CHECK-NEXT: vmovmskps edi, xmm0
	; CHECK-NEXT: mov ebx, edi			; CHECK-NEXT: mov ebx, edi
	; CHECK-NEXT: shr bl, 3			; CHECK-NEXT: shr bl, 3
	; CHECK-NEXT: mov ebp, edi			; CHECK-NEXT: mov ebp, edi
	; CHECK-NEXT: and bpl, 4
	; CHECK-NEXT: shr bpl, 2			; CHECK-NEXT: shr bpl, 2
				; CHECK-NEXT: and bpl, 1
	; CHECK-NEXT: mov r14d, edi			; CHECK-NEXT: mov r14d, edi
	; CHECK-NEXT: and r14b, 2
	; CHECK-NEXT: shr r14b			; CHECK-NEXT: shr r14b
				; CHECK-NEXT: and r14b, 1
	; CHECK-NEXT: call print_i1@PLT			; CHECK-NEXT: call print_i1@PLT
	; CHECK-NEXT: movzx edi, r14b			; CHECK-NEXT: movzx edi, r14b
	; CHECK-NEXT: call print_i1@PLT			; CHECK-NEXT: call print_i1@PLT
	; CHECK-NEXT: movzx edi, bpl			; CHECK-NEXT: movzx edi, bpl
	; CHECK-NEXT: call print_i1@PLT			; CHECK-NEXT: call print_i1@PLT
	; CHECK-NEXT: movzx edi, bl			; CHECK-NEXT: movzx edi, bl
	; CHECK-NEXT: call print_i1@PLT			; CHECK-NEXT: call print_i1@PLT
	; CHECK-NEXT: pop rbx			; CHECK-NEXT: pop rbx
	Show All 39 Lines
	; CHECK-NEXT: .cfi_offset r15, -24			; CHECK-NEXT: .cfi_offset r15, -24
	; CHECK-NEXT: .cfi_offset rbp, -16			; CHECK-NEXT: .cfi_offset rbp, -16
	; CHECK-NEXT: vpslld xmm1, xmm1, 31			; CHECK-NEXT: vpslld xmm1, xmm1, 31
	; CHECK-NEXT: vmovmskps ebx, xmm1			; CHECK-NEXT: vmovmskps ebx, xmm1
	; CHECK-NEXT: mov eax, ebx			; CHECK-NEXT: mov eax, ebx
	; CHECK-NEXT: shr al, 3			; CHECK-NEXT: shr al, 3
	; CHECK-NEXT: mov byte ptr [rsp + 7], al # 1-byte Spill			; CHECK-NEXT: mov byte ptr [rsp + 7], al # 1-byte Spill
	; CHECK-NEXT: mov r14d, ebx			; CHECK-NEXT: mov r14d, ebx
	; CHECK-NEXT: and r14b, 4
	; CHECK-NEXT: shr r14b, 2			; CHECK-NEXT: shr r14b, 2
				; CHECK-NEXT: and r14b, 1
	; CHECK-NEXT: mov r15d, ebx			; CHECK-NEXT: mov r15d, ebx
	; CHECK-NEXT: and r15b, 2
	; CHECK-NEXT: shr r15b			; CHECK-NEXT: shr r15b
				; CHECK-NEXT: and r15b, 1
	; CHECK-NEXT: vpslld xmm0, xmm0, 31			; CHECK-NEXT: vpslld xmm0, xmm0, 31
	; CHECK-NEXT: vmovmskps edi, xmm0			; CHECK-NEXT: vmovmskps edi, xmm0
	; CHECK-NEXT: mov r12d, edi			; CHECK-NEXT: mov r12d, edi
	; CHECK-NEXT: shr r12b, 3			; CHECK-NEXT: shr r12b, 3
	; CHECK-NEXT: mov r13d, edi			; CHECK-NEXT: mov r13d, edi
	; CHECK-NEXT: and r13b, 4
	; CHECK-NEXT: shr r13b, 2			; CHECK-NEXT: shr r13b, 2
				; CHECK-NEXT: and r13b, 1
	; CHECK-NEXT: mov ebp, edi			; CHECK-NEXT: mov ebp, edi
	; CHECK-NEXT: and bpl, 2
	; CHECK-NEXT: shr bpl			; CHECK-NEXT: shr bpl
				; CHECK-NEXT: and bpl, 1
	; CHECK-NEXT: call print_i1@PLT			; CHECK-NEXT: call print_i1@PLT
	; CHECK-NEXT: movzx edi, bpl			; CHECK-NEXT: movzx edi, bpl
	; CHECK-NEXT: call print_i1@PLT			; CHECK-NEXT: call print_i1@PLT
	; CHECK-NEXT: movzx edi, r13b			; CHECK-NEXT: movzx edi, r13b
	; CHECK-NEXT: call print_i1@PLT			; CHECK-NEXT: call print_i1@PLT
	; CHECK-NEXT: movzx edi, r12b			; CHECK-NEXT: movzx edi, r12b
	; CHECK-NEXT: call print_i1@PLT			; CHECK-NEXT: call print_i1@PLT
	; CHECK-NEXT: mov edi, ebx			; CHECK-NEXT: mov edi, ebx
	▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/pull-binop-through-shift.ll

	Show First 20 Lines • Show All 211 Lines • ▼ Show 20 Lines
	; X64-NEXT: shrl $8, %eax			; X64-NEXT: shrl $8, %eax
	; X64-NEXT: andl $8388352, %eax # imm = 0x7FFF00			; X64-NEXT: andl $8388352, %eax # imm = 0x7FFF00
	; X64-NEXT: movl %eax, (%rsi)			; X64-NEXT: movl %eax, (%rsi)
	; X64-NEXT: retq			; X64-NEXT: retq
	;			;
	; X86-LABEL: and_nosignbit_lshr:			; X86-LABEL: and_nosignbit_lshr:
	; X86: # %bb.0:			; X86: # %bb.0:
	; X86-NEXT: movl 8(%esp), %ecx			; X86-NEXT: movl 8(%esp), %ecx
	; X86-NEXT: movl $2147418112, %eax # imm = 0x7FFF0000			; X86-NEXT: movl 4(%esp), %eax
	; X86-NEXT: andl 4(%esp), %eax
	; X86-NEXT: shrl $8, %eax			; X86-NEXT: shrl $8, %eax
				; X86-NEXT: andl $8388352, %eax # imm = 0x7FFF00
	; X86-NEXT: movl %eax, (%ecx)			; X86-NEXT: movl %eax, (%ecx)
	; X86-NEXT: retl			; X86-NEXT: retl
	%t0 = and i32 %x, 2147418112 ; 0x7FFF0000			%t0 = and i32 %x, 2147418112 ; 0x7FFF0000
	%r = lshr i32 %t0, 8			%r = lshr i32 %t0, 8
	store i32 %r, ptr %dst			store i32 %r, ptr %dst
	ret i32 %r			ret i32 %r
	}			}

	▲ Show 20 Lines • Show All 162 Lines • ▼ Show 20 Lines
	; X64-NEXT: shrl $8, %eax			; X64-NEXT: shrl $8, %eax
	; X64-NEXT: andl $8388352, %eax # imm = 0x7FFF00			; X64-NEXT: andl $8388352, %eax # imm = 0x7FFF00
	; X64-NEXT: movl %eax, (%rsi)			; X64-NEXT: movl %eax, (%rsi)
	; X64-NEXT: retq			; X64-NEXT: retq
	;			;
	; X86-LABEL: and_nosignbit_ashr:			; X86-LABEL: and_nosignbit_ashr:
	; X86: # %bb.0:			; X86: # %bb.0:
	; X86-NEXT: movl 8(%esp), %ecx			; X86-NEXT: movl 8(%esp), %ecx
	; X86-NEXT: movl $2147418112, %eax # imm = 0x7FFF0000			; X86-NEXT: movl 4(%esp), %eax
	; X86-NEXT: andl 4(%esp), %eax
	; X86-NEXT: shrl $8, %eax			; X86-NEXT: shrl $8, %eax
				; X86-NEXT: andl $8388352, %eax # imm = 0x7FFF00
	; X86-NEXT: movl %eax, (%ecx)			; X86-NEXT: movl %eax, (%ecx)
	; X86-NEXT: retl			; X86-NEXT: retl
	%t0 = and i32 %x, 2147418112 ; 0x7FFF0000			%t0 = and i32 %x, 2147418112 ; 0x7FFF0000
	%r = ashr i32 %t0, 8			%r = ashr i32 %t0, 8
	store i32 %r, ptr %dst			store i32 %r, ptr %dst
	ret i32 %r			ret i32 %r
	}			}

	▲ Show 20 Lines • Show All 134 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/rev16.ll

Show All 23 Lines	; X64-NEXT: retq
%mask_r8 = and i32 %r8, 16711935		%mask_r8 = and i32 %r8, 16711935
%tmp = or i32 %mask_l8, %mask_r8		%tmp = or i32 %mask_l8, %mask_r8
ret i32 %tmp		ret i32 %tmp
}		}

define i32 @not_rev16(i32 %a) {		define i32 @not_rev16(i32 %a) {
; X86-LABEL: not_rev16:		; X86-LABEL: not_rev16:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: movl %ecx, %eax		; X86-NEXT: movl %eax, %ecx
; X86-NEXT: shll $8, %eax
; X86-NEXT: shrl $8, %ecx		; X86-NEXT: shrl $8, %ecx
; X86-NEXT: andl $65280, %ecx # imm = 0xFF00		; X86-NEXT: andl $65280, %ecx # imm = 0xFF00
; X86-NEXT: andl $16711680, %eax # imm = 0xFF0000		; X86-NEXT: andl $65280, %eax # imm = 0xFF00
		; X86-NEXT: shll $8, %eax
; X86-NEXT: orl %ecx, %eax		; X86-NEXT: orl %ecx, %eax
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: not_rev16:		; X64-LABEL: not_rev16:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: shll $8, %eax		; X64-NEXT: shrl $8, %eax
; X64-NEXT: shrl $8, %edi		; X64-NEXT: andl $65280, %eax # imm = 0xFF00
; X64-NEXT: andl $65280, %edi # imm = 0xFF00		; X64-NEXT: andl $65280, %edi # imm = 0xFF00
; X64-NEXT: andl $16711680, %eax # imm = 0xFF0000		; X64-NEXT: shll $8, %edi
; X64-NEXT: orl %edi, %eax		; X64-NEXT: orl %edi, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%l8 = shl i32 %a, 8		%l8 = shl i32 %a, 8
%r8 = lshr i32 %a, 8		%r8 = lshr i32 %a, 8
%mask_r8 = and i32 %r8, 4278255360		%mask_r8 = and i32 %r8, 4278255360
%mask_l8 = and i32 %l8, 16711935		%mask_l8 = and i32 %l8, 16711935
%tmp = or i32 %mask_r8, %mask_l8		%tmp = or i32 %mask_r8, %mask_l8
ret i32 %tmp		ret i32 %tmp
▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%mask_r8 = and i32 %r8, 16711935		%mask_r8 = and i32 %r8, 16711935
%tmp = or i32 %mask_r8, %mask_l8		%tmp = or i32 %mask_r8, %mask_l8
ret i32 %tmp		ret i32 %tmp
}		}

define i32 @different_shift_amount(i32 %a) {		define i32 @different_shift_amount(i32 %a) {
; X86-LABEL: different_shift_amount:		; X86-LABEL: different_shift_amount:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl %eax, %ecx		; X86-NEXT: movl %ecx, %eax
; X86-NEXT: shll $9, %ecx
; X86-NEXT: shrl $8, %eax		; X86-NEXT: shrl $8, %eax
; X86-NEXT: andl $-16712192, %ecx # imm = 0xFF00FE00		; X86-NEXT: andl $8355967, %ecx # imm = 0x7F807F
		; X86-NEXT: shll $9, %ecx
; X86-NEXT: andl $16711935, %eax # imm = 0xFF00FF		; X86-NEXT: andl $16711935, %eax # imm = 0xFF00FF
; X86-NEXT: orl %ecx, %eax		; X86-NEXT: orl %ecx, %eax
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: different_shift_amount:		; X64-LABEL: different_shift_amount:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: shll $9, %eax		; X64-NEXT: shrl $8, %eax
; X64-NEXT: shrl $8, %edi		; X64-NEXT: andl $8355967, %edi # imm = 0x7F807F
; X64-NEXT: andl $-16712192, %eax # imm = 0xFF00FE00		; X64-NEXT: shll $9, %edi
; X64-NEXT: andl $16711935, %edi # imm = 0xFF00FF		; X64-NEXT: andl $16711935, %eax # imm = 0xFF00FF
; X64-NEXT: orl %edi, %eax		; X64-NEXT: orl %edi, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%l8 = shl i32 %a, 9		%l8 = shl i32 %a, 9
%r8 = lshr i32 %a, 8		%r8 = lshr i32 %a, 8
%mask_l8 = and i32 %l8, 4278255360		%mask_l8 = and i32 %l8, 4278255360
%mask_r8 = and i32 %r8, 16711935		%mask_r8 = and i32 %r8, 16711935
%tmp = or i32 %mask_l8, %mask_r8		%tmp = or i32 %mask_l8, %mask_r8
ret i32 %tmp		ret i32 %tmp
▲ Show 20 Lines • Show All 112 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/rotate-extract.ll

	Show First 20 Lines • Show All 160 Lines • ▼ Show 20 Lines

	; Result would overshift			; Result would overshift
	define i32 @no_extract_shrl(i32 %i) nounwind {			define i32 @no_extract_shrl(i32 %i) nounwind {
	; X86-LABEL: no_extract_shrl:			; X86-LABEL: no_extract_shrl:
	; X86: # %bb.0:			; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl %eax, %ecx			; X86-NEXT: movl %eax, %ecx
	; X86-NEXT: shrl $9, %ecx			; X86-NEXT: shrl $9, %ecx
	; X86-NEXT: andl $-8, %eax			; X86-NEXT: andl $120, %eax
	; X86-NEXT: shll $25, %eax			; X86-NEXT: shll $25, %eax
	; X86-NEXT: orl %ecx, %eax			; X86-NEXT: orl %ecx, %eax
	; X86-NEXT: retl			; X86-NEXT: retl
	;			;
	; X64-LABEL: no_extract_shrl:			; X64-LABEL: no_extract_shrl:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movl %edi, %eax			; X64-NEXT: movl %edi, %eax
	; X64-NEXT: shrl $9, %eax			; X64-NEXT: shrl $9, %eax
	; X64-NEXT: andl $-8, %edi			; X64-NEXT: andl $120, %edi
	; X64-NEXT: shll $25, %edi			; X64-NEXT: shll $25, %edi
	; X64-NEXT: orl %edi, %eax			; X64-NEXT: orl %edi, %eax
	; X64-NEXT: retq			; X64-NEXT: retq
	%lhs_div = lshr i32 %i, 3			%lhs_div = lshr i32 %i, 3
	%rhs_div = lshr i32 %i, 9			%rhs_div = lshr i32 %i, 9
	%lhs_shift = shl i32 %lhs_div, 28			%lhs_shift = shl i32 %lhs_div, 28
	%out = or i32 %lhs_shift, %rhs_div			%out = or i32 %lhs_shift, %rhs_div
	ret i32 %out			ret i32 %out
	▲ Show 20 Lines • Show All 133 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/selectcc-to-shiftand.ll

	Show First 20 Lines • Show All 163 Lines • ▼ Show 20 Lines
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%shl = select i1 %t, i8 128, i8 0			%shl = select i1 %t, i8 128, i8 0
	ret i8 %shl			ret i8 %shl
	}			}

	define i16 @sel_shift_bool_i16(i1 %t) {			define i16 @sel_shift_bool_i16(i1 %t) {
	; ANY-LABEL: sel_shift_bool_i16:			; ANY-LABEL: sel_shift_bool_i16:
	; ANY: # %bb.0:			; ANY: # %bb.0:
	; ANY-NEXT: movl %edi, %eax			; ANY-NEXT: shll $7, %edi
	; ANY-NEXT: andl $1, %eax			; ANY-NEXT: movzbl %dil, %eax
	; ANY-NEXT: shll $7, %eax
	; ANY-NEXT: # kill: def $ax killed $ax killed $eax			; ANY-NEXT: # kill: def $ax killed $ax killed $eax
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%shl = select i1 %t, i16 128, i16 0			%shl = select i1 %t, i16 128, i16 0
	ret i16 %shl			ret i16 %shl
	}			}

	define i32 @sel_shift_bool_i32(i1 %t) {			define i32 @sel_shift_bool_i32(i1 %t) {
	; ANY-LABEL: sel_shift_bool_i32:			; ANY-LABEL: sel_shift_bool_i32:
	▲ Show 20 Lines • Show All 59 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/setcc.ll

Show First 20 Lines • Show All 275 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%6 = xor i32 %5, 1		%6 = xor i32 %5, 1
ret i32 %6		ret i32 %6
}		}

define i16 @shift_and(i16 %a) {		define i16 @shift_and(i16 %a) {
; X86-LABEL: shift_and:		; X86-LABEL: shift_and:
; X86: ## %bb.0:		; X86: ## %bb.0:
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
; X86-NEXT: andb $4, %al		; X86-NEXT: shrl $2, %eax
; X86-NEXT: shrb $2, %al		; X86-NEXT: andl $1, %eax
; X86-NEXT: movzbl %al, %eax
; X86-NEXT: ## kill: def $ax killed $ax killed $eax		; X86-NEXT: ## kill: def $ax killed $ax killed $eax
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-NOTBM-LABEL: shift_and:		; X64-NOTBM-LABEL: shift_and:
; X64-NOTBM: ## %bb.0:		; X64-NOTBM: ## %bb.0:
; X64-NOTBM-NEXT: movl %edi, %eax		; X64-NOTBM-NEXT: movl %edi, %eax
; X64-NOTBM-NEXT: shrl $10, %eax		; X64-NOTBM-NEXT: shrl $10, %eax
; X64-NOTBM-NEXT: andl $1, %eax		; X64-NOTBM-NEXT: andl $1, %eax
▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/shift-amount-mod.ll

Show First 20 Lines • Show All 1,550 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%negaaddbitwidthaddb = add i64 %negaaddbitwidth, %b		%negaaddbitwidthaddb = add i64 %negaaddbitwidth, %b
%shifted = lshr i64 %val, %negaaddbitwidthaddb		%shifted = lshr i64 %val, %negaaddbitwidthaddb
ret i64 %shifted		ret i64 %shifted
}		}

define i16 @sh_trunc_sh(i64 %x) {		define i16 @sh_trunc_sh(i64 %x) {
; X32-LABEL: sh_trunc_sh:		; X32-LABEL: sh_trunc_sh:
; X32: # %bb.0:		; X32: # %bb.0:
; X32-NEXT: movl {{[0-9]+}}(%esp), %eax		; X32-NEXT: movzbl {{[0-9]+}}(%esp), %eax
; X32-NEXT: shrl $4, %eax		; X32-NEXT: shrl $4, %eax
; X32-NEXT: andl $15, %eax
; X32-NEXT: # kill: def $ax killed $ax killed $eax		; X32-NEXT: # kill: def $ax killed $ax killed $eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: sh_trunc_sh:		; X64-LABEL: sh_trunc_sh:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: shrq $36, %rax		; X64-NEXT: shrq $36, %rax
; X64-NEXT: andl $15, %eax		; X64-NEXT: andl $15, %eax
; X64-NEXT: # kill: def $ax killed $ax killed $rax		; X64-NEXT: # kill: def $ax killed $ax killed $rax
; X64-NEXT: retq		; X64-NEXT: retq
%s = lshr i64 %x, 24		%s = lshr i64 %x, 24
%t = trunc i64 %s to i16		%t = trunc i64 %s to i16
%r = lshr i16 %t, 12		%r = lshr i16 %t, 12
ret i16 %r		ret i16 %r
}		}

llvm/test/CodeGen/X86/shift-mask.ll

Show First 20 Lines • Show All 107 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%1 = lshr i16 %a0, 3		%1 = lshr i16 %a0, 3
%2 = shl i16 %1, 3		%2 = shl i16 %1, 3
ret i16 %2		ret i16 %2
}		}

define i16 @test_i16_shl_lshr_1(i16 %a0) {		define i16 @test_i16_shl_lshr_1(i16 %a0) {
; X86-LABEL: test_i16_shl_lshr_1:		; X86-LABEL: test_i16_shl_lshr_1:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax
		; X86-NEXT: andl $16376, %eax # imm = 0x3FF8
; X86-NEXT: shll $2, %eax		; X86-NEXT: shll $2, %eax
; X86-NEXT: andl $65504, %eax # imm = 0xFFE0
; X86-NEXT: # kill: def $ax killed $ax killed $eax		; X86-NEXT: # kill: def $ax killed $ax killed $eax
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-MASK-LABEL: test_i16_shl_lshr_1:		; X64-MASK-LABEL: test_i16_shl_lshr_1:
; X64-MASK: # %bb.0:		; X64-MASK: # %bb.0:
; X64-MASK-NEXT: # kill: def $edi killed $edi def $rdi		; X64-MASK-NEXT: # kill: def $edi killed $edi def $rdi
		; X64-MASK-NEXT: andl $16376, %edi # imm = 0x3FF8
; X64-MASK-NEXT: leal (,%rdi,4), %eax		; X64-MASK-NEXT: leal (,%rdi,4), %eax
; X64-MASK-NEXT: andl $65504, %eax # imm = 0xFFE0
; X64-MASK-NEXT: # kill: def $ax killed $ax killed $eax		; X64-MASK-NEXT: # kill: def $ax killed $ax killed $eax
; X64-MASK-NEXT: retq		; X64-MASK-NEXT: retq
;		;
; X64-SHIFT-LABEL: test_i16_shl_lshr_1:		; X64-SHIFT-LABEL: test_i16_shl_lshr_1:
; X64-SHIFT: # %bb.0:		; X64-SHIFT: # %bb.0:
; X64-SHIFT-NEXT: movzwl %di, %eax		; X64-SHIFT-NEXT: movzwl %di, %eax
; X64-SHIFT-NEXT: shrl $3, %eax		; X64-SHIFT-NEXT: shrl $3, %eax
; X64-SHIFT-NEXT: shll $5, %eax		; X64-SHIFT-NEXT: shll $5, %eax
▲ Show 20 Lines • Show All 225 Lines • ▼ Show 20 Lines	; X64-SHIFT-NEXT: retq
%2 = lshr i8 %1, 5		%2 = lshr i8 %1, 5
ret i8 %2		ret i8 %2
}		}

define i8 @test_i8_lshr_lshr_2(i8 %a0) {		define i8 @test_i8_lshr_lshr_2(i8 %a0) {
; X86-LABEL: test_i8_lshr_lshr_2:		; X86-LABEL: test_i8_lshr_lshr_2:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzbl {{[0-9]+}}(%esp), %eax
		; X86-NEXT: andb $7, %al
; X86-NEXT: shlb $2, %al		; X86-NEXT: shlb $2, %al
; X86-NEXT: andb $28, %al
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-MASK-LABEL: test_i8_lshr_lshr_2:		; X64-MASK-LABEL: test_i8_lshr_lshr_2:
; X64-MASK: # %bb.0:		; X64-MASK: # %bb.0:
; X64-MASK-NEXT: # kill: def $edi killed $edi def $rdi		; X64-MASK-NEXT: # kill: def $edi killed $edi def $rdi
		; X64-MASK-NEXT: andb $7, %dil
; X64-MASK-NEXT: leal (,%rdi,4), %eax		; X64-MASK-NEXT: leal (,%rdi,4), %eax
; X64-MASK-NEXT: andb $28, %al
; X64-MASK-NEXT: # kill: def $al killed $al killed $eax		; X64-MASK-NEXT: # kill: def $al killed $al killed $eax
; X64-MASK-NEXT: retq		; X64-MASK-NEXT: retq
;		;
; X64-SHIFT-LABEL: test_i8_lshr_lshr_2:		; X64-SHIFT-LABEL: test_i8_lshr_lshr_2:
; X64-SHIFT: # %bb.0:		; X64-SHIFT: # %bb.0:
; X64-SHIFT-NEXT: movl %edi, %eax		; X64-SHIFT-NEXT: movl %edi, %eax
; X64-SHIFT-NEXT: shlb $5, %al		; X64-SHIFT-NEXT: shlb $5, %al
; X64-SHIFT-NEXT: shrb $3, %al		; X64-SHIFT-NEXT: shrb $3, %al
Show All 35 Lines
; X64-MASK-LABEL: test_i16_lshr_lshr_1:		; X64-MASK-LABEL: test_i16_lshr_lshr_1:
; X64-MASK: # %bb.0:		; X64-MASK: # %bb.0:
; X64-MASK-NEXT: movl %edi, %eax		; X64-MASK-NEXT: movl %edi, %eax
; X64-MASK-NEXT: shrl $2, %eax		; X64-MASK-NEXT: shrl $2, %eax
; X64-MASK-NEXT: andl $2047, %eax # imm = 0x7FF		; X64-MASK-NEXT: andl $2047, %eax # imm = 0x7FF
; X64-MASK-NEXT: # kill: def $ax killed $ax killed $eax		; X64-MASK-NEXT: # kill: def $ax killed $ax killed $eax
; X64-MASK-NEXT: retq		; X64-MASK-NEXT: retq
;		;
; X64-SHIFT2-LABEL: test_i16_lshr_lshr_1:		; X64-SHIFT-LABEL: test_i16_lshr_lshr_1:
; X64-SHIFT2: # %bb.0:		; X64-SHIFT: # %bb.0:
; X64-SHIFT2-NEXT: movl %edi, %eax		; X64-SHIFT-NEXT: shll $3, %edi
; X64-SHIFT2-NEXT: shrl $2, %eax		; X64-SHIFT-NEXT: movzwl %di, %eax
; X64-SHIFT2-NEXT: andl $2047, %eax # imm = 0x7FF		; X64-SHIFT-NEXT: shrl $5, %eax
; X64-SHIFT2-NEXT: # kill: def $ax killed $ax killed $eax		; X64-SHIFT-NEXT: # kill: def $ax killed $ax killed $eax
; X64-SHIFT2-NEXT: retq		; X64-SHIFT-NEXT: retq
;
; X64-TBM-LABEL: test_i16_lshr_lshr_1:
; X64-TBM: # %bb.0:
; X64-TBM-NEXT: bextrl $2818, %edi, %eax # imm = 0xB02
; X64-TBM-NEXT: # kill: def $ax killed $ax killed $eax
; X64-TBM-NEXT: retq
;
; X64-BMI-LABEL: test_i16_lshr_lshr_1:
; X64-BMI: # %bb.0:
; X64-BMI-NEXT: movl $2818, %eax # imm = 0xB02
; X64-BMI-NEXT: bextrl %eax, %edi, %eax
; X64-BMI-NEXT: # kill: def $ax killed $ax killed $eax
; X64-BMI-NEXT: retq
%1 = shl i16 %a0, 3		%1 = shl i16 %a0, 3
%2 = lshr i16 %1, 5		%2 = lshr i16 %1, 5
ret i16 %2		ret i16 %2
}		}

define i16 @test_i16_lshr_lshr_2(i16 %a0) {		define i16 @test_i16_lshr_lshr_2(i16 %a0) {
; X86-LABEL: test_i16_lshr_lshr_2:		; X86-LABEL: test_i16_lshr_lshr_2:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax
		; X86-NEXT: andl $2047, %eax # imm = 0x7FF
; X86-NEXT: shll $2, %eax		; X86-NEXT: shll $2, %eax
; X86-NEXT: andl $8188, %eax # imm = 0x1FFC
; X86-NEXT: # kill: def $ax killed $ax killed $eax		; X86-NEXT: # kill: def $ax killed $ax killed $eax
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: test_i16_lshr_lshr_2:		; X64-MASK-LABEL: test_i16_lshr_lshr_2:
; X64: # %bb.0:		; X64-MASK: # %bb.0:
; X64-NEXT: # kill: def $edi killed $edi def $rdi		; X64-MASK-NEXT: # kill: def $edi killed $edi def $rdi
; X64-NEXT: leal (,%rdi,4), %eax		; X64-MASK-NEXT: andl $2047, %edi # imm = 0x7FF
; X64-NEXT: andl $8188, %eax # imm = 0x1FFC		; X64-MASK-NEXT: leal (,%rdi,4), %eax
; X64-NEXT: # kill: def $ax killed $ax killed $eax		; X64-MASK-NEXT: # kill: def $ax killed $ax killed $eax
; X64-NEXT: retq		; X64-MASK-NEXT: retq
		;
		; X64-SHIFT-LABEL: test_i16_lshr_lshr_2:
		; X64-SHIFT: # %bb.0:
		; X64-SHIFT-NEXT: shll $5, %edi
		; X64-SHIFT-NEXT: movzwl %di, %eax
		; X64-SHIFT-NEXT: shrl $3, %eax
		; X64-SHIFT-NEXT: # kill: def $ax killed $ax killed $eax
		; X64-SHIFT-NEXT: retq
%1 = shl i16 %a0, 5		%1 = shl i16 %a0, 5
%2 = lshr i16 %1, 3		%2 = lshr i16 %1, 3
ret i16 %2		ret i16 %2
}		}

define i32 @test_i32_lshr_lshr_0(i32 %a0) {		define i32 @test_i32_lshr_lshr_0(i32 %a0) {
; X86-LABEL: test_i32_lshr_lshr_0:		; X86-LABEL: test_i32_lshr_lshr_0:
; X86: # %bb.0:		; X86: # %bb.0:
Show All 35 Lines	; X64-SHIFT-NEXT: retq
%1 = shl i32 %a0, 3		%1 = shl i32 %a0, 3
%2 = lshr i32 %1, 5		%2 = lshr i32 %1, 5
ret i32 %2		ret i32 %2
}		}

define i32 @test_i32_lshr_lshr_2(i32 %a0) {		define i32 @test_i32_lshr_lshr_2(i32 %a0) {
; X86-LABEL: test_i32_lshr_lshr_2:		; X86-LABEL: test_i32_lshr_lshr_2:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl $134217727, %eax # imm = 0x7FFFFFF
		; X86-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NEXT: shll $2, %eax		; X86-NEXT: shll $2, %eax
; X86-NEXT: andl $536870908, %eax # imm = 0x1FFFFFFC
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-MASK-LABEL: test_i32_lshr_lshr_2:		; X64-MASK-LABEL: test_i32_lshr_lshr_2:
; X64-MASK: # %bb.0:		; X64-MASK: # %bb.0:
; X64-MASK-NEXT: # kill: def $edi killed $edi def $rdi		; X64-MASK-NEXT: # kill: def $edi killed $edi def $rdi
		; X64-MASK-NEXT: andl $134217727, %edi # imm = 0x7FFFFFF
; X64-MASK-NEXT: leal (,%rdi,4), %eax		; X64-MASK-NEXT: leal (,%rdi,4), %eax
; X64-MASK-NEXT: andl $536870908, %eax # imm = 0x1FFFFFFC
; X64-MASK-NEXT: retq		; X64-MASK-NEXT: retq
;		;
; X64-SHIFT-LABEL: test_i32_lshr_lshr_2:		; X64-SHIFT-LABEL: test_i32_lshr_lshr_2:
; X64-SHIFT: # %bb.0:		; X64-SHIFT: # %bb.0:
; X64-SHIFT-NEXT: movl %edi, %eax		; X64-SHIFT-NEXT: movl %edi, %eax
; X64-SHIFT-NEXT: shll $5, %eax		; X64-SHIFT-NEXT: shll $5, %eax
; X64-SHIFT-NEXT: shrl $3, %eax		; X64-SHIFT-NEXT: shrl $3, %eax
; X64-SHIFT-NEXT: retq		; X64-SHIFT-NEXT: retq
▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: shldl $2, %eax, %edx		; X86-NEXT: shldl $2, %eax, %edx
; X86-NEXT: shll $2, %eax		; X86-NEXT: shll $2, %eax
; X86-NEXT: andl $536870911, %edx # imm = 0x1FFFFFFF		; X86-NEXT: andl $536870911, %edx # imm = 0x1FFFFFFF
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-MASK-LABEL: test_i64_lshr_lshr_2:		; X64-MASK-LABEL: test_i64_lshr_lshr_2:
; X64-MASK: # %bb.0:		; X64-MASK: # %bb.0:
; X64-MASK-NEXT: leaq (,%rdi,4), %rcx		; X64-MASK-NEXT: movabsq $576460752303423487, %rax # imm = 0x7FFFFFFFFFFFFFF
; X64-MASK-NEXT: movabsq $2305843009213693948, %rax # imm = 0x1FFFFFFFFFFFFFFC		; X64-MASK-NEXT: andq %rdi, %rax
; X64-MASK-NEXT: andq %rcx, %rax		; X64-MASK-NEXT: shlq $2, %rax
; X64-MASK-NEXT: retq		; X64-MASK-NEXT: retq
;		;
; X64-SHIFT-LABEL: test_i64_lshr_lshr_2:		; X64-SHIFT-LABEL: test_i64_lshr_lshr_2:
; X64-SHIFT: # %bb.0:		; X64-SHIFT: # %bb.0:
; X64-SHIFT-NEXT: movq %rdi, %rax		; X64-SHIFT-NEXT: movq %rdi, %rax
; X64-SHIFT-NEXT: shlq $5, %rax		; X64-SHIFT-NEXT: shlq $5, %rax
; X64-SHIFT-NEXT: shrq $3, %rax		; X64-SHIFT-NEXT: shrq $3, %rax
; X64-SHIFT-NEXT: retq		; X64-SHIFT-NEXT: retq
%1 = shl i64 %a0, 5		%1 = shl i64 %a0, 5
%2 = lshr i64 %1, 3		%2 = lshr i64 %1, 3
ret i64 %2		ret i64 %2
}		}
		;; NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
		; X64-BMI: {{.*}}

llvm/test/CodeGen/X86/sttni.ll

	Show First 20 Lines • Show All 309 Lines • ▼ Show 20 Lines
	; X86-NEXT: pcmpestri $24, %xmm1, %xmm0			; X86-NEXT: pcmpestri $24, %xmm1, %xmm0
	; X86-NEXT: cmpl $16, %ecx			; X86-NEXT: cmpl $16, %ecx
	; X86-NEXT: jne .LBB8_2			; X86-NEXT: jne .LBB8_2
	; X86-NEXT: # %bb.1:			; X86-NEXT: # %bb.1:
	; X86-NEXT: xorl %eax, %eax			; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: jmp .LBB8_3			; X86-NEXT: jmp .LBB8_3
	; X86-NEXT: .LBB8_2: # %compare			; X86-NEXT: .LBB8_2: # %compare
	; X86-NEXT: movdqa %xmm0, (%esp)			; X86-NEXT: movdqa %xmm0, (%esp)
	; X86-NEXT: addl %ecx, %ecx			; X86-NEXT: andl $7, %ecx
	; X86-NEXT: andl $14, %ecx			; X86-NEXT: movzwl (%esp,%ecx,2), %eax
	; X86-NEXT: movzwl (%esp,%ecx), %eax
	; X86-NEXT: movdqa %xmm1, {{[0-9]+}}(%esp)			; X86-NEXT: movdqa %xmm1, {{[0-9]+}}(%esp)
	; X86-NEXT: subw 16(%esp,%ecx), %ax			; X86-NEXT: subw 16(%esp,%ecx,2), %ax
	; X86-NEXT: .LBB8_3: # %exit			; X86-NEXT: .LBB8_3: # %exit
	; X86-NEXT: movzwl %ax, %eax			; X86-NEXT: movzwl %ax, %eax
	; X86-NEXT: movl %ebp, %esp			; X86-NEXT: movl %ebp, %esp
	; X86-NEXT: popl %ebp			; X86-NEXT: popl %ebp
	; X86-NEXT: retl			; X86-NEXT: retl
	;			;
	; X64-LABEL: pcmpestri_reg_diff_i16:			; X64-LABEL: pcmpestri_reg_diff_i16:
	; X64: # %bb.0: # %entry			; X64: # %bb.0: # %entry
	▲ Show 20 Lines • Show All 116 Lines • ▼ Show 20 Lines
	; X86-NEXT: pcmpestri $25, %xmm0, %xmm1			; X86-NEXT: pcmpestri $25, %xmm0, %xmm1
	; X86-NEXT: cmpl $8, %ecx			; X86-NEXT: cmpl $8, %ecx
	; X86-NEXT: jne .LBB11_2			; X86-NEXT: jne .LBB11_2
	; X86-NEXT: # %bb.1:			; X86-NEXT: # %bb.1:
	; X86-NEXT: xorl %eax, %eax			; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: jmp .LBB11_3			; X86-NEXT: jmp .LBB11_3
	; X86-NEXT: .LBB11_2: # %compare			; X86-NEXT: .LBB11_2: # %compare
	; X86-NEXT: movdqa %xmm1, (%esp)			; X86-NEXT: movdqa %xmm1, (%esp)
	; X86-NEXT: addl %ecx, %ecx			; X86-NEXT: andl $7, %ecx
	; X86-NEXT: andl $14, %ecx			; X86-NEXT: movzwl (%esp,%ecx,2), %eax
	; X86-NEXT: movzwl (%esp,%ecx), %eax
	; X86-NEXT: movdqa %xmm0, {{[0-9]+}}(%esp)			; X86-NEXT: movdqa %xmm0, {{[0-9]+}}(%esp)
	; X86-NEXT: subw 16(%esp,%ecx), %ax			; X86-NEXT: subw 16(%esp,%ecx,2), %ax
	; X86-NEXT: .LBB11_3: # %exit			; X86-NEXT: .LBB11_3: # %exit
	; X86-NEXT: movzwl %ax, %eax			; X86-NEXT: movzwl %ax, %eax
	; X86-NEXT: leal -4(%ebp), %esp			; X86-NEXT: leal -4(%ebp), %esp
	; X86-NEXT: popl %esi			; X86-NEXT: popl %esi
	; X86-NEXT: popl %ebp			; X86-NEXT: popl %ebp
	; X86-NEXT: retl			; X86-NEXT: retl
	;			;
	; X64-LABEL: pcmpestri_mem_diff_i16:			; X64-LABEL: pcmpestri_mem_diff_i16:
	▲ Show 20 Lines • Show All 299 Lines • ▼ Show 20 Lines
	; X86-NEXT: movzwl %ax, %eax			; X86-NEXT: movzwl %ax, %eax
	; X86-NEXT: retl			; X86-NEXT: retl
	; X86-NEXT: .LBB20_2: # %compare			; X86-NEXT: .LBB20_2: # %compare
	; X86-NEXT: pushl %ebp			; X86-NEXT: pushl %ebp
	; X86-NEXT: movl %esp, %ebp			; X86-NEXT: movl %esp, %ebp
	; X86-NEXT: andl $-16, %esp			; X86-NEXT: andl $-16, %esp
	; X86-NEXT: subl $48, %esp			; X86-NEXT: subl $48, %esp
	; X86-NEXT: movdqa %xmm0, (%esp)			; X86-NEXT: movdqa %xmm0, (%esp)
	; X86-NEXT: addl %ecx, %ecx			; X86-NEXT: andl $7, %ecx
	; X86-NEXT: andl $14, %ecx			; X86-NEXT: movzwl (%esp,%ecx,2), %eax
	; X86-NEXT: movzwl (%esp,%ecx), %eax
	; X86-NEXT: movdqa %xmm1, {{[0-9]+}}(%esp)			; X86-NEXT: movdqa %xmm1, {{[0-9]+}}(%esp)
	; X86-NEXT: subw 16(%esp,%ecx), %ax			; X86-NEXT: subw 16(%esp,%ecx,2), %ax
	; X86-NEXT: movl %ebp, %esp			; X86-NEXT: movl %ebp, %esp
	; X86-NEXT: popl %ebp			; X86-NEXT: popl %ebp
	; X86-NEXT: movzwl %ax, %eax			; X86-NEXT: movzwl %ax, %eax
	; X86-NEXT: retl			; X86-NEXT: retl
	;			;
	; X64-LABEL: pcmpistri_reg_diff_i16:			; X64-LABEL: pcmpistri_reg_diff_i16:
	; X64: # %bb.0: # %entry			; X64: # %bb.0: # %entry
	; X64-NEXT: pcmpistri $24, %xmm1, %xmm0			; X64-NEXT: pcmpistri $24, %xmm1, %xmm0
	▲ Show 20 Lines • Show All 96 Lines • ▼ Show 20 Lines
	; X86-NEXT: pcmpistri $25, %xmm0, %xmm1			; X86-NEXT: pcmpistri $25, %xmm0, %xmm1
	; X86-NEXT: cmpl $8, %ecx			; X86-NEXT: cmpl $8, %ecx
	; X86-NEXT: jne .LBB23_2			; X86-NEXT: jne .LBB23_2
	; X86-NEXT: # %bb.1:			; X86-NEXT: # %bb.1:
	; X86-NEXT: xorl %eax, %eax			; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: jmp .LBB23_3			; X86-NEXT: jmp .LBB23_3
	; X86-NEXT: .LBB23_2: # %compare			; X86-NEXT: .LBB23_2: # %compare
	; X86-NEXT: movdqa %xmm1, (%esp)			; X86-NEXT: movdqa %xmm1, (%esp)
	; X86-NEXT: addl %ecx, %ecx			; X86-NEXT: andl $7, %ecx
	; X86-NEXT: andl $14, %ecx			; X86-NEXT: movzwl (%esp,%ecx,2), %eax
	; X86-NEXT: movzwl (%esp,%ecx), %eax
	; X86-NEXT: movdqa %xmm0, {{[0-9]+}}(%esp)			; X86-NEXT: movdqa %xmm0, {{[0-9]+}}(%esp)
	; X86-NEXT: subw 16(%esp,%ecx), %ax			; X86-NEXT: subw 16(%esp,%ecx,2), %ax
	; X86-NEXT: .LBB23_3: # %exit			; X86-NEXT: .LBB23_3: # %exit
	; X86-NEXT: movzwl %ax, %eax			; X86-NEXT: movzwl %ax, %eax
	; X86-NEXT: movl %ebp, %esp			; X86-NEXT: movl %ebp, %esp
	; X86-NEXT: popl %ebp			; X86-NEXT: popl %ebp
	; X86-NEXT: retl			; X86-NEXT: retl
	;			;
	; X64-LABEL: pcmpistri_mem_diff_i16:			; X64-LABEL: pcmpistri_mem_diff_i16:
	; X64: # %bb.0: # %entry			; X64: # %bb.0: # %entry
	▲ Show 20 Lines • Show All 409 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/tbm_patterns.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -mtriple=x86_64-unknown-unknown -mattr=+tbm < %s \| FileCheck %s			; RUN: llc -mtriple=x86_64-unknown-unknown -mattr=+tbm < %s \| FileCheck %s

	define i32 @test_x86_tbm_bextri_u32(i32 %a) nounwind {			define i32 @test_x86_tbm_bextri_u32(i32 %a) nounwind {
	; CHECK-LABEL: test_x86_tbm_bextri_u32:			; CHECK-LABEL: test_x86_tbm_bextri_u32:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: bextrl $3076, %edi, %eax # imm = 0xC04			; CHECK-NEXT: movzwl %di, %eax
				; CHECK-NEXT: shrl $4, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%t0 = lshr i32 %a, 4			%t0 = lshr i32 %a, 4
	%t1 = and i32 %t0, 4095			%t1 = and i32 %t0, 4095
	ret i32 %t1			ret i32 %t1
	}			}

	; Make sure we still use AH subreg trick for extracting bits 15:8			; Make sure we still use AH subreg trick for extracting bits 15:8
	define i32 @test_x86_tbm_bextri_u32_subreg(i32 %a) nounwind {			define i32 @test_x86_tbm_bextri_u32_subreg(i32 %a) nounwind {
	; CHECK-LABEL: test_x86_tbm_bextri_u32_subreg:			; CHECK-LABEL: test_x86_tbm_bextri_u32_subreg:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: movl %edi, %eax			; CHECK-NEXT: movl %edi, %eax
	; CHECK-NEXT: movzbl %ah, %eax			; CHECK-NEXT: movzbl %ah, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%t0 = lshr i32 %a, 8			%t0 = lshr i32 %a, 8
	%t1 = and i32 %t0, 255			%t1 = and i32 %t0, 255
	ret i32 %t1			ret i32 %t1
	}			}

	define i32 @test_x86_tbm_bextri_u32_m(ptr nocapture %a) nounwind {			define i32 @test_x86_tbm_bextri_u32_m(ptr nocapture %a) nounwind {
	; CHECK-LABEL: test_x86_tbm_bextri_u32_m:			; CHECK-LABEL: test_x86_tbm_bextri_u32_m:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: bextrl $3076, (%rdi), %eax # imm = 0xC04			; CHECK-NEXT: movzwl (%rdi), %eax
				; CHECK-NEXT: shrl $4, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%t0 = load i32, ptr %a			%t0 = load i32, ptr %a
	%t1 = lshr i32 %t0, 4			%t1 = lshr i32 %t0, 4
	%t2 = and i32 %t1, 4095			%t2 = and i32 %t1, 4095
	ret i32 %t2			ret i32 %t2
	}			}

	define i32 @test_x86_tbm_bextri_u32_z(i32 %a, i32 %b) nounwind {			define i32 @test_x86_tbm_bextri_u32_z(i32 %a, i32 %b) nounwind {
	; CHECK-LABEL: test_x86_tbm_bextri_u32_z:			; CHECK-LABEL: test_x86_tbm_bextri_u32_z:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: bextrl $3076, %edi, %eax # imm = 0xC04			; CHECK-NEXT: movzwl %di, %eax
				; CHECK-NEXT: shrl $4, %eax
	; CHECK-NEXT: cmovel %esi, %eax			; CHECK-NEXT: cmovel %esi, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%t0 = lshr i32 %a, 4			%t0 = lshr i32 %a, 4
	%t1 = and i32 %t0, 4095			%t1 = and i32 %t0, 4095
	%t2 = icmp eq i32 %t1, 0			%t2 = icmp eq i32 %t1, 0
	%t3 = select i1 %t2, i32 %b, i32 %t1			%t3 = select i1 %t2, i32 %b, i32 %t1
	ret i32 %t3			ret i32 %t3
	}			}

	define i32 @test_x86_tbm_bextri_u32_z2(i32 %a, i32 %b, i32 %c) nounwind {			define i32 @test_x86_tbm_bextri_u32_z2(i32 %a, i32 %b, i32 %c) nounwind {
	; CHECK-LABEL: test_x86_tbm_bextri_u32_z2:			; CHECK-LABEL: test_x86_tbm_bextri_u32_z2:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: movl %esi, %eax			; CHECK-NEXT: movl %esi, %eax
	; CHECK-NEXT: bextrl $3076, %edi, %ecx # imm = 0xC04			; CHECK-NEXT: testl $65520, %edi # imm = 0xFFF0
	; CHECK-NEXT: cmovnel %edx, %eax			; CHECK-NEXT: cmovnel %edx, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%t0 = lshr i32 %a, 4			%t0 = lshr i32 %a, 4
	%t1 = and i32 %t0, 4095			%t1 = and i32 %t0, 4095
	%t2 = icmp eq i32 %t1, 0			%t2 = icmp eq i32 %t1, 0
	%t3 = select i1 %t2, i32 %b, i32 %c			%t3 = select i1 %t2, i32 %b, i32 %c
	ret i32 %t3			ret i32 %t3
	}			}

	define i32 @test_x86_tbm_bextri_u32_sle(i32 %a, i32 %b, i32 %c) nounwind {			define i32 @test_x86_tbm_bextri_u32_sle(i32 %a, i32 %b, i32 %c) nounwind {
	; CHECK-LABEL: test_x86_tbm_bextri_u32_sle:			; CHECK-LABEL: test_x86_tbm_bextri_u32_sle:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: movl %esi, %eax			; CHECK-NEXT: movl %esi, %eax
	; CHECK-NEXT: bextrl $3076, %edi, %ecx # imm = 0xC04			; CHECK-NEXT: movzwl %di, %ecx
				; CHECK-NEXT: shrl $4, %ecx
	; CHECK-NEXT: testl %ecx, %ecx			; CHECK-NEXT: testl %ecx, %ecx
	; CHECK-NEXT: cmovgl %edx, %eax			; CHECK-NEXT: cmovgl %edx, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%t0 = lshr i32 %a, 4			%t0 = lshr i32 %a, 4
	%t1 = and i32 %t0, 4095			%t1 = and i32 %t0, 4095
	%t2 = icmp sle i32 %t1, 0			%t2 = icmp sle i32 %t1, 0
	%t3 = select i1 %t2, i32 %b, i32 %c			%t3 = select i1 %t2, i32 %b, i32 %c
	ret i32 %t3			ret i32 %t3
	}			}

	define i64 @test_x86_tbm_bextri_u64(i64 %a) nounwind {			define i64 @test_x86_tbm_bextri_u64(i64 %a) nounwind {
	; CHECK-LABEL: test_x86_tbm_bextri_u64:			; CHECK-LABEL: test_x86_tbm_bextri_u64:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: bextrl $3076, %edi, %eax # imm = 0xC04			; CHECK-NEXT: movzwl %di, %eax
				; CHECK-NEXT: shrl $4, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%t0 = lshr i64 %a, 4			%t0 = lshr i64 %a, 4
	%t1 = and i64 %t0, 4095			%t1 = and i64 %t0, 4095
	ret i64 %t1			ret i64 %t1
	}			}

	; Make sure we still use AH subreg trick for extracting bits 15:8			; Make sure we still use AH subreg trick for extracting bits 15:8
	define i64 @test_x86_tbm_bextri_u64_subreg(i64 %a) nounwind {			define i64 @test_x86_tbm_bextri_u64_subreg(i64 %a) nounwind {
	; CHECK-LABEL: test_x86_tbm_bextri_u64_subreg:			; CHECK-LABEL: test_x86_tbm_bextri_u64_subreg:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: movq %rdi, %rax			; CHECK-NEXT: movq %rdi, %rax
	; CHECK-NEXT: movzbl %ah, %eax			; CHECK-NEXT: movzbl %ah, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%t0 = lshr i64 %a, 8			%t0 = lshr i64 %a, 8
	%t1 = and i64 %t0, 255			%t1 = and i64 %t0, 255
	ret i64 %t1			ret i64 %t1
	}			}

	define i64 @test_x86_tbm_bextri_u64_m(ptr nocapture %a) nounwind {			define i64 @test_x86_tbm_bextri_u64_m(ptr nocapture %a) nounwind {
	; CHECK-LABEL: test_x86_tbm_bextri_u64_m:			; CHECK-LABEL: test_x86_tbm_bextri_u64_m:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: bextrl $3076, (%rdi), %eax # imm = 0xC04			; CHECK-NEXT: movzwl (%rdi), %eax
				; CHECK-NEXT: shrl $4, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%t0 = load i64, ptr %a			%t0 = load i64, ptr %a
	%t1 = lshr i64 %t0, 4			%t1 = lshr i64 %t0, 4
	%t2 = and i64 %t1, 4095			%t2 = and i64 %t1, 4095
	ret i64 %t2			ret i64 %t2
	}			}

	define i64 @test_x86_tbm_bextri_u64_z(i64 %a, i64 %b) nounwind {			define i64 @test_x86_tbm_bextri_u64_z(i64 %a, i64 %b) nounwind {
	; CHECK-LABEL: test_x86_tbm_bextri_u64_z:			; CHECK-LABEL: test_x86_tbm_bextri_u64_z:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: bextrl $3076, %edi, %eax # imm = 0xC04			; CHECK-NEXT: movzwl %di, %eax
				; CHECK-NEXT: shrl $4, %eax
	; CHECK-NEXT: cmoveq %rsi, %rax			; CHECK-NEXT: cmoveq %rsi, %rax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%t0 = lshr i64 %a, 4			%t0 = lshr i64 %a, 4
	%t1 = and i64 %t0, 4095			%t1 = and i64 %t0, 4095
	%t2 = icmp eq i64 %t1, 0			%t2 = icmp eq i64 %t1, 0
	%t3 = select i1 %t2, i64 %b, i64 %t1			%t3 = select i1 %t2, i64 %b, i64 %t1
	ret i64 %t3			ret i64 %t3
	}			}

	define i64 @test_x86_tbm_bextri_u64_z2(i64 %a, i64 %b, i64 %c) nounwind {			define i64 @test_x86_tbm_bextri_u64_z2(i64 %a, i64 %b, i64 %c) nounwind {
	; CHECK-LABEL: test_x86_tbm_bextri_u64_z2:			; CHECK-LABEL: test_x86_tbm_bextri_u64_z2:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: movq %rsi, %rax			; CHECK-NEXT: movq %rsi, %rax
	; CHECK-NEXT: bextrl $3076, %edi, %ecx # imm = 0xC04			; CHECK-NEXT: testl $65520, %edi # imm = 0xFFF0
	; CHECK-NEXT: cmovneq %rdx, %rax			; CHECK-NEXT: cmovneq %rdx, %rax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%t0 = lshr i64 %a, 4			%t0 = lshr i64 %a, 4
	%t1 = and i64 %t0, 4095			%t1 = and i64 %t0, 4095
	%t2 = icmp eq i64 %t1, 0			%t2 = icmp eq i64 %t1, 0
	%t3 = select i1 %t2, i64 %b, i64 %c			%t3 = select i1 %t2, i64 %b, i64 %c
	ret i64 %t3			ret i64 %t3
	}			}

	define i64 @test_x86_tbm_bextri_u64_sle(i64 %a, i64 %b, i64 %c) nounwind {			define i64 @test_x86_tbm_bextri_u64_sle(i64 %a, i64 %b, i64 %c) nounwind {
	; CHECK-LABEL: test_x86_tbm_bextri_u64_sle:			; CHECK-LABEL: test_x86_tbm_bextri_u64_sle:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: movq %rsi, %rax			; CHECK-NEXT: movq %rsi, %rax
	; CHECK-NEXT: bextrl $3076, %edi, %ecx # imm = 0xC04			; CHECK-NEXT: movzwl %di, %ecx
				; CHECK-NEXT: shrl $4, %ecx
	; CHECK-NEXT: testq %rcx, %rcx			; CHECK-NEXT: testq %rcx, %rcx
	; CHECK-NEXT: cmovgq %rdx, %rax			; CHECK-NEXT: cmovgq %rdx, %rax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%t0 = lshr i64 %a, 4			%t0 = lshr i64 %a, 4
	%t1 = and i64 %t0, 4095			%t1 = and i64 %t0, 4095
	%t2 = icmp sle i64 %t1, 0			%t2 = icmp sle i64 %t1, 0
	%t3 = select i1 %t2, i64 %b, i64 %c			%t3 = select i1 %t2, i64 %b, i64 %c
	ret i64 %t3			ret i64 %t3
	▲ Show 20 Lines • Show All 1,152 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/udiv_fix.ll

	Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines
	; X64-NEXT: cwtl			; X64-NEXT: cwtl
	; X64-NEXT: shrl %eax			; X64-NEXT: shrl %eax
	; X64-NEXT: # kill: def $ax killed $ax killed $eax			; X64-NEXT: # kill: def $ax killed $ax killed $eax
	; X64-NEXT: retq			; X64-NEXT: retq
	;			;
	; X86-LABEL: func3:			; X86-LABEL: func3:
	; X86: # %bb.0:			; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: addl %eax, %eax			; X86-NEXT: addl %eax, %eax
	; X86-NEXT: movzbl %cl, %ecx			; X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: shll $4, %ecx			; X86-NEXT: shll $4, %ecx
	; X86-NEXT: # kill: def $ax killed $ax killed $eax			; X86-NEXT: # kill: def $ax killed $ax killed $eax
	; X86-NEXT: xorl %edx, %edx			; X86-NEXT: xorl %edx, %edx
	; X86-NEXT: divw %cx			; X86-NEXT: divw %cx
	; X86-NEXT: # kill: def $ax killed $ax def $eax			; X86-NEXT: # kill: def $ax killed $ax def $eax
	; X86-NEXT: addl %eax, %eax			; X86-NEXT: addl %eax, %eax
	; X86-NEXT: cwtl			; X86-NEXT: cwtl
	; X86-NEXT: shrl %eax			; X86-NEXT: shrl %eax
	▲ Show 20 Lines • Show All 241 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/udiv_fix_sat.ll

	Show All 21 Lines
	; X64-NEXT: cmovael %ecx, %eax			; X64-NEXT: cmovael %ecx, %eax
	; X64-NEXT: shrl %eax			; X64-NEXT: shrl %eax
	; X64-NEXT: # kill: def $ax killed $ax killed $eax			; X64-NEXT: # kill: def $ax killed $ax killed $eax
	; X64-NEXT: retq			; X64-NEXT: retq
	;			;
	; X86-LABEL: func:			; X86-LABEL: func:
	; X86: # %bb.0:			; X86: # %bb.0:
	; X86-NEXT: movzwl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: movzwl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movzwl %ax, %eax
	; X86-NEXT: shll $8, %eax			; X86-NEXT: shll $8, %eax
	; X86-NEXT: xorl %edx, %edx			; X86-NEXT: xorl %edx, %edx
	; X86-NEXT: divl %ecx			; X86-NEXT: divl %ecx
	; X86-NEXT: cmpl $131071, %eax # imm = 0x1FFFF			; X86-NEXT: cmpl $131071, %eax # imm = 0x1FFFF
	; X86-NEXT: movl $131071, %ecx # imm = 0x1FFFF			; X86-NEXT: movl $131071, %ecx # imm = 0x1FFFF
	; X86-NEXT: cmovael %ecx, %eax			; X86-NEXT: cmovael %ecx, %eax
	; X86-NEXT: shrl %eax			; X86-NEXT: shrl %eax
	; X86-NEXT: # kill: def $ax killed $ax killed $eax			; X86-NEXT: # kill: def $ax killed $ax killed $eax
	▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines
	; X64-NEXT: movswl %cx, %eax			; X64-NEXT: movswl %cx, %eax
	; X64-NEXT: shrl %eax			; X64-NEXT: shrl %eax
	; X64-NEXT: # kill: def $ax killed $ax killed $eax			; X64-NEXT: # kill: def $ax killed $ax killed $eax
	; X64-NEXT: retq			; X64-NEXT: retq
	;			;
	; X86-LABEL: func3:			; X86-LABEL: func3:
	; X86: # %bb.0:			; X86: # %bb.0:
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: addl %eax, %eax			; X86-NEXT: addl %eax, %eax
	; X86-NEXT: movzbl %cl, %ecx			; X86-NEXT: movzbl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: shll $4, %ecx			; X86-NEXT: shll $4, %ecx
	; X86-NEXT: # kill: def $ax killed $ax killed $eax			; X86-NEXT: # kill: def $ax killed $ax killed $eax
	; X86-NEXT: xorl %edx, %edx			; X86-NEXT: xorl %edx, %edx
	; X86-NEXT: divw %cx			; X86-NEXT: divw %cx
	; X86-NEXT: # kill: def $ax killed $ax def $eax			; X86-NEXT: # kill: def $ax killed $ax def $eax
	; X86-NEXT: movzwl %ax, %ecx			; X86-NEXT: movzwl %ax, %ecx
	; X86-NEXT: cmpl $32767, %ecx # imm = 0x7FFF			; X86-NEXT: cmpl $32767, %ecx # imm = 0x7FFF
	; X86-NEXT: movl $32767, %ecx # imm = 0x7FFF			; X86-NEXT: movl $32767, %ecx # imm = 0x7FFF
	▲ Show 20 Lines • Show All 156 Lines • ▼ Show 20 Lines
	; X64-NEXT: cmovaeq %rcx, %rax			; X64-NEXT: cmovaeq %rcx, %rax
	; X64-NEXT: shrl %eax			; X64-NEXT: shrl %eax
	; X64-NEXT: # kill: def $ax killed $ax killed $rax			; X64-NEXT: # kill: def $ax killed $ax killed $rax
	; X64-NEXT: retq			; X64-NEXT: retq
	;			;
	; X86-LABEL: func7:			; X86-LABEL: func7:
	; X86: # %bb.0:			; X86: # %bb.0:
	; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: movzwl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl %ecx, %edx			; X86-NEXT: movl %ecx, %edx
	; X86-NEXT: shll $17, %edx			; X86-NEXT: shrl $15, %edx
	; X86-NEXT: shrl $15, %ecx			; X86-NEXT: shll $17, %ecx
	; X86-NEXT: andl $1, %ecx
	; X86-NEXT: pushl $0			; X86-NEXT: pushl $0
	; X86-NEXT: pushl %eax			; X86-NEXT: pushl %eax
	; X86-NEXT: pushl %ecx
	; X86-NEXT: pushl %edx			; X86-NEXT: pushl %edx
				; X86-NEXT: pushl %ecx
	; X86-NEXT: calll __udivdi3			; X86-NEXT: calll __udivdi3
	; X86-NEXT: addl $16, %esp			; X86-NEXT: addl $16, %esp
	; X86-NEXT: cmpl $131071, %eax # imm = 0x1FFFF			; X86-NEXT: cmpl $131071, %eax # imm = 0x1FFFF
	; X86-NEXT: movl $131071, %ecx # imm = 0x1FFFF			; X86-NEXT: movl $131071, %ecx # imm = 0x1FFFF
	; X86-NEXT: cmovael %ecx, %eax			; X86-NEXT: cmovael %ecx, %eax
	; X86-NEXT: testl %edx, %edx			; X86-NEXT: testl %edx, %edx
	; X86-NEXT: cmovnel %ecx, %eax			; X86-NEXT: cmovnel %ecx, %eax
	; X86-NEXT: shrl %eax			; X86-NEXT: shrl %eax
	▲ Show 20 Lines • Show All 168 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/urem-seteq-illegal-types.ll

	Show All 28 Lines
	}			}

	define i1 @test_urem_even(i27 %X) nounwind {			define i1 @test_urem_even(i27 %X) nounwind {
	; X86-LABEL: test_urem_even:			; X86-LABEL: test_urem_even:
	; X86: # %bb.0:			; X86: # %bb.0:
	; X86-NEXT: imull $115043767, {{[0-9]+}}(%esp), %eax # imm = 0x6DB6DB7			; X86-NEXT: imull $115043767, {{[0-9]+}}(%esp), %eax # imm = 0x6DB6DB7
	; X86-NEXT: movl %eax, %ecx			; X86-NEXT: movl %eax, %ecx
	; X86-NEXT: shll $26, %ecx			; X86-NEXT: shll $26, %ecx
	; X86-NEXT: andl $134217726, %eax # imm = 0x7FFFFFE
	; X86-NEXT: shrl %eax			; X86-NEXT: shrl %eax
				; X86-NEXT: andl $67108863, %eax # imm = 0x3FFFFFF
	; X86-NEXT: orl %ecx, %eax			; X86-NEXT: orl %ecx, %eax
	; X86-NEXT: andl $134217727, %eax # imm = 0x7FFFFFF			; X86-NEXT: andl $134217727, %eax # imm = 0x7FFFFFF
	; X86-NEXT: cmpl $9586981, %eax # imm = 0x924925			; X86-NEXT: cmpl $9586981, %eax # imm = 0x924925
	; X86-NEXT: setb %al			; X86-NEXT: setb %al
	; X86-NEXT: retl			; X86-NEXT: retl
	;			;
	; X64-LABEL: test_urem_even:			; X64-LABEL: test_urem_even:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: imull $115043767, %edi, %eax # imm = 0x6DB6DB7			; X64-NEXT: imull $115043767, %edi, %eax # imm = 0x6DB6DB7
	; X64-NEXT: movl %eax, %ecx			; X64-NEXT: movl %eax, %ecx
	; X64-NEXT: shll $26, %ecx			; X64-NEXT: shll $26, %ecx
	; X64-NEXT: andl $134217726, %eax # imm = 0x7FFFFFE
	; X64-NEXT: shrl %eax			; X64-NEXT: shrl %eax
				; X64-NEXT: andl $67108863, %eax # imm = 0x3FFFFFF
	; X64-NEXT: orl %ecx, %eax			; X64-NEXT: orl %ecx, %eax
	; X64-NEXT: andl $134217727, %eax # imm = 0x7FFFFFF			; X64-NEXT: andl $134217727, %eax # imm = 0x7FFFFFF
	; X64-NEXT: cmpl $9586981, %eax # imm = 0x924925			; X64-NEXT: cmpl $9586981, %eax # imm = 0x924925
	; X64-NEXT: setb %al			; X64-NEXT: setb %al
	; X64-NEXT: retq			; X64-NEXT: retq
	%urem = urem i27 %X, 14			%urem = urem i27 %X, 14
	%cmp = icmp eq i27 %urem, 0			%cmp = icmp eq i27 %urem, 0
	ret i1 %cmp			ret i1 %cmp
	▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines
	}			}

	define <3 x i1> @test_urem_vec(<3 x i11> %X) nounwind {			define <3 x i1> @test_urem_vec(<3 x i11> %X) nounwind {
	; X86-LABEL: test_urem_vec:			; X86-LABEL: test_urem_vec:
	; X86: # %bb.0:			; X86: # %bb.0:
	; X86-NEXT: imull $683, {{[0-9]+}}(%esp), %eax # imm = 0x2AB			; X86-NEXT: imull $683, {{[0-9]+}}(%esp), %eax # imm = 0x2AB
	; X86-NEXT: movl %eax, %ecx			; X86-NEXT: movl %eax, %ecx
	; X86-NEXT: shll $10, %ecx			; X86-NEXT: shll $10, %ecx
	; X86-NEXT: andl $2046, %eax # imm = 0x7FE
	; X86-NEXT: shrl %eax			; X86-NEXT: shrl %eax
				; X86-NEXT: andl $1023, %eax # imm = 0x3FF
	; X86-NEXT: orl %ecx, %eax			; X86-NEXT: orl %ecx, %eax
	; X86-NEXT: andl $2047, %eax # imm = 0x7FF			; X86-NEXT: andl $2047, %eax # imm = 0x7FF
	; X86-NEXT: cmpl $342, %eax # imm = 0x156			; X86-NEXT: cmpl $342, %eax # imm = 0x156
	; X86-NEXT: setae %al			; X86-NEXT: setae %al
	; X86-NEXT: imull $1463, {{[0-9]+}}(%esp), %ecx # imm = 0x5B7			; X86-NEXT: imull $1463, {{[0-9]+}}(%esp), %ecx # imm = 0x5B7
	; X86-NEXT: addl $-1463, %ecx # imm = 0xFA49			; X86-NEXT: addl $-1463, %ecx # imm = 0xFA49
	; X86-NEXT: andl $2047, %ecx # imm = 0x7FF			; X86-NEXT: andl $2047, %ecx # imm = 0x7FF
	; X86-NEXT: cmpl $293, %ecx # imm = 0x125			; X86-NEXT: cmpl $293, %ecx # imm = 0x125
	▲ Show 20 Lines • Show All 141 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/vselect.ll

	Show First 20 Lines • Show All 646 Lines • ▼ Show 20 Lines
	; This test case previously crashed after r363802, r363850, and r363856 due			; This test case previously crashed after r363802, r363850, and r363856 due
	; any_extend_vector_inreg not being handled by the X86 backend.			; any_extend_vector_inreg not being handled by the X86 backend.
	define i64 @vselect_any_extend_vector_inreg_crash(ptr %x) {			define i64 @vselect_any_extend_vector_inreg_crash(ptr %x) {
	; SSE-LABEL: vselect_any_extend_vector_inreg_crash:			; SSE-LABEL: vselect_any_extend_vector_inreg_crash:
	; SSE: # %bb.0:			; SSE: # %bb.0:
	; SSE-NEXT: movq {{.*#+}} xmm0 = mem[0],zero			; SSE-NEXT: movq {{.*#+}} xmm0 = mem[0],zero
	; SSE-NEXT: pcmpeqb {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0			; SSE-NEXT: pcmpeqb {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0
	; SSE-NEXT: movq %xmm0, %rax			; SSE-NEXT: movq %xmm0, %rax
	; SSE-NEXT: andl $1, %eax			; SSE-NEXT: shll $15, %eax
	; SSE-NEXT: shlq $15, %rax			; SSE-NEXT: movzwl %ax, %eax
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX-LABEL: vselect_any_extend_vector_inreg_crash:			; AVX-LABEL: vselect_any_extend_vector_inreg_crash:
	; AVX: # %bb.0:			; AVX: # %bb.0:
	; AVX-NEXT: vmovq {{.*#+}} xmm0 = mem[0],zero			; AVX-NEXT: vmovq {{.*#+}} xmm0 = mem[0],zero
	; AVX-NEXT: vpcmpeqb {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0			; AVX-NEXT: vpcmpeqb {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0
	; AVX-NEXT: vmovq %xmm0, %rax			; AVX-NEXT: vmovq %xmm0, %rax
	; AVX-NEXT: andl $1, %eax			; AVX-NEXT: shll $15, %eax
	; AVX-NEXT: shlq $15, %rax			; AVX-NEXT: movzwl %ax, %eax
	; AVX-NEXT: retq			; AVX-NEXT: retq
	0:			0:
	%1 = load <8 x i8>, ptr %x			%1 = load <8 x i8>, ptr %x
	%2 = icmp eq <8 x i8> %1, <i8 49, i8 49, i8 49, i8 49, i8 49, i8 49, i8 49, i8 49>			%2 = icmp eq <8 x i8> %1, <i8 49, i8 49, i8 49, i8 49, i8 49, i8 49, i8 49, i8 49>
	%3 = select <8 x i1> %2, <8 x i64> <i64 32768, i64 16384, i64 8192, i64 4096, i64 2048, i64 1024, i64 512, i64 256>, <8 x i64> zeroinitializer			%3 = select <8 x i1> %2, <8 x i64> <i64 32768, i64 16384, i64 8192, i64 4096, i64 2048, i64 1024, i64 512, i64 256>, <8 x i64> zeroinitializer
	%4 = extractelement <8 x i64> %3, i32 0			%4 = extractelement <8 x i64> %3, i32 0
	ret i64 %4			ret i64 %4
	}			}

llvm/test/CodeGen/X86/zext-logicop-shift-load.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=i686-unknown-unknown \| FileCheck %s --check-prefix=X86			; RUN: llc < %s -mtriple=i686-unknown-unknown \| FileCheck %s --check-prefix=X86
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefix=X64			; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefix=X64

	define i64 @test1(ptr %data) {			define i64 @test1(ptr %data) {
	; X86-LABEL: test1:			; X86-LABEL: test1:
	; X86: # %bb.0: # %entry			; X86: # %bb.0: # %entry
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movzbl (%eax), %eax			; X86-NEXT: movzbl (%eax), %eax
				; X86-NEXT: andl $15, %eax
	; X86-NEXT: shll $2, %eax			; X86-NEXT: shll $2, %eax
	; X86-NEXT: andl $60, %eax
	; X86-NEXT: xorl %edx, %edx			; X86-NEXT: xorl %edx, %edx
	; X86-NEXT: retl			; X86-NEXT: retl
	;			;
	; X64-LABEL: test1:			; X64-LABEL: test1:
	; X64: # %bb.0: # %entry			; X64: # %bb.0: # %entry
	; X64-NEXT: movl (%rdi), %eax			; X64-NEXT: movl (%rdi), %eax
	; X64-NEXT: shll $2, %eax			; X64-NEXT: shll $2, %eax
	; X64-NEXT: andl $60, %eax			; X64-NEXT: andl $60, %eax
	▲ Show 20 Lines • Show All 160 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[X86] Improve instruction ordering of constant `srl/shl` with `and` to get better and-masksAcceptedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 488861

llvm/lib/Target/X86/X86ISelLowering.h

llvm/lib/Target/X86/X86ISelLowering.cpp

llvm/test/CodeGen/X86/avx512-calling-conv.ll

llvm/test/CodeGen/X86/bitreverse.ll

llvm/test/CodeGen/X86/bmi-x86_64.ll

llvm/test/CodeGen/X86/bmi.ll

llvm/test/CodeGen/X86/btc_bts_btr.ll

llvm/test/CodeGen/X86/combine-bitreverse.ll

llvm/test/CodeGen/X86/combine-rotates.ll

llvm/test/CodeGen/X86/const-shift-of-constmasked.ll

llvm/test/CodeGen/X86/const-shift-with-and.ll

llvm/test/CodeGen/X86/fold-and-shift.ll

llvm/test/CodeGen/X86/limited-prec.ll

llvm/test/CodeGen/X86/movmsk-cmp.ll

llvm/test/CodeGen/X86/pr15267.ll

llvm/test/CodeGen/X86/pr26350.ll

llvm/test/CodeGen/X86/pr32282.ll

llvm/test/CodeGen/X86/pr45995.ll

llvm/test/CodeGen/X86/pull-binop-through-shift.ll

llvm/test/CodeGen/X86/rev16.ll

llvm/test/CodeGen/X86/rotate-extract.ll

llvm/test/CodeGen/X86/selectcc-to-shiftand.ll

llvm/test/CodeGen/X86/setcc.ll

llvm/test/CodeGen/X86/shift-amount-mod.ll

llvm/test/CodeGen/X86/shift-mask.ll

llvm/test/CodeGen/X86/sttni.ll

llvm/test/CodeGen/X86/tbm_patterns.ll

llvm/test/CodeGen/X86/udiv_fix.ll

llvm/test/CodeGen/X86/udiv_fix_sat.ll

llvm/test/CodeGen/X86/urem-seteq-illegal-types.ll

llvm/test/CodeGen/X86/vselect.ll

llvm/test/CodeGen/X86/zext-logicop-shift-load.ll

[X86] Improve instruction ordering of constant `srl/shl` with `and` to get better and-masks
AcceptedPublic