This is an archive of the discontinued LLVM Phabricator instance.

[X86] Replace (most) X86ISD::SHLD/SHRD usage with ISD::FSHL/FSHR generic opcodes (PR39467)
ClosedPublic

Authored by RKSimon on Mar 6 2020, 7:50 AM.

Download Raw Diff

Details

Reviewers

craig.topper
spatel
lebedev.ri

Commits

rGb3b4727a3e7e: [X86] Replace (most) X86ISD::SHLD/SHRD usage with ISD::FSHL/FSHR generic…

Summary

For i32 and i64 cases, X86ISD::SHLD/SHRD are close enough to ISD::FSHL/FSHR that we can use them directly, we just need to account for the operand commutation for SHRD.

The i16 SHLD/SHRD case is annoying as the shift amount is modulo-32 (vs funnel shift modulo-16), so I've added X86ISD::FSHL/FSHR equivalents, which matches the generic implementation in all other terms.

Something I'm slightly concerned with is that ISD::FSHL/FSHR legality is controlled by the Subtarget.isSHLDSlow() feature flag - we don't normally use non-ISA features for this but it allows the DAG combines to continue to operate after legalization in a lot more cases.

The X86 clear_highbits.ll changes are all affected by the same issue - we now have a "FSHR(-1,-1,amt) -> ROTR(-1,amt) -> (-1)" simplification that reduces the dependencies enough for the branch fall through code to mess up. I'm not sure how much of a patch-specific problem this is - tbh if it wasn't for the extra stack usage I wouldn't care much at all. Thoughts?

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

RKSimon created this revision.Mar 6 2020, 7:50 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 6 2020, 7:50 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

RKSimon edited the summary of this revision. (Show Details)Mar 6 2020, 7:54 AM

RKSimon marked an inline comment as done.Mar 6 2020, 8:03 AM

RKSimon added inline comments.

llvm/test/CodeGen/X86/clear-highbits.ll
6	@lebedev.ri Apart from the X86-FALLBACK0 case, can't we enable +cmov on these x86 targets? I can't think of a target that would support any BMI/TBM level without CMOV support.

lebedev.ri added inline comments.Mar 6 2020, 8:20 AM

llvm/test/CodeGen/X86/clear-highbits.ll
6	I don't see why not.

Harbormaster failed remote builds in B48338: Diff 248730!Mar 6 2020, 8:46 AM

RKSimon mentioned this in D58475: [X86] Improve detection of unneeded shift amount masking to also handle the case that the LHS has known zeroes in it.Mar 6 2020, 9:42 AM

rebase

RKSimon mentioned this in rGfb8149cac8b0: [X86] Add CMOV to i686 BMI/TBM tests.Mar 6 2020, 9:55 AM

(i think this is the wrong patch)

Harbormaster failed remote builds in B48367: Diff 248768!Mar 6 2020, 10:29 AM

Are you sure this is the right patch?

This revision now requires changes to proceed.Mar 9 2020, 6:29 AM

Sorry, git was being git - will fix in a sec

regenerate diff

Harbormaster failed remote builds in B48550: Diff 249099!Mar 9 2020, 8:35 AM

Any more comments? I don't want to rush but I'm keen to get this in as it blocks a number of other patches.

craig.topper added inline comments.Mar 10 2020, 10:58 AM

llvm/lib/Target/X86/X86ISelLowering.h
37	What is the 'x' after the W?

This doesn't look unreasonable to me but best for someone one too look this over, too.

This revision is now accepted and ready to land.Mar 10 2020, 11:04 AM

RKSimon marked an inline comment as done.Mar 10 2020, 3:23 PM

RKSimon added inline comments.

llvm/lib/Target/X86/X86ISelLowering.h
37	The instruction suffix for the rr/rm/mr cases - I'll remove it.

RKSimon marked an inline comment as done and an inline comment as not done.Mar 10 2020, 3:49 PM

RKSimon added inline comments.

llvm/lib/Target/X86/X86ISelLowering.h
37	@craig.topper Other than this are you ok with the patch?

LGTM

llvm/lib/Target/X86/X86ISelLowering.h
37	Yeah

Closed by commit rGb3b4727a3e7e: [X86] Replace (most) X86ISD::SHLD/SHRD usage with ISD::FSHL/FSHR generic… (authored by RKSimon). · Explain WhyMar 11 2020, 4:33 AM

This revision was automatically updated to reflect the committed changes.

RKSimon marked an inline comment as not done.

RKSimon mentioned this in D75114: [DAG] MatchRotate - Add funnel shift by immediate support.Mar 11 2020, 5:07 AM

Revision Contents

Path

Size

llvm/

lib/

Target/

X86/

9 lines

22 lines

33 lines

4 lines

X86InstrShiftRotate.td

70 lines

test/

CodeGen/

X86/

234 lines

158 lines

464 lines

489 lines

9 lines

13 lines

1 line

11 lines

Diff 248730

llvm/lib/Target/X86/X86ISelLowering.h

	Show All 18 Lines
	#include "llvm/CodeGen/TargetLowering.h"			#include "llvm/CodeGen/TargetLowering.h"

	namespace llvm {			namespace llvm {
	class X86Subtarget;			class X86Subtarget;
	class X86TargetMachine;			class X86TargetMachine;

	namespace X86ISD {			namespace X86ISD {
	// X86 Specific DAG Nodes			// X86 Specific DAG Nodes
	enum NodeType : unsigned {			enum NodeType : unsigned {
				Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - enum NodeType : unsigned { - // Start the numbering where the builtin ops leave off. - FIRST_NUMBER = ISD::BUILTIN_OP_END, - - /// Bit scan forward. - BSF, - /// Bit scan reverse. - BSR, - - /// X86 funnel/double shift i16 instructions. These correspond to - /// X86::SHLDWx and X86::SHRDWx instructions which have different amt - /// modulo rules to generic funnel shifts. - FSHL, - FSHR, - - /// Bitwise logical AND of floating point values. This corresponds - /// to X86::ANDPS or X86::ANDPD. - FAND, - - /// Bitwise logical OR of floating point values. This corresponds - /// to X86::ORPS or X86::ORPD. - FOR, - - /// Bitwise logical XOR of floating point values. This corresponds - /// to X86::XORPS or X86::XORPD. - FXOR, - - /// Bitwise logical ANDNOT of floating point values. This - /// corresponds to X86::ANDNPS or X86::ANDNPD. - FANDN, - - /// These operations represent an abstract X86 call - /// instruction, which includes a bunch of information. In particular the - /// operands of these node are: - /// - /// #0 - The incoming token chain - /// #1 - The callee - /// #2 - The number of arg bytes the caller pushes on the stack. - /// #3 - The number of arg bytes the callee pops off the stack. - /// #4 - The value to pass in AL/AX/EAX (optional) - /// #5 - The value to pass in DL/DX/EDX (optional) - /// - /// The result values of these nodes are: - /// - /// #0 - The outgoing token chain - /// #1 - The first register result value (optional) - /// #2 - The second register result value (optional) - /// - CALL, - - /// Same as call except it adds the NoTrack prefix. - NT_CALL, - - /// X86 compare and logical compare instructions. - CMP, FCMP, COMI, UCOMI, - - /// X86 bit-test instructions. - BT, - - /// X86 SetCC. Operand 0 is condition code, and operand 1 is the EFLAGS - /// operand, usually produced by a CMP instruction. - SETCC, - - /// X86 Select - SELECTS, - - // Same as SETCC except it's materialized with a sbb and the value is all - // one's or all zero's. - SETCC_CARRY, // R = carry_bit ? ~0 : 0 - - /// X86 FP SETCC, implemented with CMP{cc}SS/CMP{cc}SD. - /// Operands are two FP values to compare; result is a mask of - /// 0s or 1s. Generally DTRT for C/C++ with NaNs. - FSETCC, - - /// X86 FP SETCC, similar to above, but with output as an i1 mask and - /// and a version with SAE. - FSETCCM, FSETCCM_SAE, - - /// X86 conditional moves. Operand 0 and operand 1 are the two values - /// to select from. Operand 2 is the condition code, and operand 3 is the - /// flag operand produced by a CMP or TEST instruction. - CMOV, - - /// X86 conditional branches. Operand 0 is the chain operand, operand 1 - /// is the block to branch if condition is true, operand 2 is the - /// condition code, and operand 3 is the flag operand produced by a CMP - /// or TEST instruction. - BRCOND, - - /// BRIND node with NoTrack prefix. Operand 0 is the chain operand and - /// operand 1 is the target address. - NT_BRIND, - - /// Return with a flag operand. Operand 0 is the chain operand, operand - /// 1 is the number of bytes of stack to pop. - RET_FLAG, - - /// Return from interrupt. Operand 0 is the number of bytes to pop. - IRET, - - /// Repeat fill, corresponds to X86::REP_STOSx. - REP_STOS, - - /// Repeat move, corresponds to X86::REP_MOVSx. - REP_MOVS, - - /// On Darwin, this node represents the result of the popl - /// at function entry, used for PIC code. - GlobalBaseReg, - - /// A wrapper node for TargetConstantPool, TargetJumpTable, - /// TargetExternalSymbol, TargetGlobalAddress, TargetGlobalTLSAddress, - /// MCSymbol and TargetBlockAddress. - Wrapper, - - /// Special wrapper used under X86-64 PIC mode for RIP - /// relative displacements. - WrapperRIP, - - /// Copies a 64-bit value from an MMX vector to the low word - /// of an XMM vector, with the high word zero filled. - MOVQ2DQ, - - /// Copies a 64-bit value from the low word of an XMM vector - /// to an MMX vector. - MOVDQ2Q, - - /// Copies a 32-bit value from the low word of a MMX - /// vector to a GPR. - MMX_MOVD2W, - - /// Copies a GPR into the low 32-bit word of a MMX vector - /// and zero out the high word. - MMX_MOVW2D, - - /// Extract an 8-bit value from a vector and zero extend it to - /// i32, corresponds to X86::PEXTRB. - PEXTRB, - - /// Extract a 16-bit value from a vector and zero extend it to - /// i32, corresponds to X86::PEXTRW. - PEXTRW, - - /// Insert any element of a 4 x float vector into any element - /// of a destination 4 x floatvector. - INSERTPS, - - /// Insert the lower 8-bits of a 32-bit value to a vector, - /// corresponds to X86::PINSRB. - PINSRB, - - /// Insert the lower 16-bits of a 32-bit value to a vector, - /// corresponds to X86::PINSRW. - PINSRW, - - /// Shuffle 16 8-bit values within a vector. - PSHUFB, - - /// Compute Sum of Absolute Differences. - PSADBW, - /// Compute Double Block Packed Sum-Absolute-Differences - DBPSADBW, - - /// Bitwise Logical AND NOT of Packed FP values. - ANDNP, - - /// Blend where the selector is an immediate. - BLENDI, - - /// Dynamic (non-constant condition) vector blend where only the sign bits - /// of the condition elements are used. This is used to enforce that the - /// condition mask is not valid for generic VSELECT optimizations. This - /// is also used to implement the intrinsics. - /// Operands are in VSELECT order: MASK, TRUE, FALSE - BLENDV, - - /// Combined add and sub on an FP vector. - ADDSUB, - - // FP vector ops with rounding mode. - FADD_RND, FADDS, FADDS_RND, - FSUB_RND, FSUBS, FSUBS_RND, - FMUL_RND, FMULS, FMULS_RND, - FDIV_RND, FDIVS, FDIVS_RND, - FMAX_SAE, FMAXS_SAE, - FMIN_SAE, FMINS_SAE, - FSQRT_RND, FSQRTS, FSQRTS_RND, - - // FP vector get exponent. - FGETEXP, FGETEXP_SAE, FGETEXPS, FGETEXPS_SAE, - // Extract Normalized Mantissas. - VGETMANT, VGETMANT_SAE, VGETMANTS, VGETMANTS_SAE, - // FP Scale. - SCALEF, SCALEF_RND, - SCALEFS, SCALEFS_RND, - - // Unsigned Integer average. - AVG, - - /// Integer horizontal add/sub. - HADD, - HSUB, - - /// Floating point horizontal add/sub. - FHADD, - FHSUB, - - // Detect Conflicts Within a Vector - CONFLICT, - - /// Floating point max and min. - FMAX, FMIN, - - /// Commutative FMIN and FMAX. - FMAXC, FMINC, - - /// Scalar intrinsic floating point max and min. - FMAXS, FMINS, - - /// Floating point reciprocal-sqrt and reciprocal approximation. - /// Note that these typically require refinement - /// in order to obtain suitable precision. - FRSQRT, FRCP, - - // AVX-512 reciprocal approximations with a little more precision. - RSQRT14, RSQRT14S, RCP14, RCP14S, - - // Thread Local Storage. - TLSADDR, - - // Thread Local Storage. A call to get the start address - // of the TLS block for the current module. - TLSBASEADDR, - - // Thread Local Storage. When calling to an OS provided - // thunk at the address from an earlier relocation. - TLSCALL, - - // Exception Handling helpers. - EH_RETURN, - - // SjLj exception handling setjmp. - EH_SJLJ_SETJMP, - - // SjLj exception handling longjmp. - EH_SJLJ_LONGJMP, - - // SjLj exception handling dispatch. - EH_SJLJ_SETUP_DISPATCH, - - /// Tail call return. See X86TargetLowering::LowerCall for - /// the list of operands. - TC_RETURN, - - // Vector move to low scalar and zero higher vector elements. - VZEXT_MOVL, - - // Vector integer truncate. - VTRUNC, - // Vector integer truncate with unsigned/signed saturation. - VTRUNCUS, VTRUNCS, - - // Masked version of the above. Used when less than a 128-bit result is - // produced since the mask only applies to the lower elements and can't - // be represented by a select. - // SRC, PASSTHRU, MASK - VMTRUNC, VMTRUNCUS, VMTRUNCS, - - // Vector FP extend. - VFPEXT, VFPEXT_SAE, VFPEXTS, VFPEXTS_SAE, - - // Vector FP round. - VFPROUND, VFPROUND_RND, VFPROUNDS, VFPROUNDS_RND, - - // Masked version of above. Used for v2f64->v4f32. - // SRC, PASSTHRU, MASK - VMFPROUND, - - // 128-bit vector logical left / right shift - VSHLDQ, VSRLDQ, - - // Vector shift elements - VSHL, VSRL, VSRA, - - // Vector variable shift - VSHLV, VSRLV, VSRAV, - - // Vector shift elements by immediate - VSHLI, VSRLI, VSRAI, - - // Shifts of mask registers. - KSHIFTL, KSHIFTR, - - // Bit rotate by immediate - VROTLI, VROTRI, - - // Vector packed double/float comparison. - CMPP, - - // Vector integer comparisons. - PCMPEQ, PCMPGT, - - // v8i16 Horizontal minimum and position. - PHMINPOS, - - MULTISHIFT, - - /// Vector comparison generating mask bits for fp and - /// integer signed and unsigned data types. - CMPM, - // Vector comparison with SAE for FP values - CMPM_SAE, - - // Arithmetic operations with FLAGS results. - ADD, SUB, ADC, SBB, SMUL, UMUL, - OR, XOR, AND, - - // Bit field extract. - BEXTR, - - // Zero High Bits Starting with Specified Bit Position. - BZHI, - - // X86-specific multiply by immediate. - MUL_IMM, - - // Vector sign bit extraction. - MOVMSK, - - // Vector bitwise comparisons. - PTEST, - - // Vector packed fp sign bitwise comparisons. - TESTP, - - // OR/AND test for masks. - KORTEST, - KTEST, - - // ADD for masks. - KADD, - - // Several flavors of instructions with vector shuffle behaviors. - // Saturated signed/unnsigned packing. - PACKSS, - PACKUS, - // Intra-lane alignr. - PALIGNR, - // AVX512 inter-lane alignr. - VALIGN, - PSHUFD, - PSHUFHW, - PSHUFLW, - SHUFP, - // VBMI2 Concat & Shift. - VSHLD, - VSHRD, - VSHLDV, - VSHRDV, - //Shuffle Packed Values at 128-bit granularity. - SHUF128, - MOVDDUP, - MOVSHDUP, - MOVSLDUP, - MOVLHPS, - MOVHLPS, - MOVSD, - MOVSS, - UNPCKL, - UNPCKH, - VPERMILPV, - VPERMILPI, - VPERMI, - VPERM2X128, - - // Variable Permute (VPERM). - // Res = VPERMV MaskV, V0 - VPERMV, - - // 3-op Variable Permute (VPERMT2). - // Res = VPERMV3 V0, MaskV, V1 - VPERMV3, - - // Bitwise ternary logic. - VPTERNLOG, - // Fix Up Special Packed Float32/64 values. - VFIXUPIMM, VFIXUPIMM_SAE, - VFIXUPIMMS, VFIXUPIMMS_SAE, - // Range Restriction Calculation For Packed Pairs of Float32/64 values. - VRANGE, VRANGE_SAE, VRANGES, VRANGES_SAE, - // Reduce - Perform Reduction Transformation on scalar\packed FP. - VREDUCE, VREDUCE_SAE, VREDUCES, VREDUCES_SAE, - // RndScale - Round FP Values To Include A Given Number Of Fraction Bits. - // Also used by the legacy (V)ROUND intrinsics where we mask out the - // scaling part of the immediate. - VRNDSCALE, VRNDSCALE_SAE, VRNDSCALES, VRNDSCALES_SAE, - // Tests Types Of a FP Values for packed types. - VFPCLASS, - // Tests Types Of a FP Values for scalar types. - VFPCLASSS, - - // Broadcast (splat) scalar or element 0 of a vector. If the operand is - // a vector, this node may change the vector length as part of the splat. - VBROADCAST, - // Broadcast mask to vector. - VBROADCASTM, - // Broadcast subvector to vector. - SUBV_BROADCAST, - - /// SSE4A Extraction and Insertion. - EXTRQI, INSERTQI, - - // XOP arithmetic/logical shifts. - VPSHA, VPSHL, - // XOP signed/unsigned integer comparisons. - VPCOM, VPCOMU, - // XOP packed permute bytes. - VPPERM, - // XOP two source permutation. - VPERMIL2, - - // Vector multiply packed unsigned doubleword integers. - PMULUDQ, - // Vector multiply packed signed doubleword integers. - PMULDQ, - // Vector Multiply Packed UnsignedIntegers with Round and Scale. - MULHRS, - - // Multiply and Add Packed Integers. - VPMADDUBSW, VPMADDWD, - - // AVX512IFMA multiply and add. - // NOTE: These are different than the instruction and perform - // op0 x op1 + op2. - VPMADD52L, VPMADD52H, - - // VNNI - VPDPBUSD, - VPDPBUSDS, - VPDPWSSD, - VPDPWSSDS, - - // FMA nodes. - // We use the target independent ISD::FMA for the non-inverted case. - FNMADD, - FMSUB, - FNMSUB, - FMADDSUB, - FMSUBADD, - - // FMA with rounding mode. - FMADD_RND, - FNMADD_RND, - FMSUB_RND, - FNMSUB_RND, - FMADDSUB_RND, - FMSUBADD_RND, - - // Compress and expand. - COMPRESS, - EXPAND, - - // Bits shuffle - VPSHUFBITQMB, - - // Convert Unsigned/Integer to Floating-Point Value with rounding mode. - SINT_TO_FP_RND, UINT_TO_FP_RND, - SCALAR_SINT_TO_FP, SCALAR_UINT_TO_FP, - SCALAR_SINT_TO_FP_RND, SCALAR_UINT_TO_FP_RND, - - // Vector float/double to signed/unsigned integer. - CVTP2SI, CVTP2UI, CVTP2SI_RND, CVTP2UI_RND, - // Scalar float/double to signed/unsigned integer. - CVTS2SI, CVTS2UI, CVTS2SI_RND, CVTS2UI_RND, - - // Vector float/double to signed/unsigned integer with truncation. - CVTTP2SI, CVTTP2UI, CVTTP2SI_SAE, CVTTP2UI_SAE, - // Scalar float/double to signed/unsigned integer with truncation. - CVTTS2SI, CVTTS2UI, CVTTS2SI_SAE, CVTTS2UI_SAE, - - // Vector signed/unsigned integer to float/double. - CVTSI2P, CVTUI2P, - - // Masked versions of above. Used for v2f64->v4f32. - // SRC, PASSTHRU, MASK - MCVTP2SI, MCVTP2UI, MCVTTP2SI, MCVTTP2UI, - MCVTSI2P, MCVTUI2P, - - // Vector float to bfloat16. - // Convert TWO packed single data to one packed BF16 data - CVTNE2PS2BF16, - // Convert packed single data to packed BF16 data - CVTNEPS2BF16, - // Masked version of above. - // SRC, PASSTHRU, MASK - MCVTNEPS2BF16, - - // Dot product of BF16 pairs to accumulated into - // packed single precision. - DPBF16PS, - - // Save xmm argument registers to the stack, according to %al. An operator - // is needed so that this can be expanded with control flow. - VASTART_SAVE_XMM_REGS, - - // Windows's _chkstk call to do stack probing. - WIN_ALLOCA, - - // For allocating variable amounts of stack space when using - // segmented stacks. Check if the current stacklet has enough space, and - // falls back to heap allocation if not. - SEG_ALLOCA, - - // For allocating stack space when using stack clash protector. - // Allocation is performed by block, and each block is probed. - PROBED_ALLOCA, - - // Memory barriers. - MEMBARRIER, - MFENCE, - - // Get a random integer and indicate whether it is valid in CF. - RDRAND, - - // Get a NIST SP800-90B & C compliant random integer and - // indicate whether it is valid in CF. - RDSEED, - - // Protection keys - // RDPKRU - Operand 0 is chain. Operand 1 is value for ECX. - // WRPKRU - Operand 0 is chain. Operand 1 is value for EDX. Operand 2 is - // value for ECX. - RDPKRU, WRPKRU, - - // SSE42 string comparisons. - // These nodes produce 3 results, index, mask, and flags. X86ISelDAGToDAG - // will emit one or two instructions based on which results are used. If - // flags and index/mask this allows us to use a single instruction since - // we won't have to pick and opcode for flags. Instead we can rely on the - // DAG to CSE everything and decide at isel. - PCMPISTR, - PCMPESTR, - - // Test if in transactional execution. - XTEST, - - // ERI instructions. - RSQRT28, RSQRT28_SAE, RSQRT28S, RSQRT28S_SAE, - RCP28, RCP28_SAE, RCP28S, RCP28S_SAE, EXP2, EXP2_SAE, - - // Conversions between float and half-float. - CVTPS2PH, CVTPH2PS, CVTPH2PS_SAE, - - // Masked version of above. - // SRC, RND, PASSTHRU, MASK - MCVTPS2PH, - - // Galois Field Arithmetic Instructions - GF2P8AFFINEINVQB, GF2P8AFFINEQB, GF2P8MULB, - - // LWP insert record. - LWPINS, - - // User level wait - UMWAIT, TPAUSE, - - // Enqueue Stores Instructions - ENQCMD, ENQCMDS, - - // For avx512-vp2intersect - VP2INTERSECT, - - /// X86 strict FP compare instructions. - STRICT_FCMP = ISD::FIRST_TARGET_STRICTFP_OPCODE, - STRICT_FCMPS, - - // Vector packed double/float comparison. - STRICT_CMPP, - - /// Vector comparison generating mask bits for fp and - /// integer signed and unsigned data types. - STRICT_CMPM, - - // Vector float/double to signed/unsigned integer with truncation. - STRICT_CVTTP2SI, STRICT_CVTTP2UI, - - // Vector FP extend. - STRICT_VFPEXT, - - // Vector FP round. - STRICT_VFPROUND, - - // RndScale - Round FP Values To Include A Given Number Of Fraction Bits. - // Also used by the legacy (V)ROUND intrinsics where we mask out the - // scaling part of the immediate. - STRICT_VRNDSCALE, - - // Vector signed/unsigned integer to float/double. - STRICT_CVTSI2P, STRICT_CVTUI2P, - - // Strict FMA nodes. - STRICT_FNMADD, STRICT_FMSUB, STRICT_FNMSUB, - - // Conversions between float and half-float. - STRICT_CVTPS2PH, STRICT_CVTPH2PS, - - // Compare and swap. - LCMPXCHG_DAG = ISD::FIRST_TARGET_MEMORY_OPCODE, - LCMPXCHG8_DAG, - LCMPXCHG16_DAG, - LCMPXCHG8_SAVE_EBX_DAG, - LCMPXCHG16_SAVE_RBX_DAG, - - /// LOCK-prefixed arithmetic read-modify-write instructions. - /// EFLAGS, OUTCHAIN = LADD(INCHAIN, PTR, RHS) - LADD, LSUB, LOR, LXOR, LAND, - - // Load, scalar_to_vector, and zero extend. - VZEXT_LOAD, - - // extract_vector_elt, store. - VEXTRACT_STORE, - - // scalar broadcast from memory - VBROADCAST_LOAD, - - // Store FP control world into i16 memory. - FNSTCW16m, - - /// This instruction implements FP_TO_SINT with the - /// integer destination in memory and a FP reg source. This corresponds - /// to the X86::FISTm instructions and the rounding mode change stuff. It - /// has two inputs (token chain and address) and two outputs (int value - /// and token chain). Memory VT specifies the type to store to. - FP_TO_INT_IN_MEM, - - /// This instruction implements SINT_TO_FP with the - /// integer source in memory and FP reg result. This corresponds to the - /// X86::FILDm instructions. It has two inputs (token chain and address) - /// and two outputs (FP value and token chain). The integer source type is - /// specified by the memory VT. - FILD, - - /// This instruction implements a fp->int store from FP stack - /// slots. This corresponds to the fist instruction. It takes a - /// chain operand, value to store, address, and glue. The memory VT - /// specifies the type to store as. - FIST, - - /// This instruction implements an extending load to FP stack slots. - /// This corresponds to the X86::FLD32m / X86::FLD64m. It takes a chain - /// operand, and ptr to load from. The memory VT specifies the type to - /// load from. - FLD, - - /// This instruction implements a truncating store from FP stack - /// slots. This corresponds to the X86::FST32m / X86::FST64m. It takes a - /// chain operand, value to store, address, and glue. The memory VT - /// specifies the type to store as. - FST, - - /// This instruction grabs the address of the next argument - /// from a va_list. (reads and modifies the va_list in memory) - VAARG_64, - - // Vector truncating store with unsigned/signed saturation - VTRUNCSTOREUS, VTRUNCSTORES, - // Vector truncating masked store with unsigned/signed saturation - VMTRUNCSTOREUS, VMTRUNCSTORES, - - // X86 specific gather and scatter - MGATHER, MSCATTER, - - // WARNING: Do not add anything in the end unless you want the node to - // have memop! In fact, starting from FIRST_TARGET_MEMORY_OPCODE all - // opcodes will be thought as target memory ops! - }; + enum NodeType : unsigned { + // Start the numbering where the builtin ops leave off. + FIRST_NUMBER = ISD::BUILTIN_OP_END, + + /// Bit scan forward. + BSF, + /// Bit scan reverse. + BSR, + + /// X86 funnel/double shift i16 instructions. These correspond to + /// X86::SHLDWx and X86::SHRDWx instructions which have different amt + /// modulo rules to generic funnel shifts. + FSHL, + FSHR, + + /// Bitwise logical AND of floating point values. This corresponds + /// to X86::ANDPS or X86::ANDPD. + FAND, + + /// Bitwise logical OR of floating point values. This corresponds + /// to X86::ORPS or X86::ORPD. + FOR, + + /// Bitwise logical XOR of floating point values. This corresponds + /// to X86::XORPS or X86::XORPD. + FXOR, + + /// Bitwise logical ANDNOT of floating point values. This + /// corresponds to X86::ANDNPS or X86::ANDNPD. + FANDN, + + /// These operations represent an abstract X86 call + /// instruction, which includes a bunch of information. In particular the + /// operands of these node are: + /// + /// #0 - The incoming token chain + /// #1 - The callee + /// #2 - The number of arg bytes the caller pushes on the stack. + /// #3 - The number of arg bytes the callee pops off the stack. + /// #4 - The value to pass in AL/AX/EAX (optional) + /// #5 - The value to pass in DL/DX/EDX (optional) + /// + /// The result values of these nodes are: + /// + /// #0 - The outgoing token chain + /// #1 - The first register result value (optional) + /// #2 - The second register result value (optional) + /// + CALL, + + /// Same as call except it adds the NoTrack prefix. + NT_CALL, + + /// X86 compare and logical compare instructions. + CMP, + FCMP, + COMI, + UCOMI, + + /// X86 bit-test instructions. + BT, + + /// X86 SetCC. Operand 0 is condition code, and operand 1 is the EFLAGS + /// operand, usually produced by a CMP instruction. + SETCC, + + /// X86 Select + SELECTS, + + // Same as SETCC except it's materialized with a sbb and the value is all + // one's or all zero's. + SETCC_CARRY, // R = carry_bit ? ~0 : 0 + + /// X86 FP SETCC, implemented with CMP{cc}SS/CMP{cc}SD. + /// Operands are two FP values to compare; result is a mask of + /// 0s or 1s. Generally DTRT for C/C++ with NaNs. + FSETCC, + + /// X86 FP SETCC, similar to above, but with output as an i1 mask and + /// and a version with SAE. + FSETCCM, + FSETCCM_SAE, + + /// X86 conditional moves. Operand 0 and operand 1 are the two values + /// to select from. Operand 2 is the condition code, and operand 3 is the + /// flag operand produced by a CMP or TEST instruction. + CMOV, + + /// X86 conditional branches. Operand 0 is the chain operand, operand 1 + /// is the block to branch if condition is true, operand 2 is the + /// condition code, and operand 3 is the flag operand produced by a CMP + /// or TEST instruction. + BRCOND, + + /// BRIND node with NoTrack prefix. Operand 0 is the chain operand and + /// operand 1 is the target address. + NT_BRIND, + + /// Return with a flag operand. Operand 0 is the chain operand, operand + /// 1 is the number of bytes of stack to pop. + RET_FLAG, + + /// Return from interrupt. Operand 0 is the number of bytes to pop. + IRET, + + /// Repeat fill, corresponds to X86::REP_STOSx. + REP_STOS, + + /// Repeat move, corresponds to X86::REP_MOVSx. + REP_MOVS, + + /// On Darwin, this node represents the result of the popl + /// at function entry, used for PIC code. + GlobalBaseReg, + + /// A wrapper node for TargetConstantPool, TargetJumpTable, + /// TargetExternalSymbol, TargetGlobalAddress, TargetGlobalTLSAddress, + /// MCSymbol and TargetBlockAddress. + Wrapper, + + /// Special wrapper used under X86-64 PIC mode for RIP + /// relative displacements. + WrapperRIP, + + /// Copies a 64-bit value from an MMX vector to the low word + /// of an XMM vector, with the high word zero filled. + MOVQ2DQ, + + /// Copies a 64-bit value from the low word of an XMM vector + /// to an MMX vector. + MOVDQ2Q, + + /// Copies a 32-bit value from the low word of a MMX + /// vector to a GPR. + MMX_MOVD2W, + + /// Copies a GPR into the low 32-bit word of a MMX vector + /// and zero out the high word. + MMX_MOVW2D, + + /// Extract an 8-bit value from a vector and zero extend it to + /// i32, corresponds to X86::PEXTRB. + PEXTRB, + + /// Extract a 16-bit value from a vector and zero extend it to + /// i32, corresponds to X86::PEXTRW. + PEXTRW, + + /// Insert any element of a 4 x float vector into any element + /// of a destination 4 x floatvector. + INSERTPS, + + /// Insert the lower 8-bits of a 32-bit value to a vector, + /// corresponds to X86::PINSRB. + PINSRB, + + /// Insert the lower 16-bits of a 32-bit value to a vector, + /// corresponds to X86::PINSRW. + PINSRW, + + /// Shuffle 16 8-bit values within a vector. + PSHUFB, + + /// Compute Sum of Absolute Differences. + PSADBW, + /// Compute Double Block Packed Sum-Absolute-Differences + DBPSADBW, + + /// Bitwise Logical AND NOT of Packed FP values. + ANDNP, + + /// Blend where the selector is an immediate. + BLENDI, + + /// Dynamic (non-constant condition) vector blend where only the sign bits + /// of the condition elements are used. This is used to enforce that the + /// condition mask is not valid for generic VSELECT optimizations. This + /// is also used to implement the intrinsics. + /// Operands are in VSELECT order: MASK, TRUE, FALSE + BLENDV, + + /// Combined add and sub on an FP vector. + ADDSUB, + + // FP vector ops with rounding mode. + FADD_RND, + FADDS, + FADDS_RND, + FSUB_RND, + FSUBS, + FSUBS_RND, + FMUL_RND, + FMULS, + FMULS_RND, + FDIV_RND, + FDIVS, + FDIVS_RND, + FMAX_SAE, + FMAXS_SAE, + FMIN_SAE, + FMINS_SAE, + FSQRT_RND, + FSQRTS, + FSQRTS_RND, + + // FP vector get exponent. + FGETEXP, + FGETEXP_SAE, + FGETEXPS, + FGETEXPS_SAE, + // Extract Normalized Mantissas. + VGETMANT, + VGETMANT_SAE, + VGETMANTS, + VGETMANTS_SAE, + // FP Scale. + SCALEF, + SCALEF_RND, + SCALEFS, + SCALEFS_RND, + + // Unsigned Integer average. + AVG, + + /// Integer horizontal add/sub. + HADD, + HSUB, + + /// Floating point horizontal add/sub. + FHADD, + FHSUB, + + // Detect Conflicts Within a Vector + CONFLICT, + + /// Floating point max and min. + FMAX, + FMIN, + + /// Commutative FMIN and FMAX. + FMAXC, + FMINC, + + /// Scalar intrinsic floating point max and min. + FMAXS, + FMINS, + + /// Floating point reciprocal-sqrt and reciprocal approximation. + /// Note that these typically require refinement + /// in order to obtain suitable precision. + FRSQRT, + FRCP, + + // AVX-512 reciprocal approximations with a little more precision. + RSQRT14, + RSQRT14S, + RCP14, + RCP14S, + + // Thread Local Storage. + TLSADDR, + + // Thread Local Storage. A call to get the start address + // of the TLS block for the current module. + TLSBASEADDR, + + // Thread Local Storage. When calling to an OS provided + // thunk at the address from an earlier relocation. + TLSCALL, + + // Exception Handling helpers. + EH_RETURN, + + // SjLj exception handling setjmp. + EH_SJLJ_SETJMP, + + // SjLj exception handling longjmp. + EH_SJLJ_LONGJMP, + + // SjLj exception handling dispatch. + EH_SJLJ_SETUP_DISPATCH, + + /// Tail call return. See X86TargetLowering::LowerCall for + /// the list of operands. + TC_RETURN, + + // Vector move to low scalar and zero higher vector elements. + VZEXT_MOVL, + + // Vector integer truncate. + VTRUNC, + // Vector integer truncate with unsigned/signed saturation. + VTRUNCUS, + VTRUNCS, + + // Masked version of the above. Used when less than a 128-bit result is + // produced since the mask only applies to the lower elements and can't + // be represented by a select. + // SRC, PASSTHRU, MASK + VMTRUNC, + VMTRUNCUS, + VMTRUNCS, + + // Vector FP extend. + VFPEXT, + VFPEXT_SAE, + VFPEXTS, + VFPEXTS_SAE, + + // Vector FP round. + VFPROUND, + VFPROUND_RND, + VFPROUNDS, + VFPROUNDS_RND, + + // Masked version of above. Used for v2f64->v4f32. + // SRC, PASSTHRU, MASK + VMFPROUND, + + // 128-bit vector logical left / right shift + VSHLDQ, + VSRLDQ, + + // Vector shift elements + VSHL, + VSRL, + VSRA, + + // Vector variable shift + VSHLV, + VSRLV, + VSRAV, + + // Vector shift elements by immediate + VSHLI, + VSRLI, + VSRAI, + + // Shifts of mask registers. + KSHIFTL, + KSHIFTR, + + // Bit rotate by immediate + VROTLI, + VROTRI, + + // Vector packed double/float comparison. + CMPP, + + // Vector integer comparisons. + PCMPEQ, + PCMPGT, + + // v8i16 Horizontal minimum and position. + PHMINPOS, + + MULTISHIFT, + + /// Vector comparison generating mask bits for fp and + /// integer signed and unsigned data types. + CMPM, + // Vector comparison with SAE for FP values + CMPM_SAE, + + // Arithmetic operations with FLAGS results. + ADD, + SUB, + ADC, + SBB, + SMUL, + UMUL, + OR, + XOR, + AND, + + // Bit field extract. + BEXTR, + + // Zero High Bits Starting with Specified Bit Position. + BZHI, + + // X86-specific multiply by immediate. + MUL_IMM, + + // Vector sign bit extraction. + MOVMSK, + + // Vector bitwise comparisons. + PTEST, + + // Vector packed fp sign bitwise comparisons. + TESTP, + + // OR/AND test for masks. + KORTEST, + KTEST, + + // ADD for masks. + KADD, + + // Several flavors of instructions with vector shuffle behaviors. + // Saturated signed/unnsigned packing. + PACKSS, + PACKUS, + // Intra-lane alignr. + PALIGNR, + // AVX512 inter-lane alignr. + VALIGN, + PSHUFD, + PSHUFHW, + PSHUFLW, + SHUFP, + // VBMI2 Concat & Shift. + VSHLD, + VSHRD, + VSHLDV, + VSHRDV, + // Shuffle Packed Values at 128-bit granularity. + SHUF128, + MOVDDUP, + MOVSHDUP, + MOVSLDUP, + MOVLHPS, + MOVHLPS, + MOVSD, + MOVSS, + UNPCKL, + UNPCKH, + VPERMILPV, + VPERMILPI, + VPERMI, + VPERM2X128, + + // Variable Permute (VPERM). + // Res = VPERMV MaskV, V0 + VPERMV, + + // 3-op Variable Permute (VPERMT2). + // Res = VPERMV3 V0, MaskV, V1 + VPERMV3, + + // Bitwise ternary logic. + VPTERNLOG, + // Fix Up Special Packed Float32/64 values. + VFIXUPIMM, + VFIXUPIMM_SAE, + VFIXUPIMMS, + VFIXUPIMMS_SAE, + // Range Restriction Calculation For Packed Pairs of Float32/64 values. + VRANGE, + VRANGE_SAE, + VRANGES, + VRANGES_SAE, + // Reduce - Perform Reduction Transformation on scalar\packed FP. + VREDUCE, + VREDUCE_SAE, + VREDUCES, + VREDUCES_SAE, + // RndScale - Round FP Values To Include A Given Number Of Fraction Bits. + // Also used by the legacy (V)ROUND intrinsics where we mask out the + // scaling part of the immediate. + VRNDSCALE, + VRNDSCALE_SAE, + VRNDSCALES, + VRNDSCALES_SAE, + // Tests Types Of a FP Values for packed types. + VFPCLASS, + // Tests Types Of a FP Values for scalar types. + VFPCLASSS, + + // Broadcast (splat) scalar or element 0 of a vector. If the operand is + // a vector, this node may change the vector length as part of the splat. + VBROADCAST, + // Broadcast mask to vector. + VBROADCASTM, + // Broadcast subvector to vector. + SUBV_BROADCAST, + + /// SSE4A Extraction and Insertion. + EXTRQI, + INSERTQI, + + // XOP arithmetic/logical shifts. + VPSHA, + VPSHL, + // XOP signed/unsigned integer comparisons. + VPCOM, + VPCOMU, + // XOP packed permute bytes. + VPPERM, + // XOP two source permutation. + VPERMIL2, + + // Vector multiply packed unsigned doubleword integers. + PMULUDQ, + // Vector multiply packed signed doubleword integers. + PMULDQ, + // Vector Multiply Packed UnsignedIntegers with Round and Scale. + MULHRS, + + // Multiply and Add Packed Integers. + VPMADDUBSW, + VPMADDWD, + + // AVX512IFMA multiply and add. + // NOTE: These are different than the instruction and perform + // op0 x op1 + op2. + VPMADD52L, + VPMADD52H, + + // VNNI + VPDPBUSD, + VPDPBUSDS, + VPDPWSSD, + VPDPWSSDS, + + // FMA nodes. + // We use the target independent ISD::FMA for the non-inverted case. + FNMADD, + FMSUB, + FNMSUB, + FMADDSUB, + FMSUBADD, + + // FMA with rounding mode. + FMADD_RND, + FNMADD_RND, + FMSUB_RND, + FNMSUB_RND, + FMADDSUB_RND, + FMSUBADD_RND, + + // Compress and expand. + COMPRESS, + EXPAND, + + // Bits shuffle + VPSHUFBITQMB, + + // Convert Unsigned/Integer to Floating-Point Value with rounding mode. + SINT_TO_FP_RND, + UINT_TO_FP_RND, + SCALAR_SINT_TO_FP, + SCALAR_UINT_TO_FP, + SCALAR_SINT_TO_FP_RND, + SCALAR_UINT_TO_FP_RND, + + // Vector float/double to signed/unsigned integer. + CVTP2SI, + CVTP2UI, + CVTP2SI_RND, + CVTP2UI_RND, + // Scalar float/double to signed/unsigned integer. + CVTS2SI, + CVTS2UI, + CVTS2SI_RND, + CVTS2UI_RND, + + // Vector float/double to signed/unsigned integer with truncation. + CVTTP2SI, + CVTTP2UI, + CVTTP2SI_SAE, + CVTTP2UI_SAE, + // Scalar float/double to signed/unsigned integer with truncation. + CVTTS2SI, + CVTTS2UI, + CVTTS2SI_SAE, + CVTTS2UI_SAE, + + // Vector signed/unsigned integer to float/double. + CVTSI2P, + CVTUI2P, + + // Masked versions of above. Used for v2f64->v4f32. + // SRC, PASSTHRU, MASK + MCVTP2SI, + MCVTP2UI, + MCVTTP2SI, + MCVTTP2UI, + MCVTSI2P, + MCVTUI2P, + + // Vector float to bfloat16. + // Convert TWO packed single data to one packed BF16 data + CVTNE2PS2BF16, + // Convert packed single data to packed BF16 data + CVTNEPS2BF16, + // Masked version of above. + // SRC, PASSTHRU, MASK + MCVTNEPS2BF16, + + // Dot product of BF16 pairs to accumulated into + // packed single precision. + DPBF16PS, + + // Save xmm argument registers to the stack, according to %al. An operator + // is needed so that this can be expanded with control flow. + VASTART_SAVE_XMM_REGS, + + // Windows's _chkstk call to do stack probing. + WIN_ALLOCA, + + // For allocating variable amounts of stack space when using + // segmented stacks. Check if the current stacklet has enough space, and + // falls back to heap allocation if not. + SEG_ALLOCA, + + // For allocating stack space when using stack clash protector. + // Allocation is performed by block, and each block is probed. + PROBED_ALLOCA, + + // Memory barriers. + MEMBARRIER, + MFENCE, + + // Get a random integer and indicate whether it is valid in CF. + RDRAND, + + // Get a NIST SP800-90B & C compliant random integer and + // indicate whether it is valid in CF. + RDSEED, + + // Protection keys + // RDPKRU - Operand 0 is chain. Operand 1 is value for ECX. + // WRPKRU - Operand 0 is chain. Operand 1 is value for EDX. Operand 2 is + // value for ECX. + RDPKRU, + WRPKRU, + + // SSE42 string comparisons. + // These nodes produce 3 results, index, mask, and flags. X86ISelDAGToDAG + // will emit one or two instructions based on which results are used. If + // flags and index/mask this allows us to use a single instruction since + // we won't have to pick and opcode for flags. Instead we can rely on the + // DAG to CSE everything and decide at isel. + PCMPISTR, + PCMPESTR, + + // Test if in transactional execution. + XTEST, + + // ERI instructions. + RSQRT28, + RSQRT28_SAE, + RSQRT28S, + RSQRT28S_SAE, + RCP28, + RCP28_SAE, + RCP28S, + RCP28S_SAE, + EXP2, + EXP2_SAE, + + // Conversions between float and half-float. + CVTPS2PH, + CVTPH2PS, + CVTPH2PS_SAE, + + // Masked version of above. + // SRC, RND, PASSTHRU, MASK + MCVTPS2PH, + + // Galois Field Arithmetic Instructions + GF2P8AFFINEINVQB, + GF2P8AFFINEQB, + GF2P8MULB, + + // LWP insert record. + LWPINS, + + // User level wait + UMWAIT, + TPAUSE, + + // Enqueue Stores Instructions + ENQCMD, + ENQCMDS, + + // For avx512-vp2intersect + VP2INTERSECT, + + /// X86 strict FP compare instructions. + STRICT_FCMP = ISD::FIRST_TARGET_STRICTFP_OPCODE, + STRICT_FCMPS, + + // Vector packed double/float comparison. + STRICT_CMPP, + + /// Vector comparison generating mask bits for fp and + /// integer signed and unsigned data types. + STRICT_CMPM, + + // Vector float/double to signed/unsigned integer with truncation. + STRICT_CVTTP2SI, + STRICT_CVTTP2UI, + + // Vector FP extend. + STRICT_VFPEXT, + + // Vector FP round. + STRICT_VFPROUND, + + // RndScale - Round FP Values To Include A Given Number Of Fraction Bits. + // Also used by the legacy (V)ROUND intrinsics where we mask out the + // scaling part of the immediate. + STRICT_VRNDSCALE, + + // Vector signed/unsigned integer to float/double. + STRICT_CVTSI2P, + STRICT_CVTUI2P, + + // Strict FMA nodes. + STRICT_FNMADD, + STRICT_FMSUB, + STRICT_FNMSUB, + + // Conversions between float and half-float. + STRICT_CVTPS2PH, + STRICT_CVTPH2PS, + + // Compare and swap. + LCMPXCHG_DAG = ISD::FIRST_TARGET_MEMORY_OPCODE, + LCMPXCHG8_DAG, + LCMPXCHG16_DAG, + LCMPXCHG8_SAVE_EBX_DAG, + LCMPXCHG16_SAVE_RBX_DAG, + + /// LOCK-prefixed arithmetic read-modify-write instructions. + /// EFLAGS, OUTCHAIN = LADD(INCHAIN, PTR, RHS) + LADD, + LSUB, + LOR, + LXOR, + LAND, + + // Load, scalar_to_vector, and zero extend. + VZEXT_LOAD, + + // extract_vector_elt, store. + VEXTRACT_STORE, + + // scalar broadcast from memory + VBROADCAST_LOAD, + + // Store FP control world into i16 memory. + FNSTCW16m, + + /// This instruction implements FP_TO_SINT with the + /// integer destination in memory and a FP reg source. This corresponds + /// to the X86::FISTm instructions and the rounding mode change stuff. It + /// has two inputs (token chain and address) and two outputs (int value + /// and token chain). Memory VT specifies the type to store to. + FP_TO_INT_IN_MEM, + + /// This instruction implements SINT_TO_FP with the + /// integer source in memory and FP reg result. This corresponds to the + /// X86::FILDm instructions. It has two inputs (token chain and address) + /// and two outputs (FP value and token chain). The integer source type is + /// specified by the memory VT. + FILD, + + /// This instruction implements a fp->int store from FP stack + /// slots. This corresponds to the fist instruction. It takes a + /// chain operand, value to store, address, and glue. The memory VT + /// specifies the type to store as. + FIST, + + /// This instruction implements an extending load to FP stack slots. + /// This corresponds to the X86::FLD32m / X86::FLD64m. It takes a chain + /// operand, and ptr to load from. The memory VT specifies the type to + /// load from. + FLD, + + /// This instruction implements a truncating store from FP stack + /// slots. This corresponds to the X86::FST32m / X86::FST64m. It takes a + /// chain operand, value to store, address, and glue. The memory VT + /// specifies the type to store as. + FST, + + /// This instruction grabs the address of the next argument + /// from a va_list. (reads and modifies the va_list in memory) + VAARG_64, + + // Vector truncating store with unsigned/signed saturation + VTRUNCSTOREUS, + VTRUNCSTORES, + // Vector truncating masked store with unsigned/signed saturation + VMTRUNCSTOREUS, + VMTRUNCSTORES, + + // X86 specific gather and scatter + MGATHER, + MSCATTER, + + // WARNING: Do not add anything in the end unless you want the node to + // have memop! In fact, starting from FIRST_TARGET_MEMORY_OPCODE all + // opcodes will be thought as target memory ops! + }; Lint: Pre-merge checks: clang-format: please reformat the code ``` - enum NodeType : unsigned { - // Start the…
	// Start the numbering where the builtin ops leave off.			// Start the numbering where the builtin ops leave off.
	FIRST_NUMBER = ISD::BUILTIN_OP_END,			FIRST_NUMBER = ISD::BUILTIN_OP_END,

	/// Bit scan forward.			/// Bit scan forward.
	BSF,			BSF,
	/// Bit scan reverse.			/// Bit scan reverse.
	BSR,			BSR,

	/// Double shift instructions. These correspond to			/// X86 funnel/double shift i16 instructions. These correspond to
	/// X86::SHLDxx and X86::SHRDxx instructions.			/// X86::SHLDWx and X86::SHRDWx instructions which have different amt
				craig.topperUnsubmitted Not Done Reply Inline Actions What is the 'x' after the W? craig.topper: What is the 'x' after the W?
				RKSimonAuthorUnsubmitted Not Done Reply Inline Actions The instruction suffix for the rr/rm/mr cases - I'll remove it. RKSimon: The instruction suffix for the rr/rm/mr cases - I'll remove it.
				RKSimonAuthorUnsubmitted Done Reply Inline Actions @craig.topper Other than this are you ok with the patch? RKSimon: @craig.topper Other than this are you ok with the patch?
				craig.topperUnsubmitted Not Done Reply Inline Actions Yeah craig.topper: Yeah
	SHLD,			/// modulo rules to generic funnel shifts.
	SHRD,			FSHL,
				FSHR,

	/// Bitwise logical AND of floating point values. This corresponds			/// Bitwise logical AND of floating point values. This corresponds
	/// to X86::ANDPS or X86::ANDPD.			/// to X86::ANDPS or X86::ANDPD.
	FAND,			FAND,

	/// Bitwise logical OR of floating point values. This corresponds			/// Bitwise logical OR of floating point values. This corresponds
	/// to X86::ORPS or X86::ORPD.			/// to X86::ORPS or X86::ORPD.
	FOR,			FOR,
	▲ Show 20 Lines • Show All 1,574 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 201 Lines • ▼ Show 20 Lines	X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
if (Subtarget.hasCMov()) {		if (Subtarget.hasCMov()) {
setOperationAction(ISD::ABS , MVT::i16 , Custom);		setOperationAction(ISD::ABS , MVT::i16 , Custom);
setOperationAction(ISD::ABS , MVT::i32 , Custom);		setOperationAction(ISD::ABS , MVT::i32 , Custom);
}		}
setOperationAction(ISD::ABS , MVT::i64 , Custom);		setOperationAction(ISD::ABS , MVT::i64 , Custom);

// Funnel shifts.		// Funnel shifts.
for (auto ShiftOp : {ISD::FSHL, ISD::FSHR}) {		for (auto ShiftOp : {ISD::FSHL, ISD::FSHR}) {
		// For slow shld targets we only lower for code size.
		LegalizeAction ShiftDoubleAction = Subtarget.isSHLDSlow() ? Custom : Legal;

setOperationAction(ShiftOp , MVT::i16 , Custom);		setOperationAction(ShiftOp , MVT::i16 , Custom);
setOperationAction(ShiftOp , MVT::i32 , Custom);		setOperationAction(ShiftOp , MVT::i32 , ShiftDoubleAction);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - setOperationAction(ShiftOp , MVT::i32 , ShiftDoubleAction); + setOperationAction(ShiftOp, MVT::i32, ShiftDoubleAction); Lint: Pre-merge checks: clang-format: please reformat the code ``` - setOperationAction(ShiftOp , MVT…
if (Subtarget.is64Bit())		if (Subtarget.is64Bit())
setOperationAction(ShiftOp , MVT::i64 , Custom);		setOperationAction(ShiftOp , MVT::i64 , ShiftDoubleAction);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - setOperationAction(ShiftOp , MVT::i64 , ShiftDoubleAction); + setOperationAction(ShiftOp, MVT::i64, ShiftDoubleAction); Lint: Pre-merge checks: clang-format: please reformat the code ``` - setOperationAction(ShiftOp , MVT…
}		}

if (!Subtarget.useSoftFloat()) {		if (!Subtarget.useSoftFloat()) {
// Promote all UINT_TO_FP to larger SINT_TO_FP's, as X86 doesn't have this		// Promote all UINT_TO_FP to larger SINT_TO_FP's, as X86 doesn't have this
// operation.		// operation.
setOperationAction(ISD::UINT_TO_FP, MVT::i8, Promote);		setOperationAction(ISD::UINT_TO_FP, MVT::i8, Promote);
setOperationAction(ISD::STRICT_UINT_TO_FP, MVT::i8, Promote);		setOperationAction(ISD::STRICT_UINT_TO_FP, MVT::i8, Promote);
setOperationAction(ISD::UINT_TO_FP, MVT::i16, Promote);		setOperationAction(ISD::UINT_TO_FP, MVT::i16, Promote);
▲ Show 20 Lines • Show All 18,609 Lines • ▼ Show 20 Lines	static SDValue LowerFunnelShift(SDValue Op, const X86Subtarget &Subtarget,
assert((VT == MVT::i16 \|\| VT == MVT::i32 \|\| VT == MVT::i64) &&		assert((VT == MVT::i16 \|\| VT == MVT::i32 \|\| VT == MVT::i64) &&
"Unexpected funnel shift type!");		"Unexpected funnel shift type!");

// Expand slow SHLD/SHRD cases if we are not optimizing for size.		// Expand slow SHLD/SHRD cases if we are not optimizing for size.
bool OptForSize = DAG.shouldOptForSize();		bool OptForSize = DAG.shouldOptForSize();
if (!OptForSize && Subtarget.isSHLDSlow())		if (!OptForSize && Subtarget.isSHLDSlow())
return SDValue();		return SDValue();

if (IsFSHR)
std::swap(Op0, Op1);

// i16 needs to modulo the shift amount, but i32/i64 have implicit modulo.		// i16 needs to modulo the shift amount, but i32/i64 have implicit modulo.
if (VT == MVT::i16)		if (VT == MVT::i16) {
Amt = DAG.getNode(ISD::AND, DL, Amt.getValueType(), Amt,		Amt = DAG.getNode(ISD::AND, DL, Amt.getValueType(), Amt,
DAG.getConstant(15, DL, Amt.getValueType()));		DAG.getConstant(15, DL, Amt.getValueType()));
		unsigned FSHOp = (IsFSHR ? X86ISD::FSHR : X86ISD::FSHL);
		return DAG.getNode(FSHOp, DL, VT, Op0, Op1, Amt);
		}

unsigned SHDOp = (IsFSHR ? X86ISD::SHRD : X86ISD::SHLD);		return Op;
return DAG.getNode(SHDOp, DL, VT, Op0, Op1, Amt);
}		}

// Try to use a packed vector operation to handle i64 on 32-bit targets when		// Try to use a packed vector operation to handle i64 on 32-bit targets when
// AVX512DQ is enabled.		// AVX512DQ is enabled.
static SDValue LowerI64IntToFP_AVX512DQ(SDValue Op, SelectionDAG &DAG,		static SDValue LowerI64IntToFP_AVX512DQ(SDValue Op, SelectionDAG &DAG,
const X86Subtarget &Subtarget) {		const X86Subtarget &Subtarget) {
assert((Op.getOpcode() == ISD::SINT_TO_FP \|\|		assert((Op.getOpcode() == ISD::SINT_TO_FP \|\|
Op.getOpcode() == ISD::STRICT_SINT_TO_FP \|\|		Op.getOpcode() == ISD::STRICT_SINT_TO_FP \|\|
▲ Show 20 Lines • Show All 11,077 Lines • ▼ Show 20 Lines
}		}

const char *X86TargetLowering::getTargetNodeName(unsigned Opcode) const {		const char *X86TargetLowering::getTargetNodeName(unsigned Opcode) const {
switch ((X86ISD::NodeType)Opcode) {		switch ((X86ISD::NodeType)Opcode) {
case X86ISD::FIRST_NUMBER: break;		case X86ISD::FIRST_NUMBER: break;
#define NODE_NAME_CASE(NODE) case X86ISD::NODE: return "X86ISD::" #NODE;		#define NODE_NAME_CASE(NODE) case X86ISD::NODE: return "X86ISD::" #NODE;
NODE_NAME_CASE(BSF)		NODE_NAME_CASE(BSF)
NODE_NAME_CASE(BSR)		NODE_NAME_CASE(BSR)
NODE_NAME_CASE(SHLD)		NODE_NAME_CASE(FSHL)
NODE_NAME_CASE(SHRD)		NODE_NAME_CASE(FSHR)
NODE_NAME_CASE(FAND)		NODE_NAME_CASE(FAND)
NODE_NAME_CASE(FANDN)		NODE_NAME_CASE(FANDN)
NODE_NAME_CASE(FOR)		NODE_NAME_CASE(FOR)
NODE_NAME_CASE(FXOR)		NODE_NAME_CASE(FXOR)
NODE_NAME_CASE(FILD)		NODE_NAME_CASE(FILD)
NODE_NAME_CASE(FIST)		NODE_NAME_CASE(FIST)
NODE_NAME_CASE(FP_TO_INT_IN_MEM)		NODE_NAME_CASE(FP_TO_INT_IN_MEM)
NODE_NAME_CASE(FLD)		NODE_NAME_CASE(FLD)
▲ Show 20 Lines • Show All 18,354 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86InstrCompiler.td

Show First 20 Lines • Show All 1,776 Lines • ▼ Show 20 Lines	multiclass MaskedRotateAmountPats<SDNode frag, string name> {
def : Pat<(store (frag (loadi64 addr:$dst), (shiftMask64 CL)), addr:$dst),		def : Pat<(store (frag (loadi64 addr:$dst), (shiftMask64 CL)), addr:$dst),
(!cast<Instruction>(name # "64mCL") addr:$dst)>;		(!cast<Instruction>(name # "64mCL") addr:$dst)>;
}		}


defm : MaskedRotateAmountPats<rotl, "ROL">;		defm : MaskedRotateAmountPats<rotl, "ROL">;
defm : MaskedRotateAmountPats<rotr, "ROR">;		defm : MaskedRotateAmountPats<rotr, "ROR">;

// Double shift amount is implicitly masked.		// Double "funnel" shift amount is implicitly masked.
multiclass MaskedDoubleShiftAmountPats<SDNode frag, string name> {		// (fshl/fshr x (and y, 31)) ==> (fshl/fshr x, y) (NOTE: modulo32)
// (shift x (and y, 31)) ==> (shift x, y)		def : Pat<(X86fshl GR16:$src1, GR16:$src2, (shiftMask32 CL)),
def : Pat<(frag GR16:$src1, GR16:$src2, (shiftMask32 CL)),		(SHLD16rrCL GR16:$src1, GR16:$src2)>;
(!cast<Instruction>(name # "16rrCL") GR16:$src1, GR16:$src2)>;		def : Pat<(X86fshr GR16:$src2, GR16:$src1, (shiftMask32 CL)),
def : Pat<(frag GR32:$src1, GR32:$src2, (shiftMask32 CL)),		(SHRD16rrCL GR16:$src1, GR16:$src2)>;
(!cast<Instruction>(name # "32rrCL") GR32:$src1, GR32:$src2)>;
		// (fshl/fshr x (and y, 31)) ==> (fshl/fshr x, y)
// (shift x (and y, 63)) ==> (shift x, y)		def : Pat<(fshl GR32:$src1, GR32:$src2, (shiftMask32 CL)),
def : Pat<(frag GR64:$src1, GR64:$src2, (shiftMask32 CL)),		(SHLD32rrCL GR32:$src1, GR32:$src2)>;
(!cast<Instruction>(name # "64rrCL") GR64:$src1, GR64:$src2)>;		def : Pat<(fshr GR32:$src2, GR32:$src1, (shiftMask32 CL)),
}		(SHRD32rrCL GR32:$src1, GR32:$src2)>;

defm : MaskedDoubleShiftAmountPats<X86shld, "SHLD">;		// (fshl/fshr x (and y, 63)) ==> (fshl/fshr x, y)
defm : MaskedDoubleShiftAmountPats<X86shrd, "SHRD">;		def : Pat<(fshl GR64:$src1, GR64:$src2, (shiftMask64 CL)),
		(SHLD64rrCL GR64:$src1, GR64:$src2)>;
		def : Pat<(fshr GR64:$src2, GR64:$src1, (shiftMask64 CL)),
		(SHRD64rrCL GR64:$src1, GR64:$src2)>;

let Predicates = [HasBMI2] in {		let Predicates = [HasBMI2] in {
let AddedComplexity = 1 in {		let AddedComplexity = 1 in {
def : Pat<(sra GR32:$src1, (shiftMask32 GR8:$src2)),		def : Pat<(sra GR32:$src1, (shiftMask32 GR8:$src2)),
(SARX32rr GR32:$src1,		(SARX32rr GR32:$src1,
(INSERT_SUBREG		(INSERT_SUBREG
(i32 (IMPLICIT_DEF)), GR8:$src2, sub_8bit))>;		(i32 (IMPLICIT_DEF)), GR8:$src2, sub_8bit))>;
def : Pat<(sra GR64:$src1, (shiftMask64 GR8:$src2)),		def : Pat<(sra GR64:$src1, (shiftMask64 GR8:$src2)),
▲ Show 20 Lines • Show All 328 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86InstrInfo.td

	Show First 20 Lines • Show All 137 Lines • ▼ Show 20 Lines
	def X86MemBarrier : SDNode<"X86ISD::MEMBARRIER", SDT_X86MEMBARRIER,			def X86MemBarrier : SDNode<"X86ISD::MEMBARRIER", SDT_X86MEMBARRIER,
	[SDNPHasChain,SDNPSideEffect]>;			[SDNPHasChain,SDNPSideEffect]>;
	def X86MFence : SDNode<"X86ISD::MFENCE", SDT_X86MEMBARRIER,			def X86MFence : SDNode<"X86ISD::MFENCE", SDT_X86MEMBARRIER,
	[SDNPHasChain]>;			[SDNPHasChain]>;


	def X86bsf : SDNode<"X86ISD::BSF", SDTUnaryArithWithFlags>;			def X86bsf : SDNode<"X86ISD::BSF", SDTUnaryArithWithFlags>;
	def X86bsr : SDNode<"X86ISD::BSR", SDTUnaryArithWithFlags>;			def X86bsr : SDNode<"X86ISD::BSR", SDTUnaryArithWithFlags>;
	def X86shld : SDNode<"X86ISD::SHLD", SDTIntShiftDOp>;			def X86fshl : SDNode<"X86ISD::FSHL", SDTIntShiftDOp>;
	def X86shrd : SDNode<"X86ISD::SHRD", SDTIntShiftDOp>;			def X86fshr : SDNode<"X86ISD::FSHR", SDTIntShiftDOp>;

	def X86cmp : SDNode<"X86ISD::CMP" , SDTX86CmpTest>;			def X86cmp : SDNode<"X86ISD::CMP" , SDTX86CmpTest>;
	def X86fcmp : SDNode<"X86ISD::FCMP", SDTX86FCmp>;			def X86fcmp : SDNode<"X86ISD::FCMP", SDTX86FCmp>;
	def X86strict_fcmp : SDNode<"X86ISD::STRICT_FCMP", SDTX86FCmp, [SDNPHasChain]>;			def X86strict_fcmp : SDNode<"X86ISD::STRICT_FCMP", SDTX86FCmp, [SDNPHasChain]>;
	def X86strict_fcmps : SDNode<"X86ISD::STRICT_FCMPS", SDTX86FCmp, [SDNPHasChain]>;			def X86strict_fcmps : SDNode<"X86ISD::STRICT_FCMPS", SDTX86FCmp, [SDNPHasChain]>;
	def X86bt : SDNode<"X86ISD::BT", SDTX86CmpTest>;			def X86bt : SDNode<"X86ISD::BT", SDTX86CmpTest>;

	def X86cmov : SDNode<"X86ISD::CMOV", SDTX86Cmov>;			def X86cmov : SDNode<"X86ISD::CMOV", SDTX86Cmov>;
	▲ Show 20 Lines • Show All 3,439 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86InstrShiftRotate.td

	Show First 20 Lines • Show All 655 Lines • ▼ Show 20 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	let Constraints = "$src1 = $dst" in {			let Constraints = "$src1 = $dst" in {

	let Uses = [CL], SchedRW = [WriteSHDrrcl] in {			let Uses = [CL], SchedRW = [WriteSHDrrcl] in {
	def SHLD16rrCL : I<0xA5, MRMDestReg, (outs GR16:$dst),			def SHLD16rrCL : I<0xA5, MRMDestReg, (outs GR16:$dst),
	(ins GR16:$src1, GR16:$src2),			(ins GR16:$src1, GR16:$src2),
	"shld{w}\t{%cl, $src2, $dst\|$dst, $src2, cl}",			"shld{w}\t{%cl, $src2, $dst\|$dst, $src2, cl}",
	[(set GR16:$dst, (X86shld GR16:$src1, GR16:$src2, CL))]>,			[(set GR16:$dst, (X86fshl GR16:$src1, GR16:$src2, CL))]>,
	TB, OpSize16;			TB, OpSize16;
	def SHRD16rrCL : I<0xAD, MRMDestReg, (outs GR16:$dst),			def SHRD16rrCL : I<0xAD, MRMDestReg, (outs GR16:$dst),
	(ins GR16:$src1, GR16:$src2),			(ins GR16:$src1, GR16:$src2),
	"shrd{w}\t{%cl, $src2, $dst\|$dst, $src2, cl}",			"shrd{w}\t{%cl, $src2, $dst\|$dst, $src2, cl}",
	[(set GR16:$dst, (X86shrd GR16:$src1, GR16:$src2, CL))]>,			[(set GR16:$dst, (X86fshr GR16:$src2, GR16:$src1, CL))]>,
	TB, OpSize16;			TB, OpSize16;
	def SHLD32rrCL : I<0xA5, MRMDestReg, (outs GR32:$dst),			def SHLD32rrCL : I<0xA5, MRMDestReg, (outs GR32:$dst),
	(ins GR32:$src1, GR32:$src2),			(ins GR32:$src1, GR32:$src2),
	"shld{l}\t{%cl, $src2, $dst\|$dst, $src2, cl}",			"shld{l}\t{%cl, $src2, $dst\|$dst, $src2, cl}",
	[(set GR32:$dst, (X86shld GR32:$src1, GR32:$src2, CL))]>,			[(set GR32:$dst, (fshl GR32:$src1, GR32:$src2, CL))]>,
	TB, OpSize32;			TB, OpSize32;
	def SHRD32rrCL : I<0xAD, MRMDestReg, (outs GR32:$dst),			def SHRD32rrCL : I<0xAD, MRMDestReg, (outs GR32:$dst),
	(ins GR32:$src1, GR32:$src2),			(ins GR32:$src1, GR32:$src2),
	"shrd{l}\t{%cl, $src2, $dst\|$dst, $src2, cl}",			"shrd{l}\t{%cl, $src2, $dst\|$dst, $src2, cl}",
	[(set GR32:$dst, (X86shrd GR32:$src1, GR32:$src2, CL))]>,			[(set GR32:$dst, (fshr GR32:$src2, GR32:$src1, CL))]>,
	TB, OpSize32;			TB, OpSize32;
	def SHLD64rrCL : RI<0xA5, MRMDestReg, (outs GR64:$dst),			def SHLD64rrCL : RI<0xA5, MRMDestReg, (outs GR64:$dst),
	(ins GR64:$src1, GR64:$src2),			(ins GR64:$src1, GR64:$src2),
	"shld{q}\t{%cl, $src2, $dst\|$dst, $src2, cl}",			"shld{q}\t{%cl, $src2, $dst\|$dst, $src2, cl}",
	[(set GR64:$dst, (X86shld GR64:$src1, GR64:$src2, CL))]>,			[(set GR64:$dst, (fshl GR64:$src1, GR64:$src2, CL))]>,
	TB;			TB;
	def SHRD64rrCL : RI<0xAD, MRMDestReg, (outs GR64:$dst),			def SHRD64rrCL : RI<0xAD, MRMDestReg, (outs GR64:$dst),
	(ins GR64:$src1, GR64:$src2),			(ins GR64:$src1, GR64:$src2),
	"shrd{q}\t{%cl, $src2, $dst\|$dst, $src2, cl}",			"shrd{q}\t{%cl, $src2, $dst\|$dst, $src2, cl}",
	[(set GR64:$dst, (X86shrd GR64:$src1, GR64:$src2, CL))]>,			[(set GR64:$dst, (fshr GR64:$src2, GR64:$src1, CL))]>,
	TB;			TB;
	} // SchedRW			} // SchedRW

	let isCommutable = 1, SchedRW = [WriteSHDrri] in { // These instructions commute to each other.			let isCommutable = 1, SchedRW = [WriteSHDrri] in { // These instructions commute to each other.
	def SHLD16rri8 : Ii8<0xA4, MRMDestReg,			def SHLD16rri8 : Ii8<0xA4, MRMDestReg,
	(outs GR16:$dst),			(outs GR16:$dst),
	(ins GR16:$src1, GR16:$src2, u8imm:$src3),			(ins GR16:$src1, GR16:$src2, u8imm:$src3),
	"shld{w}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",			"shld{w}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",
	[(set GR16:$dst, (X86shld GR16:$src1, GR16:$src2,			[(set GR16:$dst, (X86fshl GR16:$src1, GR16:$src2,
	(i8 imm:$src3)))]>,			(i8 imm:$src3)))]>,
	TB, OpSize16;			TB, OpSize16;
	def SHRD16rri8 : Ii8<0xAC, MRMDestReg,			def SHRD16rri8 : Ii8<0xAC, MRMDestReg,
	(outs GR16:$dst),			(outs GR16:$dst),
	(ins GR16:$src1, GR16:$src2, u8imm:$src3),			(ins GR16:$src1, GR16:$src2, u8imm:$src3),
	"shrd{w}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",			"shrd{w}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",
	[(set GR16:$dst, (X86shrd GR16:$src1, GR16:$src2,			[(set GR16:$dst, (X86fshr GR16:$src2, GR16:$src1,
	(i8 imm:$src3)))]>,			(i8 imm:$src3)))]>,
	TB, OpSize16;			TB, OpSize16;
	def SHLD32rri8 : Ii8<0xA4, MRMDestReg,			def SHLD32rri8 : Ii8<0xA4, MRMDestReg,
	(outs GR32:$dst),			(outs GR32:$dst),
	(ins GR32:$src1, GR32:$src2, u8imm:$src3),			(ins GR32:$src1, GR32:$src2, u8imm:$src3),
	"shld{l}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",			"shld{l}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",
	[(set GR32:$dst, (X86shld GR32:$src1, GR32:$src2,			[(set GR32:$dst, (fshl GR32:$src1, GR32:$src2,
	(i8 imm:$src3)))]>,			(i8 imm:$src3)))]>,
	TB, OpSize32;			TB, OpSize32;
	def SHRD32rri8 : Ii8<0xAC, MRMDestReg,			def SHRD32rri8 : Ii8<0xAC, MRMDestReg,
	(outs GR32:$dst),			(outs GR32:$dst),
	(ins GR32:$src1, GR32:$src2, u8imm:$src3),			(ins GR32:$src1, GR32:$src2, u8imm:$src3),
	"shrd{l}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",			"shrd{l}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",
	[(set GR32:$dst, (X86shrd GR32:$src1, GR32:$src2,			[(set GR32:$dst, (fshr GR32:$src2, GR32:$src1,
	(i8 imm:$src3)))]>,			(i8 imm:$src3)))]>,
	TB, OpSize32;			TB, OpSize32;
	def SHLD64rri8 : RIi8<0xA4, MRMDestReg,			def SHLD64rri8 : RIi8<0xA4, MRMDestReg,
	(outs GR64:$dst),			(outs GR64:$dst),
	(ins GR64:$src1, GR64:$src2, u8imm:$src3),			(ins GR64:$src1, GR64:$src2, u8imm:$src3),
	"shld{q}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",			"shld{q}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",
	[(set GR64:$dst, (X86shld GR64:$src1, GR64:$src2,			[(set GR64:$dst, (fshl GR64:$src1, GR64:$src2,
	(i8 imm:$src3)))]>,			(i8 imm:$src3)))]>,
	TB;			TB;
	def SHRD64rri8 : RIi8<0xAC, MRMDestReg,			def SHRD64rri8 : RIi8<0xAC, MRMDestReg,
	(outs GR64:$dst),			(outs GR64:$dst),
	(ins GR64:$src1, GR64:$src2, u8imm:$src3),			(ins GR64:$src1, GR64:$src2, u8imm:$src3),
	"shrd{q}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",			"shrd{q}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",
	[(set GR64:$dst, (X86shrd GR64:$src1, GR64:$src2,			[(set GR64:$dst, (fshr GR64:$src2, GR64:$src1,
	(i8 imm:$src3)))]>,			(i8 imm:$src3)))]>,
	TB;			TB;
	} // SchedRW			} // SchedRW
	} // Constraints = "$src = $dst"			} // Constraints = "$src = $dst"

	let Uses = [CL], SchedRW = [WriteSHDmrcl] in {			let Uses = [CL], SchedRW = [WriteSHDmrcl] in {
	def SHLD16mrCL : I<0xA5, MRMDestMem, (outs), (ins i16mem:$dst, GR16:$src2),			def SHLD16mrCL : I<0xA5, MRMDestMem, (outs), (ins i16mem:$dst, GR16:$src2),
	"shld{w}\t{%cl, $src2, $dst\|$dst, $src2, cl}",			"shld{w}\t{%cl, $src2, $dst\|$dst, $src2, cl}",
	[(store (X86shld (loadi16 addr:$dst), GR16:$src2, CL),			[(store (X86fshl (loadi16 addr:$dst), GR16:$src2, CL),
	addr:$dst)]>, TB, OpSize16;			addr:$dst)]>, TB, OpSize16;
	def SHRD16mrCL : I<0xAD, MRMDestMem, (outs), (ins i16mem:$dst, GR16:$src2),			def SHRD16mrCL : I<0xAD, MRMDestMem, (outs), (ins i16mem:$dst, GR16:$src2),
	"shrd{w}\t{%cl, $src2, $dst\|$dst, $src2, cl}",			"shrd{w}\t{%cl, $src2, $dst\|$dst, $src2, cl}",
	[(store (X86shrd (loadi16 addr:$dst), GR16:$src2, CL),			[(store (X86fshr GR16:$src2, (loadi16 addr:$dst), CL),
	addr:$dst)]>, TB, OpSize16;			addr:$dst)]>, TB, OpSize16;

	def SHLD32mrCL : I<0xA5, MRMDestMem, (outs), (ins i32mem:$dst, GR32:$src2),			def SHLD32mrCL : I<0xA5, MRMDestMem, (outs), (ins i32mem:$dst, GR32:$src2),
	"shld{l}\t{%cl, $src2, $dst\|$dst, $src2, cl}",			"shld{l}\t{%cl, $src2, $dst\|$dst, $src2, cl}",
	[(store (X86shld (loadi32 addr:$dst), GR32:$src2, CL),			[(store (fshl (loadi32 addr:$dst), GR32:$src2, CL),
	addr:$dst)]>, TB, OpSize32;			addr:$dst)]>, TB, OpSize32;
	def SHRD32mrCL : I<0xAD, MRMDestMem, (outs), (ins i32mem:$dst, GR32:$src2),			def SHRD32mrCL : I<0xAD, MRMDestMem, (outs), (ins i32mem:$dst, GR32:$src2),
	"shrd{l}\t{%cl, $src2, $dst\|$dst, $src2, cl}",			"shrd{l}\t{%cl, $src2, $dst\|$dst, $src2, cl}",
	[(store (X86shrd (loadi32 addr:$dst), GR32:$src2, CL),			[(store (fshr GR32:$src2, (loadi32 addr:$dst), CL),
	addr:$dst)]>, TB, OpSize32;			addr:$dst)]>, TB, OpSize32;

	def SHLD64mrCL : RI<0xA5, MRMDestMem, (outs), (ins i64mem:$dst, GR64:$src2),			def SHLD64mrCL : RI<0xA5, MRMDestMem, (outs), (ins i64mem:$dst, GR64:$src2),
	"shld{q}\t{%cl, $src2, $dst\|$dst, $src2, cl}",			"shld{q}\t{%cl, $src2, $dst\|$dst, $src2, cl}",
	[(store (X86shld (loadi64 addr:$dst), GR64:$src2, CL),			[(store (fshl (loadi64 addr:$dst), GR64:$src2, CL),
	addr:$dst)]>, TB;			addr:$dst)]>, TB;
	def SHRD64mrCL : RI<0xAD, MRMDestMem, (outs), (ins i64mem:$dst, GR64:$src2),			def SHRD64mrCL : RI<0xAD, MRMDestMem, (outs), (ins i64mem:$dst, GR64:$src2),
	"shrd{q}\t{%cl, $src2, $dst\|$dst, $src2, cl}",			"shrd{q}\t{%cl, $src2, $dst\|$dst, $src2, cl}",
	[(store (X86shrd (loadi64 addr:$dst), GR64:$src2, CL),			[(store (fshr GR64:$src2, (loadi64 addr:$dst), CL),
	addr:$dst)]>, TB;			addr:$dst)]>, TB;
	} // SchedRW			} // SchedRW

	let SchedRW = [WriteSHDmri] in {			let SchedRW = [WriteSHDmri] in {
	def SHLD16mri8 : Ii8<0xA4, MRMDestMem,			def SHLD16mri8 : Ii8<0xA4, MRMDestMem,
	(outs), (ins i16mem:$dst, GR16:$src2, u8imm:$src3),			(outs), (ins i16mem:$dst, GR16:$src2, u8imm:$src3),
	"shld{w}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",			"shld{w}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",
	[(store (X86shld (loadi16 addr:$dst), GR16:$src2,			[(store (X86fshl (loadi16 addr:$dst), GR16:$src2,
	(i8 imm:$src3)), addr:$dst)]>,			(i8 imm:$src3)), addr:$dst)]>,
	TB, OpSize16;			TB, OpSize16;
	def SHRD16mri8 : Ii8<0xAC, MRMDestMem,			def SHRD16mri8 : Ii8<0xAC, MRMDestMem,
	(outs), (ins i16mem:$dst, GR16:$src2, u8imm:$src3),			(outs), (ins i16mem:$dst, GR16:$src2, u8imm:$src3),
	"shrd{w}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",			"shrd{w}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",
	[(store (X86shrd (loadi16 addr:$dst), GR16:$src2,			[(store (X86fshr GR16:$src2, (loadi16 addr:$dst),
	(i8 imm:$src3)), addr:$dst)]>,			(i8 imm:$src3)), addr:$dst)]>,
	TB, OpSize16;			TB, OpSize16;

	def SHLD32mri8 : Ii8<0xA4, MRMDestMem,			def SHLD32mri8 : Ii8<0xA4, MRMDestMem,
	(outs), (ins i32mem:$dst, GR32:$src2, u8imm:$src3),			(outs), (ins i32mem:$dst, GR32:$src2, u8imm:$src3),
	"shld{l}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",			"shld{l}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",
	[(store (X86shld (loadi32 addr:$dst), GR32:$src2,			[(store (fshl (loadi32 addr:$dst), GR32:$src2,
	(i8 imm:$src3)), addr:$dst)]>,			(i8 imm:$src3)), addr:$dst)]>,
	TB, OpSize32;			TB, OpSize32;
	def SHRD32mri8 : Ii8<0xAC, MRMDestMem,			def SHRD32mri8 : Ii8<0xAC, MRMDestMem,
	(outs), (ins i32mem:$dst, GR32:$src2, u8imm:$src3),			(outs), (ins i32mem:$dst, GR32:$src2, u8imm:$src3),
	"shrd{l}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",			"shrd{l}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",
	[(store (X86shrd (loadi32 addr:$dst), GR32:$src2,			[(store (fshr GR32:$src2, (loadi32 addr:$dst),
	(i8 imm:$src3)), addr:$dst)]>,			(i8 imm:$src3)), addr:$dst)]>,
	TB, OpSize32;			TB, OpSize32;

	def SHLD64mri8 : RIi8<0xA4, MRMDestMem,			def SHLD64mri8 : RIi8<0xA4, MRMDestMem,
	(outs), (ins i64mem:$dst, GR64:$src2, u8imm:$src3),			(outs), (ins i64mem:$dst, GR64:$src2, u8imm:$src3),
	"shld{q}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",			"shld{q}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",
	[(store (X86shld (loadi64 addr:$dst), GR64:$src2,			[(store (fshl (loadi64 addr:$dst), GR64:$src2,
	(i8 imm:$src3)), addr:$dst)]>,			(i8 imm:$src3)), addr:$dst)]>,
	TB;			TB;
	def SHRD64mri8 : RIi8<0xAC, MRMDestMem,			def SHRD64mri8 : RIi8<0xAC, MRMDestMem,
	(outs), (ins i64mem:$dst, GR64:$src2, u8imm:$src3),			(outs), (ins i64mem:$dst, GR64:$src2, u8imm:$src3),
	"shrd{q}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",			"shrd{q}\t{$src3, $src2, $dst\|$dst, $src2, $src3}",
	[(store (X86shrd (loadi64 addr:$dst), GR64:$src2,			[(store (fshr GR64:$src2, (loadi64 addr:$dst),
	(i8 imm:$src3)), addr:$dst)]>,			(i8 imm:$src3)), addr:$dst)]>,
	TB;			TB;
	} // SchedRW			} // SchedRW

	} // Defs = [EFLAGS]			} // Defs = [EFLAGS]

	// Use the opposite rotate if allows us to use the rotate by 1 instruction.			// Use the opposite rotate if allows us to use the rotate by 1 instruction.
	def : Pat<(rotl GR8:$src1, (i8 7)), (ROR8r1 GR8:$src1)>;			def : Pat<(rotl GR8:$src1, (i8 7)), (ROR8r1 GR8:$src1)>;
	def : Pat<(rotl GR16:$src1, (i8 15)), (ROR16r1 GR16:$src1)>;			def : Pat<(rotl GR16:$src1, (i8 15)), (ROR16r1 GR16:$src1)>;
	▲ Show 20 Lines • Show All 202 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/clear-highbits.ll

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc -mtriple=i686-unknown-linux-gnu -mattr=-bmi,-tbm,-bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X86,NOBMI2,X86-NOBMI2,FALLBACK0,X86-FALLBACK0		; RUN: llc -mtriple=i686-unknown-linux-gnu -mattr=-bmi,-tbm,-bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X86,NOBMI2,X86-NOBMI2,FALLBACK0,X86-FALLBACK0
; RUN: llc -mtriple=i686-unknown-linux-gnu -mattr=+bmi,-tbm,-bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X86,NOBMI2,X86-NOBMI2,FALLBACK1,X86-FALLBACK1		; RUN: llc -mtriple=i686-unknown-linux-gnu -mattr=+bmi,-tbm,-bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X86,NOBMI2,X86-NOBMI2,FALLBACK1,X86-FALLBACK1
; RUN: llc -mtriple=i686-unknown-linux-gnu -mattr=+bmi,+tbm,-bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X86,NOBMI2,X86-NOBMI2,FALLBACK2,X86-FALLBACK2		; RUN: llc -mtriple=i686-unknown-linux-gnu -mattr=+bmi,+tbm,-bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X86,NOBMI2,X86-NOBMI2,FALLBACK2,X86-FALLBACK2
; RUN: llc -mtriple=i686-unknown-linux-gnu -mattr=+bmi,+tbm,+bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X86,BMI2,X86-BMI2,FALLBACK3,X86-FALLBACK3		; RUN: llc -mtriple=i686-unknown-linux-gnu -mattr=+bmi,+tbm,+bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X86,BMI2,X86-BMI2,FALLBACK3,X86-FALLBACK3
; RUN: llc -mtriple=i686-unknown-linux-gnu -mattr=+bmi,-tbm,+bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X86,BMI2,X86-BMI2,FALLBACK4,X86-FALLBACK4		; RUN: llc -mtriple=i686-unknown-linux-gnu -mattr=+bmi,-tbm,+bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X86,BMI2,X86-BMI2,FALLBACK4,X86-FALLBACK4
		RKSimonAuthorUnsubmitted Done Reply Inline Actions @lebedev.ri Apart from the X86-FALLBACK0 case, can't we enable +cmov on these x86 targets? I can't think of a target that would support any BMI/TBM level without CMOV support. RKSimon: @lebedev.ri Apart from the X86-FALLBACK0 case, can't we enable +cmov on these x86 targets? I…
		lebedev.riUnsubmitted Not Done Reply Inline Actions I don't see why not. lebedev.ri: I don't see why not.
; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=-bmi,-tbm,-bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X64,NOBMI2,X64-NOBMI2,FALLBACK0,X64-FALLBACK0		; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=-bmi,-tbm,-bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X64,NOBMI2,X64-NOBMI2,FALLBACK0,X64-FALLBACK0
; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+bmi,-tbm,-bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X64,NOBMI2,X64-NOBMI2,FALLBACK1,X64-FALLBACK1		; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+bmi,-tbm,-bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X64,NOBMI2,X64-NOBMI2,FALLBACK1,X64-FALLBACK1
; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+bmi,+tbm,-bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X64,NOBMI2,X64-NOBMI2,FALLBACK2,X64-FALLBACK2		; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+bmi,+tbm,-bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X64,NOBMI2,X64-NOBMI2,FALLBACK2,X64-FALLBACK2
; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+bmi,+tbm,+bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X64,BMI2,X64-BMI2,FALLBACK3,X64-FALLBACK3		; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+bmi,+tbm,+bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X64,BMI2,X64-BMI2,FALLBACK3,X64-FALLBACK3
; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+bmi,-tbm,+bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X64,BMI2,X64-BMI2,FALLBACK4,X64-FALLBACK4		; RUN: llc -mtriple=x86_64-unknown-linux-gnu -mattr=+bmi,-tbm,+bmi2 < %s \| FileCheck %s --check-prefixes=CHECK,X64,BMI2,X64-BMI2,FALLBACK4,X64-FALLBACK4

; Patterns:		; Patterns:
; c) x & (-1 >> y)		; c) x & (-1 >> y)
▲ Show 20 Lines • Show All 493 Lines • ▼ Show 20 Lines

; ---------------------------------------------------------------------------- ;		; ---------------------------------------------------------------------------- ;
; 64-bit		; 64-bit
; ---------------------------------------------------------------------------- ;		; ---------------------------------------------------------------------------- ;

define i64 @clear_highbits64_c0(i64 %val, i64 %numhighbits) nounwind {		define i64 @clear_highbits64_c0(i64 %val, i64 %numhighbits) nounwind {
; X86-NOBMI2-LABEL: clear_highbits64_c0:		; X86-NOBMI2-LABEL: clear_highbits64_c0:
; X86-NOBMI2: # %bb.0:		; X86-NOBMI2: # %bb.0:
		; X86-NOBMI2-NEXT: pushl %esi
; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-NOBMI2-NEXT: movl $-1, %eax		; X86-NOBMI2-NEXT: movl $-1, %eax
; X86-NOBMI2-NEXT: movl $-1, %edx		; X86-NOBMI2-NEXT: movl $-1, %esi
; X86-NOBMI2-NEXT: shrl %cl, %edx		; X86-NOBMI2-NEXT: shrl %cl, %esi
; X86-NOBMI2-NEXT: shrdl %cl, %eax, %eax
; X86-NOBMI2-NEXT: testb $32, %cl
; X86-NOBMI2-NEXT: je .LBB13_2
; X86-NOBMI2-NEXT: # %bb.1:
; X86-NOBMI2-NEXT: movl %edx, %eax
; X86-NOBMI2-NEXT: xorl %edx, %edx		; X86-NOBMI2-NEXT: xorl %edx, %edx
; X86-NOBMI2-NEXT: .LBB13_2:		; X86-NOBMI2-NEXT: testb $32, %cl
		; X86-NOBMI2-NEXT: jne .LBB13_1
		; X86-NOBMI2-NEXT: # %bb.2:
		; X86-NOBMI2-NEXT: movl %esi, %edx
		; X86-NOBMI2-NEXT: jmp .LBB13_3
		; X86-NOBMI2-NEXT: .LBB13_1:
		; X86-NOBMI2-NEXT: movl %esi, %eax
		; X86-NOBMI2-NEXT: .LBB13_3:
; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %eax		; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edx		; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
		; X86-NOBMI2-NEXT: popl %esi
; X86-NOBMI2-NEXT: retl		; X86-NOBMI2-NEXT: retl
;		;
; X86-BMI2-LABEL: clear_highbits64_c0:		; X86-BMI2-LABEL: clear_highbits64_c0:
; X86-BMI2: # %bb.0:		; X86-BMI2: # %bb.0:
; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI2-NEXT: pushl %ebx
		; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %bl
; X86-BMI2-NEXT: movl $-1, %eax		; X86-BMI2-NEXT: movl $-1, %eax
; X86-BMI2-NEXT: shrxl %ecx, %eax, %edx		; X86-BMI2-NEXT: shrxl %ebx, %eax, %ecx
; X86-BMI2-NEXT: shrdl %cl, %eax, %eax
; X86-BMI2-NEXT: testb $32, %cl
; X86-BMI2-NEXT: je .LBB13_2
; X86-BMI2-NEXT: # %bb.1:
; X86-BMI2-NEXT: movl %edx, %eax
; X86-BMI2-NEXT: xorl %edx, %edx		; X86-BMI2-NEXT: xorl %edx, %edx
; X86-BMI2-NEXT: .LBB13_2:		; X86-BMI2-NEXT: testb $32, %bl
		; X86-BMI2-NEXT: jne .LBB13_1
		; X86-BMI2-NEXT: # %bb.2:
		; X86-BMI2-NEXT: movl %ecx, %edx
		; X86-BMI2-NEXT: jmp .LBB13_3
		; X86-BMI2-NEXT: .LBB13_1:
		; X86-BMI2-NEXT: movl %ecx, %eax
		; X86-BMI2-NEXT: .LBB13_3:
; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax		; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edx		; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
		; X86-BMI2-NEXT: popl %ebx
; X86-BMI2-NEXT: retl		; X86-BMI2-NEXT: retl
;		;
; X64-NOBMI2-LABEL: clear_highbits64_c0:		; X64-NOBMI2-LABEL: clear_highbits64_c0:
; X64-NOBMI2: # %bb.0:		; X64-NOBMI2: # %bb.0:
; X64-NOBMI2-NEXT: movq %rsi, %rcx		; X64-NOBMI2-NEXT: movq %rsi, %rcx
; X64-NOBMI2-NEXT: movq %rdi, %rax		; X64-NOBMI2-NEXT: movq %rdi, %rax
; X64-NOBMI2-NEXT: shlq %cl, %rax		; X64-NOBMI2-NEXT: shlq %cl, %rax
; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $rcx
; X64-NOBMI2-NEXT: shrq %cl, %rax		; X64-NOBMI2-NEXT: shrq %cl, %rax
; X64-NOBMI2-NEXT: retq		; X64-NOBMI2-NEXT: retq
;		;
; X64-BMI2-LABEL: clear_highbits64_c0:		; X64-BMI2-LABEL: clear_highbits64_c0:
; X64-BMI2: # %bb.0:		; X64-BMI2: # %bb.0:
; X64-BMI2-NEXT: shlxq %rsi, %rdi, %rax		; X64-BMI2-NEXT: shlxq %rsi, %rdi, %rax
; X64-BMI2-NEXT: shrxq %rsi, %rax, %rax		; X64-BMI2-NEXT: shrxq %rsi, %rax, %rax
; X64-BMI2-NEXT: retq		; X64-BMI2-NEXT: retq
%mask = lshr i64 -1, %numhighbits		%mask = lshr i64 -1, %numhighbits
%masked = and i64 %mask, %val		%masked = and i64 %mask, %val
ret i64 %masked		ret i64 %masked
}		}

define i64 @clear_highbits64_c1_indexzext(i64 %val, i8 %numhighbits) nounwind {		define i64 @clear_highbits64_c1_indexzext(i64 %val, i8 %numhighbits) nounwind {
; X86-NOBMI2-LABEL: clear_highbits64_c1_indexzext:		; X86-NOBMI2-LABEL: clear_highbits64_c1_indexzext:
; X86-NOBMI2: # %bb.0:		; X86-NOBMI2: # %bb.0:
		; X86-NOBMI2-NEXT: pushl %esi
; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-NOBMI2-NEXT: movl $-1, %eax		; X86-NOBMI2-NEXT: movl $-1, %eax
; X86-NOBMI2-NEXT: movl $-1, %edx		; X86-NOBMI2-NEXT: movl $-1, %esi
; X86-NOBMI2-NEXT: shrl %cl, %edx		; X86-NOBMI2-NEXT: shrl %cl, %esi
; X86-NOBMI2-NEXT: shrdl %cl, %eax, %eax
; X86-NOBMI2-NEXT: testb $32, %cl
; X86-NOBMI2-NEXT: je .LBB14_2
; X86-NOBMI2-NEXT: # %bb.1:
; X86-NOBMI2-NEXT: movl %edx, %eax
; X86-NOBMI2-NEXT: xorl %edx, %edx		; X86-NOBMI2-NEXT: xorl %edx, %edx
; X86-NOBMI2-NEXT: .LBB14_2:		; X86-NOBMI2-NEXT: testb $32, %cl
		; X86-NOBMI2-NEXT: jne .LBB14_1
		; X86-NOBMI2-NEXT: # %bb.2:
		; X86-NOBMI2-NEXT: movl %esi, %edx
		; X86-NOBMI2-NEXT: jmp .LBB14_3
		; X86-NOBMI2-NEXT: .LBB14_1:
		; X86-NOBMI2-NEXT: movl %esi, %eax
		; X86-NOBMI2-NEXT: .LBB14_3:
; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %eax		; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edx		; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
		; X86-NOBMI2-NEXT: popl %esi
; X86-NOBMI2-NEXT: retl		; X86-NOBMI2-NEXT: retl
;		;
; X86-BMI2-LABEL: clear_highbits64_c1_indexzext:		; X86-BMI2-LABEL: clear_highbits64_c1_indexzext:
; X86-BMI2: # %bb.0:		; X86-BMI2: # %bb.0:
; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI2-NEXT: pushl %ebx
		; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %bl
; X86-BMI2-NEXT: movl $-1, %eax		; X86-BMI2-NEXT: movl $-1, %eax
; X86-BMI2-NEXT: shrxl %ecx, %eax, %edx		; X86-BMI2-NEXT: shrxl %ebx, %eax, %ecx
; X86-BMI2-NEXT: shrdl %cl, %eax, %eax
; X86-BMI2-NEXT: testb $32, %cl
; X86-BMI2-NEXT: je .LBB14_2
; X86-BMI2-NEXT: # %bb.1:
; X86-BMI2-NEXT: movl %edx, %eax
; X86-BMI2-NEXT: xorl %edx, %edx		; X86-BMI2-NEXT: xorl %edx, %edx
; X86-BMI2-NEXT: .LBB14_2:		; X86-BMI2-NEXT: testb $32, %bl
		; X86-BMI2-NEXT: jne .LBB14_1
		; X86-BMI2-NEXT: # %bb.2:
		; X86-BMI2-NEXT: movl %ecx, %edx
		; X86-BMI2-NEXT: jmp .LBB14_3
		; X86-BMI2-NEXT: .LBB14_1:
		; X86-BMI2-NEXT: movl %ecx, %eax
		; X86-BMI2-NEXT: .LBB14_3:
; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax		; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edx		; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
		; X86-BMI2-NEXT: popl %ebx
; X86-BMI2-NEXT: retl		; X86-BMI2-NEXT: retl
;		;
; X64-NOBMI2-LABEL: clear_highbits64_c1_indexzext:		; X64-NOBMI2-LABEL: clear_highbits64_c1_indexzext:
; X64-NOBMI2: # %bb.0:		; X64-NOBMI2: # %bb.0:
; X64-NOBMI2-NEXT: movl %esi, %ecx		; X64-NOBMI2-NEXT: movl %esi, %ecx
; X64-NOBMI2-NEXT: movq %rdi, %rax		; X64-NOBMI2-NEXT: movq %rdi, %rax
; X64-NOBMI2-NEXT: shlq %cl, %rax		; X64-NOBMI2-NEXT: shlq %cl, %rax
; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $ecx		; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $ecx
Show All 10 Lines	; X64-BMI2-NEXT: retq
%mask = lshr i64 -1, %sh_prom		%mask = lshr i64 -1, %sh_prom
%masked = and i64 %mask, %val		%masked = and i64 %mask, %val
ret i64 %masked		ret i64 %masked
}		}

define i64 @clear_highbits64_c2_load(i64* %w, i64 %numhighbits) nounwind {		define i64 @clear_highbits64_c2_load(i64* %w, i64 %numhighbits) nounwind {
; X86-NOBMI2-LABEL: clear_highbits64_c2_load:		; X86-NOBMI2-LABEL: clear_highbits64_c2_load:
; X86-NOBMI2: # %bb.0:		; X86-NOBMI2: # %bb.0:
		; X86-NOBMI2-NEXT: pushl %edi
; X86-NOBMI2-NEXT: pushl %esi		; X86-NOBMI2-NEXT: pushl %esi
; X86-NOBMI2-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NOBMI2-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-NOBMI2-NEXT: movl $-1, %eax		; X86-NOBMI2-NEXT: movl $-1, %eax
; X86-NOBMI2-NEXT: movl $-1, %edx		; X86-NOBMI2-NEXT: movl $-1, %edi
; X86-NOBMI2-NEXT: shrl %cl, %edx		; X86-NOBMI2-NEXT: shrl %cl, %edi
; X86-NOBMI2-NEXT: shrdl %cl, %eax, %eax
; X86-NOBMI2-NEXT: testb $32, %cl
; X86-NOBMI2-NEXT: je .LBB15_2
; X86-NOBMI2-NEXT: # %bb.1:
; X86-NOBMI2-NEXT: movl %edx, %eax
; X86-NOBMI2-NEXT: xorl %edx, %edx		; X86-NOBMI2-NEXT: xorl %edx, %edx
; X86-NOBMI2-NEXT: .LBB15_2:		; X86-NOBMI2-NEXT: testb $32, %cl
		; X86-NOBMI2-NEXT: jne .LBB15_1
		; X86-NOBMI2-NEXT: # %bb.2:
		; X86-NOBMI2-NEXT: movl %edi, %edx
		; X86-NOBMI2-NEXT: jmp .LBB15_3
		; X86-NOBMI2-NEXT: .LBB15_1:
		; X86-NOBMI2-NEXT: movl %edi, %eax
		; X86-NOBMI2-NEXT: .LBB15_3:
; X86-NOBMI2-NEXT: andl (%esi), %eax		; X86-NOBMI2-NEXT: andl (%esi), %eax
; X86-NOBMI2-NEXT: andl 4(%esi), %edx		; X86-NOBMI2-NEXT: andl 4(%esi), %edx
; X86-NOBMI2-NEXT: popl %esi		; X86-NOBMI2-NEXT: popl %esi
		; X86-NOBMI2-NEXT: popl %edi
; X86-NOBMI2-NEXT: retl		; X86-NOBMI2-NEXT: retl
;		;
; X86-BMI2-LABEL: clear_highbits64_c2_load:		; X86-BMI2-LABEL: clear_highbits64_c2_load:
; X86-BMI2: # %bb.0:		; X86-BMI2: # %bb.0:
		; X86-BMI2-NEXT: pushl %ebx
; X86-BMI2-NEXT: pushl %esi		; X86-BMI2-NEXT: pushl %esi
; X86-BMI2-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-BMI2-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %bl
; X86-BMI2-NEXT: movl $-1, %eax		; X86-BMI2-NEXT: movl $-1, %eax
; X86-BMI2-NEXT: shrxl %ecx, %eax, %edx		; X86-BMI2-NEXT: shrxl %ebx, %eax, %esi
; X86-BMI2-NEXT: shrdl %cl, %eax, %eax
; X86-BMI2-NEXT: testb $32, %cl
; X86-BMI2-NEXT: je .LBB15_2
; X86-BMI2-NEXT: # %bb.1:
; X86-BMI2-NEXT: movl %edx, %eax
; X86-BMI2-NEXT: xorl %edx, %edx		; X86-BMI2-NEXT: xorl %edx, %edx
; X86-BMI2-NEXT: .LBB15_2:		; X86-BMI2-NEXT: testb $32, %bl
; X86-BMI2-NEXT: andl (%esi), %eax		; X86-BMI2-NEXT: jne .LBB15_1
; X86-BMI2-NEXT: andl 4(%esi), %edx		; X86-BMI2-NEXT: # %bb.2:
		; X86-BMI2-NEXT: movl %esi, %edx
		; X86-BMI2-NEXT: jmp .LBB15_3
		; X86-BMI2-NEXT: .LBB15_1:
		; X86-BMI2-NEXT: movl %esi, %eax
		; X86-BMI2-NEXT: .LBB15_3:
		; X86-BMI2-NEXT: andl (%ecx), %eax
		; X86-BMI2-NEXT: andl 4(%ecx), %edx
; X86-BMI2-NEXT: popl %esi		; X86-BMI2-NEXT: popl %esi
		; X86-BMI2-NEXT: popl %ebx
; X86-BMI2-NEXT: retl		; X86-BMI2-NEXT: retl
;		;
; X64-NOBMI2-LABEL: clear_highbits64_c2_load:		; X64-NOBMI2-LABEL: clear_highbits64_c2_load:
; X64-NOBMI2: # %bb.0:		; X64-NOBMI2: # %bb.0:
; X64-NOBMI2-NEXT: movq %rsi, %rcx		; X64-NOBMI2-NEXT: movq %rsi, %rcx
; X64-NOBMI2-NEXT: movq (%rdi), %rax		; X64-NOBMI2-NEXT: movq (%rdi), %rax
; X64-NOBMI2-NEXT: shlq %cl, %rax		; X64-NOBMI2-NEXT: shlq %cl, %rax
; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $rcx
Show All 9 Lines	; X64-BMI2-NEXT: retq
%mask = lshr i64 -1, %numhighbits		%mask = lshr i64 -1, %numhighbits
%masked = and i64 %mask, %val		%masked = and i64 %mask, %val
ret i64 %masked		ret i64 %masked
}		}

define i64 @clear_highbits64_c3_load_indexzext(i64* %w, i8 %numhighbits) nounwind {		define i64 @clear_highbits64_c3_load_indexzext(i64* %w, i8 %numhighbits) nounwind {
; X86-NOBMI2-LABEL: clear_highbits64_c3_load_indexzext:		; X86-NOBMI2-LABEL: clear_highbits64_c3_load_indexzext:
; X86-NOBMI2: # %bb.0:		; X86-NOBMI2: # %bb.0:
		; X86-NOBMI2-NEXT: pushl %edi
; X86-NOBMI2-NEXT: pushl %esi		; X86-NOBMI2-NEXT: pushl %esi
; X86-NOBMI2-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NOBMI2-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-NOBMI2-NEXT: movl $-1, %eax		; X86-NOBMI2-NEXT: movl $-1, %eax
; X86-NOBMI2-NEXT: movl $-1, %edx		; X86-NOBMI2-NEXT: movl $-1, %edi
; X86-NOBMI2-NEXT: shrl %cl, %edx		; X86-NOBMI2-NEXT: shrl %cl, %edi
; X86-NOBMI2-NEXT: shrdl %cl, %eax, %eax
; X86-NOBMI2-NEXT: testb $32, %cl
; X86-NOBMI2-NEXT: je .LBB16_2
; X86-NOBMI2-NEXT: # %bb.1:
; X86-NOBMI2-NEXT: movl %edx, %eax
; X86-NOBMI2-NEXT: xorl %edx, %edx		; X86-NOBMI2-NEXT: xorl %edx, %edx
; X86-NOBMI2-NEXT: .LBB16_2:		; X86-NOBMI2-NEXT: testb $32, %cl
		; X86-NOBMI2-NEXT: jne .LBB16_1
		; X86-NOBMI2-NEXT: # %bb.2:
		; X86-NOBMI2-NEXT: movl %edi, %edx
		; X86-NOBMI2-NEXT: jmp .LBB16_3
		; X86-NOBMI2-NEXT: .LBB16_1:
		; X86-NOBMI2-NEXT: movl %edi, %eax
		; X86-NOBMI2-NEXT: .LBB16_3:
; X86-NOBMI2-NEXT: andl (%esi), %eax		; X86-NOBMI2-NEXT: andl (%esi), %eax
; X86-NOBMI2-NEXT: andl 4(%esi), %edx		; X86-NOBMI2-NEXT: andl 4(%esi), %edx
; X86-NOBMI2-NEXT: popl %esi		; X86-NOBMI2-NEXT: popl %esi
		; X86-NOBMI2-NEXT: popl %edi
; X86-NOBMI2-NEXT: retl		; X86-NOBMI2-NEXT: retl
;		;
; X86-BMI2-LABEL: clear_highbits64_c3_load_indexzext:		; X86-BMI2-LABEL: clear_highbits64_c3_load_indexzext:
; X86-BMI2: # %bb.0:		; X86-BMI2: # %bb.0:
		; X86-BMI2-NEXT: pushl %ebx
; X86-BMI2-NEXT: pushl %esi		; X86-BMI2-NEXT: pushl %esi
; X86-BMI2-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-BMI2-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %bl
; X86-BMI2-NEXT: movl $-1, %eax		; X86-BMI2-NEXT: movl $-1, %eax
; X86-BMI2-NEXT: shrxl %ecx, %eax, %edx		; X86-BMI2-NEXT: shrxl %ebx, %eax, %esi
; X86-BMI2-NEXT: shrdl %cl, %eax, %eax
; X86-BMI2-NEXT: testb $32, %cl
; X86-BMI2-NEXT: je .LBB16_2
; X86-BMI2-NEXT: # %bb.1:
; X86-BMI2-NEXT: movl %edx, %eax
; X86-BMI2-NEXT: xorl %edx, %edx		; X86-BMI2-NEXT: xorl %edx, %edx
; X86-BMI2-NEXT: .LBB16_2:		; X86-BMI2-NEXT: testb $32, %bl
; X86-BMI2-NEXT: andl (%esi), %eax		; X86-BMI2-NEXT: jne .LBB16_1
; X86-BMI2-NEXT: andl 4(%esi), %edx		; X86-BMI2-NEXT: # %bb.2:
		; X86-BMI2-NEXT: movl %esi, %edx
		; X86-BMI2-NEXT: jmp .LBB16_3
		; X86-BMI2-NEXT: .LBB16_1:
		; X86-BMI2-NEXT: movl %esi, %eax
		; X86-BMI2-NEXT: .LBB16_3:
		; X86-BMI2-NEXT: andl (%ecx), %eax
		; X86-BMI2-NEXT: andl 4(%ecx), %edx
; X86-BMI2-NEXT: popl %esi		; X86-BMI2-NEXT: popl %esi
		; X86-BMI2-NEXT: popl %ebx
; X86-BMI2-NEXT: retl		; X86-BMI2-NEXT: retl
;		;
; X64-NOBMI2-LABEL: clear_highbits64_c3_load_indexzext:		; X64-NOBMI2-LABEL: clear_highbits64_c3_load_indexzext:
; X64-NOBMI2: # %bb.0:		; X64-NOBMI2: # %bb.0:
; X64-NOBMI2-NEXT: movl %esi, %ecx		; X64-NOBMI2-NEXT: movl %esi, %ecx
; X64-NOBMI2-NEXT: movq (%rdi), %rax		; X64-NOBMI2-NEXT: movq (%rdi), %rax
; X64-NOBMI2-NEXT: shlq %cl, %rax		; X64-NOBMI2-NEXT: shlq %cl, %rax
; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $ecx		; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $ecx
Show All 11 Lines	; X64-BMI2-NEXT: retq
%mask = lshr i64 -1, %sh_prom		%mask = lshr i64 -1, %sh_prom
%masked = and i64 %mask, %val		%masked = and i64 %mask, %val
ret i64 %masked		ret i64 %masked
}		}

define i64 @clear_highbits64_c4_commutative(i64 %val, i64 %numhighbits) nounwind {		define i64 @clear_highbits64_c4_commutative(i64 %val, i64 %numhighbits) nounwind {
; X86-NOBMI2-LABEL: clear_highbits64_c4_commutative:		; X86-NOBMI2-LABEL: clear_highbits64_c4_commutative:
; X86-NOBMI2: # %bb.0:		; X86-NOBMI2: # %bb.0:
		; X86-NOBMI2-NEXT: pushl %esi
; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-NOBMI2-NEXT: movl $-1, %eax		; X86-NOBMI2-NEXT: movl $-1, %eax
; X86-NOBMI2-NEXT: movl $-1, %edx		; X86-NOBMI2-NEXT: movl $-1, %esi
; X86-NOBMI2-NEXT: shrl %cl, %edx		; X86-NOBMI2-NEXT: shrl %cl, %esi
; X86-NOBMI2-NEXT: shrdl %cl, %eax, %eax
; X86-NOBMI2-NEXT: testb $32, %cl
; X86-NOBMI2-NEXT: je .LBB17_2
; X86-NOBMI2-NEXT: # %bb.1:
; X86-NOBMI2-NEXT: movl %edx, %eax
; X86-NOBMI2-NEXT: xorl %edx, %edx		; X86-NOBMI2-NEXT: xorl %edx, %edx
; X86-NOBMI2-NEXT: .LBB17_2:		; X86-NOBMI2-NEXT: testb $32, %cl
		; X86-NOBMI2-NEXT: jne .LBB17_1
		; X86-NOBMI2-NEXT: # %bb.2:
		; X86-NOBMI2-NEXT: movl %esi, %edx
		; X86-NOBMI2-NEXT: jmp .LBB17_3
		; X86-NOBMI2-NEXT: .LBB17_1:
		; X86-NOBMI2-NEXT: movl %esi, %eax
		; X86-NOBMI2-NEXT: .LBB17_3:
; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %eax		; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edx		; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
		; X86-NOBMI2-NEXT: popl %esi
; X86-NOBMI2-NEXT: retl		; X86-NOBMI2-NEXT: retl
;		;
; X86-BMI2-LABEL: clear_highbits64_c4_commutative:		; X86-BMI2-LABEL: clear_highbits64_c4_commutative:
; X86-BMI2: # %bb.0:		; X86-BMI2: # %bb.0:
; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI2-NEXT: pushl %ebx
		; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %bl
; X86-BMI2-NEXT: movl $-1, %eax		; X86-BMI2-NEXT: movl $-1, %eax
; X86-BMI2-NEXT: shrxl %ecx, %eax, %edx		; X86-BMI2-NEXT: shrxl %ebx, %eax, %ecx
; X86-BMI2-NEXT: shrdl %cl, %eax, %eax
; X86-BMI2-NEXT: testb $32, %cl
; X86-BMI2-NEXT: je .LBB17_2
; X86-BMI2-NEXT: # %bb.1:
; X86-BMI2-NEXT: movl %edx, %eax
; X86-BMI2-NEXT: xorl %edx, %edx		; X86-BMI2-NEXT: xorl %edx, %edx
; X86-BMI2-NEXT: .LBB17_2:		; X86-BMI2-NEXT: testb $32, %bl
		; X86-BMI2-NEXT: jne .LBB17_1
		; X86-BMI2-NEXT: # %bb.2:
		; X86-BMI2-NEXT: movl %ecx, %edx
		; X86-BMI2-NEXT: jmp .LBB17_3
		; X86-BMI2-NEXT: .LBB17_1:
		; X86-BMI2-NEXT: movl %ecx, %eax
		; X86-BMI2-NEXT: .LBB17_3:
; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax		; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edx		; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
		; X86-BMI2-NEXT: popl %ebx
; X86-BMI2-NEXT: retl		; X86-BMI2-NEXT: retl
;		;
; X64-NOBMI2-LABEL: clear_highbits64_c4_commutative:		; X64-NOBMI2-LABEL: clear_highbits64_c4_commutative:
; X64-NOBMI2: # %bb.0:		; X64-NOBMI2: # %bb.0:
; X64-NOBMI2-NEXT: movq %rsi, %rcx		; X64-NOBMI2-NEXT: movq %rsi, %rcx
; X64-NOBMI2-NEXT: movq %rdi, %rax		; X64-NOBMI2-NEXT: movq %rdi, %rax
; X64-NOBMI2-NEXT: shlq %cl, %rax		; X64-NOBMI2-NEXT: shlq %cl, %rax
; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $rcx
▲ Show 20 Lines • Show All 94 Lines • ▼ Show 20 Lines
; X86-NOBMI2: # %bb.0:		; X86-NOBMI2: # %bb.0:
; X86-NOBMI2-NEXT: pushl %edi		; X86-NOBMI2-NEXT: pushl %edi
; X86-NOBMI2-NEXT: pushl %esi		; X86-NOBMI2-NEXT: pushl %esi
; X86-NOBMI2-NEXT: pushl %eax		; X86-NOBMI2-NEXT: pushl %eax
; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-NOBMI2-NEXT: movl $-1, %esi		; X86-NOBMI2-NEXT: movl $-1, %esi
; X86-NOBMI2-NEXT: movl $-1, %edi		; X86-NOBMI2-NEXT: movl $-1, %edi
; X86-NOBMI2-NEXT: shrl %cl, %edi		; X86-NOBMI2-NEXT: shrl %cl, %edi
; X86-NOBMI2-NEXT: shrdl %cl, %esi, %esi
; X86-NOBMI2-NEXT: testb $32, %cl		; X86-NOBMI2-NEXT: testb $32, %cl
; X86-NOBMI2-NEXT: je .LBB19_2		; X86-NOBMI2-NEXT: je .LBB19_2
; X86-NOBMI2-NEXT: # %bb.1:		; X86-NOBMI2-NEXT: # %bb.1:
; X86-NOBMI2-NEXT: movl %edi, %esi		; X86-NOBMI2-NEXT: movl %edi, %esi
; X86-NOBMI2-NEXT: xorl %edi, %edi		; X86-NOBMI2-NEXT: xorl %edi, %edi
; X86-NOBMI2-NEXT: .LBB19_2:		; X86-NOBMI2-NEXT: .LBB19_2:
; X86-NOBMI2-NEXT: subl $8, %esp		; X86-NOBMI2-NEXT: subl $8, %esp
; X86-NOBMI2-NEXT: pushl %edi		; X86-NOBMI2-NEXT: pushl %edi
Show All 9 Lines
; X86-NOBMI2-NEXT: popl %edi		; X86-NOBMI2-NEXT: popl %edi
; X86-NOBMI2-NEXT: retl		; X86-NOBMI2-NEXT: retl
;		;
; X86-BMI2-LABEL: oneuse64:		; X86-BMI2-LABEL: oneuse64:
; X86-BMI2: # %bb.0:		; X86-BMI2: # %bb.0:
; X86-BMI2-NEXT: pushl %edi		; X86-BMI2-NEXT: pushl %edi
; X86-BMI2-NEXT: pushl %esi		; X86-BMI2-NEXT: pushl %esi
; X86-BMI2-NEXT: pushl %eax		; X86-BMI2-NEXT: pushl %eax
; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %al
; X86-BMI2-NEXT: movl $-1, %esi		; X86-BMI2-NEXT: movl $-1, %edi
; X86-BMI2-NEXT: shrxl %ecx, %esi, %edi		; X86-BMI2-NEXT: shrxl %eax, %edi, %esi
; X86-BMI2-NEXT: shrdl %cl, %esi, %esi		; X86-BMI2-NEXT: testb $32, %al
; X86-BMI2-NEXT: testb $32, %cl
; X86-BMI2-NEXT: je .LBB19_2		; X86-BMI2-NEXT: je .LBB19_2
; X86-BMI2-NEXT: # %bb.1:		; X86-BMI2-NEXT: # %bb.1:
; X86-BMI2-NEXT: movl %edi, %esi		; X86-BMI2-NEXT: movl %esi, %edi
; X86-BMI2-NEXT: xorl %edi, %edi		; X86-BMI2-NEXT: xorl %esi, %esi
; X86-BMI2-NEXT: .LBB19_2:		; X86-BMI2-NEXT: .LBB19_2:
; X86-BMI2-NEXT: subl $8, %esp		; X86-BMI2-NEXT: subl $8, %esp
; X86-BMI2-NEXT: pushl %edi
; X86-BMI2-NEXT: pushl %esi		; X86-BMI2-NEXT: pushl %esi
		; X86-BMI2-NEXT: pushl %edi
; X86-BMI2-NEXT: calll use64		; X86-BMI2-NEXT: calll use64
; X86-BMI2-NEXT: addl $16, %esp		; X86-BMI2-NEXT: addl $16, %esp
; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %esi
; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edi		; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edi
; X86-BMI2-NEXT: movl %esi, %eax		; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %esi
; X86-BMI2-NEXT: movl %edi, %edx		; X86-BMI2-NEXT: movl %edi, %eax
		; X86-BMI2-NEXT: movl %esi, %edx
; X86-BMI2-NEXT: addl $4, %esp		; X86-BMI2-NEXT: addl $4, %esp
; X86-BMI2-NEXT: popl %esi		; X86-BMI2-NEXT: popl %esi
; X86-BMI2-NEXT: popl %edi		; X86-BMI2-NEXT: popl %edi
; X86-BMI2-NEXT: retl		; X86-BMI2-NEXT: retl
;		;
; X64-NOBMI2-LABEL: oneuse64:		; X64-NOBMI2-LABEL: oneuse64:
; X64-NOBMI2: # %bb.0:		; X64-NOBMI2: # %bb.0:
; X64-NOBMI2-NEXT: pushq %r14		; X64-NOBMI2-NEXT: pushq %r14
Show All 37 Lines

llvm/test/CodeGen/X86/clear-lowbits.ll

	Show First 20 Lines • Show All 496 Lines • ▼ Show 20 Lines

	define i64 @clear_lowbits64_c0(i64 %val, i64 %numlowbits) nounwind {			define i64 @clear_lowbits64_c0(i64 %val, i64 %numlowbits) nounwind {
	; X86-NOBMI2-LABEL: clear_lowbits64_c0:			; X86-NOBMI2-LABEL: clear_lowbits64_c0:
	; X86-NOBMI2: # %bb.0:			; X86-NOBMI2: # %bb.0:
	; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl			; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
	; X86-NOBMI2-NEXT: movl $-1, %edx			; X86-NOBMI2-NEXT: movl $-1, %edx
	; X86-NOBMI2-NEXT: movl $-1, %eax			; X86-NOBMI2-NEXT: movl $-1, %eax
	; X86-NOBMI2-NEXT: shll %cl, %eax			; X86-NOBMI2-NEXT: shll %cl, %eax
	; X86-NOBMI2-NEXT: shldl %cl, %edx, %edx
	; X86-NOBMI2-NEXT: testb $32, %cl			; X86-NOBMI2-NEXT: testb $32, %cl
	; X86-NOBMI2-NEXT: je .LBB13_2			; X86-NOBMI2-NEXT: je .LBB13_2
	; X86-NOBMI2-NEXT: # %bb.1:			; X86-NOBMI2-NEXT: # %bb.1:
	; X86-NOBMI2-NEXT: movl %eax, %edx			; X86-NOBMI2-NEXT: movl %eax, %edx
	; X86-NOBMI2-NEXT: xorl %eax, %eax			; X86-NOBMI2-NEXT: xorl %eax, %eax
	; X86-NOBMI2-NEXT: .LBB13_2:			; X86-NOBMI2-NEXT: .LBB13_2:
	; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %eax			; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %eax
				; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-NOBMI2-NEXT: retl			; X86-NOBMI2-NEXT: retl
	;			;
	; X86-BMI2-LABEL: clear_lowbits64_c0:			; X86-BMI2-LABEL: clear_lowbits64_c0:
	; X86-BMI2: # %bb.0:			; X86-BMI2: # %bb.0:
	; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl			; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
	; X86-BMI2-NEXT: movl $-1, %edx			; X86-BMI2-NEXT: movl $-1, %edx
	; X86-BMI2-NEXT: shlxl %ecx, %edx, %eax			; X86-BMI2-NEXT: shlxl %ecx, %edx, %eax
	; X86-BMI2-NEXT: shldl %cl, %edx, %edx
	; X86-BMI2-NEXT: testb $32, %cl			; X86-BMI2-NEXT: testb $32, %cl
	; X86-BMI2-NEXT: je .LBB13_2			; X86-BMI2-NEXT: je .LBB13_2
	; X86-BMI2-NEXT: # %bb.1:			; X86-BMI2-NEXT: # %bb.1:
	; X86-BMI2-NEXT: movl %eax, %edx			; X86-BMI2-NEXT: movl %eax, %edx
	; X86-BMI2-NEXT: xorl %eax, %eax			; X86-BMI2-NEXT: xorl %eax, %eax
	; X86-BMI2-NEXT: .LBB13_2:			; X86-BMI2-NEXT: .LBB13_2:
	; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax			; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax
				; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-BMI2-NEXT: retl			; X86-BMI2-NEXT: retl
	;			;
	; X64-NOBMI2-LABEL: clear_lowbits64_c0:			; X64-NOBMI2-LABEL: clear_lowbits64_c0:
	; X64-NOBMI2: # %bb.0:			; X64-NOBMI2: # %bb.0:
	; X64-NOBMI2-NEXT: movq %rsi, %rcx			; X64-NOBMI2-NEXT: movq %rsi, %rcx
	; X64-NOBMI2-NEXT: movq %rdi, %rax			; X64-NOBMI2-NEXT: movq %rdi, %rax
	; X64-NOBMI2-NEXT: shrq %cl, %rax			; X64-NOBMI2-NEXT: shrq %cl, %rax
	; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $rcx			; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $rcx
	Show All 12 Lines

	define i64 @clear_lowbits64_c1_indexzext(i64 %val, i8 %numlowbits) nounwind {			define i64 @clear_lowbits64_c1_indexzext(i64 %val, i8 %numlowbits) nounwind {
	; X86-NOBMI2-LABEL: clear_lowbits64_c1_indexzext:			; X86-NOBMI2-LABEL: clear_lowbits64_c1_indexzext:
	; X86-NOBMI2: # %bb.0:			; X86-NOBMI2: # %bb.0:
	; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl			; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
	; X86-NOBMI2-NEXT: movl $-1, %edx			; X86-NOBMI2-NEXT: movl $-1, %edx
	; X86-NOBMI2-NEXT: movl $-1, %eax			; X86-NOBMI2-NEXT: movl $-1, %eax
	; X86-NOBMI2-NEXT: shll %cl, %eax			; X86-NOBMI2-NEXT: shll %cl, %eax
	; X86-NOBMI2-NEXT: shldl %cl, %edx, %edx
	; X86-NOBMI2-NEXT: testb $32, %cl			; X86-NOBMI2-NEXT: testb $32, %cl
	; X86-NOBMI2-NEXT: je .LBB14_2			; X86-NOBMI2-NEXT: je .LBB14_2
	; X86-NOBMI2-NEXT: # %bb.1:			; X86-NOBMI2-NEXT: # %bb.1:
	; X86-NOBMI2-NEXT: movl %eax, %edx			; X86-NOBMI2-NEXT: movl %eax, %edx
	; X86-NOBMI2-NEXT: xorl %eax, %eax			; X86-NOBMI2-NEXT: xorl %eax, %eax
	; X86-NOBMI2-NEXT: .LBB14_2:			; X86-NOBMI2-NEXT: .LBB14_2:
	; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %eax			; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %eax
				; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-NOBMI2-NEXT: retl			; X86-NOBMI2-NEXT: retl
	;			;
	; X86-BMI2-LABEL: clear_lowbits64_c1_indexzext:			; X86-BMI2-LABEL: clear_lowbits64_c1_indexzext:
	; X86-BMI2: # %bb.0:			; X86-BMI2: # %bb.0:
	; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl			; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
	; X86-BMI2-NEXT: movl $-1, %edx			; X86-BMI2-NEXT: movl $-1, %edx
	; X86-BMI2-NEXT: shlxl %ecx, %edx, %eax			; X86-BMI2-NEXT: shlxl %ecx, %edx, %eax
	; X86-BMI2-NEXT: shldl %cl, %edx, %edx
	; X86-BMI2-NEXT: testb $32, %cl			; X86-BMI2-NEXT: testb $32, %cl
	; X86-BMI2-NEXT: je .LBB14_2			; X86-BMI2-NEXT: je .LBB14_2
	; X86-BMI2-NEXT: # %bb.1:			; X86-BMI2-NEXT: # %bb.1:
	; X86-BMI2-NEXT: movl %eax, %edx			; X86-BMI2-NEXT: movl %eax, %edx
	; X86-BMI2-NEXT: xorl %eax, %eax			; X86-BMI2-NEXT: xorl %eax, %eax
	; X86-BMI2-NEXT: .LBB14_2:			; X86-BMI2-NEXT: .LBB14_2:
	; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax			; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax
				; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-BMI2-NEXT: retl			; X86-BMI2-NEXT: retl
	;			;
	; X64-NOBMI2-LABEL: clear_lowbits64_c1_indexzext:			; X64-NOBMI2-LABEL: clear_lowbits64_c1_indexzext:
	; X64-NOBMI2: # %bb.0:			; X64-NOBMI2: # %bb.0:
	; X64-NOBMI2-NEXT: movl %esi, %ecx			; X64-NOBMI2-NEXT: movl %esi, %ecx
	; X64-NOBMI2-NEXT: movq %rdi, %rax			; X64-NOBMI2-NEXT: movq %rdi, %rax
	; X64-NOBMI2-NEXT: shrq %cl, %rax			; X64-NOBMI2-NEXT: shrq %cl, %rax
	; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $ecx			; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $ecx
	Show All 16 Lines
	; X86-NOBMI2-LABEL: clear_lowbits64_c2_load:			; X86-NOBMI2-LABEL: clear_lowbits64_c2_load:
	; X86-NOBMI2: # %bb.0:			; X86-NOBMI2: # %bb.0:
	; X86-NOBMI2-NEXT: pushl %esi			; X86-NOBMI2-NEXT: pushl %esi
	; X86-NOBMI2-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NOBMI2-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl			; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
	; X86-NOBMI2-NEXT: movl $-1, %edx			; X86-NOBMI2-NEXT: movl $-1, %edx
	; X86-NOBMI2-NEXT: movl $-1, %eax			; X86-NOBMI2-NEXT: movl $-1, %eax
	; X86-NOBMI2-NEXT: shll %cl, %eax			; X86-NOBMI2-NEXT: shll %cl, %eax
	; X86-NOBMI2-NEXT: shldl %cl, %edx, %edx
	; X86-NOBMI2-NEXT: testb $32, %cl			; X86-NOBMI2-NEXT: testb $32, %cl
	; X86-NOBMI2-NEXT: je .LBB15_2			; X86-NOBMI2-NEXT: je .LBB15_2
	; X86-NOBMI2-NEXT: # %bb.1:			; X86-NOBMI2-NEXT: # %bb.1:
	; X86-NOBMI2-NEXT: movl %eax, %edx			; X86-NOBMI2-NEXT: movl %eax, %edx
	; X86-NOBMI2-NEXT: xorl %eax, %eax			; X86-NOBMI2-NEXT: xorl %eax, %eax
	; X86-NOBMI2-NEXT: .LBB15_2:			; X86-NOBMI2-NEXT: .LBB15_2:
	; X86-NOBMI2-NEXT: andl 4(%esi), %edx
	; X86-NOBMI2-NEXT: andl (%esi), %eax			; X86-NOBMI2-NEXT: andl (%esi), %eax
				; X86-NOBMI2-NEXT: andl 4(%esi), %edx
	; X86-NOBMI2-NEXT: popl %esi			; X86-NOBMI2-NEXT: popl %esi
	; X86-NOBMI2-NEXT: retl			; X86-NOBMI2-NEXT: retl
	;			;
	; X86-BMI2-LABEL: clear_lowbits64_c2_load:			; X86-BMI2-LABEL: clear_lowbits64_c2_load:
	; X86-BMI2: # %bb.0:			; X86-BMI2: # %bb.0:
	; X86-BMI2-NEXT: pushl %esi			; X86-BMI2-NEXT: pushl %ebx
	; X86-BMI2-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-BMI2-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl			; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %bl
	; X86-BMI2-NEXT: movl $-1, %edx			; X86-BMI2-NEXT: movl $-1, %edx
	; X86-BMI2-NEXT: shlxl %ecx, %edx, %eax			; X86-BMI2-NEXT: shlxl %ebx, %edx, %eax
	; X86-BMI2-NEXT: shldl %cl, %edx, %edx			; X86-BMI2-NEXT: testb $32, %bl
	; X86-BMI2-NEXT: testb $32, %cl
	; X86-BMI2-NEXT: je .LBB15_2			; X86-BMI2-NEXT: je .LBB15_2
	; X86-BMI2-NEXT: # %bb.1:			; X86-BMI2-NEXT: # %bb.1:
	; X86-BMI2-NEXT: movl %eax, %edx			; X86-BMI2-NEXT: movl %eax, %edx
	; X86-BMI2-NEXT: xorl %eax, %eax			; X86-BMI2-NEXT: xorl %eax, %eax
	; X86-BMI2-NEXT: .LBB15_2:			; X86-BMI2-NEXT: .LBB15_2:
	; X86-BMI2-NEXT: andl 4(%esi), %edx			; X86-BMI2-NEXT: andl (%ecx), %eax
	; X86-BMI2-NEXT: andl (%esi), %eax			; X86-BMI2-NEXT: andl 4(%ecx), %edx
	; X86-BMI2-NEXT: popl %esi			; X86-BMI2-NEXT: popl %ebx
	; X86-BMI2-NEXT: retl			; X86-BMI2-NEXT: retl
	;			;
	; X64-NOBMI2-LABEL: clear_lowbits64_c2_load:			; X64-NOBMI2-LABEL: clear_lowbits64_c2_load:
	; X64-NOBMI2: # %bb.0:			; X64-NOBMI2: # %bb.0:
	; X64-NOBMI2-NEXT: movq %rsi, %rcx			; X64-NOBMI2-NEXT: movq %rsi, %rcx
	; X64-NOBMI2-NEXT: movq (%rdi), %rax			; X64-NOBMI2-NEXT: movq (%rdi), %rax
	; X64-NOBMI2-NEXT: shrq %cl, %rax			; X64-NOBMI2-NEXT: shrq %cl, %rax
	; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $rcx			; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $rcx
	Show All 15 Lines
	; X86-NOBMI2-LABEL: clear_lowbits64_c3_load_indexzext:			; X86-NOBMI2-LABEL: clear_lowbits64_c3_load_indexzext:
	; X86-NOBMI2: # %bb.0:			; X86-NOBMI2: # %bb.0:
	; X86-NOBMI2-NEXT: pushl %esi			; X86-NOBMI2-NEXT: pushl %esi
	; X86-NOBMI2-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NOBMI2-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl			; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
	; X86-NOBMI2-NEXT: movl $-1, %edx			; X86-NOBMI2-NEXT: movl $-1, %edx
	; X86-NOBMI2-NEXT: movl $-1, %eax			; X86-NOBMI2-NEXT: movl $-1, %eax
	; X86-NOBMI2-NEXT: shll %cl, %eax			; X86-NOBMI2-NEXT: shll %cl, %eax
	; X86-NOBMI2-NEXT: shldl %cl, %edx, %edx
	; X86-NOBMI2-NEXT: testb $32, %cl			; X86-NOBMI2-NEXT: testb $32, %cl
	; X86-NOBMI2-NEXT: je .LBB16_2			; X86-NOBMI2-NEXT: je .LBB16_2
	; X86-NOBMI2-NEXT: # %bb.1:			; X86-NOBMI2-NEXT: # %bb.1:
	; X86-NOBMI2-NEXT: movl %eax, %edx			; X86-NOBMI2-NEXT: movl %eax, %edx
	; X86-NOBMI2-NEXT: xorl %eax, %eax			; X86-NOBMI2-NEXT: xorl %eax, %eax
	; X86-NOBMI2-NEXT: .LBB16_2:			; X86-NOBMI2-NEXT: .LBB16_2:
	; X86-NOBMI2-NEXT: andl 4(%esi), %edx
	; X86-NOBMI2-NEXT: andl (%esi), %eax			; X86-NOBMI2-NEXT: andl (%esi), %eax
				; X86-NOBMI2-NEXT: andl 4(%esi), %edx
	; X86-NOBMI2-NEXT: popl %esi			; X86-NOBMI2-NEXT: popl %esi
	; X86-NOBMI2-NEXT: retl			; X86-NOBMI2-NEXT: retl
	;			;
	; X86-BMI2-LABEL: clear_lowbits64_c3_load_indexzext:			; X86-BMI2-LABEL: clear_lowbits64_c3_load_indexzext:
	; X86-BMI2: # %bb.0:			; X86-BMI2: # %bb.0:
	; X86-BMI2-NEXT: pushl %esi			; X86-BMI2-NEXT: pushl %ebx
	; X86-BMI2-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-BMI2-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl			; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %bl
	; X86-BMI2-NEXT: movl $-1, %edx			; X86-BMI2-NEXT: movl $-1, %edx
	; X86-BMI2-NEXT: shlxl %ecx, %edx, %eax			; X86-BMI2-NEXT: shlxl %ebx, %edx, %eax
	; X86-BMI2-NEXT: shldl %cl, %edx, %edx			; X86-BMI2-NEXT: testb $32, %bl
	; X86-BMI2-NEXT: testb $32, %cl
	; X86-BMI2-NEXT: je .LBB16_2			; X86-BMI2-NEXT: je .LBB16_2
	; X86-BMI2-NEXT: # %bb.1:			; X86-BMI2-NEXT: # %bb.1:
	; X86-BMI2-NEXT: movl %eax, %edx			; X86-BMI2-NEXT: movl %eax, %edx
	; X86-BMI2-NEXT: xorl %eax, %eax			; X86-BMI2-NEXT: xorl %eax, %eax
	; X86-BMI2-NEXT: .LBB16_2:			; X86-BMI2-NEXT: .LBB16_2:
	; X86-BMI2-NEXT: andl 4(%esi), %edx			; X86-BMI2-NEXT: andl (%ecx), %eax
	; X86-BMI2-NEXT: andl (%esi), %eax			; X86-BMI2-NEXT: andl 4(%ecx), %edx
	; X86-BMI2-NEXT: popl %esi			; X86-BMI2-NEXT: popl %ebx
	; X86-BMI2-NEXT: retl			; X86-BMI2-NEXT: retl
	;			;
	; X64-NOBMI2-LABEL: clear_lowbits64_c3_load_indexzext:			; X64-NOBMI2-LABEL: clear_lowbits64_c3_load_indexzext:
	; X64-NOBMI2: # %bb.0:			; X64-NOBMI2: # %bb.0:
	; X64-NOBMI2-NEXT: movl %esi, %ecx			; X64-NOBMI2-NEXT: movl %esi, %ecx
	; X64-NOBMI2-NEXT: movq (%rdi), %rax			; X64-NOBMI2-NEXT: movq (%rdi), %rax
	; X64-NOBMI2-NEXT: shrq %cl, %rax			; X64-NOBMI2-NEXT: shrq %cl, %rax
	; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $ecx			; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $ecx
	Show All 15 Lines

	define i64 @clear_lowbits64_c4_commutative(i64 %val, i64 %numlowbits) nounwind {			define i64 @clear_lowbits64_c4_commutative(i64 %val, i64 %numlowbits) nounwind {
	; X86-NOBMI2-LABEL: clear_lowbits64_c4_commutative:			; X86-NOBMI2-LABEL: clear_lowbits64_c4_commutative:
	; X86-NOBMI2: # %bb.0:			; X86-NOBMI2: # %bb.0:
	; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl			; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
	; X86-NOBMI2-NEXT: movl $-1, %edx			; X86-NOBMI2-NEXT: movl $-1, %edx
	; X86-NOBMI2-NEXT: movl $-1, %eax			; X86-NOBMI2-NEXT: movl $-1, %eax
	; X86-NOBMI2-NEXT: shll %cl, %eax			; X86-NOBMI2-NEXT: shll %cl, %eax
	; X86-NOBMI2-NEXT: shldl %cl, %edx, %edx
	; X86-NOBMI2-NEXT: testb $32, %cl			; X86-NOBMI2-NEXT: testb $32, %cl
	; X86-NOBMI2-NEXT: je .LBB17_2			; X86-NOBMI2-NEXT: je .LBB17_2
	; X86-NOBMI2-NEXT: # %bb.1:			; X86-NOBMI2-NEXT: # %bb.1:
	; X86-NOBMI2-NEXT: movl %eax, %edx			; X86-NOBMI2-NEXT: movl %eax, %edx
	; X86-NOBMI2-NEXT: xorl %eax, %eax			; X86-NOBMI2-NEXT: xorl %eax, %eax
	; X86-NOBMI2-NEXT: .LBB17_2:			; X86-NOBMI2-NEXT: .LBB17_2:
	; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %eax			; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %eax
				; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-NOBMI2-NEXT: retl			; X86-NOBMI2-NEXT: retl
	;			;
	; X86-BMI2-LABEL: clear_lowbits64_c4_commutative:			; X86-BMI2-LABEL: clear_lowbits64_c4_commutative:
	; X86-BMI2: # %bb.0:			; X86-BMI2: # %bb.0:
	; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl			; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
	; X86-BMI2-NEXT: movl $-1, %edx			; X86-BMI2-NEXT: movl $-1, %edx
	; X86-BMI2-NEXT: shlxl %ecx, %edx, %eax			; X86-BMI2-NEXT: shlxl %ecx, %edx, %eax
	; X86-BMI2-NEXT: shldl %cl, %edx, %edx
	; X86-BMI2-NEXT: testb $32, %cl			; X86-BMI2-NEXT: testb $32, %cl
	; X86-BMI2-NEXT: je .LBB17_2			; X86-BMI2-NEXT: je .LBB17_2
	; X86-BMI2-NEXT: # %bb.1:			; X86-BMI2-NEXT: # %bb.1:
	; X86-BMI2-NEXT: movl %eax, %edx			; X86-BMI2-NEXT: movl %eax, %edx
	; X86-BMI2-NEXT: xorl %eax, %eax			; X86-BMI2-NEXT: xorl %eax, %eax
	; X86-BMI2-NEXT: .LBB17_2:			; X86-BMI2-NEXT: .LBB17_2:
	; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax			; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax
				; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-BMI2-NEXT: retl			; X86-BMI2-NEXT: retl
	;			;
	; X64-NOBMI2-LABEL: clear_lowbits64_c4_commutative:			; X64-NOBMI2-LABEL: clear_lowbits64_c4_commutative:
	; X64-NOBMI2: # %bb.0:			; X64-NOBMI2: # %bb.0:
	; X64-NOBMI2-NEXT: movq %rsi, %rcx			; X64-NOBMI2-NEXT: movq %rsi, %rcx
	; X64-NOBMI2-NEXT: movq %rdi, %rax			; X64-NOBMI2-NEXT: movq %rdi, %rax
	; X64-NOBMI2-NEXT: shrq %cl, %rax			; X64-NOBMI2-NEXT: shrq %cl, %rax
	; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $rcx			; X64-NOBMI2-NEXT: # kill: def $cl killed $cl killed $rcx
	▲ Show 20 Lines • Show All 552 Lines • ▼ Show 20 Lines
	define i64 @clear_lowbits64_ic0(i64 %val, i64 %numlowbits) nounwind {			define i64 @clear_lowbits64_ic0(i64 %val, i64 %numlowbits) nounwind {
	; X86-NOBMI2-LABEL: clear_lowbits64_ic0:			; X86-NOBMI2-LABEL: clear_lowbits64_ic0:
	; X86-NOBMI2: # %bb.0:			; X86-NOBMI2: # %bb.0:
	; X86-NOBMI2-NEXT: movb $64, %cl			; X86-NOBMI2-NEXT: movb $64, %cl
	; X86-NOBMI2-NEXT: subb {{[0-9]+}}(%esp), %cl			; X86-NOBMI2-NEXT: subb {{[0-9]+}}(%esp), %cl
	; X86-NOBMI2-NEXT: movl $-1, %edx			; X86-NOBMI2-NEXT: movl $-1, %edx
	; X86-NOBMI2-NEXT: movl $-1, %eax			; X86-NOBMI2-NEXT: movl $-1, %eax
	; X86-NOBMI2-NEXT: shll %cl, %eax			; X86-NOBMI2-NEXT: shll %cl, %eax
	; X86-NOBMI2-NEXT: shldl %cl, %edx, %edx
	; X86-NOBMI2-NEXT: testb $32, %cl			; X86-NOBMI2-NEXT: testb $32, %cl
	; X86-NOBMI2-NEXT: je .LBB31_2			; X86-NOBMI2-NEXT: je .LBB31_2
	; X86-NOBMI2-NEXT: # %bb.1:			; X86-NOBMI2-NEXT: # %bb.1:
	; X86-NOBMI2-NEXT: movl %eax, %edx			; X86-NOBMI2-NEXT: movl %eax, %edx
	; X86-NOBMI2-NEXT: xorl %eax, %eax			; X86-NOBMI2-NEXT: xorl %eax, %eax
	; X86-NOBMI2-NEXT: .LBB31_2:			; X86-NOBMI2-NEXT: .LBB31_2:
	; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %eax			; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %eax
				; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-NOBMI2-NEXT: retl			; X86-NOBMI2-NEXT: retl
	;			;
	; X86-BMI2-LABEL: clear_lowbits64_ic0:			; X86-BMI2-LABEL: clear_lowbits64_ic0:
	; X86-BMI2: # %bb.0:			; X86-BMI2: # %bb.0:
	; X86-BMI2-NEXT: movb $64, %cl			; X86-BMI2-NEXT: movb $64, %cl
	; X86-BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl			; X86-BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl
	; X86-BMI2-NEXT: movl $-1, %edx			; X86-BMI2-NEXT: movl $-1, %edx
	; X86-BMI2-NEXT: shlxl %ecx, %edx, %eax			; X86-BMI2-NEXT: shlxl %ecx, %edx, %eax
	; X86-BMI2-NEXT: shldl %cl, %edx, %edx
	; X86-BMI2-NEXT: testb $32, %cl			; X86-BMI2-NEXT: testb $32, %cl
	; X86-BMI2-NEXT: je .LBB31_2			; X86-BMI2-NEXT: je .LBB31_2
	; X86-BMI2-NEXT: # %bb.1:			; X86-BMI2-NEXT: # %bb.1:
	; X86-BMI2-NEXT: movl %eax, %edx			; X86-BMI2-NEXT: movl %eax, %edx
	; X86-BMI2-NEXT: xorl %eax, %eax			; X86-BMI2-NEXT: xorl %eax, %eax
	; X86-BMI2-NEXT: .LBB31_2:			; X86-BMI2-NEXT: .LBB31_2:
	; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax			; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax
				; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-BMI2-NEXT: retl			; X86-BMI2-NEXT: retl
	;			;
	; X64-NOBMI2-LABEL: clear_lowbits64_ic0:			; X64-NOBMI2-LABEL: clear_lowbits64_ic0:
	; X64-NOBMI2: # %bb.0:			; X64-NOBMI2: # %bb.0:
	; X64-NOBMI2-NEXT: movq %rsi, %rcx			; X64-NOBMI2-NEXT: movq %rsi, %rcx
	; X64-NOBMI2-NEXT: movq %rdi, %rax			; X64-NOBMI2-NEXT: movq %rdi, %rax
	; X64-NOBMI2-NEXT: negb %cl			; X64-NOBMI2-NEXT: negb %cl
	; X64-NOBMI2-NEXT: shrq %cl, %rax			; X64-NOBMI2-NEXT: shrq %cl, %rax
	Show All 16 Lines
	define i64 @clear_lowbits64_ic1_indexzext(i64 %val, i8 %numlowbits) nounwind {			define i64 @clear_lowbits64_ic1_indexzext(i64 %val, i8 %numlowbits) nounwind {
	; X86-NOBMI2-LABEL: clear_lowbits64_ic1_indexzext:			; X86-NOBMI2-LABEL: clear_lowbits64_ic1_indexzext:
	; X86-NOBMI2: # %bb.0:			; X86-NOBMI2: # %bb.0:
	; X86-NOBMI2-NEXT: movb $64, %cl			; X86-NOBMI2-NEXT: movb $64, %cl
	; X86-NOBMI2-NEXT: subb {{[0-9]+}}(%esp), %cl			; X86-NOBMI2-NEXT: subb {{[0-9]+}}(%esp), %cl
	; X86-NOBMI2-NEXT: movl $-1, %edx			; X86-NOBMI2-NEXT: movl $-1, %edx
	; X86-NOBMI2-NEXT: movl $-1, %eax			; X86-NOBMI2-NEXT: movl $-1, %eax
	; X86-NOBMI2-NEXT: shll %cl, %eax			; X86-NOBMI2-NEXT: shll %cl, %eax
	; X86-NOBMI2-NEXT: shldl %cl, %edx, %edx
	; X86-NOBMI2-NEXT: testb $32, %cl			; X86-NOBMI2-NEXT: testb $32, %cl
	; X86-NOBMI2-NEXT: je .LBB32_2			; X86-NOBMI2-NEXT: je .LBB32_2
	; X86-NOBMI2-NEXT: # %bb.1:			; X86-NOBMI2-NEXT: # %bb.1:
	; X86-NOBMI2-NEXT: movl %eax, %edx			; X86-NOBMI2-NEXT: movl %eax, %edx
	; X86-NOBMI2-NEXT: xorl %eax, %eax			; X86-NOBMI2-NEXT: xorl %eax, %eax
	; X86-NOBMI2-NEXT: .LBB32_2:			; X86-NOBMI2-NEXT: .LBB32_2:
	; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %eax			; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %eax
				; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-NOBMI2-NEXT: retl			; X86-NOBMI2-NEXT: retl
	;			;
	; X86-BMI2-LABEL: clear_lowbits64_ic1_indexzext:			; X86-BMI2-LABEL: clear_lowbits64_ic1_indexzext:
	; X86-BMI2: # %bb.0:			; X86-BMI2: # %bb.0:
	; X86-BMI2-NEXT: movb $64, %cl			; X86-BMI2-NEXT: movb $64, %cl
	; X86-BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl			; X86-BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl
	; X86-BMI2-NEXT: movl $-1, %edx			; X86-BMI2-NEXT: movl $-1, %edx
	; X86-BMI2-NEXT: shlxl %ecx, %edx, %eax			; X86-BMI2-NEXT: shlxl %ecx, %edx, %eax
	; X86-BMI2-NEXT: shldl %cl, %edx, %edx
	; X86-BMI2-NEXT: testb $32, %cl			; X86-BMI2-NEXT: testb $32, %cl
	; X86-BMI2-NEXT: je .LBB32_2			; X86-BMI2-NEXT: je .LBB32_2
	; X86-BMI2-NEXT: # %bb.1:			; X86-BMI2-NEXT: # %bb.1:
	; X86-BMI2-NEXT: movl %eax, %edx			; X86-BMI2-NEXT: movl %eax, %edx
	; X86-BMI2-NEXT: xorl %eax, %eax			; X86-BMI2-NEXT: xorl %eax, %eax
	; X86-BMI2-NEXT: .LBB32_2:			; X86-BMI2-NEXT: .LBB32_2:
	; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax			; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax
				; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-BMI2-NEXT: retl			; X86-BMI2-NEXT: retl
	;			;
	; X64-NOBMI2-LABEL: clear_lowbits64_ic1_indexzext:			; X64-NOBMI2-LABEL: clear_lowbits64_ic1_indexzext:
	; X64-NOBMI2: # %bb.0:			; X64-NOBMI2: # %bb.0:
	; X64-NOBMI2-NEXT: movl %esi, %ecx			; X64-NOBMI2-NEXT: movl %esi, %ecx
	; X64-NOBMI2-NEXT: movq %rdi, %rax			; X64-NOBMI2-NEXT: movq %rdi, %rax
	; X64-NOBMI2-NEXT: negb %cl			; X64-NOBMI2-NEXT: negb %cl
	; X64-NOBMI2-NEXT: shrq %cl, %rax			; X64-NOBMI2-NEXT: shrq %cl, %rax
	Show All 20 Lines
	; X86-NOBMI2: # %bb.0:			; X86-NOBMI2: # %bb.0:
	; X86-NOBMI2-NEXT: pushl %esi			; X86-NOBMI2-NEXT: pushl %esi
	; X86-NOBMI2-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NOBMI2-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-NOBMI2-NEXT: movb $64, %cl			; X86-NOBMI2-NEXT: movb $64, %cl
	; X86-NOBMI2-NEXT: subb {{[0-9]+}}(%esp), %cl			; X86-NOBMI2-NEXT: subb {{[0-9]+}}(%esp), %cl
	; X86-NOBMI2-NEXT: movl $-1, %edx			; X86-NOBMI2-NEXT: movl $-1, %edx
	; X86-NOBMI2-NEXT: movl $-1, %eax			; X86-NOBMI2-NEXT: movl $-1, %eax
	; X86-NOBMI2-NEXT: shll %cl, %eax			; X86-NOBMI2-NEXT: shll %cl, %eax
	; X86-NOBMI2-NEXT: shldl %cl, %edx, %edx
	; X86-NOBMI2-NEXT: testb $32, %cl			; X86-NOBMI2-NEXT: testb $32, %cl
	; X86-NOBMI2-NEXT: je .LBB33_2			; X86-NOBMI2-NEXT: je .LBB33_2
	; X86-NOBMI2-NEXT: # %bb.1:			; X86-NOBMI2-NEXT: # %bb.1:
	; X86-NOBMI2-NEXT: movl %eax, %edx			; X86-NOBMI2-NEXT: movl %eax, %edx
	; X86-NOBMI2-NEXT: xorl %eax, %eax			; X86-NOBMI2-NEXT: xorl %eax, %eax
	; X86-NOBMI2-NEXT: .LBB33_2:			; X86-NOBMI2-NEXT: .LBB33_2:
	; X86-NOBMI2-NEXT: andl 4(%esi), %edx
	; X86-NOBMI2-NEXT: andl (%esi), %eax			; X86-NOBMI2-NEXT: andl (%esi), %eax
				; X86-NOBMI2-NEXT: andl 4(%esi), %edx
	; X86-NOBMI2-NEXT: popl %esi			; X86-NOBMI2-NEXT: popl %esi
	; X86-NOBMI2-NEXT: retl			; X86-NOBMI2-NEXT: retl
	;			;
	; X86-BMI2-LABEL: clear_lowbits64_ic2_load:			; X86-BMI2-LABEL: clear_lowbits64_ic2_load:
	; X86-BMI2: # %bb.0:			; X86-BMI2: # %bb.0:
	; X86-BMI2-NEXT: pushl %esi			; X86-BMI2-NEXT: pushl %ebx
	; X86-BMI2-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-BMI2-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-BMI2-NEXT: movb $64, %cl			; X86-BMI2-NEXT: movb $64, %bl
	; X86-BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl			; X86-BMI2-NEXT: subb {{[0-9]+}}(%esp), %bl
	; X86-BMI2-NEXT: movl $-1, %edx			; X86-BMI2-NEXT: movl $-1, %edx
	; X86-BMI2-NEXT: shlxl %ecx, %edx, %eax			; X86-BMI2-NEXT: shlxl %ebx, %edx, %eax
	; X86-BMI2-NEXT: shldl %cl, %edx, %edx			; X86-BMI2-NEXT: testb $32, %bl
	; X86-BMI2-NEXT: testb $32, %cl
	; X86-BMI2-NEXT: je .LBB33_2			; X86-BMI2-NEXT: je .LBB33_2
	; X86-BMI2-NEXT: # %bb.1:			; X86-BMI2-NEXT: # %bb.1:
	; X86-BMI2-NEXT: movl %eax, %edx			; X86-BMI2-NEXT: movl %eax, %edx
	; X86-BMI2-NEXT: xorl %eax, %eax			; X86-BMI2-NEXT: xorl %eax, %eax
	; X86-BMI2-NEXT: .LBB33_2:			; X86-BMI2-NEXT: .LBB33_2:
	; X86-BMI2-NEXT: andl 4(%esi), %edx			; X86-BMI2-NEXT: andl (%ecx), %eax
	; X86-BMI2-NEXT: andl (%esi), %eax			; X86-BMI2-NEXT: andl 4(%ecx), %edx
	; X86-BMI2-NEXT: popl %esi			; X86-BMI2-NEXT: popl %ebx
	; X86-BMI2-NEXT: retl			; X86-BMI2-NEXT: retl
	;			;
	; X64-NOBMI2-LABEL: clear_lowbits64_ic2_load:			; X64-NOBMI2-LABEL: clear_lowbits64_ic2_load:
	; X64-NOBMI2: # %bb.0:			; X64-NOBMI2: # %bb.0:
	; X64-NOBMI2-NEXT: movq %rsi, %rcx			; X64-NOBMI2-NEXT: movq %rsi, %rcx
	; X64-NOBMI2-NEXT: movq (%rdi), %rax			; X64-NOBMI2-NEXT: movq (%rdi), %rax
	; X64-NOBMI2-NEXT: negb %cl			; X64-NOBMI2-NEXT: negb %cl
	; X64-NOBMI2-NEXT: shrq %cl, %rax			; X64-NOBMI2-NEXT: shrq %cl, %rax
	Show All 19 Lines
	; X86-NOBMI2: # %bb.0:			; X86-NOBMI2: # %bb.0:
	; X86-NOBMI2-NEXT: pushl %esi			; X86-NOBMI2-NEXT: pushl %esi
	; X86-NOBMI2-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NOBMI2-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-NOBMI2-NEXT: movb $64, %cl			; X86-NOBMI2-NEXT: movb $64, %cl
	; X86-NOBMI2-NEXT: subb {{[0-9]+}}(%esp), %cl			; X86-NOBMI2-NEXT: subb {{[0-9]+}}(%esp), %cl
	; X86-NOBMI2-NEXT: movl $-1, %edx			; X86-NOBMI2-NEXT: movl $-1, %edx
	; X86-NOBMI2-NEXT: movl $-1, %eax			; X86-NOBMI2-NEXT: movl $-1, %eax
	; X86-NOBMI2-NEXT: shll %cl, %eax			; X86-NOBMI2-NEXT: shll %cl, %eax
	; X86-NOBMI2-NEXT: shldl %cl, %edx, %edx
	; X86-NOBMI2-NEXT: testb $32, %cl			; X86-NOBMI2-NEXT: testb $32, %cl
	; X86-NOBMI2-NEXT: je .LBB34_2			; X86-NOBMI2-NEXT: je .LBB34_2
	; X86-NOBMI2-NEXT: # %bb.1:			; X86-NOBMI2-NEXT: # %bb.1:
	; X86-NOBMI2-NEXT: movl %eax, %edx			; X86-NOBMI2-NEXT: movl %eax, %edx
	; X86-NOBMI2-NEXT: xorl %eax, %eax			; X86-NOBMI2-NEXT: xorl %eax, %eax
	; X86-NOBMI2-NEXT: .LBB34_2:			; X86-NOBMI2-NEXT: .LBB34_2:
	; X86-NOBMI2-NEXT: andl 4(%esi), %edx
	; X86-NOBMI2-NEXT: andl (%esi), %eax			; X86-NOBMI2-NEXT: andl (%esi), %eax
				; X86-NOBMI2-NEXT: andl 4(%esi), %edx
	; X86-NOBMI2-NEXT: popl %esi			; X86-NOBMI2-NEXT: popl %esi
	; X86-NOBMI2-NEXT: retl			; X86-NOBMI2-NEXT: retl
	;			;
	; X86-BMI2-LABEL: clear_lowbits64_ic3_load_indexzext:			; X86-BMI2-LABEL: clear_lowbits64_ic3_load_indexzext:
	; X86-BMI2: # %bb.0:			; X86-BMI2: # %bb.0:
	; X86-BMI2-NEXT: pushl %esi			; X86-BMI2-NEXT: pushl %ebx
	; X86-BMI2-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-BMI2-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-BMI2-NEXT: movb $64, %cl			; X86-BMI2-NEXT: movb $64, %bl
	; X86-BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl			; X86-BMI2-NEXT: subb {{[0-9]+}}(%esp), %bl
	; X86-BMI2-NEXT: movl $-1, %edx			; X86-BMI2-NEXT: movl $-1, %edx
	; X86-BMI2-NEXT: shlxl %ecx, %edx, %eax			; X86-BMI2-NEXT: shlxl %ebx, %edx, %eax
	; X86-BMI2-NEXT: shldl %cl, %edx, %edx			; X86-BMI2-NEXT: testb $32, %bl
	; X86-BMI2-NEXT: testb $32, %cl
	; X86-BMI2-NEXT: je .LBB34_2			; X86-BMI2-NEXT: je .LBB34_2
	; X86-BMI2-NEXT: # %bb.1:			; X86-BMI2-NEXT: # %bb.1:
	; X86-BMI2-NEXT: movl %eax, %edx			; X86-BMI2-NEXT: movl %eax, %edx
	; X86-BMI2-NEXT: xorl %eax, %eax			; X86-BMI2-NEXT: xorl %eax, %eax
	; X86-BMI2-NEXT: .LBB34_2:			; X86-BMI2-NEXT: .LBB34_2:
	; X86-BMI2-NEXT: andl 4(%esi), %edx			; X86-BMI2-NEXT: andl (%ecx), %eax
	; X86-BMI2-NEXT: andl (%esi), %eax			; X86-BMI2-NEXT: andl 4(%ecx), %edx
	; X86-BMI2-NEXT: popl %esi			; X86-BMI2-NEXT: popl %ebx
	; X86-BMI2-NEXT: retl			; X86-BMI2-NEXT: retl
	;			;
	; X64-NOBMI2-LABEL: clear_lowbits64_ic3_load_indexzext:			; X64-NOBMI2-LABEL: clear_lowbits64_ic3_load_indexzext:
	; X64-NOBMI2: # %bb.0:			; X64-NOBMI2: # %bb.0:
	; X64-NOBMI2-NEXT: movl %esi, %ecx			; X64-NOBMI2-NEXT: movl %esi, %ecx
	; X64-NOBMI2-NEXT: movq (%rdi), %rax			; X64-NOBMI2-NEXT: movq (%rdi), %rax
	; X64-NOBMI2-NEXT: negb %cl			; X64-NOBMI2-NEXT: negb %cl
	; X64-NOBMI2-NEXT: shrq %cl, %rax			; X64-NOBMI2-NEXT: shrq %cl, %rax
	Show All 19 Lines
	define i64 @clear_lowbits64_ic4_commutative(i64 %val, i64 %numlowbits) nounwind {			define i64 @clear_lowbits64_ic4_commutative(i64 %val, i64 %numlowbits) nounwind {
	; X86-NOBMI2-LABEL: clear_lowbits64_ic4_commutative:			; X86-NOBMI2-LABEL: clear_lowbits64_ic4_commutative:
	; X86-NOBMI2: # %bb.0:			; X86-NOBMI2: # %bb.0:
	; X86-NOBMI2-NEXT: movb $64, %cl			; X86-NOBMI2-NEXT: movb $64, %cl
	; X86-NOBMI2-NEXT: subb {{[0-9]+}}(%esp), %cl			; X86-NOBMI2-NEXT: subb {{[0-9]+}}(%esp), %cl
	; X86-NOBMI2-NEXT: movl $-1, %edx			; X86-NOBMI2-NEXT: movl $-1, %edx
	; X86-NOBMI2-NEXT: movl $-1, %eax			; X86-NOBMI2-NEXT: movl $-1, %eax
	; X86-NOBMI2-NEXT: shll %cl, %eax			; X86-NOBMI2-NEXT: shll %cl, %eax
	; X86-NOBMI2-NEXT: shldl %cl, %edx, %edx
	; X86-NOBMI2-NEXT: testb $32, %cl			; X86-NOBMI2-NEXT: testb $32, %cl
	; X86-NOBMI2-NEXT: je .LBB35_2			; X86-NOBMI2-NEXT: je .LBB35_2
	; X86-NOBMI2-NEXT: # %bb.1:			; X86-NOBMI2-NEXT: # %bb.1:
	; X86-NOBMI2-NEXT: movl %eax, %edx			; X86-NOBMI2-NEXT: movl %eax, %edx
	; X86-NOBMI2-NEXT: xorl %eax, %eax			; X86-NOBMI2-NEXT: xorl %eax, %eax
	; X86-NOBMI2-NEXT: .LBB35_2:			; X86-NOBMI2-NEXT: .LBB35_2:
	; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %eax			; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %eax
				; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-NOBMI2-NEXT: retl			; X86-NOBMI2-NEXT: retl
	;			;
	; X86-BMI2-LABEL: clear_lowbits64_ic4_commutative:			; X86-BMI2-LABEL: clear_lowbits64_ic4_commutative:
	; X86-BMI2: # %bb.0:			; X86-BMI2: # %bb.0:
	; X86-BMI2-NEXT: movb $64, %cl			; X86-BMI2-NEXT: movb $64, %cl
	; X86-BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl			; X86-BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl
	; X86-BMI2-NEXT: movl $-1, %edx			; X86-BMI2-NEXT: movl $-1, %edx
	; X86-BMI2-NEXT: shlxl %ecx, %edx, %eax			; X86-BMI2-NEXT: shlxl %ecx, %edx, %eax
	; X86-BMI2-NEXT: shldl %cl, %edx, %edx
	; X86-BMI2-NEXT: testb $32, %cl			; X86-BMI2-NEXT: testb $32, %cl
	; X86-BMI2-NEXT: je .LBB35_2			; X86-BMI2-NEXT: je .LBB35_2
	; X86-BMI2-NEXT: # %bb.1:			; X86-BMI2-NEXT: # %bb.1:
	; X86-BMI2-NEXT: movl %eax, %edx			; X86-BMI2-NEXT: movl %eax, %edx
	; X86-BMI2-NEXT: xorl %eax, %eax			; X86-BMI2-NEXT: xorl %eax, %eax
	; X86-BMI2-NEXT: .LBB35_2:			; X86-BMI2-NEXT: .LBB35_2:
	; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax			; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax
				; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edx
	; X86-BMI2-NEXT: retl			; X86-BMI2-NEXT: retl
	;			;
	; X64-NOBMI2-LABEL: clear_lowbits64_ic4_commutative:			; X64-NOBMI2-LABEL: clear_lowbits64_ic4_commutative:
	; X64-NOBMI2: # %bb.0:			; X64-NOBMI2: # %bb.0:
	; X64-NOBMI2-NEXT: movq %rsi, %rcx			; X64-NOBMI2-NEXT: movq %rsi, %rcx
	; X64-NOBMI2-NEXT: movq %rdi, %rax			; X64-NOBMI2-NEXT: movq %rdi, %rax
	; X64-NOBMI2-NEXT: negb %cl			; X64-NOBMI2-NEXT: negb %cl
	; X64-NOBMI2-NEXT: shrq %cl, %rax			; X64-NOBMI2-NEXT: shrq %cl, %rax
	▲ Show 20 Lines • Show All 95 Lines • ▼ Show 20 Lines
	define i64 @oneuse64(i64 %val, i64 %numlowbits) nounwind {			define i64 @oneuse64(i64 %val, i64 %numlowbits) nounwind {
	; X86-NOBMI2-LABEL: oneuse64:			; X86-NOBMI2-LABEL: oneuse64:
	; X86-NOBMI2: # %bb.0:			; X86-NOBMI2: # %bb.0:
	; X86-NOBMI2-NEXT: pushl %edi			; X86-NOBMI2-NEXT: pushl %edi
	; X86-NOBMI2-NEXT: pushl %esi			; X86-NOBMI2-NEXT: pushl %esi
	; X86-NOBMI2-NEXT: pushl %eax			; X86-NOBMI2-NEXT: pushl %eax
	; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl			; X86-NOBMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
	; X86-NOBMI2-NEXT: movl $-1, %esi			; X86-NOBMI2-NEXT: movl $-1, %esi
	; X86-NOBMI2-NEXT: movl $-1, %edi			; X86-NOBMI2-NEXT: movl $-1, %eax
	; X86-NOBMI2-NEXT: shll %cl, %edi			; X86-NOBMI2-NEXT: shll %cl, %eax
	; X86-NOBMI2-NEXT: shldl %cl, %esi, %esi
	; X86-NOBMI2-NEXT: testb $32, %cl
	; X86-NOBMI2-NEXT: je .LBB37_2
	; X86-NOBMI2-NEXT: # %bb.1:
	; X86-NOBMI2-NEXT: movl %edi, %esi
	; X86-NOBMI2-NEXT: xorl %edi, %edi			; X86-NOBMI2-NEXT: xorl %edi, %edi
	; X86-NOBMI2-NEXT: .LBB37_2:			; X86-NOBMI2-NEXT: testb $32, %cl
				; X86-NOBMI2-NEXT: jne .LBB37_1
				; X86-NOBMI2-NEXT: # %bb.2:
				; X86-NOBMI2-NEXT: movl %eax, %edi
				; X86-NOBMI2-NEXT: jmp .LBB37_3
				; X86-NOBMI2-NEXT: .LBB37_1:
				; X86-NOBMI2-NEXT: movl %eax, %esi
				; X86-NOBMI2-NEXT: .LBB37_3:
	; X86-NOBMI2-NEXT: subl $8, %esp			; X86-NOBMI2-NEXT: subl $8, %esp
	; X86-NOBMI2-NEXT: pushl %esi			; X86-NOBMI2-NEXT: pushl %esi
	; X86-NOBMI2-NEXT: pushl %edi			; X86-NOBMI2-NEXT: pushl %edi
	; X86-NOBMI2-NEXT: calll use64			; X86-NOBMI2-NEXT: calll use64
	; X86-NOBMI2-NEXT: addl $16, %esp			; X86-NOBMI2-NEXT: addl $16, %esp
	; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %esi
	; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edi			; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %edi
				; X86-NOBMI2-NEXT: andl {{[0-9]+}}(%esp), %esi
	; X86-NOBMI2-NEXT: movl %edi, %eax			; X86-NOBMI2-NEXT: movl %edi, %eax
	; X86-NOBMI2-NEXT: movl %esi, %edx			; X86-NOBMI2-NEXT: movl %esi, %edx
	; X86-NOBMI2-NEXT: addl $4, %esp			; X86-NOBMI2-NEXT: addl $4, %esp
	; X86-NOBMI2-NEXT: popl %esi			; X86-NOBMI2-NEXT: popl %esi
	; X86-NOBMI2-NEXT: popl %edi			; X86-NOBMI2-NEXT: popl %edi
	; X86-NOBMI2-NEXT: retl			; X86-NOBMI2-NEXT: retl
	;			;
	; X86-BMI2-LABEL: oneuse64:			; X86-BMI2-LABEL: oneuse64:
	; X86-BMI2: # %bb.0:			; X86-BMI2: # %bb.0:
	; X86-BMI2-NEXT: pushl %edi			; X86-BMI2-NEXT: pushl %edi
	; X86-BMI2-NEXT: pushl %esi			; X86-BMI2-NEXT: pushl %esi
	; X86-BMI2-NEXT: pushl %eax			; X86-BMI2-NEXT: pushl %eax
	; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl			; X86-BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
	; X86-BMI2-NEXT: movl $-1, %esi			; X86-BMI2-NEXT: movl $-1, %esi
	; X86-BMI2-NEXT: shlxl %ecx, %esi, %edi			; X86-BMI2-NEXT: shlxl %ecx, %esi, %eax
	; X86-BMI2-NEXT: shldl %cl, %esi, %esi
	; X86-BMI2-NEXT: testb $32, %cl
	; X86-BMI2-NEXT: je .LBB37_2
	; X86-BMI2-NEXT: # %bb.1:
	; X86-BMI2-NEXT: movl %edi, %esi
	; X86-BMI2-NEXT: xorl %edi, %edi			; X86-BMI2-NEXT: xorl %edi, %edi
	; X86-BMI2-NEXT: .LBB37_2:			; X86-BMI2-NEXT: testb $32, %cl
				; X86-BMI2-NEXT: jne .LBB37_1
				; X86-BMI2-NEXT: # %bb.2:
				; X86-BMI2-NEXT: movl %eax, %edi
				; X86-BMI2-NEXT: jmp .LBB37_3
				; X86-BMI2-NEXT: .LBB37_1:
				; X86-BMI2-NEXT: movl %eax, %esi
				; X86-BMI2-NEXT: .LBB37_3:
	; X86-BMI2-NEXT: subl $8, %esp			; X86-BMI2-NEXT: subl $8, %esp
	; X86-BMI2-NEXT: pushl %esi			; X86-BMI2-NEXT: pushl %esi
	; X86-BMI2-NEXT: pushl %edi			; X86-BMI2-NEXT: pushl %edi
	; X86-BMI2-NEXT: calll use64			; X86-BMI2-NEXT: calll use64
	; X86-BMI2-NEXT: addl $16, %esp			; X86-BMI2-NEXT: addl $16, %esp
	; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %esi
	; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edi			; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %edi
				; X86-BMI2-NEXT: andl {{[0-9]+}}(%esp), %esi
	; X86-BMI2-NEXT: movl %edi, %eax			; X86-BMI2-NEXT: movl %edi, %eax
	; X86-BMI2-NEXT: movl %esi, %edx			; X86-BMI2-NEXT: movl %esi, %edx
	; X86-BMI2-NEXT: addl $4, %esp			; X86-BMI2-NEXT: addl $4, %esp
	; X86-BMI2-NEXT: popl %esi			; X86-BMI2-NEXT: popl %esi
	; X86-BMI2-NEXT: popl %edi			; X86-BMI2-NEXT: popl %edi
	; X86-BMI2-NEXT: retl			; X86-BMI2-NEXT: retl
	;			;
	; X64-NOBMI2-LABEL: oneuse64:			; X64-NOBMI2-LABEL: oneuse64:
	Show All 39 Lines

llvm/test/CodeGen/X86/extract-bits.ll

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,655 Lines • ▼ Show 20 Lines	; X64-BMI1BMI2-NEXT: retq
ret i32 %masked		ret i32 %masked
}		}

; 64-bit		; 64-bit

define i64 @bextr64_b0(i64 %val, i64 %numskipbits, i64 %numlowbits) nounwind {		define i64 @bextr64_b0(i64 %val, i64 %numskipbits, i64 %numlowbits) nounwind {
; X86-NOBMI-LABEL: bextr64_b0:		; X86-NOBMI-LABEL: bextr64_b0:
; X86-NOBMI: # %bb.0:		; X86-NOBMI: # %bb.0:
		; X86-NOBMI-NEXT: pushl %ebx
; X86-NOBMI-NEXT: pushl %edi		; X86-NOBMI-NEXT: pushl %edi
; X86-NOBMI-NEXT: pushl %esi		; X86-NOBMI-NEXT: pushl %esi
; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %ch		; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %ch
; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOBMI-NEXT: movl %eax, %edi		; X86-NOBMI-NEXT: movl %eax, %edi
; X86-NOBMI-NEXT: shrl %cl, %edi		; X86-NOBMI-NEXT: shrl %cl, %edi
; X86-NOBMI-NEXT: shrdl %cl, %eax, %esi		; X86-NOBMI-NEXT: shrdl %cl, %eax, %esi
		; X86-NOBMI-NEXT: xorl %eax, %eax
; X86-NOBMI-NEXT: testb $32, %cl		; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB25_2		; X86-NOBMI-NEXT: je .LBB25_2
; X86-NOBMI-NEXT: # %bb.1:		; X86-NOBMI-NEXT: # %bb.1:
; X86-NOBMI-NEXT: movl %edi, %esi		; X86-NOBMI-NEXT: movl %edi, %esi
; X86-NOBMI-NEXT: xorl %edi, %edi		; X86-NOBMI-NEXT: xorl %edi, %edi
; X86-NOBMI-NEXT: .LBB25_2:		; X86-NOBMI-NEXT: .LBB25_2:
; X86-NOBMI-NEXT: movl $-1, %edx		; X86-NOBMI-NEXT: movl $-1, %edx
; X86-NOBMI-NEXT: movl $-1, %eax		; X86-NOBMI-NEXT: movl $-1, %ebx
; X86-NOBMI-NEXT: movb %ch, %cl		; X86-NOBMI-NEXT: movb %ch, %cl
; X86-NOBMI-NEXT: shll %cl, %eax		; X86-NOBMI-NEXT: shll %cl, %ebx
; X86-NOBMI-NEXT: shldl %cl, %edx, %edx
; X86-NOBMI-NEXT: testb $32, %ch		; X86-NOBMI-NEXT: testb $32, %ch
; X86-NOBMI-NEXT: je .LBB25_4		; X86-NOBMI-NEXT: jne .LBB25_3
; X86-NOBMI-NEXT: # %bb.3:		; X86-NOBMI-NEXT: # %bb.4:
; X86-NOBMI-NEXT: movl %eax, %edx		; X86-NOBMI-NEXT: movl %ebx, %eax
; X86-NOBMI-NEXT: xorl %eax, %eax		; X86-NOBMI-NEXT: jmp .LBB25_5
; X86-NOBMI-NEXT: .LBB25_4:		; X86-NOBMI-NEXT: .LBB25_3:
		; X86-NOBMI-NEXT: movl %ebx, %edx
		; X86-NOBMI-NEXT: .LBB25_5:
; X86-NOBMI-NEXT: notl %edx		; X86-NOBMI-NEXT: notl %edx
; X86-NOBMI-NEXT: andl %edi, %edx		; X86-NOBMI-NEXT: andl %edi, %edx
; X86-NOBMI-NEXT: notl %eax		; X86-NOBMI-NEXT: notl %eax
; X86-NOBMI-NEXT: andl %esi, %eax		; X86-NOBMI-NEXT: andl %esi, %eax
; X86-NOBMI-NEXT: popl %esi		; X86-NOBMI-NEXT: popl %esi
; X86-NOBMI-NEXT: popl %edi		; X86-NOBMI-NEXT: popl %edi
		; X86-NOBMI-NEXT: popl %ebx
; X86-NOBMI-NEXT: retl		; X86-NOBMI-NEXT: retl
;		;
; X86-BMI1NOTBM-LABEL: bextr64_b0:		; X86-BMI1NOTBM-LABEL: bextr64_b0:
; X86-BMI1NOTBM: # %bb.0:		; X86-BMI1NOTBM: # %bb.0:
; X86-BMI1NOTBM-NEXT: pushl %ebx		; X86-BMI1NOTBM-NEXT: pushl %ebx
; X86-BMI1NOTBM-NEXT: pushl %edi		; X86-BMI1NOTBM-NEXT: pushl %edi
; X86-BMI1NOTBM-NEXT: pushl %esi		; X86-BMI1NOTBM-NEXT: pushl %esi
; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %al		; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %al
; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %edi		; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %edi
; X86-BMI1NOTBM-NEXT: movl %edi, %edx		; X86-BMI1NOTBM-NEXT: movl %edi, %edx
; X86-BMI1NOTBM-NEXT: shrl %cl, %edx		; X86-BMI1NOTBM-NEXT: shrl %cl, %edx
; X86-BMI1NOTBM-NEXT: shrdl %cl, %edi, %esi		; X86-BMI1NOTBM-NEXT: shrdl %cl, %edi, %esi
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: je .LBB25_2		; X86-BMI1NOTBM-NEXT: je .LBB25_2
; X86-BMI1NOTBM-NEXT: # %bb.1:		; X86-BMI1NOTBM-NEXT: # %bb.1:
; X86-BMI1NOTBM-NEXT: movl %edx, %esi		; X86-BMI1NOTBM-NEXT: movl %edx, %esi
; X86-BMI1NOTBM-NEXT: xorl %edx, %edx		; X86-BMI1NOTBM-NEXT: xorl %edx, %edx
; X86-BMI1NOTBM-NEXT: .LBB25_2:		; X86-BMI1NOTBM-NEXT: .LBB25_2:
; X86-BMI1NOTBM-NEXT: movl $-1, %edi		; X86-BMI1NOTBM-NEXT: movl $-1, %edi
; X86-BMI1NOTBM-NEXT: movl $-1, %ebx		; X86-BMI1NOTBM-NEXT: movl $-1, %ebx
; X86-BMI1NOTBM-NEXT: movl %eax, %ecx		; X86-BMI1NOTBM-NEXT: movl %eax, %ecx
; X86-BMI1NOTBM-NEXT: shll %cl, %ebx		; X86-BMI1NOTBM-NEXT: shll %cl, %ebx
; X86-BMI1NOTBM-NEXT: shldl %cl, %edi, %edi
; X86-BMI1NOTBM-NEXT: testb $32, %al		; X86-BMI1NOTBM-NEXT: testb $32, %al
; X86-BMI1NOTBM-NEXT: je .LBB25_4		; X86-BMI1NOTBM-NEXT: je .LBB25_4
; X86-BMI1NOTBM-NEXT: # %bb.3:		; X86-BMI1NOTBM-NEXT: # %bb.3:
; X86-BMI1NOTBM-NEXT: movl %ebx, %edi		; X86-BMI1NOTBM-NEXT: movl %ebx, %edi
; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx		; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx
; X86-BMI1NOTBM-NEXT: .LBB25_4:		; X86-BMI1NOTBM-NEXT: .LBB25_4:
; X86-BMI1NOTBM-NEXT: andnl %edx, %edi, %edx		; X86-BMI1NOTBM-NEXT: andnl %edx, %edi, %edx
; X86-BMI1NOTBM-NEXT: andnl %esi, %ebx, %eax		; X86-BMI1NOTBM-NEXT: andnl %esi, %ebx, %eax
; X86-BMI1NOTBM-NEXT: popl %esi		; X86-BMI1NOTBM-NEXT: popl %esi
; X86-BMI1NOTBM-NEXT: popl %edi		; X86-BMI1NOTBM-NEXT: popl %edi
; X86-BMI1NOTBM-NEXT: popl %ebx		; X86-BMI1NOTBM-NEXT: popl %ebx
; X86-BMI1NOTBM-NEXT: retl		; X86-BMI1NOTBM-NEXT: retl
;		;
; X86-BMI1BMI2-LABEL: bextr64_b0:		; X86-BMI1BMI2-LABEL: bextr64_b0:
; X86-BMI1BMI2: # %bb.0:		; X86-BMI1BMI2: # %bb.0:
; X86-BMI1BMI2-NEXT: pushl %ebx		; X86-BMI1BMI2-NEXT: pushl %ebx
; X86-BMI1BMI2-NEXT: pushl %edi
; X86-BMI1BMI2-NEXT: pushl %esi		; X86-BMI1BMI2-NEXT: pushl %esi
; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %al		; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %bl
; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-BMI1BMI2-NEXT: shrdl %cl, %edx, %esi		; X86-BMI1BMI2-NEXT: shrdl %cl, %edx, %eax
; X86-BMI1BMI2-NEXT: shrxl %ecx, %edx, %edx		; X86-BMI1BMI2-NEXT: shrxl %ecx, %edx, %edx
; X86-BMI1BMI2-NEXT: testb $32, %cl		; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB25_2		; X86-BMI1BMI2-NEXT: je .LBB25_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %edx, %esi		; X86-BMI1BMI2-NEXT: movl %edx, %eax
; X86-BMI1BMI2-NEXT: xorl %edx, %edx		; X86-BMI1BMI2-NEXT: xorl %edx, %edx
; X86-BMI1BMI2-NEXT: .LBB25_2:		; X86-BMI1BMI2-NEXT: .LBB25_2:
; X86-BMI1BMI2-NEXT: movl $-1, %edi		; X86-BMI1BMI2-NEXT: movl $-1, %esi
; X86-BMI1BMI2-NEXT: shlxl %eax, %edi, %ebx		; X86-BMI1BMI2-NEXT: shlxl %ebx, %esi, %ecx
; X86-BMI1BMI2-NEXT: movl %eax, %ecx		; X86-BMI1BMI2-NEXT: testb $32, %bl
; X86-BMI1BMI2-NEXT: shldl %cl, %edi, %edi
; X86-BMI1BMI2-NEXT: testb $32, %al
; X86-BMI1BMI2-NEXT: je .LBB25_4		; X86-BMI1BMI2-NEXT: je .LBB25_4
; X86-BMI1BMI2-NEXT: # %bb.3:		; X86-BMI1BMI2-NEXT: # %bb.3:
; X86-BMI1BMI2-NEXT: movl %ebx, %edi		; X86-BMI1BMI2-NEXT: movl %ecx, %esi
; X86-BMI1BMI2-NEXT: xorl %ebx, %ebx		; X86-BMI1BMI2-NEXT: xorl %ecx, %ecx
; X86-BMI1BMI2-NEXT: .LBB25_4:		; X86-BMI1BMI2-NEXT: .LBB25_4:
; X86-BMI1BMI2-NEXT: andnl %edx, %edi, %edx		; X86-BMI1BMI2-NEXT: andnl %edx, %esi, %edx
; X86-BMI1BMI2-NEXT: andnl %esi, %ebx, %eax		; X86-BMI1BMI2-NEXT: andnl %eax, %ecx, %eax
; X86-BMI1BMI2-NEXT: popl %esi		; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: popl %edi
; X86-BMI1BMI2-NEXT: popl %ebx		; X86-BMI1BMI2-NEXT: popl %ebx
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
;		;
; X64-NOBMI-LABEL: bextr64_b0:		; X64-NOBMI-LABEL: bextr64_b0:
; X64-NOBMI: # %bb.0:		; X64-NOBMI: # %bb.0:
; X64-NOBMI-NEXT: movq %rsi, %rcx		; X64-NOBMI-NEXT: movq %rsi, %rcx
; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $rcx
; X64-NOBMI-NEXT: shrq %cl, %rdi		; X64-NOBMI-NEXT: shrq %cl, %rdi
Show All 22 Lines	; X64-BMI1BMI2-NEXT: retq
%mask = xor i64 %notmask, -1		%mask = xor i64 %notmask, -1
%masked = and i64 %mask, %shifted		%masked = and i64 %mask, %shifted
ret i64 %masked		ret i64 %masked
}		}

define i64 @bextr64_b1_indexzext(i64 %val, i8 zeroext %numskipbits, i8 zeroext %numlowbits) nounwind {		define i64 @bextr64_b1_indexzext(i64 %val, i8 zeroext %numskipbits, i8 zeroext %numlowbits) nounwind {
; X86-NOBMI-LABEL: bextr64_b1_indexzext:		; X86-NOBMI-LABEL: bextr64_b1_indexzext:
; X86-NOBMI: # %bb.0:		; X86-NOBMI: # %bb.0:
		; X86-NOBMI-NEXT: pushl %ebx
; X86-NOBMI-NEXT: pushl %edi		; X86-NOBMI-NEXT: pushl %edi
; X86-NOBMI-NEXT: pushl %esi		; X86-NOBMI-NEXT: pushl %esi
; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %ch		; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %ch
; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOBMI-NEXT: movl %eax, %edi		; X86-NOBMI-NEXT: movl %eax, %edi
; X86-NOBMI-NEXT: shrl %cl, %edi		; X86-NOBMI-NEXT: shrl %cl, %edi
; X86-NOBMI-NEXT: shrdl %cl, %eax, %esi		; X86-NOBMI-NEXT: shrdl %cl, %eax, %esi
		; X86-NOBMI-NEXT: xorl %eax, %eax
; X86-NOBMI-NEXT: testb $32, %cl		; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB26_2		; X86-NOBMI-NEXT: je .LBB26_2
; X86-NOBMI-NEXT: # %bb.1:		; X86-NOBMI-NEXT: # %bb.1:
; X86-NOBMI-NEXT: movl %edi, %esi		; X86-NOBMI-NEXT: movl %edi, %esi
; X86-NOBMI-NEXT: xorl %edi, %edi		; X86-NOBMI-NEXT: xorl %edi, %edi
; X86-NOBMI-NEXT: .LBB26_2:		; X86-NOBMI-NEXT: .LBB26_2:
; X86-NOBMI-NEXT: movl $-1, %edx		; X86-NOBMI-NEXT: movl $-1, %edx
; X86-NOBMI-NEXT: movl $-1, %eax		; X86-NOBMI-NEXT: movl $-1, %ebx
; X86-NOBMI-NEXT: movb %ch, %cl		; X86-NOBMI-NEXT: movb %ch, %cl
; X86-NOBMI-NEXT: shll %cl, %eax		; X86-NOBMI-NEXT: shll %cl, %ebx
; X86-NOBMI-NEXT: shldl %cl, %edx, %edx
; X86-NOBMI-NEXT: testb $32, %ch		; X86-NOBMI-NEXT: testb $32, %ch
; X86-NOBMI-NEXT: je .LBB26_4		; X86-NOBMI-NEXT: jne .LBB26_3
; X86-NOBMI-NEXT: # %bb.3:		; X86-NOBMI-NEXT: # %bb.4:
; X86-NOBMI-NEXT: movl %eax, %edx		; X86-NOBMI-NEXT: movl %ebx, %eax
; X86-NOBMI-NEXT: xorl %eax, %eax		; X86-NOBMI-NEXT: jmp .LBB26_5
; X86-NOBMI-NEXT: .LBB26_4:		; X86-NOBMI-NEXT: .LBB26_3:
		; X86-NOBMI-NEXT: movl %ebx, %edx
		; X86-NOBMI-NEXT: .LBB26_5:
; X86-NOBMI-NEXT: notl %edx		; X86-NOBMI-NEXT: notl %edx
; X86-NOBMI-NEXT: andl %edi, %edx		; X86-NOBMI-NEXT: andl %edi, %edx
; X86-NOBMI-NEXT: notl %eax		; X86-NOBMI-NEXT: notl %eax
; X86-NOBMI-NEXT: andl %esi, %eax		; X86-NOBMI-NEXT: andl %esi, %eax
; X86-NOBMI-NEXT: popl %esi		; X86-NOBMI-NEXT: popl %esi
; X86-NOBMI-NEXT: popl %edi		; X86-NOBMI-NEXT: popl %edi
		; X86-NOBMI-NEXT: popl %ebx
; X86-NOBMI-NEXT: retl		; X86-NOBMI-NEXT: retl
;		;
; X86-BMI1NOTBM-LABEL: bextr64_b1_indexzext:		; X86-BMI1NOTBM-LABEL: bextr64_b1_indexzext:
; X86-BMI1NOTBM: # %bb.0:		; X86-BMI1NOTBM: # %bb.0:
; X86-BMI1NOTBM-NEXT: pushl %ebx		; X86-BMI1NOTBM-NEXT: pushl %ebx
; X86-BMI1NOTBM-NEXT: pushl %edi		; X86-BMI1NOTBM-NEXT: pushl %edi
; X86-BMI1NOTBM-NEXT: pushl %esi		; X86-BMI1NOTBM-NEXT: pushl %esi
; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %al		; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %al
; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %edi		; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %edi
; X86-BMI1NOTBM-NEXT: movl %edi, %edx		; X86-BMI1NOTBM-NEXT: movl %edi, %edx
; X86-BMI1NOTBM-NEXT: shrl %cl, %edx		; X86-BMI1NOTBM-NEXT: shrl %cl, %edx
; X86-BMI1NOTBM-NEXT: shrdl %cl, %edi, %esi		; X86-BMI1NOTBM-NEXT: shrdl %cl, %edi, %esi
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: je .LBB26_2		; X86-BMI1NOTBM-NEXT: je .LBB26_2
; X86-BMI1NOTBM-NEXT: # %bb.1:		; X86-BMI1NOTBM-NEXT: # %bb.1:
; X86-BMI1NOTBM-NEXT: movl %edx, %esi		; X86-BMI1NOTBM-NEXT: movl %edx, %esi
; X86-BMI1NOTBM-NEXT: xorl %edx, %edx		; X86-BMI1NOTBM-NEXT: xorl %edx, %edx
; X86-BMI1NOTBM-NEXT: .LBB26_2:		; X86-BMI1NOTBM-NEXT: .LBB26_2:
; X86-BMI1NOTBM-NEXT: movl $-1, %edi		; X86-BMI1NOTBM-NEXT: movl $-1, %edi
; X86-BMI1NOTBM-NEXT: movl $-1, %ebx		; X86-BMI1NOTBM-NEXT: movl $-1, %ebx
; X86-BMI1NOTBM-NEXT: movl %eax, %ecx		; X86-BMI1NOTBM-NEXT: movl %eax, %ecx
; X86-BMI1NOTBM-NEXT: shll %cl, %ebx		; X86-BMI1NOTBM-NEXT: shll %cl, %ebx
; X86-BMI1NOTBM-NEXT: shldl %cl, %edi, %edi
; X86-BMI1NOTBM-NEXT: testb $32, %al		; X86-BMI1NOTBM-NEXT: testb $32, %al
; X86-BMI1NOTBM-NEXT: je .LBB26_4		; X86-BMI1NOTBM-NEXT: je .LBB26_4
; X86-BMI1NOTBM-NEXT: # %bb.3:		; X86-BMI1NOTBM-NEXT: # %bb.3:
; X86-BMI1NOTBM-NEXT: movl %ebx, %edi		; X86-BMI1NOTBM-NEXT: movl %ebx, %edi
; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx		; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx
; X86-BMI1NOTBM-NEXT: .LBB26_4:		; X86-BMI1NOTBM-NEXT: .LBB26_4:
; X86-BMI1NOTBM-NEXT: andnl %edx, %edi, %edx		; X86-BMI1NOTBM-NEXT: andnl %edx, %edi, %edx
; X86-BMI1NOTBM-NEXT: andnl %esi, %ebx, %eax		; X86-BMI1NOTBM-NEXT: andnl %esi, %ebx, %eax
; X86-BMI1NOTBM-NEXT: popl %esi		; X86-BMI1NOTBM-NEXT: popl %esi
; X86-BMI1NOTBM-NEXT: popl %edi		; X86-BMI1NOTBM-NEXT: popl %edi
; X86-BMI1NOTBM-NEXT: popl %ebx		; X86-BMI1NOTBM-NEXT: popl %ebx
; X86-BMI1NOTBM-NEXT: retl		; X86-BMI1NOTBM-NEXT: retl
;		;
; X86-BMI1BMI2-LABEL: bextr64_b1_indexzext:		; X86-BMI1BMI2-LABEL: bextr64_b1_indexzext:
; X86-BMI1BMI2: # %bb.0:		; X86-BMI1BMI2: # %bb.0:
; X86-BMI1BMI2-NEXT: pushl %ebx		; X86-BMI1BMI2-NEXT: pushl %ebx
; X86-BMI1BMI2-NEXT: pushl %edi
; X86-BMI1BMI2-NEXT: pushl %esi		; X86-BMI1BMI2-NEXT: pushl %esi
; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %al		; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %bl
; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-BMI1BMI2-NEXT: shrdl %cl, %edx, %esi		; X86-BMI1BMI2-NEXT: shrdl %cl, %edx, %eax
; X86-BMI1BMI2-NEXT: shrxl %ecx, %edx, %edx		; X86-BMI1BMI2-NEXT: shrxl %ecx, %edx, %edx
; X86-BMI1BMI2-NEXT: testb $32, %cl		; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB26_2		; X86-BMI1BMI2-NEXT: je .LBB26_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %edx, %esi		; X86-BMI1BMI2-NEXT: movl %edx, %eax
; X86-BMI1BMI2-NEXT: xorl %edx, %edx		; X86-BMI1BMI2-NEXT: xorl %edx, %edx
; X86-BMI1BMI2-NEXT: .LBB26_2:		; X86-BMI1BMI2-NEXT: .LBB26_2:
; X86-BMI1BMI2-NEXT: movl $-1, %edi		; X86-BMI1BMI2-NEXT: movl $-1, %esi
; X86-BMI1BMI2-NEXT: shlxl %eax, %edi, %ebx		; X86-BMI1BMI2-NEXT: shlxl %ebx, %esi, %ecx
; X86-BMI1BMI2-NEXT: movl %eax, %ecx		; X86-BMI1BMI2-NEXT: testb $32, %bl
; X86-BMI1BMI2-NEXT: shldl %cl, %edi, %edi
; X86-BMI1BMI2-NEXT: testb $32, %al
; X86-BMI1BMI2-NEXT: je .LBB26_4		; X86-BMI1BMI2-NEXT: je .LBB26_4
; X86-BMI1BMI2-NEXT: # %bb.3:		; X86-BMI1BMI2-NEXT: # %bb.3:
; X86-BMI1BMI2-NEXT: movl %ebx, %edi		; X86-BMI1BMI2-NEXT: movl %ecx, %esi
; X86-BMI1BMI2-NEXT: xorl %ebx, %ebx		; X86-BMI1BMI2-NEXT: xorl %ecx, %ecx
; X86-BMI1BMI2-NEXT: .LBB26_4:		; X86-BMI1BMI2-NEXT: .LBB26_4:
; X86-BMI1BMI2-NEXT: andnl %edx, %edi, %edx		; X86-BMI1BMI2-NEXT: andnl %edx, %esi, %edx
; X86-BMI1BMI2-NEXT: andnl %esi, %ebx, %eax		; X86-BMI1BMI2-NEXT: andnl %eax, %ecx, %eax
; X86-BMI1BMI2-NEXT: popl %esi		; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: popl %edi
; X86-BMI1BMI2-NEXT: popl %ebx		; X86-BMI1BMI2-NEXT: popl %ebx
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
;		;
; X64-NOBMI-LABEL: bextr64_b1_indexzext:		; X64-NOBMI-LABEL: bextr64_b1_indexzext:
; X64-NOBMI: # %bb.0:		; X64-NOBMI: # %bb.0:
; X64-NOBMI-NEXT: movl %esi, %ecx		; X64-NOBMI-NEXT: movl %esi, %ecx
; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $ecx		; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $ecx
; X64-NOBMI-NEXT: shrq %cl, %rdi		; X64-NOBMI-NEXT: shrq %cl, %rdi
Show All 26 Lines	; X64-BMI1BMI2-NEXT: retq
%mask = xor i64 %notmask, -1		%mask = xor i64 %notmask, -1
%masked = and i64 %mask, %shifted		%masked = and i64 %mask, %shifted
ret i64 %masked		ret i64 %masked
}		}

define i64 @bextr64_b2_load(i64* %w, i64 %numskipbits, i64 %numlowbits) nounwind {		define i64 @bextr64_b2_load(i64* %w, i64 %numskipbits, i64 %numlowbits) nounwind {
; X86-NOBMI-LABEL: bextr64_b2_load:		; X86-NOBMI-LABEL: bextr64_b2_load:
; X86-NOBMI: # %bb.0:		; X86-NOBMI: # %bb.0:
		; X86-NOBMI-NEXT: pushl %ebx
; X86-NOBMI-NEXT: pushl %edi		; X86-NOBMI-NEXT: pushl %edi
; X86-NOBMI-NEXT: pushl %esi		; X86-NOBMI-NEXT: pushl %esi
; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %ch		; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %ch
; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOBMI-NEXT: movl (%eax), %esi		; X86-NOBMI-NEXT: movl (%eax), %esi
; X86-NOBMI-NEXT: movl 4(%eax), %eax		; X86-NOBMI-NEXT: movl 4(%eax), %eax
; X86-NOBMI-NEXT: movl %eax, %edi		; X86-NOBMI-NEXT: movl %eax, %edi
; X86-NOBMI-NEXT: shrl %cl, %edi		; X86-NOBMI-NEXT: shrl %cl, %edi
; X86-NOBMI-NEXT: shrdl %cl, %eax, %esi		; X86-NOBMI-NEXT: shrdl %cl, %eax, %esi
		; X86-NOBMI-NEXT: xorl %eax, %eax
; X86-NOBMI-NEXT: testb $32, %cl		; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB27_2		; X86-NOBMI-NEXT: je .LBB27_2
; X86-NOBMI-NEXT: # %bb.1:		; X86-NOBMI-NEXT: # %bb.1:
; X86-NOBMI-NEXT: movl %edi, %esi		; X86-NOBMI-NEXT: movl %edi, %esi
; X86-NOBMI-NEXT: xorl %edi, %edi		; X86-NOBMI-NEXT: xorl %edi, %edi
; X86-NOBMI-NEXT: .LBB27_2:		; X86-NOBMI-NEXT: .LBB27_2:
; X86-NOBMI-NEXT: movl $-1, %edx		; X86-NOBMI-NEXT: movl $-1, %edx
; X86-NOBMI-NEXT: movl $-1, %eax		; X86-NOBMI-NEXT: movl $-1, %ebx
; X86-NOBMI-NEXT: movb %ch, %cl		; X86-NOBMI-NEXT: movb %ch, %cl
; X86-NOBMI-NEXT: shll %cl, %eax		; X86-NOBMI-NEXT: shll %cl, %ebx
; X86-NOBMI-NEXT: shldl %cl, %edx, %edx
; X86-NOBMI-NEXT: testb $32, %ch		; X86-NOBMI-NEXT: testb $32, %ch
; X86-NOBMI-NEXT: je .LBB27_4		; X86-NOBMI-NEXT: jne .LBB27_3
; X86-NOBMI-NEXT: # %bb.3:		; X86-NOBMI-NEXT: # %bb.4:
; X86-NOBMI-NEXT: movl %eax, %edx		; X86-NOBMI-NEXT: movl %ebx, %eax
; X86-NOBMI-NEXT: xorl %eax, %eax		; X86-NOBMI-NEXT: jmp .LBB27_5
; X86-NOBMI-NEXT: .LBB27_4:		; X86-NOBMI-NEXT: .LBB27_3:
		; X86-NOBMI-NEXT: movl %ebx, %edx
		; X86-NOBMI-NEXT: .LBB27_5:
; X86-NOBMI-NEXT: notl %edx		; X86-NOBMI-NEXT: notl %edx
; X86-NOBMI-NEXT: andl %edi, %edx		; X86-NOBMI-NEXT: andl %edi, %edx
; X86-NOBMI-NEXT: notl %eax		; X86-NOBMI-NEXT: notl %eax
; X86-NOBMI-NEXT: andl %esi, %eax		; X86-NOBMI-NEXT: andl %esi, %eax
; X86-NOBMI-NEXT: popl %esi		; X86-NOBMI-NEXT: popl %esi
; X86-NOBMI-NEXT: popl %edi		; X86-NOBMI-NEXT: popl %edi
		; X86-NOBMI-NEXT: popl %ebx
; X86-NOBMI-NEXT: retl		; X86-NOBMI-NEXT: retl
;		;
; X86-BMI1NOTBM-LABEL: bextr64_b2_load:		; X86-BMI1NOTBM-LABEL: bextr64_b2_load:
; X86-BMI1NOTBM: # %bb.0:		; X86-BMI1NOTBM: # %bb.0:
; X86-BMI1NOTBM-NEXT: pushl %ebx		; X86-BMI1NOTBM-NEXT: pushl %ebx
; X86-BMI1NOTBM-NEXT: pushl %edi		; X86-BMI1NOTBM-NEXT: pushl %edi
; X86-BMI1NOTBM-NEXT: pushl %esi		; X86-BMI1NOTBM-NEXT: pushl %esi
; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %al		; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %al
Show All 9 Lines
; X86-BMI1NOTBM-NEXT: # %bb.1:		; X86-BMI1NOTBM-NEXT: # %bb.1:
; X86-BMI1NOTBM-NEXT: movl %edx, %esi		; X86-BMI1NOTBM-NEXT: movl %edx, %esi
; X86-BMI1NOTBM-NEXT: xorl %edx, %edx		; X86-BMI1NOTBM-NEXT: xorl %edx, %edx
; X86-BMI1NOTBM-NEXT: .LBB27_2:		; X86-BMI1NOTBM-NEXT: .LBB27_2:
; X86-BMI1NOTBM-NEXT: movl $-1, %edi		; X86-BMI1NOTBM-NEXT: movl $-1, %edi
; X86-BMI1NOTBM-NEXT: movl $-1, %ebx		; X86-BMI1NOTBM-NEXT: movl $-1, %ebx
; X86-BMI1NOTBM-NEXT: movl %eax, %ecx		; X86-BMI1NOTBM-NEXT: movl %eax, %ecx
; X86-BMI1NOTBM-NEXT: shll %cl, %ebx		; X86-BMI1NOTBM-NEXT: shll %cl, %ebx
; X86-BMI1NOTBM-NEXT: shldl %cl, %edi, %edi
; X86-BMI1NOTBM-NEXT: testb $32, %al		; X86-BMI1NOTBM-NEXT: testb $32, %al
; X86-BMI1NOTBM-NEXT: je .LBB27_4		; X86-BMI1NOTBM-NEXT: je .LBB27_4
; X86-BMI1NOTBM-NEXT: # %bb.3:		; X86-BMI1NOTBM-NEXT: # %bb.3:
; X86-BMI1NOTBM-NEXT: movl %ebx, %edi		; X86-BMI1NOTBM-NEXT: movl %ebx, %edi
; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx		; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx
; X86-BMI1NOTBM-NEXT: .LBB27_4:		; X86-BMI1NOTBM-NEXT: .LBB27_4:
; X86-BMI1NOTBM-NEXT: andnl %edx, %edi, %edx		; X86-BMI1NOTBM-NEXT: andnl %edx, %edi, %edx
; X86-BMI1NOTBM-NEXT: andnl %esi, %ebx, %eax		; X86-BMI1NOTBM-NEXT: andnl %esi, %ebx, %eax
; X86-BMI1NOTBM-NEXT: popl %esi		; X86-BMI1NOTBM-NEXT: popl %esi
; X86-BMI1NOTBM-NEXT: popl %edi		; X86-BMI1NOTBM-NEXT: popl %edi
; X86-BMI1NOTBM-NEXT: popl %ebx		; X86-BMI1NOTBM-NEXT: popl %ebx
; X86-BMI1NOTBM-NEXT: retl		; X86-BMI1NOTBM-NEXT: retl
;		;
; X86-BMI1BMI2-LABEL: bextr64_b2_load:		; X86-BMI1BMI2-LABEL: bextr64_b2_load:
; X86-BMI1BMI2: # %bb.0:		; X86-BMI1BMI2: # %bb.0:
; X86-BMI1BMI2-NEXT: pushl %ebx		; X86-BMI1BMI2-NEXT: pushl %ebx
; X86-BMI1BMI2-NEXT: pushl %edi
; X86-BMI1BMI2-NEXT: pushl %esi		; X86-BMI1BMI2-NEXT: pushl %esi
; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %al		; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %bl
; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-BMI1BMI2-NEXT: movl (%edx), %esi		; X86-BMI1BMI2-NEXT: movl (%edx), %eax
; X86-BMI1BMI2-NEXT: movl 4(%edx), %edi		; X86-BMI1BMI2-NEXT: movl 4(%edx), %esi
; X86-BMI1BMI2-NEXT: shrxl %ecx, %edi, %edx		; X86-BMI1BMI2-NEXT: shrxl %ecx, %esi, %edx
; X86-BMI1BMI2-NEXT: shrdl %cl, %edi, %esi		; X86-BMI1BMI2-NEXT: shrdl %cl, %esi, %eax
; X86-BMI1BMI2-NEXT: testb $32, %cl		; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB27_2		; X86-BMI1BMI2-NEXT: je .LBB27_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %edx, %esi		; X86-BMI1BMI2-NEXT: movl %edx, %eax
; X86-BMI1BMI2-NEXT: xorl %edx, %edx		; X86-BMI1BMI2-NEXT: xorl %edx, %edx
; X86-BMI1BMI2-NEXT: .LBB27_2:		; X86-BMI1BMI2-NEXT: .LBB27_2:
; X86-BMI1BMI2-NEXT: movl $-1, %edi		; X86-BMI1BMI2-NEXT: movl $-1, %esi
; X86-BMI1BMI2-NEXT: shlxl %eax, %edi, %ebx		; X86-BMI1BMI2-NEXT: shlxl %ebx, %esi, %ecx
; X86-BMI1BMI2-NEXT: movl %eax, %ecx		; X86-BMI1BMI2-NEXT: testb $32, %bl
; X86-BMI1BMI2-NEXT: shldl %cl, %edi, %edi
; X86-BMI1BMI2-NEXT: testb $32, %al
; X86-BMI1BMI2-NEXT: je .LBB27_4		; X86-BMI1BMI2-NEXT: je .LBB27_4
; X86-BMI1BMI2-NEXT: # %bb.3:		; X86-BMI1BMI2-NEXT: # %bb.3:
; X86-BMI1BMI2-NEXT: movl %ebx, %edi		; X86-BMI1BMI2-NEXT: movl %ecx, %esi
; X86-BMI1BMI2-NEXT: xorl %ebx, %ebx		; X86-BMI1BMI2-NEXT: xorl %ecx, %ecx
; X86-BMI1BMI2-NEXT: .LBB27_4:		; X86-BMI1BMI2-NEXT: .LBB27_4:
; X86-BMI1BMI2-NEXT: andnl %edx, %edi, %edx		; X86-BMI1BMI2-NEXT: andnl %edx, %esi, %edx
; X86-BMI1BMI2-NEXT: andnl %esi, %ebx, %eax		; X86-BMI1BMI2-NEXT: andnl %eax, %ecx, %eax
; X86-BMI1BMI2-NEXT: popl %esi		; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: popl %edi
; X86-BMI1BMI2-NEXT: popl %ebx		; X86-BMI1BMI2-NEXT: popl %ebx
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
;		;
; X64-NOBMI-LABEL: bextr64_b2_load:		; X64-NOBMI-LABEL: bextr64_b2_load:
; X64-NOBMI: # %bb.0:		; X64-NOBMI: # %bb.0:
; X64-NOBMI-NEXT: movq %rsi, %rcx		; X64-NOBMI-NEXT: movq %rsi, %rcx
; X64-NOBMI-NEXT: movq (%rdi), %rsi		; X64-NOBMI-NEXT: movq (%rdi), %rsi
; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $rcx
Show All 24 Lines	; X64-BMI1BMI2-NEXT: retq
%mask = xor i64 %notmask, -1		%mask = xor i64 %notmask, -1
%masked = and i64 %mask, %shifted		%masked = and i64 %mask, %shifted
ret i64 %masked		ret i64 %masked
}		}

define i64 @bextr64_b3_load_indexzext(i64* %w, i8 zeroext %numskipbits, i8 zeroext %numlowbits) nounwind {		define i64 @bextr64_b3_load_indexzext(i64* %w, i8 zeroext %numskipbits, i8 zeroext %numlowbits) nounwind {
; X86-NOBMI-LABEL: bextr64_b3_load_indexzext:		; X86-NOBMI-LABEL: bextr64_b3_load_indexzext:
; X86-NOBMI: # %bb.0:		; X86-NOBMI: # %bb.0:
		; X86-NOBMI-NEXT: pushl %ebx
; X86-NOBMI-NEXT: pushl %edi		; X86-NOBMI-NEXT: pushl %edi
; X86-NOBMI-NEXT: pushl %esi		; X86-NOBMI-NEXT: pushl %esi
; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %ch		; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %ch
; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOBMI-NEXT: movl (%eax), %esi		; X86-NOBMI-NEXT: movl (%eax), %esi
; X86-NOBMI-NEXT: movl 4(%eax), %eax		; X86-NOBMI-NEXT: movl 4(%eax), %eax
; X86-NOBMI-NEXT: movl %eax, %edi		; X86-NOBMI-NEXT: movl %eax, %edi
; X86-NOBMI-NEXT: shrl %cl, %edi		; X86-NOBMI-NEXT: shrl %cl, %edi
; X86-NOBMI-NEXT: shrdl %cl, %eax, %esi		; X86-NOBMI-NEXT: shrdl %cl, %eax, %esi
		; X86-NOBMI-NEXT: xorl %eax, %eax
; X86-NOBMI-NEXT: testb $32, %cl		; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB28_2		; X86-NOBMI-NEXT: je .LBB28_2
; X86-NOBMI-NEXT: # %bb.1:		; X86-NOBMI-NEXT: # %bb.1:
; X86-NOBMI-NEXT: movl %edi, %esi		; X86-NOBMI-NEXT: movl %edi, %esi
; X86-NOBMI-NEXT: xorl %edi, %edi		; X86-NOBMI-NEXT: xorl %edi, %edi
; X86-NOBMI-NEXT: .LBB28_2:		; X86-NOBMI-NEXT: .LBB28_2:
; X86-NOBMI-NEXT: movl $-1, %edx		; X86-NOBMI-NEXT: movl $-1, %edx
; X86-NOBMI-NEXT: movl $-1, %eax		; X86-NOBMI-NEXT: movl $-1, %ebx
; X86-NOBMI-NEXT: movb %ch, %cl		; X86-NOBMI-NEXT: movb %ch, %cl
; X86-NOBMI-NEXT: shll %cl, %eax		; X86-NOBMI-NEXT: shll %cl, %ebx
; X86-NOBMI-NEXT: shldl %cl, %edx, %edx
; X86-NOBMI-NEXT: testb $32, %ch		; X86-NOBMI-NEXT: testb $32, %ch
; X86-NOBMI-NEXT: je .LBB28_4		; X86-NOBMI-NEXT: jne .LBB28_3
; X86-NOBMI-NEXT: # %bb.3:		; X86-NOBMI-NEXT: # %bb.4:
; X86-NOBMI-NEXT: movl %eax, %edx		; X86-NOBMI-NEXT: movl %ebx, %eax
; X86-NOBMI-NEXT: xorl %eax, %eax		; X86-NOBMI-NEXT: jmp .LBB28_5
; X86-NOBMI-NEXT: .LBB28_4:		; X86-NOBMI-NEXT: .LBB28_3:
		; X86-NOBMI-NEXT: movl %ebx, %edx
		; X86-NOBMI-NEXT: .LBB28_5:
; X86-NOBMI-NEXT: notl %edx		; X86-NOBMI-NEXT: notl %edx
; X86-NOBMI-NEXT: andl %edi, %edx		; X86-NOBMI-NEXT: andl %edi, %edx
; X86-NOBMI-NEXT: notl %eax		; X86-NOBMI-NEXT: notl %eax
; X86-NOBMI-NEXT: andl %esi, %eax		; X86-NOBMI-NEXT: andl %esi, %eax
; X86-NOBMI-NEXT: popl %esi		; X86-NOBMI-NEXT: popl %esi
; X86-NOBMI-NEXT: popl %edi		; X86-NOBMI-NEXT: popl %edi
		; X86-NOBMI-NEXT: popl %ebx
; X86-NOBMI-NEXT: retl		; X86-NOBMI-NEXT: retl
;		;
; X86-BMI1NOTBM-LABEL: bextr64_b3_load_indexzext:		; X86-BMI1NOTBM-LABEL: bextr64_b3_load_indexzext:
; X86-BMI1NOTBM: # %bb.0:		; X86-BMI1NOTBM: # %bb.0:
; X86-BMI1NOTBM-NEXT: pushl %ebx		; X86-BMI1NOTBM-NEXT: pushl %ebx
; X86-BMI1NOTBM-NEXT: pushl %edi		; X86-BMI1NOTBM-NEXT: pushl %edi
; X86-BMI1NOTBM-NEXT: pushl %esi		; X86-BMI1NOTBM-NEXT: pushl %esi
; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %al		; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %al
Show All 9 Lines
; X86-BMI1NOTBM-NEXT: # %bb.1:		; X86-BMI1NOTBM-NEXT: # %bb.1:
; X86-BMI1NOTBM-NEXT: movl %edx, %esi		; X86-BMI1NOTBM-NEXT: movl %edx, %esi
; X86-BMI1NOTBM-NEXT: xorl %edx, %edx		; X86-BMI1NOTBM-NEXT: xorl %edx, %edx
; X86-BMI1NOTBM-NEXT: .LBB28_2:		; X86-BMI1NOTBM-NEXT: .LBB28_2:
; X86-BMI1NOTBM-NEXT: movl $-1, %edi		; X86-BMI1NOTBM-NEXT: movl $-1, %edi
; X86-BMI1NOTBM-NEXT: movl $-1, %ebx		; X86-BMI1NOTBM-NEXT: movl $-1, %ebx
; X86-BMI1NOTBM-NEXT: movl %eax, %ecx		; X86-BMI1NOTBM-NEXT: movl %eax, %ecx
; X86-BMI1NOTBM-NEXT: shll %cl, %ebx		; X86-BMI1NOTBM-NEXT: shll %cl, %ebx
; X86-BMI1NOTBM-NEXT: shldl %cl, %edi, %edi
; X86-BMI1NOTBM-NEXT: testb $32, %al		; X86-BMI1NOTBM-NEXT: testb $32, %al
; X86-BMI1NOTBM-NEXT: je .LBB28_4		; X86-BMI1NOTBM-NEXT: je .LBB28_4
; X86-BMI1NOTBM-NEXT: # %bb.3:		; X86-BMI1NOTBM-NEXT: # %bb.3:
; X86-BMI1NOTBM-NEXT: movl %ebx, %edi		; X86-BMI1NOTBM-NEXT: movl %ebx, %edi
; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx		; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx
; X86-BMI1NOTBM-NEXT: .LBB28_4:		; X86-BMI1NOTBM-NEXT: .LBB28_4:
; X86-BMI1NOTBM-NEXT: andnl %edx, %edi, %edx		; X86-BMI1NOTBM-NEXT: andnl %edx, %edi, %edx
; X86-BMI1NOTBM-NEXT: andnl %esi, %ebx, %eax		; X86-BMI1NOTBM-NEXT: andnl %esi, %ebx, %eax
; X86-BMI1NOTBM-NEXT: popl %esi		; X86-BMI1NOTBM-NEXT: popl %esi
; X86-BMI1NOTBM-NEXT: popl %edi		; X86-BMI1NOTBM-NEXT: popl %edi
; X86-BMI1NOTBM-NEXT: popl %ebx		; X86-BMI1NOTBM-NEXT: popl %ebx
; X86-BMI1NOTBM-NEXT: retl		; X86-BMI1NOTBM-NEXT: retl
;		;
; X86-BMI1BMI2-LABEL: bextr64_b3_load_indexzext:		; X86-BMI1BMI2-LABEL: bextr64_b3_load_indexzext:
; X86-BMI1BMI2: # %bb.0:		; X86-BMI1BMI2: # %bb.0:
; X86-BMI1BMI2-NEXT: pushl %ebx		; X86-BMI1BMI2-NEXT: pushl %ebx
; X86-BMI1BMI2-NEXT: pushl %edi
; X86-BMI1BMI2-NEXT: pushl %esi		; X86-BMI1BMI2-NEXT: pushl %esi
; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %al		; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %bl
; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-BMI1BMI2-NEXT: movl (%edx), %esi		; X86-BMI1BMI2-NEXT: movl (%edx), %eax
; X86-BMI1BMI2-NEXT: movl 4(%edx), %edi		; X86-BMI1BMI2-NEXT: movl 4(%edx), %esi
; X86-BMI1BMI2-NEXT: shrxl %ecx, %edi, %edx		; X86-BMI1BMI2-NEXT: shrxl %ecx, %esi, %edx
; X86-BMI1BMI2-NEXT: shrdl %cl, %edi, %esi		; X86-BMI1BMI2-NEXT: shrdl %cl, %esi, %eax
; X86-BMI1BMI2-NEXT: testb $32, %cl		; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB28_2		; X86-BMI1BMI2-NEXT: je .LBB28_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %edx, %esi		; X86-BMI1BMI2-NEXT: movl %edx, %eax
; X86-BMI1BMI2-NEXT: xorl %edx, %edx		; X86-BMI1BMI2-NEXT: xorl %edx, %edx
; X86-BMI1BMI2-NEXT: .LBB28_2:		; X86-BMI1BMI2-NEXT: .LBB28_2:
; X86-BMI1BMI2-NEXT: movl $-1, %edi		; X86-BMI1BMI2-NEXT: movl $-1, %esi
; X86-BMI1BMI2-NEXT: shlxl %eax, %edi, %ebx		; X86-BMI1BMI2-NEXT: shlxl %ebx, %esi, %ecx
; X86-BMI1BMI2-NEXT: movl %eax, %ecx		; X86-BMI1BMI2-NEXT: testb $32, %bl
; X86-BMI1BMI2-NEXT: shldl %cl, %edi, %edi
; X86-BMI1BMI2-NEXT: testb $32, %al
; X86-BMI1BMI2-NEXT: je .LBB28_4		; X86-BMI1BMI2-NEXT: je .LBB28_4
; X86-BMI1BMI2-NEXT: # %bb.3:		; X86-BMI1BMI2-NEXT: # %bb.3:
; X86-BMI1BMI2-NEXT: movl %ebx, %edi		; X86-BMI1BMI2-NEXT: movl %ecx, %esi
; X86-BMI1BMI2-NEXT: xorl %ebx, %ebx		; X86-BMI1BMI2-NEXT: xorl %ecx, %ecx
; X86-BMI1BMI2-NEXT: .LBB28_4:		; X86-BMI1BMI2-NEXT: .LBB28_4:
; X86-BMI1BMI2-NEXT: andnl %edx, %edi, %edx		; X86-BMI1BMI2-NEXT: andnl %edx, %esi, %edx
; X86-BMI1BMI2-NEXT: andnl %esi, %ebx, %eax		; X86-BMI1BMI2-NEXT: andnl %eax, %ecx, %eax
; X86-BMI1BMI2-NEXT: popl %esi		; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: popl %edi
; X86-BMI1BMI2-NEXT: popl %ebx		; X86-BMI1BMI2-NEXT: popl %ebx
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
;		;
; X64-NOBMI-LABEL: bextr64_b3_load_indexzext:		; X64-NOBMI-LABEL: bextr64_b3_load_indexzext:
; X64-NOBMI: # %bb.0:		; X64-NOBMI: # %bb.0:
; X64-NOBMI-NEXT: movl %esi, %ecx		; X64-NOBMI-NEXT: movl %esi, %ecx
; X64-NOBMI-NEXT: movq (%rdi), %rsi		; X64-NOBMI-NEXT: movq (%rdi), %rsi
; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $ecx		; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $ecx
Show All 28 Lines	; X64-BMI1BMI2-NEXT: retq
%mask = xor i64 %notmask, -1		%mask = xor i64 %notmask, -1
%masked = and i64 %mask, %shifted		%masked = and i64 %mask, %shifted
ret i64 %masked		ret i64 %masked
}		}

define i64 @bextr64_b4_commutative(i64 %val, i64 %numskipbits, i64 %numlowbits) nounwind {		define i64 @bextr64_b4_commutative(i64 %val, i64 %numskipbits, i64 %numlowbits) nounwind {
; X86-NOBMI-LABEL: bextr64_b4_commutative:		; X86-NOBMI-LABEL: bextr64_b4_commutative:
; X86-NOBMI: # %bb.0:		; X86-NOBMI: # %bb.0:
		; X86-NOBMI-NEXT: pushl %ebx
; X86-NOBMI-NEXT: pushl %edi		; X86-NOBMI-NEXT: pushl %edi
; X86-NOBMI-NEXT: pushl %esi		; X86-NOBMI-NEXT: pushl %esi
; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %ch		; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %ch
; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-NOBMI-NEXT: movl %esi, %edx		; X86-NOBMI-NEXT: movl %esi, %edx
; X86-NOBMI-NEXT: shrl %cl, %edx		; X86-NOBMI-NEXT: shrl %cl, %edx
; X86-NOBMI-NEXT: shrdl %cl, %esi, %eax		; X86-NOBMI-NEXT: shrdl %cl, %esi, %eax
		; X86-NOBMI-NEXT: xorl %esi, %esi
; X86-NOBMI-NEXT: testb $32, %cl		; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB29_2		; X86-NOBMI-NEXT: je .LBB29_2
; X86-NOBMI-NEXT: # %bb.1:		; X86-NOBMI-NEXT: # %bb.1:
; X86-NOBMI-NEXT: movl %edx, %eax		; X86-NOBMI-NEXT: movl %edx, %eax
; X86-NOBMI-NEXT: xorl %edx, %edx		; X86-NOBMI-NEXT: xorl %edx, %edx
; X86-NOBMI-NEXT: .LBB29_2:		; X86-NOBMI-NEXT: .LBB29_2:
; X86-NOBMI-NEXT: movl $-1, %edi		; X86-NOBMI-NEXT: movl $-1, %edi
; X86-NOBMI-NEXT: movl $-1, %esi		; X86-NOBMI-NEXT: movl $-1, %ebx
; X86-NOBMI-NEXT: movb %ch, %cl		; X86-NOBMI-NEXT: movb %ch, %cl
; X86-NOBMI-NEXT: shll %cl, %esi		; X86-NOBMI-NEXT: shll %cl, %ebx
; X86-NOBMI-NEXT: shldl %cl, %edi, %edi
; X86-NOBMI-NEXT: testb $32, %ch		; X86-NOBMI-NEXT: testb $32, %ch
; X86-NOBMI-NEXT: je .LBB29_4		; X86-NOBMI-NEXT: jne .LBB29_3
; X86-NOBMI-NEXT: # %bb.3:		; X86-NOBMI-NEXT: # %bb.4:
; X86-NOBMI-NEXT: movl %esi, %edi		; X86-NOBMI-NEXT: movl %ebx, %esi
; X86-NOBMI-NEXT: xorl %esi, %esi		; X86-NOBMI-NEXT: jmp .LBB29_5
; X86-NOBMI-NEXT: .LBB29_4:		; X86-NOBMI-NEXT: .LBB29_3:
		; X86-NOBMI-NEXT: movl %ebx, %edi
		; X86-NOBMI-NEXT: .LBB29_5:
; X86-NOBMI-NEXT: notl %edi		; X86-NOBMI-NEXT: notl %edi
; X86-NOBMI-NEXT: andl %edi, %edx		; X86-NOBMI-NEXT: andl %edi, %edx
; X86-NOBMI-NEXT: notl %esi		; X86-NOBMI-NEXT: notl %esi
; X86-NOBMI-NEXT: andl %esi, %eax		; X86-NOBMI-NEXT: andl %esi, %eax
; X86-NOBMI-NEXT: popl %esi		; X86-NOBMI-NEXT: popl %esi
; X86-NOBMI-NEXT: popl %edi		; X86-NOBMI-NEXT: popl %edi
		; X86-NOBMI-NEXT: popl %ebx
; X86-NOBMI-NEXT: retl		; X86-NOBMI-NEXT: retl
;		;
; X86-BMI1NOTBM-LABEL: bextr64_b4_commutative:		; X86-BMI1NOTBM-LABEL: bextr64_b4_commutative:
; X86-BMI1NOTBM: # %bb.0:		; X86-BMI1NOTBM: # %bb.0:
; X86-BMI1NOTBM-NEXT: pushl %ebx		; X86-BMI1NOTBM-NEXT: pushl %ebx
; X86-BMI1NOTBM-NEXT: pushl %edi		; X86-BMI1NOTBM-NEXT: pushl %edi
; X86-BMI1NOTBM-NEXT: pushl %esi		; X86-BMI1NOTBM-NEXT: pushl %esi
; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %al		; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %al
; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %edi		; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %edi
; X86-BMI1NOTBM-NEXT: movl %edi, %edx		; X86-BMI1NOTBM-NEXT: movl %edi, %edx
; X86-BMI1NOTBM-NEXT: shrl %cl, %edx		; X86-BMI1NOTBM-NEXT: shrl %cl, %edx
; X86-BMI1NOTBM-NEXT: shrdl %cl, %edi, %esi		; X86-BMI1NOTBM-NEXT: shrdl %cl, %edi, %esi
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: je .LBB29_2		; X86-BMI1NOTBM-NEXT: je .LBB29_2
; X86-BMI1NOTBM-NEXT: # %bb.1:		; X86-BMI1NOTBM-NEXT: # %bb.1:
; X86-BMI1NOTBM-NEXT: movl %edx, %esi		; X86-BMI1NOTBM-NEXT: movl %edx, %esi
; X86-BMI1NOTBM-NEXT: xorl %edx, %edx		; X86-BMI1NOTBM-NEXT: xorl %edx, %edx
; X86-BMI1NOTBM-NEXT: .LBB29_2:		; X86-BMI1NOTBM-NEXT: .LBB29_2:
; X86-BMI1NOTBM-NEXT: movl $-1, %edi		; X86-BMI1NOTBM-NEXT: movl $-1, %edi
; X86-BMI1NOTBM-NEXT: movl $-1, %ebx		; X86-BMI1NOTBM-NEXT: movl $-1, %ebx
; X86-BMI1NOTBM-NEXT: movl %eax, %ecx		; X86-BMI1NOTBM-NEXT: movl %eax, %ecx
; X86-BMI1NOTBM-NEXT: shll %cl, %ebx		; X86-BMI1NOTBM-NEXT: shll %cl, %ebx
; X86-BMI1NOTBM-NEXT: shldl %cl, %edi, %edi
; X86-BMI1NOTBM-NEXT: testb $32, %al		; X86-BMI1NOTBM-NEXT: testb $32, %al
; X86-BMI1NOTBM-NEXT: je .LBB29_4		; X86-BMI1NOTBM-NEXT: je .LBB29_4
; X86-BMI1NOTBM-NEXT: # %bb.3:		; X86-BMI1NOTBM-NEXT: # %bb.3:
; X86-BMI1NOTBM-NEXT: movl %ebx, %edi		; X86-BMI1NOTBM-NEXT: movl %ebx, %edi
; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx		; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx
; X86-BMI1NOTBM-NEXT: .LBB29_4:		; X86-BMI1NOTBM-NEXT: .LBB29_4:
; X86-BMI1NOTBM-NEXT: andnl %edx, %edi, %edx		; X86-BMI1NOTBM-NEXT: andnl %edx, %edi, %edx
; X86-BMI1NOTBM-NEXT: andnl %esi, %ebx, %eax		; X86-BMI1NOTBM-NEXT: andnl %esi, %ebx, %eax
; X86-BMI1NOTBM-NEXT: popl %esi		; X86-BMI1NOTBM-NEXT: popl %esi
; X86-BMI1NOTBM-NEXT: popl %edi		; X86-BMI1NOTBM-NEXT: popl %edi
; X86-BMI1NOTBM-NEXT: popl %ebx		; X86-BMI1NOTBM-NEXT: popl %ebx
; X86-BMI1NOTBM-NEXT: retl		; X86-BMI1NOTBM-NEXT: retl
;		;
; X86-BMI1BMI2-LABEL: bextr64_b4_commutative:		; X86-BMI1BMI2-LABEL: bextr64_b4_commutative:
; X86-BMI1BMI2: # %bb.0:		; X86-BMI1BMI2: # %bb.0:
; X86-BMI1BMI2-NEXT: pushl %ebx		; X86-BMI1BMI2-NEXT: pushl %ebx
; X86-BMI1BMI2-NEXT: pushl %edi
; X86-BMI1BMI2-NEXT: pushl %esi		; X86-BMI1BMI2-NEXT: pushl %esi
; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %al		; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %bl
; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-BMI1BMI2-NEXT: shrdl %cl, %edx, %esi		; X86-BMI1BMI2-NEXT: shrdl %cl, %edx, %eax
; X86-BMI1BMI2-NEXT: shrxl %ecx, %edx, %edx		; X86-BMI1BMI2-NEXT: shrxl %ecx, %edx, %edx
; X86-BMI1BMI2-NEXT: testb $32, %cl		; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB29_2		; X86-BMI1BMI2-NEXT: je .LBB29_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %edx, %esi		; X86-BMI1BMI2-NEXT: movl %edx, %eax
; X86-BMI1BMI2-NEXT: xorl %edx, %edx		; X86-BMI1BMI2-NEXT: xorl %edx, %edx
; X86-BMI1BMI2-NEXT: .LBB29_2:		; X86-BMI1BMI2-NEXT: .LBB29_2:
; X86-BMI1BMI2-NEXT: movl $-1, %edi		; X86-BMI1BMI2-NEXT: movl $-1, %esi
; X86-BMI1BMI2-NEXT: shlxl %eax, %edi, %ebx		; X86-BMI1BMI2-NEXT: shlxl %ebx, %esi, %ecx
; X86-BMI1BMI2-NEXT: movl %eax, %ecx		; X86-BMI1BMI2-NEXT: testb $32, %bl
; X86-BMI1BMI2-NEXT: shldl %cl, %edi, %edi
; X86-BMI1BMI2-NEXT: testb $32, %al
; X86-BMI1BMI2-NEXT: je .LBB29_4		; X86-BMI1BMI2-NEXT: je .LBB29_4
; X86-BMI1BMI2-NEXT: # %bb.3:		; X86-BMI1BMI2-NEXT: # %bb.3:
; X86-BMI1BMI2-NEXT: movl %ebx, %edi		; X86-BMI1BMI2-NEXT: movl %ecx, %esi
; X86-BMI1BMI2-NEXT: xorl %ebx, %ebx		; X86-BMI1BMI2-NEXT: xorl %ecx, %ecx
; X86-BMI1BMI2-NEXT: .LBB29_4:		; X86-BMI1BMI2-NEXT: .LBB29_4:
; X86-BMI1BMI2-NEXT: andnl %edx, %edi, %edx		; X86-BMI1BMI2-NEXT: andnl %edx, %esi, %edx
; X86-BMI1BMI2-NEXT: andnl %esi, %ebx, %eax		; X86-BMI1BMI2-NEXT: andnl %eax, %ecx, %eax
; X86-BMI1BMI2-NEXT: popl %esi		; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: popl %edi
; X86-BMI1BMI2-NEXT: popl %ebx		; X86-BMI1BMI2-NEXT: popl %ebx
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
;		;
; X64-NOBMI-LABEL: bextr64_b4_commutative:		; X64-NOBMI-LABEL: bextr64_b4_commutative:
; X64-NOBMI: # %bb.0:		; X64-NOBMI: # %bb.0:
; X64-NOBMI-NEXT: movq %rsi, %rcx		; X64-NOBMI-NEXT: movq %rsi, %rcx
; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $rcx
; X64-NOBMI-NEXT: shrq %cl, %rdi		; X64-NOBMI-NEXT: shrq %cl, %rdi
Show All 27 Lines
define i64 @bextr64_b5_skipextrauses(i64 %val, i64 %numskipbits, i64 %numlowbits) nounwind {		define i64 @bextr64_b5_skipextrauses(i64 %val, i64 %numskipbits, i64 %numlowbits) nounwind {
; X86-NOBMI-LABEL: bextr64_b5_skipextrauses:		; X86-NOBMI-LABEL: bextr64_b5_skipextrauses:
; X86-NOBMI: # %bb.0:		; X86-NOBMI: # %bb.0:
; X86-NOBMI-NEXT: pushl %ebp		; X86-NOBMI-NEXT: pushl %ebp
; X86-NOBMI-NEXT: pushl %ebx		; X86-NOBMI-NEXT: pushl %ebx
; X86-NOBMI-NEXT: pushl %edi		; X86-NOBMI-NEXT: pushl %edi
; X86-NOBMI-NEXT: pushl %esi		; X86-NOBMI-NEXT: pushl %esi
; X86-NOBMI-NEXT: subl $12, %esp		; X86-NOBMI-NEXT: subl $12, %esp
; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %dl		; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %ch
; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %ebx		; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NOBMI-NEXT: movl %esi, %ebp		; X86-NOBMI-NEXT: movl %esi, %ebp
; X86-NOBMI-NEXT: movl %eax, %ecx		; X86-NOBMI-NEXT: movb %al, %cl
; X86-NOBMI-NEXT: shrl %cl, %ebp		; X86-NOBMI-NEXT: shrl %cl, %ebp
; X86-NOBMI-NEXT: shrdl %cl, %esi, %ebx		; X86-NOBMI-NEXT: shrdl %cl, %esi, %edx
		; X86-NOBMI-NEXT: xorl %ebx, %ebx
; X86-NOBMI-NEXT: testb $32, %al		; X86-NOBMI-NEXT: testb $32, %al
; X86-NOBMI-NEXT: je .LBB30_2		; X86-NOBMI-NEXT: je .LBB30_2
; X86-NOBMI-NEXT: # %bb.1:		; X86-NOBMI-NEXT: # %bb.1:
; X86-NOBMI-NEXT: movl %ebp, %ebx		; X86-NOBMI-NEXT: movl %ebp, %edx
; X86-NOBMI-NEXT: xorl %ebp, %ebp		; X86-NOBMI-NEXT: xorl %ebp, %ebp
; X86-NOBMI-NEXT: .LBB30_2:		; X86-NOBMI-NEXT: .LBB30_2:
; X86-NOBMI-NEXT: movl $-1, %esi
; X86-NOBMI-NEXT: movl $-1, %edi		; X86-NOBMI-NEXT: movl $-1, %edi
; X86-NOBMI-NEXT: movl %edx, %ecx		; X86-NOBMI-NEXT: movl $-1, %esi
; X86-NOBMI-NEXT: shll %cl, %edi		; X86-NOBMI-NEXT: movb %ch, %cl
; X86-NOBMI-NEXT: shldl %cl, %esi, %esi		; X86-NOBMI-NEXT: shll %cl, %esi
; X86-NOBMI-NEXT: testb $32, %dl		; X86-NOBMI-NEXT: testb $32, %ch
; X86-NOBMI-NEXT: je .LBB30_4		; X86-NOBMI-NEXT: jne .LBB30_3
; X86-NOBMI-NEXT: # %bb.3:		; X86-NOBMI-NEXT: # %bb.4:
; X86-NOBMI-NEXT: movl %edi, %esi		; X86-NOBMI-NEXT: movl %esi, %ebx
; X86-NOBMI-NEXT: xorl %edi, %edi		; X86-NOBMI-NEXT: jmp .LBB30_5
; X86-NOBMI-NEXT: .LBB30_4:		; X86-NOBMI-NEXT: .LBB30_3:
; X86-NOBMI-NEXT: notl %esi		; X86-NOBMI-NEXT: movl %esi, %edi
; X86-NOBMI-NEXT: andl %ebp, %esi		; X86-NOBMI-NEXT: .LBB30_5:
; X86-NOBMI-NEXT: notl %edi		; X86-NOBMI-NEXT: notl %edi
; X86-NOBMI-NEXT: andl %ebx, %edi		; X86-NOBMI-NEXT: andl %ebp, %edi
		; X86-NOBMI-NEXT: notl %ebx
		; X86-NOBMI-NEXT: andl %edx, %ebx
; X86-NOBMI-NEXT: subl $8, %esp		; X86-NOBMI-NEXT: subl $8, %esp
; X86-NOBMI-NEXT: pushl {{[0-9]+}}(%esp)		; X86-NOBMI-NEXT: pushl {{[0-9]+}}(%esp)
; X86-NOBMI-NEXT: pushl %eax		; X86-NOBMI-NEXT: pushl %eax
; X86-NOBMI-NEXT: calll use64		; X86-NOBMI-NEXT: calll use64
; X86-NOBMI-NEXT: addl $16, %esp		; X86-NOBMI-NEXT: addl $16, %esp
; X86-NOBMI-NEXT: movl %edi, %eax		; X86-NOBMI-NEXT: movl %ebx, %eax
; X86-NOBMI-NEXT: movl %esi, %edx		; X86-NOBMI-NEXT: movl %edi, %edx
; X86-NOBMI-NEXT: addl $12, %esp		; X86-NOBMI-NEXT: addl $12, %esp
; X86-NOBMI-NEXT: popl %esi		; X86-NOBMI-NEXT: popl %esi
; X86-NOBMI-NEXT: popl %edi		; X86-NOBMI-NEXT: popl %edi
; X86-NOBMI-NEXT: popl %ebx		; X86-NOBMI-NEXT: popl %ebx
; X86-NOBMI-NEXT: popl %ebp		; X86-NOBMI-NEXT: popl %ebp
; X86-NOBMI-NEXT: retl		; X86-NOBMI-NEXT: retl
;		;
; X86-BMI1NOTBM-LABEL: bextr64_b5_skipextrauses:		; X86-BMI1NOTBM-LABEL: bextr64_b5_skipextrauses:
Show All 16 Lines
; X86-BMI1NOTBM-NEXT: # %bb.1:		; X86-BMI1NOTBM-NEXT: # %bb.1:
; X86-BMI1NOTBM-NEXT: movl %esi, %edi		; X86-BMI1NOTBM-NEXT: movl %esi, %edi
; X86-BMI1NOTBM-NEXT: xorl %esi, %esi		; X86-BMI1NOTBM-NEXT: xorl %esi, %esi
; X86-BMI1NOTBM-NEXT: .LBB30_2:		; X86-BMI1NOTBM-NEXT: .LBB30_2:
; X86-BMI1NOTBM-NEXT: movl $-1, %ebx		; X86-BMI1NOTBM-NEXT: movl $-1, %ebx
; X86-BMI1NOTBM-NEXT: movl $-1, %ebp		; X86-BMI1NOTBM-NEXT: movl $-1, %ebp
; X86-BMI1NOTBM-NEXT: movl %edx, %ecx		; X86-BMI1NOTBM-NEXT: movl %edx, %ecx
; X86-BMI1NOTBM-NEXT: shll %cl, %ebp		; X86-BMI1NOTBM-NEXT: shll %cl, %ebp
; X86-BMI1NOTBM-NEXT: shldl %cl, %ebx, %ebx
; X86-BMI1NOTBM-NEXT: testb $32, %dl		; X86-BMI1NOTBM-NEXT: testb $32, %dl
; X86-BMI1NOTBM-NEXT: je .LBB30_4		; X86-BMI1NOTBM-NEXT: je .LBB30_4
; X86-BMI1NOTBM-NEXT: # %bb.3:		; X86-BMI1NOTBM-NEXT: # %bb.3:
; X86-BMI1NOTBM-NEXT: movl %ebp, %ebx		; X86-BMI1NOTBM-NEXT: movl %ebp, %ebx
; X86-BMI1NOTBM-NEXT: xorl %ebp, %ebp		; X86-BMI1NOTBM-NEXT: xorl %ebp, %ebp
; X86-BMI1NOTBM-NEXT: .LBB30_4:		; X86-BMI1NOTBM-NEXT: .LBB30_4:
; X86-BMI1NOTBM-NEXT: andnl %esi, %ebx, %esi		; X86-BMI1NOTBM-NEXT: andnl %esi, %ebx, %esi
; X86-BMI1NOTBM-NEXT: andnl %edi, %ebp, %edi		; X86-BMI1NOTBM-NEXT: andnl %edi, %ebp, %edi
Show All 13 Lines
;		;
; X86-BMI1BMI2-LABEL: bextr64_b5_skipextrauses:		; X86-BMI1BMI2-LABEL: bextr64_b5_skipextrauses:
; X86-BMI1BMI2: # %bb.0:		; X86-BMI1BMI2: # %bb.0:
; X86-BMI1BMI2-NEXT: pushl %ebp		; X86-BMI1BMI2-NEXT: pushl %ebp
; X86-BMI1BMI2-NEXT: pushl %ebx		; X86-BMI1BMI2-NEXT: pushl %ebx
; X86-BMI1BMI2-NEXT: pushl %edi		; X86-BMI1BMI2-NEXT: pushl %edi
; X86-BMI1BMI2-NEXT: pushl %esi		; X86-BMI1BMI2-NEXT: pushl %esi
; X86-BMI1BMI2-NEXT: subl $12, %esp		; X86-BMI1BMI2-NEXT: subl $12, %esp
; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %dl		; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %bl
; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %edi
; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-BMI1BMI2-NEXT: movl %eax, %ecx		; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-BMI1BMI2-NEXT: shrdl %cl, %esi, %edi		; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-BMI1BMI2-NEXT: shrxl %eax, %esi, %esi		; X86-BMI1BMI2-NEXT: shrdl %cl, %edx, %eax
; X86-BMI1BMI2-NEXT: testb $32, %al		; X86-BMI1BMI2-NEXT: shrxl %ecx, %edx, %edx
		; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB30_2		; X86-BMI1BMI2-NEXT: je .LBB30_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %esi, %edi		; X86-BMI1BMI2-NEXT: movl %edx, %eax
; X86-BMI1BMI2-NEXT: xorl %esi, %esi		; X86-BMI1BMI2-NEXT: xorl %edx, %edx
; X86-BMI1BMI2-NEXT: .LBB30_2:		; X86-BMI1BMI2-NEXT: .LBB30_2:
; X86-BMI1BMI2-NEXT: movl $-1, %ebp		; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %ebp
; X86-BMI1BMI2-NEXT: shlxl %edx, %ebp, %ebx		; X86-BMI1BMI2-NEXT: movl $-1, %esi
; X86-BMI1BMI2-NEXT: movl %edx, %ecx		; X86-BMI1BMI2-NEXT: shlxl %ebx, %esi, %edi
; X86-BMI1BMI2-NEXT: shldl %cl, %ebp, %ebp		; X86-BMI1BMI2-NEXT: testb $32, %bl
; X86-BMI1BMI2-NEXT: testb $32, %dl
; X86-BMI1BMI2-NEXT: je .LBB30_4		; X86-BMI1BMI2-NEXT: je .LBB30_4
; X86-BMI1BMI2-NEXT: # %bb.3:		; X86-BMI1BMI2-NEXT: # %bb.3:
; X86-BMI1BMI2-NEXT: movl %ebx, %ebp		; X86-BMI1BMI2-NEXT: movl %edi, %esi
; X86-BMI1BMI2-NEXT: xorl %ebx, %ebx		; X86-BMI1BMI2-NEXT: xorl %edi, %edi
; X86-BMI1BMI2-NEXT: .LBB30_4:		; X86-BMI1BMI2-NEXT: .LBB30_4:
; X86-BMI1BMI2-NEXT: andnl %esi, %ebp, %esi		; X86-BMI1BMI2-NEXT: andnl %edx, %esi, %esi
; X86-BMI1BMI2-NEXT: andnl %edi, %ebx, %edi		; X86-BMI1BMI2-NEXT: andnl %eax, %edi, %edi
; X86-BMI1BMI2-NEXT: subl $8, %esp		; X86-BMI1BMI2-NEXT: subl $8, %esp
; X86-BMI1BMI2-NEXT: pushl {{[0-9]+}}(%esp)		; X86-BMI1BMI2-NEXT: pushl %ebp
; X86-BMI1BMI2-NEXT: pushl %eax		; X86-BMI1BMI2-NEXT: pushl %ecx
; X86-BMI1BMI2-NEXT: calll use64		; X86-BMI1BMI2-NEXT: calll use64
; X86-BMI1BMI2-NEXT: addl $16, %esp		; X86-BMI1BMI2-NEXT: addl $16, %esp
; X86-BMI1BMI2-NEXT: movl %edi, %eax		; X86-BMI1BMI2-NEXT: movl %edi, %eax
; X86-BMI1BMI2-NEXT: movl %esi, %edx		; X86-BMI1BMI2-NEXT: movl %esi, %edx
; X86-BMI1BMI2-NEXT: addl $12, %esp		; X86-BMI1BMI2-NEXT: addl $12, %esp
; X86-BMI1BMI2-NEXT: popl %esi		; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: popl %edi		; X86-BMI1BMI2-NEXT: popl %edi
; X86-BMI1BMI2-NEXT: popl %ebx		; X86-BMI1BMI2-NEXT: popl %ebx
▲ Show 20 Lines • Show All 1,367 Lines • ▼ Show 20 Lines
; X86-NOBMI-NEXT: movl %edi, %esi		; X86-NOBMI-NEXT: movl %edi, %esi
; X86-NOBMI-NEXT: xorl %edi, %edi		; X86-NOBMI-NEXT: xorl %edi, %edi
; X86-NOBMI-NEXT: .LBB41_2:		; X86-NOBMI-NEXT: .LBB41_2:
; X86-NOBMI-NEXT: movb $64, %cl		; X86-NOBMI-NEXT: movb $64, %cl
; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl $-1, %ebp		; X86-NOBMI-NEXT: movl $-1, %ebp
; X86-NOBMI-NEXT: movl $-1, %ebx		; X86-NOBMI-NEXT: movl $-1, %ebx
; X86-NOBMI-NEXT: shrl %cl, %ebx		; X86-NOBMI-NEXT: shrl %cl, %ebx
; X86-NOBMI-NEXT: shrdl %cl, %ebp, %ebp
; X86-NOBMI-NEXT: testb $32, %cl		; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB41_4		; X86-NOBMI-NEXT: je .LBB41_4
; X86-NOBMI-NEXT: # %bb.3:		; X86-NOBMI-NEXT: # %bb.3:
; X86-NOBMI-NEXT: movl %ebx, %ebp		; X86-NOBMI-NEXT: movl %ebx, %ebp
; X86-NOBMI-NEXT: xorl %ebx, %ebx		; X86-NOBMI-NEXT: xorl %ebx, %ebx
; X86-NOBMI-NEXT: .LBB41_4:		; X86-NOBMI-NEXT: .LBB41_4:
; X86-NOBMI-NEXT: subl $8, %esp		; X86-NOBMI-NEXT: subl $8, %esp
; X86-NOBMI-NEXT: pushl %ebx		; X86-NOBMI-NEXT: pushl %ebx
Show All 30 Lines
; X86-BMI1NOTBM-NEXT: movl %edi, %esi		; X86-BMI1NOTBM-NEXT: movl %edi, %esi
; X86-BMI1NOTBM-NEXT: xorl %edi, %edi		; X86-BMI1NOTBM-NEXT: xorl %edi, %edi
; X86-BMI1NOTBM-NEXT: .LBB41_2:		; X86-BMI1NOTBM-NEXT: .LBB41_2:
; X86-BMI1NOTBM-NEXT: movb $64, %cl		; X86-BMI1NOTBM-NEXT: movb $64, %cl
; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-BMI1NOTBM-NEXT: movl $-1, %ebp		; X86-BMI1NOTBM-NEXT: movl $-1, %ebp
; X86-BMI1NOTBM-NEXT: movl $-1, %ebx		; X86-BMI1NOTBM-NEXT: movl $-1, %ebx
; X86-BMI1NOTBM-NEXT: shrl %cl, %ebx		; X86-BMI1NOTBM-NEXT: shrl %cl, %ebx
; X86-BMI1NOTBM-NEXT: shrdl %cl, %ebp, %ebp
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: je .LBB41_4		; X86-BMI1NOTBM-NEXT: je .LBB41_4
; X86-BMI1NOTBM-NEXT: # %bb.3:		; X86-BMI1NOTBM-NEXT: # %bb.3:
; X86-BMI1NOTBM-NEXT: movl %ebx, %ebp		; X86-BMI1NOTBM-NEXT: movl %ebx, %ebp
; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx		; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx
; X86-BMI1NOTBM-NEXT: .LBB41_4:		; X86-BMI1NOTBM-NEXT: .LBB41_4:
; X86-BMI1NOTBM-NEXT: subl $8, %esp		; X86-BMI1NOTBM-NEXT: subl $8, %esp
; X86-BMI1NOTBM-NEXT: pushl %ebx		; X86-BMI1NOTBM-NEXT: pushl %ebx
Show All 24 Lines
; X86-BMI1BMI2-NEXT: shrdl %cl, %eax, %esi		; X86-BMI1BMI2-NEXT: shrdl %cl, %eax, %esi
; X86-BMI1BMI2-NEXT: shrxl %ecx, %eax, %edi		; X86-BMI1BMI2-NEXT: shrxl %ecx, %eax, %edi
; X86-BMI1BMI2-NEXT: testb $32, %cl		; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB41_2		; X86-BMI1BMI2-NEXT: je .LBB41_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %edi, %esi		; X86-BMI1BMI2-NEXT: movl %edi, %esi
; X86-BMI1BMI2-NEXT: xorl %edi, %edi		; X86-BMI1BMI2-NEXT: xorl %edi, %edi
; X86-BMI1BMI2-NEXT: .LBB41_2:		; X86-BMI1BMI2-NEXT: .LBB41_2:
; X86-BMI1BMI2-NEXT: movb $64, %cl		; X86-BMI1BMI2-NEXT: movb $64, %al
; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %al
; X86-BMI1BMI2-NEXT: movl $-1, %ebx		; X86-BMI1BMI2-NEXT: movl $-1, %ebp
; X86-BMI1BMI2-NEXT: shrxl %ecx, %ebx, %ebp		; X86-BMI1BMI2-NEXT: shrxl %eax, %ebp, %ebx
; X86-BMI1BMI2-NEXT: shrdl %cl, %ebx, %ebx		; X86-BMI1BMI2-NEXT: testb $32, %al
; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB41_4		; X86-BMI1BMI2-NEXT: je .LBB41_4
; X86-BMI1BMI2-NEXT: # %bb.3:		; X86-BMI1BMI2-NEXT: # %bb.3:
; X86-BMI1BMI2-NEXT: movl %ebp, %ebx		; X86-BMI1BMI2-NEXT: movl %ebx, %ebp
; X86-BMI1BMI2-NEXT: xorl %ebp, %ebp		; X86-BMI1BMI2-NEXT: xorl %ebx, %ebx
; X86-BMI1BMI2-NEXT: .LBB41_4:		; X86-BMI1BMI2-NEXT: .LBB41_4:
; X86-BMI1BMI2-NEXT: subl $8, %esp		; X86-BMI1BMI2-NEXT: subl $8, %esp
; X86-BMI1BMI2-NEXT: pushl %ebp
; X86-BMI1BMI2-NEXT: pushl %ebx		; X86-BMI1BMI2-NEXT: pushl %ebx
		; X86-BMI1BMI2-NEXT: pushl %ebp
; X86-BMI1BMI2-NEXT: calll use64		; X86-BMI1BMI2-NEXT: calll use64
; X86-BMI1BMI2-NEXT: addl $16, %esp		; X86-BMI1BMI2-NEXT: addl $16, %esp
; X86-BMI1BMI2-NEXT: andl %ebx, %esi		; X86-BMI1BMI2-NEXT: andl %ebp, %esi
; X86-BMI1BMI2-NEXT: andl %ebp, %edi		; X86-BMI1BMI2-NEXT: andl %ebx, %edi
; X86-BMI1BMI2-NEXT: movl %esi, %eax		; X86-BMI1BMI2-NEXT: movl %esi, %eax
; X86-BMI1BMI2-NEXT: movl %edi, %edx		; X86-BMI1BMI2-NEXT: movl %edi, %edx
; X86-BMI1BMI2-NEXT: addl $12, %esp		; X86-BMI1BMI2-NEXT: addl $12, %esp
; X86-BMI1BMI2-NEXT: popl %esi		; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: popl %edi		; X86-BMI1BMI2-NEXT: popl %edi
; X86-BMI1BMI2-NEXT: popl %ebx		; X86-BMI1BMI2-NEXT: popl %ebx
; X86-BMI1BMI2-NEXT: popl %ebp		; X86-BMI1BMI2-NEXT: popl %ebp
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
▲ Show 20 Lines • Show All 87 Lines • ▼ Show 20 Lines
; X86-NOBMI-NEXT: movl %edi, %esi		; X86-NOBMI-NEXT: movl %edi, %esi
; X86-NOBMI-NEXT: xorl %edi, %edi		; X86-NOBMI-NEXT: xorl %edi, %edi
; X86-NOBMI-NEXT: .LBB42_2:		; X86-NOBMI-NEXT: .LBB42_2:
; X86-NOBMI-NEXT: movb $64, %cl		; X86-NOBMI-NEXT: movb $64, %cl
; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl $-1, %ebp		; X86-NOBMI-NEXT: movl $-1, %ebp
; X86-NOBMI-NEXT: movl $-1, %ebx		; X86-NOBMI-NEXT: movl $-1, %ebx
; X86-NOBMI-NEXT: shrl %cl, %ebx		; X86-NOBMI-NEXT: shrl %cl, %ebx
; X86-NOBMI-NEXT: shrdl %cl, %ebp, %ebp
; X86-NOBMI-NEXT: testb $32, %cl		; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB42_4		; X86-NOBMI-NEXT: je .LBB42_4
; X86-NOBMI-NEXT: # %bb.3:		; X86-NOBMI-NEXT: # %bb.3:
; X86-NOBMI-NEXT: movl %ebx, %ebp		; X86-NOBMI-NEXT: movl %ebx, %ebp
; X86-NOBMI-NEXT: xorl %ebx, %ebx		; X86-NOBMI-NEXT: xorl %ebx, %ebx
; X86-NOBMI-NEXT: .LBB42_4:		; X86-NOBMI-NEXT: .LBB42_4:
; X86-NOBMI-NEXT: subl $8, %esp		; X86-NOBMI-NEXT: subl $8, %esp
; X86-NOBMI-NEXT: pushl %ebx		; X86-NOBMI-NEXT: pushl %ebx
Show All 30 Lines
; X86-BMI1NOTBM-NEXT: movl %edi, %esi		; X86-BMI1NOTBM-NEXT: movl %edi, %esi
; X86-BMI1NOTBM-NEXT: xorl %edi, %edi		; X86-BMI1NOTBM-NEXT: xorl %edi, %edi
; X86-BMI1NOTBM-NEXT: .LBB42_2:		; X86-BMI1NOTBM-NEXT: .LBB42_2:
; X86-BMI1NOTBM-NEXT: movb $64, %cl		; X86-BMI1NOTBM-NEXT: movb $64, %cl
; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-BMI1NOTBM-NEXT: movl $-1, %ebp		; X86-BMI1NOTBM-NEXT: movl $-1, %ebp
; X86-BMI1NOTBM-NEXT: movl $-1, %ebx		; X86-BMI1NOTBM-NEXT: movl $-1, %ebx
; X86-BMI1NOTBM-NEXT: shrl %cl, %ebx		; X86-BMI1NOTBM-NEXT: shrl %cl, %ebx
; X86-BMI1NOTBM-NEXT: shrdl %cl, %ebp, %ebp
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: je .LBB42_4		; X86-BMI1NOTBM-NEXT: je .LBB42_4
; X86-BMI1NOTBM-NEXT: # %bb.3:		; X86-BMI1NOTBM-NEXT: # %bb.3:
; X86-BMI1NOTBM-NEXT: movl %ebx, %ebp		; X86-BMI1NOTBM-NEXT: movl %ebx, %ebp
; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx		; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx
; X86-BMI1NOTBM-NEXT: .LBB42_4:		; X86-BMI1NOTBM-NEXT: .LBB42_4:
; X86-BMI1NOTBM-NEXT: subl $8, %esp		; X86-BMI1NOTBM-NEXT: subl $8, %esp
; X86-BMI1NOTBM-NEXT: pushl %ebx		; X86-BMI1NOTBM-NEXT: pushl %ebx
Show All 24 Lines
; X86-BMI1BMI2-NEXT: shrdl %cl, %eax, %esi		; X86-BMI1BMI2-NEXT: shrdl %cl, %eax, %esi
; X86-BMI1BMI2-NEXT: shrxl %ecx, %eax, %edi		; X86-BMI1BMI2-NEXT: shrxl %ecx, %eax, %edi
; X86-BMI1BMI2-NEXT: testb $32, %cl		; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB42_2		; X86-BMI1BMI2-NEXT: je .LBB42_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %edi, %esi		; X86-BMI1BMI2-NEXT: movl %edi, %esi
; X86-BMI1BMI2-NEXT: xorl %edi, %edi		; X86-BMI1BMI2-NEXT: xorl %edi, %edi
; X86-BMI1BMI2-NEXT: .LBB42_2:		; X86-BMI1BMI2-NEXT: .LBB42_2:
; X86-BMI1BMI2-NEXT: movb $64, %cl		; X86-BMI1BMI2-NEXT: movb $64, %al
; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %al
; X86-BMI1BMI2-NEXT: movl $-1, %ebx		; X86-BMI1BMI2-NEXT: movl $-1, %ebp
; X86-BMI1BMI2-NEXT: shrxl %ecx, %ebx, %ebp		; X86-BMI1BMI2-NEXT: shrxl %eax, %ebp, %ebx
; X86-BMI1BMI2-NEXT: shrdl %cl, %ebx, %ebx		; X86-BMI1BMI2-NEXT: testb $32, %al
; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB42_4		; X86-BMI1BMI2-NEXT: je .LBB42_4
; X86-BMI1BMI2-NEXT: # %bb.3:		; X86-BMI1BMI2-NEXT: # %bb.3:
; X86-BMI1BMI2-NEXT: movl %ebp, %ebx		; X86-BMI1BMI2-NEXT: movl %ebx, %ebp
; X86-BMI1BMI2-NEXT: xorl %ebp, %ebp		; X86-BMI1BMI2-NEXT: xorl %ebx, %ebx
; X86-BMI1BMI2-NEXT: .LBB42_4:		; X86-BMI1BMI2-NEXT: .LBB42_4:
; X86-BMI1BMI2-NEXT: subl $8, %esp		; X86-BMI1BMI2-NEXT: subl $8, %esp
; X86-BMI1BMI2-NEXT: pushl %ebp
; X86-BMI1BMI2-NEXT: pushl %ebx		; X86-BMI1BMI2-NEXT: pushl %ebx
		; X86-BMI1BMI2-NEXT: pushl %ebp
; X86-BMI1BMI2-NEXT: calll use64		; X86-BMI1BMI2-NEXT: calll use64
; X86-BMI1BMI2-NEXT: addl $16, %esp		; X86-BMI1BMI2-NEXT: addl $16, %esp
; X86-BMI1BMI2-NEXT: andl %ebx, %esi		; X86-BMI1BMI2-NEXT: andl %ebp, %esi
; X86-BMI1BMI2-NEXT: andl %ebp, %edi		; X86-BMI1BMI2-NEXT: andl %ebx, %edi
; X86-BMI1BMI2-NEXT: movl %esi, %eax		; X86-BMI1BMI2-NEXT: movl %esi, %eax
; X86-BMI1BMI2-NEXT: movl %edi, %edx		; X86-BMI1BMI2-NEXT: movl %edi, %edx
; X86-BMI1BMI2-NEXT: addl $12, %esp		; X86-BMI1BMI2-NEXT: addl $12, %esp
; X86-BMI1BMI2-NEXT: popl %esi		; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: popl %edi		; X86-BMI1BMI2-NEXT: popl %edi
; X86-BMI1BMI2-NEXT: popl %ebx		; X86-BMI1BMI2-NEXT: popl %ebx
; X86-BMI1BMI2-NEXT: popl %ebp		; X86-BMI1BMI2-NEXT: popl %ebp
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
▲ Show 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
; X86-NOBMI-NEXT: movl %edi, %esi		; X86-NOBMI-NEXT: movl %edi, %esi
; X86-NOBMI-NEXT: xorl %edi, %edi		; X86-NOBMI-NEXT: xorl %edi, %edi
; X86-NOBMI-NEXT: .LBB43_2:		; X86-NOBMI-NEXT: .LBB43_2:
; X86-NOBMI-NEXT: movb $64, %cl		; X86-NOBMI-NEXT: movb $64, %cl
; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl $-1, %ebp		; X86-NOBMI-NEXT: movl $-1, %ebp
; X86-NOBMI-NEXT: movl $-1, %ebx		; X86-NOBMI-NEXT: movl $-1, %ebx
; X86-NOBMI-NEXT: shrl %cl, %ebx		; X86-NOBMI-NEXT: shrl %cl, %ebx
; X86-NOBMI-NEXT: shrdl %cl, %ebp, %ebp
; X86-NOBMI-NEXT: testb $32, %cl		; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB43_4		; X86-NOBMI-NEXT: je .LBB43_4
; X86-NOBMI-NEXT: # %bb.3:		; X86-NOBMI-NEXT: # %bb.3:
; X86-NOBMI-NEXT: movl %ebx, %ebp		; X86-NOBMI-NEXT: movl %ebx, %ebp
; X86-NOBMI-NEXT: xorl %ebx, %ebx		; X86-NOBMI-NEXT: xorl %ebx, %ebx
; X86-NOBMI-NEXT: .LBB43_4:		; X86-NOBMI-NEXT: .LBB43_4:
; X86-NOBMI-NEXT: subl $8, %esp		; X86-NOBMI-NEXT: subl $8, %esp
; X86-NOBMI-NEXT: pushl %ebx		; X86-NOBMI-NEXT: pushl %ebx
Show All 31 Lines
; X86-BMI1NOTBM-NEXT: movl %edi, %esi		; X86-BMI1NOTBM-NEXT: movl %edi, %esi
; X86-BMI1NOTBM-NEXT: xorl %edi, %edi		; X86-BMI1NOTBM-NEXT: xorl %edi, %edi
; X86-BMI1NOTBM-NEXT: .LBB43_2:		; X86-BMI1NOTBM-NEXT: .LBB43_2:
; X86-BMI1NOTBM-NEXT: movb $64, %cl		; X86-BMI1NOTBM-NEXT: movb $64, %cl
; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-BMI1NOTBM-NEXT: movl $-1, %ebp		; X86-BMI1NOTBM-NEXT: movl $-1, %ebp
; X86-BMI1NOTBM-NEXT: movl $-1, %ebx		; X86-BMI1NOTBM-NEXT: movl $-1, %ebx
; X86-BMI1NOTBM-NEXT: shrl %cl, %ebx		; X86-BMI1NOTBM-NEXT: shrl %cl, %ebx
; X86-BMI1NOTBM-NEXT: shrdl %cl, %ebp, %ebp
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: je .LBB43_4		; X86-BMI1NOTBM-NEXT: je .LBB43_4
; X86-BMI1NOTBM-NEXT: # %bb.3:		; X86-BMI1NOTBM-NEXT: # %bb.3:
; X86-BMI1NOTBM-NEXT: movl %ebx, %ebp		; X86-BMI1NOTBM-NEXT: movl %ebx, %ebp
; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx		; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx
; X86-BMI1NOTBM-NEXT: .LBB43_4:		; X86-BMI1NOTBM-NEXT: .LBB43_4:
; X86-BMI1NOTBM-NEXT: subl $8, %esp		; X86-BMI1NOTBM-NEXT: subl $8, %esp
; X86-BMI1NOTBM-NEXT: pushl %ebx		; X86-BMI1NOTBM-NEXT: pushl %ebx
Show All 25 Lines
; X86-BMI1BMI2-NEXT: shrxl %ecx, %eax, %edi		; X86-BMI1BMI2-NEXT: shrxl %ecx, %eax, %edi
; X86-BMI1BMI2-NEXT: shrdl %cl, %eax, %esi		; X86-BMI1BMI2-NEXT: shrdl %cl, %eax, %esi
; X86-BMI1BMI2-NEXT: testb $32, %cl		; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB43_2		; X86-BMI1BMI2-NEXT: je .LBB43_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %edi, %esi		; X86-BMI1BMI2-NEXT: movl %edi, %esi
; X86-BMI1BMI2-NEXT: xorl %edi, %edi		; X86-BMI1BMI2-NEXT: xorl %edi, %edi
; X86-BMI1BMI2-NEXT: .LBB43_2:		; X86-BMI1BMI2-NEXT: .LBB43_2:
; X86-BMI1BMI2-NEXT: movb $64, %cl		; X86-BMI1BMI2-NEXT: movb $64, %al
; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %al
; X86-BMI1BMI2-NEXT: movl $-1, %ebx		; X86-BMI1BMI2-NEXT: movl $-1, %ebp
; X86-BMI1BMI2-NEXT: shrxl %ecx, %ebx, %ebp		; X86-BMI1BMI2-NEXT: shrxl %eax, %ebp, %ebx
; X86-BMI1BMI2-NEXT: shrdl %cl, %ebx, %ebx		; X86-BMI1BMI2-NEXT: testb $32, %al
; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB43_4		; X86-BMI1BMI2-NEXT: je .LBB43_4
; X86-BMI1BMI2-NEXT: # %bb.3:		; X86-BMI1BMI2-NEXT: # %bb.3:
; X86-BMI1BMI2-NEXT: movl %ebp, %ebx		; X86-BMI1BMI2-NEXT: movl %ebx, %ebp
; X86-BMI1BMI2-NEXT: xorl %ebp, %ebp		; X86-BMI1BMI2-NEXT: xorl %ebx, %ebx
; X86-BMI1BMI2-NEXT: .LBB43_4:		; X86-BMI1BMI2-NEXT: .LBB43_4:
; X86-BMI1BMI2-NEXT: subl $8, %esp		; X86-BMI1BMI2-NEXT: subl $8, %esp
; X86-BMI1BMI2-NEXT: pushl %ebp
; X86-BMI1BMI2-NEXT: pushl %ebx		; X86-BMI1BMI2-NEXT: pushl %ebx
		; X86-BMI1BMI2-NEXT: pushl %ebp
; X86-BMI1BMI2-NEXT: calll use64		; X86-BMI1BMI2-NEXT: calll use64
; X86-BMI1BMI2-NEXT: addl $16, %esp		; X86-BMI1BMI2-NEXT: addl $16, %esp
; X86-BMI1BMI2-NEXT: andl %ebx, %esi		; X86-BMI1BMI2-NEXT: andl %ebp, %esi
; X86-BMI1BMI2-NEXT: andl %ebp, %edi		; X86-BMI1BMI2-NEXT: andl %ebx, %edi
; X86-BMI1BMI2-NEXT: movl %esi, %eax		; X86-BMI1BMI2-NEXT: movl %esi, %eax
; X86-BMI1BMI2-NEXT: movl %edi, %edx		; X86-BMI1BMI2-NEXT: movl %edi, %edx
; X86-BMI1BMI2-NEXT: addl $12, %esp		; X86-BMI1BMI2-NEXT: addl $12, %esp
; X86-BMI1BMI2-NEXT: popl %esi		; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: popl %edi		; X86-BMI1BMI2-NEXT: popl %edi
; X86-BMI1BMI2-NEXT: popl %ebx		; X86-BMI1BMI2-NEXT: popl %ebx
; X86-BMI1BMI2-NEXT: popl %ebp		; X86-BMI1BMI2-NEXT: popl %ebp
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
▲ Show 20 Lines • Show All 89 Lines • ▼ Show 20 Lines
; X86-NOBMI-NEXT: movl %edi, %esi		; X86-NOBMI-NEXT: movl %edi, %esi
; X86-NOBMI-NEXT: xorl %edi, %edi		; X86-NOBMI-NEXT: xorl %edi, %edi
; X86-NOBMI-NEXT: .LBB44_2:		; X86-NOBMI-NEXT: .LBB44_2:
; X86-NOBMI-NEXT: movb $64, %cl		; X86-NOBMI-NEXT: movb $64, %cl
; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl $-1, %ebp		; X86-NOBMI-NEXT: movl $-1, %ebp
; X86-NOBMI-NEXT: movl $-1, %ebx		; X86-NOBMI-NEXT: movl $-1, %ebx
; X86-NOBMI-NEXT: shrl %cl, %ebx		; X86-NOBMI-NEXT: shrl %cl, %ebx
; X86-NOBMI-NEXT: shrdl %cl, %ebp, %ebp
; X86-NOBMI-NEXT: testb $32, %cl		; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB44_4		; X86-NOBMI-NEXT: je .LBB44_4
; X86-NOBMI-NEXT: # %bb.3:		; X86-NOBMI-NEXT: # %bb.3:
; X86-NOBMI-NEXT: movl %ebx, %ebp		; X86-NOBMI-NEXT: movl %ebx, %ebp
; X86-NOBMI-NEXT: xorl %ebx, %ebx		; X86-NOBMI-NEXT: xorl %ebx, %ebx
; X86-NOBMI-NEXT: .LBB44_4:		; X86-NOBMI-NEXT: .LBB44_4:
; X86-NOBMI-NEXT: subl $8, %esp		; X86-NOBMI-NEXT: subl $8, %esp
; X86-NOBMI-NEXT: pushl %ebx		; X86-NOBMI-NEXT: pushl %ebx
Show All 31 Lines
; X86-BMI1NOTBM-NEXT: movl %edi, %esi		; X86-BMI1NOTBM-NEXT: movl %edi, %esi
; X86-BMI1NOTBM-NEXT: xorl %edi, %edi		; X86-BMI1NOTBM-NEXT: xorl %edi, %edi
; X86-BMI1NOTBM-NEXT: .LBB44_2:		; X86-BMI1NOTBM-NEXT: .LBB44_2:
; X86-BMI1NOTBM-NEXT: movb $64, %cl		; X86-BMI1NOTBM-NEXT: movb $64, %cl
; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-BMI1NOTBM-NEXT: movl $-1, %ebp		; X86-BMI1NOTBM-NEXT: movl $-1, %ebp
; X86-BMI1NOTBM-NEXT: movl $-1, %ebx		; X86-BMI1NOTBM-NEXT: movl $-1, %ebx
; X86-BMI1NOTBM-NEXT: shrl %cl, %ebx		; X86-BMI1NOTBM-NEXT: shrl %cl, %ebx
; X86-BMI1NOTBM-NEXT: shrdl %cl, %ebp, %ebp
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: je .LBB44_4		; X86-BMI1NOTBM-NEXT: je .LBB44_4
; X86-BMI1NOTBM-NEXT: # %bb.3:		; X86-BMI1NOTBM-NEXT: # %bb.3:
; X86-BMI1NOTBM-NEXT: movl %ebx, %ebp		; X86-BMI1NOTBM-NEXT: movl %ebx, %ebp
; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx		; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx
; X86-BMI1NOTBM-NEXT: .LBB44_4:		; X86-BMI1NOTBM-NEXT: .LBB44_4:
; X86-BMI1NOTBM-NEXT: subl $8, %esp		; X86-BMI1NOTBM-NEXT: subl $8, %esp
; X86-BMI1NOTBM-NEXT: pushl %ebx		; X86-BMI1NOTBM-NEXT: pushl %ebx
Show All 25 Lines
; X86-BMI1BMI2-NEXT: shrxl %ecx, %eax, %edi		; X86-BMI1BMI2-NEXT: shrxl %ecx, %eax, %edi
; X86-BMI1BMI2-NEXT: shrdl %cl, %eax, %esi		; X86-BMI1BMI2-NEXT: shrdl %cl, %eax, %esi
; X86-BMI1BMI2-NEXT: testb $32, %cl		; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB44_2		; X86-BMI1BMI2-NEXT: je .LBB44_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %edi, %esi		; X86-BMI1BMI2-NEXT: movl %edi, %esi
; X86-BMI1BMI2-NEXT: xorl %edi, %edi		; X86-BMI1BMI2-NEXT: xorl %edi, %edi
; X86-BMI1BMI2-NEXT: .LBB44_2:		; X86-BMI1BMI2-NEXT: .LBB44_2:
; X86-BMI1BMI2-NEXT: movb $64, %cl		; X86-BMI1BMI2-NEXT: movb $64, %al
; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %al
; X86-BMI1BMI2-NEXT: movl $-1, %ebx		; X86-BMI1BMI2-NEXT: movl $-1, %ebp
; X86-BMI1BMI2-NEXT: shrxl %ecx, %ebx, %ebp		; X86-BMI1BMI2-NEXT: shrxl %eax, %ebp, %ebx
; X86-BMI1BMI2-NEXT: shrdl %cl, %ebx, %ebx		; X86-BMI1BMI2-NEXT: testb $32, %al
; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB44_4		; X86-BMI1BMI2-NEXT: je .LBB44_4
; X86-BMI1BMI2-NEXT: # %bb.3:		; X86-BMI1BMI2-NEXT: # %bb.3:
; X86-BMI1BMI2-NEXT: movl %ebp, %ebx		; X86-BMI1BMI2-NEXT: movl %ebx, %ebp
; X86-BMI1BMI2-NEXT: xorl %ebp, %ebp		; X86-BMI1BMI2-NEXT: xorl %ebx, %ebx
; X86-BMI1BMI2-NEXT: .LBB44_4:		; X86-BMI1BMI2-NEXT: .LBB44_4:
; X86-BMI1BMI2-NEXT: subl $8, %esp		; X86-BMI1BMI2-NEXT: subl $8, %esp
; X86-BMI1BMI2-NEXT: pushl %ebp
; X86-BMI1BMI2-NEXT: pushl %ebx		; X86-BMI1BMI2-NEXT: pushl %ebx
		; X86-BMI1BMI2-NEXT: pushl %ebp
; X86-BMI1BMI2-NEXT: calll use64		; X86-BMI1BMI2-NEXT: calll use64
; X86-BMI1BMI2-NEXT: addl $16, %esp		; X86-BMI1BMI2-NEXT: addl $16, %esp
; X86-BMI1BMI2-NEXT: andl %ebx, %esi		; X86-BMI1BMI2-NEXT: andl %ebp, %esi
; X86-BMI1BMI2-NEXT: andl %ebp, %edi		; X86-BMI1BMI2-NEXT: andl %ebx, %edi
; X86-BMI1BMI2-NEXT: movl %esi, %eax		; X86-BMI1BMI2-NEXT: movl %esi, %eax
; X86-BMI1BMI2-NEXT: movl %edi, %edx		; X86-BMI1BMI2-NEXT: movl %edi, %edx
; X86-BMI1BMI2-NEXT: addl $12, %esp		; X86-BMI1BMI2-NEXT: addl $12, %esp
; X86-BMI1BMI2-NEXT: popl %esi		; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: popl %edi		; X86-BMI1BMI2-NEXT: popl %edi
; X86-BMI1BMI2-NEXT: popl %ebx		; X86-BMI1BMI2-NEXT: popl %ebx
; X86-BMI1BMI2-NEXT: popl %ebp		; X86-BMI1BMI2-NEXT: popl %ebp
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
▲ Show 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
; X86-NOBMI-NEXT: movl %edi, %esi		; X86-NOBMI-NEXT: movl %edi, %esi
; X86-NOBMI-NEXT: xorl %edi, %edi		; X86-NOBMI-NEXT: xorl %edi, %edi
; X86-NOBMI-NEXT: .LBB45_2:		; X86-NOBMI-NEXT: .LBB45_2:
; X86-NOBMI-NEXT: movb $64, %cl		; X86-NOBMI-NEXT: movb $64, %cl
; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl $-1, %ebp		; X86-NOBMI-NEXT: movl $-1, %ebp
; X86-NOBMI-NEXT: movl $-1, %ebx		; X86-NOBMI-NEXT: movl $-1, %ebx
; X86-NOBMI-NEXT: shrl %cl, %ebx		; X86-NOBMI-NEXT: shrl %cl, %ebx
; X86-NOBMI-NEXT: shrdl %cl, %ebp, %ebp
; X86-NOBMI-NEXT: testb $32, %cl		; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB45_4		; X86-NOBMI-NEXT: je .LBB45_4
; X86-NOBMI-NEXT: # %bb.3:		; X86-NOBMI-NEXT: # %bb.3:
; X86-NOBMI-NEXT: movl %ebx, %ebp		; X86-NOBMI-NEXT: movl %ebx, %ebp
; X86-NOBMI-NEXT: xorl %ebx, %ebx		; X86-NOBMI-NEXT: xorl %ebx, %ebx
; X86-NOBMI-NEXT: .LBB45_4:		; X86-NOBMI-NEXT: .LBB45_4:
; X86-NOBMI-NEXT: subl $8, %esp		; X86-NOBMI-NEXT: subl $8, %esp
; X86-NOBMI-NEXT: pushl %ebx		; X86-NOBMI-NEXT: pushl %ebx
Show All 30 Lines
; X86-BMI1NOTBM-NEXT: movl %edi, %esi		; X86-BMI1NOTBM-NEXT: movl %edi, %esi
; X86-BMI1NOTBM-NEXT: xorl %edi, %edi		; X86-BMI1NOTBM-NEXT: xorl %edi, %edi
; X86-BMI1NOTBM-NEXT: .LBB45_2:		; X86-BMI1NOTBM-NEXT: .LBB45_2:
; X86-BMI1NOTBM-NEXT: movb $64, %cl		; X86-BMI1NOTBM-NEXT: movb $64, %cl
; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-BMI1NOTBM-NEXT: movl $-1, %ebp		; X86-BMI1NOTBM-NEXT: movl $-1, %ebp
; X86-BMI1NOTBM-NEXT: movl $-1, %ebx		; X86-BMI1NOTBM-NEXT: movl $-1, %ebx
; X86-BMI1NOTBM-NEXT: shrl %cl, %ebx		; X86-BMI1NOTBM-NEXT: shrl %cl, %ebx
; X86-BMI1NOTBM-NEXT: shrdl %cl, %ebp, %ebp
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: je .LBB45_4		; X86-BMI1NOTBM-NEXT: je .LBB45_4
; X86-BMI1NOTBM-NEXT: # %bb.3:		; X86-BMI1NOTBM-NEXT: # %bb.3:
; X86-BMI1NOTBM-NEXT: movl %ebx, %ebp		; X86-BMI1NOTBM-NEXT: movl %ebx, %ebp
; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx		; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx
; X86-BMI1NOTBM-NEXT: .LBB45_4:		; X86-BMI1NOTBM-NEXT: .LBB45_4:
; X86-BMI1NOTBM-NEXT: subl $8, %esp		; X86-BMI1NOTBM-NEXT: subl $8, %esp
; X86-BMI1NOTBM-NEXT: pushl %ebx		; X86-BMI1NOTBM-NEXT: pushl %ebx
Show All 24 Lines
; X86-BMI1BMI2-NEXT: shrdl %cl, %eax, %esi		; X86-BMI1BMI2-NEXT: shrdl %cl, %eax, %esi
; X86-BMI1BMI2-NEXT: shrxl %ecx, %eax, %edi		; X86-BMI1BMI2-NEXT: shrxl %ecx, %eax, %edi
; X86-BMI1BMI2-NEXT: testb $32, %cl		; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB45_2		; X86-BMI1BMI2-NEXT: je .LBB45_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %edi, %esi		; X86-BMI1BMI2-NEXT: movl %edi, %esi
; X86-BMI1BMI2-NEXT: xorl %edi, %edi		; X86-BMI1BMI2-NEXT: xorl %edi, %edi
; X86-BMI1BMI2-NEXT: .LBB45_2:		; X86-BMI1BMI2-NEXT: .LBB45_2:
; X86-BMI1BMI2-NEXT: movb $64, %cl		; X86-BMI1BMI2-NEXT: movb $64, %al
; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %al
; X86-BMI1BMI2-NEXT: movl $-1, %ebx		; X86-BMI1BMI2-NEXT: movl $-1, %ebp
; X86-BMI1BMI2-NEXT: shrxl %ecx, %ebx, %ebp		; X86-BMI1BMI2-NEXT: shrxl %eax, %ebp, %ebx
; X86-BMI1BMI2-NEXT: shrdl %cl, %ebx, %ebx		; X86-BMI1BMI2-NEXT: testb $32, %al
; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB45_4		; X86-BMI1BMI2-NEXT: je .LBB45_4
; X86-BMI1BMI2-NEXT: # %bb.3:		; X86-BMI1BMI2-NEXT: # %bb.3:
; X86-BMI1BMI2-NEXT: movl %ebp, %ebx		; X86-BMI1BMI2-NEXT: movl %ebx, %ebp
; X86-BMI1BMI2-NEXT: xorl %ebp, %ebp		; X86-BMI1BMI2-NEXT: xorl %ebx, %ebx
; X86-BMI1BMI2-NEXT: .LBB45_4:		; X86-BMI1BMI2-NEXT: .LBB45_4:
; X86-BMI1BMI2-NEXT: subl $8, %esp		; X86-BMI1BMI2-NEXT: subl $8, %esp
; X86-BMI1BMI2-NEXT: pushl %ebp
; X86-BMI1BMI2-NEXT: pushl %ebx		; X86-BMI1BMI2-NEXT: pushl %ebx
		; X86-BMI1BMI2-NEXT: pushl %ebp
; X86-BMI1BMI2-NEXT: calll use64		; X86-BMI1BMI2-NEXT: calll use64
; X86-BMI1BMI2-NEXT: addl $16, %esp		; X86-BMI1BMI2-NEXT: addl $16, %esp
; X86-BMI1BMI2-NEXT: andl %ebx, %esi		; X86-BMI1BMI2-NEXT: andl %ebp, %esi
; X86-BMI1BMI2-NEXT: andl %ebp, %edi		; X86-BMI1BMI2-NEXT: andl %ebx, %edi
; X86-BMI1BMI2-NEXT: movl %esi, %eax		; X86-BMI1BMI2-NEXT: movl %esi, %eax
; X86-BMI1BMI2-NEXT: movl %edi, %edx		; X86-BMI1BMI2-NEXT: movl %edi, %edx
; X86-BMI1BMI2-NEXT: addl $12, %esp		; X86-BMI1BMI2-NEXT: addl $12, %esp
; X86-BMI1BMI2-NEXT: popl %esi		; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: popl %edi		; X86-BMI1BMI2-NEXT: popl %edi
; X86-BMI1BMI2-NEXT: popl %ebx		; X86-BMI1BMI2-NEXT: popl %ebx
; X86-BMI1BMI2-NEXT: popl %ebp		; X86-BMI1BMI2-NEXT: popl %ebp
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
▲ Show 20 Lines • Show All 87 Lines • ▼ Show 20 Lines
; X86-NOBMI-NEXT: movl %edi, %esi		; X86-NOBMI-NEXT: movl %edi, %esi
; X86-NOBMI-NEXT: xorl %edi, %edi		; X86-NOBMI-NEXT: xorl %edi, %edi
; X86-NOBMI-NEXT: .LBB46_2:		; X86-NOBMI-NEXT: .LBB46_2:
; X86-NOBMI-NEXT: movb $64, %cl		; X86-NOBMI-NEXT: movb $64, %cl
; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl $-1, %ebx		; X86-NOBMI-NEXT: movl $-1, %ebx
; X86-NOBMI-NEXT: movl $-1, %ebp		; X86-NOBMI-NEXT: movl $-1, %ebp
; X86-NOBMI-NEXT: shrl %cl, %ebp		; X86-NOBMI-NEXT: shrl %cl, %ebp
; X86-NOBMI-NEXT: shrdl %cl, %ebx, %ebx
; X86-NOBMI-NEXT: testb $32, %cl		; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB46_4		; X86-NOBMI-NEXT: je .LBB46_4
; X86-NOBMI-NEXT: # %bb.3:		; X86-NOBMI-NEXT: # %bb.3:
; X86-NOBMI-NEXT: movl %ebp, %ebx		; X86-NOBMI-NEXT: movl %ebp, %ebx
; X86-NOBMI-NEXT: xorl %ebp, %ebp		; X86-NOBMI-NEXT: xorl %ebp, %ebp
; X86-NOBMI-NEXT: .LBB46_4:		; X86-NOBMI-NEXT: .LBB46_4:
; X86-NOBMI-NEXT: subl $8, %esp		; X86-NOBMI-NEXT: subl $8, %esp
; X86-NOBMI-NEXT: pushl %ebp		; X86-NOBMI-NEXT: pushl %ebp
Show All 35 Lines
; X86-BMI1NOTBM-NEXT: movl %edi, %esi		; X86-BMI1NOTBM-NEXT: movl %edi, %esi
; X86-BMI1NOTBM-NEXT: xorl %edi, %edi		; X86-BMI1NOTBM-NEXT: xorl %edi, %edi
; X86-BMI1NOTBM-NEXT: .LBB46_2:		; X86-BMI1NOTBM-NEXT: .LBB46_2:
; X86-BMI1NOTBM-NEXT: movb $64, %cl		; X86-BMI1NOTBM-NEXT: movb $64, %cl
; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-BMI1NOTBM-NEXT: movl $-1, %ebx		; X86-BMI1NOTBM-NEXT: movl $-1, %ebx
; X86-BMI1NOTBM-NEXT: movl $-1, %ebp		; X86-BMI1NOTBM-NEXT: movl $-1, %ebp
; X86-BMI1NOTBM-NEXT: shrl %cl, %ebp		; X86-BMI1NOTBM-NEXT: shrl %cl, %ebp
; X86-BMI1NOTBM-NEXT: shrdl %cl, %ebx, %ebx
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: je .LBB46_4		; X86-BMI1NOTBM-NEXT: je .LBB46_4
; X86-BMI1NOTBM-NEXT: # %bb.3:		; X86-BMI1NOTBM-NEXT: # %bb.3:
; X86-BMI1NOTBM-NEXT: movl %ebp, %ebx		; X86-BMI1NOTBM-NEXT: movl %ebp, %ebx
; X86-BMI1NOTBM-NEXT: xorl %ebp, %ebp		; X86-BMI1NOTBM-NEXT: xorl %ebp, %ebp
; X86-BMI1NOTBM-NEXT: .LBB46_4:		; X86-BMI1NOTBM-NEXT: .LBB46_4:
; X86-BMI1NOTBM-NEXT: subl $8, %esp		; X86-BMI1NOTBM-NEXT: subl $8, %esp
; X86-BMI1NOTBM-NEXT: pushl %ebp		; X86-BMI1NOTBM-NEXT: pushl %ebp
Show All 29 Lines
; X86-BMI1BMI2-NEXT: shrdl %cl, %eax, %esi		; X86-BMI1BMI2-NEXT: shrdl %cl, %eax, %esi
; X86-BMI1BMI2-NEXT: shrxl %ecx, %eax, %edi		; X86-BMI1BMI2-NEXT: shrxl %ecx, %eax, %edi
; X86-BMI1BMI2-NEXT: testb $32, %cl		; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB46_2		; X86-BMI1BMI2-NEXT: je .LBB46_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %edi, %esi		; X86-BMI1BMI2-NEXT: movl %edi, %esi
; X86-BMI1BMI2-NEXT: xorl %edi, %edi		; X86-BMI1BMI2-NEXT: xorl %edi, %edi
; X86-BMI1BMI2-NEXT: .LBB46_2:		; X86-BMI1BMI2-NEXT: .LBB46_2:
; X86-BMI1BMI2-NEXT: movb $64, %cl		; X86-BMI1BMI2-NEXT: movb $64, %al
; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %al
; X86-BMI1BMI2-NEXT: movl $-1, %ebp		; X86-BMI1BMI2-NEXT: movl $-1, %ebp
; X86-BMI1BMI2-NEXT: shrxl %ecx, %ebp, %ebx		; X86-BMI1BMI2-NEXT: shrxl %eax, %ebp, %ebx
; X86-BMI1BMI2-NEXT: shrdl %cl, %ebp, %ebp		; X86-BMI1BMI2-NEXT: testb $32, %al
; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB46_4		; X86-BMI1BMI2-NEXT: je .LBB46_4
; X86-BMI1BMI2-NEXT: # %bb.3:		; X86-BMI1BMI2-NEXT: # %bb.3:
; X86-BMI1BMI2-NEXT: movl %ebx, %ebp		; X86-BMI1BMI2-NEXT: movl %ebx, %ebp
; X86-BMI1BMI2-NEXT: xorl %ebx, %ebx		; X86-BMI1BMI2-NEXT: xorl %ebx, %ebx
; X86-BMI1BMI2-NEXT: .LBB46_4:		; X86-BMI1BMI2-NEXT: .LBB46_4:
; X86-BMI1BMI2-NEXT: subl $8, %esp		; X86-BMI1BMI2-NEXT: subl $8, %esp
; X86-BMI1BMI2-NEXT: pushl %ebx		; X86-BMI1BMI2-NEXT: pushl %ebx
; X86-BMI1BMI2-NEXT: pushl %ebp		; X86-BMI1BMI2-NEXT: pushl %ebp
▲ Show 20 Lines • Show All 108 Lines • ▼ Show 20 Lines
; X86-NOBMI-NEXT: shrdl %cl, %esi, %eax		; X86-NOBMI-NEXT: shrdl %cl, %esi, %eax
; X86-NOBMI-NEXT: testb $32, %cl		; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: jne .LBB47_2		; X86-NOBMI-NEXT: jne .LBB47_2
; X86-NOBMI-NEXT: # %bb.1:		; X86-NOBMI-NEXT: # %bb.1:
; X86-NOBMI-NEXT: movl %eax, %edx		; X86-NOBMI-NEXT: movl %eax, %edx
; X86-NOBMI-NEXT: .LBB47_2:		; X86-NOBMI-NEXT: .LBB47_2:
; X86-NOBMI-NEXT: movb $64, %cl		; X86-NOBMI-NEXT: movb $64, %cl
; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl $-1, %esi
; X86-NOBMI-NEXT: movl $-1, %eax		; X86-NOBMI-NEXT: movl $-1, %eax
; X86-NOBMI-NEXT: shrl %cl, %eax		; X86-NOBMI-NEXT: shrl %cl, %eax
; X86-NOBMI-NEXT: shrdl %cl, %esi, %esi
; X86-NOBMI-NEXT: testb $32, %cl		; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: jne .LBB47_4		; X86-NOBMI-NEXT: jne .LBB47_4
; X86-NOBMI-NEXT: # %bb.3:		; X86-NOBMI-NEXT: # %bb.3:
; X86-NOBMI-NEXT: movl %esi, %eax		; X86-NOBMI-NEXT: movl $-1, %eax
; X86-NOBMI-NEXT: .LBB47_4:		; X86-NOBMI-NEXT: .LBB47_4:
; X86-NOBMI-NEXT: andl %edx, %eax		; X86-NOBMI-NEXT: andl %edx, %eax
; X86-NOBMI-NEXT: popl %esi		; X86-NOBMI-NEXT: popl %esi
; X86-NOBMI-NEXT: retl		; X86-NOBMI-NEXT: retl
;		;
; X86-BMI1NOTBM-LABEL: bextr64_32_c0:		; X86-BMI1NOTBM-LABEL: bextr64_32_c0:
; X86-BMI1NOTBM: # %bb.0:		; X86-BMI1NOTBM: # %bb.0:
; X86-BMI1NOTBM-NEXT: pushl %esi		; X86-BMI1NOTBM-NEXT: pushl %esi
; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-BMI1NOTBM-NEXT: movl %esi, %edx		; X86-BMI1NOTBM-NEXT: movl %esi, %edx
; X86-BMI1NOTBM-NEXT: shrl %cl, %edx		; X86-BMI1NOTBM-NEXT: shrl %cl, %edx
; X86-BMI1NOTBM-NEXT: shrdl %cl, %esi, %eax		; X86-BMI1NOTBM-NEXT: shrdl %cl, %esi, %eax
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: jne .LBB47_2		; X86-BMI1NOTBM-NEXT: jne .LBB47_2
; X86-BMI1NOTBM-NEXT: # %bb.1:		; X86-BMI1NOTBM-NEXT: # %bb.1:
; X86-BMI1NOTBM-NEXT: movl %eax, %edx		; X86-BMI1NOTBM-NEXT: movl %eax, %edx
; X86-BMI1NOTBM-NEXT: .LBB47_2:		; X86-BMI1NOTBM-NEXT: .LBB47_2:
; X86-BMI1NOTBM-NEXT: movb $64, %cl		; X86-BMI1NOTBM-NEXT: movb $64, %cl
; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-BMI1NOTBM-NEXT: movl $-1, %esi
; X86-BMI1NOTBM-NEXT: movl $-1, %eax		; X86-BMI1NOTBM-NEXT: movl $-1, %eax
; X86-BMI1NOTBM-NEXT: shrl %cl, %eax		; X86-BMI1NOTBM-NEXT: shrl %cl, %eax
; X86-BMI1NOTBM-NEXT: shrdl %cl, %esi, %esi
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: jne .LBB47_4		; X86-BMI1NOTBM-NEXT: jne .LBB47_4
; X86-BMI1NOTBM-NEXT: # %bb.3:		; X86-BMI1NOTBM-NEXT: # %bb.3:
; X86-BMI1NOTBM-NEXT: movl %esi, %eax		; X86-BMI1NOTBM-NEXT: movl $-1, %eax
; X86-BMI1NOTBM-NEXT: .LBB47_4:		; X86-BMI1NOTBM-NEXT: .LBB47_4:
; X86-BMI1NOTBM-NEXT: andl %edx, %eax		; X86-BMI1NOTBM-NEXT: andl %edx, %eax
; X86-BMI1NOTBM-NEXT: popl %esi		; X86-BMI1NOTBM-NEXT: popl %esi
; X86-BMI1NOTBM-NEXT: retl		; X86-BMI1NOTBM-NEXT: retl
;		;
; X86-BMI1BMI2-LABEL: bextr64_32_c0:		; X86-BMI1BMI2-LABEL: bextr64_32_c0:
; X86-BMI1BMI2: # %bb.0:		; X86-BMI1BMI2: # %bb.0:
; X86-BMI1BMI2-NEXT: pushl %esi
; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-BMI1BMI2-NEXT: shrdl %cl, %eax, %edx		; X86-BMI1BMI2-NEXT: shrdl %cl, %eax, %edx
; X86-BMI1BMI2-NEXT: testb $32, %cl		; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB47_2		; X86-BMI1BMI2-NEXT: je .LBB47_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: shrxl %ecx, %eax, %edx		; X86-BMI1BMI2-NEXT: shrxl %ecx, %eax, %edx
; X86-BMI1BMI2-NEXT: .LBB47_2:		; X86-BMI1BMI2-NEXT: .LBB47_2:
; X86-BMI1BMI2-NEXT: movb $64, %cl		; X86-BMI1BMI2-NEXT: movb $64, %cl
; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-BMI1BMI2-NEXT: movl $-1, %esi
; X86-BMI1BMI2-NEXT: movl $-1, %eax		; X86-BMI1BMI2-NEXT: movl $-1, %eax
; X86-BMI1BMI2-NEXT: shrdl %cl, %eax, %eax
; X86-BMI1BMI2-NEXT: testb $32, %cl		; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB47_4		; X86-BMI1BMI2-NEXT: je .LBB47_4
; X86-BMI1BMI2-NEXT: # %bb.3:		; X86-BMI1BMI2-NEXT: # %bb.3:
; X86-BMI1BMI2-NEXT: shrxl %ecx, %esi, %eax		; X86-BMI1BMI2-NEXT: shrxl %ecx, %eax, %eax
; X86-BMI1BMI2-NEXT: .LBB47_4:		; X86-BMI1BMI2-NEXT: .LBB47_4:
; X86-BMI1BMI2-NEXT: andl %edx, %eax		; X86-BMI1BMI2-NEXT: andl %edx, %eax
; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
;		;
; X64-NOBMI-LABEL: bextr64_32_c0:		; X64-NOBMI-LABEL: bextr64_32_c0:
; X64-NOBMI: # %bb.0:		; X64-NOBMI: # %bb.0:
; X64-NOBMI-NEXT: movq %rsi, %rcx		; X64-NOBMI-NEXT: movq %rsi, %rcx
; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $rcx
; X64-NOBMI-NEXT: shrq %cl, %rdi		; X64-NOBMI-NEXT: shrq %cl, %rdi
; X64-NOBMI-NEXT: negb %dl		; X64-NOBMI-NEXT: negb %dl
▲ Show 20 Lines • Show All 2,524 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/extract-lowbits.ll

Show First 20 Lines • Show All 1,350 Lines • ▼ Show 20 Lines	; X64-BMI1BMI2-NEXT: retq
ret i32 %masked		ret i32 %masked
}		}

; 64-bit		; 64-bit

define i64 @bzhi64_b0(i64 %val, i64 %numlowbits) nounwind {		define i64 @bzhi64_b0(i64 %val, i64 %numlowbits) nounwind {
; X86-NOBMI-LABEL: bzhi64_b0:		; X86-NOBMI-LABEL: bzhi64_b0:
; X86-NOBMI: # %bb.0:		; X86-NOBMI: # %bb.0:
		; X86-NOBMI-NEXT: pushl %esi
; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl $-1, %edx		; X86-NOBMI-NEXT: movl $-1, %edx
; X86-NOBMI-NEXT: movl $-1, %eax		; X86-NOBMI-NEXT: movl $-1, %esi
; X86-NOBMI-NEXT: shll %cl, %eax		; X86-NOBMI-NEXT: shll %cl, %esi
; X86-NOBMI-NEXT: shldl %cl, %edx, %edx
; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB20_2
; X86-NOBMI-NEXT: # %bb.1:
; X86-NOBMI-NEXT: movl %eax, %edx
; X86-NOBMI-NEXT: xorl %eax, %eax		; X86-NOBMI-NEXT: xorl %eax, %eax
; X86-NOBMI-NEXT: .LBB20_2:		; X86-NOBMI-NEXT: testb $32, %cl
		; X86-NOBMI-NEXT: jne .LBB20_1
		; X86-NOBMI-NEXT: # %bb.2:
		; X86-NOBMI-NEXT: movl %esi, %eax
		; X86-NOBMI-NEXT: jmp .LBB20_3
		; X86-NOBMI-NEXT: .LBB20_1:
		; X86-NOBMI-NEXT: movl %esi, %edx
		; X86-NOBMI-NEXT: .LBB20_3:
; X86-NOBMI-NEXT: notl %edx		; X86-NOBMI-NEXT: notl %edx
; X86-NOBMI-NEXT: notl %eax		; X86-NOBMI-NEXT: notl %eax
; X86-NOBMI-NEXT: andl {{[0-9]+}}(%esp), %edx
; X86-NOBMI-NEXT: andl {{[0-9]+}}(%esp), %eax		; X86-NOBMI-NEXT: andl {{[0-9]+}}(%esp), %eax
		; X86-NOBMI-NEXT: andl {{[0-9]+}}(%esp), %edx
		; X86-NOBMI-NEXT: popl %esi
; X86-NOBMI-NEXT: retl		; X86-NOBMI-NEXT: retl
;		;
; X86-BMI1NOTBM-LABEL: bzhi64_b0:		; X86-BMI1NOTBM-LABEL: bzhi64_b0:
; X86-BMI1NOTBM: # %bb.0:		; X86-BMI1NOTBM: # %bb.0:
; X86-BMI1NOTBM-NEXT: pushl %esi
; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %cl
		; X86-BMI1NOTBM-NEXT: movl $-1, %edx
; X86-BMI1NOTBM-NEXT: movl $-1, %eax		; X86-BMI1NOTBM-NEXT: movl $-1, %eax
; X86-BMI1NOTBM-NEXT: movl $-1, %esi		; X86-BMI1NOTBM-NEXT: shll %cl, %eax
; X86-BMI1NOTBM-NEXT: shll %cl, %esi
; X86-BMI1NOTBM-NEXT: shldl %cl, %eax, %eax
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: je .LBB20_2		; X86-BMI1NOTBM-NEXT: je .LBB20_2
; X86-BMI1NOTBM-NEXT: # %bb.1:		; X86-BMI1NOTBM-NEXT: # %bb.1:
; X86-BMI1NOTBM-NEXT: movl %esi, %eax		; X86-BMI1NOTBM-NEXT: movl %eax, %edx
; X86-BMI1NOTBM-NEXT: xorl %esi, %esi		; X86-BMI1NOTBM-NEXT: xorl %eax, %eax
; X86-BMI1NOTBM-NEXT: .LBB20_2:		; X86-BMI1NOTBM-NEXT: .LBB20_2:
; X86-BMI1NOTBM-NEXT: andnl {{[0-9]+}}(%esp), %eax, %edx		; X86-BMI1NOTBM-NEXT: andnl {{[0-9]+}}(%esp), %eax, %eax
; X86-BMI1NOTBM-NEXT: andnl {{[0-9]+}}(%esp), %esi, %eax		; X86-BMI1NOTBM-NEXT: andnl {{[0-9]+}}(%esp), %edx, %edx
; X86-BMI1NOTBM-NEXT: popl %esi
; X86-BMI1NOTBM-NEXT: retl		; X86-BMI1NOTBM-NEXT: retl
;		;
; X86-BMI1BMI2-LABEL: bzhi64_b0:		; X86-BMI1BMI2-LABEL: bzhi64_b0:
; X86-BMI1BMI2: # %bb.0:		; X86-BMI1BMI2: # %bb.0:
; X86-BMI1BMI2-NEXT: pushl %esi		; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %dl
; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: movl $-1, %ecx
; X86-BMI1BMI2-NEXT: movl $-1, %eax		; X86-BMI1BMI2-NEXT: shlxl %edx, %ecx, %eax
; X86-BMI1BMI2-NEXT: shlxl %ecx, %eax, %esi		; X86-BMI1BMI2-NEXT: testb $32, %dl
; X86-BMI1BMI2-NEXT: shldl %cl, %eax, %eax
; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB20_2		; X86-BMI1BMI2-NEXT: je .LBB20_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %esi, %eax		; X86-BMI1BMI2-NEXT: movl %eax, %ecx
; X86-BMI1BMI2-NEXT: xorl %esi, %esi		; X86-BMI1BMI2-NEXT: xorl %eax, %eax
; X86-BMI1BMI2-NEXT: .LBB20_2:		; X86-BMI1BMI2-NEXT: .LBB20_2:
; X86-BMI1BMI2-NEXT: andnl {{[0-9]+}}(%esp), %eax, %edx		; X86-BMI1BMI2-NEXT: andnl {{[0-9]+}}(%esp), %eax, %eax
; X86-BMI1BMI2-NEXT: andnl {{[0-9]+}}(%esp), %esi, %eax		; X86-BMI1BMI2-NEXT: andnl {{[0-9]+}}(%esp), %ecx, %edx
; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
;		;
; X64-NOBMI-LABEL: bzhi64_b0:		; X64-NOBMI-LABEL: bzhi64_b0:
; X64-NOBMI: # %bb.0:		; X64-NOBMI: # %bb.0:
; X64-NOBMI-NEXT: movq %rsi, %rcx		; X64-NOBMI-NEXT: movq %rsi, %rcx
; X64-NOBMI-NEXT: movq $-1, %rax		; X64-NOBMI-NEXT: movq $-1, %rax
; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $rcx
; X64-NOBMI-NEXT: shlq %cl, %rax		; X64-NOBMI-NEXT: shlq %cl, %rax
Show All 15 Lines	; X64-BMI1BMI2-NEXT: retq
%mask = xor i64 %notmask, -1		%mask = xor i64 %notmask, -1
%masked = and i64 %mask, %val		%masked = and i64 %mask, %val
ret i64 %masked		ret i64 %masked
}		}

define i64 @bzhi64_b1_indexzext(i64 %val, i8 zeroext %numlowbits) nounwind {		define i64 @bzhi64_b1_indexzext(i64 %val, i8 zeroext %numlowbits) nounwind {
; X86-NOBMI-LABEL: bzhi64_b1_indexzext:		; X86-NOBMI-LABEL: bzhi64_b1_indexzext:
; X86-NOBMI: # %bb.0:		; X86-NOBMI: # %bb.0:
		; X86-NOBMI-NEXT: pushl %esi
; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl $-1, %edx		; X86-NOBMI-NEXT: movl $-1, %edx
; X86-NOBMI-NEXT: movl $-1, %eax		; X86-NOBMI-NEXT: movl $-1, %esi
; X86-NOBMI-NEXT: shll %cl, %eax		; X86-NOBMI-NEXT: shll %cl, %esi
; X86-NOBMI-NEXT: shldl %cl, %edx, %edx
; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB21_2
; X86-NOBMI-NEXT: # %bb.1:
; X86-NOBMI-NEXT: movl %eax, %edx
; X86-NOBMI-NEXT: xorl %eax, %eax		; X86-NOBMI-NEXT: xorl %eax, %eax
; X86-NOBMI-NEXT: .LBB21_2:		; X86-NOBMI-NEXT: testb $32, %cl
		; X86-NOBMI-NEXT: jne .LBB21_1
		; X86-NOBMI-NEXT: # %bb.2:
		; X86-NOBMI-NEXT: movl %esi, %eax
		; X86-NOBMI-NEXT: jmp .LBB21_3
		; X86-NOBMI-NEXT: .LBB21_1:
		; X86-NOBMI-NEXT: movl %esi, %edx
		; X86-NOBMI-NEXT: .LBB21_3:
; X86-NOBMI-NEXT: notl %edx		; X86-NOBMI-NEXT: notl %edx
; X86-NOBMI-NEXT: notl %eax		; X86-NOBMI-NEXT: notl %eax
; X86-NOBMI-NEXT: andl {{[0-9]+}}(%esp), %edx
; X86-NOBMI-NEXT: andl {{[0-9]+}}(%esp), %eax		; X86-NOBMI-NEXT: andl {{[0-9]+}}(%esp), %eax
		; X86-NOBMI-NEXT: andl {{[0-9]+}}(%esp), %edx
		; X86-NOBMI-NEXT: popl %esi
; X86-NOBMI-NEXT: retl		; X86-NOBMI-NEXT: retl
;		;
; X86-BMI1NOTBM-LABEL: bzhi64_b1_indexzext:		; X86-BMI1NOTBM-LABEL: bzhi64_b1_indexzext:
; X86-BMI1NOTBM: # %bb.0:		; X86-BMI1NOTBM: # %bb.0:
; X86-BMI1NOTBM-NEXT: pushl %esi
; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %cl
		; X86-BMI1NOTBM-NEXT: movl $-1, %edx
; X86-BMI1NOTBM-NEXT: movl $-1, %eax		; X86-BMI1NOTBM-NEXT: movl $-1, %eax
; X86-BMI1NOTBM-NEXT: movl $-1, %esi		; X86-BMI1NOTBM-NEXT: shll %cl, %eax
; X86-BMI1NOTBM-NEXT: shll %cl, %esi
; X86-BMI1NOTBM-NEXT: shldl %cl, %eax, %eax
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: je .LBB21_2		; X86-BMI1NOTBM-NEXT: je .LBB21_2
; X86-BMI1NOTBM-NEXT: # %bb.1:		; X86-BMI1NOTBM-NEXT: # %bb.1:
; X86-BMI1NOTBM-NEXT: movl %esi, %eax		; X86-BMI1NOTBM-NEXT: movl %eax, %edx
; X86-BMI1NOTBM-NEXT: xorl %esi, %esi		; X86-BMI1NOTBM-NEXT: xorl %eax, %eax
; X86-BMI1NOTBM-NEXT: .LBB21_2:		; X86-BMI1NOTBM-NEXT: .LBB21_2:
; X86-BMI1NOTBM-NEXT: andnl {{[0-9]+}}(%esp), %eax, %edx		; X86-BMI1NOTBM-NEXT: andnl {{[0-9]+}}(%esp), %eax, %eax
; X86-BMI1NOTBM-NEXT: andnl {{[0-9]+}}(%esp), %esi, %eax		; X86-BMI1NOTBM-NEXT: andnl {{[0-9]+}}(%esp), %edx, %edx
; X86-BMI1NOTBM-NEXT: popl %esi
; X86-BMI1NOTBM-NEXT: retl		; X86-BMI1NOTBM-NEXT: retl
;		;
; X86-BMI1BMI2-LABEL: bzhi64_b1_indexzext:		; X86-BMI1BMI2-LABEL: bzhi64_b1_indexzext:
; X86-BMI1BMI2: # %bb.0:		; X86-BMI1BMI2: # %bb.0:
; X86-BMI1BMI2-NEXT: pushl %esi		; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %dl
; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: movl $-1, %ecx
; X86-BMI1BMI2-NEXT: movl $-1, %eax		; X86-BMI1BMI2-NEXT: shlxl %edx, %ecx, %eax
; X86-BMI1BMI2-NEXT: shlxl %ecx, %eax, %esi		; X86-BMI1BMI2-NEXT: testb $32, %dl
; X86-BMI1BMI2-NEXT: shldl %cl, %eax, %eax
; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB21_2		; X86-BMI1BMI2-NEXT: je .LBB21_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %esi, %eax		; X86-BMI1BMI2-NEXT: movl %eax, %ecx
; X86-BMI1BMI2-NEXT: xorl %esi, %esi		; X86-BMI1BMI2-NEXT: xorl %eax, %eax
; X86-BMI1BMI2-NEXT: .LBB21_2:		; X86-BMI1BMI2-NEXT: .LBB21_2:
; X86-BMI1BMI2-NEXT: andnl {{[0-9]+}}(%esp), %eax, %edx		; X86-BMI1BMI2-NEXT: andnl {{[0-9]+}}(%esp), %eax, %eax
; X86-BMI1BMI2-NEXT: andnl {{[0-9]+}}(%esp), %esi, %eax		; X86-BMI1BMI2-NEXT: andnl {{[0-9]+}}(%esp), %ecx, %edx
; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
;		;
; X64-NOBMI-LABEL: bzhi64_b1_indexzext:		; X64-NOBMI-LABEL: bzhi64_b1_indexzext:
; X64-NOBMI: # %bb.0:		; X64-NOBMI: # %bb.0:
; X64-NOBMI-NEXT: movl %esi, %ecx		; X64-NOBMI-NEXT: movl %esi, %ecx
; X64-NOBMI-NEXT: movq $-1, %rax		; X64-NOBMI-NEXT: movq $-1, %rax
; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $ecx		; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $ecx
; X64-NOBMI-NEXT: shlq %cl, %rax		; X64-NOBMI-NEXT: shlq %cl, %rax
Show All 18 Lines	; X64-BMI1BMI2-NEXT: retq
%mask = xor i64 %notmask, -1		%mask = xor i64 %notmask, -1
%masked = and i64 %mask, %val		%masked = and i64 %mask, %val
ret i64 %masked		ret i64 %masked
}		}

define i64 @bzhi64_b2_load(i64* %w, i64 %numlowbits) nounwind {		define i64 @bzhi64_b2_load(i64* %w, i64 %numlowbits) nounwind {
; X86-NOBMI-LABEL: bzhi64_b2_load:		; X86-NOBMI-LABEL: bzhi64_b2_load:
; X86-NOBMI: # %bb.0:		; X86-NOBMI: # %bb.0:
		; X86-NOBMI-NEXT: pushl %edi
; X86-NOBMI-NEXT: pushl %esi		; X86-NOBMI-NEXT: pushl %esi
; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl $-1, %edx		; X86-NOBMI-NEXT: movl $-1, %edx
; X86-NOBMI-NEXT: movl $-1, %eax		; X86-NOBMI-NEXT: movl $-1, %edi
; X86-NOBMI-NEXT: shll %cl, %eax		; X86-NOBMI-NEXT: shll %cl, %edi
; X86-NOBMI-NEXT: shldl %cl, %edx, %edx
; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB22_2
; X86-NOBMI-NEXT: # %bb.1:
; X86-NOBMI-NEXT: movl %eax, %edx
; X86-NOBMI-NEXT: xorl %eax, %eax		; X86-NOBMI-NEXT: xorl %eax, %eax
; X86-NOBMI-NEXT: .LBB22_2:		; X86-NOBMI-NEXT: testb $32, %cl
		; X86-NOBMI-NEXT: jne .LBB22_1
		; X86-NOBMI-NEXT: # %bb.2:
		; X86-NOBMI-NEXT: movl %edi, %eax
		; X86-NOBMI-NEXT: jmp .LBB22_3
		; X86-NOBMI-NEXT: .LBB22_1:
		; X86-NOBMI-NEXT: movl %edi, %edx
		; X86-NOBMI-NEXT: .LBB22_3:
; X86-NOBMI-NEXT: notl %edx		; X86-NOBMI-NEXT: notl %edx
; X86-NOBMI-NEXT: notl %eax		; X86-NOBMI-NEXT: notl %eax
; X86-NOBMI-NEXT: andl 4(%esi), %edx
; X86-NOBMI-NEXT: andl (%esi), %eax		; X86-NOBMI-NEXT: andl (%esi), %eax
		; X86-NOBMI-NEXT: andl 4(%esi), %edx
; X86-NOBMI-NEXT: popl %esi		; X86-NOBMI-NEXT: popl %esi
		; X86-NOBMI-NEXT: popl %edi
; X86-NOBMI-NEXT: retl		; X86-NOBMI-NEXT: retl
;		;
; X86-BMI1NOTBM-LABEL: bzhi64_b2_load:		; X86-BMI1NOTBM-LABEL: bzhi64_b2_load:
; X86-BMI1NOTBM: # %bb.0:		; X86-BMI1NOTBM: # %bb.0:
; X86-BMI1NOTBM-NEXT: pushl %esi		; X86-BMI1NOTBM-NEXT: pushl %esi
; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-BMI1NOTBM-NEXT: movl $-1, %edx
; X86-BMI1NOTBM-NEXT: movl $-1, %esi		; X86-BMI1NOTBM-NEXT: movl $-1, %esi
; X86-BMI1NOTBM-NEXT: shll %cl, %esi		; X86-BMI1NOTBM-NEXT: movl $-1, %eax
; X86-BMI1NOTBM-NEXT: shldl %cl, %edx, %edx		; X86-BMI1NOTBM-NEXT: shll %cl, %eax
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: je .LBB22_2		; X86-BMI1NOTBM-NEXT: je .LBB22_2
; X86-BMI1NOTBM-NEXT: # %bb.1:		; X86-BMI1NOTBM-NEXT: # %bb.1:
; X86-BMI1NOTBM-NEXT: movl %esi, %edx		; X86-BMI1NOTBM-NEXT: movl %eax, %esi
; X86-BMI1NOTBM-NEXT: xorl %esi, %esi		; X86-BMI1NOTBM-NEXT: xorl %eax, %eax
; X86-BMI1NOTBM-NEXT: .LBB22_2:		; X86-BMI1NOTBM-NEXT: .LBB22_2:
; X86-BMI1NOTBM-NEXT: andnl 4(%eax), %edx, %edx		; X86-BMI1NOTBM-NEXT: andnl (%edx), %eax, %eax
; X86-BMI1NOTBM-NEXT: andnl (%eax), %esi, %eax		; X86-BMI1NOTBM-NEXT: andnl 4(%edx), %esi, %edx
; X86-BMI1NOTBM-NEXT: popl %esi		; X86-BMI1NOTBM-NEXT: popl %esi
; X86-BMI1NOTBM-NEXT: retl		; X86-BMI1NOTBM-NEXT: retl
;		;
; X86-BMI1BMI2-LABEL: bzhi64_b2_load:		; X86-BMI1BMI2-LABEL: bzhi64_b2_load:
; X86-BMI1BMI2: # %bb.0:		; X86-BMI1BMI2: # %bb.0:
; X86-BMI1BMI2-NEXT: pushl %esi		; X86-BMI1BMI2-NEXT: pushl %ebx
; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %bl
; X86-BMI1BMI2-NEXT: movl $-1, %edx		; X86-BMI1BMI2-NEXT: movl $-1, %edx
; X86-BMI1BMI2-NEXT: shlxl %ecx, %edx, %esi		; X86-BMI1BMI2-NEXT: shlxl %ebx, %edx, %eax
; X86-BMI1BMI2-NEXT: shldl %cl, %edx, %edx		; X86-BMI1BMI2-NEXT: testb $32, %bl
; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB22_2		; X86-BMI1BMI2-NEXT: je .LBB22_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %esi, %edx		; X86-BMI1BMI2-NEXT: movl %eax, %edx
; X86-BMI1BMI2-NEXT: xorl %esi, %esi		; X86-BMI1BMI2-NEXT: xorl %eax, %eax
; X86-BMI1BMI2-NEXT: .LBB22_2:		; X86-BMI1BMI2-NEXT: .LBB22_2:
; X86-BMI1BMI2-NEXT: andnl 4(%eax), %edx, %edx		; X86-BMI1BMI2-NEXT: andnl (%ecx), %eax, %eax
; X86-BMI1BMI2-NEXT: andnl (%eax), %esi, %eax		; X86-BMI1BMI2-NEXT: andnl 4(%ecx), %edx, %edx
; X86-BMI1BMI2-NEXT: popl %esi		; X86-BMI1BMI2-NEXT: popl %ebx
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
;		;
; X64-NOBMI-LABEL: bzhi64_b2_load:		; X64-NOBMI-LABEL: bzhi64_b2_load:
; X64-NOBMI: # %bb.0:		; X64-NOBMI: # %bb.0:
; X64-NOBMI-NEXT: movq %rsi, %rcx		; X64-NOBMI-NEXT: movq %rsi, %rcx
; X64-NOBMI-NEXT: movq $-1, %rax		; X64-NOBMI-NEXT: movq $-1, %rax
; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $rcx
; X64-NOBMI-NEXT: shlq %cl, %rax		; X64-NOBMI-NEXT: shlq %cl, %rax
Show All 16 Lines	; X64-BMI1BMI2-NEXT: retq
%mask = xor i64 %notmask, -1		%mask = xor i64 %notmask, -1
%masked = and i64 %mask, %val		%masked = and i64 %mask, %val
ret i64 %masked		ret i64 %masked
}		}

define i64 @bzhi64_b3_load_indexzext(i64* %w, i8 zeroext %numlowbits) nounwind {		define i64 @bzhi64_b3_load_indexzext(i64* %w, i8 zeroext %numlowbits) nounwind {
; X86-NOBMI-LABEL: bzhi64_b3_load_indexzext:		; X86-NOBMI-LABEL: bzhi64_b3_load_indexzext:
; X86-NOBMI: # %bb.0:		; X86-NOBMI: # %bb.0:
		; X86-NOBMI-NEXT: pushl %edi
; X86-NOBMI-NEXT: pushl %esi		; X86-NOBMI-NEXT: pushl %esi
; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl $-1, %edx		; X86-NOBMI-NEXT: movl $-1, %edx
; X86-NOBMI-NEXT: movl $-1, %eax		; X86-NOBMI-NEXT: movl $-1, %edi
; X86-NOBMI-NEXT: shll %cl, %eax		; X86-NOBMI-NEXT: shll %cl, %edi
; X86-NOBMI-NEXT: shldl %cl, %edx, %edx
; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB23_2
; X86-NOBMI-NEXT: # %bb.1:
; X86-NOBMI-NEXT: movl %eax, %edx
; X86-NOBMI-NEXT: xorl %eax, %eax		; X86-NOBMI-NEXT: xorl %eax, %eax
; X86-NOBMI-NEXT: .LBB23_2:		; X86-NOBMI-NEXT: testb $32, %cl
		; X86-NOBMI-NEXT: jne .LBB23_1
		; X86-NOBMI-NEXT: # %bb.2:
		; X86-NOBMI-NEXT: movl %edi, %eax
		; X86-NOBMI-NEXT: jmp .LBB23_3
		; X86-NOBMI-NEXT: .LBB23_1:
		; X86-NOBMI-NEXT: movl %edi, %edx
		; X86-NOBMI-NEXT: .LBB23_3:
; X86-NOBMI-NEXT: notl %edx		; X86-NOBMI-NEXT: notl %edx
; X86-NOBMI-NEXT: notl %eax		; X86-NOBMI-NEXT: notl %eax
; X86-NOBMI-NEXT: andl 4(%esi), %edx
; X86-NOBMI-NEXT: andl (%esi), %eax		; X86-NOBMI-NEXT: andl (%esi), %eax
		; X86-NOBMI-NEXT: andl 4(%esi), %edx
; X86-NOBMI-NEXT: popl %esi		; X86-NOBMI-NEXT: popl %esi
		; X86-NOBMI-NEXT: popl %edi
; X86-NOBMI-NEXT: retl		; X86-NOBMI-NEXT: retl
;		;
; X86-BMI1NOTBM-LABEL: bzhi64_b3_load_indexzext:		; X86-BMI1NOTBM-LABEL: bzhi64_b3_load_indexzext:
; X86-BMI1NOTBM: # %bb.0:		; X86-BMI1NOTBM: # %bb.0:
; X86-BMI1NOTBM-NEXT: pushl %esi		; X86-BMI1NOTBM-NEXT: pushl %esi
; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-BMI1NOTBM-NEXT: movl $-1, %edx
; X86-BMI1NOTBM-NEXT: movl $-1, %esi		; X86-BMI1NOTBM-NEXT: movl $-1, %esi
; X86-BMI1NOTBM-NEXT: shll %cl, %esi		; X86-BMI1NOTBM-NEXT: movl $-1, %eax
; X86-BMI1NOTBM-NEXT: shldl %cl, %edx, %edx		; X86-BMI1NOTBM-NEXT: shll %cl, %eax
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: je .LBB23_2		; X86-BMI1NOTBM-NEXT: je .LBB23_2
; X86-BMI1NOTBM-NEXT: # %bb.1:		; X86-BMI1NOTBM-NEXT: # %bb.1:
; X86-BMI1NOTBM-NEXT: movl %esi, %edx		; X86-BMI1NOTBM-NEXT: movl %eax, %esi
; X86-BMI1NOTBM-NEXT: xorl %esi, %esi		; X86-BMI1NOTBM-NEXT: xorl %eax, %eax
; X86-BMI1NOTBM-NEXT: .LBB23_2:		; X86-BMI1NOTBM-NEXT: .LBB23_2:
; X86-BMI1NOTBM-NEXT: andnl 4(%eax), %edx, %edx		; X86-BMI1NOTBM-NEXT: andnl (%edx), %eax, %eax
; X86-BMI1NOTBM-NEXT: andnl (%eax), %esi, %eax		; X86-BMI1NOTBM-NEXT: andnl 4(%edx), %esi, %edx
; X86-BMI1NOTBM-NEXT: popl %esi		; X86-BMI1NOTBM-NEXT: popl %esi
; X86-BMI1NOTBM-NEXT: retl		; X86-BMI1NOTBM-NEXT: retl
;		;
; X86-BMI1BMI2-LABEL: bzhi64_b3_load_indexzext:		; X86-BMI1BMI2-LABEL: bzhi64_b3_load_indexzext:
; X86-BMI1BMI2: # %bb.0:		; X86-BMI1BMI2: # %bb.0:
; X86-BMI1BMI2-NEXT: pushl %esi		; X86-BMI1BMI2-NEXT: pushl %ebx
; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %bl
; X86-BMI1BMI2-NEXT: movl $-1, %edx		; X86-BMI1BMI2-NEXT: movl $-1, %edx
; X86-BMI1BMI2-NEXT: shlxl %ecx, %edx, %esi		; X86-BMI1BMI2-NEXT: shlxl %ebx, %edx, %eax
; X86-BMI1BMI2-NEXT: shldl %cl, %edx, %edx		; X86-BMI1BMI2-NEXT: testb $32, %bl
; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB23_2		; X86-BMI1BMI2-NEXT: je .LBB23_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %esi, %edx		; X86-BMI1BMI2-NEXT: movl %eax, %edx
; X86-BMI1BMI2-NEXT: xorl %esi, %esi		; X86-BMI1BMI2-NEXT: xorl %eax, %eax
; X86-BMI1BMI2-NEXT: .LBB23_2:		; X86-BMI1BMI2-NEXT: .LBB23_2:
; X86-BMI1BMI2-NEXT: andnl 4(%eax), %edx, %edx		; X86-BMI1BMI2-NEXT: andnl (%ecx), %eax, %eax
; X86-BMI1BMI2-NEXT: andnl (%eax), %esi, %eax		; X86-BMI1BMI2-NEXT: andnl 4(%ecx), %edx, %edx
; X86-BMI1BMI2-NEXT: popl %esi		; X86-BMI1BMI2-NEXT: popl %ebx
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
;		;
; X64-NOBMI-LABEL: bzhi64_b3_load_indexzext:		; X64-NOBMI-LABEL: bzhi64_b3_load_indexzext:
; X64-NOBMI: # %bb.0:		; X64-NOBMI: # %bb.0:
; X64-NOBMI-NEXT: movl %esi, %ecx		; X64-NOBMI-NEXT: movl %esi, %ecx
; X64-NOBMI-NEXT: movq $-1, %rax		; X64-NOBMI-NEXT: movq $-1, %rax
; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $ecx		; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $ecx
; X64-NOBMI-NEXT: shlq %cl, %rax		; X64-NOBMI-NEXT: shlq %cl, %rax
Show All 19 Lines	; X64-BMI1BMI2-NEXT: retq
%mask = xor i64 %notmask, -1		%mask = xor i64 %notmask, -1
%masked = and i64 %mask, %val		%masked = and i64 %mask, %val
ret i64 %masked		ret i64 %masked
}		}

define i64 @bzhi64_b4_commutative(i64 %val, i64 %numlowbits) nounwind {		define i64 @bzhi64_b4_commutative(i64 %val, i64 %numlowbits) nounwind {
; X86-NOBMI-LABEL: bzhi64_b4_commutative:		; X86-NOBMI-LABEL: bzhi64_b4_commutative:
; X86-NOBMI: # %bb.0:		; X86-NOBMI: # %bb.0:
		; X86-NOBMI-NEXT: pushl %esi
; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: movb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl $-1, %edx		; X86-NOBMI-NEXT: movl $-1, %edx
; X86-NOBMI-NEXT: movl $-1, %eax		; X86-NOBMI-NEXT: movl $-1, %esi
; X86-NOBMI-NEXT: shll %cl, %eax		; X86-NOBMI-NEXT: shll %cl, %esi
; X86-NOBMI-NEXT: shldl %cl, %edx, %edx
; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB24_2
; X86-NOBMI-NEXT: # %bb.1:
; X86-NOBMI-NEXT: movl %eax, %edx
; X86-NOBMI-NEXT: xorl %eax, %eax		; X86-NOBMI-NEXT: xorl %eax, %eax
; X86-NOBMI-NEXT: .LBB24_2:		; X86-NOBMI-NEXT: testb $32, %cl
		; X86-NOBMI-NEXT: jne .LBB24_1
		; X86-NOBMI-NEXT: # %bb.2:
		; X86-NOBMI-NEXT: movl %esi, %eax
		; X86-NOBMI-NEXT: jmp .LBB24_3
		; X86-NOBMI-NEXT: .LBB24_1:
		; X86-NOBMI-NEXT: movl %esi, %edx
		; X86-NOBMI-NEXT: .LBB24_3:
; X86-NOBMI-NEXT: notl %edx		; X86-NOBMI-NEXT: notl %edx
; X86-NOBMI-NEXT: notl %eax		; X86-NOBMI-NEXT: notl %eax
; X86-NOBMI-NEXT: andl {{[0-9]+}}(%esp), %edx
; X86-NOBMI-NEXT: andl {{[0-9]+}}(%esp), %eax		; X86-NOBMI-NEXT: andl {{[0-9]+}}(%esp), %eax
		; X86-NOBMI-NEXT: andl {{[0-9]+}}(%esp), %edx
		; X86-NOBMI-NEXT: popl %esi
; X86-NOBMI-NEXT: retl		; X86-NOBMI-NEXT: retl
;		;
; X86-BMI1NOTBM-LABEL: bzhi64_b4_commutative:		; X86-BMI1NOTBM-LABEL: bzhi64_b4_commutative:
; X86-BMI1NOTBM: # %bb.0:		; X86-BMI1NOTBM: # %bb.0:
; X86-BMI1NOTBM-NEXT: pushl %esi
; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: movb {{[0-9]+}}(%esp), %cl
		; X86-BMI1NOTBM-NEXT: movl $-1, %edx
; X86-BMI1NOTBM-NEXT: movl $-1, %eax		; X86-BMI1NOTBM-NEXT: movl $-1, %eax
; X86-BMI1NOTBM-NEXT: movl $-1, %esi		; X86-BMI1NOTBM-NEXT: shll %cl, %eax
; X86-BMI1NOTBM-NEXT: shll %cl, %esi
; X86-BMI1NOTBM-NEXT: shldl %cl, %eax, %eax
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: je .LBB24_2		; X86-BMI1NOTBM-NEXT: je .LBB24_2
; X86-BMI1NOTBM-NEXT: # %bb.1:		; X86-BMI1NOTBM-NEXT: # %bb.1:
; X86-BMI1NOTBM-NEXT: movl %esi, %eax		; X86-BMI1NOTBM-NEXT: movl %eax, %edx
; X86-BMI1NOTBM-NEXT: xorl %esi, %esi		; X86-BMI1NOTBM-NEXT: xorl %eax, %eax
; X86-BMI1NOTBM-NEXT: .LBB24_2:		; X86-BMI1NOTBM-NEXT: .LBB24_2:
; X86-BMI1NOTBM-NEXT: andnl {{[0-9]+}}(%esp), %eax, %edx		; X86-BMI1NOTBM-NEXT: andnl {{[0-9]+}}(%esp), %eax, %eax
; X86-BMI1NOTBM-NEXT: andnl {{[0-9]+}}(%esp), %esi, %eax		; X86-BMI1NOTBM-NEXT: andnl {{[0-9]+}}(%esp), %edx, %edx
; X86-BMI1NOTBM-NEXT: popl %esi
; X86-BMI1NOTBM-NEXT: retl		; X86-BMI1NOTBM-NEXT: retl
;		;
; X86-BMI1BMI2-LABEL: bzhi64_b4_commutative:		; X86-BMI1BMI2-LABEL: bzhi64_b4_commutative:
; X86-BMI1BMI2: # %bb.0:		; X86-BMI1BMI2: # %bb.0:
; X86-BMI1BMI2-NEXT: pushl %esi		; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %dl
; X86-BMI1BMI2-NEXT: movb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: movl $-1, %ecx
; X86-BMI1BMI2-NEXT: movl $-1, %eax		; X86-BMI1BMI2-NEXT: shlxl %edx, %ecx, %eax
; X86-BMI1BMI2-NEXT: shlxl %ecx, %eax, %esi		; X86-BMI1BMI2-NEXT: testb $32, %dl
; X86-BMI1BMI2-NEXT: shldl %cl, %eax, %eax
; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB24_2		; X86-BMI1BMI2-NEXT: je .LBB24_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %esi, %eax		; X86-BMI1BMI2-NEXT: movl %eax, %ecx
; X86-BMI1BMI2-NEXT: xorl %esi, %esi		; X86-BMI1BMI2-NEXT: xorl %eax, %eax
; X86-BMI1BMI2-NEXT: .LBB24_2:		; X86-BMI1BMI2-NEXT: .LBB24_2:
; X86-BMI1BMI2-NEXT: andnl {{[0-9]+}}(%esp), %eax, %edx		; X86-BMI1BMI2-NEXT: andnl {{[0-9]+}}(%esp), %eax, %eax
; X86-BMI1BMI2-NEXT: andnl {{[0-9]+}}(%esp), %esi, %eax		; X86-BMI1BMI2-NEXT: andnl {{[0-9]+}}(%esp), %ecx, %edx
; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
;		;
; X64-NOBMI-LABEL: bzhi64_b4_commutative:		; X64-NOBMI-LABEL: bzhi64_b4_commutative:
; X64-NOBMI: # %bb.0:		; X64-NOBMI: # %bb.0:
; X64-NOBMI-NEXT: movq %rsi, %rcx		; X64-NOBMI-NEXT: movq %rsi, %rcx
; X64-NOBMI-NEXT: movq $-1, %rax		; X64-NOBMI-NEXT: movq $-1, %rax
; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $rcx		; X64-NOBMI-NEXT: # kill: def $cl killed $cl killed $rcx
; X64-NOBMI-NEXT: shlq %cl, %rax		; X64-NOBMI-NEXT: shlq %cl, %rax
▲ Show 20 Lines • Show All 854 Lines • ▼ Show 20 Lines
; X86-NOBMI-NEXT: pushl %edi		; X86-NOBMI-NEXT: pushl %edi
; X86-NOBMI-NEXT: pushl %esi		; X86-NOBMI-NEXT: pushl %esi
; X86-NOBMI-NEXT: pushl %eax		; X86-NOBMI-NEXT: pushl %eax
; X86-NOBMI-NEXT: movb $64, %cl		; X86-NOBMI-NEXT: movb $64, %cl
; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl $-1, %esi		; X86-NOBMI-NEXT: movl $-1, %esi
; X86-NOBMI-NEXT: movl $-1, %edi		; X86-NOBMI-NEXT: movl $-1, %edi
; X86-NOBMI-NEXT: shrl %cl, %edi		; X86-NOBMI-NEXT: shrl %cl, %edi
; X86-NOBMI-NEXT: shrdl %cl, %esi, %esi
; X86-NOBMI-NEXT: testb $32, %cl		; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB34_2		; X86-NOBMI-NEXT: je .LBB34_2
; X86-NOBMI-NEXT: # %bb.1:		; X86-NOBMI-NEXT: # %bb.1:
; X86-NOBMI-NEXT: movl %edi, %esi		; X86-NOBMI-NEXT: movl %edi, %esi
; X86-NOBMI-NEXT: xorl %edi, %edi		; X86-NOBMI-NEXT: xorl %edi, %edi
; X86-NOBMI-NEXT: .LBB34_2:		; X86-NOBMI-NEXT: .LBB34_2:
; X86-NOBMI-NEXT: subl $8, %esp		; X86-NOBMI-NEXT: subl $8, %esp
; X86-NOBMI-NEXT: pushl %edi		; X86-NOBMI-NEXT: pushl %edi
Show All 14 Lines
; X86-BMI1NOTBM-NEXT: pushl %edi		; X86-BMI1NOTBM-NEXT: pushl %edi
; X86-BMI1NOTBM-NEXT: pushl %esi		; X86-BMI1NOTBM-NEXT: pushl %esi
; X86-BMI1NOTBM-NEXT: pushl %eax		; X86-BMI1NOTBM-NEXT: pushl %eax
; X86-BMI1NOTBM-NEXT: movb $64, %cl		; X86-BMI1NOTBM-NEXT: movb $64, %cl
; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-BMI1NOTBM-NEXT: movl $-1, %esi		; X86-BMI1NOTBM-NEXT: movl $-1, %esi
; X86-BMI1NOTBM-NEXT: movl $-1, %edi		; X86-BMI1NOTBM-NEXT: movl $-1, %edi
; X86-BMI1NOTBM-NEXT: shrl %cl, %edi		; X86-BMI1NOTBM-NEXT: shrl %cl, %edi
; X86-BMI1NOTBM-NEXT: shrdl %cl, %esi, %esi
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: je .LBB34_2		; X86-BMI1NOTBM-NEXT: je .LBB34_2
; X86-BMI1NOTBM-NEXT: # %bb.1:		; X86-BMI1NOTBM-NEXT: # %bb.1:
; X86-BMI1NOTBM-NEXT: movl %edi, %esi		; X86-BMI1NOTBM-NEXT: movl %edi, %esi
; X86-BMI1NOTBM-NEXT: xorl %edi, %edi		; X86-BMI1NOTBM-NEXT: xorl %edi, %edi
; X86-BMI1NOTBM-NEXT: .LBB34_2:		; X86-BMI1NOTBM-NEXT: .LBB34_2:
; X86-BMI1NOTBM-NEXT: subl $8, %esp		; X86-BMI1NOTBM-NEXT: subl $8, %esp
; X86-BMI1NOTBM-NEXT: pushl %edi		; X86-BMI1NOTBM-NEXT: pushl %edi
Show All 9 Lines
; X86-BMI1NOTBM-NEXT: popl %edi		; X86-BMI1NOTBM-NEXT: popl %edi
; X86-BMI1NOTBM-NEXT: retl		; X86-BMI1NOTBM-NEXT: retl
;		;
; X86-BMI1BMI2-LABEL: bzhi64_c0:		; X86-BMI1BMI2-LABEL: bzhi64_c0:
; X86-BMI1BMI2: # %bb.0:		; X86-BMI1BMI2: # %bb.0:
; X86-BMI1BMI2-NEXT: pushl %edi		; X86-BMI1BMI2-NEXT: pushl %edi
; X86-BMI1BMI2-NEXT: pushl %esi		; X86-BMI1BMI2-NEXT: pushl %esi
; X86-BMI1BMI2-NEXT: pushl %eax		; X86-BMI1BMI2-NEXT: pushl %eax
; X86-BMI1BMI2-NEXT: movb $64, %cl		; X86-BMI1BMI2-NEXT: movb $64, %al
; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %al
; X86-BMI1BMI2-NEXT: movl $-1, %esi		; X86-BMI1BMI2-NEXT: movl $-1, %edi
; X86-BMI1BMI2-NEXT: shrxl %ecx, %esi, %edi		; X86-BMI1BMI2-NEXT: shrxl %eax, %edi, %esi
; X86-BMI1BMI2-NEXT: shrdl %cl, %esi, %esi		; X86-BMI1BMI2-NEXT: testb $32, %al
; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB34_2		; X86-BMI1BMI2-NEXT: je .LBB34_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %edi, %esi		; X86-BMI1BMI2-NEXT: movl %esi, %edi
; X86-BMI1BMI2-NEXT: xorl %edi, %edi		; X86-BMI1BMI2-NEXT: xorl %esi, %esi
; X86-BMI1BMI2-NEXT: .LBB34_2:		; X86-BMI1BMI2-NEXT: .LBB34_2:
; X86-BMI1BMI2-NEXT: subl $8, %esp		; X86-BMI1BMI2-NEXT: subl $8, %esp
; X86-BMI1BMI2-NEXT: pushl %edi
; X86-BMI1BMI2-NEXT: pushl %esi		; X86-BMI1BMI2-NEXT: pushl %esi
		; X86-BMI1BMI2-NEXT: pushl %edi
; X86-BMI1BMI2-NEXT: calll use64		; X86-BMI1BMI2-NEXT: calll use64
; X86-BMI1BMI2-NEXT: addl $16, %esp		; X86-BMI1BMI2-NEXT: addl $16, %esp
; X86-BMI1BMI2-NEXT: andl {{[0-9]+}}(%esp), %esi
; X86-BMI1BMI2-NEXT: andl {{[0-9]+}}(%esp), %edi		; X86-BMI1BMI2-NEXT: andl {{[0-9]+}}(%esp), %edi
; X86-BMI1BMI2-NEXT: movl %esi, %eax		; X86-BMI1BMI2-NEXT: andl {{[0-9]+}}(%esp), %esi
; X86-BMI1BMI2-NEXT: movl %edi, %edx		; X86-BMI1BMI2-NEXT: movl %edi, %eax
		; X86-BMI1BMI2-NEXT: movl %esi, %edx
; X86-BMI1BMI2-NEXT: addl $4, %esp		; X86-BMI1BMI2-NEXT: addl $4, %esp
; X86-BMI1BMI2-NEXT: popl %esi		; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: popl %edi		; X86-BMI1BMI2-NEXT: popl %edi
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
;		;
; X64-NOBMI-LABEL: bzhi64_c0:		; X64-NOBMI-LABEL: bzhi64_c0:
; X64-NOBMI: # %bb.0:		; X64-NOBMI: # %bb.0:
; X64-NOBMI-NEXT: pushq %r14		; X64-NOBMI-NEXT: pushq %r14
▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines
; X86-NOBMI-NEXT: pushl %edi		; X86-NOBMI-NEXT: pushl %edi
; X86-NOBMI-NEXT: pushl %esi		; X86-NOBMI-NEXT: pushl %esi
; X86-NOBMI-NEXT: pushl %eax		; X86-NOBMI-NEXT: pushl %eax
; X86-NOBMI-NEXT: movb $64, %cl		; X86-NOBMI-NEXT: movb $64, %cl
; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl $-1, %esi		; X86-NOBMI-NEXT: movl $-1, %esi
; X86-NOBMI-NEXT: movl $-1, %edi		; X86-NOBMI-NEXT: movl $-1, %edi
; X86-NOBMI-NEXT: shrl %cl, %edi		; X86-NOBMI-NEXT: shrl %cl, %edi
; X86-NOBMI-NEXT: shrdl %cl, %esi, %esi
; X86-NOBMI-NEXT: testb $32, %cl		; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB35_2		; X86-NOBMI-NEXT: je .LBB35_2
; X86-NOBMI-NEXT: # %bb.1:		; X86-NOBMI-NEXT: # %bb.1:
; X86-NOBMI-NEXT: movl %edi, %esi		; X86-NOBMI-NEXT: movl %edi, %esi
; X86-NOBMI-NEXT: xorl %edi, %edi		; X86-NOBMI-NEXT: xorl %edi, %edi
; X86-NOBMI-NEXT: .LBB35_2:		; X86-NOBMI-NEXT: .LBB35_2:
; X86-NOBMI-NEXT: subl $8, %esp		; X86-NOBMI-NEXT: subl $8, %esp
; X86-NOBMI-NEXT: pushl %edi		; X86-NOBMI-NEXT: pushl %edi
Show All 14 Lines
; X86-BMI1NOTBM-NEXT: pushl %edi		; X86-BMI1NOTBM-NEXT: pushl %edi
; X86-BMI1NOTBM-NEXT: pushl %esi		; X86-BMI1NOTBM-NEXT: pushl %esi
; X86-BMI1NOTBM-NEXT: pushl %eax		; X86-BMI1NOTBM-NEXT: pushl %eax
; X86-BMI1NOTBM-NEXT: movb $64, %cl		; X86-BMI1NOTBM-NEXT: movb $64, %cl
; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-BMI1NOTBM-NEXT: movl $-1, %esi		; X86-BMI1NOTBM-NEXT: movl $-1, %esi
; X86-BMI1NOTBM-NEXT: movl $-1, %edi		; X86-BMI1NOTBM-NEXT: movl $-1, %edi
; X86-BMI1NOTBM-NEXT: shrl %cl, %edi		; X86-BMI1NOTBM-NEXT: shrl %cl, %edi
; X86-BMI1NOTBM-NEXT: shrdl %cl, %esi, %esi
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: je .LBB35_2		; X86-BMI1NOTBM-NEXT: je .LBB35_2
; X86-BMI1NOTBM-NEXT: # %bb.1:		; X86-BMI1NOTBM-NEXT: # %bb.1:
; X86-BMI1NOTBM-NEXT: movl %edi, %esi		; X86-BMI1NOTBM-NEXT: movl %edi, %esi
; X86-BMI1NOTBM-NEXT: xorl %edi, %edi		; X86-BMI1NOTBM-NEXT: xorl %edi, %edi
; X86-BMI1NOTBM-NEXT: .LBB35_2:		; X86-BMI1NOTBM-NEXT: .LBB35_2:
; X86-BMI1NOTBM-NEXT: subl $8, %esp		; X86-BMI1NOTBM-NEXT: subl $8, %esp
; X86-BMI1NOTBM-NEXT: pushl %edi		; X86-BMI1NOTBM-NEXT: pushl %edi
Show All 9 Lines
; X86-BMI1NOTBM-NEXT: popl %edi		; X86-BMI1NOTBM-NEXT: popl %edi
; X86-BMI1NOTBM-NEXT: retl		; X86-BMI1NOTBM-NEXT: retl
;		;
; X86-BMI1BMI2-LABEL: bzhi64_c1_indexzext:		; X86-BMI1BMI2-LABEL: bzhi64_c1_indexzext:
; X86-BMI1BMI2: # %bb.0:		; X86-BMI1BMI2: # %bb.0:
; X86-BMI1BMI2-NEXT: pushl %edi		; X86-BMI1BMI2-NEXT: pushl %edi
; X86-BMI1BMI2-NEXT: pushl %esi		; X86-BMI1BMI2-NEXT: pushl %esi
; X86-BMI1BMI2-NEXT: pushl %eax		; X86-BMI1BMI2-NEXT: pushl %eax
; X86-BMI1BMI2-NEXT: movb $64, %cl		; X86-BMI1BMI2-NEXT: movb $64, %al
; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %al
; X86-BMI1BMI2-NEXT: movl $-1, %esi		; X86-BMI1BMI2-NEXT: movl $-1, %edi
; X86-BMI1BMI2-NEXT: shrxl %ecx, %esi, %edi		; X86-BMI1BMI2-NEXT: shrxl %eax, %edi, %esi
; X86-BMI1BMI2-NEXT: shrdl %cl, %esi, %esi		; X86-BMI1BMI2-NEXT: testb $32, %al
; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB35_2		; X86-BMI1BMI2-NEXT: je .LBB35_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %edi, %esi		; X86-BMI1BMI2-NEXT: movl %esi, %edi
; X86-BMI1BMI2-NEXT: xorl %edi, %edi		; X86-BMI1BMI2-NEXT: xorl %esi, %esi
; X86-BMI1BMI2-NEXT: .LBB35_2:		; X86-BMI1BMI2-NEXT: .LBB35_2:
; X86-BMI1BMI2-NEXT: subl $8, %esp		; X86-BMI1BMI2-NEXT: subl $8, %esp
; X86-BMI1BMI2-NEXT: pushl %edi
; X86-BMI1BMI2-NEXT: pushl %esi		; X86-BMI1BMI2-NEXT: pushl %esi
		; X86-BMI1BMI2-NEXT: pushl %edi
; X86-BMI1BMI2-NEXT: calll use64		; X86-BMI1BMI2-NEXT: calll use64
; X86-BMI1BMI2-NEXT: addl $16, %esp		; X86-BMI1BMI2-NEXT: addl $16, %esp
; X86-BMI1BMI2-NEXT: andl {{[0-9]+}}(%esp), %esi
; X86-BMI1BMI2-NEXT: andl {{[0-9]+}}(%esp), %edi		; X86-BMI1BMI2-NEXT: andl {{[0-9]+}}(%esp), %edi
; X86-BMI1BMI2-NEXT: movl %esi, %eax		; X86-BMI1BMI2-NEXT: andl {{[0-9]+}}(%esp), %esi
; X86-BMI1BMI2-NEXT: movl %edi, %edx		; X86-BMI1BMI2-NEXT: movl %edi, %eax
		; X86-BMI1BMI2-NEXT: movl %esi, %edx
; X86-BMI1BMI2-NEXT: addl $4, %esp		; X86-BMI1BMI2-NEXT: addl $4, %esp
; X86-BMI1BMI2-NEXT: popl %esi		; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: popl %edi		; X86-BMI1BMI2-NEXT: popl %edi
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
;		;
; X64-NOBMI-LABEL: bzhi64_c1_indexzext:		; X64-NOBMI-LABEL: bzhi64_c1_indexzext:
; X64-NOBMI: # %bb.0:		; X64-NOBMI: # %bb.0:
; X64-NOBMI-NEXT: pushq %r14		; X64-NOBMI-NEXT: pushq %r14
▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines
; X86-NOBMI-NEXT: pushl %edi		; X86-NOBMI-NEXT: pushl %edi
; X86-NOBMI-NEXT: pushl %esi		; X86-NOBMI-NEXT: pushl %esi
; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NOBMI-NEXT: movb $64, %cl		; X86-NOBMI-NEXT: movb $64, %cl
; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl $-1, %eax		; X86-NOBMI-NEXT: movl $-1, %eax
; X86-NOBMI-NEXT: movl $-1, %ebx		; X86-NOBMI-NEXT: movl $-1, %ebx
; X86-NOBMI-NEXT: shrl %cl, %ebx		; X86-NOBMI-NEXT: shrl %cl, %ebx
; X86-NOBMI-NEXT: shrdl %cl, %eax, %eax
; X86-NOBMI-NEXT: testb $32, %cl		; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB36_2		; X86-NOBMI-NEXT: je .LBB36_2
; X86-NOBMI-NEXT: # %bb.1:		; X86-NOBMI-NEXT: # %bb.1:
; X86-NOBMI-NEXT: movl %ebx, %eax		; X86-NOBMI-NEXT: movl %ebx, %eax
; X86-NOBMI-NEXT: xorl %ebx, %ebx		; X86-NOBMI-NEXT: xorl %ebx, %ebx
; X86-NOBMI-NEXT: .LBB36_2:		; X86-NOBMI-NEXT: .LBB36_2:
; X86-NOBMI-NEXT: movl (%edx), %esi		; X86-NOBMI-NEXT: movl 4(%edx), %esi
; X86-NOBMI-NEXT: andl %eax, %esi		; X86-NOBMI-NEXT: andl %ebx, %esi
; X86-NOBMI-NEXT: movl 4(%edx), %edi		; X86-NOBMI-NEXT: movl (%edx), %edi
; X86-NOBMI-NEXT: andl %ebx, %edi		; X86-NOBMI-NEXT: andl %eax, %edi
; X86-NOBMI-NEXT: subl $8, %esp		; X86-NOBMI-NEXT: subl $8, %esp
; X86-NOBMI-NEXT: pushl %ebx		; X86-NOBMI-NEXT: pushl %ebx
; X86-NOBMI-NEXT: pushl %eax		; X86-NOBMI-NEXT: pushl %eax
; X86-NOBMI-NEXT: calll use64		; X86-NOBMI-NEXT: calll use64
; X86-NOBMI-NEXT: addl $16, %esp		; X86-NOBMI-NEXT: addl $16, %esp
; X86-NOBMI-NEXT: movl %esi, %eax		; X86-NOBMI-NEXT: movl %edi, %eax
; X86-NOBMI-NEXT: movl %edi, %edx		; X86-NOBMI-NEXT: movl %esi, %edx
; X86-NOBMI-NEXT: popl %esi		; X86-NOBMI-NEXT: popl %esi
; X86-NOBMI-NEXT: popl %edi		; X86-NOBMI-NEXT: popl %edi
; X86-NOBMI-NEXT: popl %ebx		; X86-NOBMI-NEXT: popl %ebx
; X86-NOBMI-NEXT: retl		; X86-NOBMI-NEXT: retl
;		;
; X86-BMI1NOTBM-LABEL: bzhi64_c2_load:		; X86-BMI1NOTBM-LABEL: bzhi64_c2_load:
; X86-BMI1NOTBM: # %bb.0:		; X86-BMI1NOTBM: # %bb.0:
; X86-BMI1NOTBM-NEXT: pushl %ebx		; X86-BMI1NOTBM-NEXT: pushl %ebx
; X86-BMI1NOTBM-NEXT: pushl %edi		; X86-BMI1NOTBM-NEXT: pushl %edi
; X86-BMI1NOTBM-NEXT: pushl %esi		; X86-BMI1NOTBM-NEXT: pushl %esi
; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-BMI1NOTBM-NEXT: movb $64, %cl		; X86-BMI1NOTBM-NEXT: movb $64, %cl
; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-BMI1NOTBM-NEXT: movl $-1, %eax		; X86-BMI1NOTBM-NEXT: movl $-1, %eax
; X86-BMI1NOTBM-NEXT: movl $-1, %ebx		; X86-BMI1NOTBM-NEXT: movl $-1, %ebx
; X86-BMI1NOTBM-NEXT: shrl %cl, %ebx		; X86-BMI1NOTBM-NEXT: shrl %cl, %ebx
; X86-BMI1NOTBM-NEXT: shrdl %cl, %eax, %eax
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: je .LBB36_2		; X86-BMI1NOTBM-NEXT: je .LBB36_2
; X86-BMI1NOTBM-NEXT: # %bb.1:		; X86-BMI1NOTBM-NEXT: # %bb.1:
; X86-BMI1NOTBM-NEXT: movl %ebx, %eax		; X86-BMI1NOTBM-NEXT: movl %ebx, %eax
; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx		; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx
; X86-BMI1NOTBM-NEXT: .LBB36_2:		; X86-BMI1NOTBM-NEXT: .LBB36_2:
; X86-BMI1NOTBM-NEXT: movl (%edx), %esi		; X86-BMI1NOTBM-NEXT: movl 4(%edx), %esi
; X86-BMI1NOTBM-NEXT: andl %eax, %esi		; X86-BMI1NOTBM-NEXT: andl %ebx, %esi
; X86-BMI1NOTBM-NEXT: movl 4(%edx), %edi		; X86-BMI1NOTBM-NEXT: movl (%edx), %edi
; X86-BMI1NOTBM-NEXT: andl %ebx, %edi		; X86-BMI1NOTBM-NEXT: andl %eax, %edi
; X86-BMI1NOTBM-NEXT: subl $8, %esp		; X86-BMI1NOTBM-NEXT: subl $8, %esp
; X86-BMI1NOTBM-NEXT: pushl %ebx		; X86-BMI1NOTBM-NEXT: pushl %ebx
; X86-BMI1NOTBM-NEXT: pushl %eax		; X86-BMI1NOTBM-NEXT: pushl %eax
; X86-BMI1NOTBM-NEXT: calll use64		; X86-BMI1NOTBM-NEXT: calll use64
; X86-BMI1NOTBM-NEXT: addl $16, %esp		; X86-BMI1NOTBM-NEXT: addl $16, %esp
; X86-BMI1NOTBM-NEXT: movl %esi, %eax		; X86-BMI1NOTBM-NEXT: movl %edi, %eax
; X86-BMI1NOTBM-NEXT: movl %edi, %edx		; X86-BMI1NOTBM-NEXT: movl %esi, %edx
; X86-BMI1NOTBM-NEXT: popl %esi		; X86-BMI1NOTBM-NEXT: popl %esi
; X86-BMI1NOTBM-NEXT: popl %edi		; X86-BMI1NOTBM-NEXT: popl %edi
; X86-BMI1NOTBM-NEXT: popl %ebx		; X86-BMI1NOTBM-NEXT: popl %ebx
; X86-BMI1NOTBM-NEXT: retl		; X86-BMI1NOTBM-NEXT: retl
;		;
; X86-BMI1BMI2-LABEL: bzhi64_c2_load:		; X86-BMI1BMI2-LABEL: bzhi64_c2_load:
; X86-BMI1BMI2: # %bb.0:		; X86-BMI1BMI2: # %bb.0:
; X86-BMI1BMI2-NEXT: pushl %ebx		; X86-BMI1BMI2-NEXT: pushl %ebx
; X86-BMI1BMI2-NEXT: pushl %edi		; X86-BMI1BMI2-NEXT: pushl %edi
; X86-BMI1BMI2-NEXT: pushl %esi		; X86-BMI1BMI2-NEXT: pushl %esi
; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-BMI1BMI2-NEXT: movb $64, %cl		; X86-BMI1BMI2-NEXT: movb $64, %bl
; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %bl
; X86-BMI1BMI2-NEXT: movl $-1, %eax		; X86-BMI1BMI2-NEXT: movl $-1, %ecx
; X86-BMI1BMI2-NEXT: shrxl %ecx, %eax, %ebx		; X86-BMI1BMI2-NEXT: shrxl %ebx, %ecx, %edx
; X86-BMI1BMI2-NEXT: shrdl %cl, %eax, %eax		; X86-BMI1BMI2-NEXT: testb $32, %bl
; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB36_2		; X86-BMI1BMI2-NEXT: je .LBB36_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %ebx, %eax		; X86-BMI1BMI2-NEXT: movl %edx, %ecx
; X86-BMI1BMI2-NEXT: xorl %ebx, %ebx		; X86-BMI1BMI2-NEXT: xorl %edx, %edx
; X86-BMI1BMI2-NEXT: .LBB36_2:		; X86-BMI1BMI2-NEXT: .LBB36_2:
; X86-BMI1BMI2-NEXT: movl (%edx), %esi		; X86-BMI1BMI2-NEXT: movl 4(%eax), %esi
; X86-BMI1BMI2-NEXT: andl %eax, %esi		; X86-BMI1BMI2-NEXT: andl %edx, %esi
; X86-BMI1BMI2-NEXT: movl 4(%edx), %edi		; X86-BMI1BMI2-NEXT: movl (%eax), %edi
; X86-BMI1BMI2-NEXT: andl %ebx, %edi		; X86-BMI1BMI2-NEXT: andl %ecx, %edi
; X86-BMI1BMI2-NEXT: subl $8, %esp		; X86-BMI1BMI2-NEXT: subl $8, %esp
; X86-BMI1BMI2-NEXT: pushl %ebx		; X86-BMI1BMI2-NEXT: pushl %edx
; X86-BMI1BMI2-NEXT: pushl %eax		; X86-BMI1BMI2-NEXT: pushl %ecx
; X86-BMI1BMI2-NEXT: calll use64		; X86-BMI1BMI2-NEXT: calll use64
; X86-BMI1BMI2-NEXT: addl $16, %esp		; X86-BMI1BMI2-NEXT: addl $16, %esp
; X86-BMI1BMI2-NEXT: movl %esi, %eax		; X86-BMI1BMI2-NEXT: movl %edi, %eax
; X86-BMI1BMI2-NEXT: movl %edi, %edx		; X86-BMI1BMI2-NEXT: movl %esi, %edx
; X86-BMI1BMI2-NEXT: popl %esi		; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: popl %edi		; X86-BMI1BMI2-NEXT: popl %edi
; X86-BMI1BMI2-NEXT: popl %ebx		; X86-BMI1BMI2-NEXT: popl %ebx
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
;		;
; X64-NOBMI-LABEL: bzhi64_c2_load:		; X64-NOBMI-LABEL: bzhi64_c2_load:
; X64-NOBMI: # %bb.0:		; X64-NOBMI: # %bb.0:
; X64-NOBMI-NEXT: pushq %rbx		; X64-NOBMI-NEXT: pushq %rbx
▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines
; X86-NOBMI-NEXT: pushl %edi		; X86-NOBMI-NEXT: pushl %edi
; X86-NOBMI-NEXT: pushl %esi		; X86-NOBMI-NEXT: pushl %esi
; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NOBMI-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NOBMI-NEXT: movb $64, %cl		; X86-NOBMI-NEXT: movb $64, %cl
; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl $-1, %eax		; X86-NOBMI-NEXT: movl $-1, %eax
; X86-NOBMI-NEXT: movl $-1, %ebx		; X86-NOBMI-NEXT: movl $-1, %ebx
; X86-NOBMI-NEXT: shrl %cl, %ebx		; X86-NOBMI-NEXT: shrl %cl, %ebx
; X86-NOBMI-NEXT: shrdl %cl, %eax, %eax
; X86-NOBMI-NEXT: testb $32, %cl		; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB37_2		; X86-NOBMI-NEXT: je .LBB37_2
; X86-NOBMI-NEXT: # %bb.1:		; X86-NOBMI-NEXT: # %bb.1:
; X86-NOBMI-NEXT: movl %ebx, %eax		; X86-NOBMI-NEXT: movl %ebx, %eax
; X86-NOBMI-NEXT: xorl %ebx, %ebx		; X86-NOBMI-NEXT: xorl %ebx, %ebx
; X86-NOBMI-NEXT: .LBB37_2:		; X86-NOBMI-NEXT: .LBB37_2:
; X86-NOBMI-NEXT: movl (%edx), %esi		; X86-NOBMI-NEXT: movl 4(%edx), %esi
; X86-NOBMI-NEXT: andl %eax, %esi		; X86-NOBMI-NEXT: andl %ebx, %esi
; X86-NOBMI-NEXT: movl 4(%edx), %edi		; X86-NOBMI-NEXT: movl (%edx), %edi
; X86-NOBMI-NEXT: andl %ebx, %edi		; X86-NOBMI-NEXT: andl %eax, %edi
; X86-NOBMI-NEXT: subl $8, %esp		; X86-NOBMI-NEXT: subl $8, %esp
; X86-NOBMI-NEXT: pushl %ebx		; X86-NOBMI-NEXT: pushl %ebx
; X86-NOBMI-NEXT: pushl %eax		; X86-NOBMI-NEXT: pushl %eax
; X86-NOBMI-NEXT: calll use64		; X86-NOBMI-NEXT: calll use64
; X86-NOBMI-NEXT: addl $16, %esp		; X86-NOBMI-NEXT: addl $16, %esp
; X86-NOBMI-NEXT: movl %esi, %eax		; X86-NOBMI-NEXT: movl %edi, %eax
; X86-NOBMI-NEXT: movl %edi, %edx		; X86-NOBMI-NEXT: movl %esi, %edx
; X86-NOBMI-NEXT: popl %esi		; X86-NOBMI-NEXT: popl %esi
; X86-NOBMI-NEXT: popl %edi		; X86-NOBMI-NEXT: popl %edi
; X86-NOBMI-NEXT: popl %ebx		; X86-NOBMI-NEXT: popl %ebx
; X86-NOBMI-NEXT: retl		; X86-NOBMI-NEXT: retl
;		;
; X86-BMI1NOTBM-LABEL: bzhi64_c3_load_indexzext:		; X86-BMI1NOTBM-LABEL: bzhi64_c3_load_indexzext:
; X86-BMI1NOTBM: # %bb.0:		; X86-BMI1NOTBM: # %bb.0:
; X86-BMI1NOTBM-NEXT: pushl %ebx		; X86-BMI1NOTBM-NEXT: pushl %ebx
; X86-BMI1NOTBM-NEXT: pushl %edi		; X86-BMI1NOTBM-NEXT: pushl %edi
; X86-BMI1NOTBM-NEXT: pushl %esi		; X86-BMI1NOTBM-NEXT: pushl %esi
; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-BMI1NOTBM-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-BMI1NOTBM-NEXT: movb $64, %cl		; X86-BMI1NOTBM-NEXT: movb $64, %cl
; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-BMI1NOTBM-NEXT: movl $-1, %eax		; X86-BMI1NOTBM-NEXT: movl $-1, %eax
; X86-BMI1NOTBM-NEXT: movl $-1, %ebx		; X86-BMI1NOTBM-NEXT: movl $-1, %ebx
; X86-BMI1NOTBM-NEXT: shrl %cl, %ebx		; X86-BMI1NOTBM-NEXT: shrl %cl, %ebx
; X86-BMI1NOTBM-NEXT: shrdl %cl, %eax, %eax
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: je .LBB37_2		; X86-BMI1NOTBM-NEXT: je .LBB37_2
; X86-BMI1NOTBM-NEXT: # %bb.1:		; X86-BMI1NOTBM-NEXT: # %bb.1:
; X86-BMI1NOTBM-NEXT: movl %ebx, %eax		; X86-BMI1NOTBM-NEXT: movl %ebx, %eax
; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx		; X86-BMI1NOTBM-NEXT: xorl %ebx, %ebx
; X86-BMI1NOTBM-NEXT: .LBB37_2:		; X86-BMI1NOTBM-NEXT: .LBB37_2:
; X86-BMI1NOTBM-NEXT: movl (%edx), %esi		; X86-BMI1NOTBM-NEXT: movl 4(%edx), %esi
; X86-BMI1NOTBM-NEXT: andl %eax, %esi		; X86-BMI1NOTBM-NEXT: andl %ebx, %esi
; X86-BMI1NOTBM-NEXT: movl 4(%edx), %edi		; X86-BMI1NOTBM-NEXT: movl (%edx), %edi
; X86-BMI1NOTBM-NEXT: andl %ebx, %edi		; X86-BMI1NOTBM-NEXT: andl %eax, %edi
; X86-BMI1NOTBM-NEXT: subl $8, %esp		; X86-BMI1NOTBM-NEXT: subl $8, %esp
; X86-BMI1NOTBM-NEXT: pushl %ebx		; X86-BMI1NOTBM-NEXT: pushl %ebx
; X86-BMI1NOTBM-NEXT: pushl %eax		; X86-BMI1NOTBM-NEXT: pushl %eax
; X86-BMI1NOTBM-NEXT: calll use64		; X86-BMI1NOTBM-NEXT: calll use64
; X86-BMI1NOTBM-NEXT: addl $16, %esp		; X86-BMI1NOTBM-NEXT: addl $16, %esp
; X86-BMI1NOTBM-NEXT: movl %esi, %eax		; X86-BMI1NOTBM-NEXT: movl %edi, %eax
; X86-BMI1NOTBM-NEXT: movl %edi, %edx		; X86-BMI1NOTBM-NEXT: movl %esi, %edx
; X86-BMI1NOTBM-NEXT: popl %esi		; X86-BMI1NOTBM-NEXT: popl %esi
; X86-BMI1NOTBM-NEXT: popl %edi		; X86-BMI1NOTBM-NEXT: popl %edi
; X86-BMI1NOTBM-NEXT: popl %ebx		; X86-BMI1NOTBM-NEXT: popl %ebx
; X86-BMI1NOTBM-NEXT: retl		; X86-BMI1NOTBM-NEXT: retl
;		;
; X86-BMI1BMI2-LABEL: bzhi64_c3_load_indexzext:		; X86-BMI1BMI2-LABEL: bzhi64_c3_load_indexzext:
; X86-BMI1BMI2: # %bb.0:		; X86-BMI1BMI2: # %bb.0:
; X86-BMI1BMI2-NEXT: pushl %ebx		; X86-BMI1BMI2-NEXT: pushl %ebx
; X86-BMI1BMI2-NEXT: pushl %edi		; X86-BMI1BMI2-NEXT: pushl %edi
; X86-BMI1BMI2-NEXT: pushl %esi		; X86-BMI1BMI2-NEXT: pushl %esi
; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-BMI1BMI2-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-BMI1BMI2-NEXT: movb $64, %cl		; X86-BMI1BMI2-NEXT: movb $64, %bl
; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %bl
; X86-BMI1BMI2-NEXT: movl $-1, %eax		; X86-BMI1BMI2-NEXT: movl $-1, %ecx
; X86-BMI1BMI2-NEXT: shrxl %ecx, %eax, %ebx		; X86-BMI1BMI2-NEXT: shrxl %ebx, %ecx, %edx
; X86-BMI1BMI2-NEXT: shrdl %cl, %eax, %eax		; X86-BMI1BMI2-NEXT: testb $32, %bl
; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB37_2		; X86-BMI1BMI2-NEXT: je .LBB37_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %ebx, %eax		; X86-BMI1BMI2-NEXT: movl %edx, %ecx
; X86-BMI1BMI2-NEXT: xorl %ebx, %ebx		; X86-BMI1BMI2-NEXT: xorl %edx, %edx
; X86-BMI1BMI2-NEXT: .LBB37_2:		; X86-BMI1BMI2-NEXT: .LBB37_2:
; X86-BMI1BMI2-NEXT: movl (%edx), %esi		; X86-BMI1BMI2-NEXT: movl 4(%eax), %esi
; X86-BMI1BMI2-NEXT: andl %eax, %esi		; X86-BMI1BMI2-NEXT: andl %edx, %esi
; X86-BMI1BMI2-NEXT: movl 4(%edx), %edi		; X86-BMI1BMI2-NEXT: movl (%eax), %edi
; X86-BMI1BMI2-NEXT: andl %ebx, %edi		; X86-BMI1BMI2-NEXT: andl %ecx, %edi
; X86-BMI1BMI2-NEXT: subl $8, %esp		; X86-BMI1BMI2-NEXT: subl $8, %esp
; X86-BMI1BMI2-NEXT: pushl %ebx		; X86-BMI1BMI2-NEXT: pushl %edx
; X86-BMI1BMI2-NEXT: pushl %eax		; X86-BMI1BMI2-NEXT: pushl %ecx
; X86-BMI1BMI2-NEXT: calll use64		; X86-BMI1BMI2-NEXT: calll use64
; X86-BMI1BMI2-NEXT: addl $16, %esp		; X86-BMI1BMI2-NEXT: addl $16, %esp
; X86-BMI1BMI2-NEXT: movl %esi, %eax		; X86-BMI1BMI2-NEXT: movl %edi, %eax
; X86-BMI1BMI2-NEXT: movl %edi, %edx		; X86-BMI1BMI2-NEXT: movl %esi, %edx
; X86-BMI1BMI2-NEXT: popl %esi		; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: popl %edi		; X86-BMI1BMI2-NEXT: popl %edi
; X86-BMI1BMI2-NEXT: popl %ebx		; X86-BMI1BMI2-NEXT: popl %ebx
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
;		;
; X64-NOBMI-LABEL: bzhi64_c3_load_indexzext:		; X64-NOBMI-LABEL: bzhi64_c3_load_indexzext:
; X64-NOBMI: # %bb.0:		; X64-NOBMI: # %bb.0:
; X64-NOBMI-NEXT: pushq %rbx		; X64-NOBMI-NEXT: pushq %rbx
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines
; X86-NOBMI-NEXT: pushl %edi		; X86-NOBMI-NEXT: pushl %edi
; X86-NOBMI-NEXT: pushl %esi		; X86-NOBMI-NEXT: pushl %esi
; X86-NOBMI-NEXT: pushl %eax		; X86-NOBMI-NEXT: pushl %eax
; X86-NOBMI-NEXT: movb $64, %cl		; X86-NOBMI-NEXT: movb $64, %cl
; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl $-1, %esi		; X86-NOBMI-NEXT: movl $-1, %esi
; X86-NOBMI-NEXT: movl $-1, %edi		; X86-NOBMI-NEXT: movl $-1, %edi
; X86-NOBMI-NEXT: shrl %cl, %edi		; X86-NOBMI-NEXT: shrl %cl, %edi
; X86-NOBMI-NEXT: shrdl %cl, %esi, %esi
; X86-NOBMI-NEXT: testb $32, %cl		; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: je .LBB38_2		; X86-NOBMI-NEXT: je .LBB38_2
; X86-NOBMI-NEXT: # %bb.1:		; X86-NOBMI-NEXT: # %bb.1:
; X86-NOBMI-NEXT: movl %edi, %esi		; X86-NOBMI-NEXT: movl %edi, %esi
; X86-NOBMI-NEXT: xorl %edi, %edi		; X86-NOBMI-NEXT: xorl %edi, %edi
; X86-NOBMI-NEXT: .LBB38_2:		; X86-NOBMI-NEXT: .LBB38_2:
; X86-NOBMI-NEXT: subl $8, %esp		; X86-NOBMI-NEXT: subl $8, %esp
; X86-NOBMI-NEXT: pushl %edi		; X86-NOBMI-NEXT: pushl %edi
Show All 14 Lines
; X86-BMI1NOTBM-NEXT: pushl %edi		; X86-BMI1NOTBM-NEXT: pushl %edi
; X86-BMI1NOTBM-NEXT: pushl %esi		; X86-BMI1NOTBM-NEXT: pushl %esi
; X86-BMI1NOTBM-NEXT: pushl %eax		; X86-BMI1NOTBM-NEXT: pushl %eax
; X86-BMI1NOTBM-NEXT: movb $64, %cl		; X86-BMI1NOTBM-NEXT: movb $64, %cl
; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-BMI1NOTBM-NEXT: movl $-1, %esi		; X86-BMI1NOTBM-NEXT: movl $-1, %esi
; X86-BMI1NOTBM-NEXT: movl $-1, %edi		; X86-BMI1NOTBM-NEXT: movl $-1, %edi
; X86-BMI1NOTBM-NEXT: shrl %cl, %edi		; X86-BMI1NOTBM-NEXT: shrl %cl, %edi
; X86-BMI1NOTBM-NEXT: shrdl %cl, %esi, %esi
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: je .LBB38_2		; X86-BMI1NOTBM-NEXT: je .LBB38_2
; X86-BMI1NOTBM-NEXT: # %bb.1:		; X86-BMI1NOTBM-NEXT: # %bb.1:
; X86-BMI1NOTBM-NEXT: movl %edi, %esi		; X86-BMI1NOTBM-NEXT: movl %edi, %esi
; X86-BMI1NOTBM-NEXT: xorl %edi, %edi		; X86-BMI1NOTBM-NEXT: xorl %edi, %edi
; X86-BMI1NOTBM-NEXT: .LBB38_2:		; X86-BMI1NOTBM-NEXT: .LBB38_2:
; X86-BMI1NOTBM-NEXT: subl $8, %esp		; X86-BMI1NOTBM-NEXT: subl $8, %esp
; X86-BMI1NOTBM-NEXT: pushl %edi		; X86-BMI1NOTBM-NEXT: pushl %edi
Show All 9 Lines
; X86-BMI1NOTBM-NEXT: popl %edi		; X86-BMI1NOTBM-NEXT: popl %edi
; X86-BMI1NOTBM-NEXT: retl		; X86-BMI1NOTBM-NEXT: retl
;		;
; X86-BMI1BMI2-LABEL: bzhi64_c4_commutative:		; X86-BMI1BMI2-LABEL: bzhi64_c4_commutative:
; X86-BMI1BMI2: # %bb.0:		; X86-BMI1BMI2: # %bb.0:
; X86-BMI1BMI2-NEXT: pushl %edi		; X86-BMI1BMI2-NEXT: pushl %edi
; X86-BMI1BMI2-NEXT: pushl %esi		; X86-BMI1BMI2-NEXT: pushl %esi
; X86-BMI1BMI2-NEXT: pushl %eax		; X86-BMI1BMI2-NEXT: pushl %eax
; X86-BMI1BMI2-NEXT: movb $64, %cl		; X86-BMI1BMI2-NEXT: movb $64, %al
; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %al
; X86-BMI1BMI2-NEXT: movl $-1, %esi		; X86-BMI1BMI2-NEXT: movl $-1, %edi
; X86-BMI1BMI2-NEXT: shrxl %ecx, %esi, %edi		; X86-BMI1BMI2-NEXT: shrxl %eax, %edi, %esi
; X86-BMI1BMI2-NEXT: shrdl %cl, %esi, %esi		; X86-BMI1BMI2-NEXT: testb $32, %al
; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB38_2		; X86-BMI1BMI2-NEXT: je .LBB38_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: movl %edi, %esi		; X86-BMI1BMI2-NEXT: movl %esi, %edi
; X86-BMI1BMI2-NEXT: xorl %edi, %edi		; X86-BMI1BMI2-NEXT: xorl %esi, %esi
; X86-BMI1BMI2-NEXT: .LBB38_2:		; X86-BMI1BMI2-NEXT: .LBB38_2:
; X86-BMI1BMI2-NEXT: subl $8, %esp		; X86-BMI1BMI2-NEXT: subl $8, %esp
; X86-BMI1BMI2-NEXT: pushl %edi
; X86-BMI1BMI2-NEXT: pushl %esi		; X86-BMI1BMI2-NEXT: pushl %esi
		; X86-BMI1BMI2-NEXT: pushl %edi
; X86-BMI1BMI2-NEXT: calll use64		; X86-BMI1BMI2-NEXT: calll use64
; X86-BMI1BMI2-NEXT: addl $16, %esp		; X86-BMI1BMI2-NEXT: addl $16, %esp
; X86-BMI1BMI2-NEXT: andl {{[0-9]+}}(%esp), %esi
; X86-BMI1BMI2-NEXT: andl {{[0-9]+}}(%esp), %edi		; X86-BMI1BMI2-NEXT: andl {{[0-9]+}}(%esp), %edi
; X86-BMI1BMI2-NEXT: movl %esi, %eax		; X86-BMI1BMI2-NEXT: andl {{[0-9]+}}(%esp), %esi
; X86-BMI1BMI2-NEXT: movl %edi, %edx		; X86-BMI1BMI2-NEXT: movl %edi, %eax
		; X86-BMI1BMI2-NEXT: movl %esi, %edx
; X86-BMI1BMI2-NEXT: addl $4, %esp		; X86-BMI1BMI2-NEXT: addl $4, %esp
; X86-BMI1BMI2-NEXT: popl %esi		; X86-BMI1BMI2-NEXT: popl %esi
; X86-BMI1BMI2-NEXT: popl %edi		; X86-BMI1BMI2-NEXT: popl %edi
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
;		;
; X64-NOBMI-LABEL: bzhi64_c4_commutative:		; X64-NOBMI-LABEL: bzhi64_c4_commutative:
; X64-NOBMI: # %bb.0:		; X64-NOBMI: # %bb.0:
; X64-NOBMI-NEXT: pushq %r14		; X64-NOBMI-NEXT: pushq %r14
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines
; 64-bit, but with 32-bit output		; 64-bit, but with 32-bit output

; Everything done in 64-bit, truncation happens last.		; Everything done in 64-bit, truncation happens last.
define i32 @bzhi64_32_c0(i64 %val, i64 %numlowbits) nounwind {		define i32 @bzhi64_32_c0(i64 %val, i64 %numlowbits) nounwind {
; X86-NOBMI-LABEL: bzhi64_32_c0:		; X86-NOBMI-LABEL: bzhi64_32_c0:
; X86-NOBMI: # %bb.0:		; X86-NOBMI: # %bb.0:
; X86-NOBMI-NEXT: movb $64, %cl		; X86-NOBMI-NEXT: movb $64, %cl
; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-NOBMI-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-NOBMI-NEXT: movl $-1, %edx
; X86-NOBMI-NEXT: movl $-1, %eax		; X86-NOBMI-NEXT: movl $-1, %eax
; X86-NOBMI-NEXT: shrl %cl, %eax		; X86-NOBMI-NEXT: shrl %cl, %eax
; X86-NOBMI-NEXT: shrdl %cl, %edx, %edx
; X86-NOBMI-NEXT: testb $32, %cl		; X86-NOBMI-NEXT: testb $32, %cl
; X86-NOBMI-NEXT: jne .LBB39_2		; X86-NOBMI-NEXT: jne .LBB39_2
; X86-NOBMI-NEXT: # %bb.1:		; X86-NOBMI-NEXT: # %bb.1:
; X86-NOBMI-NEXT: movl %edx, %eax		; X86-NOBMI-NEXT: movl $-1, %eax
; X86-NOBMI-NEXT: .LBB39_2:		; X86-NOBMI-NEXT: .LBB39_2:
; X86-NOBMI-NEXT: andl {{[0-9]+}}(%esp), %eax		; X86-NOBMI-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-NOBMI-NEXT: retl		; X86-NOBMI-NEXT: retl
;		;
; X86-BMI1NOTBM-LABEL: bzhi64_32_c0:		; X86-BMI1NOTBM-LABEL: bzhi64_32_c0:
; X86-BMI1NOTBM: # %bb.0:		; X86-BMI1NOTBM: # %bb.0:
; X86-BMI1NOTBM-NEXT: movb $64, %cl		; X86-BMI1NOTBM-NEXT: movb $64, %cl
; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1NOTBM-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-BMI1NOTBM-NEXT: movl $-1, %edx
; X86-BMI1NOTBM-NEXT: movl $-1, %eax		; X86-BMI1NOTBM-NEXT: movl $-1, %eax
; X86-BMI1NOTBM-NEXT: shrl %cl, %eax		; X86-BMI1NOTBM-NEXT: shrl %cl, %eax
; X86-BMI1NOTBM-NEXT: shrdl %cl, %edx, %edx
; X86-BMI1NOTBM-NEXT: testb $32, %cl		; X86-BMI1NOTBM-NEXT: testb $32, %cl
; X86-BMI1NOTBM-NEXT: jne .LBB39_2		; X86-BMI1NOTBM-NEXT: jne .LBB39_2
; X86-BMI1NOTBM-NEXT: # %bb.1:		; X86-BMI1NOTBM-NEXT: # %bb.1:
; X86-BMI1NOTBM-NEXT: movl %edx, %eax		; X86-BMI1NOTBM-NEXT: movl $-1, %eax
; X86-BMI1NOTBM-NEXT: .LBB39_2:		; X86-BMI1NOTBM-NEXT: .LBB39_2:
; X86-BMI1NOTBM-NEXT: andl {{[0-9]+}}(%esp), %eax		; X86-BMI1NOTBM-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-BMI1NOTBM-NEXT: retl		; X86-BMI1NOTBM-NEXT: retl
;		;
; X86-BMI1BMI2-LABEL: bzhi64_32_c0:		; X86-BMI1BMI2-LABEL: bzhi64_32_c0:
; X86-BMI1BMI2: # %bb.0:		; X86-BMI1BMI2: # %bb.0:
; X86-BMI1BMI2-NEXT: movb $64, %cl		; X86-BMI1BMI2-NEXT: movb $64, %cl
; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl		; X86-BMI1BMI2-NEXT: subb {{[0-9]+}}(%esp), %cl
; X86-BMI1BMI2-NEXT: movl $-1, %edx
; X86-BMI1BMI2-NEXT: movl $-1, %eax		; X86-BMI1BMI2-NEXT: movl $-1, %eax
; X86-BMI1BMI2-NEXT: shrdl %cl, %eax, %eax
; X86-BMI1BMI2-NEXT: testb $32, %cl		; X86-BMI1BMI2-NEXT: testb $32, %cl
; X86-BMI1BMI2-NEXT: je .LBB39_2		; X86-BMI1BMI2-NEXT: je .LBB39_2
; X86-BMI1BMI2-NEXT: # %bb.1:		; X86-BMI1BMI2-NEXT: # %bb.1:
; X86-BMI1BMI2-NEXT: shrxl %ecx, %edx, %eax		; X86-BMI1BMI2-NEXT: shrxl %ecx, %eax, %eax
; X86-BMI1BMI2-NEXT: .LBB39_2:		; X86-BMI1BMI2-NEXT: .LBB39_2:
; X86-BMI1BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax		; X86-BMI1BMI2-NEXT: andl {{[0-9]+}}(%esp), %eax
; X86-BMI1BMI2-NEXT: retl		; X86-BMI1BMI2-NEXT: retl
;		;
; X64-NOBMI-LABEL: bzhi64_32_c0:		; X64-NOBMI-LABEL: bzhi64_32_c0:
; X64-NOBMI: # %bb.0:		; X64-NOBMI: # %bb.0:
; X64-NOBMI-NEXT: movq %rsi, %rcx		; X64-NOBMI-NEXT: movq %rsi, %rcx
; X64-NOBMI-NEXT: negb %cl		; X64-NOBMI-NEXT: negb %cl
▲ Show 20 Lines • Show All 1,413 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/fshl.ll

Show First 20 Lines • Show All 581 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%ld1 = load i32, i32 *%p1		%ld1 = load i32, i32 *%p1
%res = call i32 @llvm.fshl.i32(i32 %ld1, i32 %ld0, i32 8)		%res = call i32 @llvm.fshl.i32(i32 %ld1, i32 %ld0, i32 8)
ret i32 %res		ret i32 %res
}		}

define i64 @combine_fshl_load_i64(i64* %p) nounwind {		define i64 @combine_fshl_load_i64(i64* %p) nounwind {
; X86-FAST-LABEL: combine_fshl_load_i64:		; X86-FAST-LABEL: combine_fshl_load_i64:
; X86-FAST: # %bb.0:		; X86-FAST: # %bb.0:
; X86-FAST-NEXT: pushl %esi
; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-FAST-NEXT: movl 12(%ecx), %eax		; X86-FAST-NEXT: movl 13(%ecx), %eax
; X86-FAST-NEXT: movl 16(%ecx), %esi		; X86-FAST-NEXT: movl 17(%ecx), %edx
; X86-FAST-NEXT: movl 20(%ecx), %edx
; X86-FAST-NEXT: shldl $24, %esi, %edx
; X86-FAST-NEXT: shrdl $8, %esi, %eax
; X86-FAST-NEXT: popl %esi
; X86-FAST-NEXT: retl		; X86-FAST-NEXT: retl
;		;
; X86-SLOW-LABEL: combine_fshl_load_i64:		; X86-SLOW-LABEL: combine_fshl_load_i64:
; X86-SLOW: # %bb.0:		; X86-SLOW: # %bb.0:
; X86-SLOW-NEXT: pushl %esi		; X86-SLOW-NEXT: pushl %esi
; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-SLOW-NEXT: movl 20(%eax), %edx		; X86-SLOW-NEXT: movl 20(%eax), %edx
; X86-SLOW-NEXT: movl 12(%eax), %ecx		; X86-SLOW-NEXT: movl 12(%eax), %ecx
Show All 39 Lines

llvm/test/CodeGen/X86/fshr.ll

Show First 20 Lines • Show All 576 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%ld1 = load i32, i32 *%p1		%ld1 = load i32, i32 *%p1
%res = call i32 @llvm.fshr.i32(i32 %ld1, i32 %ld0, i32 8)		%res = call i32 @llvm.fshr.i32(i32 %ld1, i32 %ld0, i32 8)
ret i32 %res		ret i32 %res
}		}

define i64 @combine_fshr_load_i64(i64* %p) nounwind {		define i64 @combine_fshr_load_i64(i64* %p) nounwind {
; X86-FAST-LABEL: combine_fshr_load_i64:		; X86-FAST-LABEL: combine_fshr_load_i64:
; X86-FAST: # %bb.0:		; X86-FAST: # %bb.0:
; X86-FAST-NEXT: pushl %esi		; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-FAST-NEXT: movl 11(%ecx), %eax
; X86-FAST-NEXT: movzbl 11(%eax), %ecx		; X86-FAST-NEXT: movl 15(%ecx), %edx
; X86-FAST-NEXT: movl 12(%eax), %esi
; X86-FAST-NEXT: movl 16(%eax), %edx
; X86-FAST-NEXT: shldl $8, %esi, %edx
; X86-FAST-NEXT: movl %esi, %eax
; X86-FAST-NEXT: shll $8, %eax
; X86-FAST-NEXT: orl %ecx, %eax
; X86-FAST-NEXT: popl %esi
; X86-FAST-NEXT: retl		; X86-FAST-NEXT: retl
;		;
; X86-SLOW-LABEL: combine_fshr_load_i64:		; X86-SLOW-LABEL: combine_fshr_load_i64:
; X86-SLOW: # %bb.0:		; X86-SLOW: # %bb.0:
; X86-SLOW-NEXT: pushl %esi		; X86-SLOW-NEXT: pushl %esi
; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-SLOW-NEXT: movzbl 11(%eax), %ecx		; X86-SLOW-NEXT: movzbl 11(%eax), %ecx
; X86-SLOW-NEXT: movl 12(%eax), %esi		; X86-SLOW-NEXT: movl 12(%eax), %esi
Show All 38 Lines

llvm/test/CodeGen/X86/shift-combine.ll

Show First 20 Lines • Show All 284 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
ret i64 %conv1		ret i64 %conv1
}		}

define i64 @ashr_add_shl_mismatch_shifts2(i64 %r) nounwind {		define i64 @ashr_add_shl_mismatch_shifts2(i64 %r) nounwind {
; X32-LABEL: ashr_add_shl_mismatch_shifts2:		; X32-LABEL: ashr_add_shl_mismatch_shifts2:
; X32: # %bb.0:		; X32: # %bb.0:
; X32-NEXT: movl {{[0-9]+}}(%esp), %eax		; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
; X32-NEXT: movl {{[0-9]+}}(%esp), %edx		; X32-NEXT: movl {{[0-9]+}}(%esp), %edx
; X32-NEXT: shrdl $8, %edx, %eax
; X32-NEXT: shrl $8, %edx		; X32-NEXT: shrl $8, %edx
; X32-NEXT: incl %edx		; X32-NEXT: incl %edx
; X32-NEXT: shrdl $8, %edx, %eax		; X32-NEXT: shrdl $8, %edx, %eax
; X32-NEXT: shrl $8, %edx		; X32-NEXT: shrl $8, %edx
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: ashr_add_shl_mismatch_shifts2:		; X64-LABEL: ashr_add_shl_mismatch_shifts2:
; X64: # %bb.0:		; X64: # %bb.0:
▲ Show 20 Lines • Show All 125 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/shift-parts.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -mtriple=x86_64-- < %s \| FileCheck %s			; RUN: llc -mtriple=x86_64-- < %s \| FileCheck %s
	; PR4736			; PR4736

	%0 = type { i32, i8, [35 x i8] }			%0 = type { i32, i8, [35 x i8] }

	@g_144 = external global %0, align 8 ; <%0*> [#uses=1]			@g_144 = external global %0, align 8 ; <%0*> [#uses=1]

	define i32 @int87(i32 %uint64p_8, i1 %cond) nounwind {			define i32 @int87(i32 %uint64p_8, i1 %cond) nounwind {
	; CHECK-LABEL: int87:			; CHECK-LABEL: int87:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: movq g_144+{{.*}}(%rip), %rax			; CHECK-NEXT: movq g_144+{{.*}}(%rip), %rax
	; CHECK-NEXT: movq g_144+{{.*}}(%rip), %rdx			; CHECK-NEXT: movq g_144+{{.*}}(%rip), %rcx
	; CHECK-NEXT: movzbl %sil, %ecx			; CHECK-NEXT: movzbl %sil, %edx
	; CHECK-NEXT: shll $6, %ecx			; CHECK-NEXT: shll $6, %edx
	; CHECK-NEXT: .p2align 4, 0x90			; CHECK-NEXT: .p2align 4, 0x90
	; CHECK-NEXT: .LBB0_1: # %for.cond			; CHECK-NEXT: .LBB0_1: # %for.cond
	; CHECK-NEXT: # =>This Inner Loop Header: Depth=1			; CHECK-NEXT: # =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: movq %rdx, %rsi			; CHECK-NEXT: testb $64, %dl
	; CHECK-NEXT: shrdq %cl, %rax, %rsi			; CHECK-NEXT: movq %rcx, %rsi
	; CHECK-NEXT: testb $64, %cl
	; CHECK-NEXT: cmovneq %rax, %rsi			; CHECK-NEXT: cmovneq %rax, %rsi
	; CHECK-NEXT: orl $0, %esi			; CHECK-NEXT: orl $0, %esi
	; CHECK-NEXT: je .LBB0_1			; CHECK-NEXT: je .LBB0_1
	; CHECK-NEXT: # %bb.2: # %if.then			; CHECK-NEXT: # %bb.2: # %if.then
	; CHECK-NEXT: movl $1, %eax			; CHECK-NEXT: movl $1, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	entry:			entry:
	%srcval4 = load i320, i320* bitcast (%0* @g_144 to i320*), align 8 ; <i320> [#uses=1]			%srcval4 = load i320, i320* bitcast (%0* @g_144 to i320*), align 8 ; <i320> [#uses=1]
	Show All 12 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[X86] Replace (most) X86ISD::SHLD/SHRD usage with ISD::FSHL/FSHR generic opcodes (PR39467)ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 248730

llvm/lib/Target/X86/X86ISelLowering.h

llvm/lib/Target/X86/X86ISelLowering.cpp

llvm/lib/Target/X86/X86InstrCompiler.td

llvm/lib/Target/X86/X86InstrInfo.td

llvm/lib/Target/X86/X86InstrShiftRotate.td

llvm/test/CodeGen/X86/clear-highbits.ll

llvm/test/CodeGen/X86/clear-lowbits.ll

llvm/test/CodeGen/X86/extract-bits.ll

llvm/test/CodeGen/X86/extract-lowbits.ll

llvm/test/CodeGen/X86/fshl.ll

llvm/test/CodeGen/X86/fshr.ll

llvm/test/CodeGen/X86/shift-combine.ll

llvm/test/CodeGen/X86/shift-parts.ll

[X86] Replace (most) X86ISD::SHLD/SHRD usage with ISD::FSHL/FSHR generic opcodes (PR39467)
ClosedPublic