This is an archive of the discontinued LLVM Phabricator instance.

Improve support for i386 and i486 CPUs.
AcceptedPublic

Authored by jyknight on Apr 5 2016, 12:18 PM.

Download Raw Diff

Details

Reviewers

Summary

As far as "normal" instructions go, only a few are present in the 586
but not in the earlier processors: the 486 doesn't have CMPXCHG8B, and
386 is additionally missing BSWAP, CMPXCHG, and XADD.

Previously, llvm would emit these instructions even if you asked to
target a 386.

XADD and CMPXCHG are used only for atomics; we now ask AtomicExpandPass
to deal with expanding to libcalls on these CPUs, so it's trivial to
avoid emitting those.

BSWAP, then, is all that remains to be dealt with. It gets a custom
expansion, because the 3 ROR instructions is better than the default
expansion for bswap (which can't take advantage of the partial register
updating on x86.)

Depends on D18201

Diff Detail

Event Timeline

jyknight updated this revision to Diff 52723.Apr 5 2016, 12:18 PM

jyknight retitled this revision from to Improve support for i386 and i486 CPUs..

jyknight updated this object.

jyknight added a parent revision: D18201: Switch over targets to use AtomicExpandPass, and clean up target atomics code..

jyknight added a subscriber: llvm-commits.

For BSWAP: would it work to do the custom lowering at ISel time instead of using a pseudo? I think it's possible to even write a pattern.

lib/Target/X86/X86.td
279	Should these features be on the remaining models? It'd probably be time to do ProcessorFeatures lists like SNB.
lib/Target/X86/X86InstrCompiler.td
1964	Make it obvious that it's a pseudo in the asmstring?
lib/Target/X86/X86InstrInfo.td
771	: alignment
test/CodeGen/X86/bswap.ll
25	Newline between the checks? And/or shorten CHECK386 -> I386?

RKSimon added a subscriber: RKSimon.Apr 5 2016, 2:27 PM

RKSimon added inline comments.

lib/Target/X86/X86ISelLowering.cpp
96	Subtarget.has486Insns() and Subtarget.has586Insns() are quiet vague about actual features, would it not be clearer to instead add Subtarget.hasCMPXCHG8() and Subtarget.hasCMPXCHG() that then uses Has586/Has486 internally? Same for BSWAP (Subtarget.hasBSWAP()) below.

For BSWAP: would it work to do the custom lowering at ISel time instead of using a pseudo? I think it's possible to even write a pattern.

I don't know, is it? You'd need to be able to express the operation "rotate bottom 16 bits of this register by 8 bits, in place, leaving the upper 16 bits as they were". It's not a normal 16-bit operation, which would let you allocate 16-bit inputs/output virtual registers, because the contents of the upper bits are important.

lib/Target/X86/X86.td
279	I think it doesn't need to be, because I added it as implied by cmov, which everything else supports?
lib/Target/X86/X86ISelLowering.cpp
96	Sure, sounds fine.

In D18802#393757, @jyknight wrote:

For BSWAP: would it work to do the custom lowering at ISel time instead of using a pseudo? I think it's possible to even write a pattern.

I don't know, is it? You'd need to be able to express the operation "rotate bottom 16 bits of this register by 8 bits, in place, leaving the upper 16 bits as they were". It's not a normal 16-bit operation, which would let you allocate 16-bit inputs/output virtual registers, because the contents of the upper bits are important.

For the pattern: in my mind it was simple enough that you could duplicate the intermediate GR32 and let MachineCSE clean it up; but I tried writing it and it's actually pretty damn unwieldy:

def : Pat<(bswap GR32:$src),
    (INSERT_SUBREG
     (ROR32ri
      (INSERT_SUBREG (i32 GR32:$src),
       (ROR16ri (EXTRACT_SUBREG (i32 GR32:$src), sub_16bit), 8),
       sub_16bit),
      16),
     (ROR16ri
      (EXTRACT_SUBREG
       (ROR32ri
        (INSERT_SUBREG (i32 GR32:$src),
         (ROR16ri (EXTRACT_SUBREG (i32 GR32:$src), sub_16bit), 8),
         sub_16bit),
        16),
       sub_16bit),
      8),
     sub_16bit)
    >;

However, I think anything you can do in pseudo expansion, you could still do in ISelDAGToDAG, no?

Ideally you'd do it in SDAG, but there you're right, I don't think you can shoehorn it into the type system (not unless having machine nodes or something).

lib/Target/X86/X86.td
279	You're right, missed that, sorry!

echristo added a subscriber: echristo.Apr 8 2016, 1:11 PM

echristo added inline comments.

lib/Target/X86/X86ISelLowering.cpp
96	Couple of comments here: a) I definitely like the choice of better function naming for whether or not we have a particular cpu feature. b) I'm not sure whether or not we want to conditionalize the 486/586 insns on sets of subtarget features or a broad ISA level feature. Unlike most of the later instructions added there's very little "subsetting" that we can do for this, that said, it forms a contrast with the rest of the port where we do have subtarget features for everything not baseline pentium. c) That leads us to "hey, why not just set pentium as the baseline set of features and stop letting people select 386/486 as part of their compiles". Any thoughts here?

jyknight added inline comments.Apr 8 2016, 1:34 PM

lib/Target/X86/X86ISelLowering.cpp
96	My thoughts: I have no personal use for < 586 processors. I have no idea if anyone in the world actually needs it. Quite possibly they don't. (I really have no idea). However, it seems unfortunate that llvm/clang sort of pretend to support them, but don't actually. It would probably be better to return an error for -march=i[34]86, if we aren't going to support it. Although, doing that might break people's build setups, who decide to compile with -march=i386 for whatever reason, even though they don't actually mean to. The code to support the processors is almost trivial, so there's really little reason not to just do it. (That's the only reason I even sent this patch -- because it touches upon the atomics stuff I'm working on, that there were FIXMEs about this support, and it looked fairly trivial to fix)

kbsmith1 added a subscriber: kbsmith1.Apr 8 2016, 1:37 PM

Added inline comment.

lib/Target/X86/X86ISelLowering.cpp
96	That leads us to "hey, why not just set pentium as the baseline set of features and stop letting people select 386/486 as part of their compiles". Any thoughts here? Seems OK to me, with the exception that we definitely need to be able to specify Pentium ISA without x87 floating point. and that iseems covered under FeatureX87. If this code is going to stay having Feature586Insns, and Feature486Insns, then where these are defined in X86.td ought to have a comment explaining exactly which instructions are covered in these features compared to "baseline" 386.

Sure, let's go ahead and just do this then (do fix up anything AB mentioned before committing).

Thanks!

-eric

This revision is now accepted and ready to land.Apr 19 2016, 5:25 PM

Revision Contents

Path

Size

lib/

Target/

X86/

22 lines

15 lines

32 lines

16 lines

48 lines

8 lines

test/

CodeGen/

X86/

2010-10-08-cmpxchg8b.ll

2 lines

115 lines

2 lines

2 lines

2 lines

77 lines

cmpxchg-clobber-flags.ll

4 lines

nocx16.ll

peephole-na-phys-copy-folding.ll

2 lines

Transforms/

AtomicExpand/

X86/

expand-atomic-rmw-initial-load.ll

2 lines

Diff 52723

lib/Target/X86/X86.td

	Show All 28 Lines

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// X86 Subtarget features			// X86 Subtarget features
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	def FeatureX87 : SubtargetFeature<"x87","HasX87", "true",			def FeatureX87 : SubtargetFeature<"x87","HasX87", "true",
	"Enable X87 float instructions">;			"Enable X87 float instructions">;

				def Feature486Insns : SubtargetFeature<"i486insns","Has486Insns", "true",
				"Enable i486 instructions">;

				def Feature586Insns : SubtargetFeature<"i586insns","Has586Insns", "true",
				"Enable i586 instructions",
				[Feature486Insns]>;

	def FeatureCMOV : SubtargetFeature<"cmov","HasCMov", "true",			def FeatureCMOV : SubtargetFeature<"cmov","HasCMov", "true",
	"Enable conditional move instructions">;			"Enable conditional move instructions",
				[Feature586Insns]>;

	def FeaturePOPCNT : SubtargetFeature<"popcnt", "HasPOPCNT", "true",			def FeaturePOPCNT : SubtargetFeature<"popcnt", "HasPOPCNT", "true",
	"Support POPCNT instruction">;			"Support POPCNT instruction">;

	def FeatureFXSR : SubtargetFeature<"fxsr", "HasFXSR", "true",			def FeatureFXSR : SubtargetFeature<"fxsr", "HasFXSR", "true",
	"Support fxsave/fxrestore instructions">;			"Support fxsave/fxrestore instructions">;

	def FeatureXSAVE : SubtargetFeature<"xsave", "HasXSAVE", "true",			def FeatureXSAVE : SubtargetFeature<"xsave", "HasXSAVE", "true",
	▲ Show 20 Lines • Show All 212 Lines • ▼ Show 20 Lines
	def ProcIntelSLM : SubtargetFeature<"slm", "X86ProcFamily", "IntelSLM",			def ProcIntelSLM : SubtargetFeature<"slm", "X86ProcFamily", "IntelSLM",
	"Intel Silvermont processors">;			"Intel Silvermont processors">;

	class Proc<string Name, list<SubtargetFeature> Features>			class Proc<string Name, list<SubtargetFeature> Features>
	: ProcessorModel<Name, GenericModel, Features>;			: ProcessorModel<Name, GenericModel, Features>;

	def : Proc<"generic", [FeatureX87, FeatureSlowUAMem16]>;			def : Proc<"generic", [FeatureX87, FeatureSlowUAMem16]>;
	def : Proc<"i386", [FeatureX87, FeatureSlowUAMem16]>;			def : Proc<"i386", [FeatureX87, FeatureSlowUAMem16]>;
	def : Proc<"i486", [FeatureX87, FeatureSlowUAMem16]>;			def : Proc<"i486", [FeatureX87, FeatureSlowUAMem16, Feature486Insns]>;
	def : Proc<"i586", [FeatureX87, FeatureSlowUAMem16]>;			def : Proc<"i586", [FeatureX87, FeatureSlowUAMem16, Feature586Insns]>;
	def : Proc<"pentium", [FeatureX87, FeatureSlowUAMem16]>;			def : Proc<"pentium", [FeatureX87, FeatureSlowUAMem16, Feature586Insns]>;
	def : Proc<"pentium-mmx", [FeatureX87, FeatureSlowUAMem16, FeatureMMX]>;			def : Proc<"pentium-mmx", [FeatureX87, FeatureSlowUAMem16, FeatureMMX, Feature586Insns]>;
	def : Proc<"i686", [FeatureX87, FeatureSlowUAMem16]>;			def : Proc<"i686", [FeatureX87, FeatureSlowUAMem16, Feature586Insns]>;
				abUnsubmitted Not Done Reply Inline Actions Should these features be on the remaining models? It'd probably be time to do ProcessorFeatures lists like SNB. ab: Should these features be on the remaining models? It'd probably be time to do…
				jyknightAuthorUnsubmitted Not Done Reply Inline Actions I think it doesn't need to be, because I added it as implied by cmov, which everything else supports? jyknight: I think it doesn't need to be, because I added it as implied by cmov, which everything else…
				abUnsubmitted Not Done Reply Inline Actions You're right, missed that, sorry! ab: You're right, missed that, sorry!
	def : Proc<"pentiumpro", [FeatureX87, FeatureSlowUAMem16, FeatureCMOV]>;			def : Proc<"pentiumpro", [FeatureX87, FeatureSlowUAMem16, FeatureCMOV]>;
	def : Proc<"pentium2", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,			def : Proc<"pentium2", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
	FeatureCMOV, FeatureFXSR]>;			FeatureCMOV, FeatureFXSR]>;
	def : Proc<"pentium3", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,			def : Proc<"pentium3", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
	FeatureSSE1, FeatureFXSR]>;			FeatureSSE1, FeatureFXSR]>;
	def : Proc<"pentium3m", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,			def : Proc<"pentium3m", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
	FeatureSSE1, FeatureFXSR, FeatureSlowBTMem]>;			FeatureSSE1, FeatureFXSR, FeatureSlowBTMem]>;
	def : Proc<"pentium-m", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,			def : Proc<"pentium-m", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
	FeatureSSE2, FeatureFXSR, FeatureSlowBTMem]>;			FeatureSSE2, FeatureFXSR, FeatureSlowBTMem]>;
	def : Proc<"pentium4", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,			def : Proc<"pentium4", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
	FeatureSSE2, FeatureFXSR]>;			FeatureSSE2, FeatureFXSR]>;
	def : Proc<"pentium4m", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,			def : Proc<"pentium4m", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
	FeatureSSE2, FeatureFXSR, FeatureSlowBTMem]>;			FeatureSSE2, FeatureFXSR, FeatureSlowBTMem]>;

	// Intel Quark.			// Intel Quark.
	def : Proc<"lakemont", []>;			def : Proc<"lakemont", [Feature586Insns]>;

	// Intel Core Duo.			// Intel Core Duo.
	def : ProcessorModel<"yonah", SandyBridgeModel,			def : ProcessorModel<"yonah", SandyBridgeModel,
	[FeatureX87, FeatureSlowUAMem16, FeatureMMX, FeatureSSE3,			[FeatureX87, FeatureSlowUAMem16, FeatureMMX, FeatureSSE3,
	FeatureFXSR, FeatureSlowBTMem]>;			FeatureFXSR, FeatureSlowBTMem]>;

	// NetBurst.			// NetBurst.
	def : Proc<"prescott",			def : Proc<"prescott",
	▲ Show 20 Lines • Show All 516 Lines • Show Last 20 Lines

lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
setBooleanVectorContents(ZeroOrNegativeOneBooleanContent);		setBooleanVectorContents(ZeroOrNegativeOneBooleanContent);

if (Subtarget.is64Bit()) {		if (Subtarget.is64Bit()) {
if (Subtarget.hasCmpxchg16b())		if (Subtarget.hasCmpxchg16b())
setMaxAtomicSizeInBitsSupported(128);		setMaxAtomicSizeInBitsSupported(128);
else		else
setMaxAtomicSizeInBitsSupported(64);		setMaxAtomicSizeInBitsSupported(64);
} else {		} else {
// FIXME: Check that we actually have cmpxchg (i486 or later)		if (Subtarget.has586Insns())
// FIXME: Check that we actually have cmpxchg8b (i586 or later)		setMaxAtomicSizeInBitsSupported(64); // has cmpxchg8b
setMaxAtomicSizeInBitsSupported(64);		else if (Subtarget.has486Insns())
		setMaxAtomicSizeInBitsSupported(32); // has cmpxchg
		RKSimonUnsubmitted Not Done Reply Inline Actions Subtarget.has486Insns() and Subtarget.has586Insns() are quiet vague about actual features, would it not be clearer to instead add Subtarget.hasCMPXCHG8() and Subtarget.hasCMPXCHG() that then uses Has586/Has486 internally? Same for BSWAP (Subtarget.hasBSWAP()) below. RKSimon: Subtarget.has486Insns() and Subtarget.has586Insns() are quiet vague about actual features…
		jyknightAuthorUnsubmitted Not Done Reply Inline Actions Sure, sounds fine. jyknight: Sure, sounds fine.
		echristoUnsubmitted Not Done Reply Inline Actions Couple of comments here: a) I definitely like the choice of better function naming for whether or not we have a particular cpu feature. b) I'm not sure whether or not we want to conditionalize the 486/586 insns on sets of subtarget features or a broad ISA level feature. Unlike most of the later instructions added there's very little "subsetting" that we can do for this, that said, it forms a contrast with the rest of the port where we do have subtarget features for everything not baseline pentium. c) That leads us to "hey, why not just set pentium as the baseline set of features and stop letting people select 386/486 as part of their compiles". Any thoughts here? echristo: Couple of comments here: a) I definitely like the choice of better function naming for whether…
		jyknightAuthorUnsubmitted Not Done Reply Inline Actions My thoughts: I have no personal use for < 586 processors. I have no idea if anyone in the world actually needs it. Quite possibly they don't. (I really have no idea). However, it seems unfortunate that llvm/clang sort of pretend to support them, but don't actually. It would probably be better to return an error for -march=i[34]86, if we aren't going to support it. Although, doing that might break people's build setups, who decide to compile with -march=i386 for whatever reason, even though they don't actually mean to. The code to support the processors is almost trivial, so there's really little reason not to just do it. (That's the only reason I even sent this patch -- because it touches upon the atomics stuff I'm working on, that there were FIXMEs about this support, and it looked fairly trivial to fix) jyknight: My thoughts: I have no personal use for < 586 processors. I have no idea if anyone in the…
		kbsmith1Unsubmitted Not Done Reply Inline Actions That leads us to "hey, why not just set pentium as the baseline set of features and stop letting people select 386/486 as part of their compiles". Any thoughts here? Seems OK to me, with the exception that we definitely need to be able to specify Pentium ISA without x87 floating point. and that iseems covered under FeatureX87. If this code is going to stay having Feature586Insns, and Feature486Insns, then where these are defined in X86.td ought to have a comment explaining exactly which instructions are covered in these features compared to "baseline" 386. kbsmith1: > That leads us to "hey, why not just set pentium as the baseline set of features and stop…
}		}

// For 64-bit, since we have so many registers, use the ILP scheduler.		// For 64-bit, since we have so many registers, use the ILP scheduler.
// For 32-bit, use the register pressure specific scheduling.		// For 32-bit, use the register pressure specific scheduling.
// For Atom, always use ILP scheduling.		// For Atom, always use ILP scheduling.
if (Subtarget.isAtom())		if (Subtarget.isAtom())
setSchedulingPreference(Sched::ILP);		setSchedulingPreference(Sched::ILP);
else if (Subtarget.is64Bit())		else if (Subtarget.is64Bit())
▲ Show 20 Lines • Show All 29,742 Lines • ▼ Show 20 Lines	if (std::count(AsmPieces.begin(), AsmPieces.end(), "~{cc}") &&
else if (std::count(AsmPieces.begin(), AsmPieces.end(), "~{dirflag}"))		else if (std::count(AsmPieces.begin(), AsmPieces.end(), "~{dirflag}"))
return true;		return true;
}		}
}		}
return false;		return false;
}		}

bool X86TargetLowering::ExpandInlineAsm(CallInst *CI) const {		bool X86TargetLowering::ExpandInlineAsm(CallInst *CI) const {
		// If we don't have bswap available, don't do these transforms.
		if (!Subtarget.has486Insns())
		return false;

InlineAsm *IA = cast<InlineAsm>(CI->getCalledValue());		InlineAsm *IA = cast<InlineAsm>(CI->getCalledValue());

std::string AsmStr = IA->getAsmString();		std::string AsmStr = IA->getAsmString();

IntegerType *Ty = dyn_cast<IntegerType>(CI->getType());		IntegerType *Ty = dyn_cast<IntegerType>(CI->getType());
if (!Ty \|\| Ty->getBitWidth() % 16 != 0)		if (!Ty \|\| Ty->getBitWidth() % 16 != 0)
return false;		return false;

// TODO: should remove alternatives from the asmstring: "foo {a\|b}" -> "foo a"		// TODO: should remove alternatives from the asmstring: "foo {a\|b}" -> "foo a"
SmallVector<StringRef, 4> AsmPieces;		SmallVector<StringRef, 4> AsmPieces;
SplitString(AsmStr, AsmPieces, ";\n");		SplitString(AsmStr, AsmPieces, ";\n");

switch (AsmPieces.size()) {		switch (AsmPieces.size()) {
default: return false;		default: return false;
case 1:		case 1:
// FIXME: this should verify that we are targeting a 486 or better. If not,
// we will turn this bswap into something that will be lowered to logical
// ops instead of emitting the bswap asm. For now, we don't support 486 or
// lower so don't worry about this.
// bswap $0		// bswap $0
if (matchAsm(AsmPieces[0], {"bswap", "$0"}) \|\|		if (matchAsm(AsmPieces[0], {"bswap", "$0"}) \|\|
matchAsm(AsmPieces[0], {"bswapl", "$0"}) \|\|		matchAsm(AsmPieces[0], {"bswapl", "$0"}) \|\|
matchAsm(AsmPieces[0], {"bswapq", "$0"}) \|\|		matchAsm(AsmPieces[0], {"bswapq", "$0"}) \|\|
matchAsm(AsmPieces[0], {"bswap", "${0:q}"}) \|\|		matchAsm(AsmPieces[0], {"bswap", "${0:q}"}) \|\|
matchAsm(AsmPieces[0], {"bswapl", "${0:q}"}) \|\|		matchAsm(AsmPieces[0], {"bswapl", "${0:q}"}) \|\|
matchAsm(AsmPieces[0], {"bswapq", "${0:q}"})) {		matchAsm(AsmPieces[0], {"bswapq", "${0:q}"})) {
// No need to check constraints, nothing other than the equivalent of		// No need to check constraints, nothing other than the equivalent of
▲ Show 20 Lines • Show All 678 Lines • Show Last 20 Lines

lib/Target/X86/X86InstrCompiler.td

Show First 20 Lines • Show All 719 Lines • ▼ Show 20 Lines

multiclass LCMPXCHG_BinOp<bits<8> Opc8, bits<8> Opc, Format Form,		multiclass LCMPXCHG_BinOp<bits<8> Opc8, bits<8> Opc, Format Form,
string mnemonic, SDPatternOperator frag,		string mnemonic, SDPatternOperator frag,
InstrItinClass itin8, InstrItinClass itin> {		InstrItinClass itin8, InstrItinClass itin> {
let isCodeGenOnly = 1, SchedRW = [WriteALULd, WriteRMW] in {		let isCodeGenOnly = 1, SchedRW = [WriteALULd, WriteRMW] in {
let Defs = [AL, EFLAGS], Uses = [AL] in		let Defs = [AL, EFLAGS], Uses = [AL] in
def NAME#8 : I<Opc8, Form, (outs), (ins i8mem:$ptr, GR8:$swap),		def NAME#8 : I<Opc8, Form, (outs), (ins i8mem:$ptr, GR8:$swap),
!strconcat(mnemonic, "{b}\t{$swap, $ptr\|$ptr, $swap}"),		!strconcat(mnemonic, "{b}\t{$swap, $ptr\|$ptr, $swap}"),
[(frag addr:$ptr, GR8:$swap, 1)], itin8>, TB, LOCK;		[(frag addr:$ptr, GR8:$swap, 1)], itin8>, TB, LOCK,
		Requires<[Has486Insns]>;
let Defs = [AX, EFLAGS], Uses = [AX] in		let Defs = [AX, EFLAGS], Uses = [AX] in
def NAME#16 : I<Opc, Form, (outs), (ins i16mem:$ptr, GR16:$swap),		def NAME#16 : I<Opc, Form, (outs), (ins i16mem:$ptr, GR16:$swap),
!strconcat(mnemonic, "{w}\t{$swap, $ptr\|$ptr, $swap}"),		!strconcat(mnemonic, "{w}\t{$swap, $ptr\|$ptr, $swap}"),
[(frag addr:$ptr, GR16:$swap, 2)], itin>, TB, OpSize16, LOCK;		[(frag addr:$ptr, GR16:$swap, 2)], itin>, TB, OpSize16, LOCK,
		Requires<[Has486Insns]>;
let Defs = [EAX, EFLAGS], Uses = [EAX] in		let Defs = [EAX, EFLAGS], Uses = [EAX] in
def NAME#32 : I<Opc, Form, (outs), (ins i32mem:$ptr, GR32:$swap),		def NAME#32 : I<Opc, Form, (outs), (ins i32mem:$ptr, GR32:$swap),
!strconcat(mnemonic, "{l}\t{$swap, $ptr\|$ptr, $swap}"),		!strconcat(mnemonic, "{l}\t{$swap, $ptr\|$ptr, $swap}"),
[(frag addr:$ptr, GR32:$swap, 4)], itin>, TB, OpSize32, LOCK;		[(frag addr:$ptr, GR32:$swap, 4)], itin>, TB, OpSize32, LOCK,
		Requires<[Has486Insns]>;
let Defs = [RAX, EFLAGS], Uses = [RAX] in		let Defs = [RAX, EFLAGS], Uses = [RAX] in
def NAME#64 : RI<Opc, Form, (outs), (ins i64mem:$ptr, GR64:$swap),		def NAME#64 : RI<Opc, Form, (outs), (ins i64mem:$ptr, GR64:$swap),
!strconcat(mnemonic, "{q}\t{$swap, $ptr\|$ptr, $swap}"),		!strconcat(mnemonic, "{q}\t{$swap, $ptr\|$ptr, $swap}"),
[(frag addr:$ptr, GR64:$swap, 8)], itin>, TB, LOCK;		[(frag addr:$ptr, GR64:$swap, 8)], itin>, TB, LOCK,
		Requires<[In64BitMode]>;
}		}
}		}

let Defs = [EAX, EDX, EFLAGS], Uses = [EAX, EBX, ECX, EDX],		let Defs = [EAX, EDX, EFLAGS], Uses = [EAX, EBX, ECX, EDX],
SchedRW = [WriteALULd, WriteRMW] in {		Predicates = [Has586Insns], SchedRW = [WriteALULd, WriteRMW] in {
defm LCMPXCHG8B : LCMPXCHG_UnOp<0xC7, MRM1m, "cmpxchg8b",		defm LCMPXCHG8B : LCMPXCHG_UnOp<0xC7, MRM1m, "cmpxchg8b",
X86cas8, i64mem,		X86cas8, i64mem,
IIC_CMPX_LOCK_8B>;		IIC_CMPX_LOCK_8B>;
}		}

// This pseudo must be used when the frame uses RBX as		// This pseudo must be used when the frame uses RBX as
// the base pointer. Indeed, in such situation RBX is a reserved		// the base pointer. Indeed, in such situation RBX is a reserved
// register and the register allocator will ignore any use/def of		// register and the register allocator will ignore any use/def of
▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines	multiclass ATOMIC_LOAD_BINOP<bits<8> opc8, bits<8> opc, string mnemonic,
InstrItinClass itin8, InstrItinClass itin> {		InstrItinClass itin8, InstrItinClass itin> {
let Constraints = "$val = $dst", Defs = [EFLAGS], isCodeGenOnly = 1,		let Constraints = "$val = $dst", Defs = [EFLAGS], isCodeGenOnly = 1,
SchedRW = [WriteALULd, WriteRMW] in {		SchedRW = [WriteALULd, WriteRMW] in {
def NAME#8 : I<opc8, MRMSrcMem, (outs GR8:$dst),		def NAME#8 : I<opc8, MRMSrcMem, (outs GR8:$dst),
(ins GR8:$val, i8mem:$ptr),		(ins GR8:$val, i8mem:$ptr),
!strconcat(mnemonic, "{b}\t{$val, $ptr\|$ptr, $val}"),		!strconcat(mnemonic, "{b}\t{$val, $ptr\|$ptr, $val}"),
[(set GR8:$dst,		[(set GR8:$dst,
(!cast<PatFrag>(frag # "_8") addr:$ptr, GR8:$val))],		(!cast<PatFrag>(frag # "_8") addr:$ptr, GR8:$val))],
itin8>;		itin8>, Requires<[Has486Insns]>;
def NAME#16 : I<opc, MRMSrcMem, (outs GR16:$dst),		def NAME#16 : I<opc, MRMSrcMem, (outs GR16:$dst),
(ins GR16:$val, i16mem:$ptr),		(ins GR16:$val, i16mem:$ptr),
!strconcat(mnemonic, "{w}\t{$val, $ptr\|$ptr, $val}"),		!strconcat(mnemonic, "{w}\t{$val, $ptr\|$ptr, $val}"),
[(set		[(set
GR16:$dst,		GR16:$dst,
(!cast<PatFrag>(frag # "_16") addr:$ptr, GR16:$val))],		(!cast<PatFrag>(frag # "_16") addr:$ptr, GR16:$val))],
itin>, OpSize16;		itin>, OpSize16, Requires<[Has486Insns]>;
def NAME#32 : I<opc, MRMSrcMem, (outs GR32:$dst),		def NAME#32 : I<opc, MRMSrcMem, (outs GR32:$dst),
(ins GR32:$val, i32mem:$ptr),		(ins GR32:$val, i32mem:$ptr),
!strconcat(mnemonic, "{l}\t{$val, $ptr\|$ptr, $val}"),		!strconcat(mnemonic, "{l}\t{$val, $ptr\|$ptr, $val}"),
[(set		[(set
GR32:$dst,		GR32:$dst,
(!cast<PatFrag>(frag # "_32") addr:$ptr, GR32:$val))],		(!cast<PatFrag>(frag # "_32") addr:$ptr, GR32:$val))],
itin>, OpSize32;		itin>, OpSize32, Requires<[Has486Insns]>;
def NAME#64 : RI<opc, MRMSrcMem, (outs GR64:$dst),		def NAME#64 : RI<opc, MRMSrcMem, (outs GR64:$dst),
(ins GR64:$val, i64mem:$ptr),		(ins GR64:$val, i64mem:$ptr),
!strconcat(mnemonic, "{q}\t{$val, $ptr\|$ptr, $val}"),		!strconcat(mnemonic, "{q}\t{$val, $ptr\|$ptr, $val}"),
[(set		[(set
GR64:$dst,		GR64:$dst,
(!cast<PatFrag>(frag # "_64") addr:$ptr, GR64:$val))],		(!cast<PatFrag>(frag # "_64") addr:$ptr, GR64:$val))],
itin>;		itin>, Requires<[In64BitMode]>;
}		}
}		}

defm LXADD : ATOMIC_LOAD_BINOP<0xc0, 0xc1, "xadd", "atomic_load_add",		defm LXADD : ATOMIC_LOAD_BINOP<0xc0, 0xc1, "xadd", "atomic_load_add",
IIC_XADD_LOCK_MEM8, IIC_XADD_LOCK_MEM>,		IIC_XADD_LOCK_MEM8, IIC_XADD_LOCK_MEM>,
TB, LOCK;		TB, LOCK;

/* The following multiclass tries to make sure that in code like		/* The following multiclass tries to make sure that in code like
▲ Show 20 Lines • Show All 1,097 Lines • ▼ Show 20 Lines
def : Pat<(cttz_zero_undef (loadi32 addr:$src)), (BSF32rm addr:$src)>;		def : Pat<(cttz_zero_undef (loadi32 addr:$src)), (BSF32rm addr:$src)>;
def : Pat<(cttz_zero_undef (loadi64 addr:$src)), (BSF64rm addr:$src)>;		def : Pat<(cttz_zero_undef (loadi64 addr:$src)), (BSF64rm addr:$src)>;

// When HasMOVBE is enabled it is possible to get a non-legalized		// When HasMOVBE is enabled it is possible to get a non-legalized
// register-register 16 bit bswap. This maps it to a ROL instruction.		// register-register 16 bit bswap. This maps it to a ROL instruction.
let Predicates = [HasMOVBE] in {		let Predicates = [HasMOVBE] in {
def : Pat<(bswap GR16:$src), (ROL16ri GR16:$src, (i8 8))>;		def : Pat<(bswap GR16:$src), (ROL16ri GR16:$src, (i8 8))>;
}		}

		// On a 386, we expand bswap to 3 rotates after register selection.
		let Predicates = [No486Insns],
		Constraints = "$src = $dst", Defs = [EFLAGS],
		isPseudo = 1 in {
		def PSEUDO_BSWAP32r : I<0, Pseudo,
		(outs GR32:$dst), (ins GR32:$src),
		"bswap\t$dst",
		abUnsubmitted Not Done Reply Inline Actions Make it obvious that it's a pseudo in the asmstring? ab: Make it obvious that it's a pseudo in the asmstring?
		[(set GR32:$dst, (bswap GR32:$src))]>;
		}

lib/Target/X86/X86InstrInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,462 Lines • ▼ Show 20 Lines	static void expandLoadStackGuard(MachineInstrBuilder &MIB,
BuildMI(MBB, I, DL, TII.get(X86::MOV64rm), Reg).addReg(X86::RIP).addImm(1)		BuildMI(MBB, I, DL, TII.get(X86::MOV64rm), Reg).addReg(X86::RIP).addImm(1)
.addReg(0).addGlobalAddress(GV, 0, X86II::MO_GOTPCREL).addReg(0)		.addReg(0).addGlobalAddress(GV, 0, X86II::MO_GOTPCREL).addReg(0)
.addMemOperand(MMO);		.addMemOperand(MMO);
MIB->setDebugLoc(DL);		MIB->setDebugLoc(DL);
MIB->setDesc(TII.get(X86::MOV64rm));		MIB->setDesc(TII.get(X86::MOV64rm));
MIB.addReg(Reg, RegState::Kill).addImm(1).addReg(0).addImm(0).addReg(0);		MIB.addReg(Reg, RegState::Kill).addImm(1).addReg(0).addImm(0).addReg(0);
}		}

		static bool ExpandPSEUDO_BSWAP32r(MachineInstr *MI,
		const TargetInstrInfo &TII) {
		MachineBasicBlock *BB = MI->getParent();
		DebugLoc DL = MI->getDebugLoc();
		unsigned Reg = MI->getOperand(0).getReg();
		unsigned Reg16 = getX86SubSuperRegister(Reg, 16);
		BuildMI(*BB, MI, DL, TII.get(X86::ROR16ri), Reg16).addReg(Reg16).addImm(8);
		BuildMI(*BB, MI, DL, TII.get(X86::ROR32ri), Reg).addReg(Reg).addImm(16);
		BuildMI(*BB, MI, DL, TII.get(X86::ROR16ri), Reg16).addReg(Reg16).addImm(8);

		MI->eraseFromParent(); // The pseudo is gone now.
		return true;
		}

bool X86InstrInfo::expandPostRAPseudo(MachineBasicBlock::iterator MI) const {		bool X86InstrInfo::expandPostRAPseudo(MachineBasicBlock::iterator MI) const {
bool HasAVX = Subtarget.hasAVX();		bool HasAVX = Subtarget.hasAVX();
MachineInstrBuilder MIB(*MI->getParent()->getParent(), MI);		MachineInstrBuilder MIB(*MI->getParent()->getParent(), MI);
switch (MI->getOpcode()) {		switch (MI->getOpcode()) {
case X86::MOV32r0:		case X86::MOV32r0:
return Expand2AddrUndef(MIB, get(X86::XOR32rr));		return Expand2AddrUndef(MIB, get(X86::XOR32rr));
case X86::MOV32r1:		case X86::MOV32r1:
return expandMOV32r1(MIB, this, /MinusOne=*/ false);		return expandMOV32r1(MIB, this, /MinusOne=*/ false);
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	bool X86InstrInfo::expandPostRAPseudo(MachineBasicBlock::iterator MI) const {
case X86::KSET0Q: return Expand2AddrKreg(MIB, get(X86::KXORQrr), X86::K0);		case X86::KSET0Q: return Expand2AddrKreg(MIB, get(X86::KXORQrr), X86::K0);
case X86::KSET1B:		case X86::KSET1B:
case X86::KSET1W: return Expand2AddrKreg(MIB, get(X86::KXNORWrr), X86::K0);		case X86::KSET1W: return Expand2AddrKreg(MIB, get(X86::KXNORWrr), X86::K0);
case X86::KSET1D: return Expand2AddrKreg(MIB, get(X86::KXNORDrr), X86::K0);		case X86::KSET1D: return Expand2AddrKreg(MIB, get(X86::KXNORDrr), X86::K0);
case X86::KSET1Q: return Expand2AddrKreg(MIB, get(X86::KXNORQrr), X86::K0);		case X86::KSET1Q: return Expand2AddrKreg(MIB, get(X86::KXNORQrr), X86::K0);
case TargetOpcode::LOAD_STACK_GUARD:		case TargetOpcode::LOAD_STACK_GUARD:
expandLoadStackGuard(MIB, *this);		expandLoadStackGuard(MIB, *this);
return true;		return true;
		case X86::PSEUDO_BSWAP32r:
		return ExpandPSEUDO_BSWAP32r(MIB, *this);
}		}
return false;		return false;
}		}

static void addOperands(MachineInstrBuilder &MIB, ArrayRef<MachineOperand> MOs,		static void addOperands(MachineInstrBuilder &MIB, ArrayRef<MachineOperand> MOs,
int PtrOffset = 0) {		int PtrOffset = 0) {
unsigned NumAddrOps = MOs.size();		unsigned NumAddrOps = MOs.size();

▲ Show 20 Lines • Show All 1,948 Lines • Show Last 20 Lines

lib/Target/X86/X86InstrInfo.td

Show First 20 Lines • Show All 761 Lines • ▼ Show 20 Lines	def tls64baseaddr : ComplexPattern<i64, 5, "selectTLSADDRAddr",
[tglobaltlsaddr], []>;		[tglobaltlsaddr], []>;

def vectoraddr : ComplexPattern<iPTR, 5, "selectVectorAddr", [],[SDNPWantParent]>;		def vectoraddr : ComplexPattern<iPTR, 5, "selectVectorAddr", [],[SDNPWantParent]>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// X86 Instruction Predicate Definitions.		// X86 Instruction Predicate Definitions.
def TruePredicate : Predicate<"true">;		def TruePredicate : Predicate<"true">;

		def Has486Insns : Predicate<"Subtarget->has486Insns()">;
		def No486Insns : Predicate<"!Subtarget->has486Insns()">;
		abUnsubmitted Not Done Reply Inline Actions : alignment ab: : alignment
		def Has586Insns : Predicate<"Subtarget->has586Insns()">;
def HasCMov : Predicate<"Subtarget->hasCMov()">;		def HasCMov : Predicate<"Subtarget->hasCMov()">;
def NoCMov : Predicate<"!Subtarget->hasCMov()">;		def NoCMov : Predicate<"!Subtarget->hasCMov()">;

def HasMMX : Predicate<"Subtarget->hasMMX()">;		def HasMMX : Predicate<"Subtarget->hasMMX()">;
def Has3DNow : Predicate<"Subtarget->has3DNow()">;		def Has3DNow : Predicate<"Subtarget->has3DNow()">;
def Has3DNowA : Predicate<"Subtarget->has3DNowA()">;		def Has3DNowA : Predicate<"Subtarget->has3DNowA()">;
def HasSSE1 : Predicate<"Subtarget->hasSSE1()">;		def HasSSE1 : Predicate<"Subtarget->hasSSE1()">;
def UseSSE1 : Predicate<"Subtarget->hasSSE1() && !Subtarget->hasAVX()">;		def UseSSE1 : Predicate<"Subtarget->hasSSE1() && !Subtarget->hasAVX()">;
▲ Show 20 Lines • Show All 454 Lines • ▼ Show 20 Lines	def PUSHA16 : I<0x60, RawFrm, (outs), (ins), "pushaw", [], IIC_PUSH_A>,
OpSize16, Requires<[Not64BitMode]>;		OpSize16, Requires<[Not64BitMode]>;
}		}

let Constraints = "$src = $dst", SchedRW = [WriteALU] in {		let Constraints = "$src = $dst", SchedRW = [WriteALU] in {
// GR32 = bswap GR32		// GR32 = bswap GR32
def BSWAP32r : I<0xC8, AddRegFrm,		def BSWAP32r : I<0xC8, AddRegFrm,
(outs GR32:$dst), (ins GR32:$src),		(outs GR32:$dst), (ins GR32:$src),
"bswap{l}\t$dst",		"bswap{l}\t$dst",
[(set GR32:$dst, (bswap GR32:$src))], IIC_BSWAP>, OpSize32, TB;		[(set GR32:$dst, (bswap GR32:$src))], IIC_BSWAP>, OpSize32, TB,
		Requires<[Has486Insns]>;

def BSWAP64r : RI<0xC8, AddRegFrm, (outs GR64:$dst), (ins GR64:$src),		def BSWAP64r : RI<0xC8, AddRegFrm, (outs GR64:$dst), (ins GR64:$src),
"bswap{q}\t$dst",		"bswap{q}\t$dst",
[(set GR64:$dst, (bswap GR64:$src))], IIC_BSWAP>, TB;		[(set GR64:$dst, (bswap GR64:$src))], IIC_BSWAP>, TB,
		Requires<[Has486Insns]>;
} // Constraints = "$src = $dst", SchedRW		} // Constraints = "$src = $dst", SchedRW

// Bit scan instructions.		// Bit scan instructions.
let Defs = [EFLAGS] in {		let Defs = [EFLAGS] in {
def BSF16rr : I<0xBC, MRMSrcReg, (outs GR16:$dst), (ins GR16:$src),		def BSF16rr : I<0xBC, MRMSrcReg, (outs GR16:$dst), (ins GR16:$src),
"bsf{w}\t{$src, $dst\|$dst, $src}",		"bsf{w}\t{$src, $dst\|$dst, $src}",
[(set GR16:$dst, EFLAGS, (X86bsf GR16:$src))],		[(set GR16:$dst, EFLAGS, (X86bsf GR16:$src))],
IIC_BIT_SCAN_REG>, PS, OpSize16, Sched<[WriteShift]>;		IIC_BIT_SCAN_REG>, PS, OpSize16, Sched<[WriteShift]>;
▲ Show 20 Lines • Show All 623 Lines • ▼ Show 20 Lines	def XCHG32ar64 : I<0x90, AddRegFrm, (outs), (ins GR32_NOAX:$src),
OpSize32, Requires<[In64BitMode]>;		OpSize32, Requires<[In64BitMode]>;
let Uses = [RAX], Defs = [RAX] in		let Uses = [RAX], Defs = [RAX] in
def XCHG64ar : RI<0x90, AddRegFrm, (outs), (ins GR64:$src),		def XCHG64ar : RI<0x90, AddRegFrm, (outs), (ins GR64:$src),
"xchg{q}\t{$src, %rax\|rax, $src}", [], IIC_XCHG_REG>;		"xchg{q}\t{$src, %rax\|rax, $src}", [], IIC_XCHG_REG>;
} // SchedRW		} // SchedRW

let SchedRW = [WriteALU] in {		let SchedRW = [WriteALU] in {
def XADD8rr : I<0xC0, MRMDestReg, (outs GR8:$dst), (ins GR8:$src),		def XADD8rr : I<0xC0, MRMDestReg, (outs GR8:$dst), (ins GR8:$src),
"xadd{b}\t{$src, $dst\|$dst, $src}", [], IIC_XADD_REG>, TB;		"xadd{b}\t{$src, $dst\|$dst, $src}", [], IIC_XADD_REG>, TB,
		Requires<[Has486Insns]>;
def XADD16rr : I<0xC1, MRMDestReg, (outs GR16:$dst), (ins GR16:$src),		def XADD16rr : I<0xC1, MRMDestReg, (outs GR16:$dst), (ins GR16:$src),
"xadd{w}\t{$src, $dst\|$dst, $src}", [], IIC_XADD_REG>, TB,		"xadd{w}\t{$src, $dst\|$dst, $src}", [], IIC_XADD_REG>, TB,
OpSize16;		OpSize16, Requires<[Has486Insns]>;
def XADD32rr : I<0xC1, MRMDestReg, (outs GR32:$dst), (ins GR32:$src),		def XADD32rr : I<0xC1, MRMDestReg, (outs GR32:$dst), (ins GR32:$src),
"xadd{l}\t{$src, $dst\|$dst, $src}", [], IIC_XADD_REG>, TB,		"xadd{l}\t{$src, $dst\|$dst, $src}", [], IIC_XADD_REG>, TB,
OpSize32;		OpSize32, Requires<[Has486Insns]>;
def XADD64rr : RI<0xC1, MRMDestReg, (outs GR64:$dst), (ins GR64:$src),		def XADD64rr : RI<0xC1, MRMDestReg, (outs GR64:$dst), (ins GR64:$src),
"xadd{q}\t{$src, $dst\|$dst, $src}", [], IIC_XADD_REG>, TB;		"xadd{q}\t{$src, $dst\|$dst, $src}", [], IIC_XADD_REG>, TB,
		Requires<[In64BitMode]>;
} // SchedRW		} // SchedRW

let mayLoad = 1, mayStore = 1, SchedRW = [WriteALULd, WriteRMW] in {		let mayLoad = 1, mayStore = 1, SchedRW = [WriteALULd, WriteRMW] in {
def XADD8rm : I<0xC0, MRMDestMem, (outs), (ins i8mem:$dst, GR8:$src),		def XADD8rm : I<0xC0, MRMDestMem, (outs), (ins i8mem:$dst, GR8:$src),
"xadd{b}\t{$src, $dst\|$dst, $src}", [], IIC_XADD_MEM>, TB;		"xadd{b}\t{$src, $dst\|$dst, $src}", [], IIC_XADD_MEM>, TB,
		Requires<[Has486Insns]>;
def XADD16rm : I<0xC1, MRMDestMem, (outs), (ins i16mem:$dst, GR16:$src),		def XADD16rm : I<0xC1, MRMDestMem, (outs), (ins i16mem:$dst, GR16:$src),
"xadd{w}\t{$src, $dst\|$dst, $src}", [], IIC_XADD_MEM>, TB,		"xadd{w}\t{$src, $dst\|$dst, $src}", [], IIC_XADD_MEM>, TB,
OpSize16;		OpSize16, Requires<[Has486Insns]>;
def XADD32rm : I<0xC1, MRMDestMem, (outs), (ins i32mem:$dst, GR32:$src),		def XADD32rm : I<0xC1, MRMDestMem, (outs), (ins i32mem:$dst, GR32:$src),
"xadd{l}\t{$src, $dst\|$dst, $src}", [], IIC_XADD_MEM>, TB,		"xadd{l}\t{$src, $dst\|$dst, $src}", [], IIC_XADD_MEM>, TB,
OpSize32;		OpSize32, Requires<[Has486Insns]>;
def XADD64rm : RI<0xC1, MRMDestMem, (outs), (ins i64mem:$dst, GR64:$src),		def XADD64rm : RI<0xC1, MRMDestMem, (outs), (ins i64mem:$dst, GR64:$src),
"xadd{q}\t{$src, $dst\|$dst, $src}", [], IIC_XADD_MEM>, TB;		"xadd{q}\t{$src, $dst\|$dst, $src}", [], IIC_XADD_MEM>, TB,
		Requires<[In64BitMode]>;

}		}

let SchedRW = [WriteALU] in {		let SchedRW = [WriteALU] in {
def CMPXCHG8rr : I<0xB0, MRMDestReg, (outs GR8:$dst), (ins GR8:$src),		def CMPXCHG8rr : I<0xB0, MRMDestReg, (outs GR8:$dst), (ins GR8:$src),
"cmpxchg{b}\t{$src, $dst\|$dst, $src}", [],		"cmpxchg{b}\t{$src, $dst\|$dst, $src}", [],
IIC_CMPXCHG_REG8>, TB;		IIC_CMPXCHG_REG8>, TB, Requires<[Has486Insns]>;
def CMPXCHG16rr : I<0xB1, MRMDestReg, (outs GR16:$dst), (ins GR16:$src),		def CMPXCHG16rr : I<0xB1, MRMDestReg, (outs GR16:$dst), (ins GR16:$src),
"cmpxchg{w}\t{$src, $dst\|$dst, $src}", [],		"cmpxchg{w}\t{$src, $dst\|$dst, $src}", [],
IIC_CMPXCHG_REG>, TB, OpSize16;		IIC_CMPXCHG_REG>, TB, OpSize16, Requires<[Has486Insns]>;
def CMPXCHG32rr : I<0xB1, MRMDestReg, (outs GR32:$dst), (ins GR32:$src),		def CMPXCHG32rr : I<0xB1, MRMDestReg, (outs GR32:$dst), (ins GR32:$src),
"cmpxchg{l}\t{$src, $dst\|$dst, $src}", [],		"cmpxchg{l}\t{$src, $dst\|$dst, $src}", [],
IIC_CMPXCHG_REG>, TB, OpSize32;		IIC_CMPXCHG_REG>, TB, OpSize32, Requires<[Has486Insns]>;
def CMPXCHG64rr : RI<0xB1, MRMDestReg, (outs GR64:$dst), (ins GR64:$src),		def CMPXCHG64rr : RI<0xB1, MRMDestReg, (outs GR64:$dst), (ins GR64:$src),
"cmpxchg{q}\t{$src, $dst\|$dst, $src}", [],		"cmpxchg{q}\t{$src, $dst\|$dst, $src}", [],
IIC_CMPXCHG_REG>, TB;		IIC_CMPXCHG_REG>, TB, Requires<[In64BitMode]>;
} // SchedRW		} // SchedRW

let SchedRW = [WriteALULd, WriteRMW] in {		let SchedRW = [WriteALULd, WriteRMW] in {
let mayLoad = 1, mayStore = 1 in {		let mayLoad = 1, mayStore = 1 in {
def CMPXCHG8rm : I<0xB0, MRMDestMem, (outs), (ins i8mem:$dst, GR8:$src),		def CMPXCHG8rm : I<0xB0, MRMDestMem, (outs), (ins i8mem:$dst, GR8:$src),
"cmpxchg{b}\t{$src, $dst\|$dst, $src}", [],		"cmpxchg{b}\t{$src, $dst\|$dst, $src}", [],
IIC_CMPXCHG_MEM8>, TB;		IIC_CMPXCHG_MEM8>, TB, Requires<[Has486Insns]>;
def CMPXCHG16rm : I<0xB1, MRMDestMem, (outs), (ins i16mem:$dst, GR16:$src),		def CMPXCHG16rm : I<0xB1, MRMDestMem, (outs), (ins i16mem:$dst, GR16:$src),
"cmpxchg{w}\t{$src, $dst\|$dst, $src}", [],		"cmpxchg{w}\t{$src, $dst\|$dst, $src}", [],
IIC_CMPXCHG_MEM>, TB, OpSize16;		IIC_CMPXCHG_MEM>, TB, OpSize16, Requires<[Has486Insns]>;
def CMPXCHG32rm : I<0xB1, MRMDestMem, (outs), (ins i32mem:$dst, GR32:$src),		def CMPXCHG32rm : I<0xB1, MRMDestMem, (outs), (ins i32mem:$dst, GR32:$src),
"cmpxchg{l}\t{$src, $dst\|$dst, $src}", [],		"cmpxchg{l}\t{$src, $dst\|$dst, $src}", [],
IIC_CMPXCHG_MEM>, TB, OpSize32;		IIC_CMPXCHG_MEM>, TB, OpSize32, Requires<[Has486Insns]>;
def CMPXCHG64rm : RI<0xB1, MRMDestMem, (outs), (ins i64mem:$dst, GR64:$src),		def CMPXCHG64rm : RI<0xB1, MRMDestMem, (outs), (ins i64mem:$dst, GR64:$src),
"cmpxchg{q}\t{$src, $dst\|$dst, $src}", [],		"cmpxchg{q}\t{$src, $dst\|$dst, $src}", [],
IIC_CMPXCHG_MEM>, TB;		IIC_CMPXCHG_MEM>, TB, Requires<[In64BitMode]>;
}		}

let Defs = [EAX, EDX, EFLAGS], Uses = [EAX, EBX, ECX, EDX] in		let Defs = [EAX, EDX, EFLAGS], Uses = [EAX, EBX, ECX, EDX] in
def CMPXCHG8B : I<0xC7, MRM1m, (outs), (ins i64mem:$dst),		def CMPXCHG8B : I<0xC7, MRM1m, (outs), (ins i64mem:$dst),
"cmpxchg8b\t$dst", [], IIC_CMPXCHG_8B>, TB;		"cmpxchg8b\t$dst", [], IIC_CMPXCHG_8B>, TB,
		Requires<[Has586Insns]>;

let Defs = [RAX, RDX, EFLAGS], Uses = [RAX, RBX, RCX, RDX] in		let Defs = [RAX, RDX, EFLAGS], Uses = [RAX, RBX, RCX, RDX] in
def CMPXCHG16B : RI<0xC7, MRM1m, (outs), (ins i128mem:$dst),		def CMPXCHG16B : RI<0xC7, MRM1m, (outs), (ins i128mem:$dst),
"cmpxchg16b\t$dst", [], IIC_CMPXCHG_16B>,		"cmpxchg16b\t$dst", [], IIC_CMPXCHG_16B>,
TB, Requires<[HasCmpxchg16b]>;		TB, Requires<[HasCmpxchg16b]>;
} // SchedRW		} // SchedRW


▲ Show 20 Lines • Show All 1,222 Lines • Show Last 20 Lines

lib/Target/X86/X86Subtarget.h

Show First 20 Lines • Show All 67 Lines • ▼ Show 20 Lines	protected:
X86SSEEnum X86SSELevel;		X86SSEEnum X86SSELevel;

/// MMX, 3DNow, 3DNow Athlon, or none supported.		/// MMX, 3DNow, 3DNow Athlon, or none supported.
X863DNowEnum X863DNowLevel;		X863DNowEnum X863DNowLevel;

/// True if the processor supports X87 instructions.		/// True if the processor supports X87 instructions.
bool HasX87;		bool HasX87;

		/// Target has the instructions added with i486.
		bool Has486Insns;

		/// Target has the instructions added with i586.
		bool Has586Insns;

/// True if this processor has conditional move instructions		/// True if this processor has conditional move instructions
/// (generally pentium pro+).		/// (generally pentium pro+).
bool HasCMov;		bool HasCMov;

/// True if the processor supports X86-64 instructions.		/// True if the processor supports X86-64 instructions.
bool HasX86_64;		bool HasX86_64;

/// True if the processor supports POPCNT.		/// True if the processor supports POPCNT.
▲ Show 20 Lines • Show All 285 Lines • ▼ Show 20 Lines	bool isTarget64BitLP64() const {
return In64BitMode && (TargetTriple.getEnvironment() != Triple::GNUX32 &&		return In64BitMode && (TargetTriple.getEnvironment() != Triple::GNUX32 &&
!TargetTriple.isOSNaCl());		!TargetTriple.isOSNaCl());
}		}

PICStyles::Style getPICStyle() const { return PICStyle; }		PICStyles::Style getPICStyle() const { return PICStyle; }
void setPICStyle(PICStyles::Style Style) { PICStyle = Style; }		void setPICStyle(PICStyles::Style Style) { PICStyle = Style; }

bool hasX87() const { return HasX87; }		bool hasX87() const { return HasX87; }
		bool has486Insns() const { return Has486Insns; }
		bool has586Insns() const { return Has586Insns; }
bool hasCMov() const { return HasCMov; }		bool hasCMov() const { return HasCMov; }
bool hasSSE1() const { return X86SSELevel >= SSE1; }		bool hasSSE1() const { return X86SSELevel >= SSE1; }
bool hasSSE2() const { return X86SSELevel >= SSE2; }		bool hasSSE2() const { return X86SSELevel >= SSE2; }
bool hasSSE3() const { return X86SSELevel >= SSE3; }		bool hasSSE3() const { return X86SSELevel >= SSE3; }
bool hasSSSE3() const { return X86SSELevel >= SSSE3; }		bool hasSSSE3() const { return X86SSELevel >= SSSE3; }
bool hasSSE41() const { return X86SSELevel >= SSE41; }		bool hasSSE41() const { return X86SSELevel >= SSE41; }
bool hasSSE42() const { return X86SSELevel >= SSE42; }		bool hasSSE42() const { return X86SSELevel >= SSE42; }
bool hasAVX() const { return X86SSELevel >= AVX; }		bool hasAVX() const { return X86SSELevel >= AVX; }
▲ Show 20 Lines • Show All 208 Lines • Show Last 20 Lines

test/CodeGen/X86/2010-10-08-cmpxchg8b.ll

	; RUN: llc < %s -march=x86 -mtriple=i386-apple-darwin \| FileCheck %s			; RUN: llc < %s -march=x86 -mtriple=i686-apple-darwin -mcpu=i686 \| FileCheck %s
	; PR8297			; PR8297
	;			;
	; On i386, i64 cmpxchg is lowered during legalize types to extract the			; On i386, i64 cmpxchg is lowered during legalize types to extract the
	; 64-bit result into a pair of fixed regs. So creation of the DAG node			; 64-bit result into a pair of fixed regs. So creation of the DAG node
	; happens in a different place. See			; happens in a different place. See
	; X86TargetLowering::ReplaceNodeResults, case ATOMIC_CMP_SWAP.			; X86TargetLowering::ReplaceNodeResults, case ATOMIC_CMP_SWAP.
	;			;
	; Neither Atomic-xx.ll nor atomic_op.ll cover this. Those tests were			; Neither Atomic-xx.ll nor atomic_op.ll cover this. Those tests were
	Show All 17 Lines

test/CodeGen/X86/atomic-cpus.ll

This file was added.

				; RUN: llc < %s -march=x86-64 -mcpu=corei7 \| FileCheck --check-prefix=X64_CX16 --check-prefix=CX8_64 --check-prefix=CX4 --check-prefix=CHECK %s
				; RUN: llc < %s -march=x86-64 -mcpu=x86-64 \| FileCheck --check-prefix=X64_NOCX16 --check-prefix=CX8_64 --check-prefix=CX4 --check-prefix=CHECK %s
				; RUN: llc < %s -march=x86-64 -mcpu=corei7 -mattr=-cx16 \| FileCheck --check-prefix=X64_NOCX16 --check-prefix=CX8_64 --check-prefix=CX4 --check-prefix=CHECK %s
				; RUN: llc < %s -march=x86 -mcpu=i586 \| FileCheck --check-prefix=X32_NOCX16 --check-prefix=CX8_32 --check-prefix=CX4 --check-prefix=CHECK %s
				; RUN: llc < %s -march=x86 -mcpu=i486 \| FileCheck --check-prefix=X32_NOCX16 --check-prefix=NOCX8 --check-prefix=CX4 --check-prefix=CHECK %s
				; RUN: llc < %s -march=x86 -mcpu=i386 \| FileCheck --check-prefix=X32_NOCX16 --check-prefix=NOCX8 --check-prefix=NOCX4 --check-prefix=CHECK %s

				;; This test checks that various versions of the x86 do, or do not,
				;; support native atomic instructions of different sizes.

				define void @test_i128(i128* %a) nounwind {
				; CHECK-LABEL: test_i128:
				entry:
				; X64_NOCX16: __atomic_compare_exchange_16
				; X32_NOCX16: __atomic_compare_exchange{{$}}
				; X64_CX16: cmpxchg16b
				%0 = cmpxchg i128* %a, i128 1, i128 1 seq_cst seq_cst
				; X64_NOCX16: __atomic_exchange_16
				; X32_NOCX16: __atomic_exchange{{$}}
				; X64_CX16: cmpxchg16b
				%1 = atomicrmw xchg i128* %a, i128 1 seq_cst
				; X64_NOCX16: __atomic_fetch_add_16
				; X32_NOCX16: __atomic_compare_exchange{{$}}
				; X64_CX16: cmpxchg16b
				%2 = atomicrmw add i128* %a, i128 1 seq_cst
				; X64_NOCX16: __atomic_fetch_sub_16
				; X32_NOCX16: __atomic_compare_exchange{{$}}
				; X64_CX16: cmpxchg16b
				%3 = atomicrmw sub i128* %a, i128 1 seq_cst
				; X64_NOCX16: __atomic_fetch_and_16
				; X32_NOCX16: __atomic_compare_exchange{{$}}
				; X64_CX16: cmpxchg16b
				%4 = atomicrmw and i128* %a, i128 1 seq_cst
				; X64_NOCX16: __atomic_fetch_nand_16
				; X32_NOCX16: __atomic_compare_exchange{{$}}
				; X64_CX16: cmpxchg16b
				%5 = atomicrmw nand i128* %a, i128 1 seq_cst
				; X64_NOCX16: __atomic_fetch_or_16
				; X32_NOCX16: __atomic_compare_exchange{{$}}
				; X64_CX16: cmpxchg16b
				%6 = atomicrmw or i128* %a, i128 1 seq_cst
				; X64_NOCX16: __atomic_fetch_xor_16
				; X32_NOCX16: __atomic_compare_exchange{{$}}
				; X64_CX16: cmpxchg16b
				%7 = atomicrmw xor i128* %a, i128 1 seq_cst
				ret void
				}

				define void @test_i64(i64* %a) nounwind {
				; CHECK-LABEL: test_i64:
				entry:
				; NOCX8: __atomic_compare_exchange_8
				; CX8_64: cmpxchgq
				; CX8_32: cmpxchg8b
				%0 = cmpxchg i64* %a, i64 1, i64 1 seq_cst seq_cst
				; NOCX8: __atomic_exchange_8
				; CX8_64: xchgq
				; CX8_32: cmpxchg8b
				%1 = atomicrmw xchg i64* %a, i64 1 seq_cst
				; NOCX8: __atomic_fetch_add_8
				; CX8_64: lock incq
				; CX8_32: cmpxchg8b
				%2 = atomicrmw add i64* %a, i64 1 seq_cst
				; NOCX8: __atomic_fetch_sub_8
				; CX8_64: lock decq
				; CX8_32: cmpxchg8b
				%3 = atomicrmw sub i64* %a, i64 1 seq_cst
				; NOCX8: __atomic_fetch_and_8
				; CX8_64: lock andq
				; CX8_32: cmpxchg8b
				%4 = atomicrmw and i64* %a, i64 1 seq_cst
				; NOCX8: __atomic_fetch_nand_8
				; CX8_64: cmpxchgq
				; CX8_32: cmpxchg8b
				%5 = atomicrmw nand i64* %a, i64 1 seq_cst
				; NOCX8: __atomic_fetch_or_8
				; CX8_64: lock orq
				; CX8_32: cmpxchg8b
				%6 = atomicrmw or i64* %a, i64 1 seq_cst
				; NOCX8: __atomic_fetch_xor_8
				; CX8_64: lock xorq
				; CX8_32: cmpxchg8b
				%7 = atomicrmw xor i64* %a, i64 1 seq_cst
				ret void
				}

				define void @test_i32(i32* %a) nounwind {
				; CHECK-LABEL: test_i32:
				entry:
				; NOCX4: __atomic_compare_exchange_4
				; CX4: lock cmpxchgl
				%0 = cmpxchg i32* %a, i32 1, i32 1 seq_cst seq_cst
				; NOCX4: __atomic_exchange_4
				; CX4: xchgl
				%1 = atomicrmw xchg i32* %a, i32 1 seq_cst
				; NOCX4: __atomic_fetch_add_4
				; CX4: lock incl
				%2 = atomicrmw add i32* %a, i32 1 seq_cst
				; NOCX4: __atomic_fetch_sub_4
				; CX4: lock decl
				%3 = atomicrmw sub i32* %a, i32 1 seq_cst
				; NOCX4: __atomic_fetch_and_4
				; CX4: lock andl
				%4 = atomicrmw and i32* %a, i32 1 seq_cst
				; NOCX4: __atomic_fetch_nand_4
				; CX4: lock cmpxchgl
				%5 = atomicrmw nand i32* %a, i32 1 seq_cst
				; NOCX4: __atomic_fetch_or_4
				; CX4: lock orl
				%6 = atomicrmw or i32* %a, i32 1 seq_cst
				; NOCX4: __atomic_fetch_xor_4
				; CX4: lock xorl
				%7 = atomicrmw xor i32* %a, i32 1 seq_cst
				ret void
				}

test/CodeGen/X86/atomic-flags.ll

	; RUN: llc < %s -mtriple=x86_64-unknown-unknown -verify-machineinstrs \| FileCheck %s			; RUN: llc < %s -mtriple=x86_64-unknown-unknown -verify-machineinstrs \| FileCheck %s
	; RUN: llc < %s -mtriple=i686-unknown-unknown -verify-machineinstrs \| FileCheck %s			; RUN: llc < %s -mtriple=i686-unknown-unknown -mcpu=i686 -verify-machineinstrs \| FileCheck %s

	; Make sure that flags are properly preserved despite atomic optimizations.			; Make sure that flags are properly preserved despite atomic optimizations.

	define i32 @atomic_and_flags_1(i8* %p, i32 %a, i32 %b) {			define i32 @atomic_and_flags_1(i8* %p, i32 %a, i32 %b) {
	; CHECK-LABEL: atomic_and_flags_1:			; CHECK-LABEL: atomic_and_flags_1:

	; Generate flags value, and use it.			; Generate flags value, and use it.
	; CHECK: cmpl			; CHECK: cmpl
	▲ Show 20 Lines • Show All 51 Lines • Show Last 20 Lines

test/CodeGen/X86/atomic-pointer.ll

	; RUN: llc < %s -mtriple=i686-none-linux -verify-machineinstrs \| FileCheck %s			; RUN: llc < %s -mtriple=i686-none-linux -mcpu=i686 -verify-machineinstrs \| FileCheck %s

	define i32* @test_atomic_ptr_load(i32** %a0) {			define i32* @test_atomic_ptr_load(i32** %a0) {
	; CHECK: test_atomic_ptr_load			; CHECK: test_atomic_ptr_load
	; CHECK: movl			; CHECK: movl
	; CHECK: movl			; CHECK: movl
	; CHECK: ret			; CHECK: ret
	0:			0:
	%0 = load atomic i32, i32* %a0 seq_cst, align 4			%0 = load atomic i32, i32* %a0 seq_cst, align 4
	Show All 13 Lines

test/CodeGen/X86/atomic_mi.ll

	; RUN: llc < %s -mtriple=x86_64-unknown-unknown -verify-machineinstrs \| FileCheck %s --check-prefix X64			; RUN: llc < %s -mtriple=x86_64-unknown-unknown -verify-machineinstrs \| FileCheck %s --check-prefix X64
	; RUN: llc < %s -mtriple=i686-unknown-unknown -verify-machineinstrs \| FileCheck %s --check-prefix X32			; RUN: llc < %s -mtriple=i686-unknown-unknown -verify-machineinstrs -mcpu=i686 \| FileCheck %s --check-prefix X32
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=slow-incdec -verify-machineinstrs \| FileCheck %s --check-prefix SLOW_INC			; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=slow-incdec -verify-machineinstrs \| FileCheck %s --check-prefix SLOW_INC

	; This file checks that atomic (non-seq_cst) stores of immediate values are			; This file checks that atomic (non-seq_cst) stores of immediate values are
	; done in one mov instruction and not 2. More precisely, it makes sure that the			; done in one mov instruction and not 2. More precisely, it makes sure that the
	; immediate is not first copied uselessly into a register.			; immediate is not first copied uselessly into a register.

	; Similarily, it checks that a binary operation of an immediate with an atomic			; Similarily, it checks that a binary operation of an immediate with an atomic
	; variable that is stored back in that variable is done as a single instruction.			; variable that is stored back in that variable is done as a single instruction.
	▲ Show 20 Lines • Show All 988 Lines • Show Last 20 Lines

test/CodeGen/X86/bswap.ll

	; bswap should be constant folded when it is passed a constant argument			; bswap should be constant folded when it is passed a constant argument

	; RUN: llc < %s -march=x86 -mcpu=i686 \| FileCheck %s			; RUN: llc < %s -march=x86 -mcpu=i386 \| FileCheck --check-prefix=CHECK386 --check-prefix=CHECK %s
	; RUN: llc < %s -march=x86-64 \| FileCheck %s --check-prefix=CHECK64			; RUNX: llc < %s -march=x86 -mcpu=i486 \| FileCheck --check-prefix=CHECK486 --check-prefix=CHECKBSW --check-prefix=CHECK %s
				; RUN: llc < %s -march=x86-64 \| FileCheck --check-prefix=CHECK64 --check-prefix=CHECKBSW --check-prefix=CHECK %s

	declare i16 @llvm.bswap.i16(i16)			declare i16 @llvm.bswap.i16(i16)

	declare i32 @llvm.bswap.i32(i32)			declare i32 @llvm.bswap.i32(i32)

	declare i64 @llvm.bswap.i64(i64)			declare i64 @llvm.bswap.i64(i64)

	define i16 @W(i16 %A) {			define i16 @W(i16 %A) {
	; CHECK-LABEL: W:			; CHECK-LABEL: W:
	; CHECK: rolw $8, %ax			; CHECK: rolw $8, %

	; CHECK64-LABEL: W:
	; CHECK64: rolw $8, %
	%Z = call i16 @llvm.bswap.i16( i16 %A ) ; <i16> [#uses=1]			%Z = call i16 @llvm.bswap.i16( i16 %A ) ; <i16> [#uses=1]
	ret i16 %Z			ret i16 %Z
	}			}

	define i32 @X(i32 %A) {			define i32 @X(i32 %A) {
	; CHECK-LABEL: X:			; CHECK-LABEL: X:
	; CHECK: bswapl %eax			; CHECK386: rorw $8, %ax
				; CHECK386: rorl $16, %eax
	; CHECK64-LABEL: X:			; CHECK386: rorw $8, %ax
	; CHECK64: bswapl %			; CHECKBSW: bswapl %
				abUnsubmitted Not Done Reply Inline Actions Newline between the checks? And/or shorten CHECK386 -> I386? ab: Newline between the checks? And/or shorten CHECK386 -> I386?
	%Z = call i32 @llvm.bswap.i32( i32 %A ) ; <i32> [#uses=1]			%Z = call i32 @llvm.bswap.i32( i32 %A ) ; <i32> [#uses=1]
	ret i32 %Z			ret i32 %Z
	}			}

	define i64 @Y(i64 %A) {			define i64 @Y(i64 %A) {
	; CHECK-LABEL: Y:			; CHECK-LABEL: Y:
	; CHECK: bswapl %eax			; CHECK386: rorw $8, %ax
	; CHECK: bswapl %edx			; CHECK386: rorl $16, %eax
				; CHECK386: rorw $8, %ax
	; CHECK64-LABEL: Y:			; CHECK386: rorw $8, %dx
				; CHECK386: rorl $16, %edx
				; CHECK386: rorw $8, %dx
				; CHECK486: bswapl %eax
				; CHECK486: bswapl %edx
	; CHECK64: bswapq %			; CHECK64: bswapq %
	%Z = call i64 @llvm.bswap.i64( i64 %A ) ; <i64> [#uses=1]			%Z = call i64 @llvm.bswap.i64( i64 %A ) ; <i64> [#uses=1]
	ret i64 %Z			ret i64 %Z
	}			}

	; rdar://9164521			; rdar://9164521
	define i32 @test1(i32 %a) nounwind readnone {			define i32 @test1(i32 %a) nounwind readnone {
	entry:			entry:
	; CHECK-LABEL: test1:			; CHECK-LABEL: test1:
	; CHECK: bswapl [[REG:%.*]]			; CHECK386: rorw $8, %[[REG:.*]]
	; CHECK: shrl $16, [[REG]]			; CHECK386: rorl $16, %e[[REG]]
				; CHECK386: rorw $8, %[[REG]]
	; CHECK64-LABEL: test1:			; CHECK386: shrl $16, %e[[REG]]
	; CHECK64: bswapl [[REG:%.*]]			; CHECKBSW: bswapl [[REG:%.*]]
	; CHECK64: shrl $16, [[REG]]			; CHECKBSW: shrl $16, [[REG]]
	%and = lshr i32 %a, 8			%and = lshr i32 %a, 8
	%shr3 = and i32 %and, 255			%shr3 = and i32 %and, 255
	%and2 = shl i32 %a, 8			%and2 = shl i32 %a, 8
	%shl = and i32 %and2, 65280			%shl = and i32 %and2, 65280
	%or = or i32 %shr3, %shl			%or = or i32 %shr3, %shl
	ret i32 %or			ret i32 %or
	}			}

	define i32 @test2(i32 %a) nounwind readnone {			define i32 @test2(i32 %a) nounwind readnone {
	entry:			entry:
	; CHECK-LABEL: test2:			; CHECK-LABEL: test2:
	; CHECK: bswapl [[REG:%.*]]			; CHECK386: rorw $8, %[[REG:.*]]
	; CHECK: sarl $16, [[REG]]			; CHECK386: rorl $16, %e[[REG]]
				; CHECK386: rorw $8, %[[REG]]
	; CHECK64-LABEL: test2:			; CHECK386: sarl $16, %e[[REG]]
	; CHECK64: bswapl [[REG:%.*]]			; CHECKBSW: bswapl [[REG:%.*]]
	; CHECK64: sarl $16, [[REG]]			; CHECKBSW: sarl $16, [[REG]]
	%and = lshr i32 %a, 8			%and = lshr i32 %a, 8
	%shr4 = and i32 %and, 255			%shr4 = and i32 %and, 255
	%and2 = shl i32 %a, 8			%and2 = shl i32 %a, 8
	%or = or i32 %shr4, %and2			%or = or i32 %shr4, %and2
	%sext = shl i32 %or, 16			%sext = shl i32 %or, 16
	%conv3 = ashr exact i32 %sext, 16			%conv3 = ashr exact i32 %sext, 16
	ret i32 %conv3			ret i32 %conv3
	}			}

	@var8 = global i8 0			@var8 = global i8 0
	@var16 = global i16 0			@var16 = global i16 0

	; The "shl" below can move bits into the high parts of the value, so the			; The "shl" below can move bits into the high parts of the value, so the
	; operation is not a "bswap, shr" pair.			; operation is not a "bswap, shr" pair.

	; rdar://problem/14814049			; rdar://problem/14814049
	define i64 @not_bswap() {			define i64 @not_bswap() {
	; CHECK-LABEL: not_bswap:			; CHECK-LABEL: not_bswap:
	; CHECK-NOT: bswapl			; CHECK-NOT: bswapl
				; CHECK-NOT: bswapq
	; CHECK: ret			; CHECK: ret

	; CHECK64-LABEL: not_bswap:
	; CHECK64-NOT: bswapq
	; CHECK64: ret
	%init = load i16, i16* @var16			%init = load i16, i16* @var16
	%big = zext i16 %init to i64			%big = zext i16 %init to i64

	%hishifted = lshr i64 %big, 8			%hishifted = lshr i64 %big, 8
	%loshifted = shl i64 %big, 8			%loshifted = shl i64 %big, 8

	%notswapped = or i64 %hishifted, %loshifted			%notswapped = or i64 %hishifted, %loshifted

	ret i64 %notswapped			ret i64 %notswapped
	}			}

	; This time, the lshr (and subsequent or) is completely useless. While it's			; This time, the lshr (and subsequent or) is completely useless. While it's
	; technically correct to convert this into a "bswap, shr", it's suboptimal. A			; technically correct to convert this into a "bswap, shr", it's suboptimal. A
	; simple shl works better.			; simple shl works better.

	define i64 @not_useful_bswap() {			define i64 @not_useful_bswap() {
	; CHECK-LABEL: not_useful_bswap:			; CHECK-LABEL: not_useful_bswap:
	; CHECK-NOT: bswapl			; CHECK-NOT: bswapl
				; CHECK-NOT: bswapq
	; CHECK: ret			; CHECK: ret

	; CHECK64-LABEL: not_useful_bswap:
	; CHECK64-NOT: bswapq
	; CHECK64: ret

	%init = load i8, i8* @var8			%init = load i8, i8* @var8
	%big = zext i8 %init to i64			%big = zext i8 %init to i64

	%hishifted = lshr i64 %big, 8			%hishifted = lshr i64 %big, 8
	%loshifted = shl i64 %big, 8			%loshifted = shl i64 %big, 8

	%notswapped = or i64 %hishifted, %loshifted			%notswapped = or i64 %hishifted, %loshifted

	ret i64 %notswapped			ret i64 %notswapped
	}			}

	; Finally, it is OK to just mask off the shl if we know that the value is zero			; Finally, it is OK to just mask off the shl if we know that the value is zero
	; beyond 16 bits anyway. This is a legitimate bswap.			; beyond 16 bits anyway. This is a legitimate bswap.

	define i64 @finally_useful_bswap() {			define i64 @finally_useful_bswap() {
	; CHECK-LABEL: finally_useful_bswap:			; CHECK-LABEL: finally_useful_bswap:
	; CHECK: bswapl [[REG:%.*]]			; CHECK386: rorw $8, %[[REG:.*]]
	; CHECK: shrl $16, [[REG]]			; CHECK386: rorl $16, %e[[REG]]
	; CHECK: ret			; CHECK386: rorw $8, %[[REG]]
				; CHECK386: shrl $16, %e[[REG]]
	; CHECK64-LABEL: finally_useful_bswap:			; CHECK486: bswapl [[REG:%.*]]
				; CHECK486: shrl $16, [[REG]]
				; CHECK486: ret
	; CHECK64: bswapq [[REG:%.*]]			; CHECK64: bswapq [[REG:%.*]]
	; CHECK64: shrq $48, [[REG]]			; CHECK64: shrq $48, [[REG]]
	; CHECK64: ret			; CHECK64: ret

	%init = load i16, i16* @var16			%init = load i16, i16* @var16
	%big = zext i16 %init to i64			%big = zext i16 %init to i64

	%hishifted = lshr i64 %big, 8			%hishifted = lshr i64 %big, 8
	%lomasked = and i64 %big, 255			%lomasked = and i64 %big, 255
	%loshifted = shl i64 %lomasked, 8			%loshifted = shl i64 %lomasked, 8

	%swapped = or i64 %hishifted, %loshifted			%swapped = or i64 %hishifted, %loshifted

	ret i64 %swapped			ret i64 %swapped
	}			}

test/CodeGen/X86/cmpxchg-clobber-flags.ll

	; RUN: llc -mtriple=i386-linux-gnu %s -o - \| FileCheck %s -check-prefix=i386			; RUN: llc -mtriple=i386-linux-gnu -mcpu=i686 %s -o - \| FileCheck %s -check-prefix=i386
	; RUN: llc -mtriple=i386-linux-gnu -pre-RA-sched=fast %s -o - \| FileCheck %s -check-prefix=i386f			; RUN: llc -mtriple=i386-linux-gnu -mcpu=i686 -pre-RA-sched=fast %s -o - \| FileCheck %s -check-prefix=i386f

	; RUN: llc -mtriple=x86_64-linux-gnu %s -o - \| FileCheck %s -check-prefix=x8664			; RUN: llc -mtriple=x86_64-linux-gnu %s -o - \| FileCheck %s -check-prefix=x8664
	; RUN: llc -mtriple=x86_64-linux-gnu -pre-RA-sched=fast %s -o - \| FileCheck %s -check-prefix=x8664			; RUN: llc -mtriple=x86_64-linux-gnu -pre-RA-sched=fast %s -o - \| FileCheck %s -check-prefix=x8664
	; RUN: llc -mtriple=x86_64-linux-gnu -mattr=+sahf %s -o - \| FileCheck %s -check-prefix=x8664-sahf			; RUN: llc -mtriple=x86_64-linux-gnu -mattr=+sahf %s -o - \| FileCheck %s -check-prefix=x8664-sahf
	; RUN: llc -mtriple=x86_64-linux-gnu -mattr=+sahf -pre-RA-sched=fast %s -o - \| FileCheck %s -check-prefix=x8664-sahf			; RUN: llc -mtriple=x86_64-linux-gnu -mattr=+sahf -pre-RA-sched=fast %s -o - \| FileCheck %s -check-prefix=x8664-sahf
	; RUN: llc -mtriple=x86_64-linux-gnu -mcpu=corei7 %s -o - \| FileCheck %s -check-prefix=x8664-sahf			; RUN: llc -mtriple=x86_64-linux-gnu -mcpu=corei7 %s -o - \| FileCheck %s -check-prefix=x8664-sahf

	; TODO: Reenable verify-machineinstr once the if (!AXDead) // FIXME			; TODO: Reenable verify-machineinstr once the if (!AXDead) // FIXME
	▲ Show 20 Lines • Show All 180 Lines • Show Last 20 Lines

test/CodeGen/X86/nocx16.ll

This file was deleted.

	; RUN: llc < %s -march=x86-64 -mcpu=corei7 -mattr=-cx16 \| FileCheck %s
	define void @test(i128* %a) nounwind {
	entry:
	; CHECK: __atomic_compare_exchange_16
	%0 = cmpxchg i128* %a, i128 1, i128 1 seq_cst seq_cst
	; CHECK: __atomic_exchange_16
	%1 = atomicrmw xchg i128* %a, i128 1 seq_cst
	; CHECK: __atomic_fetch_add_16
	%2 = atomicrmw add i128* %a, i128 1 seq_cst
	; CHECK: __atomic_fetch_sub_16
	%3 = atomicrmw sub i128* %a, i128 1 seq_cst
	; CHECK: __atomic_fetch_and_16
	%4 = atomicrmw and i128* %a, i128 1 seq_cst
	; CHECK: __atomic_fetch_nand_16
	%5 = atomicrmw nand i128* %a, i128 1 seq_cst
	; CHECK: __atomic_fetch_or_16
	%6 = atomicrmw or i128* %a, i128 1 seq_cst
	; CHECK: __atomic_fetch_xor_16
	%7 = atomicrmw xor i128* %a, i128 1 seq_cst
	ret void
	}

test/CodeGen/X86/peephole-na-phys-copy-folding.ll

	; RUN: llc -mtriple=i386-linux-gnu %s -o - \| FileCheck %s			; RUN: llc -mtriple=i686-linux-gnu -mcpu=i686 %s -o - \| FileCheck %s
	; RUN: llc -mtriple=x86_64-linux-gnu -mattr=+sahf %s -o - \| FileCheck %s			; RUN: llc -mtriple=x86_64-linux-gnu -mattr=+sahf %s -o - \| FileCheck %s

	; TODO: Reenable verify-machineinstrs once the if (!AXDead) // FIXME in			; TODO: Reenable verify-machineinstrs once the if (!AXDead) // FIXME in
	; X86InstrInfo::copyPhysReg() is resolved.			; X86InstrInfo::copyPhysReg() is resolved.

	; The peephole optimizer can elide some physical register copies such as			; The peephole optimizer can elide some physical register copies such as
	; EFLAGS. Make sure the flags are used directly, instead of needlessly using			; EFLAGS. Make sure the flags are used directly, instead of needlessly using
	; lahf, when possible.			; lahf, when possible.
	▲ Show 20 Lines • Show All 184 Lines • Show Last 20 Lines

test/Transforms/AtomicExpand/X86/expand-atomic-rmw-initial-load.ll

	; RUN: opt -S %s -atomic-expand -mtriple=i686-linux-gnu \| FileCheck %s			; RUN: opt -S %s -atomic-expand -mtriple=i686-linux-gnu -mcpu=i686 \| FileCheck %s

	; This file tests the function `llvm::expandAtomicRMWToCmpXchg`.			; This file tests the function `llvm::expandAtomicRMWToCmpXchg`.
	; It isn't technically target specific, but is exposed through a pass that is.			; It isn't technically target specific, but is exposed through a pass that is.

	define i8 @test_initial_load(i8* %ptr, i8 %value) {			define i8 @test_initial_load(i8* %ptr, i8 %value) {
	%res = atomicrmw nand i8* %ptr, i8 %value seq_cst			%res = atomicrmw nand i8* %ptr, i8 %value seq_cst
	ret i8 %res			ret i8 %res
	}			}
	; CHECK-LABEL: @test_initial_load			; CHECK-LABEL: @test_initial_load
	; CHECK-NEXT: %1 = load i8, i8* %ptr, align 1			; CHECK-NEXT: %1 = load i8, i8* %ptr, align 1

This is an archive of the discontinued LLVM Phabricator instance.

Improve support for i386 and i486 CPUs.AcceptedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 52723

lib/Target/X86/X86.td

lib/Target/X86/X86ISelLowering.cpp

lib/Target/X86/X86InstrCompiler.td

lib/Target/X86/X86InstrInfo.cpp

lib/Target/X86/X86InstrInfo.td

lib/Target/X86/X86Subtarget.h

test/CodeGen/X86/2010-10-08-cmpxchg8b.ll

test/CodeGen/X86/atomic-cpus.ll

test/CodeGen/X86/atomic-flags.ll

test/CodeGen/X86/atomic-pointer.ll

test/CodeGen/X86/atomic_mi.ll

test/CodeGen/X86/bswap.ll

test/CodeGen/X86/cmpxchg-clobber-flags.ll

test/CodeGen/X86/nocx16.ll

test/CodeGen/X86/peephole-na-phys-copy-folding.ll

test/Transforms/AtomicExpand/X86/expand-atomic-rmw-initial-load.ll

Improve support for i386 and i486 CPUs.
AcceptedPublic