This is an archive of the discontinued LLVM Phabricator instance.

Introduction of FeatureX87
ClosedPublic

Authored by aturetsk on Oct 22 2015, 5:00 AM.

Download Raw Diff

Details

Reviewers

bruno
nadav
echristo

Commits

rG6a3d561ea036: [X86] Introduction of FeatureX87.
rL264148: [X86] Introduction of FeatureX87.

Summary

Add FeatureX87 in X86 backend.
This is a preparatory changes for introducing a new CPU - Lakemont - which doesn't support X87 instructions.

Diff Detail

Event Timeline

aturetsk updated this revision to Diff 38112.Oct 22 2015, 5:00 AM

aturetsk retitled this revision from to Introduction of FeatureX87.

aturetsk updated this object.

aturetsk added a reviewer: nadav.

aturetsk added a subscriber: llvm-commits.

aturetsk added a child revision: D13980: Add "x87" in x86 target feature map.Oct 22 2015, 5:12 AM

Ping.

LGTM.

Hi Andrey,

lib/Target/X86/X86Subtarget.h
403	This looks odd since we do support f32 (not f64) with SSE1. See X86ISelLowering.cpp:553 } else if (!Subtarget->useSoftFloat() && X86ScalarSSEf32) { // Use SSE for f32, x87 for f64. // Set up the FP register classes. addRegisterClass(MVT::f32, &X86::FR32RegClass); addRegisterClass(MVT::f64, &X86::RFP64RegClass); ... Since not having SSE at all fallbacks to x87, why not only check for "UseSoftFloat \|\| !hasX87())" ? Anyway, I think those should come in a separate patch with appropriate feature testcases.

RKSimon added a subscriber: RKSimon.Nov 2 2015, 1:57 PM

RKSimon added inline comments.

lib/Target/X86/X86.td
57–58	You can reduce the size of the diff if you inherit FeatureX87 in FeatureMMX - all targets which declare FeatureMMX wouldn't then need to be altered.

Hi Bruno and Simon,
Thanks for the review,

lib/Target/X86/X86.td
57–58	I was thinking about it, but decided not to do so. From the technical point of view X87 has nothing to do with MMX so it makes sense to keep them unbound. Also I think it'd be good to have an opportunity to enable MMX/SSE/etc while having X87 disabled - I have no idea which features will be supported in Lakemont successors, but I don't see why that can't be the case. So if you don't mind I'd stick to the current version of the patch.
lib/Target/X86/X86Subtarget.h
403	My logic came from the hypothetical situation when we have SSE/SSE2, but not X87. If we have SSE there is no instructions to handle f64, thus without X87 we should use soft floats. SSE2 can handle f64 so there is no need to do that.

You've done this on an old version of the sources, you'll need to rebase.

One other reply inline.

Thanks!

-eric

lib/Target/X86/X86Subtarget.h
403	I agree with Bruno here. It also might be a good idea to not add this conditional at all and just use the legalizer. Thoughts?

Rebased and fixed questionable condition in useSoftFloat.

aturetsk added inline comments.Nov 25 2015, 6:27 AM

lib/Target/X86/X86Subtarget.h
403	For now I fixed the condition as suggested. Eric, could you please explain in more details how you suggest to use legalizar?

Added the test.

Ping.

Hi Andrey,

lib/Target/X86/X86Subtarget.h
403	As Eric mentioned, better to live the option check alone: "bool useSoftFloat() const { return UseSoftFloat }". Then change X86ISelLowering.cpp:589 this way: Change } else if (!Subtarget->useSoftFloat()) { To } else if (!Subtarget->useSoftFloat() && Subtarget->hasX87()) { This should be enough to get the behaviour you want.
test/CodeGen/X86/x87.ll
6	This is missing appropriate checks for instructions you want (or not) to be present in the output.

Hi Bruno,

lib/Target/X86/X86Subtarget.h
403	Thanks for the explanation. I have some concern about such approach. Look at the test example: ; RUN: llc < %s -march=x86 -mattr=-x87,+sse define float @test32(float %a, float %b) nounwind readnone { entry: %0 = fadd float %a, %b ret float %0 ; Generated assembly: ; pushl %eax ; movss 8(%esp), %xmm0 ; addss 12(%esp), %xmm0 ; movss %xmm0, (%esp) ; flds (%esp) ; popl %eax ; retl } define double @test64(double %a, double %b) nounwind readnone { entry: %0 = fadd double %a, %b ret double %0 ; Generated assembly: ; fldl 4(%esp) ; faddl 12(%esp) ; retl } The approach you suggest would generate x87 instructions (you can see the generated assembly in the comments), which is wrong since we have FeatureX87 disabled. The current version of the patch makes compiler to generate soft float calls in test32 and test64 (probably that's not the best assembly for test32 since we enabled sse, but at least it's correct). I understand, that the combination "-x87,+sse" does not correspond to any existing CPU, yet llc gives an opportunity for user to use it through -mattr option. So shouldn't we care to generate correct assembly even in this case?
test/CodeGen/X86/x87.ll
6	In this test I just want to make sure that x87's fadd won't be generated. So why is "CHECK-NOT: fadd{{.*}}" not enough?

aturetsk added inline comments.Dec 15 2015, 7:56 AM

lib/Target/X86/X86Subtarget.h
403	To sum up, the approach you're suggesting does make the compiler to behave as I want, since right now I'm only interested in having -mattr=-x87 to work correctly. And I'm ready to submit the updated patch. I'm just not sure that this approach is right from the general point of view...

Ping.

bruno added a reviewer: bruno.Jan 5 2016, 4:55 PM

bruno added inline comments.

lib/Target/X86/X86Subtarget.h
403	Hi Andrey, Sorry for not mentioning it explicitly but the idea here is that we should only use "Subtarget->useSoftFloat()" to represent the state for the soft-float feature whereas the logic of selecting what is actually supported should be done in X86TargetLowering::X86TargetLowering (something along the lines of the snippet I used, but if that's not enough, please add more logic to guarantee it). Does that make sense? Feel free to address it in a upcoming patch if you wish but remember to add the "-mattr=-x87,+sse" tests when you do so.
test/CodeGen/X86/x87.ll
6	Ok. Please add simple cases when +x86 it's used as well!

Fixed and rebased.

Hello Bruno,

I've added more logic in X86TargetLowering::X86TargetLowering as you suggested and improved the test covering different combinations of x87 and sse. Also I included a plenty of float convert instructions in the test since X86TargetLowering::X86TargetLowering contains non-trivial logic to handle them.

bruno added inline comments.Jan 20 2016, 9:41 AM

lib/Target/X86/X86ISelLowering.cpp
526 ↗	(On Diff #45388)	Why check Subtarget->hasX87() here? We're using SSE for f32 and f64 anyways
560 ↗	(On Diff #45388)	Factor out "!Subtarget->useSoftFloat() && Subtarget->hasX87()" with bool UseX87 = !Subtarget->useSoftFloat() && Subtarget->hasX87(); and use that throughout the checks.
596 ↗	(On Diff #45388)	UseX87 here
630 ↗	(On Diff #45388)	UseX87 here
test/CodeGen/X86/x87.ll
8	This test needs improvements; you can make it tighter by removing the allocas and other unnecessary instructions. Please explicitly check for all the specific instructions you want to match.

Fix remarks.

aturetsk added inline comments.Feb 3 2016, 8:24 AM

lib/Target/X86/X86ISelLowering.cpp
527 ↗	(On Diff #46784)	I get X87 load and store instructions in x87.ll if I don't check hasX87 here. I think changing that would require significant efforts. Since we don't have a CPU which has -x87 but +sse2, I just left the check here.
561 ↗	(On Diff #46784)	Done.
test/CodeGen/X86/x87.ll
9	Done.

bruno added inline comments.Feb 3 2016, 11:35 AM

lib/Target/X86/X86ISelLowering.cpp
527 ↗	(On Diff #46784)	This looks odd, do you know why it happens? in which specific target feature combination?
test/CodeGen/X86/x87.ll
32	Please place the checks above the IR instructions you intend to match. Also put a check-label in the beginning of the function.

aturetsk added inline comments.Feb 4 2016, 6:05 AM

lib/Target/X86/X86ISelLowering.cpp
527 ↗	(On Diff #46784)	This happens with -x87,+sse2 in 32 bit mode (64 bit mode seems to be OK). The instruction to blame is "sitofp i64 %l to float". Actually I was able to get rid of fild instruction and have a soft float call by adding the condition at line 213: if(Subtarget.hasX87() \|\| Subtarget.is64Bit()) setOperationAction(ISD::SINT_TO_FP , MVT::i64 , Custom); However I still get the wrong fstp instruction: calll __floatdisf fstps 20(%esp) movss 20(%esp), %xmm0 # xmm0 = mem[0],zero,zero,zero But what worries me the most is that there may be more such non-obvious cases, even in the current version of the patch having the test passed. Probably we should use more straight-forward and easier way (until we have a real case where the best code for -x87,+sse/sse2 is required) - to have a variable "UseX87 = !useSoftFloat() && hasX87()" and replace useSoftFloat() with !useX87 absolutely everywhere (similar to what was done initially, but keep useSoftFloat() unchanged through the use of a new variable). This way would guarantee that the compiler generates correct code with disabled x87.

Rebase and improve the test.

aturetsk added inline comments.Feb 19 2016, 7:10 AM

lib/Target/X86/X86ISelLowering.cpp
527 ↗	(On Diff #48485)	The reason why the float store has appeared is that when we lower call (to the soft float function in this case) we use f80 type for float return if we have SSE. Here is the code snippet from X86ISelLowering.cpp:2428: // If we prefer to use the value in xmm registers, copy it out as f80 and // use a truncate to move it from fp stack reg to xmm reg. bool RoundAfterCopy = false; if ((VA.getLocReg() == X86::FP0 \|\| VA.getLocReg() == X86::FP1) && isScalarFPTypeInSSEReg(VA.getValVT())) { CopyVT = MVT::f80; RoundAfterCopy = (CopyVT != VA.getLocVT()); } The comment in the code explains what's happened. Thus, currently even when we use SSE2 we still rely on the fact that we have X87 and changing that is not trivial. Now looking at this I'm really inclined to follow Simons advice to inherit FeatureX87 in FeatureMMX. This way SSE/SSE2 will imply X87 which makes a perfect sense since they rely on it. What do you think?
test/CodeGen/X86/x87.ll
33	Done.

bruno added inline comments.Feb 19 2016, 10:17 AM

lib/Target/X86/X86ISelLowering.cpp
527 ↗	(On Diff #48485)	Is there any actual any hardware that supports -x87,+sse2? In the same line of thought: is there any ABI definition that describes calling convention for this situation? If not I suggest we don't need to care about this case, though a report_fatal_error sanity check (see the one right above the code snippet you pointed out) in lower call for the "32-bit x86 -x87,+sse2" would be nice, since we don't support it. Inheriting FeatureX87 in FeatureMMX makes it easier for implementation purposes but they look orthogonal to me; if anyone is willing to support -x87,+sse2, it should be supported with code to handle it plus tests. Your patch seems to add FeatureX87 to all current CPUs we support, and that should be enough to guarantee it won't break anything. That said, you can also remove "-x87,+sse2" from your tests.
test/CodeGen/X86/x87.ll
12	There's no FileCheck invocation without check-prefix, so this isn't checking for anything. Use "X86-LABEL" and "NOX87-LABEL" instead.

(I've been quiet since I haven't had anything to add to Bruno's review)

Fix the test, add sanity check in the legalizer.

aturetsk added inline comments.Feb 20 2016, 5:10 AM

lib/Target/X86/X86ISelLowering.cpp
527 ↗	(On Diff #48589)	I believe there is no such hardware and ABI and I agree with not bothering with "-x87,+sse2" case until it's really needed. So I left UseX87 checks in legalizer only in the places where X87 float register classes are used and added a sanity check (note that it should not be 32-bit specific, it's just happened so that the issue I described had appeared in my testcase in 32 bit mode).
test/CodeGen/X86/x87.ll
13	Fixed.

Thanks Andrey, LGTM

This revision is now accepted and ready to land.Feb 22 2016, 10:18 AM

Closed by commit rL264148: [X86] Introduction of FeatureX87. (authored by aturetsk). · Explain WhyMar 23 2016, 4:19 AM

This revision was automatically updated to reflect the committed changes.

asavonic mentioned this in D98895: [X86][clang] Disable long double type for -mno-x87 option.Apr 26 2021, 8:36 AM

Revision Contents

Path

Size

lib/

Target/

X86/

X86.td

156 lines

X86Subtarget.h

6 lines

X86Subtarget.cpp

1 line

test/

CodeGen/

X86/

x87.ll

10 lines

Diff 41387

lib/Target/X86/X86.td

Show All 25 Lines	def Mode32Bit : SubtargetFeature<"32bit-mode", "In32BitMode", "true",
"32-bit mode (80386)">;		"32-bit mode (80386)">;
def Mode16Bit : SubtargetFeature<"16bit-mode", "In16BitMode", "true",		def Mode16Bit : SubtargetFeature<"16bit-mode", "In16BitMode", "true",
"16-bit mode (i8086)">;		"16-bit mode (i8086)">;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// X86 Subtarget features		// X86 Subtarget features
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		def FeatureX87 : SubtargetFeature<"x87","HasX87", "true",
		"Enable X87 float instructions">;

def FeatureCMOV : SubtargetFeature<"cmov","HasCMov", "true",		def FeatureCMOV : SubtargetFeature<"cmov","HasCMov", "true",
"Enable conditional move instructions">;		"Enable conditional move instructions">;

def FeaturePOPCNT : SubtargetFeature<"popcnt", "HasPOPCNT", "true",		def FeaturePOPCNT : SubtargetFeature<"popcnt", "HasPOPCNT", "true",
"Support POPCNT instruction">;		"Support POPCNT instruction">;

def FeatureFXSR : SubtargetFeature<"fxsr", "HasFXSR", "true",		def FeatureFXSR : SubtargetFeature<"fxsr", "HasFXSR", "true",
"Support fxsave/fxrestore instructions">;		"Support fxsave/fxrestore instructions">;

def FeatureXSAVE : SubtargetFeature<"xsave", "HasXSAVE", "true",		def FeatureXSAVE : SubtargetFeature<"xsave", "HasXSAVE", "true",
"Support xsave instructions">;		"Support xsave instructions">;

def FeatureXSAVEOPT: SubtargetFeature<"xsaveopt", "HasXSAVEOPT", "true",		def FeatureXSAVEOPT: SubtargetFeature<"xsaveopt", "HasXSAVEOPT", "true",
"Support xsaveopt instructions">;		"Support xsaveopt instructions">;

def FeatureXSAVEC : SubtargetFeature<"xsavec", "HasXSAVEC", "true",		def FeatureXSAVEC : SubtargetFeature<"xsavec", "HasXSAVEC", "true",
"Support xsavec instructions">;		"Support xsavec instructions">;

def FeatureXSAVES : SubtargetFeature<"xsaves", "HasXSAVES", "true",		def FeatureXSAVES : SubtargetFeature<"xsaves", "HasXSAVES", "true",
"Support xsaves instructions">;		"Support xsaves instructions">;

def FeatureSSE1 : SubtargetFeature<"sse", "X86SSELevel", "SSE1",		def FeatureSSE1 : SubtargetFeature<"sse", "X86SSELevel", "SSE1",
		RKSimonUnsubmitted Not Done Reply Inline Actions You can reduce the size of the diff if you inherit FeatureX87 in FeatureMMX - all targets which declare FeatureMMX wouldn't then need to be altered. RKSimon: You can reduce the size of the diff if you inherit FeatureX87 in FeatureMMX - all targets which…
		aturetskAuthorUnsubmitted Not Done Reply Inline Actions I was thinking about it, but decided not to do so. From the technical point of view X87 has nothing to do with MMX so it makes sense to keep them unbound. Also I think it'd be good to have an opportunity to enable MMX/SSE/etc while having X87 disabled - I have no idea which features will be supported in Lakemont successors, but I don't see why that can't be the case. So if you don't mind I'd stick to the current version of the patch. aturetsk: I was thinking about it, but decided not to do so. From the technical point of view X87 has…
"Enable SSE instructions",		"Enable SSE instructions",
// SSE codegen depends on cmovs, and all		// SSE codegen depends on cmovs, and all
// SSE1+ processors support them.		// SSE1+ processors support them.
[FeatureCMOV]>;		[FeatureCMOV]>;
def FeatureSSE2 : SubtargetFeature<"sse2", "X86SSELevel", "SSE2",		def FeatureSSE2 : SubtargetFeature<"sse2", "X86SSELevel", "SSE2",
"Enable SSE2 instructions",		"Enable SSE2 instructions",
[FeatureSSE1]>;		[FeatureSSE1]>;
def FeatureSSE3 : SubtargetFeature<"sse3", "X86SSELevel", "SSE3",		def FeatureSSE3 : SubtargetFeature<"sse3", "X86SSELevel", "SSE3",
▲ Show 20 Lines • Show All 158 Lines • ▼ Show 20 Lines
def ProcIntelAtom : SubtargetFeature<"atom", "X86ProcFamily", "IntelAtom",		def ProcIntelAtom : SubtargetFeature<"atom", "X86ProcFamily", "IntelAtom",
"Intel Atom processors">;		"Intel Atom processors">;
def ProcIntelSLM : SubtargetFeature<"slm", "X86ProcFamily", "IntelSLM",		def ProcIntelSLM : SubtargetFeature<"slm", "X86ProcFamily", "IntelSLM",
"Intel Silvermont processors">;		"Intel Silvermont processors">;

class Proc<string Name, list<SubtargetFeature> Features>		class Proc<string Name, list<SubtargetFeature> Features>
: ProcessorModel<Name, GenericModel, Features>;		: ProcessorModel<Name, GenericModel, Features>;

def : Proc<"generic", [FeatureSlowUAMem16]>;		def : Proc<"generic", [FeatureX87, FeatureSlowUAMem16]>;
def : Proc<"i386", [FeatureSlowUAMem16]>;		def : Proc<"i386", [FeatureX87, FeatureSlowUAMem16]>;
def : Proc<"i486", [FeatureSlowUAMem16]>;		def : Proc<"i486", [FeatureX87, FeatureSlowUAMem16]>;
def : Proc<"i586", [FeatureSlowUAMem16]>;		def : Proc<"i586", [FeatureX87, FeatureSlowUAMem16]>;
def : Proc<"pentium", [FeatureSlowUAMem16]>;		def : Proc<"pentium", [FeatureX87, FeatureSlowUAMem16]>;
def : Proc<"pentium-mmx", [FeatureSlowUAMem16, FeatureMMX]>;		def : Proc<"pentium-mmx", [FeatureX87, FeatureSlowUAMem16, FeatureMMX]>;
def : Proc<"i686", [FeatureSlowUAMem16]>;		def : Proc<"i686", [FeatureX87, FeatureSlowUAMem16]>;
def : Proc<"pentiumpro", [FeatureSlowUAMem16, FeatureCMOV]>;		def : Proc<"pentiumpro", [FeatureX87, FeatureSlowUAMem16, FeatureCMOV]>;
def : Proc<"pentium2", [FeatureSlowUAMem16, FeatureMMX, FeatureCMOV,		def : Proc<"pentium2", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
FeatureFXSR]>;		FeatureCMOV, FeatureFXSR]>;
def : Proc<"pentium3", [FeatureSlowUAMem16, FeatureMMX, FeatureSSE1,		def : Proc<"pentium3", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
FeatureFXSR]>;		FeatureSSE1, FeatureFXSR]>;
def : Proc<"pentium3m", [FeatureSlowUAMem16, FeatureMMX, FeatureSSE1,		def : Proc<"pentium3m", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
FeatureFXSR, FeatureSlowBTMem]>;		FeatureSSE1, FeatureFXSR, FeatureSlowBTMem]>;
def : Proc<"pentium-m", [FeatureSlowUAMem16, FeatureMMX, FeatureSSE2,		def : Proc<"pentium-m", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
FeatureFXSR, FeatureSlowBTMem]>;		FeatureSSE2, FeatureFXSR, FeatureSlowBTMem]>;
def : Proc<"pentium4", [FeatureSlowUAMem16, FeatureMMX, FeatureSSE2,		def : Proc<"pentium4", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
FeatureFXSR]>;		FeatureSSE2, FeatureFXSR]>;
def : Proc<"pentium4m", [FeatureSlowUAMem16, FeatureMMX, FeatureSSE2,		def : Proc<"pentium4m", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
FeatureFXSR, FeatureSlowBTMem]>;		FeatureSSE2, FeatureFXSR, FeatureSlowBTMem]>;

// Intel Core Duo.		// Intel Core Duo.
def : ProcessorModel<"yonah", SandyBridgeModel,		def : ProcessorModel<"yonah", SandyBridgeModel,
[FeatureSlowUAMem16, FeatureMMX, FeatureSSE3, FeatureFXSR,		[FeatureX87, FeatureSlowUAMem16, FeatureMMX, FeatureSSE3,
FeatureSlowBTMem]>;		FeatureFXSR, FeatureSlowBTMem]>;

// NetBurst.		// NetBurst.
def : Proc<"prescott",		def : Proc<"prescott",
[FeatureSlowUAMem16, FeatureMMX, FeatureSSE3, FeatureFXSR,		[FeatureX87, FeatureSlowUAMem16, FeatureMMX, FeatureSSE3,
FeatureSlowBTMem]>;		FeatureFXSR, FeatureSlowBTMem]>;
def : Proc<"nocona", [		def : Proc<"nocona", [
		FeatureX87,
FeatureSlowUAMem16,		FeatureSlowUAMem16,
FeatureMMX,		FeatureMMX,
FeatureSSE3,		FeatureSSE3,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureSlowBTMem		FeatureSlowBTMem
]>;		]>;

// Intel Core 2 Solo/Duo.		// Intel Core 2 Solo/Duo.
def : ProcessorModel<"core2", SandyBridgeModel, [		def : ProcessorModel<"core2", SandyBridgeModel, [
		FeatureX87,
FeatureSlowUAMem16,		FeatureSlowUAMem16,
FeatureMMX,		FeatureMMX,
FeatureSSSE3,		FeatureSSSE3,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureSlowBTMem		FeatureSlowBTMem
]>;		]>;
def : ProcessorModel<"penryn", SandyBridgeModel, [		def : ProcessorModel<"penryn", SandyBridgeModel, [
		FeatureX87,
FeatureSlowUAMem16,		FeatureSlowUAMem16,
FeatureMMX,		FeatureMMX,
FeatureSSE41,		FeatureSSE41,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureSlowBTMem		FeatureSlowBTMem
]>;		]>;

// Atom CPUs.		// Atom CPUs.
class BonnellProc<string Name> : ProcessorModel<Name, AtomModel, [		class BonnellProc<string Name> : ProcessorModel<Name, AtomModel, [
ProcIntelAtom,		ProcIntelAtom,
		FeatureX87,
FeatureSlowUAMem16,		FeatureSlowUAMem16,
FeatureMMX,		FeatureMMX,
FeatureSSSE3,		FeatureSSSE3,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureMOVBE,		FeatureMOVBE,
FeatureSlowBTMem,		FeatureSlowBTMem,
FeatureLEAForSP,		FeatureLEAForSP,
FeatureSlowDivide32,		FeatureSlowDivide32,
FeatureSlowDivide64,		FeatureSlowDivide64,
FeatureCallRegIndirect,		FeatureCallRegIndirect,
FeatureLEAUsesAG,		FeatureLEAUsesAG,
FeaturePadShortFunctions		FeaturePadShortFunctions
]>;		]>;
def : BonnellProc<"bonnell">;		def : BonnellProc<"bonnell">;
def : BonnellProc<"atom">; // Pin the generic name to the baseline.		def : BonnellProc<"atom">; // Pin the generic name to the baseline.

class SilvermontProc<string Name> : ProcessorModel<Name, SLMModel, [		class SilvermontProc<string Name> : ProcessorModel<Name, SLMModel, [
ProcIntelSLM,		ProcIntelSLM,
		FeatureX87,
FeatureMMX,		FeatureMMX,
FeatureSSE42,		FeatureSSE42,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureMOVBE,		FeatureMOVBE,
FeaturePOPCNT,		FeaturePOPCNT,
FeaturePCLMUL,		FeaturePCLMUL,
FeatureAES,		FeatureAES,
FeatureSlowDivide64,		FeatureSlowDivide64,
FeatureCallRegIndirect,		FeatureCallRegIndirect,
FeaturePRFCHW,		FeaturePRFCHW,
FeatureSlowLEA,		FeatureSlowLEA,
FeatureSlowIncDec,		FeatureSlowIncDec,
FeatureSlowBTMem		FeatureSlowBTMem
]>;		]>;
def : SilvermontProc<"silvermont">;		def : SilvermontProc<"silvermont">;
def : SilvermontProc<"slm">; // Legacy alias.		def : SilvermontProc<"slm">; // Legacy alias.

// "Arrandale" along with corei3 and corei5		// "Arrandale" along with corei3 and corei5
class NehalemProc<string Name> : ProcessorModel<Name, SandyBridgeModel, [		class NehalemProc<string Name> : ProcessorModel<Name, SandyBridgeModel, [
		FeatureX87,
FeatureMMX,		FeatureMMX,
FeatureSSE42,		FeatureSSE42,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureSlowBTMem,		FeatureSlowBTMem,
FeaturePOPCNT		FeaturePOPCNT
]>;		]>;
def : NehalemProc<"nehalem">;		def : NehalemProc<"nehalem">;
def : NehalemProc<"corei7">;		def : NehalemProc<"corei7">;

// Westmere is a similar machine to nehalem with some additional features.		// Westmere is a similar machine to nehalem with some additional features.
// Westmere is the corei3/i5/i7 path from nehalem to sandybridge		// Westmere is the corei3/i5/i7 path from nehalem to sandybridge
class WestmereProc<string Name> : ProcessorModel<Name, SandyBridgeModel, [		class WestmereProc<string Name> : ProcessorModel<Name, SandyBridgeModel, [
		FeatureX87,
FeatureMMX,		FeatureMMX,
FeatureSSE42,		FeatureSSE42,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureSlowBTMem,		FeatureSlowBTMem,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureAES,		FeatureAES,
FeaturePCLMUL		FeaturePCLMUL
]>;		]>;
def : WestmereProc<"westmere">;		def : WestmereProc<"westmere">;

// SSE is not listed here since llvm treats AVX as a reimplementation of SSE,		// SSE is not listed here since llvm treats AVX as a reimplementation of SSE,
// rather than a superset.		// rather than a superset.
class SandyBridgeProc<string Name> : ProcessorModel<Name, SandyBridgeModel, [		class SandyBridgeProc<string Name> : ProcessorModel<Name, SandyBridgeModel, [
		FeatureX87,
FeatureMMX,		FeatureMMX,
FeatureAVX,		FeatureAVX,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureSlowBTMem,		FeatureSlowBTMem,
FeatureSlowUAMem32,		FeatureSlowUAMem32,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureAES,		FeatureAES,
FeaturePCLMUL,		FeaturePCLMUL,
FeatureXSAVE,		FeatureXSAVE,
FeatureXSAVEOPT		FeatureXSAVEOPT
]>;		]>;
def : SandyBridgeProc<"sandybridge">;		def : SandyBridgeProc<"sandybridge">;
def : SandyBridgeProc<"corei7-avx">; // Legacy alias.		def : SandyBridgeProc<"corei7-avx">; // Legacy alias.

class IvyBridgeProc<string Name> : ProcessorModel<Name, SandyBridgeModel, [		class IvyBridgeProc<string Name> : ProcessorModel<Name, SandyBridgeModel, [
		FeatureX87,
FeatureMMX,		FeatureMMX,
FeatureAVX,		FeatureAVX,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureSlowBTMem,		FeatureSlowBTMem,
FeatureSlowUAMem32,		FeatureSlowUAMem32,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureAES,		FeatureAES,
FeaturePCLMUL,		FeaturePCLMUL,
FeatureXSAVE,		FeatureXSAVE,
FeatureXSAVEOPT,		FeatureXSAVEOPT,
FeatureRDRAND,		FeatureRDRAND,
FeatureF16C,		FeatureF16C,
FeatureFSGSBase		FeatureFSGSBase
]>;		]>;
def : IvyBridgeProc<"ivybridge">;		def : IvyBridgeProc<"ivybridge">;
def : IvyBridgeProc<"core-avx-i">; // Legacy alias.		def : IvyBridgeProc<"core-avx-i">; // Legacy alias.

class HaswellProc<string Name> : ProcessorModel<Name, HaswellModel, [		class HaswellProc<string Name> : ProcessorModel<Name, HaswellModel, [
		FeatureX87,
FeatureMMX,		FeatureMMX,
FeatureAVX2,		FeatureAVX2,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureSlowBTMem,		FeatureSlowBTMem,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureAES,		FeatureAES,
FeaturePCLMUL,		FeaturePCLMUL,
Show All 10 Lines	class HaswellProc<string Name> : ProcessorModel<Name, HaswellModel, [
FeatureRTM,		FeatureRTM,
FeatureHLE,		FeatureHLE,
FeatureSlowIncDec		FeatureSlowIncDec
]>;		]>;
def : HaswellProc<"haswell">;		def : HaswellProc<"haswell">;
def : HaswellProc<"core-avx2">; // Legacy alias.		def : HaswellProc<"core-avx2">; // Legacy alias.

class BroadwellProc<string Name> : ProcessorModel<Name, HaswellModel, [		class BroadwellProc<string Name> : ProcessorModel<Name, HaswellModel, [
		FeatureX87,
FeatureMMX,		FeatureMMX,
FeatureAVX2,		FeatureAVX2,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureSlowBTMem,		FeatureSlowBTMem,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureAES,		FeatureAES,
FeaturePCLMUL,		FeaturePCLMUL,
Show All 12 Lines	class BroadwellProc<string Name> : ProcessorModel<Name, HaswellModel, [
FeatureADX,		FeatureADX,
FeatureRDSEED,		FeatureRDSEED,
FeatureSlowIncDec		FeatureSlowIncDec
]>;		]>;
def : BroadwellProc<"broadwell">;		def : BroadwellProc<"broadwell">;

// FIXME: define KNL model		// FIXME: define KNL model
class KnightsLandingProc<string Name> : ProcessorModel<Name, HaswellModel, [		class KnightsLandingProc<string Name> : ProcessorModel<Name, HaswellModel, [
		FeatureX87,
FeatureMMX,		FeatureMMX,
FeatureAVX512,		FeatureAVX512,
FeatureFXSR,		FeatureFXSR,
FeatureERI,		FeatureERI,
FeatureCDI,		FeatureCDI,
FeaturePFI,		FeaturePFI,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeaturePOPCNT,		FeaturePOPCNT,
Show All 13 Lines	class KnightsLandingProc<string Name> : ProcessorModel<Name, HaswellModel, [
FeatureHLE,		FeatureHLE,
FeatureSlowIncDec,		FeatureSlowIncDec,
FeatureMPX		FeatureMPX
]>;		]>;
def : KnightsLandingProc<"knl">;		def : KnightsLandingProc<"knl">;

// FIXME: define SKX model		// FIXME: define SKX model
class SkylakeProc<string Name> : ProcessorModel<Name, HaswellModel, [		class SkylakeProc<string Name> : ProcessorModel<Name, HaswellModel, [
		FeatureX87,
FeatureMMX,		FeatureMMX,
FeatureAVX512,		FeatureAVX512,
FeatureFXSR,		FeatureFXSR,
FeatureCDI,		FeatureCDI,
FeatureDQI,		FeatureDQI,
FeatureBWI,		FeatureBWI,
FeatureVLX,		FeatureVLX,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
Show All 21 Lines	class SkylakeProc<string Name> : ProcessorModel<Name, HaswellModel, [
FeatureXSAVES		FeatureXSAVES
]>;		]>;
def : SkylakeProc<"skylake">;		def : SkylakeProc<"skylake">;
def : SkylakeProc<"skx">; // Legacy alias.		def : SkylakeProc<"skx">; // Legacy alias.


// AMD CPUs.		// AMD CPUs.

def : Proc<"k6", [FeatureSlowUAMem16, FeatureMMX]>;		def : Proc<"k6", [FeatureX87, FeatureSlowUAMem16, FeatureMMX]>;
def : Proc<"k6-2", [FeatureSlowUAMem16, Feature3DNow]>;		def : Proc<"k6-2", [FeatureX87, FeatureSlowUAMem16, Feature3DNow]>;
def : Proc<"k6-3", [FeatureSlowUAMem16, Feature3DNow]>;		def : Proc<"k6-3", [FeatureX87, FeatureSlowUAMem16, Feature3DNow]>;
def : Proc<"athlon", [FeatureSlowUAMem16, Feature3DNowA,		def : Proc<"athlon", [FeatureX87, FeatureSlowUAMem16, Feature3DNowA,
FeatureSlowBTMem, FeatureSlowSHLD]>;		FeatureSlowBTMem, FeatureSlowSHLD]>;
def : Proc<"athlon-tbird", [FeatureSlowUAMem16, Feature3DNowA,		def : Proc<"athlon-tbird", [FeatureX87, FeatureSlowUAMem16, Feature3DNowA,
FeatureSlowBTMem, FeatureSlowSHLD]>;		FeatureSlowBTMem, FeatureSlowSHLD]>;
def : Proc<"athlon-4", [FeatureSlowUAMem16, FeatureSSE1, Feature3DNowA,		def : Proc<"athlon-4", [FeatureX87, FeatureSlowUAMem16, FeatureSSE1,
FeatureFXSR, FeatureSlowBTMem, FeatureSlowSHLD]>;		Feature3DNowA, FeatureFXSR, FeatureSlowBTMem,
def : Proc<"athlon-xp", [FeatureSlowUAMem16, FeatureSSE1, Feature3DNowA,
FeatureFXSR, FeatureSlowBTMem, FeatureSlowSHLD]>;
def : Proc<"athlon-mp", [FeatureSlowUAMem16, FeatureSSE1, Feature3DNowA,
FeatureFXSR, FeatureSlowBTMem, FeatureSlowSHLD]>;
def : Proc<"k8", [FeatureSlowUAMem16, FeatureSSE2, Feature3DNowA,
FeatureFXSR, Feature64Bit, FeatureSlowBTMem,
FeatureSlowSHLD]>;
def : Proc<"opteron", [FeatureSlowUAMem16, FeatureSSE2, Feature3DNowA,
FeatureFXSR, Feature64Bit, FeatureSlowBTMem,
FeatureSlowSHLD]>;
def : Proc<"athlon64", [FeatureSlowUAMem16, FeatureSSE2, Feature3DNowA,
FeatureFXSR, Feature64Bit, FeatureSlowBTMem,
FeatureSlowSHLD]>;
def : Proc<"athlon-fx", [FeatureSlowUAMem16, FeatureSSE2, Feature3DNowA,
FeatureFXSR, Feature64Bit, FeatureSlowBTMem,
FeatureSlowSHLD]>;		FeatureSlowSHLD]>;
def : Proc<"k8-sse3", [FeatureSlowUAMem16, FeatureSSE3, Feature3DNowA,		def : Proc<"athlon-xp", [FeatureX87, FeatureSlowUAMem16, FeatureSSE1,
FeatureFXSR, FeatureCMPXCHG16B, FeatureSlowBTMem,		Feature3DNowA, FeatureFXSR, FeatureSlowBTMem,
FeatureSlowSHLD]>;		FeatureSlowSHLD]>;
def : Proc<"opteron-sse3", [FeatureSlowUAMem16, FeatureSSE3, Feature3DNowA,		def : Proc<"athlon-mp", [FeatureX87, FeatureSlowUAMem16, FeatureSSE1,
FeatureFXSR, FeatureCMPXCHG16B, FeatureSlowBTMem,		Feature3DNowA, FeatureFXSR, FeatureSlowBTMem,
FeatureSlowSHLD]>;		FeatureSlowSHLD]>;
def : Proc<"athlon64-sse3", [FeatureSlowUAMem16, FeatureSSE3, Feature3DNowA,		def : Proc<"k8", [FeatureX87, FeatureSlowUAMem16, FeatureSSE2,
FeatureFXSR, FeatureCMPXCHG16B, FeatureSlowBTMem,		Feature3DNowA, FeatureFXSR, Feature64Bit,
FeatureSlowSHLD]>;		FeatureSlowBTMem, FeatureSlowSHLD]>;
def : Proc<"amdfam10", [FeatureSSE4A, Feature3DNowA, FeatureFXSR,		def : Proc<"opteron", [FeatureX87, FeatureSlowUAMem16, FeatureSSE2,
FeatureCMPXCHG16B, FeatureLZCNT, FeaturePOPCNT,		Feature3DNowA, FeatureFXSR, Feature64Bit,
		FeatureSlowBTMem, FeatureSlowSHLD]>;
		def : Proc<"athlon64", [FeatureX87, FeatureSlowUAMem16, FeatureSSE2,
		Feature3DNowA, FeatureFXSR, Feature64Bit,
		FeatureSlowBTMem, FeatureSlowSHLD]>;
		def : Proc<"athlon-fx", [FeatureX87, FeatureSlowUAMem16, FeatureSSE2,
		Feature3DNowA, FeatureFXSR, Feature64Bit,
FeatureSlowBTMem, FeatureSlowSHLD]>;		FeatureSlowBTMem, FeatureSlowSHLD]>;
def : Proc<"barcelona", [FeatureSSE4A, Feature3DNowA, FeatureFXSR,		def : Proc<"k8-sse3", [FeatureX87, FeatureSlowUAMem16, FeatureSSE3,
FeatureCMPXCHG16B, FeatureLZCNT, FeaturePOPCNT,		Feature3DNowA, FeatureFXSR, FeatureCMPXCHG16B,
FeatureSlowBTMem, FeatureSlowSHLD]>;		FeatureSlowBTMem, FeatureSlowSHLD]>;
		def : Proc<"opteron-sse3", [FeatureX87, FeatureSlowUAMem16, FeatureSSE3,
		Feature3DNowA, FeatureFXSR, FeatureCMPXCHG16B,
		FeatureSlowBTMem, FeatureSlowSHLD]>;
		def : Proc<"athlon64-sse3", [FeatureX87, FeatureSlowUAMem16, FeatureSSE3,
		Feature3DNowA, FeatureFXSR, FeatureCMPXCHG16B,
		FeatureSlowBTMem, FeatureSlowSHLD]>;
		def : Proc<"amdfam10", [FeatureX87, FeatureSSE4A, Feature3DNowA,
		FeatureFXSR, FeatureCMPXCHG16B, FeatureLZCNT,
		FeaturePOPCNT, FeatureSlowBTMem,
		FeatureSlowSHLD]>;
		def : Proc<"barcelona", [FeatureX87, FeatureSSE4A, Feature3DNowA,
		FeatureFXSR, FeatureCMPXCHG16B, FeatureLZCNT,
		FeaturePOPCNT, FeatureSlowBTMem,
		FeatureSlowSHLD]>;

// Bobcat		// Bobcat
def : Proc<"btver1", [		def : Proc<"btver1", [
		FeatureX87,
FeatureMMX,		FeatureMMX,
FeatureSSSE3,		FeatureSSSE3,
FeatureSSE4A,		FeatureSSE4A,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeaturePRFCHW,		FeaturePRFCHW,
FeatureLZCNT,		FeatureLZCNT,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureXSAVE,		FeatureXSAVE,
FeatureSlowSHLD		FeatureSlowSHLD
]>;		]>;

// Jaguar		// Jaguar
def : ProcessorModel<"btver2", BtVer2Model, [		def : ProcessorModel<"btver2", BtVer2Model, [
		FeatureX87,
FeatureMMX,		FeatureMMX,
FeatureAVX,		FeatureAVX,
FeatureFXSR,		FeatureFXSR,
FeatureSSE4A,		FeatureSSE4A,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeaturePRFCHW,		FeaturePRFCHW,
FeatureAES,		FeatureAES,
FeaturePCLMUL,		FeaturePCLMUL,
FeatureBMI,		FeatureBMI,
FeatureF16C,		FeatureF16C,
FeatureMOVBE,		FeatureMOVBE,
FeatureLZCNT,		FeatureLZCNT,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureXSAVE,		FeatureXSAVE,
FeatureXSAVEOPT,		FeatureXSAVEOPT,
FeatureSlowSHLD		FeatureSlowSHLD
]>;		]>;

// Bulldozer		// Bulldozer
def : Proc<"bdver1", [		def : Proc<"bdver1", [
		FeatureX87,
FeatureXOP,		FeatureXOP,
FeatureFMA4,		FeatureFMA4,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureAES,		FeatureAES,
FeaturePRFCHW,		FeaturePRFCHW,
FeaturePCLMUL,		FeaturePCLMUL,
FeatureMMX,		FeatureMMX,
FeatureAVX,		FeatureAVX,
FeatureFXSR,		FeatureFXSR,
FeatureSSE4A,		FeatureSSE4A,
FeatureLZCNT,		FeatureLZCNT,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureXSAVE,		FeatureXSAVE,
FeatureSlowSHLD		FeatureSlowSHLD
]>;		]>;
// Piledriver		// Piledriver
def : Proc<"bdver2", [		def : Proc<"bdver2", [
		FeatureX87,
FeatureXOP,		FeatureXOP,
FeatureFMA4,		FeatureFMA4,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureAES,		FeatureAES,
FeaturePRFCHW,		FeaturePRFCHW,
FeaturePCLMUL,		FeaturePCLMUL,
FeatureMMX,		FeatureMMX,
FeatureAVX,		FeatureAVX,
FeatureFXSR,		FeatureFXSR,
FeatureSSE4A,		FeatureSSE4A,
FeatureF16C,		FeatureF16C,
FeatureLZCNT,		FeatureLZCNT,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureXSAVE,		FeatureXSAVE,
FeatureBMI,		FeatureBMI,
FeatureTBM,		FeatureTBM,
FeatureFMA,		FeatureFMA,
FeatureSlowSHLD		FeatureSlowSHLD
]>;		]>;

// Steamroller		// Steamroller
def : Proc<"bdver3", [		def : Proc<"bdver3", [
		FeatureX87,
FeatureXOP,		FeatureXOP,
FeatureFMA4,		FeatureFMA4,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureAES,		FeatureAES,
FeaturePRFCHW,		FeaturePRFCHW,
FeaturePCLMUL,		FeaturePCLMUL,
FeatureMMX,		FeatureMMX,
FeatureAVX,		FeatureAVX,
FeatureFXSR,		FeatureFXSR,
FeatureSSE4A,		FeatureSSE4A,
FeatureF16C,		FeatureF16C,
FeatureLZCNT,		FeatureLZCNT,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureXSAVE,		FeatureXSAVE,
FeatureBMI,		FeatureBMI,
FeatureTBM,		FeatureTBM,
FeatureFMA,		FeatureFMA,
FeatureXSAVEOPT,		FeatureXSAVEOPT,
FeatureSlowSHLD,		FeatureSlowSHLD,
FeatureFSGSBase		FeatureFSGSBase
]>;		]>;

// Excavator		// Excavator
def : Proc<"bdver4", [		def : Proc<"bdver4", [
		FeatureX87,
FeatureMMX,		FeatureMMX,
FeatureAVX2,		FeatureAVX2,
FeatureFXSR,		FeatureFXSR,
FeatureXOP,		FeatureXOP,
FeatureFMA4,		FeatureFMA4,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureAES,		FeatureAES,
FeaturePRFCHW,		FeaturePRFCHW,
FeaturePCLMUL,		FeaturePCLMUL,
FeatureF16C,		FeatureF16C,
FeatureLZCNT,		FeatureLZCNT,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureXSAVE,		FeatureXSAVE,
FeatureBMI,		FeatureBMI,
FeatureBMI2,		FeatureBMI2,
FeatureTBM,		FeatureTBM,
FeatureFMA,		FeatureFMA,
FeatureXSAVEOPT,		FeatureXSAVEOPT,
FeatureFSGSBase		FeatureFSGSBase
]>;		]>;

def : Proc<"geode", [FeatureSlowUAMem16, Feature3DNowA]>;		def : Proc<"geode", [FeatureX87, FeatureSlowUAMem16, Feature3DNowA]>;

def : Proc<"winchip-c6", [FeatureSlowUAMem16, FeatureMMX]>;		def : Proc<"winchip-c6", [FeatureX87, FeatureSlowUAMem16, FeatureMMX]>;
def : Proc<"winchip2", [FeatureSlowUAMem16, Feature3DNow]>;		def : Proc<"winchip2", [FeatureX87, FeatureSlowUAMem16, Feature3DNow]>;
def : Proc<"c3", [FeatureSlowUAMem16, Feature3DNow]>;		def : Proc<"c3", [FeatureX87, FeatureSlowUAMem16, Feature3DNow]>;
def : Proc<"c3-2", [FeatureSlowUAMem16, FeatureMMX, FeatureSSE1, FeatureFXSR]>;		def : Proc<"c3-2", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
		FeatureSSE1, FeatureFXSR]>;

// We also provide a generic 64-bit specific x86 processor model which tries to		// We also provide a generic 64-bit specific x86 processor model which tries to
// be good for modern chips without enabling instruction set encodings past the		// be good for modern chips without enabling instruction set encodings past the
// basic SSE2 and 64-bit ones. It disables slow things from any mainstream and		// basic SSE2 and 64-bit ones. It disables slow things from any mainstream and
// modern 64-bit x86 chip, and enables features that are generally beneficial.		// modern 64-bit x86 chip, and enables features that are generally beneficial.
//		//
// We currently use the Sandy Bridge model as the default scheduling model as		// We currently use the Sandy Bridge model as the default scheduling model as
// we use it across Nehalem, Westmere, Sandy Bridge, and Ivy Bridge which		// we use it across Nehalem, Westmere, Sandy Bridge, and Ivy Bridge which
// covers a huge swath of x86 processors. If there are specific scheduling		// covers a huge swath of x86 processors. If there are specific scheduling
// knobs which need to be tuned differently for AMD chips, we might consider		// knobs which need to be tuned differently for AMD chips, we might consider
// forming a common base for them.		// forming a common base for them.
def : ProcessorModel<"x86-64", SandyBridgeModel,		def : ProcessorModel<"x86-64", SandyBridgeModel,
[FeatureMMX, FeatureSSE2, FeatureFXSR, Feature64Bit,		[FeatureX87, FeatureMMX, FeatureSSE2, FeatureFXSR,
FeatureSlowBTMem ]>;		Feature64Bit, FeatureSlowBTMem ]>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Register File Description		// Register File Description
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

include "X86RegisterInfo.td"		include "X86RegisterInfo.td"

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
▲ Show 20 Lines • Show All 70 Lines • Show Last 20 Lines

lib/Target/X86/X86Subtarget.h

Show First 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	protected:
PICStyles::Style PICStyle;		PICStyles::Style PICStyle;

/// SSE1, SSE2, SSE3, SSSE3, SSE41, SSE42, or none supported.		/// SSE1, SSE2, SSE3, SSSE3, SSE41, SSE42, or none supported.
X86SSEEnum X86SSELevel;		X86SSEEnum X86SSELevel;

/// MMX, 3DNow, 3DNow Athlon, or none supported.		/// MMX, 3DNow, 3DNow Athlon, or none supported.
X863DNowEnum X863DNowLevel;		X863DNowEnum X863DNowLevel;

		/// True if the processor supports X87 instructions.
		bool HasX87;

/// True if this processor has conditional move instructions		/// True if this processor has conditional move instructions
/// (generally pentium pro+).		/// (generally pentium pro+).
bool HasCMov;		bool HasCMov;

/// True if the processor supports X86-64 instructions.		/// True if the processor supports X86-64 instructions.
bool HasX86_64;		bool HasX86_64;

/// True if the processor supports POPCNT.		/// True if the processor supports POPCNT.
▲ Show 20 Lines • Show All 244 Lines • ▼ Show 20 Lines	public:
bool isTarget64BitLP64() const {		bool isTarget64BitLP64() const {
return In64BitMode && (TargetTriple.getEnvironment() != Triple::GNUX32 &&		return In64BitMode && (TargetTriple.getEnvironment() != Triple::GNUX32 &&
!TargetTriple.isOSNaCl());		!TargetTriple.isOSNaCl());
}		}

PICStyles::Style getPICStyle() const { return PICStyle; }		PICStyles::Style getPICStyle() const { return PICStyle; }
void setPICStyle(PICStyles::Style Style) { PICStyle = Style; }		void setPICStyle(PICStyles::Style Style) { PICStyle = Style; }

		bool hasX87() const { return HasX87; }
bool hasCMov() const { return HasCMov; }		bool hasCMov() const { return HasCMov; }
bool hasSSE1() const { return X86SSELevel >= SSE1; }		bool hasSSE1() const { return X86SSELevel >= SSE1; }
bool hasSSE2() const { return X86SSELevel >= SSE2; }		bool hasSSE2() const { return X86SSELevel >= SSE2; }
bool hasSSE3() const { return X86SSELevel >= SSE3; }		bool hasSSE3() const { return X86SSELevel >= SSE3; }
bool hasSSSE3() const { return X86SSELevel >= SSSE3; }		bool hasSSSE3() const { return X86SSELevel >= SSSE3; }
bool hasSSE41() const { return X86SSELevel >= SSE41; }		bool hasSSE41() const { return X86SSELevel >= SSE41; }
bool hasSSE42() const { return X86SSELevel >= SSE42; }		bool hasSSE42() const { return X86SSELevel >= SSE42; }
bool hasAVX() const { return X86SSELevel >= AVX; }		bool hasAVX() const { return X86SSELevel >= AVX; }
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	public:
bool hasERI() const { return HasERI; }		bool hasERI() const { return HasERI; }
bool hasDQI() const { return HasDQI; }		bool hasDQI() const { return HasDQI; }
bool hasBWI() const { return HasBWI; }		bool hasBWI() const { return HasBWI; }
bool hasVLX() const { return HasVLX; }		bool hasVLX() const { return HasVLX; }
bool hasMPX() const { return HasMPX; }		bool hasMPX() const { return HasMPX; }

bool isAtom() const { return X86ProcFamily == IntelAtom; }		bool isAtom() const { return X86ProcFamily == IntelAtom; }
bool isSLM() const { return X86ProcFamily == IntelSLM; }		bool isSLM() const { return X86ProcFamily == IntelSLM; }
bool useSoftFloat() const { return UseSoftFloat; }		bool useSoftFloat() const { return UseSoftFloat \|\| !hasX87(); }
		brunoUnsubmitted Not Done Reply Inline Actions This looks odd since we do support f32 (not f64) with SSE1. See X86ISelLowering.cpp:553 } else if (!Subtarget->useSoftFloat() && X86ScalarSSEf32) { // Use SSE for f32, x87 for f64. // Set up the FP register classes. addRegisterClass(MVT::f32, &X86::FR32RegClass); addRegisterClass(MVT::f64, &X86::RFP64RegClass); ... Since not having SSE at all fallbacks to x87, why not only check for "UseSoftFloat \|\| !hasX87())" ? Anyway, I think those should come in a separate patch with appropriate feature testcases. bruno: This looks odd since we do support f32 (not f64) with SSE1. See X86ISelLowering.cpp:553 } else…
		aturetskAuthorUnsubmitted Not Done Reply Inline Actions My logic came from the hypothetical situation when we have SSE/SSE2, but not X87. If we have SSE there is no instructions to handle f64, thus without X87 we should use soft floats. SSE2 can handle f64 so there is no need to do that. aturetsk: My logic came from the hypothetical situation when we have SSE/SSE2, but not X87. If we have…
		echristoUnsubmitted Not Done Reply Inline Actions I agree with Bruno here. It also might be a good idea to not add this conditional at all and just use the legalizer. Thoughts? echristo: I agree with Bruno here. It also might be a good idea to not add this conditional at all and…
		aturetskAuthorUnsubmitted Not Done Reply Inline Actions For now I fixed the condition as suggested. Eric, could you please explain in more details how you suggest to use legalizar? aturetsk: For now I fixed the condition as suggested. Eric, could you please explain in more details how…
		brunoUnsubmitted Not Done Reply Inline Actions As Eric mentioned, better to live the option check alone: "bool useSoftFloat() const { return UseSoftFloat }". Then change X86ISelLowering.cpp:589 this way: Change } else if (!Subtarget->useSoftFloat()) { To } else if (!Subtarget->useSoftFloat() && Subtarget->hasX87()) { This should be enough to get the behaviour you want. bruno: As Eric mentioned, better to live the option check alone: "bool useSoftFloat() const { return…
		aturetskAuthorUnsubmitted Not Done Reply Inline Actions Thanks for the explanation. I have some concern about such approach. Look at the test example: ; RUN: llc < %s -march=x86 -mattr=-x87,+sse define float @test32(float %a, float %b) nounwind readnone { entry: %0 = fadd float %a, %b ret float %0 ; Generated assembly: ; pushl %eax ; movss 8(%esp), %xmm0 ; addss 12(%esp), %xmm0 ; movss %xmm0, (%esp) ; flds (%esp) ; popl %eax ; retl } define double @test64(double %a, double %b) nounwind readnone { entry: %0 = fadd double %a, %b ret double %0 ; Generated assembly: ; fldl 4(%esp) ; faddl 12(%esp) ; retl } The approach you suggest would generate x87 instructions (you can see the generated assembly in the comments), which is wrong since we have FeatureX87 disabled. The current version of the patch makes compiler to generate soft float calls in test32 and test64 (probably that's not the best assembly for test32 since we enabled sse, but at least it's correct). I understand, that the combination "-x87,+sse" does not correspond to any existing CPU, yet llc gives an opportunity for user to use it through -mattr option. So shouldn't we care to generate correct assembly even in this case? aturetsk: Thanks for the explanation. I have some concern about such approach. Look at the test example…
		aturetskAuthorUnsubmitted Not Done Reply Inline Actions To sum up, the approach you're suggesting does make the compiler to behave as I want, since right now I'm only interested in having -mattr=-x87 to work correctly. And I'm ready to submit the updated patch. I'm just not sure that this approach is right from the general point of view... aturetsk: To sum up, the approach you're suggesting does make the compiler to behave as I want, since…
		brunoUnsubmitted Not Done Reply Inline Actions Hi Andrey, Sorry for not mentioning it explicitly but the idea here is that we should only use "Subtarget->useSoftFloat()" to represent the state for the soft-float feature whereas the logic of selecting what is actually supported should be done in X86TargetLowering::X86TargetLowering (something along the lines of the snippet I used, but if that's not enough, please add more logic to guarantee it). Does that make sense? Feel free to address it in a upcoming patch if you wish but remember to add the "-mattr=-x87,+sse" tests when you do so. bruno: Hi Andrey, Sorry for not mentioning it explicitly but the idea here is that we should only use…

const Triple &getTargetTriple() const { return TargetTriple; }		const Triple &getTargetTriple() const { return TargetTriple; }

bool isTargetDarwin() const { return TargetTriple.isOSDarwin(); }		bool isTargetDarwin() const { return TargetTriple.isOSDarwin(); }
bool isTargetFreeBSD() const { return TargetTriple.isOSFreeBSD(); }		bool isTargetFreeBSD() const { return TargetTriple.isOSFreeBSD(); }
bool isTargetDragonFly() const { return TargetTriple.isOSDragonFly(); }		bool isTargetDragonFly() const { return TargetTriple.isOSDragonFly(); }
bool isTargetSolaris() const { return TargetTriple.isOSSolaris(); }		bool isTargetSolaris() const { return TargetTriple.isOSSolaris(); }
bool isTargetPS4() const { return TargetTriple.isPS4(); }		bool isTargetPS4() const { return TargetTriple.isPS4(); }
▲ Show 20 Lines • Show All 129 Lines • Show Last 20 Lines

lib/Target/X86/X86Subtarget.cpp

Show First 20 Lines • Show All 224 Lines • ▼ Show 20 Lines	void X86Subtarget::initSubtargetFeatures(StringRef CPU, StringRef FS) {
else if (isTargetDarwin() \|\| isTargetLinux() \|\| isTargetSolaris() \|\|		else if (isTargetDarwin() \|\| isTargetLinux() \|\| isTargetSolaris() \|\|
In64BitMode)		In64BitMode)
stackAlignment = 16;		stackAlignment = 16;
}		}

void X86Subtarget::initializeEnvironment() {		void X86Subtarget::initializeEnvironment() {
X86SSELevel = NoSSE;		X86SSELevel = NoSSE;
X863DNowLevel = NoThreeDNow;		X863DNowLevel = NoThreeDNow;
		HasX87 = false;
HasCMov = false;		HasCMov = false;
HasX86_64 = false;		HasX86_64 = false;
HasPOPCNT = false;		HasPOPCNT = false;
HasSSE4A = false;		HasSSE4A = false;
HasAES = false;		HasAES = false;
HasFXSR = false;		HasFXSR = false;
HasXSAVE = false;		HasXSAVE = false;
HasXSAVEOPT = false;		HasXSAVEOPT = false;
▲ Show 20 Lines • Show All 92 Lines • Show Last 20 Lines

test/CodeGen/X86/x87.ll

This file was added.

				; RUN: llc < %s -march=x86 -mattr=-x87 \| FileCheck %s
				; RUN: llc < %s -march=x86-64 -mattr=-x87 \| FileCheck %s

				; CHECK-NOT: fadd{{.*}}

				define float @foo(float %a, float %b) nounwind readnone {
				brunoUnsubmitted Not Done Reply Inline Actions This is missing appropriate checks for instructions you want (or not) to be present in the output. bruno: This is missing appropriate checks for instructions you want (or not) to be present in the…
				aturetskAuthorUnsubmitted Not Done Reply Inline Actions In this test I just want to make sure that x87's fadd won't be generated. So why is "CHECK-NOT: fadd{{.}}" not enough? aturetsk:* In this test I just want to make sure that x87's fadd won't be generated. So why is "CHECK-NOT…
				brunoUnsubmitted Not Done Reply Inline Actions Ok. Please add simple cases when +x86 it's used as well! bruno: Ok. Please add simple cases when +x86 it's used as well!
				entry:
				%0 = fadd float %a, %b
				brunoUnsubmitted Not Done Reply Inline Actions This test needs improvements; you can make it tighter by removing the allocas and other unnecessary instructions. Please explicitly check for all the specific instructions you want to match. bruno: This test needs improvements; you can make it tighter by removing the allocas and other…
				ret float %0
				aturetskAuthorUnsubmitted Not Done Reply Inline Actions Done. aturetsk: Done.
				}
				brunoUnsubmitted Not Done Reply Inline Actions Please place the checks above the IR instructions you intend to match. Also put a check-label in the beginning of the function. bruno: Please place the checks above the IR instructions you intend to match. Also put a check-label…
				aturetskAuthorUnsubmitted Not Done Reply Inline Actions Done. aturetsk: Done.
				brunoUnsubmitted Not Done Reply Inline Actions There's no FileCheck invocation without check-prefix, so this isn't checking for anything. Use "X86-LABEL" and "NOX87-LABEL" instead. bruno: There's no FileCheck invocation without check-prefix, so this isn't checking for anything. Use…
				aturetskAuthorUnsubmitted Not Done Reply Inline Actions Fixed. aturetsk: Fixed.

This is an archive of the discontinued LLVM Phabricator instance.

Introduction of FeatureX87ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 41387

lib/Target/X86/X86.td

lib/Target/X86/X86Subtarget.h

lib/Target/X86/X86Subtarget.cpp

test/CodeGen/X86/x87.ll

Introduction of FeatureX87
ClosedPublic