This is an archive of the discontinued LLVM Phabricator instance.

Introduction of FeatureX87
ClosedPublic

Authored by aturetsk on Oct 22 2015, 5:00 AM.

Download Raw Diff

Details

Reviewers

bruno
nadav
echristo

Commits

rG6a3d561ea036: [X86] Introduction of FeatureX87.
rL264148: [X86] Introduction of FeatureX87.

Summary

Add FeatureX87 in X86 backend.
This is a preparatory changes for introducing a new CPU - Lakemont - which doesn't support X87 instructions.

Diff Detail

Repository: rL LLVM

Event Timeline

aturetsk updated this revision to Diff 38112.Oct 22 2015, 5:00 AM

aturetsk retitled this revision from to Introduction of FeatureX87.

aturetsk updated this object.

aturetsk added a reviewer: nadav.

aturetsk added a subscriber: llvm-commits.

aturetsk added a child revision: D13980: Add "x87" in x86 target feature map.Oct 22 2015, 5:12 AM

Ping.

LGTM.

Hi Andrey,

lib/Target/X86/X86Subtarget.h
406 ↗	(On Diff #38112)	This looks odd since we do support f32 (not f64) with SSE1. See X86ISelLowering.cpp:553 } else if (!Subtarget->useSoftFloat() && X86ScalarSSEf32) { // Use SSE for f32, x87 for f64. // Set up the FP register classes. addRegisterClass(MVT::f32, &X86::FR32RegClass); addRegisterClass(MVT::f64, &X86::RFP64RegClass); ... Since not having SSE at all fallbacks to x87, why not only check for "UseSoftFloat \|\| !hasX87())" ? Anyway, I think those should come in a separate patch with appropriate feature testcases.

RKSimon added a subscriber: RKSimon.Nov 2 2015, 1:57 PM

RKSimon added inline comments.

lib/Target/X86/X86.td
62 ↗	(On Diff #38112)	You can reduce the size of the diff if you inherit FeatureX87 in FeatureMMX - all targets which declare FeatureMMX wouldn't then need to be altered.

Hi Bruno and Simon,
Thanks for the review,

lib/Target/X86/X86.td
62 ↗	(On Diff #38112)	I was thinking about it, but decided not to do so. From the technical point of view X87 has nothing to do with MMX so it makes sense to keep them unbound. Also I think it'd be good to have an opportunity to enable MMX/SSE/etc while having X87 disabled - I have no idea which features will be supported in Lakemont successors, but I don't see why that can't be the case. So if you don't mind I'd stick to the current version of the patch.
lib/Target/X86/X86Subtarget.h
406 ↗	(On Diff #38112)	My logic came from the hypothetical situation when we have SSE/SSE2, but not X87. If we have SSE there is no instructions to handle f64, thus without X87 we should use soft floats. SSE2 can handle f64 so there is no need to do that.

You've done this on an old version of the sources, you'll need to rebase.

One other reply inline.

Thanks!

-eric

lib/Target/X86/X86Subtarget.h
406 ↗	(On Diff #38112)	I agree with Bruno here. It also might be a good idea to not add this conditional at all and just use the legalizer. Thoughts?

Rebased and fixed questionable condition in useSoftFloat.

aturetsk added inline comments.Nov 25 2015, 6:27 AM

lib/Target/X86/X86Subtarget.h
403 ↗	(On Diff #41138)	For now I fixed the condition as suggested. Eric, could you please explain in more details how you suggest to use legalizar?

Added the test.

Ping.

Hi Andrey,

lib/Target/X86/X86Subtarget.h
403 ↗	(On Diff #41387)	As Eric mentioned, better to live the option check alone: "bool useSoftFloat() const { return UseSoftFloat }". Then change X86ISelLowering.cpp:589 this way: Change } else if (!Subtarget->useSoftFloat()) { To } else if (!Subtarget->useSoftFloat() && Subtarget->hasX87()) { This should be enough to get the behaviour you want.
test/CodeGen/X86/x87.ll
6 ↗	(On Diff #41387)	This is missing appropriate checks for instructions you want (or not) to be present in the output.

Hi Bruno,

lib/Target/X86/X86Subtarget.h
403 ↗	(On Diff #41387)	Thanks for the explanation. I have some concern about such approach. Look at the test example: ; RUN: llc < %s -march=x86 -mattr=-x87,+sse define float @test32(float %a, float %b) nounwind readnone { entry: %0 = fadd float %a, %b ret float %0 ; Generated assembly: ; pushl %eax ; movss 8(%esp), %xmm0 ; addss 12(%esp), %xmm0 ; movss %xmm0, (%esp) ; flds (%esp) ; popl %eax ; retl } define double @test64(double %a, double %b) nounwind readnone { entry: %0 = fadd double %a, %b ret double %0 ; Generated assembly: ; fldl 4(%esp) ; faddl 12(%esp) ; retl } The approach you suggest would generate x87 instructions (you can see the generated assembly in the comments), which is wrong since we have FeatureX87 disabled. The current version of the patch makes compiler to generate soft float calls in test32 and test64 (probably that's not the best assembly for test32 since we enabled sse, but at least it's correct). I understand, that the combination "-x87,+sse" does not correspond to any existing CPU, yet llc gives an opportunity for user to use it through -mattr option. So shouldn't we care to generate correct assembly even in this case?
test/CodeGen/X86/x87.ll
6 ↗	(On Diff #41387)	In this test I just want to make sure that x87's fadd won't be generated. So why is "CHECK-NOT: fadd{{.*}}" not enough?

aturetsk added inline comments.Dec 15 2015, 7:56 AM

lib/Target/X86/X86Subtarget.h
403 ↗	(On Diff #41387)	To sum up, the approach you're suggesting does make the compiler to behave as I want, since right now I'm only interested in having -mattr=-x87 to work correctly. And I'm ready to submit the updated patch. I'm just not sure that this approach is right from the general point of view...

Ping.

bruno added a reviewer: bruno.Jan 5 2016, 4:55 PM

bruno added inline comments.

lib/Target/X86/X86Subtarget.h
403 ↗	(On Diff #41387)	Hi Andrey, Sorry for not mentioning it explicitly but the idea here is that we should only use "Subtarget->useSoftFloat()" to represent the state for the soft-float feature whereas the logic of selecting what is actually supported should be done in X86TargetLowering::X86TargetLowering (something along the lines of the snippet I used, but if that's not enough, please add more logic to guarantee it). Does that make sense? Feel free to address it in a upcoming patch if you wish but remember to add the "-mattr=-x87,+sse" tests when you do so.
test/CodeGen/X86/x87.ll
6 ↗	(On Diff #41387)	Ok. Please add simple cases when +x86 it's used as well!

Fixed and rebased.

Hello Bruno,

I've added more logic in X86TargetLowering::X86TargetLowering as you suggested and improved the test covering different combinations of x87 and sse. Also I included a plenty of float convert instructions in the test since X86TargetLowering::X86TargetLowering contains non-trivial logic to handle them.

bruno added inline comments.Jan 20 2016, 9:41 AM

lib/Target/X86/X86ISelLowering.cpp
526 ↗	(On Diff #45388)	Why check Subtarget->hasX87() here? We're using SSE for f32 and f64 anyways
560 ↗	(On Diff #45388)	Factor out "!Subtarget->useSoftFloat() && Subtarget->hasX87()" with bool UseX87 = !Subtarget->useSoftFloat() && Subtarget->hasX87(); and use that throughout the checks.
596 ↗	(On Diff #45388)	UseX87 here
630 ↗	(On Diff #45388)	UseX87 here
test/CodeGen/X86/x87.ll
7 ↗	(On Diff #45388)	This test needs improvements; you can make it tighter by removing the allocas and other unnecessary instructions. Please explicitly check for all the specific instructions you want to match.

Fix remarks.

aturetsk added inline comments.Feb 3 2016, 8:24 AM

lib/Target/X86/X86ISelLowering.cpp
527 ↗	(On Diff #46784)	I get X87 load and store instructions in x87.ll if I don't check hasX87 here. I think changing that would require significant efforts. Since we don't have a CPU which has -x87 but +sse2, I just left the check here.
561 ↗	(On Diff #46784)	Done.
test/CodeGen/X86/x87.ll
8 ↗	(On Diff #46784)	Done.

bruno added inline comments.Feb 3 2016, 11:35 AM

lib/Target/X86/X86ISelLowering.cpp
527 ↗	(On Diff #46784)	This looks odd, do you know why it happens? in which specific target feature combination?
test/CodeGen/X86/x87.ll
31 ↗	(On Diff #46784)	Please place the checks above the IR instructions you intend to match. Also put a check-label in the beginning of the function.

aturetsk added inline comments.Feb 4 2016, 6:05 AM

lib/Target/X86/X86ISelLowering.cpp
527 ↗	(On Diff #46784)	This happens with -x87,+sse2 in 32 bit mode (64 bit mode seems to be OK). The instruction to blame is "sitofp i64 %l to float". Actually I was able to get rid of fild instruction and have a soft float call by adding the condition at line 213: if(Subtarget.hasX87() \|\| Subtarget.is64Bit()) setOperationAction(ISD::SINT_TO_FP , MVT::i64 , Custom); However I still get the wrong fstp instruction: calll __floatdisf fstps 20(%esp) movss 20(%esp), %xmm0 # xmm0 = mem[0],zero,zero,zero But what worries me the most is that there may be more such non-obvious cases, even in the current version of the patch having the test passed. Probably we should use more straight-forward and easier way (until we have a real case where the best code for -x87,+sse/sse2 is required) - to have a variable "UseX87 = !useSoftFloat() && hasX87()" and replace useSoftFloat() with !useX87 absolutely everywhere (similar to what was done initially, but keep useSoftFloat() unchanged through the use of a new variable). This way would guarantee that the compiler generates correct code with disabled x87.

Rebase and improve the test.

aturetsk added inline comments.Feb 19 2016, 7:10 AM

lib/Target/X86/X86ISelLowering.cpp
527 ↗	(On Diff #48485)	The reason why the float store has appeared is that when we lower call (to the soft float function in this case) we use f80 type for float return if we have SSE. Here is the code snippet from X86ISelLowering.cpp:2428: // If we prefer to use the value in xmm registers, copy it out as f80 and // use a truncate to move it from fp stack reg to xmm reg. bool RoundAfterCopy = false; if ((VA.getLocReg() == X86::FP0 \|\| VA.getLocReg() == X86::FP1) && isScalarFPTypeInSSEReg(VA.getValVT())) { CopyVT = MVT::f80; RoundAfterCopy = (CopyVT != VA.getLocVT()); } The comment in the code explains what's happened. Thus, currently even when we use SSE2 we still rely on the fact that we have X87 and changing that is not trivial. Now looking at this I'm really inclined to follow Simons advice to inherit FeatureX87 in FeatureMMX. This way SSE/SSE2 will imply X87 which makes a perfect sense since they rely on it. What do you think?
test/CodeGen/X86/x87.ll
32 ↗	(On Diff #48485)	Done.

bruno added inline comments.Feb 19 2016, 10:17 AM

lib/Target/X86/X86ISelLowering.cpp
527 ↗	(On Diff #48485)	Is there any actual any hardware that supports -x87,+sse2? In the same line of thought: is there any ABI definition that describes calling convention for this situation? If not I suggest we don't need to care about this case, though a report_fatal_error sanity check (see the one right above the code snippet you pointed out) in lower call for the "32-bit x86 -x87,+sse2" would be nice, since we don't support it. Inheriting FeatureX87 in FeatureMMX makes it easier for implementation purposes but they look orthogonal to me; if anyone is willing to support -x87,+sse2, it should be supported with code to handle it plus tests. Your patch seems to add FeatureX87 to all current CPUs we support, and that should be enough to guarantee it won't break anything. That said, you can also remove "-x87,+sse2" from your tests.
test/CodeGen/X86/x87.ll
11 ↗	(On Diff #48485)	There's no FileCheck invocation without check-prefix, so this isn't checking for anything. Use "X86-LABEL" and "NOX87-LABEL" instead.

(I've been quiet since I haven't had anything to add to Bruno's review)

Fix the test, add sanity check in the legalizer.

aturetsk added inline comments.Feb 20 2016, 5:10 AM

lib/Target/X86/X86ISelLowering.cpp
527 ↗	(On Diff #48589)	I believe there is no such hardware and ABI and I agree with not bothering with "-x87,+sse2" case until it's really needed. So I left UseX87 checks in legalizer only in the places where X87 float register classes are used and added a sanity check (note that it should not be 32-bit specific, it's just happened so that the issue I described had appeared in my testcase in 32 bit mode).
test/CodeGen/X86/x87.ll
12 ↗	(On Diff #48589)	Fixed.

Thanks Andrey, LGTM

This revision is now accepted and ready to land.Feb 22 2016, 10:18 AM

Closed by commit rL264148: [X86] Introduction of FeatureX87. (authored by aturetsk). · Explain WhyMar 23 2016, 4:19 AM

This revision was automatically updated to reflect the committed changes.

asavonic mentioned this in D98895: [X86][clang] Disable long double type for -mno-x87 option.Apr 26 2021, 8:36 AM

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

X86/

155 lines

9 lines

4 lines

1 line

test/

CodeGen/

X86/

x87.ll

55 lines

Diff 51397

llvm/trunk/lib/Target/X86/X86.td

Show All 25 Lines	def Mode32Bit : SubtargetFeature<"32bit-mode", "In32BitMode", "true",
"32-bit mode (80386)">;		"32-bit mode (80386)">;
def Mode16Bit : SubtargetFeature<"16bit-mode", "In16BitMode", "true",		def Mode16Bit : SubtargetFeature<"16bit-mode", "In16BitMode", "true",
"16-bit mode (i8086)">;		"16-bit mode (i8086)">;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// X86 Subtarget features		// X86 Subtarget features
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		def FeatureX87 : SubtargetFeature<"x87","HasX87", "true",
		"Enable X87 float instructions">;

def FeatureCMOV : SubtargetFeature<"cmov","HasCMov", "true",		def FeatureCMOV : SubtargetFeature<"cmov","HasCMov", "true",
"Enable conditional move instructions">;		"Enable conditional move instructions">;

def FeaturePOPCNT : SubtargetFeature<"popcnt", "HasPOPCNT", "true",		def FeaturePOPCNT : SubtargetFeature<"popcnt", "HasPOPCNT", "true",
"Support POPCNT instruction">;		"Support POPCNT instruction">;

def FeatureFXSR : SubtargetFeature<"fxsr", "HasFXSR", "true",		def FeatureFXSR : SubtargetFeature<"fxsr", "HasFXSR", "true",
"Support fxsave/fxrestore instructions">;		"Support fxsave/fxrestore instructions">;
▲ Show 20 Lines • Show All 212 Lines • ▼ Show 20 Lines
def ProcIntelAtom : SubtargetFeature<"atom", "X86ProcFamily", "IntelAtom",		def ProcIntelAtom : SubtargetFeature<"atom", "X86ProcFamily", "IntelAtom",
"Intel Atom processors">;		"Intel Atom processors">;
def ProcIntelSLM : SubtargetFeature<"slm", "X86ProcFamily", "IntelSLM",		def ProcIntelSLM : SubtargetFeature<"slm", "X86ProcFamily", "IntelSLM",
"Intel Silvermont processors">;		"Intel Silvermont processors">;

class Proc<string Name, list<SubtargetFeature> Features>		class Proc<string Name, list<SubtargetFeature> Features>
: ProcessorModel<Name, GenericModel, Features>;		: ProcessorModel<Name, GenericModel, Features>;

def : Proc<"generic", [FeatureSlowUAMem16]>;		def : Proc<"generic", [FeatureX87, FeatureSlowUAMem16]>;
def : Proc<"i386", [FeatureSlowUAMem16]>;		def : Proc<"i386", [FeatureX87, FeatureSlowUAMem16]>;
def : Proc<"i486", [FeatureSlowUAMem16]>;		def : Proc<"i486", [FeatureX87, FeatureSlowUAMem16]>;
def : Proc<"i586", [FeatureSlowUAMem16]>;		def : Proc<"i586", [FeatureX87, FeatureSlowUAMem16]>;
def : Proc<"pentium", [FeatureSlowUAMem16]>;		def : Proc<"pentium", [FeatureX87, FeatureSlowUAMem16]>;
def : Proc<"pentium-mmx", [FeatureSlowUAMem16, FeatureMMX]>;		def : Proc<"pentium-mmx", [FeatureX87, FeatureSlowUAMem16, FeatureMMX]>;
def : Proc<"i686", [FeatureSlowUAMem16]>;		def : Proc<"i686", [FeatureX87, FeatureSlowUAMem16]>;
def : Proc<"pentiumpro", [FeatureSlowUAMem16, FeatureCMOV]>;		def : Proc<"pentiumpro", [FeatureX87, FeatureSlowUAMem16, FeatureCMOV]>;
def : Proc<"pentium2", [FeatureSlowUAMem16, FeatureMMX, FeatureCMOV,		def : Proc<"pentium2", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
FeatureFXSR]>;		FeatureCMOV, FeatureFXSR]>;
def : Proc<"pentium3", [FeatureSlowUAMem16, FeatureMMX, FeatureSSE1,		def : Proc<"pentium3", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
FeatureFXSR]>;		FeatureSSE1, FeatureFXSR]>;
def : Proc<"pentium3m", [FeatureSlowUAMem16, FeatureMMX, FeatureSSE1,		def : Proc<"pentium3m", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
FeatureFXSR, FeatureSlowBTMem]>;		FeatureSSE1, FeatureFXSR, FeatureSlowBTMem]>;
def : Proc<"pentium-m", [FeatureSlowUAMem16, FeatureMMX, FeatureSSE2,		def : Proc<"pentium-m", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
FeatureFXSR, FeatureSlowBTMem]>;		FeatureSSE2, FeatureFXSR, FeatureSlowBTMem]>;
def : Proc<"pentium4", [FeatureSlowUAMem16, FeatureMMX, FeatureSSE2,		def : Proc<"pentium4", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
FeatureFXSR]>;		FeatureSSE2, FeatureFXSR]>;
def : Proc<"pentium4m", [FeatureSlowUAMem16, FeatureMMX, FeatureSSE2,		def : Proc<"pentium4m", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
FeatureFXSR, FeatureSlowBTMem]>;		FeatureSSE2, FeatureFXSR, FeatureSlowBTMem]>;

// Intel Core Duo.		// Intel Core Duo.
def : ProcessorModel<"yonah", SandyBridgeModel,		def : ProcessorModel<"yonah", SandyBridgeModel,
[FeatureSlowUAMem16, FeatureMMX, FeatureSSE3, FeatureFXSR,		[FeatureX87, FeatureSlowUAMem16, FeatureMMX, FeatureSSE3,
FeatureSlowBTMem]>;		FeatureFXSR, FeatureSlowBTMem]>;

// NetBurst.		// NetBurst.
def : Proc<"prescott",		def : Proc<"prescott",
[FeatureSlowUAMem16, FeatureMMX, FeatureSSE3, FeatureFXSR,		[FeatureX87, FeatureSlowUAMem16, FeatureMMX, FeatureSSE3,
FeatureSlowBTMem]>;		FeatureFXSR, FeatureSlowBTMem]>;
def : Proc<"nocona", [		def : Proc<"nocona", [
		FeatureX87,
FeatureSlowUAMem16,		FeatureSlowUAMem16,
FeatureMMX,		FeatureMMX,
FeatureSSE3,		FeatureSSE3,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureSlowBTMem		FeatureSlowBTMem
]>;		]>;

// Intel Core 2 Solo/Duo.		// Intel Core 2 Solo/Duo.
def : ProcessorModel<"core2", SandyBridgeModel, [		def : ProcessorModel<"core2", SandyBridgeModel, [
		FeatureX87,
FeatureSlowUAMem16,		FeatureSlowUAMem16,
FeatureMMX,		FeatureMMX,
FeatureSSSE3,		FeatureSSSE3,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureSlowBTMem,		FeatureSlowBTMem,
FeatureLAHFSAHF		FeatureLAHFSAHF
]>;		]>;
def : ProcessorModel<"penryn", SandyBridgeModel, [		def : ProcessorModel<"penryn", SandyBridgeModel, [
		FeatureX87,
FeatureSlowUAMem16,		FeatureSlowUAMem16,
FeatureMMX,		FeatureMMX,
FeatureSSE41,		FeatureSSE41,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureSlowBTMem,		FeatureSlowBTMem,
FeatureLAHFSAHF		FeatureLAHFSAHF
]>;		]>;

// Atom CPUs.		// Atom CPUs.
class BonnellProc<string Name> : ProcessorModel<Name, AtomModel, [		class BonnellProc<string Name> : ProcessorModel<Name, AtomModel, [
ProcIntelAtom,		ProcIntelAtom,
		FeatureX87,
FeatureSlowUAMem16,		FeatureSlowUAMem16,
FeatureMMX,		FeatureMMX,
FeatureSSSE3,		FeatureSSSE3,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureMOVBE,		FeatureMOVBE,
FeatureSlowBTMem,		FeatureSlowBTMem,
FeatureLEAForSP,		FeatureLEAForSP,
FeatureSlowDivide32,		FeatureSlowDivide32,
FeatureSlowDivide64,		FeatureSlowDivide64,
FeatureCallRegIndirect,		FeatureCallRegIndirect,
FeatureLEAUsesAG,		FeatureLEAUsesAG,
FeaturePadShortFunctions,		FeaturePadShortFunctions,
FeatureLAHFSAHF		FeatureLAHFSAHF
]>;		]>;
def : BonnellProc<"bonnell">;		def : BonnellProc<"bonnell">;
def : BonnellProc<"atom">; // Pin the generic name to the baseline.		def : BonnellProc<"atom">; // Pin the generic name to the baseline.

class SilvermontProc<string Name> : ProcessorModel<Name, SLMModel, [		class SilvermontProc<string Name> : ProcessorModel<Name, SLMModel, [
ProcIntelSLM,		ProcIntelSLM,
		FeatureX87,
FeatureMMX,		FeatureMMX,
FeatureSSE42,		FeatureSSE42,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureMOVBE,		FeatureMOVBE,
FeaturePOPCNT,		FeaturePOPCNT,
FeaturePCLMUL,		FeaturePCLMUL,
FeatureAES,		FeatureAES,
FeatureSlowDivide64,		FeatureSlowDivide64,
FeatureCallRegIndirect,		FeatureCallRegIndirect,
FeaturePRFCHW,		FeaturePRFCHW,
FeatureSlowLEA,		FeatureSlowLEA,
FeatureSlowIncDec,		FeatureSlowIncDec,
FeatureSlowBTMem,		FeatureSlowBTMem,
FeatureLAHFSAHF		FeatureLAHFSAHF
]>;		]>;
def : SilvermontProc<"silvermont">;		def : SilvermontProc<"silvermont">;
def : SilvermontProc<"slm">; // Legacy alias.		def : SilvermontProc<"slm">; // Legacy alias.

// "Arrandale" along with corei3 and corei5		// "Arrandale" along with corei3 and corei5
class NehalemProc<string Name> : ProcessorModel<Name, SandyBridgeModel, [		class NehalemProc<string Name> : ProcessorModel<Name, SandyBridgeModel, [
		FeatureX87,
FeatureMMX,		FeatureMMX,
FeatureSSE42,		FeatureSSE42,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureSlowBTMem,		FeatureSlowBTMem,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureLAHFSAHF		FeatureLAHFSAHF
]>;		]>;
def : NehalemProc<"nehalem">;		def : NehalemProc<"nehalem">;
def : NehalemProc<"corei7">;		def : NehalemProc<"corei7">;

// Westmere is a similar machine to nehalem with some additional features.		// Westmere is a similar machine to nehalem with some additional features.
// Westmere is the corei3/i5/i7 path from nehalem to sandybridge		// Westmere is the corei3/i5/i7 path from nehalem to sandybridge
class WestmereProc<string Name> : ProcessorModel<Name, SandyBridgeModel, [		class WestmereProc<string Name> : ProcessorModel<Name, SandyBridgeModel, [
		FeatureX87,
FeatureMMX,		FeatureMMX,
FeatureSSE42,		FeatureSSE42,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureSlowBTMem,		FeatureSlowBTMem,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureAES,		FeatureAES,
FeaturePCLMUL,		FeaturePCLMUL,
Show All 9 Lines
class ProcModel<string Name, SchedMachineModel Model,		class ProcModel<string Name, SchedMachineModel Model,
list<SubtargetFeature> ProcFeatures,		list<SubtargetFeature> ProcFeatures,
list<SubtargetFeature> OtherFeatures> :		list<SubtargetFeature> OtherFeatures> :
ProcessorModel<Name, Model, !listconcat(ProcFeatures, OtherFeatures)>;		ProcessorModel<Name, Model, !listconcat(ProcFeatures, OtherFeatures)>;

// SSE is not listed here since llvm treats AVX as a reimplementation of SSE,		// SSE is not listed here since llvm treats AVX as a reimplementation of SSE,
// rather than a superset.		// rather than a superset.
def SNBFeatures : ProcessorFeatures<[], [		def SNBFeatures : ProcessorFeatures<[], [
		FeatureX87,
FeatureMMX,		FeatureMMX,
FeatureAVX,		FeatureAVX,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureAES,		FeatureAES,
FeaturePCLMUL,		FeaturePCLMUL,
FeatureXSAVE,		FeatureXSAVE,
▲ Show 20 Lines • Show All 106 Lines • ▼ Show 20 Lines
]>;		]>;

class CannonlakeProc<string Name> : ProcModel<Name, HaswellModel,		class CannonlakeProc<string Name> : ProcModel<Name, HaswellModel,
CNLFeatures.Value, []>;		CNLFeatures.Value, []>;
def : CannonlakeProc<"cannonlake">;		def : CannonlakeProc<"cannonlake">;

// AMD CPUs.		// AMD CPUs.

def : Proc<"k6", [FeatureSlowUAMem16, FeatureMMX]>;		def : Proc<"k6", [FeatureX87, FeatureSlowUAMem16, FeatureMMX]>;
def : Proc<"k6-2", [FeatureSlowUAMem16, Feature3DNow]>;		def : Proc<"k6-2", [FeatureX87, FeatureSlowUAMem16, Feature3DNow]>;
def : Proc<"k6-3", [FeatureSlowUAMem16, Feature3DNow]>;		def : Proc<"k6-3", [FeatureX87, FeatureSlowUAMem16, Feature3DNow]>;
def : Proc<"athlon", [FeatureSlowUAMem16, Feature3DNowA,		def : Proc<"athlon", [FeatureX87, FeatureSlowUAMem16, Feature3DNowA,
FeatureSlowBTMem, FeatureSlowSHLD]>;		FeatureSlowBTMem, FeatureSlowSHLD]>;
def : Proc<"athlon-tbird", [FeatureSlowUAMem16, Feature3DNowA,		def : Proc<"athlon-tbird", [FeatureX87, FeatureSlowUAMem16, Feature3DNowA,
FeatureSlowBTMem, FeatureSlowSHLD]>;		FeatureSlowBTMem, FeatureSlowSHLD]>;
def : Proc<"athlon-4", [FeatureSlowUAMem16, FeatureSSE1, Feature3DNowA,		def : Proc<"athlon-4", [FeatureX87, FeatureSlowUAMem16, FeatureSSE1,
FeatureFXSR, FeatureSlowBTMem, FeatureSlowSHLD]>;		Feature3DNowA, FeatureFXSR, FeatureSlowBTMem,
def : Proc<"athlon-xp", [FeatureSlowUAMem16, FeatureSSE1, Feature3DNowA,
FeatureFXSR, FeatureSlowBTMem, FeatureSlowSHLD]>;
def : Proc<"athlon-mp", [FeatureSlowUAMem16, FeatureSSE1, Feature3DNowA,
FeatureFXSR, FeatureSlowBTMem, FeatureSlowSHLD]>;
def : Proc<"k8", [FeatureSlowUAMem16, FeatureSSE2, Feature3DNowA,
FeatureFXSR, Feature64Bit, FeatureSlowBTMem,
FeatureSlowSHLD]>;
def : Proc<"opteron", [FeatureSlowUAMem16, FeatureSSE2, Feature3DNowA,
FeatureFXSR, Feature64Bit, FeatureSlowBTMem,
FeatureSlowSHLD]>;
def : Proc<"athlon64", [FeatureSlowUAMem16, FeatureSSE2, Feature3DNowA,
FeatureFXSR, Feature64Bit, FeatureSlowBTMem,
FeatureSlowSHLD]>;
def : Proc<"athlon-fx", [FeatureSlowUAMem16, FeatureSSE2, Feature3DNowA,
FeatureFXSR, Feature64Bit, FeatureSlowBTMem,
FeatureSlowSHLD]>;		FeatureSlowSHLD]>;
def : Proc<"k8-sse3", [FeatureSlowUAMem16, FeatureSSE3, Feature3DNowA,		def : Proc<"athlon-xp", [FeatureX87, FeatureSlowUAMem16, FeatureSSE1,
FeatureFXSR, FeatureCMPXCHG16B, FeatureSlowBTMem,		Feature3DNowA, FeatureFXSR, FeatureSlowBTMem,
FeatureSlowSHLD]>;		FeatureSlowSHLD]>;
def : Proc<"opteron-sse3", [FeatureSlowUAMem16, FeatureSSE3, Feature3DNowA,		def : Proc<"athlon-mp", [FeatureX87, FeatureSlowUAMem16, FeatureSSE1,
FeatureFXSR, FeatureCMPXCHG16B, FeatureSlowBTMem,		Feature3DNowA, FeatureFXSR, FeatureSlowBTMem,
FeatureSlowSHLD]>;		FeatureSlowSHLD]>;
def : Proc<"athlon64-sse3", [FeatureSlowUAMem16, FeatureSSE3, Feature3DNowA,		def : Proc<"k8", [FeatureX87, FeatureSlowUAMem16, FeatureSSE2,
FeatureFXSR, FeatureCMPXCHG16B, FeatureSlowBTMem,		Feature3DNowA, FeatureFXSR, Feature64Bit,
FeatureSlowSHLD]>;		FeatureSlowBTMem, FeatureSlowSHLD]>;
def : Proc<"amdfam10", [FeatureSSE4A, Feature3DNowA, FeatureFXSR,		def : Proc<"opteron", [FeatureX87, FeatureSlowUAMem16, FeatureSSE2,
FeatureCMPXCHG16B, FeatureLZCNT, FeaturePOPCNT,		Feature3DNowA, FeatureFXSR, Feature64Bit,
FeatureSlowBTMem, FeatureSlowSHLD, FeatureLAHFSAHF]>;		FeatureSlowBTMem, FeatureSlowSHLD]>;
def : Proc<"barcelona", [FeatureSSE4A, Feature3DNowA, FeatureFXSR,		def : Proc<"athlon64", [FeatureX87, FeatureSlowUAMem16, FeatureSSE2,
FeatureCMPXCHG16B, FeatureLZCNT, FeaturePOPCNT,		Feature3DNowA, FeatureFXSR, Feature64Bit,
FeatureSlowBTMem, FeatureSlowSHLD, FeatureLAHFSAHF]>;		FeatureSlowBTMem, FeatureSlowSHLD]>;
		def : Proc<"athlon-fx", [FeatureX87, FeatureSlowUAMem16, FeatureSSE2,
		Feature3DNowA, FeatureFXSR, Feature64Bit,
		FeatureSlowBTMem, FeatureSlowSHLD]>;
		def : Proc<"k8-sse3", [FeatureX87, FeatureSlowUAMem16, FeatureSSE3,
		Feature3DNowA, FeatureFXSR, FeatureCMPXCHG16B,
		FeatureSlowBTMem, FeatureSlowSHLD]>;
		def : Proc<"opteron-sse3", [FeatureX87, FeatureSlowUAMem16, FeatureSSE3,
		Feature3DNowA, FeatureFXSR, FeatureCMPXCHG16B,
		FeatureSlowBTMem, FeatureSlowSHLD]>;
		def : Proc<"athlon64-sse3", [FeatureX87, FeatureSlowUAMem16, FeatureSSE3,
		Feature3DNowA, FeatureFXSR, FeatureCMPXCHG16B,
		FeatureSlowBTMem, FeatureSlowSHLD]>;
		def : Proc<"amdfam10", [FeatureX87, FeatureSSE4A, Feature3DNowA,
		FeatureFXSR, FeatureCMPXCHG16B, FeatureLZCNT,
		FeaturePOPCNT, FeatureSlowBTMem, FeatureSlowSHLD,
		FeatureLAHFSAHF]>;
		def : Proc<"barcelona", [FeatureX87, FeatureSSE4A, Feature3DNowA,
		FeatureFXSR, FeatureCMPXCHG16B, FeatureLZCNT,
		FeaturePOPCNT, FeatureSlowBTMem, FeatureSlowSHLD,
		FeatureLAHFSAHF]>;

// Bobcat		// Bobcat
def : Proc<"btver1", [		def : Proc<"btver1", [
		FeatureX87,
FeatureMMX,		FeatureMMX,
FeatureSSSE3,		FeatureSSSE3,
FeatureSSE4A,		FeatureSSE4A,
FeatureFXSR,		FeatureFXSR,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeaturePRFCHW,		FeaturePRFCHW,
FeatureLZCNT,		FeatureLZCNT,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureSlowSHLD,		FeatureSlowSHLD,
FeatureLAHFSAHF		FeatureLAHFSAHF
]>;		]>;

// Jaguar		// Jaguar
def : ProcessorModel<"btver2", BtVer2Model, [		def : ProcessorModel<"btver2", BtVer2Model, [
		FeatureX87,
FeatureMMX,		FeatureMMX,
FeatureAVX,		FeatureAVX,
FeatureFXSR,		FeatureFXSR,
FeatureSSE4A,		FeatureSSE4A,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeaturePRFCHW,		FeaturePRFCHW,
FeatureAES,		FeatureAES,
FeaturePCLMUL,		FeaturePCLMUL,
FeatureBMI,		FeatureBMI,
FeatureF16C,		FeatureF16C,
FeatureMOVBE,		FeatureMOVBE,
FeatureLZCNT,		FeatureLZCNT,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureXSAVE,		FeatureXSAVE,
FeatureXSAVEOPT,		FeatureXSAVEOPT,
FeatureSlowSHLD,		FeatureSlowSHLD,
FeatureLAHFSAHF,		FeatureLAHFSAHF,
FeatureFastPartialYMMWrite		FeatureFastPartialYMMWrite
]>;		]>;

// Bulldozer		// Bulldozer
def : Proc<"bdver1", [		def : Proc<"bdver1", [
		FeatureX87,
FeatureXOP,		FeatureXOP,
FeatureFMA4,		FeatureFMA4,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureAES,		FeatureAES,
FeaturePRFCHW,		FeaturePRFCHW,
FeaturePCLMUL,		FeaturePCLMUL,
FeatureMMX,		FeatureMMX,
FeatureAVX,		FeatureAVX,
FeatureFXSR,		FeatureFXSR,
FeatureSSE4A,		FeatureSSE4A,
FeatureLZCNT,		FeatureLZCNT,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureXSAVE,		FeatureXSAVE,
FeatureSlowSHLD,		FeatureSlowSHLD,
FeatureLAHFSAHF		FeatureLAHFSAHF
]>;		]>;
// Piledriver		// Piledriver
def : Proc<"bdver2", [		def : Proc<"bdver2", [
		FeatureX87,
FeatureXOP,		FeatureXOP,
FeatureFMA4,		FeatureFMA4,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureAES,		FeatureAES,
FeaturePRFCHW,		FeaturePRFCHW,
FeaturePCLMUL,		FeaturePCLMUL,
FeatureMMX,		FeatureMMX,
FeatureAVX,		FeatureAVX,
FeatureFXSR,		FeatureFXSR,
FeatureSSE4A,		FeatureSSE4A,
FeatureF16C,		FeatureF16C,
FeatureLZCNT,		FeatureLZCNT,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureXSAVE,		FeatureXSAVE,
FeatureBMI,		FeatureBMI,
FeatureTBM,		FeatureTBM,
FeatureFMA,		FeatureFMA,
FeatureSlowSHLD,		FeatureSlowSHLD,
FeatureLAHFSAHF		FeatureLAHFSAHF
]>;		]>;

// Steamroller		// Steamroller
def : Proc<"bdver3", [		def : Proc<"bdver3", [
		FeatureX87,
FeatureXOP,		FeatureXOP,
FeatureFMA4,		FeatureFMA4,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureAES,		FeatureAES,
FeaturePRFCHW,		FeaturePRFCHW,
FeaturePCLMUL,		FeaturePCLMUL,
FeatureMMX,		FeatureMMX,
FeatureAVX,		FeatureAVX,
Show All 9 Lines	def : Proc<"bdver3", [
FeatureXSAVEOPT,		FeatureXSAVEOPT,
FeatureSlowSHLD,		FeatureSlowSHLD,
FeatureFSGSBase,		FeatureFSGSBase,
FeatureLAHFSAHF		FeatureLAHFSAHF
]>;		]>;

// Excavator		// Excavator
def : Proc<"bdver4", [		def : Proc<"bdver4", [
		FeatureX87,
FeatureMMX,		FeatureMMX,
FeatureAVX2,		FeatureAVX2,
FeatureFXSR,		FeatureFXSR,
FeatureXOP,		FeatureXOP,
FeatureFMA4,		FeatureFMA4,
FeatureCMPXCHG16B,		FeatureCMPXCHG16B,
FeatureAES,		FeatureAES,
FeaturePRFCHW,		FeaturePRFCHW,
FeaturePCLMUL,		FeaturePCLMUL,
FeatureF16C,		FeatureF16C,
FeatureLZCNT,		FeatureLZCNT,
FeaturePOPCNT,		FeaturePOPCNT,
FeatureXSAVE,		FeatureXSAVE,
FeatureBMI,		FeatureBMI,
FeatureBMI2,		FeatureBMI2,
FeatureTBM,		FeatureTBM,
FeatureFMA,		FeatureFMA,
FeatureXSAVEOPT,		FeatureXSAVEOPT,
FeatureFSGSBase,		FeatureFSGSBase,
FeatureLAHFSAHF		FeatureLAHFSAHF
]>;		]>;

def : Proc<"geode", [FeatureSlowUAMem16, Feature3DNowA]>;		def : Proc<"geode", [FeatureX87, FeatureSlowUAMem16, Feature3DNowA]>;

def : Proc<"winchip-c6", [FeatureSlowUAMem16, FeatureMMX]>;		def : Proc<"winchip-c6", [FeatureX87, FeatureSlowUAMem16, FeatureMMX]>;
def : Proc<"winchip2", [FeatureSlowUAMem16, Feature3DNow]>;		def : Proc<"winchip2", [FeatureX87, FeatureSlowUAMem16, Feature3DNow]>;
def : Proc<"c3", [FeatureSlowUAMem16, Feature3DNow]>;		def : Proc<"c3", [FeatureX87, FeatureSlowUAMem16, Feature3DNow]>;
def : Proc<"c3-2", [FeatureSlowUAMem16, FeatureMMX, FeatureSSE1, FeatureFXSR]>;		def : Proc<"c3-2", [FeatureX87, FeatureSlowUAMem16, FeatureMMX,
		FeatureSSE1, FeatureFXSR]>;

// We also provide a generic 64-bit specific x86 processor model which tries to		// We also provide a generic 64-bit specific x86 processor model which tries to
// be good for modern chips without enabling instruction set encodings past the		// be good for modern chips without enabling instruction set encodings past the
// basic SSE2 and 64-bit ones. It disables slow things from any mainstream and		// basic SSE2 and 64-bit ones. It disables slow things from any mainstream and
// modern 64-bit x86 chip, and enables features that are generally beneficial.		// modern 64-bit x86 chip, and enables features that are generally beneficial.
//		//
// We currently use the Sandy Bridge model as the default scheduling model as		// We currently use the Sandy Bridge model as the default scheduling model as
// we use it across Nehalem, Westmere, Sandy Bridge, and Ivy Bridge which		// we use it across Nehalem, Westmere, Sandy Bridge, and Ivy Bridge which
// covers a huge swath of x86 processors. If there are specific scheduling		// covers a huge swath of x86 processors. If there are specific scheduling
// knobs which need to be tuned differently for AMD chips, we might consider		// knobs which need to be tuned differently for AMD chips, we might consider
// forming a common base for them.		// forming a common base for them.
def : ProcessorModel<"x86-64", SandyBridgeModel,		def : ProcessorModel<"x86-64", SandyBridgeModel,
[FeatureMMX, FeatureSSE2, FeatureFXSR, Feature64Bit,		[FeatureX87, FeatureMMX, FeatureSSE2, FeatureFXSR,
FeatureSlowBTMem ]>;		Feature64Bit, FeatureSlowBTMem ]>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Register File Description		// Register File Description
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

include "X86RegisterInfo.td"		include "X86RegisterInfo.td"

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
▲ Show 20 Lines • Show All 65 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	static cl::opt<bool> ExperimentalVectorWideningLegalization(
"x86-experimental-vector-widening-legalization", cl::init(false),		"x86-experimental-vector-widening-legalization", cl::init(false),
cl::desc("Enable an experimental vector type legalization through widening "		cl::desc("Enable an experimental vector type legalization through widening "
"rather than promotion."),		"rather than promotion."),
cl::Hidden);		cl::Hidden);

X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,		X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
const X86Subtarget &STI)		const X86Subtarget &STI)
: TargetLowering(TM), Subtarget(STI) {		: TargetLowering(TM), Subtarget(STI) {
		bool UseX87 = !Subtarget.useSoftFloat() && Subtarget.hasX87();
X86ScalarSSEf64 = Subtarget.hasSSE2();		X86ScalarSSEf64 = Subtarget.hasSSE2();
X86ScalarSSEf32 = Subtarget.hasSSE1();		X86ScalarSSEf32 = Subtarget.hasSSE1();
MVT PtrVT = MVT::getIntegerVT(8 * TM.getPointerSize());		MVT PtrVT = MVT::getIntegerVT(8 * TM.getPointerSize());

// Set up the TargetLowering object.		// Set up the TargetLowering object.

// X86 is weird. It always uses i8 for shift amounts and setcc results.		// X86 is weird. It always uses i8 for shift amounts and setcc results.
setBooleanContents(ZeroOrOneBooleanContent);		setBooleanContents(ZeroOrOneBooleanContent);
▲ Show 20 Lines • Show All 473 Lines • ▼ Show 20 Lines	if (!Subtarget.useSoftFloat() && X86ScalarSSEf64) {
setOperationAction(ISD::FSIN , MVT::f32, Expand);		setOperationAction(ISD::FSIN , MVT::f32, Expand);
setOperationAction(ISD::FCOS , MVT::f32, Expand);		setOperationAction(ISD::FCOS , MVT::f32, Expand);
setOperationAction(ISD::FSINCOS, MVT::f32, Expand);		setOperationAction(ISD::FSINCOS, MVT::f32, Expand);

// Expand FP immediates into loads from the stack, except for the special		// Expand FP immediates into loads from the stack, except for the special
// cases we handle.		// cases we handle.
addLegalFPImmediate(APFloat(+0.0)); // xorpd		addLegalFPImmediate(APFloat(+0.0)); // xorpd
addLegalFPImmediate(APFloat(+0.0f)); // xorps		addLegalFPImmediate(APFloat(+0.0f)); // xorps
} else if (!Subtarget.useSoftFloat() && X86ScalarSSEf32) {		} else if (UseX87 && X86ScalarSSEf32) {
// Use SSE for f32, x87 for f64.		// Use SSE for f32, x87 for f64.
// Set up the FP register classes.		// Set up the FP register classes.
addRegisterClass(MVT::f32, &X86::FR32RegClass);		addRegisterClass(MVT::f32, &X86::FR32RegClass);
addRegisterClass(MVT::f64, &X86::RFP64RegClass);		addRegisterClass(MVT::f64, &X86::RFP64RegClass);

// Use ANDPS to simulate FABS.		// Use ANDPS to simulate FABS.
setOperationAction(ISD::FABS , MVT::f32, Custom);		setOperationAction(ISD::FABS , MVT::f32, Custom);

Show All 18 Lines	if (!Subtarget.useSoftFloat() && X86ScalarSSEf64) {
addLegalFPImmediate(APFloat(-0.0)); // FLD0/FCHS		addLegalFPImmediate(APFloat(-0.0)); // FLD0/FCHS
addLegalFPImmediate(APFloat(-1.0)); // FLD1/FCHS		addLegalFPImmediate(APFloat(-1.0)); // FLD1/FCHS

if (!TM.Options.UnsafeFPMath) {		if (!TM.Options.UnsafeFPMath) {
setOperationAction(ISD::FSIN , MVT::f64, Expand);		setOperationAction(ISD::FSIN , MVT::f64, Expand);
setOperationAction(ISD::FCOS , MVT::f64, Expand);		setOperationAction(ISD::FCOS , MVT::f64, Expand);
setOperationAction(ISD::FSINCOS, MVT::f64, Expand);		setOperationAction(ISD::FSINCOS, MVT::f64, Expand);
}		}
} else if (!Subtarget.useSoftFloat()) {		} else if (UseX87) {
// f32 and f64 in x87.		// f32 and f64 in x87.
// Set up the FP register classes.		// Set up the FP register classes.
addRegisterClass(MVT::f64, &X86::RFP64RegClass);		addRegisterClass(MVT::f64, &X86::RFP64RegClass);
addRegisterClass(MVT::f32, &X86::RFP32RegClass);		addRegisterClass(MVT::f32, &X86::RFP32RegClass);

setOperationAction(ISD::UNDEF, MVT::f64, Expand);		setOperationAction(ISD::UNDEF, MVT::f64, Expand);
setOperationAction(ISD::UNDEF, MVT::f32, Expand);		setOperationAction(ISD::UNDEF, MVT::f32, Expand);
setOperationAction(ISD::FCOPYSIGN, MVT::f64, Expand);		setOperationAction(ISD::FCOPYSIGN, MVT::f64, Expand);
Show All 17 Lines	if (!Subtarget.useSoftFloat() && X86ScalarSSEf64) {
addLegalFPImmediate(APFloat(-1.0f)); // FLD1/FCHS		addLegalFPImmediate(APFloat(-1.0f)); // FLD1/FCHS
}		}

// We don't support FMA.		// We don't support FMA.
setOperationAction(ISD::FMA, MVT::f64, Expand);		setOperationAction(ISD::FMA, MVT::f64, Expand);
setOperationAction(ISD::FMA, MVT::f32, Expand);		setOperationAction(ISD::FMA, MVT::f32, Expand);

// Long double always uses X87, except f128 in MMX.		// Long double always uses X87, except f128 in MMX.
if (!Subtarget.useSoftFloat()) {		if (UseX87) {
if (Subtarget.is64Bit() && Subtarget.hasMMX()) {		if (Subtarget.is64Bit() && Subtarget.hasMMX()) {
addRegisterClass(MVT::f128, &X86::FR128RegClass);		addRegisterClass(MVT::f128, &X86::FR128RegClass);
ValueTypeActions.setTypeAction(MVT::f128, TypeSoftenFloat);		ValueTypeActions.setTypeAction(MVT::f128, TypeSoftenFloat);
setOperationAction(ISD::FABS , MVT::f128, Custom);		setOperationAction(ISD::FABS , MVT::f128, Custom);
setOperationAction(ISD::FNEG , MVT::f128, Custom);		setOperationAction(ISD::FNEG , MVT::f128, Custom);
setOperationAction(ISD::FCOPYSIGN, MVT::f128, Custom);		setOperationAction(ISD::FCOPYSIGN, MVT::f128, Custom);
}		}

▲ Show 20 Lines • Show All 1,786 Lines • ▼ Show 20 Lines	if ((CopyVT == MVT::f32 \|\| CopyVT == MVT::f64 \|\| CopyVT == MVT::f128) &&
report_fatal_error("SSE register return with SSE disabled");		report_fatal_error("SSE register return with SSE disabled");
}		}

// If we prefer to use the value in xmm registers, copy it out as f80 and		// If we prefer to use the value in xmm registers, copy it out as f80 and
// use a truncate to move it from fp stack reg to xmm reg.		// use a truncate to move it from fp stack reg to xmm reg.
bool RoundAfterCopy = false;		bool RoundAfterCopy = false;
if ((VA.getLocReg() == X86::FP0 \|\| VA.getLocReg() == X86::FP1) &&		if ((VA.getLocReg() == X86::FP0 \|\| VA.getLocReg() == X86::FP1) &&
isScalarFPTypeInSSEReg(VA.getValVT())) {		isScalarFPTypeInSSEReg(VA.getValVT())) {
		if (!Subtarget.hasX87())
		report_fatal_error("X87 register return with X87 disabled");
CopyVT = MVT::f80;		CopyVT = MVT::f80;
RoundAfterCopy = (CopyVT != VA.getLocVT());		RoundAfterCopy = (CopyVT != VA.getLocVT());
}		}

Chain = DAG.getCopyFromReg(Chain, dl, VA.getLocReg(),		Chain = DAG.getCopyFromReg(Chain, dl, VA.getLocReg(),
CopyVT, InFlag).getValue(1);		CopyVT, InFlag).getValue(1);
SDValue Val = Chain.getValue(0);		SDValue Val = Chain.getValue(0);

▲ Show 20 Lines • Show All 27,765 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/X86/X86Subtarget.h

Show First 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	protected:
PICStyles::Style PICStyle;		PICStyles::Style PICStyle;

/// SSE1, SSE2, SSE3, SSSE3, SSE41, SSE42, or none supported.		/// SSE1, SSE2, SSE3, SSSE3, SSE41, SSE42, or none supported.
X86SSEEnum X86SSELevel;		X86SSEEnum X86SSELevel;

/// MMX, 3DNow, 3DNow Athlon, or none supported.		/// MMX, 3DNow, 3DNow Athlon, or none supported.
X863DNowEnum X863DNowLevel;		X863DNowEnum X863DNowLevel;

		/// True if the processor supports X87 instructions.
		bool HasX87;

/// True if this processor has conditional move instructions		/// True if this processor has conditional move instructions
/// (generally pentium pro+).		/// (generally pentium pro+).
bool HasCMov;		bool HasCMov;

/// True if the processor supports X86-64 instructions.		/// True if the processor supports X86-64 instructions.
bool HasX86_64;		bool HasX86_64;

/// True if the processor supports POPCNT.		/// True if the processor supports POPCNT.
▲ Show 20 Lines • Show All 284 Lines • ▼ Show 20 Lines	public:
bool isTarget64BitLP64() const {		bool isTarget64BitLP64() const {
return In64BitMode && (TargetTriple.getEnvironment() != Triple::GNUX32 &&		return In64BitMode && (TargetTriple.getEnvironment() != Triple::GNUX32 &&
!TargetTriple.isOSNaCl());		!TargetTriple.isOSNaCl());
}		}

PICStyles::Style getPICStyle() const { return PICStyle; }		PICStyles::Style getPICStyle() const { return PICStyle; }
void setPICStyle(PICStyles::Style Style) { PICStyle = Style; }		void setPICStyle(PICStyles::Style Style) { PICStyle = Style; }

		bool hasX87() const { return HasX87; }
bool hasCMov() const { return HasCMov; }		bool hasCMov() const { return HasCMov; }
bool hasSSE1() const { return X86SSELevel >= SSE1; }		bool hasSSE1() const { return X86SSELevel >= SSE1; }
bool hasSSE2() const { return X86SSELevel >= SSE2; }		bool hasSSE2() const { return X86SSELevel >= SSE2; }
bool hasSSE3() const { return X86SSELevel >= SSE3; }		bool hasSSE3() const { return X86SSELevel >= SSE3; }
bool hasSSSE3() const { return X86SSELevel >= SSSE3; }		bool hasSSSE3() const { return X86SSELevel >= SSSE3; }
bool hasSSE41() const { return X86SSELevel >= SSE41; }		bool hasSSE41() const { return X86SSELevel >= SSE41; }
bool hasSSE42() const { return X86SSELevel >= SSE42; }		bool hasSSE42() const { return X86SSELevel >= SSE42; }
bool hasAVX() const { return X86SSELevel >= AVX; }		bool hasAVX() const { return X86SSELevel >= AVX; }
▲ Show 20 Lines • Show All 208 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/X86/X86Subtarget.cpp

Show First 20 Lines • Show All 233 Lines • ▼ Show 20 Lines	void X86Subtarget::initSubtargetFeatures(StringRef CPU, StringRef FS) {
else if (isTargetDarwin() \|\| isTargetLinux() \|\| isTargetSolaris() \|\|		else if (isTargetDarwin() \|\| isTargetLinux() \|\| isTargetSolaris() \|\|
In64BitMode)		In64BitMode)
stackAlignment = 16;		stackAlignment = 16;
}		}

void X86Subtarget::initializeEnvironment() {		void X86Subtarget::initializeEnvironment() {
X86SSELevel = NoSSE;		X86SSELevel = NoSSE;
X863DNowLevel = NoThreeDNow;		X863DNowLevel = NoThreeDNow;
		HasX87 = false;
HasCMov = false;		HasCMov = false;
HasX86_64 = false;		HasX86_64 = false;
HasPOPCNT = false;		HasPOPCNT = false;
HasSSE4A = false;		HasSSE4A = false;
HasAES = false;		HasAES = false;
HasFXSR = false;		HasFXSR = false;
HasXSAVE = false;		HasXSAVE = false;
HasXSAVEOPT = false;		HasXSAVEOPT = false;
▲ Show 20 Lines • Show All 97 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/x87.ll

				; RUN: llc < %s -march=x86 \| FileCheck %s -check-prefix=X87
				; RUN: llc < %s -march=x86-64 -mattr=-sse \| FileCheck %s -check-prefix=X87
				; RUN: llc < %s -march=x86 -mattr=-x87 \| FileCheck %s -check-prefix=NOX87 --implicit-check-not "{{ }}f{{.*}}"
				; RUN: llc < %s -march=x86-64 -mattr=-x87,-sse \| FileCheck %s -check-prefix=NOX87 --implicit-check-not "{{ }}f{{.*}}"
				; RUN: llc < %s -march=x86 -mattr=-x87,+sse \| FileCheck %s -check-prefix=NOX87 --implicit-check-not "{{ }}f{{.*}}"
				; RUN: llc < %s -march=x86-64 -mattr=-x87,-sse2 \| FileCheck %s -check-prefix=NOX87 --implicit-check-not "{{ }}f{{.*}}"

				define void @test(i32 %i, i64 %l, float* %pf, double* %pd, fp128* %pld) nounwind readnone {
				; X87-LABEL: test:
				; NOX87-LABEL: test:
				; X87: fild
				; NOX87: __floatunsisf
				%tmp = uitofp i32 %i to float

				; X87: fild
				; NOX87: __floatdisf
				%tmp1 = sitofp i64 %l to float

				; X87: fadd
				; NOX87: __addsf3
				%tmp2 = fadd float %tmp, %tmp1

				; X87: fstp
				store float %tmp2, float* %pf

				; X87: fild
				; NOX87: __floatunsidf
				%tmp3 = uitofp i32 %i to double

				; X87: fild
				; NOX87: __floatdidf
				%tmp4 = sitofp i64 %l to double

				; X87: fadd
				; NOX87: __adddf3
				%tmp5 = fadd double %tmp3, %tmp4

				; X87: fstp
				store double %tmp5, double* %pd

				; X87: __floatsitf
				; NOX87: __floatsitf
				%tmp6 = sitofp i32 %i to fp128

				; X87: __floatunditf
				; NOX87: __floatunditf
				%tmp7 = uitofp i64 %l to fp128

				; X87: __addtf3
				; NOX87: __addtf3
				%tmp8 = fadd fp128 %tmp6, %tmp7
				store fp128 %tmp8, fp128* %pld

				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

Introduction of FeatureX87ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 51397

llvm/trunk/lib/Target/X86/X86.td

llvm/trunk/lib/Target/X86/X86ISelLowering.cpp

llvm/trunk/lib/Target/X86/X86Subtarget.h

llvm/trunk/lib/Target/X86/X86Subtarget.cpp

llvm/trunk/test/CodeGen/X86/x87.ll

Introduction of FeatureX87
ClosedPublic