This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Target/
-
llvm/
-
Target/
1/1
Target.td
-
lib/Target/X86/
-
Target/
-
X86/
3/6
X86.td
1/1
X86Subtarget.h
-
utils/TableGen/
-
TableGen/
1/1
SubtargetEmitter.cpp

Differential D121768

[X86][tablgen] Auto-generate trivial fields and trivial interfaces for target features
AbandonedPublic

Authored by skan on Mar 15 2022, 8:36 PM.

Download Raw Diff

Details

Reviewers

craig.topper
LuoYuanke
pengfei
MaskRay
tmatheson
RKSimon

Summary

A trivial field is always zero-initialized. An interface is trivial if directly returns
a field of the object or compares a field w/ a constant value, e.g

bool hasX87() const { return HasX87; }
bool hasSSE1() const { return X86SSELevel >= SSE1; }

The effort of writing such code can be saved.

We start with X86 target features in this patch, and do the similar things for other
archs in the following patches.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

skan created this revision.Mar 15 2022, 8:36 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 15 2022, 8:36 PM

Herald added subscribers: pengfei, hiraditya, dschuff. · View Herald Transcript

skan requested review of this revision.Mar 15 2022, 8:36 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 15 2022, 8:36 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B154502: Diff 415685.Mar 15 2022, 8:37 PM

skan added reviewers: craig.topper, LuoYuanke, pengfei, MaskRay.Mar 15 2022, 8:39 PM

If the patch contains changes to X86 files, please put X86 in the title. I have email filters that check the subject.

Can you split the NFC renames into their own patch?

craig.topper added inline comments.Mar 15 2022, 9:07 PM

llvm/utils/TableGen/SubtargetEmitter.cpp
1743	You can do if (!Attrs.insert(Attribute).second) continue; instead of calling find and insert.

pengfei added inline comments.Mar 15 2022, 10:00 PM

llvm/lib/Target/X86/X86.td
21–22	I think `In64BitMode` is more readable, we usually generate 32 bit instructions under 64 bit mode.
33	How about define a new class like `TrivalSubtargetFeature` so that we don't need to add `, [], 0` to the old one?
llvm/lib/Target/X86/X86InstrInfo.td
882 ↗	(On Diff #415685)	Should change here too? The same below.
llvm/lib/Target/X86/X86Subtarget.cpp
56 ↗	(On Diff #415685)	Unrelated change.
llvm/lib/Target/X86/X86Subtarget.h
82	Some comments like this doesn't appear on X86.td, should we move all these comments there?

skan mentioned this in rG052d37dc7ced: [NFC][X86] Rename some variables and functions about target features.Mar 15 2022, 10:09 PM

In D121768#3384797, @craig.topper wrote:

If the patch contains changes to X86 files, please put X86 in the title. I have email filters that check the subject.

Can you split the NFC renames into their own patch?

Done. Splited the NFC change to https://reviews.llvm.org/rG052d37dc7ced.

Rebase

skan retitled this revision from [tablgen] Auto-generate fields & interfaces for target features to [tablgen][X86] Auto-generate fields & interfaces for target features.Mar 15 2022, 10:40 PM

LiuChen3 added a subscriber: LiuChen3.Mar 15 2022, 11:10 PM

Harbormaster completed remote builds in B154512: Diff 415701.Mar 15 2022, 11:16 PM

Address comments

skan added inline comments.Mar 15 2022, 11:47 PM

llvm/lib/Target/X86/X86.td
21–22	It's meaningless to have different name for inteface and the member. We already use the interface `is64Bit`.
33	If so, then we have to replace most of `SubtargetFeature` with `TrivalSubtargetFeature`. Now we only need to add `, [], 0` for CMOV, so I think `TrivalSubtargetFeature` is not worthy.

Add class NonTrivalSubtargetFeature

Harbormaster completed remote builds in B154522: Diff 415711.Mar 16 2022, 1:07 AM

Matt added a subscriber: Matt.Mar 16 2022, 7:36 AM

skan retitled this revision from [tablgen][X86] Auto-generate fields & interfaces for target features to [tablgen][X86] Auto-generate fields and trival interace for target features.Mar 16 2022, 10:57 PM

skan edited the summary of this revision. (Show Details)

This looks remarkably similar to an ARM/AArch64 patch I've had up for a couple of weeks: D120906

The main differences seem to be:

I use a more general GET_SUBTARGETINFO_MACRO macro that can be expanded for other uses going forwards, rather than emitting the various bits of C++ directly.
This patch makes a distinction between trivial/nontrivial members, why is this necessary?

Add three classes: TrivalFieldSubtargetFeature, TrivalInterfaceSubtargetFeature, TrivalSubtargetFeature for flexibilty
Add some comments
Move the description of feature to the interface if there are multiples features related to the field

In D121768#3388689, @tmatheson wrote:

This looks remarkably similar to an ARM/AArch64 patch I've had up for a couple of weeks: D120906

The main differences seem to be:

I use a more general GET_SUBTARGETINFO_MACRO macro that can be expanded for other uses going forwards, rather than emitting the various bits of C++ directly.

This patch makes a distinction between trivial/nontrivial members, why is this necessary?

I'm sorry that I didn't notice your patch, otherwise I would comment on it rather than propose a patch. Let me check the difference.

In D121768#3388868, @skan wrote:

In D121768#3388689, @tmatheson wrote:

This looks remarkably similar to an ARM/AArch64 patch I've had up for a couple of weeks: D120906

The main differences seem to be:

I use a more general GET_SUBTARGETINFO_MACRO macro that can be expanded for other uses going forwards, rather than emitting the various bits of C++ directly.

This patch makes a distinction between trivial/nontrivial members, why is this necessary?

I'm sorry that I didn't notice your patch, otherwise I would comment on it rather than propose a patch. Let me check the difference.

There are more differences

I copy the descriptions of the features to the generated header file but you just omit them
I also support the enum types, e.g

bool hasAVX512() const { return X86SSELevel >= AVX512; }

This patch makes a distinction between trivial/nontrivial members & interface b/c some targets has some tricky interfaces, e.g

`
bool hasCMov() const { return HasCMov || X86SSELevel >= SSE1 || is64Bit(); }
bool useAA() const override { return UseAA; }
bool hasLAHFSAHF() const { return HasLAHFSAHF64 || !is64Bit(); }

It seems that you patch handle them incorrectly by now.

A field may be set to false by a feature while the default value of the field can either be true or false at the same time. So I provide a way to avoid generating field that is not zero-initialized. But your assumption is that the default value of the field must be true if a feature set it to false.

Harbormaster completed remote builds in B154812: Diff 416133.Mar 17 2022, 4:56 AM

skan added a reviewer: tmatheson.Mar 17 2022, 4:58 AM

skan retitled this revision from [tablgen][X86] Auto-generate fields and trival interace for target features to [tablgen][X86] Auto-generate trival fields and interace for target features.Mar 17 2022, 5:07 AM

This patch makes a distinction between trivial/nontrivial members & interface b/c some targets has some tricky interfaces, e.g
bool hasCMov() const { return HasCMov || X86SSELevel >= SSE1 || is64Bit(); }
bool useAA() const override { return UseAA; }
bool hasLAHFSAHF() const { return HasLAHFSAHF64 || !is64Bit(); }
It seems that you patch handle them incorrectly by now.

It would be better to take this opportunity to fix these inconsistencies between the getter methods and the fields, rather than adding complexity to the tablegen to keep them. If HasCMov is false it is counterintuitive for hasCMov to return true. I handled this by the getter always matching the field name, e.g.

bool enablePostRAScheduler() const override { return usePostRAScheduler(); }
bool useNEONForSinglePrecisionFP() const { return hasNEON() && hasNEONForFP(); }

A field may be set to false by a feature while the default value of the field can either be true or false at the same time. So I provide a way to avoid generating field that is not zero-initialized. But your assumption is that the default value of the field must be true if a feature set it to false.

How many examples of this are there, and are they intentional? This also seems like an inconsistency that could lead to confusion. If a field has default value false and enabling the feature sets it to false, at what point is it set to true and which takes precedence?

In D121768#3389065, @tmatheson wrote:
This patch makes a distinction between trivial/nontrivial members & interface b/c some targets has some tricky interfaces, e.g
bool hasCMov() const { return HasCMov || X86SSELevel >= SSE1 || is64Bit(); }
bool useAA() const override { return UseAA; }
bool hasLAHFSAHF() const { return HasLAHFSAHF64 || !is64Bit(); }
It seems that you patch handle them incorrectly by now.
It would be better to take this opportunity to fix these inconsistencies between the getter methods and the fields, rather than adding complexity to the tablegen to keep them. If HasCMov is false it is counterintuitive for hasCMov to return true. I handled this by the getter always matching the field name, e.g.
bool enablePostRAScheduler() const override { return usePostRAScheduler(); }
bool useNEONForSinglePrecisionFP() const { return hasNEON() && hasNEONForFP(); }

A field may be set to false by a feature while the default value of the field can either be true or false at the same time. So I provide a way to avoid generating field that is not zero-initialized. But your assumption is that the default value of the field must be true if a feature set it to false.

How many examples of this are there, and are they intentional? This also seems like an inconsistency that could lead to confusion. If a field has default value false and enabling the feature sets it to false, at what point is it set to true and which takes precedence?

I don't why the inconsistencies are there, probably they are a bugfix or a workaround themselves... In another view, I think we should give the developer such flexibility to customize the interface.
Of course the inner-class initializer comes first, then the value set by feature overrides the default value in function ParseSubtargetFeatures.

skan mentioned this in D120906: [ARM][AArch64] generate subtarget feature flags [NFC].Mar 17 2022, 7:44 PM

The cmov mess is because SSE and 64-bit odegen assumes cmov exists. We couldn’t make 64-bit or SSE imply cmov feature or -mattr=-cmov would disable SSE or 64bit which is just completely broken. Thus the OR that ignores attempts to disable cmov.

llvm/include/llvm/Target/Target.td
1664	Trival->Trivial

Every feature bit is exposed directly to FeatureBits vector at the MC layer using the name of the tablegen def. It is to our advantage to have consistent naming between the Def, the Subtarget field name, and the getter. Thus the concept of the getter being different than the Subtarget field doesn’t make sense because it means the MC layer is still wrong.

Fix typo Trival->Trivial

skan retitled this revision from [tablgen][X86] Auto-generate trival fields and interace for target features to Auto-generate trivial fields and trivial interfaces for target features.Mar 17 2022, 8:34 PM

skan edited the summary of this revision. (Show Details)

In D121768#3391208, @craig.topper wrote:

The cmov mess is because SSE and 64-bit odegen assumes cmov exists. We couldn’t make 64-bit or SSE imply cmov feature or -mattr=-cmov would disable SSE or 64bit which is just completely broken. Thus the OR that ignores attempts to disable cmov.

Thank you for the explanation, craig! Then I think it's important to have the flexibility to customize the interface, at least for now. Similar interfaces also exist in AMDGPU target.

Harbormaster completed remote builds in B154970: Diff 416382.Mar 17 2022, 9:04 PM

craig.topper added inline comments.Mar 17 2022, 9:22 PM

llvm/lib/Target/X86/X86.td
103–107	Doesn't hasCmpxchg16b() have non-trivial implementation?

craig.topper added inline comments.Mar 17 2022, 9:42 PM

llvm/lib/Target/X86/X86.td
103–107	Nevermind I see this uses TrivialFieldSubtargetFeature

skan mentioned this in D121975: [X86] Rename more target feature related things consistency. NFC.Mar 17 2022, 10:04 PM

Rebase

skan retitled this revision from Auto-generate trivial fields and trivial interfaces for target features to [X86][tablgen] Auto-generate trivial fields and trivial interfaces for target features.Mar 17 2022, 11:14 PM

Harbormaster completed remote builds in B154974: Diff 416394.Mar 17 2022, 11:57 PM

Then I think it's important to have the flexibility to customize the interface, at least for now.

To be clear, I think that flexibility is a *bad* thing in this situation and not justified. Autogenerated code is already kind of magic, so it should be repetitive, boilerplate and consistent. Allowing people to tweak the code generation to behave inconsistently some of the time is a recipe for confusion. It will also encourage more deviation in future. If this is to be used in other backends as well, I would prefer a solution that resulted in a 1-1 relationship between getter and field. IIUC, this is in agreement with what @craig.topper is saying here:

Every feature bit is exposed directly to FeatureBits vector at the MC layer using the name of the tablegen def. It is to our advantage to have consistent naming between the Def, the Subtarget field name, and the getter. Thus the concept of the getter being different than the Subtarget field doesn’t make sense because it means the MC layer is still wrong

To give a concrete example using CMov again, either the feature or the existing getter can be simply renamed, making it clear that hasCMov() is not simply checking for the feature:

/// Autogenerated getter for the autogenerated field CMovFeature (not necessarily a good name).
/// Renamed to avoid changing the name of hasCMov()
bool hasCMovFeature() const { return HasCMovFeature; }

/// People still have the flexibility to create a custom interface in a way that is explicit and obvious to the reader
bool hasCMov() const { return hasCMov() || X86SSELevel >= SSE1 || is64Bit(); }

Given the above, I haven't seen any examples to justify the added complexity to the tablegen classes.

In D121768#3391687, @tmatheson wrote:
Then I think it's important to have the flexibility to customize the interface, at least for now.

To be clear, I think that flexibility is a *bad* thing in this situation and not justified. Autogenerated code is already kind of magic, so it should be repetitive, boilerplate and consistent. Allowing people to tweak the code generation to behave inconsistently some of the time is a recipe for confusion. It will also encourage more deviation in future. If this is to be used in other backends as well, I would prefer a solution that resulted in a 1-1 relationship between getter and field. IIUC, this is in agreement with what @craig.topper is saying here:

Every feature bit is exposed directly to FeatureBits vector at the MC layer using the name of the tablegen def. It is to our advantage to have consistent naming between the Def, the Subtarget field name, and the getter. Thus the concept of the getter being different than the Subtarget field doesn’t make sense because it means the MC layer is still wrong

To give a concrete example using CMov again, either the feature or the existing getter can be simply renamed, making it clear that hasCMov() is not simply checking for the feature:
/// Autogenerated getter for the autogenerated field CMovFeature (not necessarily a good name).
/// Renamed to avoid changing the name of hasCMov()
bool hasCMovFeature() const { return HasCMovFeature; }

/// People still have the flexibility to create a custom interface in a way that is explicit and obvious to the reader
bool hasCMov() const { return hasCMov() || X86SSELevel >= SSE1 || is64Bit(); }
Given the above, I haven't seen any examples to justify the added complexity to the tablegen classes.

Your method can not get rid of non-trivial feature like useAA, which has an virtual interface. In addition, as a LLVM backend developer, I often need to hack these interfaces to do some performance tuning or testing work, so the flexibility is quite important.

Your method can not get rid of non-trivial feature like useAA, which has an virtual interface.

I gave an example above of exactly this:

bool enablePostRAScheduler() const override { return usePostRAScheduler(); }

In addition, as a LLVM backend developer, I often need to hack these interfaces to do some performance tuning or testing work, so the flexibility is quite important.

Could you be more specific? I can't imagine how consistent getter/field naming prevents performance or testing work.

Still seeing no justification for the added complexity of 2 new fields to SubtargetFeature, 3 new SubtargetFeature subclasses, in order to avoid a couple of function renames. We already have custom predicates.

In D121768#3391810, @tmatheson wrote:
Your method can not get rid of non-trivial feature like useAA, which has an virtual interface.

I gave an example above of exactly this:
bool enablePostRAScheduler() const override { return usePostRAScheduler(); }
In addition, as a LLVM backend developer, I often need to hack these interfaces to do some performance tuning or testing work, so the flexibility is quite important.

Could you be more specific? I can't imagine how consistent getter/field naming prevents performance or testing work.

Still seeing no justification for the added complexity of 2 new fields to SubtargetFeature, 3 new SubtargetFeature subclasses, in order to avoid a couple of function renames. We already have custom predicates.

Let me tell the scenario:

The features defined in TD file have dependencies between each other. The simpliest case is that both feature A and feature B depends on feature C due to any historic HW reason. When we disable C by passing
knobs like -mattr=-C or -mnoC, feature A and B will be disabled too. However, it's possible either A or B does not depends on C in SW, namely, compiler could emit ISA in A while not emitting ISA in C.
Handwritten (non-trivial) interface gives us an quick way to untie such dependencies. If we'd like to disable C w/o affecting A and B, we can simply write such code

bool hasC() const { return false; }

That's the flexibility and we could have more complicated cases.

Besides, I think adding two bits to a class is very common in a DSL (tablgen). It's worthy b/c flexibility is kept and code is reduced at same time.

RKSimon added a reviewer: RKSimon.Mar 18 2022, 7:53 AM

In D121768#3392153, @skan wrote:
In D121768#3391810, @tmatheson wrote:
Your method can not get rid of non-trivial feature like useAA, which has an virtual interface.

I gave an example above of exactly this:
bool enablePostRAScheduler() const override { return usePostRAScheduler(); }
In addition, as a LLVM backend developer, I often need to hack these interfaces to do some performance tuning or testing work, so the flexibility is quite important.

Could you be more specific? I can't imagine how consistent getter/field naming prevents performance or testing work.

Still seeing no justification for the added complexity of 2 new fields to SubtargetFeature, 3 new SubtargetFeature subclasses, in order to avoid a couple of function renames. We already have custom predicates.
Let me tell the scenario:

The features defined in TD file have dependencies between each other. The simpliest case is that both feature A and feature B depends on feature C due to any historic HW reason. When we disable C by passing
knobs like -mattr=-C or -mnoC, feature A and B will be disabled too. However, it's possible either A or B does not depends on C in SW, namely, compiler could emit ISA in A while not emitting ISA in C.
Handwritten (non-trivial) interface gives us an quick way to untie such dependencies. If we'd like to disable C w/o affecting A and B, we can simply write such code
bool hasC() const { return false; }
That's the flexibility and we could have more complicated cases.

Since you would have to edit both the .td to make it non-trivial and then edit the .h file. Couldn't you instead rename the field in the .td so the autogenerated getter has a different name, and then add the custom getter using the old name?

Is useAA() really supported. It's off by default, nothing implies FeatureUseAA, and no lit tests pass -mattr=+use-aa. Seems like we should look at it's history, it might be a candidate for deletion.

In D121768#3392554, @craig.topper wrote:
In D121768#3392153, @skan wrote:
In D121768#3391810, @tmatheson wrote:
Your method can not get rid of non-trivial feature like useAA, which has an virtual interface.

I gave an example above of exactly this:
bool enablePostRAScheduler() const override { return usePostRAScheduler(); }
In addition, as a LLVM backend developer, I often need to hack these interfaces to do some performance tuning or testing work, so the flexibility is quite important.

Could you be more specific? I can't imagine how consistent getter/field naming prevents performance or testing work.

Still seeing no justification for the added complexity of 2 new fields to SubtargetFeature, 3 new SubtargetFeature subclasses, in order to avoid a couple of function renames. We already have custom predicates.
Let me tell the scenario:

The features defined in TD file have dependencies between each other. The simpliest case is that both feature A and feature B depends on feature C due to any historic HW reason. When we disable C by passing
knobs like -mattr=-C or -mnoC, feature A and B will be disabled too. However, it's possible either A or B does not depends on C in SW, namely, compiler could emit ISA in A while not emitting ISA in C.
Handwritten (non-trivial) interface gives us an quick way to untie such dependencies. If we'd like to disable C w/o affecting A and B, we can simply write such code
bool hasC() const { return false; }
That's the flexibility and we could have more complicated cases.
Since you would have to edit both the .td to make it non-trivial and then edit the .h file. Couldn't you instead rename the field in the .td so the autogenerated getter has a different name, and then add the custom getter using the old name?

It's not straightforward vs my method. As I said, this is the simpliest case, non-trival interfaces allow us to untie and retie the dependencies in more complicated cases. Another example is that we could add a knob like -force-noC to the backend, then define the interface like

bool hasC() const { return C && !ForceNoC; }

so that we can control the dependency w/o building two compilers.

In D121768#3393857, @skan wrote:
In D121768#3392554, @craig.topper wrote:
In D121768#3392153, @skan wrote:
In D121768#3391810, @tmatheson wrote:
Your method can not get rid of non-trivial feature like useAA, which has an virtual interface.

I gave an example above of exactly this:
bool enablePostRAScheduler() const override { return usePostRAScheduler(); }
In addition, as a LLVM backend developer, I often need to hack these interfaces to do some performance tuning or testing work, so the flexibility is quite important.

Could you be more specific? I can't imagine how consistent getter/field naming prevents performance or testing work.

Still seeing no justification for the added complexity of 2 new fields to SubtargetFeature, 3 new SubtargetFeature subclasses, in order to avoid a couple of function renames. We already have custom predicates.
Let me tell the scenario:

The features defined in TD file have dependencies between each other. The simpliest case is that both feature A and feature B depends on feature C due to any historic HW reason. When we disable C by passing
knobs like -mattr=-C or -mnoC, feature A and B will be disabled too. However, it's possible either A or B does not depends on C in SW, namely, compiler could emit ISA in A while not emitting ISA in C.
Handwritten (non-trivial) interface gives us an quick way to untie such dependencies. If we'd like to disable C w/o affecting A and B, we can simply write such code
bool hasC() const { return false; }
That's the flexibility and we could have more complicated cases.
Since you would have to edit both the .td to make it non-trivial and then edit the .h file. Couldn't you instead rename the field in the .td so the autogenerated getter has a different name, and then add the custom getter using the old name?
It's not straightforward vs my method. As I said, this is the simpliest case, non-trival interfaces allow us to untie and retie the dependencies in more complicated cases. Another example is that we could add a knob like -force-noC to the backend, then define the interface like
bool hasC() const { return C && !ForceNoC; }
so that we can control the dependency w/o building two compilers.

I don't understand these use cases. How often are you doing these kinds of things? The dependencies in the compiler are supposed to be as loose as possible. BMI2 doesn't imply BMI1 for example.

Also, D120906 drops these comments in the header file by moving them to TD file. I'm not sure whether there is a document issue. We keep the comments in the generated header file, which seems better.

In D121768#3393882, @craig.topper wrote:
In D121768#3393857, @skan wrote:
In D121768#3392554, @craig.topper wrote:
In D121768#3392153, @skan wrote:
In D121768#3391810, @tmatheson wrote:
Your method can not get rid of non-trivial feature like useAA, which has an virtual interface.

I gave an example above of exactly this:
bool enablePostRAScheduler() const override { return usePostRAScheduler(); }
In addition, as a LLVM backend developer, I often need to hack these interfaces to do some performance tuning or testing work, so the flexibility is quite important.

Could you be more specific? I can't imagine how consistent getter/field naming prevents performance or testing work.

Still seeing no justification for the added complexity of 2 new fields to SubtargetFeature, 3 new SubtargetFeature subclasses, in order to avoid a couple of function renames. We already have custom predicates.
Let me tell the scenario:

The features defined in TD file have dependencies between each other. The simpliest case is that both feature A and feature B depends on feature C due to any historic HW reason. When we disable C by passing
knobs like -mattr=-C or -mnoC, feature A and B will be disabled too. However, it's possible either A or B does not depends on C in SW, namely, compiler could emit ISA in A while not emitting ISA in C.
Handwritten (non-trivial) interface gives us an quick way to untie such dependencies. If we'd like to disable C w/o affecting A and B, we can simply write such code
bool hasC() const { return false; }
That's the flexibility and we could have more complicated cases.
Since you would have to edit both the .td to make it non-trivial and then edit the .h file. Couldn't you instead rename the field in the .td so the autogenerated getter has a different name, and then add the custom getter using the old name?
It's not straightforward vs my method. As I said, this is the simpliest case, non-trival interfaces allow us to untie and retie the dependencies in more complicated cases. Another example is that we could add a knob like -force-noC to the backend, then define the interface like
bool hasC() const { return C && !ForceNoC; }
so that we can control the dependency w/o building two compilers.
I don't understand these use cases. How often are you doing these kinds of things? The dependencies in the compiler are supposed to be as loose as possible. BMI2 doesn't imply BMI1 for example.

But AVX implies SSE. In fact, it's the common case used in my development. Why shouldn't we have such usage?

In D121768#3393895, @skan wrote:
In D121768#3393882, @craig.topper wrote:
In D121768#3393857, @skan wrote:
In D121768#3392554, @craig.topper wrote:
In D121768#3392153, @skan wrote:
In D121768#3391810, @tmatheson wrote:
Your method can not get rid of non-trivial feature like useAA, which has an virtual interface.

I gave an example above of exactly this:
bool enablePostRAScheduler() const override { return usePostRAScheduler(); }
In addition, as a LLVM backend developer, I often need to hack these interfaces to do some performance tuning or testing work, so the flexibility is quite important.

Could you be more specific? I can't imagine how consistent getter/field naming prevents performance or testing work.

Still seeing no justification for the added complexity of 2 new fields to SubtargetFeature, 3 new SubtargetFeature subclasses, in order to avoid a couple of function renames. We already have custom predicates.
Let me tell the scenario:

The features defined in TD file have dependencies between each other. The simpliest case is that both feature A and feature B depends on feature C due to any historic HW reason. When we disable C by passing
knobs like -mattr=-C or -mnoC, feature A and B will be disabled too. However, it's possible either A or B does not depends on C in SW, namely, compiler could emit ISA in A while not emitting ISA in C.
Handwritten (non-trivial) interface gives us an quick way to untie such dependencies. If we'd like to disable C w/o affecting A and B, we can simply write such code
bool hasC() const { return false; }
That's the flexibility and we could have more complicated cases.
Since you would have to edit both the .td to make it non-trivial and then edit the .h file. Couldn't you instead rename the field in the .td so the autogenerated getter has a different name, and then add the custom getter using the old name?
It's not straightforward vs my method. As I said, this is the simpliest case, non-trival interfaces allow us to untie and retie the dependencies in more complicated cases. Another example is that we could add a knob like -force-noC to the backend, then define the interface like
bool hasC() const { return C && !ForceNoC; }
so that we can control the dependency w/o building two compilers.
I don't understand these use cases. How often are you doing these kinds of things? The dependencies in the compiler are supposed to be as loose as possible. BMI2 doesn't imply BMI1 for example.
But AVX implies SSE. In fact, it's the common case used in my development. Why shouldn't we have such usage?

But I guarantee the compiler will break if you force SSE to false and keep AVX. So that’s not a reasonable case.

In D121768#3393895, @skan wrote:
In D121768#3393882, @craig.topper wrote:
In D121768#3393857, @skan wrote:
In D121768#3392554, @craig.topper wrote:
In D121768#3392153, @skan wrote:
In D121768#3391810, @tmatheson wrote:
Your method can not get rid of non-trivial feature like useAA, which has an virtual interface.

I gave an example above of exactly this:
bool enablePostRAScheduler() const override { return usePostRAScheduler(); }
In addition, as a LLVM backend developer, I often need to hack these interfaces to do some performance tuning or testing work, so the flexibility is quite important.

Could you be more specific? I can't imagine how consistent getter/field naming prevents performance or testing work.

Still seeing no justification for the added complexity of 2 new fields to SubtargetFeature, 3 new SubtargetFeature subclasses, in order to avoid a couple of function renames. We already have custom predicates.
Let me tell the scenario:

The features defined in TD file have dependencies between each other. The simpliest case is that both feature A and feature B depends on feature C due to any historic HW reason. When we disable C by passing
knobs like -mattr=-C or -mnoC, feature A and B will be disabled too. However, it's possible either A or B does not depends on C in SW, namely, compiler could emit ISA in A while not emitting ISA in C.
Handwritten (non-trivial) interface gives us an quick way to untie such dependencies. If we'd like to disable C w/o affecting A and B, we can simply write such code
bool hasC() const { return false; }
That's the flexibility and we could have more complicated cases.
Since you would have to edit both the .td to make it non-trivial and then edit the .h file. Couldn't you instead rename the field in the .td so the autogenerated getter has a different name, and then add the custom getter using the old name?
It's not straightforward vs my method. As I said, this is the simpliest case, non-trival interfaces allow us to untie and retie the dependencies in more complicated cases. Another example is that we could add a knob like -force-noC to the backend, then define the interface like
bool hasC() const { return C && !ForceNoC; }
so that we can control the dependency w/o building two compilers.
I don't understand these use cases. How often are you doing these kinds of things? The dependencies in the compiler are supposed to be as loose as possible. BMI2 doesn't imply BMI1 for example.
But AVX implies SSE. In fact, it's the common case used in my development. Why shouldn't we have such usage?

Let me tell another scenario. We could have different front ends for LLVM, e,g, clang, flang, aocc or anything else. Some of them are open-source and some of them are not. We could use these front ends to emit the LLVM IR, then use LLVM middle/back end to optimize and tranlate it. New features can be transparent to the front end if backend can override the target features. I have to say that "-mattr" is hard to use or not usable when LTO is enabled. At this time, adding knob in the backend is pretty useful. Flexibility is quite important.

In D121768#3393897, @skan wrote:
In D121768#3393895, @skan wrote:
In D121768#3393882, @craig.topper wrote:
In D121768#3393857, @skan wrote:
In D121768#3392554, @craig.topper wrote:
In D121768#3392153, @skan wrote:
In D121768#3391810, @tmatheson wrote:
Your method can not get rid of non-trivial feature like useAA, which has an virtual interface.

I gave an example above of exactly this:
bool enablePostRAScheduler() const override { return usePostRAScheduler(); }
In addition, as a LLVM backend developer, I often need to hack these interfaces to do some performance tuning or testing work, so the flexibility is quite important.

Could you be more specific? I can't imagine how consistent getter/field naming prevents performance or testing work.

Still seeing no justification for the added complexity of 2 new fields to SubtargetFeature, 3 new SubtargetFeature subclasses, in order to avoid a couple of function renames. We already have custom predicates.
Let me tell the scenario:

The features defined in TD file have dependencies between each other. The simpliest case is that both feature A and feature B depends on feature C due to any historic HW reason. When we disable C by passing
knobs like -mattr=-C or -mnoC, feature A and B will be disabled too. However, it's possible either A or B does not depends on C in SW, namely, compiler could emit ISA in A while not emitting ISA in C.
Handwritten (non-trivial) interface gives us an quick way to untie such dependencies. If we'd like to disable C w/o affecting A and B, we can simply write such code
bool hasC() const { return false; }
That's the flexibility and we could have more complicated cases.
Since you would have to edit both the .td to make it non-trivial and then edit the .h file. Couldn't you instead rename the field in the .td so the autogenerated getter has a different name, and then add the custom getter using the old name?
It's not straightforward vs my method. As I said, this is the simpliest case, non-trival interfaces allow us to untie and retie the dependencies in more complicated cases. Another example is that we could add a knob like -force-noC to the backend, then define the interface like
bool hasC() const { return C && !ForceNoC; }
so that we can control the dependency w/o building two compilers.
I don't understand these use cases. How often are you doing these kinds of things? The dependencies in the compiler are supposed to be as loose as possible. BMI2 doesn't imply BMI1 for example.
But AVX implies SSE. In fact, it's the common case used in my development. Why shouldn't we have such usage?
Let me tell another scenario. We could have different front ends for LLVM, e,g, clang, flang, aocc or anything else. Some of them are open-source and some of them are not. We could use these front ends to emit the LLVM IR, then use LLVM middle/back end to optimize and tranlate it. New features can be transparent to the front end if backend can override the target features. I have to say that "-mattr" is hard to use or not usable when LTO is enabled. At this time, adding knob in the backend is pretty useful. Flexibility is quite important.

So you want feature specific knobs that you have to pass through those front end command line interfaces with something like clang’s -mllvm?

In my opinion we should add more interfaces in X86TargetParser.cpp so that all frontends can learn what features llvm supports without a hardcoded list like in Options.td. And those frontends should use the target-feature attribute.

In D121768#3393896, @craig.topper wrote:

But I guarantee the compiler will break if you force SSE to false and keep AVX. So that’s not a reasonable case.

It's a check in X86ISelLowering.cpp, we can relax it.

In D121768#3393913, @craig.topper wrote:

So you want feature specific knobs that you have to pass through those front end command line interfaces with something like clang’s -mllvm?

In my opinion we should add more interfaces in X86TargetParser.cpp so that all frontends can learn what features llvm supports without a hardcoded list like in Options.td. And those frontends should use the target-feature attribute.

Some front-ends are not open-source. We could not teach it.

In D121768#3393915, @skan wrote:

In D121768#3393913, @craig.topper wrote:

So you want feature specific knobs that you have to pass through those front end command line interfaces with something like clang’s -mllvm?

In my opinion we should add more interfaces in X86TargetParser.cpp so that all frontends can learn what features llvm supports without a hardcoded list like in Options.td. And those frontends should use the target-feature attribute.

Some front-ends are not open-source. We could not teach it.

If we provide good library interfaces the non open source projects can implement it themselves.

There’s no guarantee that a way to set an llvm override exists in another frontend. Last I knew ispc for example didn’t have an equivalent of -mllvm. I think I heard once that Apples version of clang doesn’t support -mllvm.

In D121768#3393914, @skan wrote:

In D121768#3393896, @craig.topper wrote:

But I guarantee the compiler will break if you force SSE to false and keep AVX. So that’s not a reasonable case.

It's a check in X86ISelLowering.cpp, we can relax it.

Yes, but if you did that work then you should probably change the avx to not imply sse.

The dependencies we have are largely due to implementation details of the features within the compiler. If we relax those implementation details we should relax the dependencies. For the most part I don’t think you can just relax a dependency without making additional code changes.

AVX512 implementation has quite a few special cases to avoid assuming dependencies except where it was really necessary. It would be simpler if avx512bw implied avx512 dq for example. But no documentation exists of dependencies.

In D121768#3393916, @craig.topper wrote:

In D121768#3393915, @skan wrote:

In D121768#3393913, @craig.topper wrote:

So you want feature specific knobs that you have to pass through those front end command line interfaces with something like clang’s -mllvm?

In my opinion we should add more interfaces in X86TargetParser.cpp so that all frontends can learn what features llvm supports without a hardcoded list like in Options.td. And those frontends should use the target-feature attribute.

Some front-ends are not open-source. We could not teach it.

If we provide good library interfaces the non open source projects can implement it themselves.

There’s no guarantee that a way to set an llvm override exists in another frontend. Last I knew ispc for example didn’t have an equivalent of -mllvm. I think I heard once that Apples version of clang doesn’t support -mllvm.

We don't have such "good library interfaces" by now, so we should keep the flexibility until that. At least for now, we could not force a non open source front end to support that plugin.

In D121768#3393923, @craig.topper wrote:

In D121768#3393914, @skan wrote:

In D121768#3393896, @craig.topper wrote:

But I guarantee the compiler will break if you force SSE to false and keep AVX. So that’s not a reasonable case.

It's a check in X86ISelLowering.cpp, we can relax it.

Yes, but if you did that work then you should probably change the avx to not imply sse.

The dependencies we have are largely due to implementation details of the features within the compiler. If we relax those implementation details we should relax the dependencies. For the most part I don’t think you can just relax a dependency without making additional code changes.

AVX512 implementation has quite a few special cases to avoid assuming dependencies except where it was really necessary. It would be simpler if avx512bw implied avx512 dq for example. But no documentation exists of dependencies.

I agree that the correct approach is to relax the dependencies too. But it takes more effort and not quick enough.

In D121768#3393935, @skan wrote:

In D121768#3393923, @craig.topper wrote:

In D121768#3393914, @skan wrote:

In D121768#3393896, @craig.topper wrote:

But I guarantee the compiler will break if you force SSE to false and keep AVX. So that’s not a reasonable case.

It's a check in X86ISelLowering.cpp, we can relax it.

Yes, but if you did that work then you should probably change the avx to not imply sse.

The dependencies we have are largely due to implementation details of the features within the compiler. If we relax those implementation details we should relax the dependencies. For the most part I don’t think you can just relax a dependency without making additional code changes.

AVX512 implementation has quite a few special cases to avoid assuming dependencies except where it was really necessary. It would be simpler if avx512bw implied avx512 dq for example. But no documentation exists of dependencies.

I agree that the correct approach is to relax the dependencies too. But it takes more effort and not quick enough.

I'm saying that a dependency currently exists, that changing the subtarget "has" function will most likely result in a crash or a miscompile without other changes. Do you disagree with that?

In D121768#3393936, @craig.topper wrote:

In D121768#3393935, @skan wrote:

In D121768#3393923, @craig.topper wrote:

In D121768#3393914, @skan wrote:

In D121768#3393896, @craig.topper wrote:

But I guarantee the compiler will break if you force SSE to false and keep AVX. So that’s not a reasonable case.

It's a check in X86ISelLowering.cpp, we can relax it.

Yes, but if you did that work then you should probably change the avx to not imply sse.

The dependencies we have are largely due to implementation details of the features within the compiler. If we relax those implementation details we should relax the dependencies. For the most part I don’t think you can just relax a dependency without making additional code changes.

AVX512 implementation has quite a few special cases to avoid assuming dependencies except where it was really necessary. It would be simpler if avx512bw implied avx512 dq for example. But no documentation exists of dependencies.

I agree that the correct approach is to relax the dependencies too. But it takes more effort and not quick enough.

I'm saying that a dependency currently exists, that changing the subtarget "has" function will most likely result in a crash or a miscompile without other changes. Do you disagree with that?

I agree, but we can improve.

In D121768#3393937, @skan wrote:

In D121768#3393936, @craig.topper wrote:

In D121768#3393935, @skan wrote:

In D121768#3393923, @craig.topper wrote:

In D121768#3393914, @skan wrote:

In D121768#3393896, @craig.topper wrote:

But I guarantee the compiler will break if you force SSE to false and keep AVX. So that’s not a reasonable case.

It's a check in X86ISelLowering.cpp, we can relax it.

Yes, but if you did that work then you should probably change the avx to not imply sse.

The dependencies we have are largely due to implementation details of the features within the compiler. If we relax those implementation details we should relax the dependencies. For the most part I don’t think you can just relax a dependency without making additional code changes.

AVX512 implementation has quite a few special cases to avoid assuming dependencies except where it was really necessary. It would be simpler if avx512bw implied avx512 dq for example. But no documentation exists of dependencies.

I agree that the correct approach is to relax the dependencies too. But it takes more effort and not quick enough.

I'm saying that a dependency currently exists, that changing the subtarget "has" function will most likely result in a crash or a miscompile without other changes. Do you disagree with that?

I agree, but we can improve.

Then why does it need to be super easy to change the "has" function if it will most likely require more work to make that change functional?

In D121768#3393938, @craig.topper wrote:

In D121768#3393937, @skan wrote:

In D121768#3393936, @craig.topper wrote:

In D121768#3393935, @skan wrote:

In D121768#3393923, @craig.topper wrote:

In D121768#3393914, @skan wrote:

In D121768#3393896, @craig.topper wrote:

But I guarantee the compiler will break if you force SSE to false and keep AVX. So that’s not a reasonable case.

It's a check in X86ISelLowering.cpp, we can relax it.

Yes, but if you did that work then you should probably change the avx to not imply sse.

The dependencies we have are largely due to implementation details of the features within the compiler. If we relax those implementation details we should relax the dependencies. For the most part I don’t think you can just relax a dependency without making additional code changes.

AVX512 implementation has quite a few special cases to avoid assuming dependencies except where it was really necessary. It would be simpler if avx512bw implied avx512 dq for example. But no documentation exists of dependencies.

I agree that the correct approach is to relax the dependencies too. But it takes more effort and not quick enough.

I'm saying that a dependency currently exists, that changing the subtarget "has" function will most likely result in a crash or a miscompile without other changes. Do you disagree with that?

I agree, but we can improve.

Then why does it need to be super easy to change the "has" function if it will most likely require more work to make that change functional?

I agree this case is not so reasonable.

In D121768#3393927, @skan wrote:

In D121768#3393916, @craig.topper wrote:

In D121768#3393915, @skan wrote:

In D121768#3393913, @craig.topper wrote:

So you want feature specific knobs that you have to pass through those front end command line interfaces with something like clang’s -mllvm?

In my opinion we should add more interfaces in X86TargetParser.cpp so that all frontends can learn what features llvm supports without a hardcoded list like in Options.td. And those frontends should use the target-feature attribute.

Some front-ends are not open-source. We could not teach it.

If we provide good library interfaces the non open source projects can implement it themselves.

There’s no guarantee that a way to set an llvm override exists in another frontend. Last I knew ispc for example didn’t have an equivalent of -mllvm. I think I heard once that Apples version of clang doesn’t support -mllvm.

We don't have such "good library interfaces" by now, so we should keep the flexibility until that. At least for now, we could not force a non open source front end to support that plugin.

How about this case?

In D121768#3393949, @skan wrote:

In D121768#3393927, @skan wrote:

In D121768#3393916, @craig.topper wrote:

In D121768#3393915, @skan wrote:

In D121768#3393913, @craig.topper wrote:

So you want feature specific knobs that you have to pass through those front end command line interfaces with something like clang’s -mllvm?

In my opinion we should add more interfaces in X86TargetParser.cpp so that all frontends can learn what features llvm supports without a hardcoded list like in Options.td. And those frontends should use the target-feature attribute.

Some front-ends are not open-source. We could not teach it.

If we provide good library interfaces the non open source projects can implement it themselves.

There’s no guarantee that a way to set an llvm override exists in another frontend. Last I knew ispc for example didn’t have an equivalent of -mllvm. I think I heard once that Apples version of clang doesn’t support -mllvm.

We don't have such "good library interfaces" by now, so we should keep the flexibility until that. At least for now, we could not force a non open source front end to support that plugin.

How about this case?

Never mind. I could do some work to enable somthing like "-mattr" when LTO is enabled in theory, so non-trivial getter is not necessary.

I have one last question for @RKSimon :

D120906 drops comments in the header file by moving them to TD file. Is there any document issue due to this?

If there is, I think we should use tablgen to emit C++ bits directly rather than use macro to expand the code.
If it's not a issue, I'm tempted to abandon this pacth and do similar things for X86 like D120906.

skan added a comment.Mar 19 2022, 3:42 AM

This comment was removed by skan.

Thanks all for the discussion!

Herald added a subscriber: StephenFan. · View Herald TranscriptMar 22 2022, 5:05 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

Target/

Target.td

29 lines

lib/

Target/

X86/

X86.td

335 lines

X86Subtarget.h

612 lines

utils/

TableGen/

SubtargetEmitter.cpp

117 lines

Diff 416394

llvm/include/llvm/Target/Target.td

Show First 20 Lines • Show All 1,631 Lines • ▼ Show 20 Lines	class Target {
// for all opcodes if this flag is set to 0.		// for all opcodes if this flag is set to 0.
int AllowRegisterRenaming = 0;		int AllowRegisterRenaming = 0;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// SubtargetFeature - A characteristic of the chip set.		// SubtargetFeature - A characteristic of the chip set.
//		//
class SubtargetFeature<string n, string a, string v, string d,		class SubtargetFeature<string n, string a, string v, string d,
list<SubtargetFeature> i = []> {		list<SubtargetFeature> i = [], bits<2> t = 0> {
// Name - Feature name. Used by command line (-mattr=) to determine the		// Name - Feature name. Used by command line (-mattr=) to determine the
// appropriate target chip.		// appropriate target chip.
//		//
string Name = n;		string Name = n;

// Attribute - Attribute to be set by feature.		// Attribute - Attribute to be set by feature.
//		//
string Attribute = a;		string Attribute = a;

// Value - Value the attribute to be set to by feature.		// Value - Value the attribute to be set to by feature.
//		//
string Value = v;		string Value = v;

// Desc - Feature description. Used by command line (-mattr=) to display help		// Desc - Feature description. Used by command line (-mattr=) to display help
// information.		// information.
//		//
string Desc = d;		string Desc = d;

// Implies - Features that this feature implies are present. If one of those		// Implies - Features that this feature implies are present. If one of those
// features isn't set, then this one shouldn't be set either.		// features isn't set, then this one shouldn't be set either.
//		//
list<SubtargetFeature> Implies = i;		list<SubtargetFeature> Implies = i;

		// TrivialField - Auto-generate a trivial field for this feature.
		craig.topperUnsubmitted Done Reply Inline Actions Trival->Trivial craig.topper: Trival->Trivial
		// A trivial field is always zero-initialized.
		//
		bit TrivialField = t{0};

		// TrivialInterface - Auto-generate a trivial interface for this feature.
		// The body of a trivial interface must be one of these forms:
		// 1. return Attribute
		// 2. return Attribute >= Value
		//
		bit TrivialInterface = t{1};
}		}

		// A SubtargetFeature that has a trivial field.
		class TrivialFieldSubtargetFeature<string n, string a, string v, string d,
		list<SubtargetFeature> i = []>
		: SubtargetFeature<n, a, v, d, i, 1>;

		// A SubtargetFeature that has a trivial interface.
		class TrivialInterfaceSubtargetFeature<string n, string a, string v, string d,
		list<SubtargetFeature> i = []>
		: SubtargetFeature<n, a, v, d, i, 2>;

		// A SubtargetFeature that has a trivial field and a trivial interface.
		class TrivialSubtargetFeature<string n, string a, string v, string d,
		list<SubtargetFeature> i = []>
		: SubtargetFeature<n, a, v, d, i, 3>;

/// Specifies a Subtarget feature that this instruction is deprecated on.		/// Specifies a Subtarget feature that this instruction is deprecated on.
class Deprecated<SubtargetFeature dep> {		class Deprecated<SubtargetFeature dep> {
SubtargetFeature DeprecatedFeatureMask = dep;		SubtargetFeature DeprecatedFeatureMask = dep;
}		}

/// A custom predicate used to determine if an instruction is		/// A custom predicate used to determine if an instruction is
/// deprecated or not.		/// deprecated or not.
class ComplexDeprecationPredicate<string dep> {		class ComplexDeprecationPredicate<string dep> {
▲ Show 20 Lines • Show All 123 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86.td

	Show All 12 Lines

	// Get the target-independent interfaces which we are implementing...			// Get the target-independent interfaces which we are implementing...
	//			//
	include "llvm/Target/Target.td"			include "llvm/Target/Target.td"

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// X86 Subtarget state			// X86 Subtarget state
	//			//
				// disregarding specific ABI / programming model
	def Is64Bit : SubtargetFeature<"64bit-mode", "Is64Bit", "true",			def Is64Bit : TrivialSubtargetFeature<"64bit-mode", "Is64Bit", "true",
	pengfeiUnsubmitted Not Done Reply Inline Actions I think `In64BitMode` is more readable, we usually generate 32 bit instructions under 64 bit mode. pengfei: I think `In64BitMode` is more readable, we usually generate 32 bit instructions under 64 bit…
	skanAuthorUnsubmitted Done Reply Inline Actions It's meaningless to have different name for inteface and the member. We already use the interface `is64Bit`. skan: It's meaningless to have different name for inteface and the member. We already use the…
	"64-bit mode (x86_64)">;			"64-bit mode (x86_64)">;
	def Is32Bit : SubtargetFeature<"32bit-mode", "Is32Bit", "true",			def Is32Bit : TrivialSubtargetFeature<"32bit-mode", "Is32Bit", "true",
	"32-bit mode (80386)">;			"32-bit mode (80386)">;
	def Is16Bit : SubtargetFeature<"16bit-mode", "Is16Bit", "true",			def Is16Bit : TrivialSubtargetFeature<"16bit-mode", "Is16Bit", "true",
	"16-bit mode (i8086)">;			"16-bit mode (i8086)">;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// X86 Subtarget ISA features			// X86 Subtarget ISA features
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	def FeatureX87 : SubtargetFeature<"x87","HasX87", "true",			def FeatureX87 : TrivialSubtargetFeature<"x87","HasX87", "true",
				pengfeiUnsubmitted Done Reply Inline Actions How about define a new class like `TrivalSubtargetFeature` so that we don't need to add `, [], 0` to the old one? pengfei: How about define a new class like `TrivalSubtargetFeature` so that we don't need to add `, []…
				skanAuthorUnsubmitted Done Reply Inline Actions If so, then we have to replace most of `SubtargetFeature` with `TrivalSubtargetFeature`. Now we only need to add `, [], 0` for CMOV, so I think `TrivalSubtargetFeature` is not worthy. skan: If so, then we have to replace most of `SubtargetFeature` with `TrivalSubtargetFeature`. Now…
	"Enable X87 float instructions">;			"Enable X87 float instructions">;

	def FeatureNOPL : SubtargetFeature<"nopl", "HasNOPL", "true",			def FeatureNOPL : TrivialSubtargetFeature<"nopl", "HasNOPL", "true",
	"Enable NOPL instruction">;			"Enable NOPL instruction (generally pentium pro+)">;

	def FeatureCMOV : SubtargetFeature<"cmov","HasCMOV", "true",			def FeatureCMOV : TrivialFieldSubtargetFeature<"cmov","HasCMOV", "true",
	"Enable conditional move instructions">;			"Enable conditional move instructions">;

	def FeatureCMPXCHG8B : SubtargetFeature<"cx8", "HasCMPXCHG8B", "true",			def FeatureCMPXCHG8B : TrivialSubtargetFeature<"cx8", "HasCMPXCHG8B", "true",
	"Support CMPXCHG8B instructions">;			"Support CMPXCHG8B instructions">;

	def FeatureCRC32 : SubtargetFeature<"crc32", "HasCRC32", "true",			def FeatureCRC32 : TrivialSubtargetFeature<"crc32", "HasCRC32", "true",
	"Enable SSE 4.2 CRC32 instruction">;			"Enable SSE 4.2 CRC32 instruction">;

	def FeaturePOPCNT : SubtargetFeature<"popcnt", "HasPOPCNT", "true",			def FeaturePOPCNT : TrivialSubtargetFeature<"popcnt", "HasPOPCNT", "true",
	"Support POPCNT instruction">;			"Support POPCNT instruction">;

	def FeatureFXSR : SubtargetFeature<"fxsr", "HasFXSR", "true",			def FeatureFXSR : TrivialSubtargetFeature<"fxsr", "HasFXSR", "true",
	"Support fxsave/fxrestore instructions">;			"Support fxsave/fxrestore instructions">;

	def FeatureXSAVE : SubtargetFeature<"xsave", "HasXSAVE", "true",			def FeatureXSAVE : TrivialSubtargetFeature<"xsave", "HasXSAVE", "true",
	"Support xsave instructions">;			"Support xsave instructions">;

	def FeatureXSAVEOPT: SubtargetFeature<"xsaveopt", "HasXSAVEOPT", "true",			def FeatureXSAVEOPT: TrivialSubtargetFeature<"xsaveopt", "HasXSAVEOPT", "true",
	"Support xsaveopt instructions",			"Support xsaveopt instructions",
	[FeatureXSAVE]>;			[FeatureXSAVE]>;

	def FeatureXSAVEC : SubtargetFeature<"xsavec", "HasXSAVEC", "true",			def FeatureXSAVEC : TrivialSubtargetFeature<"xsavec", "HasXSAVEC", "true",
	"Support xsavec instructions",			"Support xsavec instructions",
	[FeatureXSAVE]>;			[FeatureXSAVE]>;

	def FeatureXSAVES : SubtargetFeature<"xsaves", "HasXSAVES", "true",			def FeatureXSAVES : TrivialSubtargetFeature<"xsaves", "HasXSAVES", "true",
	"Support xsaves instructions",			"Support xsaves instructions",
	[FeatureXSAVE]>;			[FeatureXSAVE]>;

	def FeatureSSE1 : SubtargetFeature<"sse", "X86SSELevel", "SSE1",			def FeatureSSE1 : TrivialSubtargetFeature<"sse", "X86SSELevel", "SSE1",
	"Enable SSE instructions">;			"Enable SSE instructions">;
	def FeatureSSE2 : SubtargetFeature<"sse2", "X86SSELevel", "SSE2",			def FeatureSSE2 : TrivialSubtargetFeature<"sse2", "X86SSELevel", "SSE2",
	"Enable SSE2 instructions",			"Enable SSE2 instructions",
	[FeatureSSE1]>;			[FeatureSSE1]>;
	def FeatureSSE3 : SubtargetFeature<"sse3", "X86SSELevel", "SSE3",			def FeatureSSE3 : TrivialSubtargetFeature<"sse3", "X86SSELevel", "SSE3",
	"Enable SSE3 instructions",			"Enable SSE3 instructions",
	[FeatureSSE2]>;			[FeatureSSE2]>;
	def FeatureSSSE3 : SubtargetFeature<"ssse3", "X86SSELevel", "SSSE3",			def FeatureSSSE3 : TrivialSubtargetFeature<"ssse3", "X86SSELevel", "SSSE3",
	"Enable SSSE3 instructions",			"Enable SSSE3 instructions",
	[FeatureSSE3]>;			[FeatureSSE3]>;
	def FeatureSSE41 : SubtargetFeature<"sse4.1", "X86SSELevel", "SSE41",			def FeatureSSE41 : TrivialSubtargetFeature<"sse4.1", "X86SSELevel", "SSE41",
	"Enable SSE 4.1 instructions",			"Enable SSE 4.1 instructions",
	[FeatureSSSE3]>;			[FeatureSSSE3]>;
	def FeatureSSE42 : SubtargetFeature<"sse4.2", "X86SSELevel", "SSE42",			def FeatureSSE42 : TrivialSubtargetFeature<"sse4.2", "X86SSELevel", "SSE42",
	"Enable SSE 4.2 instructions",			"Enable SSE 4.2 instructions",
	[FeatureSSE41]>;			[FeatureSSE41]>;
	// The MMX subtarget feature is separate from the rest of the SSE features			// The MMX subtarget feature is separate from the rest of the SSE features
	// because it's important (for odd compatibility reasons) to be able to			// because it's important (for odd compatibility reasons) to be able to
	// turn it off explicitly while allowing SSE+ to be on.			// turn it off explicitly while allowing SSE+ to be on.
	def FeatureMMX : SubtargetFeature<"mmx","X863DNowLevel", "MMX",			def FeatureMMX : TrivialSubtargetFeature<"mmx","X863DNowLevel", "MMX",
	"Enable MMX instructions">;			"Enable MMX instructions">;
	def Feature3DNow : SubtargetFeature<"3dnow", "X863DNowLevel", "ThreeDNow",			def Feature3DNow : TrivialSubtargetFeature<"3dnow", "X863DNowLevel", "ThreeDNow",
	"Enable 3DNow! instructions",			"Enable 3DNow! instructions",
	[FeatureMMX]>;			[FeatureMMX]>;
	def Feature3DNowA : SubtargetFeature<"3dnowa", "X863DNowLevel", "ThreeDNowA",			def Feature3DNowA : TrivialSubtargetFeature<"3dnowa", "X863DNowLevel", "ThreeDNowA",
	"Enable 3DNow! Athlon instructions",			"Enable 3DNow! Athlon instructions",
	[Feature3DNow]>;			[Feature3DNow]>;
	// All x86-64 hardware has SSE2, but we don't mark SSE2 as an implied			// All x86-64 hardware has SSE2, but we don't mark SSE2 as an implied
	// feature, because SSE2 can be disabled (e.g. for compiling OS kernels)			// feature, because SSE2 can be disabled (e.g. for compiling OS kernels)
	// without disabling 64-bit mode. Nothing should imply this feature bit. It			// without disabling 64-bit mode. Nothing should imply this feature bit. It
	// is used to enforce that only 64-bit capable CPUs are used in 64-bit mode.			// is used to enforce that only 64-bit capable CPUs are used in 64-bit mode.
	def FeatureX86_64 : SubtargetFeature<"64bit", "HasX86_64", "true",			def FeatureX86_64 : TrivialSubtargetFeature<"64bit", "HasX86_64", "true",
	"Support 64-bit instructions">;			"Support 64-bit instructions">;
	def FeatureCMPXCHG16B : SubtargetFeature<"cx16", "HasCMPXCHG16B", "true",			def FeatureCMPXCHG16B : TrivialFieldSubtargetFeature<"cx16", "HasCMPXCHG16B", "true",
	"64-bit with cmpxchg16b",			"64-bit with cmpxchg16b (this is true for most "
				"x86-64 chips, but not the first AMD chips)",
	[FeatureCMPXCHG8B]>;			[FeatureCMPXCHG8B]>;
	def FeatureSSE4A : SubtargetFeature<"sse4a", "HasSSE4A", "true",			def FeatureSSE4A : TrivialSubtargetFeature<"sse4a", "HasSSE4A", "true",
				craig.topperUnsubmitted Not Done Reply Inline Actions Doesn't hasCmpxchg16b() have non-trivial implementation? craig.topper: Doesn't hasCmpxchg16b() have non-trivial implementation?
				craig.topperUnsubmitted Not Done Reply Inline Actions Nevermind I see this uses TrivialFieldSubtargetFeature craig.topper: Nevermind I see this uses TrivialFieldSubtargetFeature
	"Support SSE 4a instructions",			"Support SSE 4a instructions",
	[FeatureSSE3]>;			[FeatureSSE3]>;

	def FeatureAVX : SubtargetFeature<"avx", "X86SSELevel", "AVX",			def FeatureAVX : TrivialSubtargetFeature<"avx", "X86SSELevel", "AVX",
	"Enable AVX instructions",			"Enable AVX instructions",
	[FeatureSSE42]>;			[FeatureSSE42]>;
	def FeatureAVX2 : SubtargetFeature<"avx2", "X86SSELevel", "AVX2",			def FeatureAVX2 : TrivialSubtargetFeature<"avx2", "X86SSELevel", "AVX2",
	"Enable AVX2 instructions",			"Enable AVX2 instructions",
	[FeatureAVX]>;			[FeatureAVX]>;
	def FeatureFMA : SubtargetFeature<"fma", "HasFMA", "true",			def FeatureFMA : TrivialSubtargetFeature<"fma", "HasFMA", "true",
	"Enable three-operand fused multiple-add",			"Enable three-operand fused multiple-add",
	[FeatureAVX]>;			[FeatureAVX]>;
	def FeatureF16C : SubtargetFeature<"f16c", "HasF16C", "true",			def FeatureF16C : TrivialSubtargetFeature<"f16c", "HasF16C", "true",
	"Support 16-bit floating point conversion instructions",			"Support 16-bit floating point conversion instructions",
	[FeatureAVX]>;			[FeatureAVX]>;
	def FeatureAVX512 : SubtargetFeature<"avx512f", "X86SSELevel", "AVX512",			def FeatureAVX512 : TrivialSubtargetFeature<"avx512f", "X86SSELevel", "AVX512",
	"Enable AVX-512 instructions",			"Enable AVX-512 instructions",
	[FeatureAVX2, FeatureFMA, FeatureF16C]>;			[FeatureAVX2, FeatureFMA, FeatureF16C]>;
	def FeatureERI : SubtargetFeature<"avx512er", "HasERI", "true",			def FeatureERI : TrivialSubtargetFeature<"avx512er", "HasERI", "true",
	"Enable AVX-512 Exponential and Reciprocal Instructions",			"Enable AVX-512 Exponential and Reciprocal Instructions",
	[FeatureAVX512]>;			[FeatureAVX512]>;
	def FeatureCDI : SubtargetFeature<"avx512cd", "HasCDI", "true",			def FeatureCDI : TrivialSubtargetFeature<"avx512cd", "HasCDI", "true",
	"Enable AVX-512 Conflict Detection Instructions",			"Enable AVX-512 Conflict Detection Instructions",
	[FeatureAVX512]>;			[FeatureAVX512]>;
	def FeatureVPOPCNTDQ : SubtargetFeature<"avx512vpopcntdq", "HasVPOPCNTDQ",			def FeatureVPOPCNTDQ : TrivialSubtargetFeature<"avx512vpopcntdq", "HasVPOPCNTDQ",
	"true", "Enable AVX-512 Population Count Instructions",			"true", "Enable AVX-512 Population Count Instructions",
	[FeatureAVX512]>;			[FeatureAVX512]>;
	def FeaturePFI : SubtargetFeature<"avx512pf", "HasPFI", "true",			def FeaturePFI : TrivialSubtargetFeature<"avx512pf", "HasPFI", "true",
	"Enable AVX-512 PreFetch Instructions",			"Enable AVX-512 PreFetch Instructions",
	[FeatureAVX512]>;			[FeatureAVX512]>;
	def FeaturePREFETCHWT1 : SubtargetFeature<"prefetchwt1", "HasPREFETCHWT1",			def FeaturePREFETCHWT1 : TrivialSubtargetFeature<"prefetchwt1", "HasPREFETCHWT1",
	"true",			"true",
	"Prefetch with Intent to Write and T1 Hint">;			"Prefetch with Intent to Write and T1 Hint">;
	def FeatureDQI : SubtargetFeature<"avx512dq", "HasDQI", "true",			def FeatureDQI : TrivialSubtargetFeature<"avx512dq", "HasDQI", "true",
	"Enable AVX-512 Doubleword and Quadword Instructions",			"Enable AVX-512 Doubleword and Quadword Instructions",
	[FeatureAVX512]>;			[FeatureAVX512]>;
	def FeatureBWI : SubtargetFeature<"avx512bw", "HasBWI", "true",			def FeatureBWI : TrivialSubtargetFeature<"avx512bw", "HasBWI", "true",
	"Enable AVX-512 Byte and Word Instructions",			"Enable AVX-512 Byte and Word Instructions",
	[FeatureAVX512]>;			[FeatureAVX512]>;
	def FeatureVLX : SubtargetFeature<"avx512vl", "HasVLX", "true",			def FeatureVLX : TrivialSubtargetFeature<"avx512vl", "HasVLX", "true",
	"Enable AVX-512 Vector Length eXtensions",			"Enable AVX-512 Vector Length eXtensions",
	[FeatureAVX512]>;			[FeatureAVX512]>;
	def FeatureVBMI : SubtargetFeature<"avx512vbmi", "HasVBMI", "true",			def FeatureVBMI : TrivialSubtargetFeature<"avx512vbmi", "HasVBMI", "true",
	"Enable AVX-512 Vector Byte Manipulation Instructions",			"Enable AVX-512 Vector Byte Manipulation Instructions",
	[FeatureBWI]>;			[FeatureBWI]>;
	def FeatureVBMI2 : SubtargetFeature<"avx512vbmi2", "HasVBMI2", "true",			def FeatureVBMI2 : TrivialSubtargetFeature<"avx512vbmi2", "HasVBMI2", "true",
	"Enable AVX-512 further Vector Byte Manipulation Instructions",			"Enable AVX-512 further Vector Byte Manipulation Instructions",
	[FeatureBWI]>;			[FeatureBWI]>;
	def FeatureIFMA : SubtargetFeature<"avx512ifma", "HasIFMA", "true",			def FeatureIFMA : TrivialSubtargetFeature<"avx512ifma", "HasIFMA", "true",
	"Enable AVX-512 Integer Fused Multiple-Add",			"Enable AVX-512 Integer Fused Multiple-Add",
	[FeatureAVX512]>;			[FeatureAVX512]>;
	def FeaturePKU : SubtargetFeature<"pku", "HasPKU", "true",			def FeaturePKU : TrivialSubtargetFeature<"pku", "HasPKU", "true",
	"Enable protection keys">;			"Enable protection keys">;
	def FeatureVNNI : SubtargetFeature<"avx512vnni", "HasVNNI", "true",			def FeatureVNNI : TrivialSubtargetFeature<"avx512vnni", "HasVNNI", "true",
	"Enable AVX-512 Vector Neural Network Instructions",			"Enable AVX-512 Vector Neural Network Instructions",
	[FeatureAVX512]>;			[FeatureAVX512]>;
	def FeatureAVXVNNI : SubtargetFeature<"avxvnni", "HasAVXVNNI", "true",			def FeatureAVXVNNI : TrivialSubtargetFeature<"avxvnni", "HasAVXVNNI", "true",
	"Support AVX_VNNI encoding",			"Support AVX_VNNI encoding",
	[FeatureAVX2]>;			[FeatureAVX2]>;
	def FeatureBF16 : SubtargetFeature<"avx512bf16", "HasBF16", "true",			def FeatureBF16 : TrivialSubtargetFeature<"avx512bf16", "HasBF16", "true",
	"Support bfloat16 floating point",			"Support bfloat16 floating point",
	[FeatureBWI]>;			[FeatureBWI]>;
	def FeatureBITALG : SubtargetFeature<"avx512bitalg", "HasBITALG", "true",			def FeatureBITALG : TrivialSubtargetFeature<"avx512bitalg", "HasBITALG", "true",
	"Enable AVX-512 Bit Algorithms",			"Enable AVX-512 Bit Algorithms",
	[FeatureBWI]>;			[FeatureBWI]>;
	def FeatureVP2INTERSECT : SubtargetFeature<"avx512vp2intersect",			def FeatureVP2INTERSECT : TrivialSubtargetFeature<"avx512vp2intersect",
	"HasVP2INTERSECT", "true",			"HasVP2INTERSECT", "true",
	"Enable AVX-512 vp2intersect",			"Enable AVX-512 vp2intersect",
	[FeatureAVX512]>;			[FeatureAVX512]>;
	// FIXME: FP16 scalar intrinsics use the type v8f16, which is supposed to be			// FIXME: FP16 scalar intrinsics use the type v8f16, which is supposed to be
	// guarded under condition hasVLX. So we imply it in FeatureFP16 currently.			// guarded under condition hasVLX. So we imply it in FeatureFP16 currently.
	// FIXME: FP16 conversion between f16 and i64 customize type v8i64, which is			// FIXME: FP16 conversion between f16 and i64 customize type v8i64, which is
	// supposed to be guarded under condition hasDQI. So we imply it in FeatureFP16			// supposed to be guarded under condition hasDQI. So we imply it in FeatureFP16
	// currently.			// currently.
	def FeatureFP16 : SubtargetFeature<"avx512fp16", "HasFP16", "true",			def FeatureFP16 : TrivialSubtargetFeature<"avx512fp16", "HasFP16", "true",
	"Support 16-bit floating point",			"Support 16-bit floating point",
	[FeatureBWI, FeatureVLX, FeatureDQI]>;			[FeatureBWI, FeatureVLX, FeatureDQI]>;
	def FeaturePCLMUL : SubtargetFeature<"pclmul", "HasPCLMUL", "true",			def FeaturePCLMUL : TrivialSubtargetFeature<"pclmul", "HasPCLMUL", "true",
	"Enable packed carry-less multiplication instructions",			"Enable packed carry-less multiplication instructions",
	[FeatureSSE2]>;			[FeatureSSE2]>;
	def FeatureGFNI : SubtargetFeature<"gfni", "HasGFNI", "true",			def FeatureGFNI : TrivialSubtargetFeature<"gfni", "HasGFNI", "true",
	"Enable Galois Field Arithmetic Instructions",			"Enable Galois Field Arithmetic Instructions",
	[FeatureSSE2]>;			[FeatureSSE2]>;
	def FeatureVPCLMULQDQ : SubtargetFeature<"vpclmulqdq", "HasVPCLMULQDQ", "true",			def FeatureVPCLMULQDQ : TrivialSubtargetFeature<"vpclmulqdq", "HasVPCLMULQDQ", "true",
	"Enable vpclmulqdq instructions",			"Enable vpclmulqdq instructions",
	[FeatureAVX, FeaturePCLMUL]>;			[FeatureAVX, FeaturePCLMUL]>;
	def FeatureFMA4 : SubtargetFeature<"fma4", "HasFMA4", "true",			def FeatureFMA4 : TrivialSubtargetFeature<"fma4", "HasFMA4", "true",
	"Enable four-operand fused multiple-add",			"Enable four-operand fused multiple-add",
	[FeatureAVX, FeatureSSE4A]>;			[FeatureAVX, FeatureSSE4A]>;
	def FeatureXOP : SubtargetFeature<"xop", "HasXOP", "true",			def FeatureXOP : TrivialSubtargetFeature<"xop", "HasXOP", "true",
	"Enable XOP instructions",			"Enable XOP instructions",
	[FeatureFMA4]>;			[FeatureFMA4]>;
	def FeatureSSEUnalignedMem : SubtargetFeature<"sse-unaligned-mem",			def FeatureSSEUnalignedMem : TrivialSubtargetFeature<"sse-unaligned-mem", "HasSSEUnalignedMem",
	"HasSSEUnalignedMem", "true",			"true", "Allow unaligned memory operands with SSE "
	"Allow unaligned memory operands with SSE instructions">;			"instructions (This may require setting a configuration "
	def FeatureAES : SubtargetFeature<"aes", "HasAES", "true",			"bit in the processor)">;
				def FeatureAES : TrivialSubtargetFeature<"aes", "HasAES", "true",
	"Enable AES instructions",			"Enable AES instructions",
	[FeatureSSE2]>;			[FeatureSSE2]>;
	def FeatureVAES : SubtargetFeature<"vaes", "HasVAES", "true",			def FeatureVAES : TrivialSubtargetFeature<"vaes", "HasVAES", "true",
	"Promote selected AES instructions to AVX512/AVX registers",			"Promote selected AES instructions to AVX512/AVX registers",
	[FeatureAVX, FeatureAES]>;			[FeatureAVX, FeatureAES]>;
	def FeatureTBM : SubtargetFeature<"tbm", "HasTBM", "true",			def FeatureTBM : TrivialSubtargetFeature<"tbm", "HasTBM", "true",
	"Enable TBM instructions">;			"Enable TBM instructions">;
	def FeatureLWP : SubtargetFeature<"lwp", "HasLWP", "true",			def FeatureLWP : TrivialSubtargetFeature<"lwp", "HasLWP", "true",
	"Enable LWP instructions">;			"Enable LWP instructions">;
	def FeatureMOVBE : SubtargetFeature<"movbe", "HasMOVBE", "true",			def FeatureMOVBE : TrivialSubtargetFeature<"movbe", "HasMOVBE", "true",
	"Support MOVBE instruction">;			"Support MOVBE instruction">;
	def FeatureRDRAND : SubtargetFeature<"rdrnd", "HasRDRAND", "true",			def FeatureRDRAND : TrivialSubtargetFeature<"rdrnd", "HasRDRAND", "true",
	"Support RDRAND instruction">;			"Support RDRAND instruction">;
	def FeatureFSGSBase : SubtargetFeature<"fsgsbase", "HasFSGSBase", "true",			def FeatureFSGSBase : TrivialSubtargetFeature<"fsgsbase", "HasFSGSBase", "true",
	"Support FS/GS Base instructions">;			"Support FS/GS Base instructions">;
	def FeatureLZCNT : SubtargetFeature<"lzcnt", "HasLZCNT", "true",			def FeatureLZCNT : TrivialSubtargetFeature<"lzcnt", "HasLZCNT", "true",
	"Support LZCNT instruction">;			"Support LZCNT instruction">;
	def FeatureBMI : SubtargetFeature<"bmi", "HasBMI", "true",			def FeatureBMI : TrivialSubtargetFeature<"bmi", "HasBMI", "true",
	"Support BMI instructions">;			"Support BMI instructions">;
	def FeatureBMI2 : SubtargetFeature<"bmi2", "HasBMI2", "true",			def FeatureBMI2 : TrivialSubtargetFeature<"bmi2", "HasBMI2", "true",
	"Support BMI2 instructions">;			"Support BMI2 instructions">;
	def FeatureRTM : SubtargetFeature<"rtm", "HasRTM", "true",			def FeatureRTM : TrivialSubtargetFeature<"rtm", "HasRTM", "true",
	"Support RTM instructions">;			"Support RTM instructions">;
	def FeatureADX : SubtargetFeature<"adx", "HasADX", "true",			def FeatureADX : TrivialSubtargetFeature<"adx", "HasADX", "true",
	"Support ADX instructions">;			"Support ADX instructions">;
	def FeatureSHA : SubtargetFeature<"sha", "HasSHA", "true",			def FeatureSHA : TrivialSubtargetFeature<"sha", "HasSHA", "true",
	"Enable SHA instructions",			"Enable SHA instructions",
	[FeatureSSE2]>;			[FeatureSSE2]>;
	def FeatureSHSTK : SubtargetFeature<"shstk", "HasSHSTK", "true",			def FeatureSHSTK : TrivialSubtargetFeature<"shstk", "HasSHSTK", "true",
	"Support CET Shadow-Stack instructions">;			"Support CET Shadow-Stack instructions">;
	def FeaturePRFCHW : SubtargetFeature<"prfchw", "HasPRFCHW", "true",			def FeaturePRFCHW : TrivialSubtargetFeature<"prfchw", "HasPRFCHW", "true",
	"Support PRFCHW instructions">;			"Support PRFCHW instructions">;
	def FeatureRDSEED : SubtargetFeature<"rdseed", "HasRDSEED", "true",			def FeatureRDSEED : TrivialSubtargetFeature<"rdseed", "HasRDSEED", "true",
	"Support RDSEED instruction">;			"Support RDSEED instruction">;
	def FeatureLAHFSAHF64 : SubtargetFeature<"sahf", "HasLAHFSAHF64", "true",			def FeatureLAHFSAHF64 : TrivialSubtargetFeature<"sahf", "HasLAHFSAHF64", "true",
	"Support LAHF and SAHF instructions in 64-bit mode">;			"Support LAHF and SAHF instructions in 64-bit mode">;
	def FeatureMWAITX : SubtargetFeature<"mwaitx", "HasMWAITX", "true",			def FeatureMWAITX : TrivialSubtargetFeature<"mwaitx", "HasMWAITX", "true",
	"Enable MONITORX/MWAITX timer functionality">;			"Enable MONITORX/MWAITX timer functionality">;
	def FeatureCLZERO : SubtargetFeature<"clzero", "HasCLZERO", "true",			def FeatureCLZERO : TrivialSubtargetFeature<"clzero", "HasCLZERO", "true",
	"Enable Cache Line Zero">;			"Enable Cache Line Zero">;
	def FeatureCLDEMOTE : SubtargetFeature<"cldemote", "HasCLDEMOTE", "true",			def FeatureCLDEMOTE : TrivialSubtargetFeature<"cldemote", "HasCLDEMOTE", "true",
	"Enable Cache Demote">;			"Enable Cache Demote">;
	def FeaturePTWRITE : SubtargetFeature<"ptwrite", "HasPTWRITE", "true",			def FeaturePTWRITE : TrivialSubtargetFeature<"ptwrite", "HasPTWRITE", "true",
	"Support ptwrite instruction">;			"Support ptwrite instruction">;
	def FeatureAMXTILE : SubtargetFeature<"amx-tile", "HasAMXTILE", "true",			def FeatureAMXTILE : TrivialSubtargetFeature<"amx-tile", "HasAMXTILE", "true",
	"Support AMX-TILE instructions">;			"Support AMX-TILE instructions">;
	def FeatureAMXINT8 : SubtargetFeature<"amx-int8", "HasAMXINT8", "true",			def FeatureAMXINT8 : TrivialSubtargetFeature<"amx-int8", "HasAMXINT8", "true",
	"Support AMX-INT8 instructions",			"Support AMX-INT8 instructions",
	[FeatureAMXTILE]>;			[FeatureAMXTILE]>;
	def FeatureAMXBF16 : SubtargetFeature<"amx-bf16", "HasAMXBF16", "true",			def FeatureAMXBF16 : TrivialSubtargetFeature<"amx-bf16", "HasAMXBF16", "true",
	"Support AMX-BF16 instructions",			"Support AMX-BF16 instructions",
	[FeatureAMXTILE]>;			[FeatureAMXTILE]>;
	def FeatureINVPCID : SubtargetFeature<"invpcid", "HasINVPCID", "true",			def FeatureINVPCID : TrivialSubtargetFeature<"invpcid", "HasINVPCID", "true",
	"Invalidate Process-Context Identifier">;			"Invalidate Process-Context Identifier">;
	def FeatureSGX : SubtargetFeature<"sgx", "HasSGX", "true",			def FeatureSGX : TrivialSubtargetFeature<"sgx", "HasSGX", "true",
	"Enable Software Guard Extensions">;			"Enable Software Guard Extensions">;
	def FeatureCLFLUSHOPT : SubtargetFeature<"clflushopt", "HasCLFLUSHOPT", "true",			def FeatureCLFLUSHOPT : TrivialSubtargetFeature<"clflushopt", "HasCLFLUSHOPT", "true",
	"Flush A Cache Line Optimized">;			"Flush A Cache Line Optimized">;
	def FeatureCLWB : SubtargetFeature<"clwb", "HasCLWB", "true",			def FeatureCLWB : TrivialSubtargetFeature<"clwb", "HasCLWB", "true",
	"Cache Line Write Back">;			"Cache Line Write Back">;
	def FeatureWBNOINVD : SubtargetFeature<"wbnoinvd", "HasWBNOINVD", "true",			def FeatureWBNOINVD : TrivialSubtargetFeature<"wbnoinvd", "HasWBNOINVD", "true",
	"Write Back No Invalidate">;			"Write Back No Invalidate">;
	def FeatureRDPID : SubtargetFeature<"rdpid", "HasRDPID", "true",			def FeatureRDPID : TrivialSubtargetFeature<"rdpid", "HasRDPID", "true",
	"Support RDPID instructions">;			"Support RDPID instructions">;
	def FeatureWAITPKG : SubtargetFeature<"waitpkg", "HasWAITPKG", "true",			def FeatureWAITPKG : TrivialSubtargetFeature<"waitpkg", "HasWAITPKG", "true",
	"Wait and pause enhancements">;			"Wait and pause enhancements">;
	def FeatureENQCMD : SubtargetFeature<"enqcmd", "HasENQCMD", "true",			def FeatureENQCMD : TrivialSubtargetFeature<"enqcmd", "HasENQCMD", "true",
	"Has ENQCMD instructions">;			"Has ENQCMD instructions">;
	def FeatureKL : SubtargetFeature<"kl", "HasKL", "true",			def FeatureKL : TrivialSubtargetFeature<"kl", "HasKL", "true",
	"Support Key Locker kl Instructions",			"Support Key Locker kl Instructions",
	[FeatureSSE2]>;			[FeatureSSE2]>;
	def FeatureWIDEKL : SubtargetFeature<"widekl", "HasWIDEKL", "true",			def FeatureWIDEKL : TrivialSubtargetFeature<"widekl", "HasWIDEKL", "true",
	"Support Key Locker wide Instructions",			"Support Key Locker wide Instructions",
	[FeatureKL]>;			[FeatureKL]>;
	def FeatureHRESET : SubtargetFeature<"hreset", "HasHRESET", "true",			def FeatureHRESET : TrivialSubtargetFeature<"hreset", "HasHRESET", "true",
	"Has hreset instruction">;			"Has hreset instruction">;
	def FeatureSERIALIZE : SubtargetFeature<"serialize", "HasSERIALIZE", "true",			def FeatureSERIALIZE : TrivialSubtargetFeature<"serialize", "HasSERIALIZE", "true",
	"Has serialize instruction">;			"Has serialize instruction">;
	def FeatureTSXLDTRK : SubtargetFeature<"tsxldtrk", "HasTSXLDTRK", "true",			def FeatureTSXLDTRK : TrivialSubtargetFeature<"tsxldtrk", "HasTSXLDTRK", "true",
	"Support TSXLDTRK instructions">;			"Support TSXLDTRK instructions">;
	def FeatureUINTR : SubtargetFeature<"uintr", "HasUINTR", "true",			def FeatureUINTR : TrivialSubtargetFeature<"uintr", "HasUINTR", "true",
	"Has UINTR Instructions">;			"Has UINTR Instructions">;
	def FeaturePCONFIG : SubtargetFeature<"pconfig", "HasPCONFIG", "true",			def FeaturePCONFIG : TrivialSubtargetFeature<"pconfig", "HasPCONFIG", "true",
	"platform configuration instruction">;			"platform configuration instruction">;
	def FeatureMOVDIRI : SubtargetFeature<"movdiri", "HasMOVDIRI", "true",			def FeatureMOVDIRI : TrivialSubtargetFeature<"movdiri", "HasMOVDIRI", "true",
	"Support movdiri instruction">;			"Support movdiri instruction (direct store integer)">;
	def FeatureMOVDIR64B : SubtargetFeature<"movdir64b", "HasMOVDIR64B", "true",			def FeatureMOVDIR64B : TrivialSubtargetFeature<"movdir64b", "HasMOVDIR64B", "true",
	"Support movdir64b instruction">;			"Support movdir64b instruction (direct store 64 bytes)">;

	// Ivy Bridge and newer processors have enhanced REP MOVSB and STOSB (aka			// Ivy Bridge and newer processors have enhanced REP MOVSB and STOSB (aka
	// "string operations"). See "REP String Enhancement" in the Intel Software			// "string operations"). See "REP String Enhancement" in the Intel Software
	// Development Manual. This feature essentially means that REP MOVSB will copy			// Development Manual. This feature essentially means that REP MOVSB will copy
	// using the largest available size instead of copying bytes one by one, making			// using the largest available size instead of copying bytes one by one, making
	// it at least as fast as REPMOVS{W,D,Q}.			// it at least as fast as REPMOVS{W,D,Q}.
	def FeatureERMSB			def FeatureERMSB
	: SubtargetFeature<			: TrivialSubtargetFeature<
	"ermsb", "HasERMSB", "true",			"ermsb", "HasERMSB", "true",
	"REP MOVS/STOS are fast">;			"REP MOVS/STOS are fast">;

	// Icelake and newer processors have Fast Short REP MOV.			// Icelake and newer processors have Fast Short REP MOV.
	def FeatureFSRM			def FeatureFSRM
	: SubtargetFeature<			: TrivialSubtargetFeature<
	"fsrm", "HasFSRM", "true",			"fsrm", "HasFSRM", "true",
	"REP MOVSB of short lengths is faster">;			"REP MOVSB of short lengths is faster">;

	def FeatureSoftFloat			def FeatureSoftFloat
	: SubtargetFeature<"soft-float", "UseSoftFloat", "true",			: TrivialSubtargetFeature<"soft-float", "UseSoftFloat", "true",
	"Use software floating point features">;			"Use software floating point features">;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// X86 Subtarget Security Mitigation features			// X86 Subtarget Security Mitigation features
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	// Lower indirect calls using a special construct called a `retpoline` to			// Lower indirect calls using a special construct called a `retpoline` to
	// mitigate potential Spectre v2 attacks against them.			// mitigate potential Spectre v2 attacks against them.
	def FeatureRetpolineIndirectCalls			def FeatureRetpolineIndirectCalls
	: SubtargetFeature<			: TrivialSubtargetFeature<
	"retpoline-indirect-calls", "UseRetpolineIndirectCalls", "true",			"retpoline-indirect-calls", "UseRetpolineIndirectCalls", "true",
	"Remove speculation of indirect calls from the generated code">;			"Remove speculation of indirect calls from the generated code">;

	// Lower indirect branches and switches either using conditional branch trees			// Lower indirect branches and switches either using conditional branch trees
	// or using a special construct called a `retpoline` to mitigate potential			// or using a special construct called a `retpoline` to mitigate potential
	// Spectre v2 attacks against them.			// Spectre v2 attacks against them.
	def FeatureRetpolineIndirectBranches			def FeatureRetpolineIndirectBranches
	: SubtargetFeature<			: TrivialSubtargetFeature<
	"retpoline-indirect-branches", "UseRetpolineIndirectBranches", "true",			"retpoline-indirect-branches", "UseRetpolineIndirectBranches", "true",
	"Remove speculation of indirect branches from the generated code">;			"Remove speculation of indirect branches from the generated code">;

	// Deprecated umbrella feature for enabling both `retpoline-indirect-calls` and			// Deprecated umbrella feature for enabling both `retpoline-indirect-calls` and
	// `retpoline-indirect-branches` above.			// `retpoline-indirect-branches` above.
	def FeatureRetpoline			def FeatureRetpoline
	: SubtargetFeature<"retpoline", "DeprecatedUseRetpoline", "true",			: TrivialSubtargetFeature<"retpoline", "DeprecatedUseRetpoline", "true",
	"Remove speculation of indirect branches from the "			"Remove speculation of indirect branches from the "
	"generated code, either by avoiding them entirely or "			"generated code, either by avoiding them entirely or "
	"lowering them with a speculation blocking construct",			"lowering them with a speculation blocking construct",
	[FeatureRetpolineIndirectCalls,			[FeatureRetpolineIndirectCalls,
	FeatureRetpolineIndirectBranches]>;			FeatureRetpolineIndirectBranches]>;

	// Rely on external thunks for the emitted retpoline calls. This allows users			// Rely on external thunks for the emitted retpoline calls. This allows users
	// to provide their own custom thunk definitions in highly specialized			// to provide their own custom thunk definitions in highly specialized
	// environments such as a kernel that does boot-time hot patching.			// environments such as a kernel that does boot-time hot patching.
	def FeatureRetpolineExternalThunk			def FeatureRetpolineExternalThunk
	: SubtargetFeature<			: TrivialSubtargetFeature<
	"retpoline-external-thunk", "UseRetpolineExternalThunk", "true",			"retpoline-external-thunk", "UseRetpolineExternalThunk", "true",
	"When lowering an indirect call or branch using a `retpoline`, rely "			"When lowering an indirect call or branch using a `retpoline`, rely "
	"on the specified user provided thunk rather than emitting one "			"on the specified user provided thunk rather than emitting one "
	"ourselves. Only has effect when combined with some other retpoline "			"ourselves. Only has effect when combined with some other retpoline "
	"feature", [FeatureRetpolineIndirectCalls]>;			"feature", [FeatureRetpolineIndirectCalls]>;

	// Mitigate LVI attacks against indirect calls/branches and call returns			// Mitigate LVI attacks against indirect calls/branches and call returns
	def FeatureLVIControlFlowIntegrity			def FeatureLVIControlFlowIntegrity
	: SubtargetFeature<			: TrivialSubtargetFeature<
	"lvi-cfi", "UseLVIControlFlowIntegrity", "true",			"lvi-cfi", "UseLVIControlFlowIntegrity", "true",
	"Prevent indirect calls/branches from using a memory operand, and "			"Prevent indirect calls/branches from using a memory operand, and "
	"precede all indirect calls/branches from a register with an "			"precede all indirect calls/branches from a register with an "
	"LFENCE instruction to serialize control flow. Also decompose RET "			"LFENCE instruction to serialize control flow. Also decompose RET "
	"instructions into a POP+LFENCE+JMP sequence.">;			"instructions into a POP+LFENCE+JMP sequence.">;

	// Enable SESES to mitigate speculative execution attacks			// Enable SESES to mitigate speculative execution attacks
	def FeatureSpeculativeExecutionSideEffectSuppression			def FeatureSpeculativeExecutionSideEffectSuppression
	: SubtargetFeature<			: TrivialSubtargetFeature<
	"seses", "UseSpeculativeExecutionSideEffectSuppression", "true",			"seses", "UseSpeculativeExecutionSideEffectSuppression", "true",
	"Prevent speculative execution side channel timing attacks by "			"Prevent speculative execution side channel timing attacks by "
	"inserting a speculation barrier before memory reads, memory writes, "			"inserting a speculation barrier before memory reads, memory writes, "
	"and conditional branches. Implies LVI Control Flow integrity.",			"and conditional branches. Implies LVI Control Flow integrity.",
	[FeatureLVIControlFlowIntegrity]>;			[FeatureLVIControlFlowIntegrity]>;

	// Mitigate LVI attacks against data loads			// Mitigate LVI attacks against data loads
	def FeatureLVILoadHardening			def FeatureLVILoadHardening
	: SubtargetFeature<			: TrivialSubtargetFeature<
	"lvi-load-hardening", "UseLVILoadHardening", "true",			"lvi-load-hardening", "UseLVILoadHardening", "true",
	"Insert LFENCE instructions to prevent data speculatively injected "			"Insert LFENCE instructions to prevent data speculatively injected "
	"into loads from being used maliciously.">;			"into loads from being used maliciously.">;

	def FeatureTaggedGlobals			def FeatureTaggedGlobals
	: SubtargetFeature<			: TrivialSubtargetFeature<
	"tagged-globals", "AllowTaggedGlobals", "true",			"tagged-globals", "AllowTaggedGlobals", "true",
	"Use an instruction sequence for taking the address of a global "			"Use an instruction sequence for taking the address of a global "
	"that allows a memory tag in the upper address bits.">;			"that allows a memory tag in the upper address bits.">;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// X86 Subtarget Tuning features			// X86 Subtarget Tuning features
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	def TuningSlowSHLD : SubtargetFeature<"slow-shld", "IsSHLDSlow", "true",			def TuningSlowSHLD : TrivialSubtargetFeature<"slow-shld", "IsSHLDSlow", "true",
	"SHLD instruction is slow">;			"SHLD instruction is slow">;

	def TuningSlowPMULLD : SubtargetFeature<"slow-pmulld", "IsPMULLDSlow", "true",			def TuningSlowPMULLD : TrivialSubtargetFeature<"slow-pmulld", "IsPMULLDSlow", "true",
	"PMULLD instruction is slow">;			"PMULLD instruction is slow">;

	def TuningSlowPMADDWD : SubtargetFeature<"slow-pmaddwd", "IsPMADDWDSlow",			def TuningSlowPMADDWD : TrivialSubtargetFeature<"slow-pmaddwd", "IsPMADDWDSlow",
	"true",			"true",
	"PMADDWD is slower than PMULLD">;			"PMADDWD is slower than PMULLD">;

	// FIXME: This should not apply to CPUs that do not have SSE.			// FIXME: This should not apply to CPUs that do not have SSE.
	def TuningSlowUAMem16 : SubtargetFeature<"slow-unaligned-mem-16",			def TuningSlowUAMem16 : TrivialSubtargetFeature<"slow-unaligned-mem-16",
	"IsUnalignedMem16Slow", "true",			"IsUnalignedMem16Slow", "true",
	"Slow unaligned 16-byte memory access">;			"Slow unaligned 16-byte memory access">;

	def TuningSlowUAMem32 : SubtargetFeature<"slow-unaligned-mem-32",			def TuningSlowUAMem32 : TrivialSubtargetFeature<"slow-unaligned-mem-32",
	"IsUnalignedMem32Slow", "true",			"IsUnalignedMem32Slow", "true",
	"Slow unaligned 32-byte memory access">;			"Slow unaligned 32-byte memory access">;

	def TuningLEAForSP : SubtargetFeature<"lea-sp", "UseLeaForSP", "true",			def TuningLEAForSP : TrivialSubtargetFeature<"lea-sp", "UseLeaForSP", "true",
	"Use LEA for adjusting the stack pointer">;			"Use LEA for adjusting the stack pointer "
				"(This is an optimization for Intel Atom processors)">;

	def TuningSlowDivide32 : SubtargetFeature<"idivl-to-divb",			def TuningSlowDivide32 : TrivialSubtargetFeature<"idivl-to-divb",
	"HasSlowDivide32", "true",			"HasSlowDivide32", "true",
	"Use 8-bit divide for positive values less than 256">;			"Use 8-bit divide for positive values less than 256">;

	def TuningSlowDivide64 : SubtargetFeature<"idivq-to-divl",			def TuningSlowDivide64 : TrivialSubtargetFeature<"idivq-to-divl",
	"HasSlowDivide64", "true",			"HasSlowDivide64", "true",
	"Use 32-bit divide for positive values less than 2^32">;			"Use 32-bit divide for positive values less than 2^32">;

	def TuningPadShortFunctions : SubtargetFeature<"pad-short-functions",			def TuningPadShortFunctions : TrivialSubtargetFeature<"pad-short-functions",
	"PadShortFunctions", "true",			"PadShortFunctions", "true",
	"Pad short functions">;			"Pad short functions to prevent a stall when returning "
				"too early">;

	// On some processors, instructions that implicitly take two memory operands are			// On some processors, instructions that implicitly take two memory operands are
	// slow. In practice, this means that CALL, PUSH, and POP with memory operands			// slow. In practice, this means that CALL, PUSH, and POP with memory operands
	// should be avoided in favor of a MOV + register CALL/PUSH/POP.			// should be avoided in favor of a MOV + register CALL/PUSH/POP.
	def TuningSlowTwoMemOps : SubtargetFeature<"slow-two-mem-ops",			def TuningSlowTwoMemOps : TrivialSubtargetFeature<"slow-two-mem-ops",
	"SlowTwoMemOps", "true",			"SlowTwoMemOps", "true",
	"Two memory operand instructions are slow">;			"Two memory operand instructions are slow">;

	def TuningLEAUsesAG : SubtargetFeature<"lea-uses-ag", "LeaUsesAG", "true",			def TuningLEAUsesAG : TrivialSubtargetFeature<"lea-uses-ag", "LeaUsesAG", "true",
	"LEA instruction needs inputs at AG stage">;			"LEA instruction needs inputs at AG stage">;

	def TuningSlowLEA : SubtargetFeature<"slow-lea", "SlowLEA", "true",			def TuningSlowLEA : TrivialSubtargetFeature<"slow-lea", "SlowLEA", "true",
	"LEA instruction with certain arguments is slow">;			"LEA instruction with certain arguments is slow">;

	def TuningSlow3OpsLEA : SubtargetFeature<"slow-3ops-lea", "Slow3OpsLEA", "true",			// True if the LEA instruction has all three source operands: base, index,
				// and offset or if the LEA instruction uses base and index registers where
				// the base is EBP, RBP,or R13
				def TuningSlow3OpsLEA : TrivialSubtargetFeature<"slow-3ops-lea", "Slow3OpsLEA", "true",
	"LEA instruction with 3 ops or certain registers is slow">;			"LEA instruction with 3 ops or certain registers is slow">;

	def TuningSlowIncDec : SubtargetFeature<"slow-incdec", "SlowIncDec", "true",			/// True if INC and DEC instructions are slow when writing to flags
				def TuningSlowIncDec : TrivialSubtargetFeature<"slow-incdec", "SlowIncDec", "true",
	"INC and DEC instructions are slower than ADD and SUB">;			"INC and DEC instructions are slower than ADD and SUB">;

	def TuningPOPCNTFalseDeps : SubtargetFeature<"false-deps-popcnt",			def TuningPOPCNTFalseDeps : TrivialSubtargetFeature<"false-deps-popcnt",
	"HasPOPCNTFalseDeps", "true",			"HasPOPCNTFalseDeps", "true",
	"POPCNT has a false dependency on dest register">;			"POPCNT has a false dependency on dest register">;

	def TuningLZCNTFalseDeps : SubtargetFeature<"false-deps-lzcnt-tzcnt",			def TuningLZCNTFalseDeps : TrivialSubtargetFeature<"false-deps-lzcnt-tzcnt",
	"HasLZCNTFalseDeps", "true",			"HasLZCNTFalseDeps", "true",
	"LZCNT/TZCNT have a false dependency on dest register">;			"LZCNT/TZCNT have a false dependency on dest register">;

	def TuningSBBDepBreaking : SubtargetFeature<"sbb-dep-breaking",			def TuningSBBDepBreaking : TrivialSubtargetFeature<"sbb-dep-breaking",
	"HasSBBDepBreaking", "true",			"HasSBBDepBreaking", "true",
	"SBB with same register has no source dependency">;			"SBB with same register has no source dependency">;

	// On recent X86 (port bound) processors, its preferable to combine to a single shuffle			// On recent X86 (port bound) processors, its preferable to combine to a single shuffle
	// using a variable mask over multiple fixed shuffles.			// using a variable mask over multiple fixed shuffles.
	def TuningFastVariableCrossLaneShuffle			def TuningFastVariableCrossLaneShuffle
	: SubtargetFeature<"fast-variable-crosslane-shuffle",			: TrivialSubtargetFeature<"fast-variable-crosslane-shuffle",
	"HasFastVariableCrossLaneShuffle",			"HasFastVariableCrossLaneShuffle",
	"true", "Cross-lane shuffles with variable masks are fast">;			"true", "Cross-lane shuffles with variable masks are fast">;
	def TuningFastVariablePerLaneShuffle			def TuningFastVariablePerLaneShuffle
	: SubtargetFeature<"fast-variable-perlane-shuffle",			: TrivialSubtargetFeature<"fast-variable-perlane-shuffle",
	"HasFastVariablePerLaneShuffle",			"HasFastVariablePerLaneShuffle",
	"true", "Per-lane shuffles with variable masks are fast">;			"true", "Per-lane shuffles with variable masks are fast">;

	// On some X86 processors, a vzeroupper instruction should be inserted after			// On some X86 processors, a vzeroupper instruction should be inserted after
	// using ymm/zmm registers before executing code that may use SSE instructions.			// using ymm/zmm registers before executing code that may use SSE instructions.
	def TuningInsertVZEROUPPER			def TuningInsertVZEROUPPER
	: SubtargetFeature<"vzeroupper",			: TrivialSubtargetFeature<"vzeroupper",
	"InsertVZEROUPPER",			"InsertVZEROUPPER",
	"true", "Should insert vzeroupper instructions">;			"true", "Should insert vzeroupper instructions">;

	// TuningFastScalarFSQRT should be enabled if scalar FSQRT has shorter latency			// TuningFastScalarFSQRT should be enabled if scalar FSQRT has shorter latency
	// than the corresponding NR code. TuningFastVectorFSQRT should be enabled if			// than the corresponding NR code. TuningFastVectorFSQRT should be enabled if
	// vector FSQRT has higher throughput than the corresponding NR code.			// vector FSQRT has higher throughput than the corresponding NR code.
	// The idea is that throughput bound code is likely to be vectorized, so for			// The idea is that throughput bound code is likely to be vectorized, so for
	// vectorized code we should care about the throughput of SQRT operations.			// vectorized code we should care about the throughput of SQRT operations.
	// But if the code is scalar that probably means that the code has some kind of			// But if the code is scalar that probably means that the code has some kind of
	// dependency and we should care more about reducing the latency.			// dependency and we should care more about reducing the latency.
	def TuningFastScalarFSQRT			def TuningFastScalarFSQRT
	: SubtargetFeature<"fast-scalar-fsqrt", "HasFastScalarFSQRT",			: TrivialSubtargetFeature<"fast-scalar-fsqrt", "HasFastScalarFSQRT",
	"true", "Scalar SQRT is fast (disable Newton-Raphson)">;			"true", "Scalar SQRT is fast (disable Newton-Raphson)">;
	def TuningFastVectorFSQRT			def TuningFastVectorFSQRT
	: SubtargetFeature<"fast-vector-fsqrt", "HasFastVectorFSQRT",			: TrivialSubtargetFeature<"fast-vector-fsqrt", "HasFastVectorFSQRT",
	"true", "Vector SQRT is fast (disable Newton-Raphson)">;			"true", "Vector SQRT is fast (disable Newton-Raphson)">;

	// If lzcnt has equivalent latency/throughput to most simple integer ops, it can			// If lzcnt has equivalent latency/throughput to most simple integer ops, it can
	// be used to replace test/set sequences.			// be used to replace test/set sequences.
	def TuningFastLZCNT			def TuningFastLZCNT
	: SubtargetFeature<			: TrivialSubtargetFeature<
	"fast-lzcnt", "HasFastLZCNT", "true",			"fast-lzcnt", "HasFastLZCNT", "true",
	"LZCNT instructions are as fast as most simple integer ops">;			"LZCNT instructions are as fast as most simple integer ops">;

	// If the target can efficiently decode NOPs upto 7-bytes in length.			// If the target can efficiently decode NOPs upto 7-bytes in length.
	def TuningFast7ByteNOP			def TuningFast7ByteNOP
	: SubtargetFeature<			: TrivialSubtargetFeature<
	"fast-7bytenop", "HasFast7ByteNOP", "true",			"fast-7bytenop", "HasFast7ByteNOP", "true",
	"Target can quickly decode up to 7 byte NOPs">;			"Target can quickly decode up to 7 byte NOPs">;

	// If the target can efficiently decode NOPs upto 11-bytes in length.			// If the target can efficiently decode NOPs upto 11-bytes in length.
	def TuningFast11ByteNOP			def TuningFast11ByteNOP
	: SubtargetFeature<			: TrivialSubtargetFeature<
	"fast-11bytenop", "HasFast11ByteNOP", "true",			"fast-11bytenop", "HasFast11ByteNOP", "true",
	"Target can quickly decode up to 11 byte NOPs">;			"Target can quickly decode up to 11 byte NOPs">;

	// If the target can efficiently decode NOPs upto 15-bytes in length.			// If the target can efficiently decode NOPs upto 15-bytes in length.
	def TuningFast15ByteNOP			def TuningFast15ByteNOP
	: SubtargetFeature<			: TrivialSubtargetFeature<
	"fast-15bytenop", "HasFast15ByteNOP", "true",			"fast-15bytenop", "HasFast15ByteNOP", "true",
	"Target can quickly decode up to 15 byte NOPs">;			"Target can quickly decode up to 15 byte NOPs">;

	// Sandy Bridge and newer processors can use SHLD with the same source on both			// Sandy Bridge and newer processors can use SHLD with the same source on both
	// inputs to implement rotate to avoid the partial flag update of the normal			// inputs to implement rotate to avoid the partial flag update of the normal
	// rotate instructions.			// rotate instructions.
	def TuningFastSHLDRotate			def TuningFastSHLDRotate
	: SubtargetFeature<			: TrivialSubtargetFeature<
	"fast-shld-rotate", "HasFastSHLDRotate", "true",			"fast-shld-rotate", "HasFastSHLDRotate", "true",
	"SHLD can be used as a faster rotate">;			"SHLD can be used as a faster rotate">;

	// Bulldozer and newer processors can merge CMP/TEST (but not other			// Bulldozer and newer processors can merge CMP/TEST (but not other
	// instructions) with conditional branches.			// instructions) with conditional branches.
	def TuningBranchFusion			def TuningBranchFusion
	: SubtargetFeature<"branchfusion", "HasBranchFusion", "true",			: TrivialSubtargetFeature<"branchfusion", "HasBranchFusion", "true",
	"CMP/TEST can be fused with conditional branches">;			"CMP/TEST can be fused with conditional branches">;

	// Sandy Bridge and newer processors have many instructions that can be			// Sandy Bridge and newer processors have many instructions that can be
	// fused with conditional branches and pass through the CPU as a single			// fused with conditional branches and pass through the CPU as a single
	// operation.			// operation.
	def TuningMacroFusion			def TuningMacroFusion
	: SubtargetFeature<"macrofusion", "HasMacroFusion", "true",			: TrivialSubtargetFeature<"macrofusion", "HasMacroFusion", "true",
	"Various instructions can be fused with conditional branches">;			"Various instructions can be fused with conditional branches">;

	// Gather is available since Haswell (AVX2 set). So technically, we can			// Gather is available since Haswell (AVX2 set). So technically, we can
	// generate Gathers on all AVX2 processors. But the overhead on HSW is high.			// generate Gathers on all AVX2 processors. But the overhead on HSW is high.
	// Skylake Client processor has faster Gathers than HSW and performance is			// Skylake Client processor has faster Gathers than HSW and performance is
	// similar to Skylake Server (AVX-512).			// similar to Skylake Server (AVX-512).
	def TuningFastGather			def TuningFastGather
	: SubtargetFeature<"fast-gather", "HasFastGather", "true",			: TrivialSubtargetFeature<"fast-gather", "HasFastGather", "true",
	"Indicates if gather is reasonably fast">;			"Indicates if gather is reasonably fast "
				"This is true for Skylake client and all AVX-512 CPUs">;

	def TuningPrefer128Bit			def TuningPrefer128Bit
	: SubtargetFeature<"prefer-128-bit", "Prefer128Bit", "true",			: TrivialSubtargetFeature<"prefer-128-bit", "Prefer128Bit", "true",
	"Prefer 128-bit AVX instructions">;			"Prefer 128-bit AVX instructions">;

	def TuningPrefer256Bit			def TuningPrefer256Bit
	: SubtargetFeature<"prefer-256-bit", "Prefer256Bit", "true",			: TrivialSubtargetFeature<"prefer-256-bit", "Prefer256Bit", "true",
	"Prefer 256-bit AVX instructions">;			"Prefer 256-bit AVX instructions">;

	def TuningPreferMaskRegisters			def TuningPreferMaskRegisters
	: SubtargetFeature<"prefer-mask-registers", "PreferMaskRegisters", "true",			: TrivialSubtargetFeature<"prefer-mask-registers", "PreferMaskRegisters", "true",
	"Prefer AVX512 mask registers over PTEST/MOVMSK">;			"Prefer AVX512 mask registers over PTEST/MOVMSK">;

	def TuningFastBEXTR : SubtargetFeature<"fast-bextr", "HasFastBEXTR", "true",			def TuningFastBEXTR : TrivialSubtargetFeature<"fast-bextr", "HasFastBEXTR", "true",
	"Indicates that the BEXTR instruction is implemented as a single uop "			"Indicates that the BEXTR instruction is implemented as a single uop "
	"with good throughput">;			"with good throughput">;

	// Combine vector math operations with shuffles into horizontal math			// Combine vector math operations with shuffles into horizontal math
	// instructions if a CPU implements horizontal operations (introduced with			// instructions if a CPU implements horizontal operations (introduced with
	// SSE3) with better latency/throughput than the alternative sequence.			// SSE3) with better latency/throughput than the alternative sequence.
	def TuningFastHorizontalOps			def TuningFastHorizontalOps
	: SubtargetFeature<			: TrivialSubtargetFeature<
	"fast-hops", "HasFastHorizontalOps", "true",			"fast-hops", "HasFastHorizontalOps", "true",
	"Prefer horizontal vector math instructions (haddp, phsub, etc.) over "			"Prefer horizontal vector math instructions (haddp, phsub, etc.) over "
	"normal vector instructions with shuffles">;			"normal vector instructions with shuffles">;

	def TuningFastScalarShiftMasks			def TuningFastScalarShiftMasks
	: SubtargetFeature<			: TrivialSubtargetFeature<
	"fast-scalar-shift-masks", "HasFastScalarShiftMasks", "true",			"fast-scalar-shift-masks", "HasFastScalarShiftMasks", "true",
	"Prefer a left/right scalar logical shift pair over a shift+and pair">;			"Prefer a left/right scalar logical shift pair over a shift+and pair">;

	def TuningFastVectorShiftMasks			def TuningFastVectorShiftMasks
	: SubtargetFeature<			: TrivialSubtargetFeature<
	"fast-vector-shift-masks", "HasFastVectorShiftMasks", "true",			"fast-vector-shift-masks", "HasFastVectorShiftMasks", "true",
	"Prefer a left/right vector logical shift pair over a shift+and pair">;			"Prefer a left/right vector logical shift pair over a shift+and pair">;

	def TuningFastMOVBE			def TuningFastMOVBE
	: SubtargetFeature<"fast-movbe", "HasFastMOVBE", "true",			: TrivialSubtargetFeature<"fast-movbe", "HasFastMOVBE", "true",
	"Prefer a movbe over a single-use load + bswap / single-use bswap + store">;			"Prefer a movbe over a single-use load + bswap / single-use bswap + store">;

	def TuningUseSLMArithCosts			def TuningUseSLMArithCosts
	: SubtargetFeature<"use-slm-arith-costs", "UseSLMArithCosts", "true",			: TrivialSubtargetFeature<"use-slm-arith-costs", "UseSLMArithCosts", "true",
	"Use Silvermont specific arithmetic costs">;			"Use Silvermont specific arithmetic costs">;

	def TuningUseGLMDivSqrtCosts			def TuningUseGLMDivSqrtCosts
	: SubtargetFeature<"use-glm-div-sqrt-costs", "UseGLMDivSqrtCosts", "true",			: TrivialSubtargetFeature<"use-glm-div-sqrt-costs", "UseGLMDivSqrtCosts", "true",
	"Use Goldmont specific floating point div/sqrt costs">;			"Use Goldmont specific floating point div/sqrt costs">;

	// Enable use of alias analysis during code generation.			// Enable use of alias analysis during code generation.
	def FeatureUseAA : SubtargetFeature<"use-aa", "UseAA", "true",			def FeatureUseAA : TrivialFieldSubtargetFeature<"use-aa", "UseAA", "true",
	"Use alias analysis during codegen">;			"Use alias analysis during codegen">;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// X86 CPU Families			// X86 CPU Families
	// TODO: Remove these - use general tuning features to determine codegen.			// TODO: Remove these - use general tuning features to determine codegen.
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	// Bonnell			// Bonnell
	def ProcIntelAtom : SubtargetFeature<"", "IsAtom", "true", "Is Intel Atom processor">;			def ProcIntelAtom : TrivialSubtargetFeature<"", "IsAtom", "true", "Is Intel Atom processor">;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Register File Description			// Register File Description
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	include "X86RegisterInfo.td"			include "X86RegisterInfo.td"
	include "X86RegisterBanks.td"			include "X86RegisterBanks.td"

	▲ Show 20 Lines • Show All 1,003 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86Subtarget.h

Show First 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	enum X863DNowEnum {
NoThreeDNow, MMX, ThreeDNow, ThreeDNowA		NoThreeDNow, MMX, ThreeDNow, ThreeDNowA
};		};

/// Which PIC style to use		/// Which PIC style to use
PICStyles::Style PICStyle;		PICStyles::Style PICStyle;

const TargetMachine &TM;		const TargetMachine &TM;

/// SSE1, SSE2, SSE3, SSSE3, SSE41, SSE42, or none supported.		#define GET_SUBTARGETINFO_FEATURE_FIELD
X86SSEEnum X86SSELevel = NoSSE;		#include "X86GenSubtargetInfo.inc"

/// MMX, 3DNow, 3DNow Athlon, or none supported.
X863DNowEnum X863DNowLevel = NoThreeDNow;

/// Is this a Intel Atom processor?
bool IsAtom = false;

/// True if the processor supports X87 instructions.
bool HasX87 = false;

/// True if the processor supports CMPXCHG8B.
bool HasCMPXCHG8B = false;

/// True if this processor has NOPL instruction
/// (generally pentium pro+).
pengfeiUnsubmitted Done Reply Inline Actions Some comments like this doesn't appear on X86.td, should we move all these comments there? pengfei: Some comments like this doesn't appear on X86.td, should we move all these comments there?
bool HasNOPL = false;

/// True if this processor has conditional move instructions
/// (generally pentium pro+).
bool HasCMOV = false;

/// True if the processor supports X86-64 instructions.
bool HasX86_64 = false;

/// True if the processor supports POPCNT.
bool HasPOPCNT = false;

/// True if the processor supports SSE4A instructions.
bool HasSSE4A = false;

/// Target has AES instructions
bool HasAES = false;
bool HasVAES = false;

/// Target has FXSAVE/FXRESTOR instructions
bool HasFXSR = false;

/// Target has XSAVE instructions
bool HasXSAVE = false;

/// Target has XSAVEOPT instructions
bool HasXSAVEOPT = false;

/// Target has XSAVEC instructions
bool HasXSAVEC = false;

/// Target has XSAVES instructions
bool HasXSAVES = false;

/// Target has carry-less multiplication
bool HasPCLMUL = false;
bool HasVPCLMULQDQ = false;

/// Target has Galois Field Arithmetic instructions
bool HasGFNI = false;

/// Target has 3-operand fused multiply-add
bool HasFMA = false;

/// Target has 4-operand fused multiply-add
bool HasFMA4 = false;

/// Target has XOP instructions
bool HasXOP = false;

/// Target has TBM instructions.
bool HasTBM = false;

/// Target has LWP instructions
bool HasLWP = false;

/// True if the processor has the MOVBE instruction.
bool HasMOVBE = false;

/// True if the processor has the RDRAND instruction.
bool HasRDRAND = false;

/// Processor has 16-bit floating point conversion instructions.
bool HasF16C = false;

/// Processor has FS/GS base insturctions.
bool HasFSGSBase = false;

/// Processor has LZCNT instruction.
bool HasLZCNT = false;

/// Processor has BMI1 instructions.
bool HasBMI = false;

/// Processor has BMI2 instructions.
bool HasBMI2 = false;

/// Processor has VBMI instructions.
bool HasVBMI = false;

/// Processor has VBMI2 instructions.
bool HasVBMI2 = false;

/// Processor has Integer Fused Multiply Add
bool HasIFMA = false;

/// Processor has RTM instructions.
bool HasRTM = false;

/// Processor has ADX instructions.
bool HasADX = false;

/// Processor has SHA instructions.
bool HasSHA = false;

/// Processor has PRFCHW instructions.
bool HasPRFCHW = false;

/// Processor has RDSEED instructions.
bool HasRDSEED = false;

/// Processor has LAHF/SAHF instructions in 64-bit mode.
bool HasLAHFSAHF64 = false;

/// Processor has MONITORX/MWAITX instructions.
bool HasMWAITX = false;

/// Processor has Cache Line Zero instruction
bool HasCLZERO = false;

/// Processor has Cache Line Demote instruction
bool HasCLDEMOTE = false;

/// Processor has MOVDIRI instruction (direct store integer).
bool HasMOVDIRI = false;

/// Processor has MOVDIR64B instruction (direct store 64 bytes).
bool HasMOVDIR64B = false;

/// Processor has ptwrite instruction.
bool HasPTWRITE = false;

/// Processor has Prefetch with intent to Write instruction
bool HasPREFETCHWT1 = false;

/// True if SHLD instructions are slow.
bool IsSHLDSlow = false;

/// True if the PMULLD instruction is slow compared to PMULLW/PMULHW and
// PMULUDQ.
bool IsPMULLDSlow = false;

/// True if the PMADDWD instruction is slow compared to PMULLD.
bool IsPMADDWDSlow = false;

/// True if unaligned memory accesses of 16-bytes are slow.
bool IsUnalignedMem16Slow = false;

/// True if unaligned memory accesses of 32-bytes are slow.
bool IsUnalignedMem32Slow = false;

/// True if SSE operations can have unaligned memory operands.
/// This may require setting a configuration bit in the processor.
bool HasSSEUnalignedMem = false;

/// True if this processor has the CMPXCHG16B instruction;
/// this is true for most x86-64 chips, but not the first AMD chips.
bool HasCMPXCHG16B = false;

/// True if the LEA instruction should be used for adjusting
/// the stack pointer. This is an optimization for Intel Atom processors.
bool UseLeaForSP = false;

/// True if POPCNT instruction has a false dependency on the destination register.
bool HasPOPCNTFalseDeps = false;

/// True if LZCNT/TZCNT instructions have a false dependency on the destination register.
bool HasLZCNTFalseDeps = false;

/// True if an SBB instruction with same source register is recognized as
/// having no dependency on that register.
bool HasSBBDepBreaking = false;

/// True if its preferable to combine to a single cross-lane shuffle
/// using a variable mask over multiple fixed shuffles.
bool HasFastVariableCrossLaneShuffle = false;

/// True if its preferable to combine to a single per-lane shuffle
/// using a variable mask over multiple fixed shuffles.
bool HasFastVariablePerLaneShuffle = false;

/// True if vzeroupper instructions should be inserted after code that uses
/// ymm or zmm registers.
bool InsertVZEROUPPER = false;

/// True if there is no performance penalty for writing NOPs with up to
/// 7 bytes.
bool HasFast7ByteNOP = false;

/// True if there is no performance penalty for writing NOPs with up to
/// 11 bytes.
bool HasFast11ByteNOP = false;

/// True if there is no performance penalty for writing NOPs with up to
/// 15 bytes.
bool HasFast15ByteNOP = false;

/// True if gather is reasonably fast. This is true for Skylake client and
/// all AVX-512 CPUs.
bool HasFastGather = false;

/// True if hardware SQRTSS instruction is at least as fast (latency) as
/// RSQRTSS followed by a Newton-Raphson iteration.
bool HasFastScalarFSQRT = false;

/// True if hardware SQRTPS/VSQRTPS instructions are at least as fast
/// (throughput) as RSQRTPS/VRSQRTPS followed by a Newton-Raphson iteration.
bool HasFastVectorFSQRT = false;

/// True if 8-bit divisions are significantly faster than
/// 32-bit divisions and should be used when possible.
bool HasSlowDivide32 = false;

/// True if 32-bit divides are significantly faster than
/// 64-bit divisions and should be used when possible.
bool HasSlowDivide64 = false;

/// True if LZCNT instruction is fast.
bool HasFastLZCNT = false;

/// True if SHLD based rotate is fast.
bool HasFastSHLDRotate = false;

/// True if the processor supports macrofusion.
bool HasMacroFusion = false;

/// True if the processor supports branch fusion.
bool HasBranchFusion = false;

/// True if the processor has enhanced REP MOVSB/STOSB.
bool HasERMSB = false;

/// True if the processor has fast short REP MOV.
bool HasFSRM = false;

/// True if the short functions should be padded to prevent
/// a stall when returning too early.
bool PadShortFunctions = false;

/// True if two memory operand instructions should use a temporary register
/// instead.
bool SlowTwoMemOps = false;

/// True if the LEA instruction inputs have to be ready at address generation
/// (AG) time.
bool LeaUsesAG = false;

/// True if the LEA instruction with certain arguments is slow
bool SlowLEA = false;

/// True if the LEA instruction has all three source operands: base, index,
/// and offset or if the LEA instruction uses base and index registers where
/// the base is EBP, RBP,or R13
bool Slow3OpsLEA = false;

/// True if INC and DEC instructions are slow when writing to flags
bool SlowIncDec = false;

/// Processor has AVX-512 PreFetch Instructions
bool HasPFI = false;

/// Processor has AVX-512 Exponential and Reciprocal Instructions
bool HasERI = false;

/// Processor has AVX-512 Conflict Detection Instructions
bool HasCDI = false;

/// Processor has AVX-512 population count Instructions
bool HasVPOPCNTDQ = false;

/// Processor has AVX-512 Doubleword and Quadword instructions
bool HasDQI = false;

/// Processor has AVX-512 Byte and Word instructions
bool HasBWI = false;

/// Processor has AVX-512 Vector Length eXtenstions
bool HasVLX = false;

/// Processor has AVX-512 16 bit floating-point extenstions
bool HasFP16 = false;

/// Processor has PKU extenstions
bool HasPKU = false;

/// Processor has AVX-512 Vector Neural Network Instructions
bool HasVNNI = false;

/// Processor has AVX Vector Neural Network Instructions
bool HasAVXVNNI = false;

/// Processor has AVX-512 bfloat16 floating-point extensions
bool HasBF16 = false;

/// Processor supports ENQCMD instructions
bool HasENQCMD = false;

/// Processor has AVX-512 Bit Algorithms instructions
bool HasBITALG = false;

/// Processor has AVX-512 vp2intersect instructions
bool HasVP2INTERSECT = false;

/// Processor supports CET SHSTK - Control-Flow Enforcement Technology
/// using Shadow Stack
bool HasSHSTK = false;

/// Processor supports Invalidate Process-Context Identifier
bool HasINVPCID = false;

/// Processor has Software Guard Extensions
bool HasSGX = false;

/// Processor supports Flush Cache Line instruction
bool HasCLFLUSHOPT = false;

/// Processor supports Cache Line Write Back instruction
bool HasCLWB = false;

/// Processor supports Write Back No Invalidate instruction
bool HasWBNOINVD = false;

/// Processor support RDPID instruction
bool HasRDPID = false;

/// Processor supports WaitPKG instructions
bool HasWAITPKG = false;

/// Processor supports PCONFIG instruction
bool HasPCONFIG = false;

/// Processor support key locker instructions
bool HasKL = false;

/// Processor support key locker wide instructions
bool HasWIDEKL = false;

/// Processor supports HRESET instruction
bool HasHRESET = false;

/// Processor supports SERIALIZE instruction
bool HasSERIALIZE = false;

/// Processor supports TSXLDTRK instruction
bool HasTSXLDTRK = false;

/// Processor has AMX support
bool HasAMXTILE = false;
bool HasAMXBF16 = false;
bool HasAMXINT8 = false;

/// Processor supports User Level Interrupt instructions
bool HasUINTR = false;

/// Enable SSE4.2 CRC32 instruction (Used when SSE4.2 is supported but
/// function is GPR only)
bool HasCRC32 = false;

/// Processor has a single uop BEXTR implementation.
bool HasFastBEXTR = false;

/// Try harder to combine to horizontal vector ops if they are fast.
bool HasFastHorizontalOps = false;

/// Prefer a left/right scalar logical shifts pair over a shift+and pair.
bool HasFastScalarShiftMasks = false;

/// Prefer a left/right vector logical shifts pair over a shift+and pair.
bool HasFastVectorShiftMasks = false;

/// Prefer a movbe over a single-use load + bswap / single-use bswap + store.
bool HasFastMOVBE = false;

/// Use a retpoline thunk rather than indirect calls to block speculative
/// execution.
bool UseRetpolineIndirectCalls = false;

/// Use a retpoline thunk or remove any indirect branch to block speculative
/// execution.
bool UseRetpolineIndirectBranches = false;

/// Deprecated flag, query `UseRetpolineIndirectCalls` and
/// `UseRetpolineIndirectBranches` instead.
bool DeprecatedUseRetpoline = false;

/// When using a retpoline thunk, call an externally provided thunk rather
/// than emitting one inside the compiler.
bool UseRetpolineExternalThunk = false;

/// Prevent generation of indirect call/branch instructions from memory,
/// and force all indirect call/branch instructions from a register to be
/// preceded by an LFENCE. Also decompose RET instructions into a
/// POP+LFENCE+JMP sequence.
bool UseLVIControlFlowIntegrity = false;

/// Enable Speculative Execution Side Effect Suppression
bool UseSpeculativeExecutionSideEffectSuppression = false;

/// Insert LFENCE instructions to prevent data speculatively injected into
/// loads from being used maliciously.
bool UseLVILoadHardening = false;

/// Use an instruction sequence for taking the address of a global that allows
/// a memory tag in the upper address bits.
bool AllowTaggedGlobals = false;

/// Use software floating point for code generation.
bool UseSoftFloat = false;

/// Use alias analysis during code generation.
bool UseAA = false;

/// The minimum alignment known to hold of the stack frame on		/// The minimum alignment known to hold of the stack frame on
/// entry to the function and which must be maintained by every function.		/// entry to the function and which must be maintained by every function.
Align stackAlignment = Align(4);		Align stackAlignment = Align(4);

Align TileConfigAlignment = Align(4);		Align TileConfigAlignment = Align(4);

/// Max. memset / memcpy size that is turned into rep/movs, rep/stos ops.		/// Max. memset / memcpy size that is turned into rep/movs, rep/stos ops.
///		///
// FIXME: this is a known good value for Yonah. How about others?		// FIXME: this is a known good value for Yonah. How about others?
unsigned MaxInlineSizeThreshold = 128;		unsigned MaxInlineSizeThreshold = 128;

/// Indicates target prefers 128 bit instructions.
bool Prefer128Bit = false;

/// Indicates target prefers 256 bit instructions.
bool Prefer256Bit = false;

/// Indicates target prefers AVX512 mask registers.
bool PreferMaskRegisters = false;

/// Use Silvermont specific arithmetic costs.
bool UseSLMArithCosts = false;

/// Use Goldmont specific floating point div/sqrt costs.
bool UseGLMDivSqrtCosts = false;

/// What processor and OS we're targeting.		/// What processor and OS we're targeting.
Triple TargetTriple;		Triple TargetTriple;

/// GlobalISel related APIs.		/// GlobalISel related APIs.
std::unique_ptr<CallLowering> CallLoweringInfo;		std::unique_ptr<CallLowering> CallLoweringInfo;
std::unique_ptr<LegalizerInfo> Legalizer;		std::unique_ptr<LegalizerInfo> Legalizer;
std::unique_ptr<RegisterBankInfo> RegBankInfo;		std::unique_ptr<RegisterBankInfo> RegBankInfo;
std::unique_ptr<InstructionSelector> InstSelector;		std::unique_ptr<InstructionSelector> InstSelector;

private:		private:
/// Override the stack alignment.		/// Override the stack alignment.
MaybeAlign StackAlignOverride;		MaybeAlign StackAlignOverride;

/// Preferred vector width from function attribute.		/// Preferred vector width from function attribute.
unsigned PreferVectorWidthOverride;		unsigned PreferVectorWidthOverride;

/// Resolved preferred vector width from function attribute and subtarget		/// Resolved preferred vector width from function attribute and subtarget
/// features.		/// features.
unsigned PreferVectorWidth = UINT32_MAX;		unsigned PreferVectorWidth = UINT32_MAX;

/// Required vector width from function attribute.		/// Required vector width from function attribute.
unsigned RequiredVectorWidth;		unsigned RequiredVectorWidth;

/// True if compiling for 64-bit, false for 16-bit or 32-bit.
bool Is64Bit = false;

/// True if compiling for 32-bit, false for 16-bit or 64-bit.
bool Is32Bit = false;

/// True if compiling for 16-bit, false for 32-bit or 64-bit.
bool Is16Bit = false;

X86SelectionDAGInfo TSInfo;		X86SelectionDAGInfo TSInfo;
// Ordering here is important. X86InstrInfo initializes X86RegisterInfo which		// Ordering here is important. X86InstrInfo initializes X86RegisterInfo which
// X86TargetLowering needs.		// X86TargetLowering needs.
X86InstrInfo InstrInfo;		X86InstrInfo InstrInfo;
X86TargetLowering TLInfo;		X86TargetLowering TLInfo;
X86FrameLowering FrameLowering;		X86FrameLowering FrameLowering;

public:		public:
▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	private:
/// Initialize the full set of dependencies so we can use an initializer		/// Initialize the full set of dependencies so we can use an initializer
/// list for X86Subtarget.		/// list for X86Subtarget.
X86Subtarget &initializeSubtargetDependencies(StringRef CPU,		X86Subtarget &initializeSubtargetDependencies(StringRef CPU,
StringRef TuneCPU,		StringRef TuneCPU,
StringRef FS);		StringRef FS);
void initSubtargetFeatures(StringRef CPU, StringRef TuneCPU, StringRef FS);		void initSubtargetFeatures(StringRef CPU, StringRef TuneCPU, StringRef FS);

public:		public:
/// Is this x86_64? (disregarding specific ABI / programming model)
bool is64Bit() const {
return Is64Bit;
}

bool is32Bit() const {
return Is32Bit;
}

bool is16Bit() const {
return Is16Bit;
}

/// Is this x86_64 with the ILP32 programming model (x32 ABI)?		/// Is this x86_64 with the ILP32 programming model (x32 ABI)?
bool isTarget64BitILP32() const {		bool isTarget64BitILP32() const {
return Is64Bit && (TargetTriple.isX32() \|\| TargetTriple.isOSNaCl());		return Is64Bit && (TargetTriple.isX32() \|\| TargetTriple.isOSNaCl());
}		}

/// Is this x86_64 with the LP64 programming model (standard AMD64, no x32)?		/// Is this x86_64 with the LP64 programming model (standard AMD64, no x32)?
bool isTarget64BitLP64() const {		bool isTarget64BitLP64() const {
return Is64Bit && (!TargetTriple.isX32() && !TargetTriple.isOSNaCl());		return Is64Bit && (!TargetTriple.isX32() && !TargetTriple.isOSNaCl());
}		}

PICStyles::Style getPICStyle() const { return PICStyle; }		PICStyles::Style getPICStyle() const { return PICStyle; }
void setPICStyle(PICStyles::Style Style) { PICStyle = Style; }		void setPICStyle(PICStyles::Style Style) { PICStyle = Style; }

bool hasX87() const { return HasX87; }		#define GET_SUBTARGETINFO_FEATURE_INTERFACE
bool hasCMPXCHG8B() const { return HasCMPXCHG8B; }		#include "X86GenSubtargetInfo.inc"
bool hasNOPL() const { return HasNOPL; }
// SSE codegen depends on cmovs, and all SSE1+ processors support them.		// SSE codegen depends on cmovs, and all SSE1+ processors support them.
// All 64-bit processors support cmov.		// All 64-bit processors support cmov.
bool hasCMOV() const { return HasCMOV \|\| X86SSELevel >= SSE1 \|\| is64Bit(); }		bool hasCMOV() const { return HasCMOV \|\| X86SSELevel >= SSE1 \|\| is64Bit(); }
bool hasSSE1() const { return X86SSELevel >= SSE1; }		bool useAA() const override { return UseAA; }
bool hasSSE2() const { return X86SSELevel >= SSE2; }
bool hasSSE3() const { return X86SSELevel >= SSE3; }
bool hasSSSE3() const { return X86SSELevel >= SSSE3; }
bool hasSSE41() const { return X86SSELevel >= SSE41; }
bool hasSSE42() const { return X86SSELevel >= SSE42; }
bool hasAVX() const { return X86SSELevel >= AVX; }
bool hasAVX2() const { return X86SSELevel >= AVX2; }
bool hasAVX512() const { return X86SSELevel >= AVX512; }
bool hasInt256() const { return hasAVX2(); }		bool hasInt256() const { return hasAVX2(); }
bool hasSSE4A() const { return HasSSE4A; }
bool hasMMX() const { return X863DNowLevel >= MMX; }
bool hasThreeDNow() const { return X863DNowLevel >= ThreeDNow; }
bool hasThreeDNowA() const { return X863DNowLevel >= ThreeDNowA; }
bool hasPOPCNT() const { return HasPOPCNT; }
bool hasAES() const { return HasAES; }
bool hasVAES() const { return HasVAES; }
bool hasFXSR() const { return HasFXSR; }
bool hasXSAVE() const { return HasXSAVE; }
bool hasXSAVEOPT() const { return HasXSAVEOPT; }
bool hasXSAVEC() const { return HasXSAVEC; }
bool hasXSAVES() const { return HasXSAVES; }
bool hasPCLMUL() const { return HasPCLMUL; }
bool hasVPCLMULQDQ() const { return HasVPCLMULQDQ; }
bool hasGFNI() const { return HasGFNI; }
// Prefer FMA4 to FMA - its better for commutation/memory folding and
// has equal or better performance on all supported targets.
bool hasFMA() const { return HasFMA; }
bool hasFMA4() const { return HasFMA4; }
bool hasAnyFMA() const { return hasFMA() \|\| hasFMA4(); }		bool hasAnyFMA() const { return hasFMA() \|\| hasFMA4(); }
bool hasXOP() const { return HasXOP; }
bool hasTBM() const { return HasTBM; }
bool hasLWP() const { return HasLWP; }
bool hasMOVBE() const { return HasMOVBE; }
bool hasRDRAND() const { return HasRDRAND; }
bool hasF16C() const { return HasF16C; }
bool hasFSGSBase() const { return HasFSGSBase; }
bool hasLZCNT() const { return HasLZCNT; }
bool hasBMI() const { return HasBMI; }
bool hasBMI2() const { return HasBMI2; }
bool hasVBMI() const { return HasVBMI; }
bool hasVBMI2() const { return HasVBMI2; }
bool hasIFMA() const { return HasIFMA; }
bool hasRTM() const { return HasRTM; }
bool hasADX() const { return HasADX; }
bool hasSHA() const { return HasSHA; }
bool hasPRFCHW() const { return HasPRFCHW; }
bool hasPREFETCHWT1() const { return HasPREFETCHWT1; }
bool hasPrefetchW() const {		bool hasPrefetchW() const {
// The PREFETCHW instruction was added with 3DNow but later CPUs gave it		// The PREFETCHW instruction was added with 3DNow but later CPUs gave it
// its own CPUID bit as part of deprecating 3DNow. Intel eventually added		// its own CPUID bit as part of deprecating 3DNow. Intel eventually added
// it and KNL has another that prefetches to L2 cache. We assume the		// it and KNL has another that prefetches to L2 cache. We assume the
// L1 version exists if the L2 version does.		// L1 version exists if the L2 version does.
return hasThreeDNow() \|\| hasPRFCHW() \|\| hasPREFETCHWT1();		return hasThreeDNow() \|\| hasPRFCHW() \|\| hasPREFETCHWT1();
}		}
bool hasSSEPrefetch() const {		bool hasSSEPrefetch() const {
// We implicitly enable these when we have a write prefix supporting cache		// We implicitly enable these when we have a write prefix supporting cache
// level OR if we have prfchw, but don't already have a read prefetch from		// level OR if we have prfchw, but don't already have a read prefetch from
// 3dnow.		// 3dnow.
return hasSSE1() \|\| (hasPRFCHW() && !hasThreeDNow()) \|\| hasPREFETCHWT1();		return hasSSE1() \|\| (hasPRFCHW() && !hasThreeDNow()) \|\| hasPREFETCHWT1();
}		}
bool hasRDSEED() const { return HasRDSEED; }
bool hasLAHFSAHF() const { return HasLAHFSAHF64 \|\| !is64Bit(); }		bool hasLAHFSAHF() const { return HasLAHFSAHF64 \|\| !is64Bit(); }
bool hasMWAITX() const { return HasMWAITX; }
bool hasCLZERO() const { return HasCLZERO; }
bool hasCLDEMOTE() const { return HasCLDEMOTE; }
bool hasMOVDIRI() const { return HasMOVDIRI; }
bool hasMOVDIR64B() const { return HasMOVDIR64B; }
bool hasPTWRITE() const { return HasPTWRITE; }
bool isSHLDSlow() const { return IsSHLDSlow; }
bool isPMULLDSlow() const { return IsPMULLDSlow; }
bool isPMADDWDSlow() const { return IsPMADDWDSlow; }
bool isUnalignedMem16Slow() const { return IsUnalignedMem16Slow; }
bool isUnalignedMem32Slow() const { return IsUnalignedMem32Slow; }
bool hasSSEUnalignedMem() const { return HasSSEUnalignedMem; }
bool hasCMPXCHG16B() const { return HasCMPXCHG16B && is64Bit(); }		bool hasCMPXCHG16B() const { return HasCMPXCHG16B && is64Bit(); }
bool useLeaForSP() const { return UseLeaForSP; }
bool hasPOPCNTFalseDeps() const { return HasPOPCNTFalseDeps; }
bool hasLZCNTFalseDeps() const { return HasLZCNTFalseDeps; }
bool hasSBBDepBreaking() const { return HasSBBDepBreaking; }
bool hasFastVariableCrossLaneShuffle() const {
return HasFastVariableCrossLaneShuffle;
}
bool hasFastVariablePerLaneShuffle() const {
return HasFastVariablePerLaneShuffle;
}
bool insertVZEROUPPER() const { return InsertVZEROUPPER; }
bool hasFastGather() const { return HasFastGather; }
bool hasFastScalarFSQRT() const { return HasFastScalarFSQRT; }
bool hasFastVectorFSQRT() const { return HasFastVectorFSQRT; }
bool hasFastLZCNT() const { return HasFastLZCNT; }
bool hasFastSHLDRotate() const { return HasFastSHLDRotate; }
bool hasFastBEXTR() const { return HasFastBEXTR; }
bool hasFastHorizontalOps() const { return HasFastHorizontalOps; }
bool hasFastScalarShiftMasks() const { return HasFastScalarShiftMasks; }
bool hasFastVectorShiftMasks() const { return HasFastVectorShiftMasks; }
bool hasFastMOVBE() const { return HasFastMOVBE; }
bool hasMacroFusion() const { return HasMacroFusion; }
bool hasBranchFusion() const { return HasBranchFusion; }
bool hasERMSB() const { return HasERMSB; }
bool hasFSRM() const { return HasFSRM; }
bool hasSlowDivide32() const { return HasSlowDivide32; }
bool hasSlowDivide64() const { return HasSlowDivide64; }
bool padShortFunctions() const { return PadShortFunctions; }
bool slowTwoMemOps() const { return SlowTwoMemOps; }
bool leaUsesAG() const { return LeaUsesAG; }
bool slowLEA() const { return SlowLEA; }
bool slow3OpsLEA() const { return Slow3OpsLEA; }
bool slowIncDec() const { return SlowIncDec; }
bool hasCDI() const { return HasCDI; }
bool hasVPOPCNTDQ() const { return HasVPOPCNTDQ; }
bool hasPFI() const { return HasPFI; }
bool hasERI() const { return HasERI; }
bool hasDQI() const { return HasDQI; }
bool hasBWI() const { return HasBWI; }
bool hasVLX() const { return HasVLX; }
bool hasFP16() const { return HasFP16; }
bool hasPKU() const { return HasPKU; }
bool hasVNNI() const { return HasVNNI; }
bool hasBF16() const { return HasBF16; }
bool hasVP2INTERSECT() const { return HasVP2INTERSECT; }
bool hasBITALG() const { return HasBITALG; }
bool hasSHSTK() const { return HasSHSTK; }
bool hasCLFLUSHOPT() const { return HasCLFLUSHOPT; }
bool hasCLWB() const { return HasCLWB; }
bool hasWBNOINVD() const { return HasWBNOINVD; }
bool hasRDPID() const { return HasRDPID; }
bool hasWAITPKG() const { return HasWAITPKG; }
bool hasPCONFIG() const { return HasPCONFIG; }
bool hasSGX() const { return HasSGX; }
bool hasINVPCID() const { return HasINVPCID; }
bool hasENQCMD() const { return HasENQCMD; }
bool hasKL() const { return HasKL; }
bool hasWIDEKL() const { return HasWIDEKL; }
bool hasHRESET() const { return HasHRESET; }
bool hasSERIALIZE() const { return HasSERIALIZE; }
bool hasTSXLDTRK() const { return HasTSXLDTRK; }
bool hasUINTR() const { return HasUINTR; }
bool hasCRC32() const { return HasCRC32; }
bool useRetpolineIndirectCalls() const { return UseRetpolineIndirectCalls; }
bool useRetpolineIndirectBranches() const {
return UseRetpolineIndirectBranches;
}
bool hasAVXVNNI() const { return HasAVXVNNI; }
bool hasAMXTILE() const { return HasAMXTILE; }
bool hasAMXBF16() const { return HasAMXBF16; }
bool hasAMXINT8() const { return HasAMXINT8; }
bool useRetpolineExternalThunk() const { return UseRetpolineExternalThunk; }

// These are generic getters that OR together all of the thunk types		// These are generic getters that OR together all of the thunk types
// supported by the subtarget. Therefore useIndirectThunk*() will return true		// supported by the subtarget. Therefore useIndirectThunk*() will return true
// if any respective thunk feature is enabled.		// if any respective thunk feature is enabled.
bool useIndirectThunkCalls() const {		bool useIndirectThunkCalls() const {
return useRetpolineIndirectCalls() \|\| useLVIControlFlowIntegrity();		return useRetpolineIndirectCalls() \|\| useLVIControlFlowIntegrity();
}		}
bool useIndirectThunkBranches() const {		bool useIndirectThunkBranches() const {
return useRetpolineIndirectBranches() \|\| useLVIControlFlowIntegrity();		return useRetpolineIndirectBranches() \|\| useLVIControlFlowIntegrity();
}		}

bool preferMaskRegisters() const { return PreferMaskRegisters; }
bool useSLMArithCosts() const { return UseSLMArithCosts; }
bool useGLMDivSqrtCosts() const { return UseGLMDivSqrtCosts; }
bool useLVIControlFlowIntegrity() const { return UseLVIControlFlowIntegrity; }
bool allowTaggedGlobals() const { return AllowTaggedGlobals; }
bool useLVILoadHardening() const { return UseLVILoadHardening; }
bool useSpeculativeExecutionSideEffectSuppression() const {
return UseSpeculativeExecutionSideEffectSuppression;
}

unsigned getPreferVectorWidth() const { return PreferVectorWidth; }		unsigned getPreferVectorWidth() const { return PreferVectorWidth; }
unsigned getRequiredVectorWidth() const { return RequiredVectorWidth; }		unsigned getRequiredVectorWidth() const { return RequiredVectorWidth; }

// Helper functions to determine when we should allow widening to 512-bit		// Helper functions to determine when we should allow widening to 512-bit
// during codegen.		// during codegen.
// TODO: Currently we're always allowing widening on CPUs without VLX,		// TODO: Currently we're always allowing widening on CPUs without VLX,
// because for many cases we don't have a better option.		// because for many cases we don't have a better option.
bool canExtendTo512DQ() const {		bool canExtendTo512DQ() const {
Show All 10 Lines	#include "X86GenSubtargetInfo.inc"
}		}

bool useBWIRegs() const {		bool useBWIRegs() const {
return hasBWI() && useAVX512Regs();		return hasBWI() && useAVX512Regs();
}		}

bool isXRaySupported() const override { return is64Bit(); }		bool isXRaySupported() const override { return is64Bit(); }

/// TODO: to be removed later and replaced with suitable properties
bool isAtom() const { return IsAtom; }
bool useSoftFloat() const { return UseSoftFloat; }
bool useAA() const override { return UseAA; }

/// Use mfence if we have SSE2 or we're on x86-64 (even if we asked for		/// Use mfence if we have SSE2 or we're on x86-64 (even if we asked for
/// no-sse2). There isn't any reason to disable it if the target processor		/// no-sse2). There isn't any reason to disable it if the target processor
/// supports it.		/// supports it.
bool hasMFence() const { return hasSSE2() \|\| is64Bit(); }		bool hasMFence() const { return hasSSE2() \|\| is64Bit(); }

const Triple &getTargetTriple() const { return TargetTriple; }		const Triple &getTargetTriple() const { return TargetTriple; }

bool isTargetDarwin() const { return TargetTriple.isOSDarwin(); }		bool isTargetDarwin() const { return TargetTriple.isOSDarwin(); }
▲ Show 20 Lines • Show All 148 Lines • Show Last 20 Lines

llvm/utils/TableGen/SubtargetEmitter.cpp

Show First 20 Lines • Show All 116 Lines • ▼ Show 20 Lines	class SubtargetEmitter {
void EmitSchedModelHelpers(const std::string &ClassName, raw_ostream &OS);		void EmitSchedModelHelpers(const std::string &ClassName, raw_ostream &OS);
void emitSchedModelHelpersImpl(raw_ostream &OS,		void emitSchedModelHelpersImpl(raw_ostream &OS,
bool OnlyExpandMCInstPredicates = false);		bool OnlyExpandMCInstPredicates = false);
void emitGenMCSubtargetInfo(raw_ostream &OS);		void emitGenMCSubtargetInfo(raw_ostream &OS);
void EmitMCInstrAnalysisPredicateFunctions(raw_ostream &OS);		void EmitMCInstrAnalysisPredicateFunctions(raw_ostream &OS);

void EmitSchedModel(raw_ostream &OS);		void EmitSchedModel(raw_ostream &OS);
void EmitHwModeCheck(const std::string &ClassName, raw_ostream &OS);		void EmitHwModeCheck(const std::string &ClassName, raw_ostream &OS);
void ParseFeaturesFunction(raw_ostream &OS, unsigned NumFeatures,		void ParseFeaturesFunction(raw_ostream &OS);
unsigned NumProcs);

public:		public:
SubtargetEmitter(RecordKeeper &R, CodeGenTarget &TGT)		SubtargetEmitter(RecordKeeper &R, CodeGenTarget &TGT)
: TGT(TGT), Records(R), SchedModels(TGT.getSchedModels()),		: TGT(TGT), Records(R), SchedModels(TGT.getSchedModels()),
Target(TGT.getName()) {}		Target(TGT.getName()) {}

void run(raw_ostream &o);		void run(raw_ostream &o);
};		};
▲ Show 20 Lines • Show All 1,545 Lines • ▼ Show 20 Lines	void SubtargetEmitter::EmitHwModeCheck(const std::string &ClassName,
}		}
OS << " return 0;\n}\n";		OS << " return 0;\n}\n";
}		}

//		//
// ParseFeaturesFunction - Produces a subtarget specific function for parsing		// ParseFeaturesFunction - Produces a subtarget specific function for parsing
// the subtarget features string.		// the subtarget features string.
//		//
void SubtargetEmitter::ParseFeaturesFunction(raw_ostream &OS,		void SubtargetEmitter::ParseFeaturesFunction(raw_ostream &OS) {
unsigned NumFeatures,		OS << "\n#ifdef GET_SUBTARGETINFO_TARGET_DESC\n";
unsigned NumProcs) {		OS << "#undef GET_SUBTARGETINFO_TARGET_DESC\n\n";

		OS << "#include \"llvm/Support/Debug.h\"\n";
		OS << "#include \"llvm/Support/raw_ostream.h\"\n\n";

std::vector<Record*> Features =		std::vector<Record*> Features =
Records.getAllDerivedDefinitions("SubtargetFeature");		Records.getAllDerivedDefinitions("SubtargetFeature");
llvm::sort(Features, LessRecord());		llvm::sort(Features, LessRecord());

OS << "// ParseSubtargetFeatures - Parses features string setting specified\n"		OS << "// ParseSubtargetFeatures - Parses features string setting specified\n"
<< "// subtarget options.\n"		<< "// subtarget options.\n"
<< "void llvm::";		<< "void llvm::";
OS << Target;		OS << Target;
OS << "Subtarget::ParseSubtargetFeatures(StringRef CPU, StringRef TuneCPU, "		OS << "Subtarget::ParseSubtargetFeatures(StringRef CPU, StringRef TuneCPU, "
<< "StringRef FS) {\n"		<< "StringRef FS) {\n"
<< " LLVM_DEBUG(dbgs() << \"\\nFeatures:\" << FS);\n"		<< " LLVM_DEBUG(dbgs() << \"\\nFeatures:\" << FS);\n"
<< " LLVM_DEBUG(dbgs() << \"\\nCPU:\" << CPU);\n"		<< " LLVM_DEBUG(dbgs() << \"\\nCPU:\" << CPU);\n"
<< " LLVM_DEBUG(dbgs() << \"\\nTuneCPU:\" << TuneCPU << \"\\n\\n\");\n";		<< " LLVM_DEBUG(dbgs() << \"\\nTuneCPU:\" << TuneCPU << \"\\n\\n\");\n";

if (Features.empty()) {		OS << " InitMCProcessorInfo(CPU, TuneCPU, FS);\n";
OS << "}\n";
return;
}

OS << " InitMCProcessorInfo(CPU, TuneCPU, FS);\n"		if (!Features.empty())
<< " const FeatureBitset &Bits = getFeatureBits();\n";		OS << " const FeatureBitset &Bits = getFeatureBits();\n";
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - OS << " const FeatureBitset &Bits = getFeatureBits();\n"; + OS << " const FeatureBitset &Bits = getFeatureBits();\n"; Lint: Pre-merge checks: clang-format: please reformat the code ``` - OS << " const FeatureBitset &Bits =…

for (Record *R : Features) {		for (Record *R : Features) {
// Next record		// Next record
StringRef Instance = R->getName();		StringRef Instance = R->getName();
StringRef Value = R->getValueAsString("Value");		StringRef Value = R->getValueAsString("Value");
StringRef Attribute = R->getValueAsString("Attribute");		StringRef Attribute = R->getValueAsString("Attribute");

if (Value=="true" \|\| Value=="false")		if (Value=="true" \|\| Value=="false")
OS << " if (Bits[" << Target << "::"		OS << " if (Bits[" << Target << "::"
<< Instance << "]) "		<< Instance << "]) "
<< Attribute << " = " << Value << ";\n";		<< Attribute << " = " << Value << ";\n";
else		else
OS << " if (Bits[" << Target << "::"		OS << " if (Bits[" << Target << "::"
<< Instance << "] && "		<< Instance << "] && "
<< Attribute << " < " << Value << ") "		<< Attribute << " < " << Value << ") "
<< Attribute << " = " << Value << ";\n";		<< Attribute << " = " << Value << ";\n";
}		}

OS << "}\n";		OS << "}\n";
		OS << "#endif // GET_SUBTARGETINFO_TARGET_DESC\n\n";

		// Attribute that has multiple related features
		std::set<StringRef> NonUniqueAttrs;
		// Attribute that is not boolean
		std::set<StringRef> NonBooleanAttrs;
		// Attribute and Description
		std::set<std::pair<StringRef,StringRef>> AttrDescs;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - std::set<std::pair<StringRef,StringRef>> AttrDescs; + std::set<std::pair<StringRef, StringRef>> AttrDescs; Lint: Pre-merge checks: clang-format: please reformat the code ``` - std::set<std::pair<StringRef,StringRef>>…
		// Attribute, Value and Description
		std::set<std::tuple<StringRef, StringRef, StringRef>> AttrValDescs;
		auto isBoolean = [](StringRef S) { return S == "true" \|\| S == "false"; };
		for (Record *R : Features) {
		StringRef Attribute = R->getValueAsString("Attribute");
		craig.topperUnsubmitted Done Reply Inline Actions You can do if (!Attrs.insert(Attribute).second) continue; instead of calling find and insert. craig.topper: You can do ``` if (!Attrs.insert(Attribute).second) continue; ``` instead of calling find…
		StringRef Value = R->getValueAsString("Value");
		if (!isBoolean(Value))
		NonBooleanAttrs.insert(Attribute);
		if(!AttrDescs.insert({Attribute,""}).second)
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if(!AttrDescs.insert({Attribute,""}).second) + if (!AttrDescs.insert({Attribute, ""}).second) Lint: Pre-merge checks: clang-format: please reformat the code ``` - if(!AttrDescs.insert({Attribute,""}).second) +…
		NonUniqueAttrs.insert(Attribute);
		}
		for (Record *R : Features) {
		// Whether we need to emit a trivial field for this feature
		bool HasTrivialField = R->getValueAsBit("TrivialField");
		// Whether we need to emit a trivial interface for this feature
		bool HasTrivialInterface = R->getValueAsBit("TrivialInterface");

		// Add decription to field if only one features set it, otherwise
		// add decription to interface
		StringRef Attribute = R->getValueAsString("Attribute");
		bool IsUniqueAttribute =
		NonUniqueAttrs.find(Attribute) == NonUniqueAttrs.end();
		StringRef Desc = R->getValueAsString("Desc");
		if (!HasTrivialField) {
		// No need to emit a trivial field
		AttrDescs.erase({Attribute,""});
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - AttrDescs.erase({Attribute,""}); + AttrDescs.erase({Attribute, ""}); Lint: Pre-merge checks: clang-format: please reformat the code ``` - AttrDescs.erase({Attribute,""}); +…
		} else if (IsUniqueAttribute) {
		AttrDescs.erase({Attribute, ""});
		AttrDescs.insert({Attribute, Desc});
		}
		// No need to emit a trivial interface
		if (!HasTrivialInterface)
		continue;
		StringRef Value = R->getValueAsString("Value");
		if (IsUniqueAttribute)
		AttrValDescs.insert(std::make_tuple(Attribute, Value, ""));
		else
		AttrValDescs.insert(std::make_tuple(Attribute, Value, Desc));
		}
		OS << "\n#ifdef GET_SUBTARGETINFO_FEATURE_FIELD\n";
		OS << "#undef GET_SUBTARGETINFO_FEATURE_FIELD\n\n";
		// Print fields for features
		for (auto AttrDesc: AttrDescs) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - for (auto AttrDesc: AttrDescs) { + for (auto AttrDesc : AttrDescs) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - for (auto AttrDesc: AttrDescs) { + for (auto…
		StringRef Attribute = AttrDesc.first;
		StringRef Desc = AttrDesc.second;
		// Print comments for this feature
		if (!Desc.empty())
		OS << "/// " << Desc << "\n";
		if (NonBooleanAttrs.find(Attribute) == NonBooleanAttrs.end())
		OS << "bool ";
		else
		OS << "unsigned ";
		// Any trivial feature's field should be zero-initialized
		OS << Attribute << " = {};\n";
		}
		OS << "#endif // GET_SUBTARGETINFO_FEATURE_FIELD\n\n";

		OS << "\n#ifdef GET_SUBTARGETINFO_FEATURE_INTERFACE\n";
		// Print interfaces for features
		for (auto AttrValDesc : AttrValDescs) {
		StringRef Attribute;
		StringRef Value;
		StringRef Desc;
		std::tie(Attribute, Value, Desc) = AttrValDesc;
		// Print comments for this feature
		if (!Desc.empty())
		OS << "/// " << Desc << "\n";
		OS << "bool ";
		if (isBoolean(Value))
		OS << toLower(Attribute[0]) << Attribute.substr(1) << "() const { return "
		<< Attribute;
		else
		OS << "has" << Value << "() const { return " << Attribute
		<< " >= " << Value;
		OS << "; }\n";
		}
		OS << "#undef GET_SUBTARGETINFO_FEATURE_INTERFACE\n\n";
		OS << "#endif // GET_SUBTARGETINFO_FEATURE_INTERFACE\n\n";
}		}

void SubtargetEmitter::emitGenMCSubtargetInfo(raw_ostream &OS) {		void SubtargetEmitter::emitGenMCSubtargetInfo(raw_ostream &OS) {
OS << "namespace " << Target << "_MC {\n"		OS << "namespace " << Target << "_MC {\n"
<< "unsigned resolveVariantSchedClassImpl(unsigned SchedClass,\n"		<< "unsigned resolveVariantSchedClassImpl(unsigned SchedClass,\n"
<< " const MCInst MI, const MCInstrInfo MCII, unsigned CPUID) {\n";		<< " const MCInst MI, const MCInstrInfo MCII, unsigned CPUID) {\n";
emitSchedModelHelpersImpl(OS, /* OnlyExpandMCPredicates */ true);		emitSchedModelHelpersImpl(OS, /* OnlyExpandMCPredicates */ true);
OS << "}\n";		OS << "}\n";
▲ Show 20 Lines • Show All 109 Lines • ▼ Show 20 Lines	#endif
} else		} else
OS << "nullptr, nullptr, nullptr";		OS << "nullptr, nullptr, nullptr";
OS << ");\n}\n\n";		OS << ");\n}\n\n";

OS << "} // end namespace llvm\n\n";		OS << "} // end namespace llvm\n\n";

OS << "#endif // GET_SUBTARGETINFO_MC_DESC\n\n";		OS << "#endif // GET_SUBTARGETINFO_MC_DESC\n\n";

OS << "\n#ifdef GET_SUBTARGETINFO_TARGET_DESC\n";		ParseFeaturesFunction(OS);
OS << "#undef GET_SUBTARGETINFO_TARGET_DESC\n\n";

OS << "#include \"llvm/Support/Debug.h\"\n";
OS << "#include \"llvm/Support/raw_ostream.h\"\n\n";
ParseFeaturesFunction(OS, NumFeatures, NumProcs);

OS << "#endif // GET_SUBTARGETINFO_TARGET_DESC\n\n";

// Create a TargetSubtargetInfo subclass to hide the MC layer initialization.		// Create a TargetSubtargetInfo subclass to hide the MC layer initialization.
OS << "\n#ifdef GET_SUBTARGETINFO_HEADER\n";		OS << "\n#ifdef GET_SUBTARGETINFO_HEADER\n";
OS << "#undef GET_SUBTARGETINFO_HEADER\n\n";		OS << "#undef GET_SUBTARGETINFO_HEADER\n\n";

std::string ClassName = Target + "GenSubtargetInfo";		std::string ClassName = Target + "GenSubtargetInfo";
OS << "namespace llvm {\n";		OS << "namespace llvm {\n";
OS << "class DFAPacketizer;\n";		OS << "class DFAPacketizer;\n";
▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines