Add option for some cores which are better to use csel instead of csinv or csinc.
Diff Detail
Event Timeline
lib/Target/AArch64/AArch64Subtarget.cpp | ||
---|---|---|
72–74 ↗ | (On Diff #89749) | This is unnecessary, the target feature will already set it. |
lib/Target/AArch64/AArch64.td | ||
---|---|---|
54 | Can't this be mapped in the cost model? |
If the core doesn't support CSEL, let's mark it as unavailable. If it does, but the cost is bad (for whatever reason), this really really should be in the cost model...
lib/Target/AArch64/AArch64InstrInfo.td | ||
---|---|---|
46 | Wait, "prefer" or "don't use"? This is confusing... |
Thanks for comment Renato.
This patch was inspired by below ARM's code. May I change this? What is your suggestion for the name "Don't use".?
// Some targets (e.g. Cortex-A9) prefer VMOVSR to VMOVDRR even when using NEON // for scalar FP, as this allows more effective execution domain optimization. def FeaturePreferVMOVSR : SubtargetFeature<"prefer-vmovsr", "PreferVMOVSR", "true", "Prefer VMOVSR">;
def UseVMOVSR : Predicate<"Subtarget->preferVMOVSR() ||" "!Subtarget->useNEONForSinglePrecisionFP()">; def DontUseVMOVSR : Predicate<"!Subtarget->preferVMOVSR() &&" "Subtarget->useNEONForSinglePrecisionFP()">;
Hi Renato,
If the core doesn't support CSEL, let's mark it as unavailable. If it does, but the cost is bad (for whatever reason), this really really should be in the cost model...
I have a bit more background on this as Junmo has talked to me offline. The core *does* support CSINV and friends, but they aren't as efficient as CSEL. Therefore, Junmo would like to disable the patterns that generate CSINV and friends (this could increase register pressure, but for his target the payoff is great enough).
James
I thought as much, but that's a bad design and shouldn't repeat.
We had similar discussions with the Samsung and Code Aurora people and we managed to work around all the issues. Shouldn't be too hard to do the same here.
We have actually removed a lot of code that was doing similar things, so I don't think we should add it back again.
cheers,
-renato
Hi Renato,
If it does, but the cost is bad (for whatever reason), this really really should be in the cost model...
For my own understanding - how does the cost model impact the choices ISel makes? I didn't think it could. Or are you saying *in addition to* disabling the relevant Pat<>s with a feature, this should *also* be reflected in the cost model?
James
After internal(Samsung only) discussion, revert patch about Exynos-M3 part.
Renato, before we add new feature for core, we use the code likes
IsExynosM1 or IsCortexA57 ...
But after discussion, we move to use Feature things instead of above code.
I am not sure that cost model can cover all the things what uArchitecture's all the specific details.
This is correct.
I am not sure that cost model can cover all the things what uArchitecture's all the specific details.
I have to agree with you and James. We have had discussions about this in the past and in some cases, it's possible, in others, it's not.
This is because the current ISel has some thresholds which can be used as very rudimentary cost models, but unless we have a way to do this generically, any work around will be as hacky as a feature.
However, this patch goes one extra step: It changes the table-gen definitions to *forbid* CSEL based on a feature called "PreferCSEL". This is confusing and a misrepresentation of what the flag actually means.
IIRC, other flags have C++ code in the style:
if (FeaturePreferFoo) emitFOO(); else emitBAR();
which, for ISel purposes, it's the same effect as your table-gen solution. However, this is not true for the assembler. CSEL is valid on that core, and if I try to assemble files with it in this new table-gen, it will bail saying that the instruction requires "DontUseCSEL", which in itself is quite opaque and meaningless to the user.
So, using the flag as a dynamic choice during ISel only should be fine for the time being, but not adding it to the table-gen as a hard requirement.
Makes sense?
--renato
Thanks for your opinion & example Renato.
When I asked about this issue to James, actually he suggested to use term "SlowCSINV".
In ARMInstrInfo.td, there is definition use "DontUse" or "Use" even if they support instructions as I mentioned before.
def UseVMOVSR : Predicate<"Subtarget->preferVMOVSR() ||" "!Subtarget->useNEONForSinglePrecisionFP()">; def DontUseVMOVSR : Predicate<"!Subtarget->preferVMOVSR() &&" "Subtarget->useNEONForSinglePrecisionFP()">;
So, are they also changed to "HasFastVMOVSR" and "HasSlowVMOVSR" ?
And
def DontUseCSEL : Predicate<"!Subtarget->preferCSEL()">;
And Is this ok that change the definition from DontUseCSEL to HasFastCSEL?
lib/Target/AArch64/AArch64InstrInfo.td | ||
---|---|---|
46 | I think that the proposed name change would make this closer to the intended meaning: def HasFastCSINCV : Predicate<"!Subtarget->HasSlowCSINCV()">; | |
1137–1138 | With the name change I'm proposing, I think this reads more natural and is easier to understand: def : Pat<(AArch64csel (i32 0), (i32 1), (i32 imm:$cc), NZCV), (CSINCWr WZR, WZR, (i32 imm:$cc))>, Requires<[HasFastCSINCV]>; | |
lib/Target/AArch64/AArch64Subtarget.h | ||
79–80 | Sorry to be late in my review feedback here. Wouldn't "HasSlowCSINCV" be more natural and make the intent a tiny bit easier to understand? e.g. // If true, CSINV or CSINC will not be generated, using CSEL instead. bool HasSlowCSINCV = false; |
@evandro
This is code review about the case "Some targets using CSEL is more prefer than CSINV, CSINC.".
And there were good discussions which way is better & correct for handling the issue. As you said, there is no specific target for this feature,now.
I just wanted to get a confirm from other members, How to handle when we want to do this.
@kristof.beyls
Thanks for comments. Your suggestion looks better.
There may be more cases resulting in CSET that perhaps should remain CSEL. In order to deal with all such situations, instead of just these matched by these Pats, It seems to me that TargetLowering would be more flexible, whatever the reasons a target has for this.
A few other things missing:
- A core where this is set by default? Otherwise, why do we have this?
- Tests. Regardless of cores, tests with and without that feature must exist.
cheers,
--renato
lib/Target/AArch64/AArch64InstrInfo.td | ||
---|---|---|
1137–1138 | Yup. | |
lib/Target/AArch64/AArch64Subtarget.h | ||
79–80 | I agree with Kristof here. |
Can't this be mapped in the cost model?