This is an archive of the discontinued LLVM Phabricator instance.

AArch64: Do not test for CPUs, use SubtargetFeatures
ClosedPublic

Authored by MatzeB on May 27 2016, 6:09 PM.

Details

Summary

Testing for specific CPUs has a number of problems, better use subtarget
features:

  • When some tweak is added for a specific CPU it is often desirable for the next version of that CPU as well, yet we often forget to add it.
  • It is hard to keep track of checks scattered around the target code; Declaring all target specifics together with the CPU in the tablegen file is a clear representation.
  • Subtarget features can be tweaked from the command line.

To discourage people from using CPU checks in the future I removed the
isCortexXX(), isCyclone(), ... functions. I added an getProcFamily()
function for exceptional circumstances but made it clear in the comment
that usage is discouraged.

Reformat feature list in AArch64.td to have 1 feature per line in
alphabetical order to simplify merging and sorting for out of tree
tweaks.

No functional change intended.

Diff Detail

Repository
rL LLVM

Event Timeline

MatzeB updated this revision to Diff 58874.May 27 2016, 6:09 PM
MatzeB retitled this revision from to AArch64: Do not test for CPUs, use SubtargetFeatures.
MatzeB updated this object.
MatzeB set the repository for this revision to rL LLVM.
MatzeB added subscribers: llvm-commits, anemet.
rovka added a subscriber: rovka.May 30 2016, 8:41 AM

LGTM, but you should wait for someone more versed to approve.

lib/Target/AArch64/AArch64.td
72

typo: Predictable

lib/Target/AArch64/AArch64ISelLowering.cpp
8813

You could make this comment sound more generic (e.g. use "Some subtargets" instead of "Cyclone").

lib/Target/AArch64/AArch64InstrInfo.cpp
1524

I think you can remove or rephrase this comment.

1804

I think you can remove/rephrase this comment as well.

rengolin edited edge metadata.May 30 2016, 11:16 AM

Hi Matthias,

I think this is a really good change to make, but I have some comments in addition to Diana...

cheers,
--renato

lib/Target/AArch64/AArch64.td
76

Can't this be a property?

102

This sounds like an odd one.

lib/Target/AArch64/AArch64Subtarget.cpp
153

What does load balancing FP have to do with PBQP?

Also, this moves from just A57 to both A57 and A53.

lib/Target/AArch64/AArch64Subtarget.h
36

I'm not convinced that this will discourage people to avoid CPU checking more than having the isCPU methods... But it also won't encourage.

49

Since we only have one, maybe try to not use the CPU name in it? Otherwise, we'd encourage things like "CycloneLike", which aren't much of an advantage of isCyclone.

Arnaud, please check the PBQP constraint, if it makes sense...

MatzeB updated this revision to Diff 59118.May 31 2016, 1:24 PM
MatzeB edited edge metadata.
MatzeB marked 3 inline comments as done.

Thanks for the reviews and irc discussions!
Some of the subtarget features I felt indeed strange because they were adjusting some cost settings rather than just enabling/disabling a tweak. We lack a good mechanism to specify those in tablegen today so instead I opted to move all of those into a newly introduced initializeProperties() function.

We should support setting non-boolean properties based on the CPU with a tablegen feature. However that is material for a future patch.

rengolin accepted this revision.Jun 1 2016, 3:46 AM
rengolin edited edge metadata.

Thanks Matthias, LGTM!

lib/Target/AArch64/AArch64Subtarget.cpp
63

Maybe a FIXME line saying this should also be table-gen'd. (your future patch)

This revision is now accepted and ready to land.Jun 1 2016, 3:46 AM
mssimpso added inline comments.
lib/Target/AArch64/AArch64Subtarget.cpp
80

Hi Matthias,

I haven't yet looked at this patch in much detail. But would you mind double checking the Exynos properties before committing this change? I didn't think Exynos and Kryo shared these value. Shouldn't MaxInterleaveFactor = 2 and VectorInsertExtractBaseCost = 3 for Exynos? The Kryo numbers look correct. Perhaps there's a missing "break" here?

MatzeB marked 2 inline comments as done.Jun 1 2016, 11:16 AM
MatzeB added inline comments.
lib/Target/AArch64/AArch64Subtarget.cpp
80

Indeed there are too many fallthroughs happening here, I'll double check this switch and update the diff here.

MatzeB updated this revision to Diff 59283.Jun 1 2016, 2:45 PM
MatzeB edited edge metadata.

Remove unintented switch fallthroughs in initializeProperties().

I'll wait at least for @mssimpso and @aadg approval before committing this.

mssimpso added inline comments.Jun 2 2016, 6:18 AM
lib/Target/AArch64/AArch64Subtarget.cpp
80

Hi Matthias,

The switch looks good to me now. Thanks.

aadg accepted this revision.Jun 2 2016, 10:45 AM
aadg edited edge metadata.

This looks OK to me from a PBQP perspective (see comment).

lib/Target/AArch64/AArch64Subtarget.cpp
153

FP load balancing has to be done with a specific pass when using the GreedyRegAlloc. PBQPRegAlloc can do it right away (no need for a separate pass), but the allocator need to learn the FP load balancing constraint it has to solve (on top of the "standard" constraints).

Enabling it for both A57 and A53 is I think the right thing to do.

This revision was automatically updated to reflect the committed changes.
evandro edited edge metadata.Jun 9 2016, 8:20 AM

Somewhat an RFC, but perhaps ARMProcFamilyEnum values should be defined as masks so that, when checking for multiple families, one can write:
Subtarget.getProcFamily() & (AArch64Subtarget::CortexA57 | AArch64Subtarget::ExynosM1)

Instead of:
Subtarget.getProcFamily() == AArch64Subtarget::CortexA57 || Subtarget.getProcFamily() == AArch64Subtarget::ExynosM1

Thoughts?

Somewhat an RFC, but perhaps ARMProcFamilyEnum values should be defined as masks so that, when checking for multiple families,

Hi Evandro,

That'd be a good idea if we were trying to encourage people to use them, but we're not.

They're just there because not all could be easily moved into features, but they should, eventually. And all new such decisions should be made into features, so we should also avoid usage of that function like the plague. :)

cheers,
--renato

Oi, Renato.

I'm for discouraging the use of Subtarget.getProcFamily(), but I don't think that it's realistic to suppose that every instance should be replaced with a feature. The only use for now can hardly be folded into a feature. How would you envision such minutiae as features?

Obrigado.

I'm for discouraging the use of Subtarget.getProcFamily(), but I don't think that it's realistic to suppose that every instance should be replaced with a feature. The only use for now can hardly be folded into a feature. How would you envision such minutiae as features?

I don't think we'll be able to (or if it's even desirable that we do), but continue discouraging its usage (by making it awkward) will move the balance towards more features, less CPU names. I think that's all we're trying to do.

Makes sense?

Flw,
--renato

MatzeB added a comment.Jun 9 2016, 1:07 PM

Oi, Renato.

I'm for discouraging the use of Subtarget.getProcFamily(), but I don't think that it's realistic to suppose that every instance should be replaced with a feature. The only use for now can hardly be folded into a feature. How would you envision such minutiae as features?

Obrigado.

I just wanted to stress again that we really really should try to create subtarget features as much as possible (or alternatively initialize properties in initializeProperties()). The last remaining usage of getProcFamily() in isAsCheapAsAMove() smells bad to me. I just don't have a good alternative yet. It would be nice to express things there in terms of the scheduling model, but that needs some more investigation in how to that exactly.

Not everything is visible at initializeProperties(). For example, were a target to set its own function alignment, setPrefFunctionAlignment() cannot be invoked outside of the TargetLowering hierarchy.

MatzeB added a comment.Jun 9 2016, 1:57 PM

Not everything is visible at initializeProperties(). For example, were a target to set its own function alignment, setPrefFunctionAlignment() cannot be invoked outside of the TargetLowering hierarchy.

You could still add an unsigned PrefFunctionAlignment; field in SubtargetInfo and then do setPrefFunctionAlignment(Subtarget.getPrefFunctionAlignment());` in the TargetLowering constructor. We already use this tactic for some properties of TargetTransformInfo (CacheLineSize, PrefetchDistance, ...).

It does require adding an extra indirection but I believe it is worth it: Otherwise when you add a new CPU how will you find all the places relevant to your new CPU? What if you want to experiment with code generation pre silicon in simulators (say the next version of your CPU pretty much behaves like the previous one but has no problem with misaligned functions anymore). I also liked the fact how adding a subtarget feature forced you to write a little description...

OK, let me hack a patch and take the discussion to a concrete case and let's see what can be sensibly done there.

Thanks for your patience.