- remove FeatureCustomCheapAsMoveHandling: when you have target features affecting isAsCheapAsAMove that can be given on command line or passed via attributes, then every sub-target effectively has custom handling
- remove special handling of FMOVD0/etc: FVMOV with an immediate zero operand is never[1] more expensive tha an FMOV with a register operand.
- remove special handling of COPY - copy is trivially as cheap as itself
- make the function default to the MachineInstr attribute isAsCheapAsAMove
- remove special handling of ANDWrr/etc and of ANDWri/etc: the fallback MachineInstr attribute is already non-zero.
- remove special handling of ADDWri/SUBWri/ADDXri/SUBXri - there are always[1] one cycle latency with maximum (for the micro-architecture) throughput
- check if MOVi32Imm/MOVi64Imm can be expanded into a "cheap" sequence of instructions
There is a little twist with determining whether a MOVi32Imm`/MOVi64Imm is "as-cheap-as-a-move". Even if one of these pseudo-instructions needs to be expanded to more than one MOVZ, MOVN, or MOVK instructions, materialisation may be preferrable to allocating a register to hold the constant. For the moment a cutoff at two instructions seems like a reasonable compromise.
[1] according to 19 software optimisation manuals
Details
Details
Diff Detail
Diff Detail
Event Timeline
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp | ||
---|---|---|
818 | Is it worth using AArch64_IMM::expandMOVImm with checking the Insns.size() <= 2? It might be a little slower, but more precise and should handle any canBeExpandedToORR / canBeExpandedToMOVZNK /anything else it learns about in the future. Does this need to be LLVM_ATTRIBUTE_ALWAYS_INLINE? Those kinds of decisions are usually best left to the optimizer. | |
823 | What do you mean by micro-architecture dependent? |
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp | ||
---|---|---|
823 | Existing comment, perhaps means use a sub-target hook, instead of a target hook. |
Is it worth using AArch64_IMM::expandMOVImm with checking the Insns.size() <= 2? It might be a little slower, but more precise and should handle any canBeExpandedToORR / canBeExpandedToMOVZNK /anything else it learns about in the future.
Does this need to be LLVM_ATTRIBUTE_ALWAYS_INLINE? Those kinds of decisions are usually best left to the optimizer.