@paulwalker-arm Here is a hastily prepared patch of the problem we discussed earlier today. I'm a little out of the VLS loop, so perhaps there is a better way to handle this...
For VLS lowering, the DAGCombiner is not matching fixed width vector FMAs at CodeGenOpt::Aggressive, since the fixed width FMA matching at CodeGenOpt::Aggressive is done in the MachineCombiner. However, when the MachineCombiner runs, we have already lowered the fixed width vectors to scalable vectors, so FMAs are not generated at all. This patch corrects that by allowing the DAGCombiner to match FMAs if we are using VLS.
clang-format: please reformat the code