This adds extra scalar handling to isFMAFasterThanFMulAndFAdd, allowing the target independent code to handle more folds in more situations (for example if the fast math flags are present, but the global AllowFPOpFusion option isnt). It also splits apart the HasSlowFPVMLx into HasSlowFPVFMx, to allow VFMA and VMLA to be controlled separately if needed.
Details
Details
Diff Detail
Diff Detail
Event Timeline
llvm/lib/Target/ARM/ARMISelLowering.cpp | ||
---|---|---|
15024 | Is there a way this logic can sit in Subtarget to avoid it being a tablegen predicate as well as code here? I'm hopeless with our FP architectures... does FullFP16 infer VFP4? |
llvm/lib/Target/ARM/ARMISelLowering.cpp | ||
---|---|---|
15024 | Yeah, that sounds good. I'll try and move it around. FullFP16 implies fp-armv8 I'm pretty sure. So at least VFP4. |
Is there a way this logic can sit in Subtarget to avoid it being a tablegen predicate as well as code here? I'm hopeless with our FP architectures... does FullFP16 infer VFP4?