This is patch 2/2, following up on D53314, to prevent fusing of mul + add sequences.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
test/CodeGen/ARM/fusedMAC.ll | ||
---|---|---|
7 ↗ | (On Diff #169796) | should the f64 version not also be tested? |
should the f64 version not also be tested?
Both the M4 and M33 have only Single-Precision VFP support. Thus f64 operations result in EABI calls and so I don't think there's much to test here. I have cleaned up the RUN line in the test case though as we only need the -mcpu=cortex-m33 part and not the other attributes.
Good point. Would it be worth adding a test for the M7 though? We seem to be a little lacking in our m-class FP tests.
test/CodeGen/ARM/fusedMAC.ll | ||
---|---|---|
2 ↗ | (On Diff #169819) | this is a funny looking triple for the M33 |
Would it be worth adding a test for the M7 though? We seem to be a little lacking in our m-class FP tests.
Yep, agreed. Added tests for the M7 and M4.
this is a funny looking triple for the M33
Oops, copy-paste, but now fixed.
Cheers
With one bonus question, are the fused operations fast on the M7..?
:-)
From a first glance, it looks like we should give the M7 a similar treatment.
But I will double check, also run benchmarks, and will follow-up if necessary (which, again, looks very likely).