This is an archive of the discontinued LLVM Phabricator instance.

[ARM][NFCI] Do not fuse VADD and VMUL, continued (1/2)
ClosedPublic

Authored by SjoerdMeijer on Oct 16 2018, 1:56 AM.

Details

Summary

This is a follow up of rL342874, which stopped fusing muls and adds into VMLAs
for performance reasons on the Cortex-M4 and Cortex-M33. This is a serie of 2
patches, that is trying to achieve the same for VFMA. The second column in the
table below shows what we were generating before rL342874, the second column
what changed with rL342874, and the last column what we want with these 2
patches:


| Opt   |  < rL342874   |  >= rL342874   |             |
|------------------------------------------------------|
|-O3    |     vmla      |      vmul      |     vmul    |
|       |               |      vadd      |     vadd    |
|------------------------------------------------------|
|-Ofast |     vfma      |      vfma      |     vmul    |
|       |               |                |     vadd    |
|------------------------------------------------------|
|-Oz    |     vmla      |      vmla      |     vmla    |
--------------------------------------------------------

This patch 1/2, is a cleanup of the spaghetti predicate logic on the different
VMLA and VFMA codegen rules, so that we can make the final functional change in
patch 2/2 in D53315. This also fixes a typo in the regression test added in rL342874.

Diff Detail

Repository
rL LLVM

Event Timeline

SjoerdMeijer created this revision.Oct 16 2018, 1:56 AM
SjoerdMeijer edited the summary of this revision. (Show Details)Oct 16 2018, 1:56 AM
SjoerdMeijer edited the summary of this revision. (Show Details)Oct 16 2018, 2:01 AM
samparker added inline comments.Oct 16 2018, 2:14 AM
lib/Target/ARM/ARMInstrInfo.td
363 ↗(On Diff #169793)

I think moving VFP4 check into the useFPVMLx method would help make this easier to read.

I think moving VFP4 check into the useFPVMLx method would help make this easier to read.

Thanks for the suggestion. I had a look, and turns out we don't need it at all because the VFP checks and predicates are already on the rules. Thus we can simplify the UseFPVMLx predicate even more by removing the VFP check from it.

samparker accepted this revision.Oct 16 2018, 5:55 AM

Great, LGTM

This revision is now accepted and ready to land.Oct 16 2018, 5:55 AM
nhaehnle removed a subscriber: nhaehnle.Oct 16 2018, 8:18 AM
This revision was automatically updated to reflect the committed changes.