This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] add mem ref to unscaled pairs
ClosedPublic

Authored by sebpop on Feb 10 2016, 2:06 PM.

Details

Summary

This matches the code used to create mem refs for scaled pairs.
We need this change to enable creation of more pairs: we used to return true
when checking for hasOrderedMemoryRef() in the check for alias analysis.

Patch by Sebastian Pop and Abderrazek Zaafrani.

Diff Detail

Event Timeline

sebpop updated this revision to Diff 47518.Feb 10 2016, 2:06 PM
sebpop retitled this revision from to [AArch64] add mem ref to unscaled pairs.
sebpop updated this object.
sebpop added reviewers: rengolin, kristof.beyls.
sebpop added a subscriber: llvm-commits.
az added a subscriber: az.Feb 10 2016, 2:12 PM
sebpop updated this revision to Diff 47535.Feb 10 2016, 3:08 PM

Update patch to apply to current trunk.

Thanks for working on this. To be clear, this change does not increase the number of paired instructions. Adding the MMOs allows the MI scheduler to reason about dependencies and in turn allows more scheduling freedom.

Do you have any performance results?

Chad

Also, please reference the PR that was filed recently for this change.

Actually this change does add new ldp/stp pairs as it enables the mayAlias analysis to reason about the memref.
Of course you need the next patch http://revioews.llvm.org/D17098 to make the dependence analysis functional.
Thus the testcase for this change comes with that next patch.

For the PR you mention, do you mean https://llvm.org/bugs/show_bug.cgi?id=26358
?

Actually this change does add new ldp/stp pairs as it enables the mayAlias analysis to reason about the memref.
Of course you need the next patch http://revioews.llvm.org/D17098 to make the dependence analysis functional.
Thus the testcase for this change comes with that next patch.

Ah, correct. With D17098 it will create more paired instructions.

For the PR you mention, do you mean https://llvm.org/bugs/show_bug.cgi?id=26358
?

Yes. From the PR "Then mergePairedInsns() in the AArch64 load store optimizer needs to be fixed so that the pair instructions include their MMOs."

az added a comment.Feb 11 2016, 8:17 AM

Do you have any performance results?

For the performance testing, our local benchmark shows that this patch improved performance for 7 tests. The improvement ranges between 0.5% and 3%. We also have few noises (very small improvement or degradation).

mcrosier accepted this revision.Feb 11 2016, 8:41 AM
mcrosier added a reviewer: mcrosier.
This revision is now accepted and ready to land.Feb 11 2016, 8:41 AM
In D17097#350017, @az wrote:

Do you have any performance results?

For the performance testing, our local benchmark shows that this patch improved performance for 7 tests. The improvement ranges between 0.5% and 3%. We also have few noises (very small improvement or degradation).

To be clear, we tested both D17097 and D17098 together. I don't think just D17097 by itself would have this impact.
With D17098 we have seen several places where the MI scheduler was able to move insns past ldp/stp, as can be seen in the changes we had to do to the tests.

Can we have a test?

This is fairly difficult to test, but one possibility would be to dump the dependency information and show that there are fewer dependencies in the MI scheduler after this change..

gberry added a subscriber: gberry.Feb 11 2016, 12:16 PM

Another possibility would be to dump the MI after AArch64LoadStoreOptimizer and check for the MMOs directly in the dump text.

Can we have a test?

The testcase added in D17098 contains two functions @st2 and @st4 that would not contain two stp without the current change.

Can we have a test?

The testcase added in D17098 contains two functions @st2 and @st4 that would not contain two stp without the current change.

I'd much prefer these type of tests over grepping debug output. If @t.p.northover doesn't object, I'm fine with landing this patch as is and include the test cases in the followup patch.

t.p.northover accepted this revision.Feb 18 2016, 2:16 PM
t.p.northover added a reviewer: t.p.northover.

I prefer that kind of test too. Seems reasonable to pick up the tests with D17098.

mcrosier closed this revision.Mar 8 2016, 9:22 AM

Committed in r262942.