This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Check correct instructions for load/store rescheduling.
ClosedPublic

Authored by efriedma on Feb 24 2017, 6:26 PM.

Details

Summary

See also https://reviews.llvm.org/D30124. I don't think this can actually cause a miscompile, but it's not obvious.

This code starts from the high end of the sorted vector of offsets, and works backwards: it tries to find contiguous offsets, process them, then pops them from the end of the vector. Most of the code agrees with this order of processing, but one loop doesn't: it instead processes elements from the low end of the vector (which are nodes with unrelated offsets). Fix that loop to process the correct elements.

This has a few implications. One, we don't incorrectly return early when processing multiple groups of offsets in the same block (which allows rescheduling prera-ldst-insertpt.mir). Two, we pick the correct insert point for loads, so they're correctly sorted (which affects the scheduling of vldm-liveness.ll). I think it might also impact some of the heuristics slightly.

Diff Detail

Repository
rL LLVM

Event Timeline

efriedma created this revision.Feb 24 2017, 6:26 PM
rengolin edited edge metadata.Feb 28 2017, 6:21 AM

This change doesn't look obvious to me. Can you add a short description of what's happening here and why this is the fix?

efriedma edited the summary of this revision. (Show Details)Feb 28 2017, 11:25 AM

Sorry; added a better description to the summary. Quoting for the mailing list:

See also https://reviews.llvm.org/D30124. I don't think this can actually cause a miscompile, but it's not obvious.

This code starts from the high end of the sorted vector of offsets, and works backwards: it tries to find contiguous offsets, process them, then pops them from the end of the vector. Most of the code agrees with this order of processing, but one loop doesn't: it instead processes elements from the low end of the vector (which are nodes with unrelated offsets). Fix that loop to process the correct elements.

This has a few implications. One, we don't incorrectly return early when processing multiple groups of offsets in the same block (which allows rescheduling prera-ldst-insertpt.mir). Two, we pick the correct insert point for loads, so they're correctly sorted (which affects the scheduling of vldm-liveness.ll). I think it might also impact some of the heuristics slightly.

rengolin accepted this revision.Feb 28 2017, 12:59 PM

Thanks Eli, that makes more sense. A single nit, if you could move the check lines right before/after each test, so that we can easily compare them. Otherwise, LGTM. Thanks!

This revision is now accepted and ready to land.Feb 28 2017, 12:59 PM
This revision was automatically updated to reflect the committed changes.