Page MenuHomePhabricator

[DAGCombine] Improve Load-Store Forwarding

Authored by niravd on Jul 11 2018, 12:47 PM.



Extend analysis forwarding loads from preceeding stores to work with
extended loads and truncated stores to the same address so long as the
load is fully subsumed by the store.

Hexagon's swp-epilog-phis.ll and swp-memrefs-epilog1.ll test are
deleted as they've no longer seem to be relevant.

Diff Detail


Event Timeline

niravd created this revision.Jul 11 2018, 12:47 PM
rnk added inline comments.Aug 15 2018, 1:27 PM
12377–12382 ↗(On Diff #155047)

This whole if looks like it should be its own helper, and even then it could be broken up more.

12382–12385 ↗(On Diff #155047)

This use of auto is not consistent with LLVM's coding standards:

12417 ↗(On Diff #155047)

TODO" seems like a typo

12421–12424 ↗(On Diff #155047)

These conditionals look like they can be simplified.

12449–12450 ↗(On Diff #155047)

This switch could be its own helper function, extendingLoadToExtension, or something like that. Putting switches into helpers is nice since it can save you a local variable and saves the break; line in each case.

niravd updated this revision to Diff 162252.Aug 23 2018, 12:52 PM

Resolve comments.

RKSimon added inline comments.Sep 14 2018, 3:18 AM
12622 ↗(On Diff #162252)

Not sure if this is that useful (especially as similar helpers typically return 1 for scalars)?

12626 ↗(On Diff #162252)

Should you update Val even if you return false?

12661 ↗(On Diff #162252)

I think you need breaks here, else add LLVM_FALLTHROUGHs

niravd updated this revision to Diff 165521.Sep 14 2018, 8:58 AM
niravd marked an inline comment as done.

Resolve comments

niravd added inline comments.Sep 14 2018, 9:04 AM
12622 ↗(On Diff #162252)

I have no strong opinions on this. The only reason for the zero is 0 is to distinguish a singleton vector and it's element type. It's saves having to check if compared types are both isVector or not isVector.

12626 ↗(On Diff #162252)

We can either assign Val when we consider the store, or before when we pass it in. I have a mild preference for this way as it avoid having to check that Val is ST->getValue(). We don't need to use Val afterwards if it's false; we can replace the continue on 12724 with a "return SDValue();", but I think it's a bit cleaner looking this way.

12661 ↗(On Diff #162252)

Right. Thanks.

niravd added a comment.Oct 9 2018, 8:20 AM

Ping again.

RKSimon added inline comments.Oct 9 2018, 9:12 AM
12622 ↗(On Diff #162252)

How about numVectorEltsOrZero ?

niravd updated this revision to Diff 168840.Oct 9 2018, 11:31 AM

Rebase to tip and address rksimon's renaming suggestion.

niravd marked an inline comment as done.Oct 9 2018, 11:32 AM
rnk accepted this revision.Oct 9 2018, 2:03 PM

I think it looks good. I think this got lost in the gulf of responsibility between multiple reviewers.

This revision is now accepted and ready to land.Oct 9 2018, 2:03 PM
This revision was automatically updated to reflect the committed changes.
This comment was removed by evandro.

This commit (found via git bisect) looks like it is the cause of which is derived from the Linux kernel and is a regression from LLVM 7.0. We get an internal fault in instruction selection for ARM state for pre ARM-v7 architectures (test uses armv4t but it is reproducible on v5t and v6).

A test case and details of how to reproduce can be found in the PR. It is possible that this has exposed a limitation in the Arm backend or the optimisation may not be suitable for it. Would you be able to take a look?