This is an archive of the discontinued LLVM Phabricator instance.

WIP: Make MergeConsecutiveStores look at other stores on same chain
ClosedPublic

Authored by arsenm on Jul 13 2015, 4:35 PM.

Details

Reviewers
hfinkel
Summary

When combiner AA is enabled, look at stores on the same chain.
Non-aliasing stores are moved to the same chain so the existing
code fails because it expects to find an adajcent store on a consecutive
chain.

There are some ordering issues by adding this so the optimal store merge
is not found, and I'm not sure how best to solve them.

Because of how DAGCombiner tries these store combines,
MergeConsecutiveStores doesn't see the correct set of stores on the chain
when it visits the other stores. Each store individually has its chain
fixed before trying to merge consecutive stores, and then tries to merge
stores from that point before the other stores have been processed to
have their chains fixed.

Suppose you have 4 32-bit stores that should be merged into 1 vector
store. One store is visited first, fixing the chain. What happens is
because not all of the store chains have yet been fixed, 2 of the stores
are merged. The other 2 stores later have their chains fixed,
but because the other stores were already merged, they have different
memory types and merging the two different sized stores is not
supported and would be more difficult to handle.

I'm able to get what I want to happen by only performing
MergeConsecutiveStores on the legalized DAG, at which point
all of the nonaliasing stores have had their chains fixed. However,
it is also desirable to have MergeConsecutivStores process stores of illegal
types like it does now, so I'm not sure how to best ensure all stores
have their chains fixed before store merging happens. Should
FindBetterChain search through consecutive chains and try to handle
other consecutive stores at the same time? Should there be some DAG
preprocess step to fix all store chains or something else?

Diff Detail

Event Timeline

arsenm updated this revision to Diff 29624.Jul 13 2015, 4:35 PM
arsenm retitled this revision from to WIP: Make MergeConsecutiveStores look at other stores on same chain.
arsenm updated this object.
arsenm added a reviewer: hfinkel.
arsenm added a subscriber: llvm-commits.
hfinkel edited edge metadata.Jul 26 2015, 6:30 AM

Looking for stores sharing the same chain for merging seems like the right thing to do. This is what you should get from memcpy legalization, etc. anyway.

Shouldn't, however, this be part of the visitor loop below that walks the chain instead of only searching the top (or bottom, depending on how you look at it) store?

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
10795

Why? Expanding memcpy, etc. also creates these kinds of parallel stores.

Looking for stores sharing the same chain for merging seems like the right thing to do. This is what you should get from memcpy legalization, etc. anyway.

Shouldn't, however, this be part of the visitor loop below that walks the chain instead of only searching the top (or bottom, depending on how you look at it) store?

The idea was if we can rely on FindBetterChain canonicalizing the chain layout, we wouldn't have to worry about stores on other chains and could delete the existing loop here. I'll post another rough version in a minute which more or less copies the loop in MergeConsecutiveStores and changes it so that visitSTORE tries to FindBetterChain on possible store merging candidates, which should make MergeConsecutiveStores only need to consider other users of the same chain. Since this only happens if combiner-alias-analysis is enabled, my questions are why isn't it on by default? Would it be OK to say that if you want consecutive store merging combiner AA should be enabled, or should I try to keep these supporting both?

I would prefer to canonicalize how non-aliasing load/store chains are arranged, rather than having MergeConsecutiveStores do it because I would like to write target DAG combines for more exotic load/store merging. For example I would like to merge certain non-adjacent loads and certain cases of load one address space and store to another.

arsenm updated this revision to Diff 31188.Jul 31 2015, 11:18 PM
arsenm edited edge metadata.

Try to call FindBetterChain on possible store merging candidates during visitSTORE

hfinkel added inline comments.Aug 16 2015, 10:58 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
10795

As I said before, this comment is misleading. That situation also commonly occurs when expanding memcpy intrinsics, expanding vector stores of illegal type, etc.

10812

Based on your review comments, you're relying here on the fact that, when AA is being used, FindBetterChain canonicalizes non-aliasing stores to be on the same chain. This needs at least a comment.

11405

Could we be consistent and do this even when UseAA is false? This code ends up calling isAlias, but isAlias does things when UseAA is false (it just does more things when UseAA is true). If we did that, could be remove the special case for UseAA in getStoreMergeAndAliasCandidates (because we'd again have a single canonical form for the store chains).

14242

We don't seem to use this for anything other than this. Remove it?

arsenm added inline comments.Aug 17 2015, 2:32 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
11405

That just exposes more instances of the same fundamental problem, that different combines currently expect different chain layouts. Ideally we could probably do this, however that will require more work. Doing this without AA will move more chains, but most of the code is not ready for this (which is why I assume CombinerAA is not on by default now).

If you remove the UseAA check here (and in current code), you many of the same test failures as you do if you enable combiner-aa by default. Eventually when more combines are fixed to look at the same chain, we could probably do this.

arsenm updated this revision to Diff 32344.Aug 17 2015, 2:51 PM

Fix comment. Add test updates

hfinkel accepted this revision.Aug 27 2015, 6:09 PM
hfinkel edited edge metadata.

LGTM.

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
11405

Doing this without AA will move more chains, but most of the code is not ready for this (which is why I assume CombinerAA is not on by default now).

It is on by default for some targets (whatever targets return true from useAA() in their TargetSubtargetInfo subclass).

If you remove the UseAA check here (and in current code), you many of the same test failures as you do if you enable combiner-aa by default. Eventually when more combines are fixed to look at the same chain, we could probably do this.

Okay, please add a FIXME explaining this here.

This revision is now accepted and ready to land.Aug 27 2015, 6:09 PM
arsenm closed this revision.Aug 28 2015, 10:32 AM

r246307