This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombine] Move AND nodes to multiple load leaves
ClosedPublic

Authored by samparker on Nov 3 2017, 10:19 AM.

Details

Summary

Search from AND nodes to find whether they can be propagated back to loads, so that the AND and load can be combined into a narrow load. We search through OR, XOR and other AND nodes and all the leaves are required to be loads or constants.

Diff Detail

Event Timeline

samparker created this revision.Nov 3 2017, 10:19 AM
samparker retitled this revision from [CodeGen] DAG combine AND nodes to load leaves to [DAGCombine] Move AND nodes to multiple load leaves.Nov 6 2017, 1:26 AM
fhahn added a subscriber: fhahn.Nov 16 2017, 7:05 AM
RKSimon edited edge metadata.Nov 27 2017, 2:56 AM

Some minor style comments

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
505

Add comment descriptions for both of these

3741
for (unsigned i = 0, e = N->getNumOperands(); i < e; ++i) {
3742

You should be able to use this:

SDValue Op = N->getOperand(i);

and most "Op->" can be replaced with "Op."

samparker updated this revision to Diff 124365.Nov 27 2017, 6:12 AM

Thanks Simon, I've made adjustments according to your comments.

niravd edited edge metadata.Nov 27 2017, 10:53 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
3739

IIUC if you see a non-handled case (AND, OR, XOR, LOAD) you will give up because we cannot fully remove the AND.

What about partial success? Consider: AND (OR (LOAD x) (ADD y z)) MASK -> OR (ZEXTLOAD x) (AND (ADD y z) MASK). As long as we only generate 1 or fewer ANDs its a net win*. I also suspect that narrowing loads regardless would be a net-positive for SimplifyDemandedBits, though other reviews would know more about it than I do.

3758

if (isAndLoadExtLoad(Mask, Load, Load->getValueType(0), ExtVT, NarrowLoad) && NarrowLoad) {

3773

I believe you need a return here or something like (AND (AND (ADD x y) MASK1) (LOAD z) MASK2) will result in the ADD missing MASK2.

3809

The call to CombineTo after this should update the users so you should do an update. IIUC the only reason for this is to commute the operands in some cases. I suspect this isn't necessary or should be done by applying the check to both operands.

test/CodeGen/ARM/and-load-combine.ll
1

Please commit the test cases separately so we can see the differences more easily in phabricator.

samparker added inline comments.Nov 30 2017, 3:49 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
3739

Ok, thanks, I'll look into it.

3809

The issue here was that I need to replace one of Users operands with an AND that masks part of one of User's original operands. I originally tried to create the AND and then use ReplaceAllUsesWith on the AND, in the hope that it would update User. But then because the AND is using the value that I tried to replace, a cyclic dependency is introduced. So the only way I could see is by creating a completely new node. I then call ReplaceAllUsesWith so that I can delete the original User, thus allowing the load to be combined with the new AND node.

test/CodeGen/ARM/and-load-combine.ll
1

Sorry, I'm confused as to why that would make it easier?

niravd added inline comments.Nov 30 2017, 7:32 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
3809

Ah! Take a look at UpdateNodeOperands. It should let you undo the construction of a cyclic dependency.

test/CodeGen/ARM/and-load-combine.ll
1

It's because the CHECK-NOT: lsl lines are not intuitively obvious as to why they are there (I'm still not sure why those were selected, but presumably it's important). IIUC all that matters is that originally the larger loads are being used. That's very easy to see with with a prepatch test case checking for the original code generated. I suppose additional CHECK-NOTs would work as well, but I'm not sure if that confuses the CHECK-NOT: lsl

spatel added inline comments.Nov 30 2017, 7:46 AM
test/CodeGen/ARM/and-load-combine.ll
1

Just a drive-by comment about codegen checks: there's a script to auto-generate the FileCheck lines at llvm/utils/update_llc_test_checks.py (and it has support for ARM triples, but not all of the ones in this test file - any help in getting more triples to work with the script is greatly appreciated!).

If you use that, then it becomes very simple to add tests with complete checking *before* you make a functional change that will change the asm. This makes it easier to review exactly what changes. It also avoids the inherent fragility of 'CHECK-NOT' (imagine that someone has an 'lsl' string in the path to this test file and FileCheck sees that path string).

samparker added inline comments.Dec 1 2017, 1:23 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
3809

Great, thanks.

test/CodeGen/ARM/and-load-combine.ll
1

Ok, thank you both. I'll make the changes and try the script, hopefully with a patch for it as well.

samparker updated this revision to Diff 125341.Dec 4 2017, 7:36 AM

Now allowing a single node in the tree that will still require the top bits to be masked off. We're also now allowing extending nodes if the original type was small enough. The tests were committed before so hopefully the diff is easier to read.

niravd added inline comments.Dec 4 2017, 8:33 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
3755

If you pull out the the checking and recording of NodeToMask to after the switch and return immediately on sucessful cases, you can out the default case and allows us to succeed with an "and of load" if one of the loads is not a valid extLoad.

3805

You should be able to replace most of this by using ReplaceAllUsesWith and then using UpdateNodeOperands to fix up the operands of the And Node. i.e.:

DAG.ReplaceAllUsesWith(N, And);
DAG.UpdateNodeOperands(And, SDValue(N,0), Mask));

It's probably worth inlining this function as well.

3834

IMO, this fast fail check isn't worth while as it's shaving off a smallish constant amount of wotk.

samparker added inline comments.Dec 4 2017, 8:47 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
3755

I'm not sure that works here, as I want to check all the operands and an early exit on the successful case could miss other valid nodes.

3805

But as 'And' uses 'N', won't this cause the cyclic dependency that I'm trying to avoid? Or is there a nice way of initialising a dummy 'And' node that doesn't use N?

niravd added inline comments.Dec 4 2017, 9:17 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
3755

Sorry, I miswrote. You should "continue" to the next iteration of the loop on success.

3805

It will but it won't loop forever and you'll immediately fix it. See the implementation of SelectionDAG::makeEquivalentMemoryOrdering.

samparker added inline comments.Dec 5 2017, 12:34 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
3755

Ah, ok, I'll give that a go.

3805

Great, thanks.

samparker updated this revision to Diff 125505.Dec 5 2017, 5:51 AM

I've removed the default case, which is now handled on exit from the switch. I've also removed the helper function to create the new mask and instead perform the masking inline.

niravd accepted this revision.Dec 5 2017, 6:04 AM

Modulo the extra NodeToMask lines in zero extend case, this LGTM. Thanks!

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
3779

Looks like you missed removing this case.

This revision is now accepted and ready to land.Dec 5 2017, 6:04 AM

Oh yeah, I'll fix it before committing. Thanks for all your help.

This revision was automatically updated to reflect the committed changes.