Search from AND nodes to find whether they can be propagated back to loads, so that the AND and load can be combined into a narrow load. We search through OR, XOR and other AND nodes and all the leaves are required to be loads or constants.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Some minor style comments
lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
505 ↗ | (On Diff #121508) | Add comment descriptions for both of these |
3741 ↗ | (On Diff #121508) | for (unsigned i = 0, e = N->getNumOperands(); i < e; ++i) { |
3742 ↗ | (On Diff #121508) | You should be able to use this: SDValue Op = N->getOperand(i); and most "Op->" can be replaced with "Op." |
lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
3739 ↗ | (On Diff #124365) | IIUC if you see a non-handled case (AND, OR, XOR, LOAD) you will give up because we cannot fully remove the AND. What about partial success? Consider: AND (OR (LOAD x) (ADD y z)) MASK -> OR (ZEXTLOAD x) (AND (ADD y z) MASK). As long as we only generate 1 or fewer ANDs its a net win*. I also suspect that narrowing loads regardless would be a net-positive for SimplifyDemandedBits, though other reviews would know more about it than I do. |
3758 ↗ | (On Diff #124365) | if (isAndLoadExtLoad(Mask, Load, Load->getValueType(0), ExtVT, NarrowLoad) && NarrowLoad) { |
3773 ↗ | (On Diff #124365) | I believe you need a return here or something like (AND (AND (ADD x y) MASK1) (LOAD z) MASK2) will result in the ADD missing MASK2. |
3809 ↗ | (On Diff #124365) | The call to CombineTo after this should update the users so you should do an update. IIUC the only reason for this is to commute the operands in some cases. I suspect this isn't necessary or should be done by applying the check to both operands. |
test/CodeGen/ARM/and-load-combine.ll | ||
1 ↗ | (On Diff #124365) | Please commit the test cases separately so we can see the differences more easily in phabricator. |
lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
3739 ↗ | (On Diff #124365) | Ok, thanks, I'll look into it. |
3809 ↗ | (On Diff #124365) | The issue here was that I need to replace one of Users operands with an AND that masks part of one of User's original operands. I originally tried to create the AND and then use ReplaceAllUsesWith on the AND, in the hope that it would update User. But then because the AND is using the value that I tried to replace, a cyclic dependency is introduced. So the only way I could see is by creating a completely new node. I then call ReplaceAllUsesWith so that I can delete the original User, thus allowing the load to be combined with the new AND node. |
test/CodeGen/ARM/and-load-combine.ll | ||
1 ↗ | (On Diff #124365) | Sorry, I'm confused as to why that would make it easier? |
lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
3809 ↗ | (On Diff #124365) | Ah! Take a look at UpdateNodeOperands. It should let you undo the construction of a cyclic dependency. |
test/CodeGen/ARM/and-load-combine.ll | ||
1 ↗ | (On Diff #124365) | It's because the CHECK-NOT: lsl lines are not intuitively obvious as to why they are there (I'm still not sure why those were selected, but presumably it's important). IIUC all that matters is that originally the larger loads are being used. That's very easy to see with with a prepatch test case checking for the original code generated. I suppose additional CHECK-NOTs would work as well, but I'm not sure if that confuses the CHECK-NOT: lsl |
test/CodeGen/ARM/and-load-combine.ll | ||
---|---|---|
1 ↗ | (On Diff #124365) | Just a drive-by comment about codegen checks: there's a script to auto-generate the FileCheck lines at llvm/utils/update_llc_test_checks.py (and it has support for ARM triples, but not all of the ones in this test file - any help in getting more triples to work with the script is greatly appreciated!). If you use that, then it becomes very simple to add tests with complete checking *before* you make a functional change that will change the asm. This makes it easier to review exactly what changes. It also avoids the inherent fragility of 'CHECK-NOT' (imagine that someone has an 'lsl' string in the path to this test file and FileCheck sees that path string). |
Now allowing a single node in the tree that will still require the top bits to be masked off. We're also now allowing extending nodes if the original type was small enough. The tests were committed before so hopefully the diff is easier to read.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
3755 ↗ | (On Diff #125341) | If you pull out the the checking and recording of NodeToMask to after the switch and return immediately on sucessful cases, you can out the default case and allows us to succeed with an "and of load" if one of the loads is not a valid extLoad. |
3805 ↗ | (On Diff #125341) | You should be able to replace most of this by using ReplaceAllUsesWith and then using UpdateNodeOperands to fix up the operands of the And Node. i.e.: DAG.ReplaceAllUsesWith(N, And); It's probably worth inlining this function as well. |
3834 ↗ | (On Diff #125341) | IMO, this fast fail check isn't worth while as it's shaving off a smallish constant amount of wotk. |
lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
3755 ↗ | (On Diff #125341) | I'm not sure that works here, as I want to check all the operands and an early exit on the successful case could miss other valid nodes. |
3805 ↗ | (On Diff #125341) | But as 'And' uses 'N', won't this cause the cyclic dependency that I'm trying to avoid? Or is there a nice way of initialising a dummy 'And' node that doesn't use N? |
lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
3755 ↗ | (On Diff #125341) | Sorry, I miswrote. You should "continue" to the next iteration of the loop on success. |
3805 ↗ | (On Diff #125341) | It will but it won't loop forever and you'll immediately fix it. See the implementation of SelectionDAG::makeEquivalentMemoryOrdering. |
I've removed the default case, which is now handled on exit from the switch. I've also removed the helper function to create the new mask and instead perform the masking inline.
Modulo the extra NodeToMask lines in zero extend case, this LGTM. Thanks!
lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
3779 ↗ | (On Diff #125505) | Looks like you missed removing this case. |