I use alias analysis to define whether the given instruction is a barrier for load hoisting.
When I find 2 identical loads, I check all preceding instructions in the both basic blocks.
I'll also look at sinking stores in the next patch.
Details
Diff Detail
Event Timeline
As a general note, please upload patches with full context, see: http://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface
lib/Transforms/Scalar/MergedLoadStoreMotion.cpp | ||
---|---|---|
270 | This is not really sufficient, we also need to check the other possible Load flags (atomic properties, volatile, alignment). This is easy to do: Load0->isSameOperationAs(Load1) (it looks like this is all guarded on a check that the load isSimple, so we know the atomic properties and volatile are a non-issue, but the alignment still needs checking ). Also, this line has a typo, (Load0->getType() == Load0->getType()) is always true. | |
test/Transforms/InstMerge/ld_hoist1.ll | ||
20 | I think these checks could be more specific. I think you can explicitly match the first gep and load and then match the second get (picking up the name by regex) but specifying the form, and then making sure the second load uses that address. | |
65 | Remove unnecessary attributes. | |
69 | Remove unnecessary metadata. |
I followed Hal's comments and used IsIndenticalToWhenDefefined() - the same interface is used in SimplifyCFG.
isSameOperationAs() does the same with some excluded flags.
I also did all other suggested changes.
Thanks.
Does the new implementation catch any cases the old one didn't? There is extra cost by walking the range of instruction twice, and perhaps a small change would have the same effect.
lib/Transforms/Scalar/MergedLoadStoreMotion.cpp | ||
---|---|---|
265 | Typo. Did you mean IsIdenticalToDefined()? Or IsIdendicalLoad()? | |
267 | Are you checking the range for load0 twice? See overall comment. | |
374 | The optimization should work when one load is used outside the block. The optimization can work when both loads are used outside but we have to deal with the PHI nodes accordingly. |
Does the new implementation catch any cases the old one didn't? There is extra cost by walking the range of instruction twice, and perhaps a small change would have the same effect.
[Demikhovsky, Elena] Exactly, not every store is a barrier for load hoisting. I check this with AliasAnalysis. I attached a test case where it works.
lib/Transforms/Scalar/MergedLoadStoreMotion.cpp | ||
---|---|---|
265 | IsIdenticalToDefined is a standard function to compare 2 instructions. | |
267 | No. I check only once for each load. I can't say that instruction is a barrier before I see the load itself. Because I go back from bb start to the current instruction, I do it after I see two identical loads. | |
374 | I did not change the current behavior in this case, but I plan to work more on these optimizations. Conditional load/store are barriers for vectorizer. |
lib/Transforms/Scalar/MergedLoadStoreMotion.cpp | ||
---|---|---|
265 | The old code used AA and checked for a MustAlias result. The new code is just checking operand quality, this makes the check less powerful. Please keep the original form of the check *using MustAlias) and use isSameOperationAs as I had suggested. The problem is that, on either side of the diamond, you could have address-equivalent but type-inequivalent GEPs. BasicAA can easily show that the resulting offset is the same, but they'll never really be hoisted (I can't say how common this is, and I presume that we don't have a regression test for it, but let's not reduce our capability to handle that situation here). |
The problem is that, on either side of the diamond, you could have address-equivalent but type-inequivalent GEPs. BasicAA can easily show that the resulting offset is the same, but they'll never really be hoisted (I can't say how common this is, and I presume that we don't have a regression test for it, but let's not reduce our capability to handle that situation here).
I checked the following code:
int *iin = (int *)in;
if (trigger[i] > 0) {
out[i] = in[i] + (float) 0.5;
}
else {
out[i] /= 3; iout[i] += iin[i];
}
And tried to hoist loading in[i]. isSameOperationAs() filters it out because the types are different.
I uploaded the new diff anyway.
And tried to hoist loading in[i]. isSameOperationAs() filters it out because the types are different.
Understood, thanks for checking. When dealing with C++ classes, etc. there are a lot of bitcasts flying around, and I did not want to change behavior here without knowing what we might regress. We probably should handle the "one side will need a bitcast" case, but that is a separate issue.
LGTM, thanks!
Please see below. I still think the compile-time concern needs more attention. Ignoring that the patch LGTM.
Thank you. Committed revision 221091.
I still think the compile-time concern needs more attention.
I do this check only after I see 2 identical loads. The alternative is checking barriers for every Load0 before I know that I have a candidate for hoisting in the second block.
Btw, SimplifyCFG() does half way by hoisting identical instructions from diamond, but we do load hoisting in a separate pass. Here I have a compile-time concern.
The same about store sinking.
- Elena
Typo. Did you mean IsIdenticalToDefined()? Or IsIdendicalLoad()?