When performing a call slot optimization to a GEP dest, it will currently usually not apply, because the GEP is directly before the memcpy and as such does not dominate the call. We should move it above the call if that satisfies the domination requirement. I think that a constant-index GEP is the only useful thing to move here, as otherwise isDereferenceablePointer couldn't look through it anyway. As such I'm not trying to generalize this further.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp | ||
---|---|---|
920 | I'm wondering if it would make sense to extend the dominates() API to accept a Value* instead of Instruction* as first argument. Handling "arguments, constants and globals dominate everything" seems like something the DominatorTree API should do, not the caller. |
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp | ||
---|---|---|
920 | agreed. |
Hi, we have a regression that started with this commit. The problem happens when GEP %43 is moved before the call on line 49: https://paste.mozilla.org/gYqBnKes . Can you see anything obviously wrong with that move?
If not, our repro is quite large and I fear it will take me a very long time to fully understand what went wrong and/or get a runnable reduced test case. At this point I'm not even sure whether this transformation was itself the culprit or merely uncovered a problem elsewhere. Hoping you might be able to quickly spot something and save me lots of time!
I thiiiink I'm starting to see this a little better, and the problem is that the memcpy clobbers the optimized call slot.
Before performCallSlotOptzn
call func(..., X, ...) load from X GEP store to Y
After performCallSlotOptzn
GEP call func(..., Y, ...) load from X store to Y ; oops
Am I interpreting that correctly?
@dmajor If the call slot optimization applies, then that load/store pair should also get dropped as part of the optimization. I don't immediately see how we can end up both changing the call argument and not dropping the load/store, if I understand right what is happening in your case.
You're right, I was looking too narrowly at performCallSlotOptzn. The IR is sane after MemCpyOptPass as a whole. Our badness actually happens later in the pipeline, sorry for the noise.
I'm wondering if it would make sense to extend the dominates() API to accept a Value* instead of Instruction* as first argument. Handling "arguments, constants and globals dominate everything" seems like something the DominatorTree API should do, not the caller.