This is an archive of the discontinued LLVM Phabricator instance.

[MemCpyOpt] Move GEP during call slot optimization
ClosedPublic

Authored by nikic on Oct 17 2020, 6:59 AM.

Details

Summary

When performing a call slot optimization to a GEP dest, it will currently usually not apply, because the GEP is directly before the memcpy and as such does not dominate the call. We should move it above the call if that satisfies the domination requirement. I think that a constant-index GEP is the only useful thing to move here, as otherwise isDereferenceablePointer couldn't look through it anyway. As such I'm not trying to generalize this further.

Diff Detail

Event Timeline

nikic created this revision.Oct 17 2020, 6:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 17 2020, 6:59 AM
nikic requested review of this revision.Oct 17 2020, 6:59 AM
nikic added inline comments.
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
920

I'm wondering if it would make sense to extend the dominates() API to accept a Value* instead of Instruction* as first argument. Handling "arguments, constants and globals dominate everything" seems like something the DominatorTree API should do, not the caller.

MSxDOS added a subscriber: MSxDOS.Oct 17 2020, 7:51 AM
jdoerfert added inline comments.Oct 17 2020, 9:32 AM
llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
920

agreed.

nikic updated this revision to Diff 300029.Oct 22 2020, 10:07 AM

Rebase over generalized dominates() API.

This revision is now accepted and ready to land.Oct 22 2020, 10:11 AM
This revision was automatically updated to reflect the committed changes.
dmajor added a subscriber: dmajor.Oct 28 2020, 3:03 PM

Hi, we have a regression that started with this commit. The problem happens when GEP %43 is moved before the call on line 49: https://paste.mozilla.org/gYqBnKes . Can you see anything obviously wrong with that move?

If not, our repro is quite large and I fear it will take me a very long time to fully understand what went wrong and/or get a runnable reduced test case. At this point I'm not even sure whether this transformation was itself the culprit or merely uncovered a problem elsewhere. Hoping you might be able to quickly spot something and save me lots of time!

I thiiiink I'm starting to see this a little better, and the problem is that the memcpy clobbers the optimized call slot.

Before performCallSlotOptzn

call func(..., X, ...)
load from X
GEP
store to Y

After performCallSlotOptzn

GEP
call func(..., Y, ...)
load from X
store to Y  ; oops

Am I interpreting that correctly?

nikic added a comment.Oct 29 2020, 1:18 AM

@dmajor If the call slot optimization applies, then that load/store pair should also get dropped as part of the optimization. I don't immediately see how we can end up both changing the call argument and not dropping the load/store, if I understand right what is happening in your case.

@dmajor If the call slot optimization applies, then that load/store pair should also get dropped as part of the optimization. I don't immediately see how we can end up both changing the call argument and not dropping the load/store, if I understand right what is happening in your case.

You're right, I was looking too narrowly at performCallSlotOptzn. The IR is sane after MemCpyOptPass as a whole. Our badness actually happens later in the pipeline, sorry for the noise.