This is an archive of the discontinued LLVM Phabricator instance.

Optionally extend alloca lifetimes to allow for callslot optimization to happen
Needs ReviewPublic

Authored by dotdash on Mar 1 2015, 7:49 AM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary

Currently, memcpy callslot optimization doesn't happen if the memcpy's
destination starts its lifetime between the call and the memcpy. We can
make the optimization happen by extending the allocas lifetime to start
before the call, but we must only do that if the callslot optimization
succeeds, because otherwise we might lose some opportunities for stack
slot sharing

Diff Detail

Event Timeline

dotdash updated this revision to Diff 20959.Mar 1 2015, 7:49 AM
dotdash retitled this revision from to Optionally extend alloca lifetimes to allow for callslot optimization to happen.
dotdash updated this object.
dotdash edited the test plan for this revision. (Show Details)
dotdash added a subscriber: Unknown Object (MLST).
reames added a subscriber: reames.Mar 2 2015, 11:40 AM

I'm reasonable sure this code is correct, but I'm not sure it's as well structured as it could be. I reserve the right to be wrong here; you may just need to explain a few things. :)

lib/Transforms/Scalar/MemCpyOptimizer.cpp
891

I'm a bit concerned about running these queries unconditionally. MDA is rather expensive.

894

I'm not clear I understand the problem here. Is the issue that MD->getDependency is returning the lifetime_start as the dependency but MD->getPointerDependency(Src) is not?

Also, would it be *correct* to always use the source pointer dependency here? Or do you need to consider both source and dest dependencies when considering the callSlotOpt? Depending on the implementation of the callSlotOpt, I could see that going either way.

p.s. You may be interested in the bug I just filed on a related issue. http://llvm.org/bugs/show_bug.cgi?id=22758

dotdash added inline comments.Mar 4 2015, 2:18 AM
lib/Transforms/Scalar/MemCpyOptimizer.cpp
894

Yes, MD->getDependency returns the lifetime_start as the dependency.

It would be incorrect to always use the source pointer dependency here, because we would then apply the call slot optimization to cases like this:

memset(%src, 0, 1000)
memset(%dst, 1, 1000)
foo(dst)
memcpy(%dst, %src, 1000)
foo(dst)

transforming it to

memset(%dst, 0, 1000)
memset(%dst, 1, 1000)
foo(dst)
foo(dst)

Unfortunately the same is true for this patch when %dst has multiple lifetime ranges like this:

memset(%src, 0, 1000)
lifetime_start(1000, %dst)
memset(%dst, 1, 1000)
foo(dst)
lifetime_end(1000, %dst)
lifetime_start(1000, %dst)
memcpy(%dst, %src, 1000)
foo(dst)
lifetime_end(1000, %dst)

I'll have to rethink this.