Currently, call slot optimization requires that if the destination is an
argument, the argument has the sret attribute. This is to ensure that
the memory access won't trap. In addition to sret, we can also allow the
optimization to happen for arguments that have the new dereferenceable
attribute, which gives the same guarantee.
Details
Diff Detail
Event Timeline
lib/Transforms/Scalar/MemCpyOptimizer.cpp | ||
---|---|---|
637 | But if you do it this way you're now *requiring* the dereferenceable attribute, and we don't want to do that. What you want to do is keep the existing logic for sret, and just add additional support for dereferenceable. |
lib/Transforms/Scalar/MemCpyOptimizer.cpp | ||
---|---|---|
637 | How so? The new check just skips the sret check if we already know that we have enough deferenceable bytes. If there is no dereferenceable attribute, getDereferenceableBytes() returns 0, which is < srcSize and so we do the sret check, and only if that fails, false is returned. |
LGTM, thanks!
lib/Transforms/Scalar/MemCpyOptimizer.cpp | ||
---|---|---|
637 | You're right; sorry about that. |
But if you do it this way you're now *requiring* the dereferenceable attribute, and we don't want to do that.
What you want to do is keep the existing logic for sret, and just add additional support for dereferenceable.