This is an archive of the discontinued LLVM Phabricator instance.

[MemCpyOpt] Perform call slot optimizations through GEPs
AbandonedPublic

Authored by dotdash on Dec 1 2017, 6:47 AM.

Details

Summary

Call slot optimization needs to know that writing to the destination of the
memcpy won't trap, so it has to know the number of dereferenceable bytes of the
destination. Currently it can only handle this for allocas and function
arguments that are directly written to. We can improve upon this by handling
simple GEPs with constant offsets.

This allows call slot optimizations to happen in code like this:

struct Foo {

int o[200];

};

struct Bar {

char p;
Foo f;

};

__attribute__((noinline))

Foo moo()
{

return {0};

}

Bar goo()
{

Bar f;
f.f = moo();
return f;

}

Which is a slightly modified version of the code in PR35134 that works around
the missing alias information in the original C++ code.

It also makes the optimization happen in the Rust code that originally lead to
the above PR.

Event Timeline

dotdash created this revision.Dec 1 2017, 6:47 AM

Hi,

could someone take a look at this or maybe suggest a different reviewer if I chose poorly, please?

Thanks a lot!

efriedma edited reviewers, added: sanjoy, hfinkel, majnemer; removed: rsmith.Dec 13 2017, 11:56 AM
efriedma added subscribers: sunfish, efriedma.
rnk added a comment.Dec 14 2017, 10:27 AM

I think you could split this up and write more targetted tests. The sret change needs a test and is probably separable.

hfinkel added inline comments.Dec 14 2017, 11:05 AM
lib/IR/Value.cpp
661

arraySize -> ArraySize

663

Because of this especially, please make DerefBytes a uint64_t (instead of just unsigned).

lib/Transforms/Scalar/MemCpyOptimizer.cpp
843

destSize -> DestSize (and similar for other variables below as per our coding conventions)

In D40723#955555, @rnk wrote:

I think you could split this up and write more targetted tests. The sret change needs a test and is probably separable.

I didn't realize that phabricator only displays the list of commits, but provides no access to the individual diffs.

So I should submit the fixes incrementally in multiple, sequential review requests, right?

In D40723#955555, @rnk wrote:

I think you could split this up and write more targetted tests. The sret change needs a test and is probably separable.

I didn't realize that phabricator only displays the list of commits, but provides no access to the individual diffs.

So I should submit the fixes incrementally in multiple, sequential review requests, right?

Yes. Please post multiple phabricator review requests.

dotdash abandoned this revision.Dec 15 2017, 5:46 AM

Will replace with a series of individual review requests.