This is an archive of the discontinued LLVM Phabricator instance.

Optimize double storing by memset; memcpy
ClosedPublic

Authored by ab on Mar 6 2013, 8:03 AM.

Details

Summary

A common idiom in some code is to do the following:

memset(dest, 0, dest_size);
memcpy(dest, src, src_size);

This patch implements a rewrite to avoid storing to the same location twice. The above code is rewritten as:

memcpy(dest, src, dest_size);
memset((char *)dest + dest_size, setVal,

src_size > dest_size ? src_size - dest_size : 0);

Diff Detail

Repository
rL LLVM

Event Timeline

Note that the the compile times aren't close, as my change is against a different version than the baseline.

grosbach requested changes to this revision.Mar 8 2013, 2:45 PM
grosbach added inline comments.
lib/Transforms/IPO/PassManagerBuilder.cpp
119 ↗(On Diff #1192)

This seems unrelated to the optimization?

lib/Transforms/Scalar/MemCpyOptimizer.cpp
847 ↗(On Diff #1192)

This should copy SrcSize bytes, not DestSize, right?

849 ↗(On Diff #1192)

"NewDest" is a bit vague. Better would be referencing that this is the first byte of the trailing bit we'll be zeroing explicitly.

854 ↗(On Diff #1192)

What happens if LenDiff is negative?

863 ↗(On Diff #1192)

We should check for the (probably fairly common) case of both SrcLen and DstLen being constants, in which case we don't have to emit instructions to do the arithmetic.

Likewise, we should check if DestLen==SrcLen. That'd be a bit silly in the user's code, but this sort of pattern can often come from autogenerated code that's not smart enough to elide the bzero() in that case. We should just do it for them.

868 ↗(On Diff #1192)

The DEBUG() line needs a lot more context about what it's logging. For anyone looking at debug output that's not actively working on this exact optimization, the instruction dump() here is going to be baffling.

1075 ↗(On Diff #1192)

Seems not related to the optimization? If we want to add good DEBUG() output for these transforms, that's great, but it should be done as a separate patch first.

These are all valid points. Thanks for the review.

Joel

ab commandeered this revision.Oct 24 2014, 12:00 PM
ab added a reviewer: joel_k_jones.
ab updated this revision to Diff 15441.Oct 24 2014, 3:11 PM
ab edited edge metadata.

Tweak Joel's patch here and there, mainly:

  • correct a size problem
  • try to keep as much alignment as possible
ab planned changes to this revision.Nov 12 2014, 3:35 PM
ab updated this revision to Diff 16954.Dec 4 2014, 2:20 PM
ab retitled this revision from Optimize double storing by memset; memcpy (Take two) to Optimize double storing by memset; memcpy.
ab removed reviewers: grosbach, joel_k_jones.

Ping for review!

Reworded a few of the comments.

ab added a comment.Mar 5 2015, 3:13 PM

Ping for an ancient patch? =)

ab updated this revision to Diff 22810.Mar 27 2015, 11:43 AM
ab updated this object.
ab added a subscriber: nicholas.

Ping! (also, rebase and harden a bit.)

-Ahmed

alexr added a subscriber: alexr.Mar 27 2015, 2:26 PM

This only handles overlap at the start of the memset. Does it make sense to handle other overlaps?

jmolloy accepted this revision.Apr 17 2015, 10:19 AM
jmolloy added a reviewer: jmolloy.
jmolloy added a subscriber: jmolloy.

Hi Ahmed,

This looks OK to me, I think, unless anyone else has objections?

Cheers,

James

This revision is now accepted and ready to land.Apr 17 2015, 10:19 AM

Looks reasonable to me as well.

This revision was automatically updated to reflect the committed changes.