This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Optimize some memccpy calls to memcpy/null
ClosedPublic

Authored by xbolva00 on Sep 26 2019, 9:07 AM.

Details

Summary

return memccpy(d, "helloworld", 'r', 20)

>

return memcpy(d, "helloworld", 8 /* pos of 'r' in string */), d + 8

Event Timeline

xbolva00 created this revision.Sep 26 2019, 9:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 26 2019, 9:07 AM
xbolva00 updated this revision to Diff 221996.Sep 26 2019, 11:55 AM
xbolva00 retitled this revision from [InstCombine] Optimize some memccpy calls to memcpy to [InstCombine] Optimize some memccpy calls to memcpy/null.

Implemented more patterns.

xbolva00 updated this revision to Diff 222001.EditedSep 26 2019, 12:18 PM

More tests

Godbolt for verification: https://godbolt.org/z/2rsnFe

(Comments using argument names from the signature void *memccpy(void *dest, const void *src, int c, size_t n);.)

Please add a testcase where the optimization triggers and "c" is 0. (I think getConstantStringInfo trims the null terminator by default?)

It should be possible to optimize cases where "c" doesn't appear in "src" if "n" is smaller than the length of "src".

test/Transforms/InstCombine/memccpy.ll
57 ↗(On Diff #222001)

This is obviously wrong; you've removed the side-effect.

xbolva00 marked an inline comment as done.Sep 26 2019, 3:45 PM

Please add a testcase where the optimization triggers and "c" is 0. (I think getConstantStringInfo trims the null terminator by default?)

Ah, right.

test/Transforms/InstCombine/memccpy.ll
57 ↗(On Diff #222001)

oops

xbolva00 updated this revision to Diff 222039.Sep 26 2019, 3:46 PM

Added new tests + fixed issues.

xbolva00 updated this revision to Diff 222040.Sep 26 2019, 3:48 PM

Renamed one test function

where "c" doesn't appear in "src" if "n" is smaller than the length of "src".

We dont even need to know n. If c is not in src, we have just memcpy(dst, src, n), no?

We dont even need to know n. If c is not in src, we have just memcpy(dst, src, n), no?

I think it's possible for there to be valid data after the array returned by getConstantStringInfo, in some cases. For example, if you're doing a memccpy out of something like struct X { char a[10]; char b[10]; }. Maybe worth adding a testcase.

xbolva00 updated this revision to Diff 222045.Sep 26 2019, 4:20 PM

Added test

xbolva00 added a comment.EditedSep 26 2019, 4:21 PM

I think it's possible for there to be valid data after the array returned by getConstantStringInfo, in some cases.

I added test with string like "blabla\0x" and stop char is 'x'. As test shows, we leave this case as is.

Sorry if I misunderstood you.

xbolva00 marked an inline comment as done.Sep 26 2019, 4:27 PM
xbolva00 added inline comments.
test/Transforms/InstCombine/memccpy.ll
91 ↗(On Diff #222045)

@efriedma ?

Maybe you meant this case?

e.g. n = 20

and we have block of memory "helloworld" "saaaattttt" and stop is char is 's'. In reality we copy 11 chars, but memcpy would copy n = 20 chars.

xbolva00 updated this revision to Diff 222053.EditedSep 26 2019, 4:59 PM

One more test

This comment was removed by xbolva00.
xbolva00 updated this revision to Diff 222061.Sep 26 2019, 5:29 PM

Use Str.size() instead of GetStringLength

xbolva00 updated this revision to Diff 222230.Sep 27 2019, 12:59 PM

Handle case when dst == src

Any more comments here?

jdoerfert added inline comments.Oct 29 2019, 12:29 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1126 ↗(On Diff #222230)

Unclear why we name Size.

1137 ↗(On Diff #222230)

Simplification nits: Move the 1136 check to 1130, then simplify 1131, and swap the cases in 1133.

1138 ↗(On Diff #222230)

I do not know so I ask, is SExt the right choice here or ZExt? StopChar could be i8 -1 right? Now the question is, what do we want to find in SrcStr, -1 or 255?

1151 ↗(On Diff #222230)

I guess it is always i8Ty but it would make it easier to read if we do not mix i8Ty and CI->getType(). (Just stick with the latter and CI->getType()->getPointerElementType()).

1152 ↗(On Diff #222230)

The return value is wrong if std::min(Pos + 1, N->getZExtValue()) is not Pos + 1 because then we should have return null. (I think)

xbolva00 marked 5 inline comments as done.Oct 29 2019, 12:46 PM
xbolva00 added inline comments.
lib/Transforms/Utils/SimplifyLibCalls.cpp
1126 ↗(On Diff #222230)

Will changed.

1137 ↗(On Diff #222230)

Ok.

1138 ↗(On Diff #222230)

Inspired by optimizeStrChr, that code uses Str.find(CharC->getSExtValue()); so I think it is OK.

1151 ↗(On Diff #222230)

Ok.

1152 ↗(On Diff #222230)

Yeah.

jdoerfert added inline comments.Oct 29 2019, 8:12 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1138 ↗(On Diff #222230)

I dislike this reason, maybe that case was missed there as well. Could we make a test (and execute it) with a string that has i8 -1 as character and we look for that one with memccpy just to be sure?

Btw, the sext vs zext question is my last request, I think this is fine otherwise.

Harbormaster completed remote builds in B41374: Diff 230644.
xbolva00 updated this revision to Diff 230645.Nov 22 2019, 6:51 AM

Removed unneeded variable.

Btw, the sext vs zext question is my last request, I think this is fine otherwise.

Did you add a test or check for this?

This comment was removed by xbolva00.

Ah, ok, another test is needed.

This comment was removed by xbolva00.
This comment was removed by xbolva00.
xbolva00 updated this revision to Diff 230962.Nov 25 2019, 1:14 PM

Updated with sext or zext testcase.

jdoerfert accepted this revision.Nov 25 2019, 9:10 PM

LGTM.

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1142

Please add a comment here explaining this.

This revision is now accepted and ready to land.Nov 25 2019, 9:10 PM
xbolva00 updated this revision to Diff 231026.Nov 26 2019, 1:54 AM

Added a comment.

This revision was automatically updated to reflect the committed changes.