This is an archive of the discontinued LLVM Phabricator instance.

[BuildLibCalls] Add noalias for strcat and stpcpy
ClosedPublic

Authored by xbolva00 on Sep 25 2020, 12:58 PM.

Details

Summary

strcat:
destination and source shall not overlap. (http://www.cplusplus.com/reference/cstring/strcat/)

stpcpy:
The strings may not overlap, and the destination string dest must be large enough to receive the copy. (https://man7.org/linux/man-pages/man3/stpcpy.3.html)

Diff Detail

Event Timeline

xbolva00 created this revision.Sep 25 2020, 12:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 25 2020, 12:58 PM

LGTM, I checked libc on macOS which also states that overlapping string args are UB for both strcpy & strcat. It seems a bit odd that those were missed out initially, so please wait a bit with committing, in case there's something I am missing.

It's conceivable that we might not want to mark strcpy/stpcpy etc. noalias for the same reason llvm.memcpy isn't noalias: standard implementation tolerate exact overlap, and the benefit from forbidding that is small. (Actually, I'm not sure any optimizations benefit at the moment.) Maybe that's being overconservative, though.

(This doesn't apply to strcat.)

xbolva00 added a comment.EditedSep 25 2020, 1:42 PM

That would be really overconservative

some tools and gcc warns and exploits this restriction.

fhahn added a comment.Sep 25 2020, 2:07 PM

It's conceivable that we might not want to mark strcpy/stpcpy etc. noalias for the same reason llvm.memcpy isn't noalias: standard implementation tolerate exact overlap, and the benefit from forbidding that is small. (Actually, I'm not sure any optimizations benefit at the moment.) Maybe that's being overconservative, though.

Hmm, it appears we already use noalias for strcpy/strncpy and this patch only additionally adds it to strcat/stpcpy.

noalias on its own is probably not too useful in this context, but I think without noalias, we cannot mark either argument as readonly/writeonly, which prevents some DSE opportunities (https://bugs.llvm.org/show_bug.cgi?id=47644). IIRC the main reason to allow exact overlap for llvm.memcpy was because clang generated code that violated the assumption. I think for the memcpy libfunc we still add noalias for both arguments.

If everyone is satisfied the noalias markings won't cause any practical issues, that's fine.

noalias on its own is probably not too useful in this context, but I think without noalias, we cannot mark either argument as readonly/writeonly

That isn't how readonly works. From LangRef: "On an argument, this attribute indicates that the function does not write through this pointer argument, even though it may write to the memory that the pointer points to."

fhahn added a comment.Sep 25 2020, 2:13 PM

noalias on its own is probably not too useful in this context, but I think without noalias, we cannot mark either argument as readonly/writeonly

That isn't how readonly works. From LangRef: "On an argument, this attribute indicates that the function does not write through this pointer argument, even though it may write to the memory that the pointer points to."

Oh right, I forgot about that exception.

FWIW, noalias on these arguments is, by itself, not doing anything. It does something if you know how much memory is accessed by the function through such an argument *for sure* and what other memory is accessed for sure. Such memory regions are not overlapping. (This is way too vague actually but I guess it doesn't matter for the gist.)

FWIW, noalias on these arguments is, by itself, not doing anything. It does something if you know how much memory is accessed by the function through such an argument *for sure* and what other memory is accessed for sure. Such memory regions are not overlapping. (This is way too vague actually but I guess it doesn't matter for the gist.)

so, the patch as is, is OK for you all?

This revision is now accepted and ready to land.Sep 27 2020, 10:06 AM
This revision was automatically updated to reflect the committed changes.