This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] snprintf(NULL, 0, "%s", str) to strlen(str)
Needs ReviewPublic

Authored by xbolva00 on Sep 5 2021, 7:49 AM.

Details

Diff Detail

Event Timeline

xbolva00 created this revision.Sep 5 2021, 7:49 AM
xbolva00 requested review of this revision.Sep 5 2021, 7:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 5 2021, 7:49 AM
xbolva00 added inline comments.Sep 5 2021, 7:50 AM
llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
2597

snprintf returns int, strlen returns size_t so... trunc should be fine?

bjope added inline comments.Sep 15 2021, 10:29 AM
llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
2594

Is there a specific use-case for doing this (I guess a user could use strlen instead if that is what they want). Is it to save cycles due to strlen probably being faster, or because it is more likely that there are subsequent transformations involving strlen?

2597

Isn't the transform only valid if the number of bits in CI is greater than the number of bits in V?
I.e. the unsigned value returned by strlen must fit into the non-negative part of the signed destination type. And given that constraint we always need the trunc.

bjope added inline comments.Sep 15 2021, 10:33 AM
llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
2597

Well, given that constraint we always need a zext, not trunc.

xbolva00 added inline comments.Sep 15 2021, 10:45 AM
llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
2597

zext? We need change from i64 (from newly added strlen) to i32 (return type of snprintf)

bjope added inline comments.Sep 15 2021, 10:46 AM
llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
2597

If we consider the case when the value returned by strlen as size_t would be out-of-range for the non-negative range of int, then I guess snprintf should return a negative value? Or is the result undefined (I haven't seen any documentation saying anything about such cases for snprintf, while for strlen it says that the result in undefined if no null-termination is found)?

Either way it would be ok to do the transformation when sizeof(size_t)==sizeof(int), because that would give a negative value if the strlen result is out-of-range for the int.

But if sizeof(size_t)>sizeof(int), then maybe we need to saturate the value instead of doing a trunc, just to be safe?

bjope added inline comments.Sep 1 2023, 8:44 AM
llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
2597

I was looking through old phabricator reviews still open, and tried to understand this conversation :-)

(And I think that I might have mixed up which function that returned int and size_t in some comments.)

What I've found out since last time is that, as described here https://www.austingroupbugs.net/view.php?id=761, the C standard (C11?) seems to have clarified that behavior is undefined if "The number of characters or wide characters transmitted by a formatted output function (or written to an array, or that would have been written to an array) is greater than INT_MAX (7.21.6.1, 7.29.2.1).".

And IIUC the snprintf would never return a negative value indicating conversion error for %s.

So it seems like we can assume that the snprintf return a value in the range [0, INT_MAX] or else it would be undefined.

Then I think we either need to check that sizeof(size_t) >= sizeof(int) (e.g. by comparing the type size for CI and V), to ensure that trunc is correct choice when the sizes aren't equal. If sizeof(size_t) < sizeof(int) (which probably is more unlikely) we could just skip doing the optimization (or we could make sure there will be a zext, e.g. by using CreateIntCast(V, CI->getType(), false)).

Herald added a project: Restricted Project. · View Herald TranscriptSep 1 2023, 8:44 AM