This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyLibCalls] Fold more memchr calls
ClosedPublic

Authored by msebor on Mar 31 2022, 10:28 AM.

Details

Summary

This patch extends the LibCallSimplifier::optimizeMemChr folder to perform two more simplifications:

  1. folding the case where the size is equal to 1 to a byte comparison, analogously to other such cases already handled by other folders
  2. simplifying to a size comparison the case where the sought character with constant value is known either to appear in the constant array or not to appear, regardless of the size argument

I have tested the patch by bootstrapping LLVM + Clang on x86_64-linux and running check-all.

Diff Detail

Event Timeline

msebor created this revision.Mar 31 2022, 10:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 31 2022, 10:28 AM
msebor requested review of this revision.Mar 31 2022, 10:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 31 2022, 10:28 AM
nikic added a comment.Apr 1 2022, 3:21 AM

I've committed your tests with baseline checks in https://github.com/llvm/llvm-project/commit/371d2ed3f3d7f67602ff9956510f3e2d9df3d5b0, can you please rebase on top of that?

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
907

Why does this zext Val and CharVal? Isn't the trunc of CharVal to i8 and a comparison in i8 type sufficient?

925

Should be n <= Pos in the comment to match the code.

934

Should be MaxLen <= Pos?

941

Not sure why we need to fall through to the code below here -- or maybe more generally, can we just drop the MaxLen == UINT64_MAX condition above and always emit the icmp+select sequence? Will get folded for the case where the size is known.

1003

Is the code below here rendered redundant by your new code above?

nikic retitled this revision from Fold more memchr calls to [SimplifyLibCalls] Fold more memchr calls.Apr 1 2022, 3:22 AM
xbolva00 added inline comments.Apr 1 2022, 3:56 AM
llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
924

!LenC ?

942

If possible better to check for LenC than comparing with UINT64_MAX..

msebor added a comment.Apr 1 2022, 1:07 PM

Thanks again for committing the baseline tests for me and (to both of you) for your comments. Let me upload a revised patch.

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
907

I believe it is. It just needs an adjustment to the existing memchr.ll test.

924

Sure.

925

Done.

934

Either works thanks to what follows but sure.

941

Good idea! It simplifies the rest of the function as well.

942

Done.

1003

Yes, I have removed it.

msebor updated this revision to Diff 419843.Apr 1 2022, 1:12 PM

Revised patch with changes suggested in review.

xbolva00 added inline comments.Apr 1 2022, 1:19 PM
llvm/test/Transforms/InstCombine/memchr-2.ll
108

But IR says N < 3

And gep is 2, not 3.

Confusing.

nikic accepted this revision.Apr 1 2022, 1:32 PM

LGTM

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
900

I think MaxLen has lost most of it's purpose now, and you can just do a LenC->isOne() check in the preceding branch?

This revision is now accepted and ready to land.Apr 1 2022, 1:32 PM
msebor added inline comments.Apr 1 2022, 1:48 PM
llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
900

There's another use in Str = Str.substr(0, MaxLen); on line 939.

The latest diff is harder to follow because of the new early return from the function. Sorry about that. I debated making that change and maybe I shouldn't have...

llvm/test/Transforms/InstCombine/memchr-2.ll
108

Sadly there are a few copy and paste typos in the comments and names of the functions (should be fold_xxx not call_xxx). Let me fix them.

msebor updated this revision to Diff 419855.Apr 1 2022, 2:14 PM

Correct typos in test comments and test function names in memchr-2.ll

@nikic if the final version looks good can you please commit the final patch for me when you have a chance?

I think you could ask for commit access now if you plan to contribute more things than just this patch.

Then you can precommit tests with fixes for names and typos. After than, rebase, regenerate tests, and commit .

xbolva00 added inline comments.Apr 1 2022, 2:28 PM
llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
900

LenC->isOne() seems like as a good suggestion.

Then you can use..

Str = Str.substr(0, LenC->getZExtValue());

One less variable to keep in mind when reading.

efriedma added inline comments.Apr 1 2022, 2:47 PM
llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
973

This implicitly truncates MaxLen to 32 bits on 32-bit targets, which is probably not what you want.

msebor added inline comments.Apr 1 2022, 3:39 PM
llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
973

You're right, silent truncation indeed isn't what I want here (or what I would have expected). Thanks for pointing out this gotcha!

It seems to be a pre-existing bug in the code made more apparent by the introduction of the MaxLen variable. Let me fix it and post another update.

nikic closed this revision.Apr 4 2022, 2:05 AM

I've taken the liberty of splitting this up into the following four commits while landing the change:
https://github.com/llvm/llvm-project/commit/0f088757449d63182fb424dce7856213739c563e
https://github.com/llvm/llvm-project/commit/d18991debfde6be335a14af3c7d2dcdecbcd758d
https://github.com/llvm/llvm-project/commit/5197d2791f908815134ad48d7b966de2d8c47eeb
https://github.com/llvm/llvm-project/commit/5ccfd5f6d43069a5bc152b335bc704c6d2584ddc

The length truncation can be addressed separately, as it's a pre-existing issue (and it looks like there are a few more substr calls in the same file that should be handled at the same time).

I've taken the liberty of splitting this up into the following four commits while landing the change:
https://github.com/llvm/llvm-project/commit/0f088757449d63182fb424dce7856213739c563e
https://github.com/llvm/llvm-project/commit/d18991debfde6be335a14af3c7d2dcdecbcd758d
https://github.com/llvm/llvm-project/commit/5197d2791f908815134ad48d7b966de2d8c47eeb
https://github.com/llvm/llvm-project/commit/5ccfd5f6d43069a5bc152b335bc704c6d2584ddc

The length truncation can be addressed separately, as it's a pre-existing issue (and it looks like there are a few more substr calls in the same file that should be handled at the same time).

Great, nice small patches!

Thanks for the patches! :)