This is an archive of the discontinued LLVM Phabricator instance.

[AggressiveInstCombine] Fold strcmp for short string literals with size 2
ClosedPublic

Authored by kitaisreal on Jul 19 2023, 12:55 PM.

Details

Summary

Fold strcmp for short string literals with size 2.
Depends D155742.

Diff Detail

Event Timeline

kitaisreal created this revision.Jul 19 2023, 12:55 PM
kitaisreal requested review of this revision.Jul 19 2023, 12:55 PM
goldstein.w.n added inline comments.Jul 19 2023, 1:33 PM
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
909

I would just generalize this to something like: "Try to expand strcmp(P, string_literal) where strlen(string_literal) <= 2"

927

instead of == 1 || == 2 how about just <= 2? Does zero length ever occur? If so thats probably worth handling as well.

1011

This is not really easy to read. It also seems to be largerly a duplicate of the size == 1 logic.
Can you hoist it to a helper that performs the check for a given char index?

kitaisreal marked 3 inline comments as done.

Updated implementation.

goldstein.w.n added inline comments.Jul 20 2023, 9:26 AM
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
997

Since a precondition is the strcmp is only used for equality, why do you zext here.

1002

is NSW right?

goldstein.w.n added inline comments.Jul 20 2023, 9:28 AM
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
1002

Likewise, do you need the sub at all? Can you just br on the compare of the two characters and return the compare result itself?

kitaisreal marked 3 inline comments as done.Jul 20 2023, 12:21 PM
kitaisreal added inline comments.
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
997

Above we check that strcmp is only used in any comparison with zero, not just equality comparison.
zest is required because in strcmp we compare characters as unsigned characters.
Please check discussion in previous pull request which provides more details https://reviews.llvm.org/D154725 file AggressiveInstCombine.cpp line 949.

1002

We need sub because we want to support not only equality compare strcmp result with zero strcmp(P, 'a') == 0, but also strcmp(P, 'a') < 0 and strcmp(P, 'a') > 0.
It seems that it can be a good idea to simplify strcmp(P, 'a') == 0 to something like P[0] == 'a' && P[1] == '\0'. It seems that compiler cannot do this right now.
NSW added to follow standard strcmp implementation from glibc https://godbolt.org/z/Pd9xKcnE8.

kitaisreal marked 2 inline comments as done.Jul 21 2023, 7:15 AM
kitaisreal set the repository for this revision to rG LLVM Github Monorepo.

Re-upload the same patch with mentioning correct repository LLVM Github Monorepo.

goldstein.w.n accepted this revision.Jul 26 2023, 9:48 AM

LGTM.

Not really familiar enough with AggressiveInstCombine to be only reviewer here so please wait at least a few days to a second signoff.

llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
1007

nit can you rename *SubZero* variables -> *SubIsZero* or *Eq* (I first read 'SubZero' as negative).

This revision is now accepted and ready to land.Jul 26 2023, 9:48 AM
nikic accepted this revision.Jul 26 2023, 2:22 PM

Looks good to me as well.

llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
939

Drop braces (and maybe combine these two conditions?)

I will fix nit picks in separate patch, where I plan to add more efficient implementations for strcmp(P, 'x') == 0, strcmp(P, 'x') != 0.

kitaisreal marked 2 inline comments as done.Jul 28 2023, 9:39 AM

Fixed nit picks in follow up revision https://reviews.llvm.org/D156556.