This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fold memcmp of constant arrays and variable size
ClosedPublic

Authored by msebor on Jun 14 2022, 10:23 AM.

Details

Summary

The memcmp simplifier is limited to folding to constants calls with constant arrays and constant sizes. This change adds the ability to simplify memcmp(A, B, N) calls with constant A and B and variable N to the pseudocode equivalent of

N <= Pos ? 0 : (A < B ? -1 : B < A ? +1 : 0)

where Pos is the offset of the first mismatch between A and B.

Diff Detail

Event Timeline

msebor created this revision.Jun 14 2022, 10:23 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2022, 10:23 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
msebor requested review of this revision.Jun 14 2022, 10:23 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2022, 10:23 AM
courbet added inline comments.Jun 15 2022, 12:10 AM
llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1163–1168

This feels a bit convoluted. Why not something like:

Value *Zero = ConstantInt::get(CI->getType(), 0);

uint64_t MinSize = std::min(LStr.size(), RStr.size());
for (uint64_t Pos = 0; Pos < MinSize; ++Pos) {
  if (LStr[Pos] != RStr[Pos]) {
    Value *MaxSize = ConstantInt::get(Size->getType(), Pos);
    Value *Cmp = B.CreateICmp(ICmpInst::ICMP_ULE, Size, MaxSize);
    int IRes = LStr[Pos] < RStr[Pos] ? -1 : 1;
    Value *Res = ConstantInt::get(CI->getType(), IRes);
    return B.CreateSelect(Cmp, Zero, Res);
  }
}
// One array is a leading part of the other of equal or greater size.   
// Fold the result to zero.  Size is assumed to be in bounds, since  
// otherwise the call would be undefined. 
return  Zero;
1276

[nit] It feels weird to have if(!constant) { /*handle general case*/ } /*handle constant case*/ . Can you rewrite it as if(constant) { /*handle constant case*/ } /*handle general case*/ ?

nikic added inline comments.Jun 15 2022, 12:35 AM
llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1276

We can probably also always check the variable case first (because it also works fine for the all-constant case, right? The extra condition will fold afterwards) and drop the getConstantStringInfo() handling from optimizeMemCmpConstantSize().

msebor updated this revision to Diff 437334.Jun 15 2022, 2:22 PM

Revision 2 of the patch including:

  • Handle same arguments in the new optimizeMemCmpVarSize and adjust loop.
  • Remove constant string handling from optimizeMemCmpConstantSize.
  • Remove special cases from LibCallSimplifier::optimizeMemCmpBCmpCommon.
llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1163–1168

Sure, I can do that, thanks for the suggestion. (I have a follow-up patch that extends the handling to strncmp so the code won't look exactly the same but should stay close enough.)

1276

Yes, that sounds good to me.

msebor added inline comments.Jun 15 2022, 3:43 PM
llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1170

Whoops, the comparison is required to be done in unsigned char. Let me fix that.

I also spoke too soon about the loop The suggested form seems to have made the strncmp handling quite a bit more cumbersome than the original.

msebor updated this revision to Diff 437390.Jun 15 2022, 4:09 PM

Revision 3 performs the memcmp character comparison in unsigned char.

I've left the loop as suggested but I expect to need to go back to something closer to the original in the strncmp followup.

courbet accepted this revision.Jun 15 2022, 11:23 PM

I've left the loop as suggested but I expect to need to go back to something closer to the original in the strncmp followup.

SGTM, but let's do it when we get there :)

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1004

Please run clang-format before pushing.

This revision is now accepted and ready to land.Jun 15 2022, 11:23 PM
nikic accepted this revision.Jun 16 2022, 1:02 AM

LG. I think this will temporarily regress handling of zeroinitializer, but that should get addressed by D127443 soon.

llvm/test/Transforms/InstCombine/memcmp-6.ll
14

charcater -> character

17

The pcmp argument is dead.

This revision was landed with ongoing or failed builds.Jun 17 2022, 9:37 AM
This revision was automatically updated to reflect the committed changes.