This is an archive of the discontinued LLVM Phabricator instance.

[Support] Add fast path for StringRef::find with needle of length 2.
ClosedPublic

Authored by ishitatsuyuki on Sep 11 2022, 12:59 AM.

Details

Summary

InclusionRewriter on Windows (CRLF line endings) will exercise this in a
hot path. Calling memcmp repeatedly would be highly suboptimal for that
use case, so give it a specialized path.

Diff Detail

Event Timeline

ishitatsuyuki created this revision.Sep 11 2022, 12:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 11 2022, 12:59 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
ishitatsuyuki requested review of this revision.Sep 11 2022, 12:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 11 2022, 12:59 AM
ishitatsuyuki added a subscriber: nikic.

(Carried over from D133658)

A note that the N=2 search code is mostly mirrored from glibc; I believe it's short enough to not cause copyright issues but please let me know if I should be more defensive here.

https://codebrowser.dev/glibc/glibc/string/memmem.c.html#70

efriedma added inline comments.
llvm/lib/Support/StringRef.cpp
165

How does this compare to just the following?

do {
  if (std::memcmp(Start, Needle, 2) == 0)
    return Start - Data;
  ++Start;
} while (Start < Stop);
return npos;

Much easier to read, and the performance is probably competitive.

Use a simplified implementation for clarity.

ishitatsuyuki marked an inline comment as done.Sep 18 2022, 8:41 AM
ishitatsuyuki added inline comments.
llvm/lib/Support/StringRef.cpp
165

Thanks, memcmp is not as fast (always loads the Needle in the loop and loads 2 byte of Start instead of 1) but it should still provide sufficient benefit from inline expansion, so I've adopted the strategy.

ishitatsuyuki marked an inline comment as done.Sep 18 2022, 8:41 AM
efriedma accepted this revision.Sep 19 2022, 9:34 AM

LGTM

llvm/lib/Support/StringRef.cpp
174

As a future optimization, we could also look at making StringRef::count cache BadCharSkip.

This revision is now accepted and ready to land.Sep 19 2022, 9:34 AM

I don’t have commit rights, so can anyone commit this and https://reviews.llvm.org/D133658?

Ping, can anyone commit this?