This is an archive of the discontinued LLVM Phabricator instance.

[LIBC] Add an optimized memcmp implementation for AArch64
ClosedPublic

Authored by avieira on Jul 5 2021, 10:31 AM.

Details

Summary

Hi,

Here is an optimized memcmp for AArch64. This uses __builtin_memcmp_inline for which I have a RFC at D105440 .

I also changed some general implementations to prefer using Equal where possible and less ThreeWayCmp's. I have not benchmarked these changes on x86 though. Let me know what ya think.

I also added extra testing as I found the existing tests to be lacking at times.

Kind regards,
Andre

Diff Detail

Event Timeline

avieira created this revision.Jul 5 2021, 10:31 AM
avieira requested review of this revision.Jul 5 2021, 10:31 AM
Matt added a subscriber: Matt.Jul 5 2021, 10:33 AM

Thx for the patch @avieira .
Can you run clang-format on all the files? (it should also fix include ordering issues).
I'll have a look at __builtin_memcmp_inline shortly.

libc/src/string/aarch64/memcmp.cpp
18

Let's use const char *lhs, const char *rhs, size_t count here and do the casting in the calling function.

34–51

The logic here is not straightforward, could you try to rewrite it with less state? Maybe with the help of helper functions?

libc/src/string/memory_utils/elements.h
467–469

Thx for the fix!

libc/src/string/memory_utils/elements_aarch64.h
16

You'd need to #include <src/string/memory_utils/elements.h>

libc/test/src/string/memcmp_test.cpp
46–50

The correctness of this test depends on the state set up by the previous test.
Could you explicitly set the state of the buffers before each tests? (e.g. memset(lhs, 'a', sizeof(lhs));)
It makes the test more maintainable.

avieira updated this revision to Diff 356913.Jul 7 2021, 3:37 AM

I think I've addressed all the comments and style issues.

A few more nits but almost there.

libc/src/string/aarch64/memcmp.cpp
34

This would be return ThreeWayCompare<Tail<_16>>(lhs, rhs, count);

libc/src/string/memory_utils/elements_aarch64.h
12

I think there should be a newline here, can you rerun clang-format?

61

#endif // __ARM_NEON

libc/test/src/string/memcmp_test.cpp
38

static constexpr size_t kMaxSize = 1024; then use it below.

avieira updated this revision to Diff 356928.Jul 7 2021, 5:44 AM
avieira added inline comments.
libc/src/string/aarch64/memcmp.cpp
34–51

Yeah I originally wrote it like that because I found it reduced code-size, but the new approach actually works better :)

libc/src/string/memory_utils/elements_aarch64.h
12

clang-format didn't add one, so I added it manually.

gchatelet accepted this revision.Jul 7 2021, 6:09 AM

One more nit but it's good to go.

libc/src/string/memory_utils/elements_aarch64.h
63

Indeed :)

libc/test/src/string/memcmp_test.cpp
44

here as well and below.

This revision is now accepted and ready to land.Jul 7 2021, 6:09 AM
avieira updated this revision to Diff 356936.Jul 7 2021, 6:35 AM
avieira added inline comments.
libc/test/src/string/memcmp_test.cpp
44

Doh!

gchatelet accepted this revision.Jul 7 2021, 7:04 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 7 2021, 7:59 AM