This is an archive of the discontinued LLVM Phabricator instance.

[libc] rewrite aarch64 memcmp implementation
ClosedPublic

Authored by gchatelet on Jul 23 2021, 2:30 AM.

Details

Summary

This patch is simply rearranging the code layout so it's easier to understand.

Diff Detail

Event Timeline

gchatelet created this revision.Jul 23 2021, 2:30 AM
gchatelet requested review of this revision.Jul 23 2021, 2:30 AM

@avieira I've rearranged the conditions so it's easier to understand (no else statements). I think the generated code is the same. Can you confirm that it performs the same?

libc/src/string/aarch64/memcmp.cpp
43

@avieira shouldn't this be Skip<64> instead of Skip<32> here?

avieira added inline comments.Jul 23 2021, 3:39 AM
libc/src/string/aarch64/memcmp.cpp
43

No I think 32 is right, It only goes through two _16 Equals before reaching this on lines 34 and 38. However, since it's guaranteed >= 64 we could do something like Skip<32>::Then<Chained<_32,Loop<_16>> (untested).

I'll benchmark it and let you know.

gchatelet updated this revision to Diff 361160.Jul 23 2021, 4:50 AM
  • Fix build

Just updated the patch it was not building...

LGTM

libc/src/string/aarch64/memcmp.cpp
43

Sorry it took so long, kinda fell through the cracks. But yeah use this patch as is, don't make it Skip<64>.

I benchmarked the Chained<_32, Loop<_16>> approach and it lead to slightly worst codegen. FYI I didn't use Chained as that requires the Tail to be a constant size operation, so I had locally created a new element to do a sequence, almost like a 'ChainEnd' where the tail can be variable size and you can't chain it further.

gchatelet accepted this revision.Jul 29 2021, 5:48 AM
gchatelet marked an inline comment as done.

Thx for the review I'll let the build bot run first and submit this patch.

This revision is now accepted and ready to land.Jul 29 2021, 5:48 AM
This revision was automatically updated to reflect the committed changes.