This is an archive of the discontinued LLVM Phabricator instance.

[libc] New version of the mem* framework
ClosedPublic

Authored by gchatelet on Oct 4 2022, 1:56 AM.

Details

Summary

This version is more composable and also simpler at the expense of being more explicit and more verbose.
This patch is not meant to be submitted but gives an idea of the change.
Codegen can be checked in https://godbolt.org/z/6z1dEoWbs by removing the "static inline" before individual functions.

Unittests are coming.

Suggested review order:

  • utils
  • op_base
  • op_builtin
  • op_generic
  • op_x86 / op_aarch64
  • *_implementations.h

Diff Detail

Event Timeline

gchatelet created this revision.Oct 4 2022, 1:56 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptOct 4 2022, 1:56 AM
gchatelet requested review of this revision.Oct 4 2022, 1:56 AM
gchatelet retitled this revision from [libc] New version of the framework to [libc] New version of the mem* framework.Oct 4 2022, 1:57 AM
courbet added inline comments.Oct 4 2022, 5:00 AM
libc/src/string/memory_utils/utils.h
57

offset_to_align_up would be consistent with existing terminology (__builtin_align_up and co.) and would avoid having two meanings for next.

69–70

This terminology is not consistent to what's above. It should be offset_to_cacheline_up or something like this. Also it would not hurt to comment the behaviour explicitly.

109–145

What's strict about that scalar :) ?

168

Maybe // precondition: &p1 != &p2, because __restrict is not part of the signature.

courbet added inline comments.Oct 4 2022, 5:10 AM
libc/src/string/memory_utils/op_base.h
8 ↗(On Diff #464920)

A file comment would help.

23 ↗(On Diff #464920)

It would be nice to define some terminology and have a better name for T, maybe Op ?

gchatelet updated this revision to Diff 465016.Oct 4 2022, 8:00 AM
  • Address comments
gchatelet marked 4 inline comments as done.Oct 4 2022, 8:02 AM
gchatelet added inline comments.
libc/src/string/memory_utils/utils.h
69–70

It is unused, I've removed it.

courbet added inline comments.Oct 4 2022, 8:24 AM
libc/src/string/memory_utils/README.md
28

// Note that 0 to 4 bytes are written twice.

69

I would say "specializations"

88

__builtin_memset_inline, given that you're talking about memset here.

libc/src/string/memory_utils/op_generic.h
20

"with a single native operation" ?

48

Pairs ?

67

Any reason fro splat to store ? I would expect static inline Type splat(uint8_t value).

95

x86 + SSE does not have this.

100

d

128

Support is not necessarily a linear range of powers of 2: e.g. x86+SSE which has 1,2,4,16

gchatelet updated this revision to Diff 465361.Oct 5 2022, 5:54 AM
gchatelet marked 8 inline comments as done.
  • Address comments and make feature usage explicit
gchatelet updated this revision to Diff 465363.Oct 5 2022, 6:11 AM
  • Improve documentation
gchatelet updated this revision to Diff 465368.Oct 5 2022, 6:23 AM
gchatelet marked 2 inline comments as done.
  • Fix missing uint64_t for 32bit platforms
gchatelet marked an inline comment as not done.Oct 5 2022, 6:25 AM
gchatelet added inline comments.
libc/src/string/memory_utils/op_generic.h
128

I've removed the support for uint64_t on 32bit platforms

courbet added inline comments.Oct 6 2022, 4:15 AM
libc/src/string/memory_utils/op_generic.h
354–360

not sure why there are two diagrams (here and below)

libc/src/string/memory_utils/utils.h
128

MemCmpResultType, or MemCmpReturnType. (I would expect the type of the symbol to be "pointer to function").

courbet added inline comments.Oct 6 2022, 4:36 AM
libc/src/string/memory_utils/op_aarch64.h
37

Does the class need to be a template ?

82

you might want to get someone more familiar with Aarch64 to review this part.

libc/src/string/memory_utils/op_generic.h
94

Do we support any non-32-bit platform ? e.g. 8- ot 16- bit ones ?

95

Should we conditionally enable on platforms we *know* support 64 bit natively, rather than conditionally disable ?

libc/src/string/memory_utils/op_x86.h
107

please document why the static_cast here is fine.

117

maybe make all casts from bool to uint32_t explicit ?

181

bytes

libc/src/string/memory_utils/utils.h
129

Technically bcmp and memcmp return int, not uint32_t, so you need to check that sizeof(uint32_t) <= sizeof(int) on the platform, and the final bcmp implementation LLVM_LIBC_FUNCTION(int, bcmp) should bitcast to int.

gchatelet updated this revision to Diff 465746.Oct 6 2022, 8:11 AM
gchatelet marked 9 inline comments as done.
  • Address comments
gchatelet added inline comments.Oct 6 2022, 8:13 AM
libc/src/string/memory_utils/op_aarch64.h
37

It does for the loop and tail function below.
I'd rather not have the template on the function instead of the struct because it will break code symmetry in the implementation.

libc/src/string/memory_utils/op_generic.h
94

No. I've added a static_assert.

354–360

This is to illustrate source vs destination alignment.
I've slightly edited the diagrams to point out where the buffer is actually aligned.

courbet added inline comments.Oct 7 2022, 12:30 AM
libc/src/string/memory_utils/memcmp_implementations.h
58–79

can we maybe implement these 3 functions as a template over the x86:xxx:MemCmp function to avoid code duplication ?

105

can you comment on why 32 here ?

gchatelet updated this revision to Diff 466014.Oct 7 2022, 2:26 AM
  • Fix builtin return type
  • factor generic implementation for memcmp
gchatelet marked 2 inline comments as done.Oct 7 2022, 2:30 AM
gchatelet added inline comments.
libc/src/string/memory_utils/memcmp_implementations.h
105

The implementations here are not final, I need to benchmark them thoroughly before pushing the code.

gchatelet marked an inline comment as done.Oct 7 2022, 2:34 AM

godbolt link to check codegen https://godbolt.org/z/fMTvfM3xn (remove static inline before mem* functions of interest at EOF)

courbet accepted this revision.Oct 7 2022, 2:48 AM
This revision is now accepted and ready to land.Oct 7 2022, 2:48 AM
  • Also updating memmove / bcmp and memcmp algorithm
gchatelet updated this revision to Diff 467130.Oct 12 2022, 7:16 AM
  • Updated the bcmp, memcmp, memmove and memcpy implementations
  • Updated utils.h function names as courbet@ suggested (align_up)
gchatelet updated this revision to Diff 467149.Oct 12 2022, 7:50 AM
  • Fix wrongly renamed variable in utils_test
gchatelet updated this revision to Diff 467152.Oct 12 2022, 7:58 AM
  • Also update bazel config
This revision was automatically updated to reflect the committed changes.

Why does this keep getting reverted and relanded? There is nothing in the commit description besides the default one from git, can you please include one? https://llvm.org/docs/GettingStarted.html#reverting-a-change

Why does this keep getting reverted and relanded? There is nothing in the commit description besides the default one from git, can you please include one? https://llvm.org/docs/GettingStarted.html#reverting-a-change

Thx for pointing me to the doc. I was unaware of this process.
Also my apologies here for the noise. Admittedly this is painful for everyone in CC 🙇

I initially thought I could fix the bug easily but there were different ones that I couldn't get to see because I don't have the hardware nor presubmit bots to catch these errors before submitting.
FTR errors were:

  • missing include for aarch64
  • off by one error in the implementation of memmove for embedded devices (ARM32)
  • warning as error for unused functions breaking the build for certain configurations
  • typos for embedded versions of bcmp anc memcmp.

This appears to cause some subtle misbehavior on Arm. I am working on the author to get a good testcase and will revert shortly.

libc/test/src/string/memory_utils/CMakeLists.txt