This is an archive of the discontinued LLVM Phabricator instance.

[libc] Add utils for memory functions
ClosedPublic

Authored by gchatelet on Jan 27 2020, 6:45 AM.

Details

Summary

This patch adds a few low level functions needed to build libc memory functions.

Diff Detail

Event Timeline

gchatelet created this revision.Jan 27 2020, 6:45 AM
Herald added a project: Restricted Project. · View Herald Transcript
gchatelet marked 7 inline comments as done.Jan 27 2020, 6:56 AM

A few questions come with this patch and will help establish guidelines for the next ones.

libc/cmake/modules/LLVMLibCRules.cmake
320 ↗(On Diff #240561)

I had to test for non entrypoint libraries first as the INTERFACE do not have the "TARGET_TYPE" property.

libc/src/memory/constants.h
1 ↗(On Diff #240561)

I assume these files will stay as implementation details and are not to be exposed but feel free to challenge this assumption.

12 ↗(On Diff #240561)
libc/src/memory/utils.h
20 ↗(On Diff #240561)

Do we prefer to annotate all functions static instead?

libc/test/src/memory/utils_test.cpp
12 ↗(On Diff #240561)

Is this ok to rely on the libc++ for the tests?

25 ↗(On Diff #240561)

@sivachandra there are no ASSERT_TRUE/ASSERT_FALSE and ASSERT_EQ do not handle bool values so I had to cast is_power2_ro_zero to int

85 ↗(On Diff #240561)

@sivachandra ASSERT_EQ needs to have the same type on both sides so I had to use the L suffix literal. Technically we should write intptr_t(0) here and below but it hurts readability.

abrachet added inline comments.
libc/src/memory/constants.h
46–48 ↗(On Diff #240561)

Maybe comment the original #if's, 3 #endif's gets a little hard to follow.

libc/src/memory/utils.h
14–15 ↗(On Diff #240561)

We'll need to wait for @sivachandra for this one, because I really don't know. But look how mman.h gets size_t, https://github.com/llvm/llvm-project/blob/master/libc/config/linux/api.td#L7
https://github.com/llvm/llvm-project/blob/master/libc/config/linux/api.td#L106
I don't think this internal header needs to go through the same process but we don't have stddef or stdint, and I think we stopped including library headers like this.

Also for @sivachandra the definition of size_t includes stddef is that ok?

20 ↗(On Diff #240561)

The coding standard says so, yes.

57 ↗(On Diff #240561)

Is inline necessary?

libc/test/src/memory/utils_test.cpp
12 ↗(On Diff #240561)

No. In any case the burden to use int [N] instead of std::array<int, N> is very small.

24 ↗(On Diff #240561)

No need for the brackets.

Not that it matters here but I did end up finding the mailing list archive where they talked about preferring signed types for loops if you're curious. http://lists.llvm.org/pipermail/llvm-dev/2019-June/132890.html

82 ↗(On Diff #240561)

No need to specify the return, it doesn't really even help readability really when the return type could not possibly be clearer in this case.

101–102 ↗(On Diff #240561)

Should there be a space here?

MaskRay added inline comments.Jan 27 2020, 11:06 AM
libc/src/memory/utils.h
14–15 ↗(On Diff #240561)

stddef.h is ok. Several headers define size_t, but that list does not include stdint.h . So, yes, we need stddef.h

sivachandra added inline comments.Jan 27 2020, 11:36 AM
libc/cmake/modules/LLVMLibCRules.cmake
320 ↗(On Diff #240561)

This is OK. I have another change to add header only libraries which should improve the situation a little.

libc/src/memory/CMakeLists.txt
8 ↗(On Diff #240561)

Its odd that you have to add link libraries for header only libraries. Is it because this is the only way to setup the dependencies correctly?

libc/src/memory/constants.h
1 ↗(On Diff #240561)

This is OK. A nit: the use of plural "constants" made me look for more than one constants. But, if I am reading correctly, there is only one constant. So, unless you think that there will be more constants added in the near future, why not call it just cacheline_size.h?

46–48 ↗(On Diff #240561)

For me, this is OK. But, as @abrachet points out, to improve readability I prefer making this a generated header. It also makes it consistent with the rest of the code base.

$> cat cacheline_size.h.def
<license header>
%%include(${machine_cacheline_size})

and add a rule for the header like this:

add_gen_hdr(
  cacheline_size,
  DEF_FILE
    cacheline_size.h.def
  GEN_HDR
    cacheline_size.h
  PARAMS
    machine_cacheline_size=${LIBC_TARGET_MACHINE}_cacheline_size.h.inc
  DATA_FILES
    ${LIBC_TARGET_MACHINE}_cacheline_size.h.inc # That is, put machine specific cacheline sizes in different files.
)
libc/src/memory/utils.h
14–15 ↗(On Diff #240561)

A high level point is, we do not intend to add freestanding C headers like stddef and stdint to LLVM libc. So, including them directly is fine as they do not interfere with our headers.

About picking size_t: in a public header, to avoid name pollution, we need to "pick" only size_t and not anything else. In an internal header however, including everything from stddef is OK. But, adding a comment to say that stddef is being included for size_t will be good. [For stdint, it is obvious so you need not add such a comment.]

20 ↗(On Diff #240561)

To avoid static, would an always_inline attribute help?

libc/test/src/memory/utils_test.cpp
12 ↗(On Diff #240561)

Agree with @abrachet. However, if you want convenience functions like size et al, I am OK with adding a class like LLVM's ArrayRef. In fact, I have another patch in the works to help with your prep here, so I can add it there.

25 ↗(On Diff #240561)

The reason why I have chosen not to have ASSERT_[TRUE|FALSE], and not to support ASSERT_EQ for bool is this: In a C library, a return value is often more than tri-state. Hence, not having support for bool forces one to check for the exact return value.

I can see that it is being an annoyance for cases such as yours. But, can you actually make functions like is_power2_or_zero return an int value?

85 ↗(On Diff #240561)

This was done for making sure that exact types are being compared. So, using intptr_t is the right thing to do. However, to make it less verbose, you can probably define a convenience like:

using I = intptr_t;
101–102 ↗(On Diff #240561)

Yes :)

sivachandra added inline comments.Jan 27 2020, 3:02 PM
libc/src/CMakeLists.txt
4 ↗(On Diff #240561)

Sorry, I did not catch it in the first pass: I think the memory functions are declared in string.h. In which case, you should not be adding a new directory called memory, but make use of the existing string directory.

gchatelet marked 28 inline comments as done.Jan 29 2020, 6:43 AM
gchatelet added inline comments.
libc/cmake/modules/LLVMLibCRules.cmake
320 ↗(On Diff #240561)

Sounds good to me. I'm waiting for it to be in and I'll revert this change.

libc/src/CMakeLists.txt
4 ↗(On Diff #240561)

I expect quite a few files in here, how about a details subfolder?

libc/src/memory/CMakeLists.txt
8 ↗(On Diff #240561)

It seems to be the way to declare transitive dependency on header only libraries yes.
https://cmake.org/pipermail/cmake/2017-October/066366.html

If someone knows a better way I'm happy to take any suggestions.

libc/src/memory/constants.h
1 ↗(On Diff #240561)

Yes definitely. Thx!

libc/src/memory/utils.h
20 ↗(On Diff #240561)

Let's stick to static for now, I'll probably need the always_inline attribute at some point though.

57 ↗(On Diff #240561)

no indeed

libc/test/src/memory/utils_test.cpp
12 ↗(On Diff #240561)

Keeping as is for now and waiting for D73530 to be in to move to ArrayRef

24 ↗(On Diff #240561)

Not that it matters here but I did end up finding the mailing list archive where they talked about preferring signed types for loops if you're curious. http://lists.llvm.org/pipermail/llvm-dev/2019-June/132890.html

Yeah I've followed the thread :) I've not been convinced by the signed arguments: I've seen code where using size_t instead of plain int lead to faster code.

25 ↗(On Diff #240561)

Thx for the explanation I understand the rationale now.

This makes sense when testing pure C code but most of the implementation for the llvm-libc will heavily rely on C++ so it seems like an unnecessary burden.

For instance utils.h has a templated function so it does not compile as pure C.

I believe all tests are written in C++ as well (LibCUnitTest is itself C++)

I can see that it is being an annoyance for cases such as yours. But, can you actually make functions like is_power2_or_zero return an int value?

Then I'd have to think about the tri-state in the implementation as well which is not ideal. bool is a great C++ tool that I want to use.

How about the opposite?
We could have macros to disable the bool versions when testing the C functions themselves?

TEST(TestCFunc, strlen) {
START_LIBC_UNIT_TEST_C
/// ASSERT_[TRUE|FALSE] and ASSERT_EQ for bool are disabled
STOP_LIBC_UNIT_TEST_C
}

I might be biased but I think that most of the unittests will be devoted to the internals of the library and not the actual lic functions.

We could go even simpler with a one liner at the top of the test file UNIT_TESTS_FOR_C_FUNCTION to disable the problematic checks. WDTY?

gchatelet updated this revision to Diff 241147.Jan 29 2020, 6:50 AM
gchatelet marked 8 inline comments as done.
  • address comments

I think there are a few files left over which should have been removed in lieu of their string/details/ counterparts.

libc/src/CMakeLists.txt
4 ↗(On Diff #240561)

+1

libc/test/src/memory/utils_test.cpp
25 ↗(On Diff #240561)

Most unit tests I presume would end up being for C library functions which even if written in C++ will not return bool. -1 for START_LIBC_UNIT_TEST_C it's just too ugly and verbose. At most we could do something like

#define LIBC_WANT_BOOL_SPECIALIZATIONS
#include "utils/UnitTest/Test.h"

But really I think this is too much too. Realistically we can provide *_EQ that handles bool because no such type exists in C so no library functions would return it, and as long as we don't have *_TRUE, *_FALSE we still have the safety Siva is hoping to achieve. I don't have a strong preference though.

libc/test/src/string/details/utils_test.cpp
17 ↗(On Diff #241147)

Nit: we don't need ({...}) it can just be {...}`. But I won't it if you prefer how it is currently.

79–81 ↗(On Diff #241147)

forge could be in global scope? Also, is const meaningful here?

100–101 ↗(On Diff #241147)

newline between these

sivachandra added inline comments.Jan 29 2020, 3:35 PM
libc/src/CMakeLists.txt
4 ↗(On Diff #240561)

I am OK even without a sub-directory. But, in case you want to add one, name it memory_utils may be?

libc/src/memory/CMakeLists.txt
8 ↗(On Diff #240561)

You should not need all of this now I think. Just use add_header_library.

libc/test/src/memory/utils_test.cpp
25 ↗(On Diff #240561)

Even I do not like the enable/disable marcos. But, what do you think about this which adds [EXPECT|ASSERT]_[TRUE|FALSE]: https://reviews.llvm.org/D73668

In this test, you can use them like:

if (kExpectedValues[i] == 1)
  ASSERT_TRUE(is_power2_or_zero(i));
else
  ASSERT_FALSE(is_power2_or_zero(i));
abrachet added inline comments.Jan 29 2020, 3:54 PM
libc/test/src/string/details/utils_test.cpp
25 ↗(On Diff #241147)

I think all of the ASSERT_*'s could be EXPECT_* instead. As a matter of preference I think ASSERT should be used to test invariants that would lead to other failures or crashes. No strong preference.

gchatelet marked 13 inline comments as done.

address comments and rebase

libc/test/src/memory/utils_test.cpp
25 ↗(On Diff #240561)

Sounds good, waiting for D73668 to be in then.
Quick question though why do you prefer a EXPECT_TRUE/EXPECT_FALSE over a EXPECT_EQ(bool, bool) with an explicit instantiation? explicit would guard against automatic promotions to bool.

libc/test/src/string/details/utils_test.cpp
17 ↗(On Diff #241147)

Acknowledged. I'm keeping it as is because of the switch to cpp::ArrayRef

25 ↗(On Diff #241147)

Yes indeed, my mistake.

  • Revert LLVMLibCRules.cmake and fix tests
gchatelet marked an inline comment as done.Jan 30 2020, 12:59 AM
gchatelet added inline comments.
libc/utils/CPP/ArrayRef.h
84 ↗(On Diff #241363)

It's necessary to scope size, we can move this to another patch though.

sivachandra added inline comments.Jan 30 2020, 8:42 AM
libc/utils/CPP/ArrayRef.h
84 ↗(On Diff #241363)

Sorry, I thought I had the a this-> to go with it. Will land the fix soon.

sivachandra accepted this revision.Jan 30 2020, 11:41 AM

Just one nit, but LGTM.

libc/src/string/memory_utils/utils.h
10

s/MEMOR/MEMORY

This revision is now accepted and ready to land.Jan 30 2020, 11:41 AM
gchatelet updated this revision to Diff 241645.Jan 31 2020, 1:13 AM
gchatelet marked 2 inline comments as done.

Address comments and rebase

This revision was automatically updated to reflect the committed changes.