This is an archive of the discontinued LLVM Phabricator instance.

[Support] Improve zero-size allocation with safe_malloc, etc.
ClosedPublic

Authored by andusy on Jun 21 2019, 1:30 PM.

Details

Summary

The current implementations of the memory allocation functions mistake a nullptr returned from std::malloc, std::calloc, or std::realloc as a failure. The behaviour for each of std::malloc, std::calloc, and std::realloc when the size is 0 is implementation defined, and may return a nullptr.

If the system function returns nullptr, check if the size is 0 and return a non-null pointer using malloc.

Diff Detail

Event Timeline

andusy created this revision.Jun 21 2019, 1:30 PM

Good find!

I don't believe this is the right way to go, however. As you point out malloc(0) is undefined behavior, I get you are trying to define it here, but I'm not sure it fits in with the ethos of this routine being 'safe', in my opinion. I would think that asking for 0 bytes would be indicative of a bug that can be found in testing debug builds. Personally, assert(Sz && "Tried to allocate 0") seems like it would suffice, but I see why you went in this direction.

I don't believe this is the right way to go, however. As you point out malloc(0) is undefined behavior

It isn't. It is implementation-defined whether or not it returns a null pointer. In either case, it returns a value on which free can be safely called.

I would think that asking for 0 bytes would be indicative of a bug that can be found in testing debug builds. Personally, assert(Sz && "Tried to allocate 0") seems like it would suffice, but I see why you went in this direction.

malloc and friends are supposed to succeed when asked for 0 bytes except when there is insufficient space. The caller is within its rights to ask for 0 bytes and expect a value on which it may call free except when there is insufficient space.

It isn't. It is implementation-defined

You're right it isn't undefined as in nondeterministic, disregard this then.

malloc and friends are supposed to succeed when asked for 0 bytes except when there is insufficient space

I don't know of a situation where I would want to call malloc to allocate 0, which is why I thought assert made more sense, it's not about rights of calling malloc(3). If my code calls safe_malloc(0), I want to know about it, and the assert also solves aborting and printing a no memory error when errno != ENOMEM.

Just a quick grep of only safe_malloc, it seems like none of its clients would ever call it with 0 in a normal circumstance, although I may be wrong here.

For reference, GNU's widely internally used xmalloc which is analogous to this one in its contract does the same as you propose, it has an if (size == 0) size = 1; at the beginning. libcxx also does this for its implementation of new and I assume other C++ STL implementations do the same, so almost all memory allocated in llvm (assuming make_unique and unique_ptr::unique_ptr call new) will not notify us of asking for 0 bytes. For more (arguably useless) reference, mmap(2) which is also widely used in the code base (at least in my experience but only for file IO) does not allow length to be 0.

I would wait to see what other reviewers think. I just figured I would throw in the assert(Sz) idea, but maybe I'm the only one who thinks it makes sense :)

malloc(0) can be used to allocate a zero-sized object. To avoid having a null pointer to the zero-sized object, implementations often change to allocate 1 byte instead. So this patch makes sense to me.

Just a quick grep of only safe_malloc, it seems like none of its clients would ever call it with 0 in a normal circumstance, although I may be wrong here.

The call to safe_calloc in https://llvm.org/doxygen/SparseSet_8h_source.html can request 0 bytes in a "normal circumstance". This change solves a portability issue with these functions as-is. Requiring a change to the client code to never pass in zero is a change in the contract of these functions. It is not definitively wrong for the client code to request zero bytes, and requiring the client code to add extra special-case handling can have drawbacks.

The change in this patch is within the scope of being a bug fix, and the implications of the change is rather well understood. A change to add the assertion that the request was not for zero bytes will broadly affect platforms where the system malloc does perform allocation when asked for zero bytes, the impact of which is not entirely known. What is known is that there is existing client code that would need additional changes should the assertion be added.

xingxue accepted this revision.Jun 24 2019, 9:51 AM

LGTM.

This revision is now accepted and ready to land.Jun 24 2019, 9:51 AM

What is known is that there is existing client code that would need additional changes should the assertion be added.

Agreed. it's just that malloc(0 /* -> 1 */) is expensive in that it fragments the memory pool for no good reason (there is a case that this doesn't matter because all llvm tools are short lived programs, but I don't think that's a reason to ignore fragmentation). It would be better design if the clients calling safe_* took steps to not do this even if the C standard allows it, in my opinion.

There are of course reasons to not go this way as you two have pointed out, sorry for all the noise.

LGTM with minor comment (that can be applied when committing).

llvm/include/llvm/Support/MemAlloc.h
28

Suggested comment before this line (and similarly for the other two functions):

// It is implementation-defined whether allocation occurs if the space
// requested is zero (ISO/IEC 9899:2018 7.22.3). Retry, requesting non-zero,
// if the space requested was zero.
This revision was automatically updated to reflect the committed changes.