If BumpPtrAllocator is asked to allocate zero bytes—common when creating an array which might be empty—it will usually violate either its no-null attribute or its no-aliasing attribute. Modify it to consistently return nullptr and remove the attribute claiming that it never will.
Details
Diff Detail
- Repository
- rL LLVM
- Build Status
Buildable 22016 Build 22016: arc lint + arc unit
Event Timeline
I'm not sure whether it's better to do this or to remove LLVM_ATTRIBUTE_RETURNS_NONNULL. I'll defer to Hans or others on that, since I'm not a frequent LLVM contributor these days.
I will note that it's a little funny to allocate a different char for every possible BumpPtrAllocatorImpl instantiation. Maybe that should be moved outside the template?
I don't think we should remove LLVM_ATTRIBUTE_RETURNS_NONNULL. But where do zero-size allocations come up? Maybe we should assert that's not allowed? The current patch won't return unique pointers, which the caller might expect.
Zero-size allocations usually happen when an array or string happens to have zero elements. I'm trying to clean up UBSan failures in the Swift compiler, and there are several places where this happens there, such as when an array of code completions has no results. We could guard every call like this with an if/else, or we could wrap llvm::BumpPtrAllocator with something that has the if/else statements, or we could change llvm::BumpPtrAllocator so that it handles zero-size allocations without breaking its own contract.
If we agree that the third approach is the right one, then the implementations I can think of are to allow nullptr returns (but then people might think we return null on error), allocate a byte even on size-zero allocations (probably adds a branch, plus it wastes a little memory), or some variation of this patch's solution (returning the same address on consecutive size-zero allocations). This implementation seemed like it would be the fastest and least invasive, so it's the one I chose, but I'm not the expert here.
If we're going to keep the LLVM_ATTRIBUTE_RETURNS_NONNULL and LLVM_ATTRIBUTE_RETURNS_NOALIAS attributes, and we probably should, as well as handle zero-sized allocations, I think the only alternative is adding a `if (Size == 0) Size = 1` check.
If we do take this answer, we should *still* go to all clients and see if a zero-length check makes sense. "Copying" an empty string or empty array should definitely not result in an allocation.
If we end up in a place where you’d still want to avoid empty allocations, then I think we should go back to the previous implementation. Current clients are already receiving duplicate addresses, so this won’t cause any new bugs. We can clearly document the behavior.
An alternative might be to return deliberately invalid addresses, for instance by returning addresses from a guard page. That seems overengineered, though—I think duplicate addresses are fine.
In that case, should we just assert about it in the allocator?
I think we'd have to drop the LLVM_ATTRIBUTE_RETURNS_NOALIAS attribute though, and the behaviour would differ from malloc and operator new, which both guarantee unique return address (and it's implementation defined whether they allow zero-sized allocs). I don't know how much that matters, though.
If we do take this answer, we should *still* go to all clients and see if a zero-length check makes sense. "Copying" an empty string or empty array should definitely not result in an allocation.
In that case, should we just assert about it in the allocator?
I'm happy with that, but that's a heck of a breaking change for a low-level LLVM API.
I locally tried adding an assert(Size > 0), but also making Allocate<T>()--which doesn't have any of these attributes--return nullptr for zero elements. This design broke 1,060 tests, mostly in clang but a few in LLVM too. So asserting is going to break a ton of code. (I also tried making it waste some space, but only if you don't go through Allocate<T>()—this worked fine.)
So I think our options are:
- Assert and cause widespread breakage.
- Remove at least one of the two attributes.
- Waste a little memory per allocation (if it's not done through Allocate<T>()).
- Allocate from a guard page, or otherwise return unique invalid pointers.
I think the best solution is to make BumpPtrAllocator::Allocate() consistently return nullptr for zero-size allocations.
Zero-size allocations are already extremely common in the wild; allocating space unnecessarily just imposes a softer
requirement for users to avoid doing it; a guard page is massive overkill; the noalias guarantee seems more valuable;
and users already had to be prepared to occasionally receive nullptr returns for zero-size allocations anyway.
I added the LLVM_ATTRIBUTE_RETURNS_NONNULL attribute back in r216192 and saw a significant speedup of Clang due to this, because it allowed us to avoid null checks in placement new operations.
But starting with Clang 3.7 or GCC 8, this doesn't seem to matter anymore: https://godbolt.org/z/QabXA-
In that case dropping the attribute seems fine. But please measure first.
@hans So do you think it's okay to go at this point, or do you want more benchmarking?
(This is my first contribution to LLVM, so I can't commit it myself—if you're satisfied with it, I'd appreciate it if you could do it.)
I'd like to see some kind of benchmark to verify this doesn't have negative performance impact.
Even not so old GCC 7.3 leaves null checks there.. And since we have v4.9 (or so) as a GCC minimum supported version to build LLVM, I think we should avoid this change for now.
FWIW, just because a lot of code hits the assert for non-zero size does not (to me) mean that isn't the correct approach.
I actually think that the API is better if allocating zero size is disallowed. I think making clients think about how it makes sense in *their* case to handle this degenerate case is better than trying to do something inside the allocator. I think we'll constantly make one set of users of the API or the other unhappy: either we allocate when we shouldn't to avoid returning a nullptr, or we make clients deal with a potentially-null return.
(Some emails I sent don't seem to have gotten through.)
This is a little rough because it uses time(1), but here's twenty compiles of oggenc.c (1.7 MB) at -O3 on macOS with as much extraneous stuff closed as possible:
Without change:
103.81 real 101.95 user 1.48 sys
With change:
101.31 real 99.55 user 1.46 sys
Additional runs under less controlled conditions (e.g. a web browser open in the background) didn't necessarily show the changed version being faster, but they were always quite close. Test runs the other day with bzip2.c (cited in the original commit) also looked pretty close, but it compiled so fast that I had a tough time getting solid numbers. I had trouble getting gcc.c to compile on macOS.
I understand that perspective, but I'm not really sure there's much "dealing" for clients to do. With the patch in place, the allocator returns null only if it allocates zero bytes, meaning there's no space reserved for that pointer and it should never be dereferenced anyway. The best way to handle a zero-byte allocation is probably to leave null whatever variable you would have otherwise set to point to the buffer—and that's exactly what would happen with this patch. So I don't think the null solution will make a set of users unhappy unless they're being pedantic.
(And as a practical matter, as a first-time contributor I'm just not sure what steps I should take before making a breaking change of that magnitude. I assume you'd want to give Clang and other projects a heads-up, but I have no idea who should be involved.)