This is an archive of the discontinued LLVM Phabricator instance.

Introduce allocsize with arbitrary expressions
Needs ReviewPublic

Authored by george.burgess.iv on Jan 20 2016, 6:14 PM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary

Note: I'm not thrilled with the compromises this patch makes; it's more here to show an approach we can take to get allocsize to support arbitrary expressions in LLVM. If we decide that the changes made are acceptable, I'll tidy this up and request a more in-depth review.

This patch introduces a more general version of the allocsize attribute proposed in D14933. It would allow code like the following to be lowered to LLVM:

// Allocates an ObjectHeader, with N bytes of space for an Object after it. Returns a pointer
// right after the header.
Object *object_malloc(int N) __attribute__((alloc_size_ex(N + sizeof(ObjectHeader), sizeof(ObjectHeader)));

Object *checked_malloc(int N) {
	Object *O = object_malloc(N);
	if (__builtin_sizeof(O, 0) < N)
		__builtin_trap();
	return O;
}

The resulting LLVM IR would look something like (simplified; clang would probably throw in allocas, etc.):

declare i8* @object_malloc(i32 %N) allocsize({i32, i32}(i32)* @__object_malloc_alloc_size_fn, [0])

; Assuming sizeof(ObjectHeader) == 8
define private {i32, i32} @__object_malloc_alloc_size_fn(i32 %N) unnamed_addr {
	%1 = add i32 %N, 8
	%2 = insertelement {i32, i32} {i32 0, i32 8}, i32 %N, 0
	ret {i32, i32} %2
}

define i8* @checked_malloc(i32 %N) {
	%1 = call i8* @object_malloc(i32 %N)
	%2 = call i32 @llvm.objectsize.i32.p0i8(i8* %1)
	%3 = icmp lt %2, %N
	br i1 %3, label %Trap, label %Return
Trap:
	call void @llvm.trap()
	unreachable
Return:
	ret i8* %1
}

@llvm.compiler.used = appending global [1 x {i32, i32}(i32)*] [
	@__object_malloc_alloc_size_fn
]

This design buys us moderately low-overhead support for allocsize with arbitrary expressions, and doesn’t unreasonably impact code size. (read: Ideally, @__object_malloc_alloc_size_fn should be discarded by the linker, so the only issue we have is that allocation functions tagged with different allocsize functions can’t be deduped. In practice, this probably isn’t an issue, because if two functions return different looking results, they probably don’t have identical bodies.)

The implementation of it, however, is not ideal. It requires the following:

  • Attributes must become mutable, which makes relative ordering unstable across RAUWs, and breaks uniquing a bit.
  • Attributes aren’t Users, so an external mechanism is needed to handle the lifetime of e.g. @__object_malloc_alloc_size_fn from above. @llvm.compiler.used is the current suggestion.
    • N.B. AssertingVH can't be used with ^, because Functions are deleted before Attributes. So we wait to assert(false) on deletion until the user actually tries to use the Attribute. This is a great place for subtle bugs to creep in unnoticed. This can be fixed by having a IsBeingDeleted member in LLVMContextImpl that's only true when LLVMContextImpl is being destructed, but... ew.
  • Type safety isn't ideal; Attribute::PseudoCall has a Constant * instead of a Function * for Fn because BitcodeReader hands us a non-Function placeholder. Fixing it probably involves duplicating a lot of methods.
  • Custom trivial call folding had to be added, because we never *actually* call e.g. @__object_malloc_alloc_size_fn in the code. In some cases, it's correct to emit this call; in others, it’s incorrect. Namely, when the allocsize function has side-effects/can trap/...
  • And more!

Any suggestions for how to better solve these problems are highly appreciated. The best overall solution seems to be handling most/all of this in Clang alone. We lose the (probably substantial) benefits of inlining/general optimizations, which makes alloc_size_ex less helpful, but doing so would almost certainly be a less error-prone approach.

Diff Detail

Event Timeline

george.burgess.iv retitled this revision from to Introduce allocsize with arbitrary expressions.
george.burgess.iv updated this object.
george.burgess.iv added a subscriber: llvm-commits.