Add the allocsize function-level attribute to LLVM.
This attribute allows users to tag arbitrary functions as "malloc-like", so intrinsics like @llvm.objectsize can give higher quality results without requiring modifications to LLVM itself.
Paths
| Differential D14933
Add the allocsize attribute to LLVM ClosedPublic Authored by george.burgess.iv on Nov 23 2015, 12:56 PM.
Details Summary Add the allocsize function-level attribute to LLVM. This attribute allows users to tag arbitrary functions as "malloc-like", so intrinsics like @llvm.objectsize can give higher quality results without requiring modifications to LLVM itself.
Diff Detail Event Timelinegeorge.burgess.iv updated this object. Comment Actions I was wondering whether we could go a bit more generic and support arbitrary expressions (like llvm.assume). This design only supports allocation functions that return a pointer to the beginning of a buffer with size p1 * p2 * ... * pn, where pi are the parameters. It would be nice to support arbitrary expressions for size and offset. There are many cases in practice where these would be helpful. Of course then we would need to expose a generalized version of gcc's attribute to applications. Comment Actions
I'm fine allowing this attribute to store a Function* or Value* that LLVM tries to constant fold at each callsite, but I can't see how llvm.assume fits into the picture at all. Can you give a concrete example of what you're trying to accomplish with the addition you proposed? Comment Actions
I mentioned assume as a precedent for something that takes an arbitrary expression which is not codegen'ed. george.burgess.iv mentioned this in D16390: Introduce allocsize with arbitrary expressions.Jan 20 2016, 6:14 PM Comment ActionsMy stab at allowing LLVM to support arbitrary expressions in allocsize is available at D16390. It needs some refactoring/cleanup, but it gives you an idea of what I think we need to do to support allocsize with arbitrary expressions well in LLVM. If we end up _really_ wanting to support that, I'll fix the other patch up a bit and submit it for actual-review. Personally, I'm sketched out by the things that we need to do in that patch, so I think I'd rather continue with a review on this one. If the consensus is that the other patch is worth the modifications that it brings, I'll clean it up + submit it for actual-review. Comment Actions
Comment Actions D16390 is indeed quite repulsive; what do you think of annotating the callsites instead? It does increase bitcode size, but perhaps something like this would feel more natural: define i8* @checked_malloc(i32 %N) { %1 = call i8* @object_malloc(i32 %N) %2 = call i32 @llvm.objectsize.i32.p0i8(i8* %1) %max = add i32 %N, 8 %c = icmp ule %max, %2 call void @llvm.assume(i1 %c) %3 = icmp lt %2, %N br i1 %3, label %Trap, label %Return ... } You could also introduce dedicated intrinsics (and maybe even wrap the calls with them, to make the lookup trivial). Comment Actions Let me modify your code sample a bit, so it represents what the transformation would really look like if we wanted to support arbitrary expressions.
So let's walk through what needs to happen here:
Because of this, we either have to get rid of @__autogenerated_get_object_size_of_object_malloc before inlining, or we need to mark it noinline so it's not impossible to remove from functions. The former case means we lose our allocsize *really* early, and the latter means we probably get slower code (especially if the function amounts to return {N - 8, 8};) Also, we'd need a highly restricted user function constant folder to make this work, just like in D16390. Plus, @llvm.assume says on the tin that it makes optimizing things harder, so this makes optimizing any function that calls an allocation function harder. Ouch. Okay, let's try something else. define i8* @checked_malloc(i32 %N) { %1 = call i8* @object_malloc(i32 %N) call void @llvm.assert_object_size_is(i8* %1, {i32, i32}(i32)* @__autogenerated_get_object_size_of_object_malloc, i32 %N) ... } This seems more reasonable. Here are the problems it poses, as I see it:
...And this is the best alternate design I can think of. Does this match what you were thinking? If so, I'm not at all convinced that it's strictly better than what I'm presenting here, which prevents no optimizations, is very lightweight, doesn't require special handling everywhere, doesn't need a custom constant folder, etc. Honestly, and this is me speaking from a position of complete ignorance as to what the actual real-life use case of this would be, in what case are tiny wrapper functions unacceptable? The only things I have to pull from are here: http://lists.llvm.org/pipermail/cfe-dev/2012-June/022272.html -- can you tell me why the below alternatives don't work just as well? Or give me an example where it wouldn't be possible to write a similar wrapper? I'm unable to think of one. // Original char *my_strdup(char *str) __attribute__((alloc_size_ex(strlen(str)+1))); // Modified extern char *_my_strdup_impl(const char *str, int N) __attribute__((alloc_size(1))); inline char *my_strdup(const char *str) { size_t total_len = strlen(str)+1; return _my_strdup_impl(str, total_len); } // ------------------ // Original void *my_complex_alloc(int n, int size, int add) __attribute__((alloc_size_ex(n * size + add))); // Modified -- _impl returns the start of the buffer extern void *_my_complex_alloc_impl(int total_bytes, int n, int size, int add) __attribute__((alloc_size(0))); inline void *my_complex_alloc(int n, int size, int add) { return _my_complex_alloc_impl(n * size + add, n, size, add); } // ------------------ // Original char *middle(int size) __attribute__((alloc_size_ex(size, size/2))); // Modified - Again, _middle_impl returns the start of the buffer extern char *_middle_impl(int size) __attribute__((alloc_size(0))); inline char *middle(int size) { return _middle_impl(size) + size/2; } Comment Actions Hi George, Have you considered using operand bundles to express arbitrary define i8* @checked_malloc(i32 %N) { %alloc_size = add i32 %N, 8 %1 = call i8* @object_malloc(i32 %N) [ "alloc-size"(i32 %alloc_size) ] %2 = call i32 @llvm.objectsize.i32.p0i8(i8* %1) %3 = icmp lt %2, %N br i1 %3, label %Trap, label %Return ... } and you'd fold (objectsize (call fn(...) [ "alloc-size"(N) ])) to (N). The caveat, of course, is that in this scheme you won't be able to
Comment Actions Before going further:
FWIW, I agree. One could justify the expression support as simplifying the transition for users (the alternative you show does require manual work), but I think this attribute is enough. That said, the expression support (D16390) is an interesting thought experiment (Disclaimer: just a curious bystander). So, I was thinking you'd IRGen the whole expression at every callsite; and this is what I meant with "wrapping" the calls: define i8* @checked_malloc(i32 %N) { %p = call i8* @object_malloc(i32 %N) %size = add i32 %N, 8 %1 = call i8* @llvm.assume_alloc_size_and_offset.p0i8(i8* %p, i32 %size, i32 -8) %2 = call i32 @llvm.objectsize.i32.p0i8(i8* %1) %3 = icmp lt %2, %N br i1 %3, label %Trap, label %Return ... } I think this avoids all problems except:
Also, +1 to Sanjoy's operand bundles approach: I think it's equivalent but cleaner. (Same caveat applies.) Comment Actions Thank you both for the feedback/thoughts. @sanjoy -
TIL those exist -- neat! I think that would be a better approach than requiring custom intrinsics in the example I posted above. @ab -
I agree, but I'm not sure how strong of an argument that would be, given that in most cases, the difference seems to be writing a few expressions in an attribute vs writing those same expressions in a small wrapper.
Yay :)
Ah, okay. I ruled that out initially because of the "side-effects must be allowed" restriction, but didn't consider it when we lifted that. Apologies.
Agreed Thought experiments aside, ISTM that no one's unhappy with the design of this attribute as-is. Is this correct? Comment Actions Just some minor comment I wrote a few weeks ago, but didn't get to the end of the review (and won't have time in the short term).
george.burgess.iv marked 3 inline comments as done. Comment ActionsAddressed all feedback -- thank you!
Comment Actions Rebased and split. I split this patch in hopes of making it a bit easier to review. This review now just adds the allocsize attribute. With this patch applied alone, allocsize can sit on functions and look pretty, but is useless; the review for the split out bits (coming in ~5 minutes ;) ) makes llvm.objectsize actually use the allocsize attribute. I intend to commit both patches together.
Comment Actions LGTM, but please see inline comment.
This revision is now accepted and ready to land.Apr 11 2016, 3:50 PM george.burgess.iv edited edge metadata. george.burgess.iv marked 6 inline comments as done. Comment ActionsAddressed all feedback.
Closed by commit rL266032: Add the allocsize attribute to LLVM. (authored by • gbiv). · Explain WhyApr 11 2016, 6:11 PM This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 40965 docs/LangRef.rst
include/llvm/Bitcode/LLVMBitCodes.h
include/llvm/IR/Attributes.h
include/llvm/IR/Attributes.td
lib/Analysis/MemoryBuiltins.cpp
lib/AsmParser/LLLexer.cpp
lib/AsmParser/LLParser.h
lib/AsmParser/LLParser.cpp
lib/AsmParser/LLToken.h
lib/Bitcode/Writer/BitcodeWriter.cpp
lib/IR/AttributeImpl.h
lib/IR/Attributes.cpp
lib/IR/Verifier.cpp
test/Transforms/InstCombine/allocsize.ll
test/Verifier/alloc-size-failedparse.ll
test/Verifier/allocsize.ll
|
The two parameters thing is not "super clean", but I can see how it models the reality (i.e. malloc vs calloc).
Maybe specify it as: allocsize(<EltSize>[, <NumElts = 1>])