This patch fixes calculating of builtin_object_size if it depends on condition. Before this patch compiler didn't know how to calculate the object size when it finds condition that cannot be eliminated. This patch enables calculating of builtin_object_size even in case when condition cannot be eliminated by choosing minimum or maximum value as result from condition. Choosing minimum or maximum value from condition is based on second argument of __builtin_object_size function.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Hi! Thanks for doing this.
include/llvm/Analysis/MemoryBuiltins.h | ||
---|---|---|
36 | Style nit: This should be CamelCase, not all caps. | |
39 | These identifiers seem really general to be putting them into the llvm namespace. Can we make this an enum class instead? Or can we prefix them all with something like OSM_? | |
137–139 | Please document what Mode does here. | |
lib/Analysis/MemoryBuiltins.cpp | ||
557 | Do we not need to handle Phis, as well? Consider: void sideeffect(); int foo(int N) { char Buf[100]; char *Ptr = Buf; if (N > 0) { sideeffect(); Ptr = Buf + 50; } return __builtin_object_size(Ptr, 0); } ...Which turns into: define i32 @foo(i32 %N) { entry: %Buf = alloca [100 x i8], align 16 %0 = getelementptr inbounds [100 x i8], [100 x i8]* %Buf, i64 0, i64 0 %cmp = icmp sgt i32 %N, 0 br i1 %cmp, label %if.then, label %if.end if.then: ; preds = %entry tail call void (...) @sideeffect() #4 %add.ptr = getelementptr inbounds [100 x i8], [100 x i8]* %Buf, i64 0, i64 50 br label %if.end if.end: ; preds = %if.then, %entry %Ptr.0 = phi i8* [ %add.ptr, %if.then ], [ %0, %entry ] %1 = call i64 @llvm.objectsize.i64.p0i8(i8* %Ptr.0, i1 false) %conv = trunc i64 %1 to i32 ret i32 %conv } | |
565 | This seems like it would fail given: int foo(int N) { char Big[20]; char Small[10]; char *Ptr = N ? Big + 10 : Small; return __bos(Ptr); } ...And it looks like this was a bug in the original implementation, too. Yay bugs. :) If you'd like to fix it, feel free; otherwise, I'll do it after your patch lands. If you decide to fix it, please add a test-case that doesn't go through CGP. | |
569 | Is there a reason we can't just compare TrueSide.first and FalseSide.first directly? | |
569 | Do we really want to ignore offsets here? It seems that doing would cause this to break on code like: int foo(int N) { char Small[10]; char Large[20]; char *Ptr = N ? Small : Large + 19; return __bos(Ptr); } | |
573 | Same comments as the Mode == MIN conditional above | |
lib/CodeGen/CodeGenPrepare.cpp | ||
1788 | Style nit: Add another space of indentation please | |
1792 | Can we just say Mode == MIN in the ternary here? | |
test/CodeGen/X86/builtin-condition.ll | ||
1 ↗ | (On Diff #51533) | I recommend using opt instead, so we don't have to compare against asm. Given that our test cases should be simple, you can probably get away with changing the run line to: ; RUN: opt -codegenprepare -S < %s | FileCheck %s ...And it will output LLVM IR with all of the objectsize calls lowered for you :) |
34 ↗ | (On Diff #51533) | Can we add a test-case for something like int foo(int N) { char Small[10]; char Large[20]; char *Ptr = N ? Small : Large + 19; return __bos(Ptr); } Too, please? |
Comments addressed. I was thinking to resolve handling Phis and including offset in calculating min and max values (for select and for phi) in next patch if it is OK with you.
I was thinking to resolve handling Phis and including offset in calculating min and max values (for select and for phi) in next patch if it is OK with you.
Sorry, "next patch" meaning the next revision to this review, or a different patch after this is committed?
If the former: sounds good!
If the latter: I'm fine if you want to commit Phi handling separately, but the offset issue seems like it would introduce a bug that breaks existing, correct code. I don't think that would be good. :)
include/llvm/Analysis/MemoryBuiltins.h | ||
---|---|---|
139 | Typo: "correspondig" |
Ok, I will resolve offset problem in this revision. Handling Phis will be different patch after this is committed.
So, I have question about offset problem. For this example:
int foo(int N) {
char Small[10]; char Large[20]; char *Ptr = N ? Small : Large - 19; return __builtin_object_size(Ptr,0);
}
int main() {
size_t ret; ret = foo(0); printf("\n%d\n", ret); return 0;
}
gcc gives 39 as result, I don't think that is correct value, what is your opinion about this case and similar cases ?
In clang's lib/AST/ExprConstant.cpp, it seems that we always return 0 if
our offset from the object's start is negative (or if the offset is greater
than the size of the object). I think this behavior is sane, and that it
would be odd if LLVM and clang acted differently here, so I'd just hand
back 0 bytes if our offset is negative.
lib/Analysis/MemoryBuiltins.cpp | ||
---|---|---|
572 | What do you think if we return unknown value if one of sides (true or false) has offset out of bounds, or you have some advice to return something else? Second proposal is to return some invalid value in SizeOffsetType object? |
lib/Analysis/MemoryBuiltins.cpp | ||
---|---|---|
572 |
I think that the goal of this code is to basically turn: __builtin_object_size(a ? b : c, N); into max(__builtin_object_size(b, N), __builtin_object_size(c, N)); (or min, if N is 2 or 3). If this is correct, I would expect foo below to return 3 if it isn't inlined: int foo(int i) { char c[3]; return __builtin_object_size(i ? c : c + 5, 0); } ...because __bos(c, 0) should hand back 3, __bos(c + 5, 0) should hand back 0, and max(3, 0) is 3. So, I don't see how returning unknown makes sense. |
Thanks for fixing the offset bug! Just a few rather small comments, and I think we're good.
lib/Analysis/MemoryBuiltins.cpp | ||
---|---|---|
393 | Is there a reason this method needs to exist on the visitor? Either way, please take SizeOffsetType by const&. | |
394 | Nit: Data.second.isNegative() | |
576 | Nit: The braces serve no purpose, and I don't think they were here before. Please remove them. :) | |
582 | And here |
lib/Analysis/MemoryBuiltins.cpp | ||
---|---|---|
393 | This part of code is called few times, and maybe will be called and for phi node case too, so I think it's OK to put it in a method. |
lib/Analysis/MemoryBuiltins.cpp | ||
---|---|---|
393 | ...But
Because of point 1, it should ideally be a free function. Because of point 2, it should ideally be local to MemoryBuiltins.cpp. So I'm not seeing how it makes more sense to make this a public method. |
Sorry for the multiple comments -- one more thing:
test/CodeGen/X86/builtin-condition.ll | ||
---|---|---|
1 ↗ | (On Diff #52918) | Nit: This should probably go into test/Transforms/CodeGenPrepare, now that it's no longer generating x86. :) |
This LGTM with two more comments. Thanks so much for the patch! :)
include/llvm/Analysis/MemoryBuiltins.h | ||
---|---|---|
146 | This has no users outside of MemoryBuiltins.cpp, so we can remove this. | |
lib/Analysis/MemoryBuiltins.cpp | ||
371 | Please make this static APInt getSizeWithOverflow(const SizeOffsetType &Data) {, and move it above llvm::getObjectSize (so llvm::getObjectSize can use it :) ). |
lib/Analysis/MemoryBuiltins.cpp | ||
---|---|---|
557 | We used to handle PHIs, but then we had a compile-time regression because alias analysis got too slow. We may introduce a switch, though, to be more or less aggressive. |
Style nit: This should be CamelCase, not all caps.