Cfl-aa used to bail out and make the most conservative assumption on any external library functions (any input or output parameter may alias just about anything). This patch tries to make it less conservative when seeing an external function that allocate/deallocate heap memory, e.g. malloc(), calloc(), and free().
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Tests would be appreciated here, too, please. :)
lib/Analysis/CFLAliasAnalysis.cpp | ||
---|---|---|
387 ↗ | (On Diff #58906) | This looks a bit sketchy, because isAllocLikeFn returns true for strdup/strndup, both of which copy arbitrary bits from a source to a destination. It also looks like BasicAA (line 766) opts to be conservative with strdup/strndup. That said, given how we currently handle external function calls + conversions of pointers to ints (and back), I'm not sure if allowing strdup here can break anything that has a reasonable chance of success in the first place. ...So, I'm really tempted to say "please do the same thing as BasicAA here and add a TODO that we'll address when external values/other CFLAA features are in, so this doesn't come back to subtly bite us in the future." |
Unless i'm missing something, this is not the correct handling. Happy to be
convinced.
The normal behavior (and that specified in the papers) is to treat malloc
as if it output the address of a unique variable.
This is because while it introduces no aliases itself, it does introduce a
variable that can be aliased.
If that variable is reached through other mechanisms, it may be an alias :)
If you say they do *nothing*, as you've done here, nothing will ever alias
the result of malloc, even something that *should*.
Am i missing something?
I agree.
If you say they do *nothing*, as you've done here, nothing will ever alias
the result of malloc, even something that *should*.Am i missing something?
"Nothing" is currently what happens to AllocaInsts. I just tried to be consistent with how AllocaInst is treated.
Also, "nothing" only applies to the malloc() inst/AllocaInst itself. If any other instructions use it, the allocated value should get linked to other part of the function.
Replaced isAllocLikeFn() with isMallocLikeFn() || isCallocLikeFn(), just to be consistent with BasicAA.
Also, added a test case.
Danny - ...Good point. I forgot that we were super conservative when something isn't in the sets.
Jia - Can we please change this to just add an edge from the malloc/calloc result (and free argument) to itself with zero stratified attributes, so that we have the relevant pointer in our sets? :)
We need to do this because, if the result of the malloc is otherwise unused, we'll not see it in the stratified sets. If something's not in the stratified sets, we'll assume it was added after we ran on the function, so we're conservative and always just say MayAlias. If we're not doing the same thing with otherwise-unused allocas, we probably should be.
Added additional edges in case the return value of malloc() is not used.
I'm not sure about free(), though. How could an actual argument becomes "unused"? Isn't the argument itself a use?
You are right. I think we are doing this for unused allocas, outside GetEdgeVisitor.
Any reason for (1) not adding unused value to StratifiedSets, and (2) treating values not in StratifiedSets conservatively?
We treat values that don't exist in StratifiedSets conservatively because the IR can be modified after we run CFLAA on it. Imagine that we started with:
define void @foo() { %1 = alloca [2 x i32], align 8 ; some code here ret void }
CFLAA ran, etc. and a later pass introduced a GEP of %1, like so:
define void @foo() { %1 = alloca [2 x i32], align 8 %2 = GEP %1, i32 0, i32 0 ; some code here ret void }
...And a later pass queries CFLAA for whether %1 and %2 alias for whatever reason. Because %2 wasn't there at build-time, we can't offer an accurate answer without doing extra work, so we always answer conservatively.
As for why we don't add unused pointers, probably bugs. If you can find a case where we're not adding pointers, please say so. :)
lib/Analysis/CFLAliasAnalysis.cpp | ||
---|---|---|
392 ↗ | (On Diff #58986) | I think we need to do the same edge hack with free calls, as well :) Consider: @g = ; some global ptr define void @foo() { %a = alloca i32 ; code that only uses %a call void @free(i8* @g) } If we don't, @g won't be in the sets, so we'll consider it to alias anything, which is too conservative. |
Hmm, this is strange. To me the one and only right thing to do after IR changes is to invalidate the cfl-aa pass and re-run the analysis. In general it is not safe to persist alias analysis result across IR modifications. In your example, what if, instead of someone adding a new value to the IR, a new store instruction is added and the points-to set of some existing value gets altered? The analysis result can't be used anyway.
Updated test case.
The previous test case makes alias queries on poisoned pointers and alloca-ed pointers. We'll always get back NoAlias result for those queries even before applying the patch. The updated version tests for the right thing.
LGTM'ed this over email. Doing so here, too, to make life easier for code historians.