This is an archive of the discontinued LLVM Phabricator instance.

LLVM CFL Alias Analysis -- Algorithm
AbandonedPublic

Authored by george.burgess.iv on Jul 16 2014, 6:38 PM.

Details

Summary

Note: WeightedBidirectedGraph and StratifiedSets are in their own patch titled "LLVM CFL Alias Analysis -- Supporting Data Structures".

This patch contains the algorithm for the CFL Alias Analysis described here: https://docs.google.com/document/d/1nGFKMmr-HbdEiag9G1GeWurgOV0CweSUjLXFr3LAwqg/edit . Currently, we don't have any extremely fancy features (no interprocedural analysis, no field sensitivity, etc.), and we do best sitting behind BasicAA + TBAA. In such a configuration, we take ~0.6-0.8% of total compile time, and give ~7-8% NoAlias responses to queries TBAA and BasicAA couldn't answer when bootstrapping LLVM. In testing this on other projects, we've seen up to 10.5% of queries dropped by BasicAA+TBAA answered with NoAlias by this algorithm. While an extensive investigation hasn't been conducted, the code being compiled ran properly + passed the unit tests we ran. So, the bugs remaining are probably super subtle. :)

WRT design quirks:

This is implemented as an ImmutablePass that holds its own state. Upon making this a FunctionPass, it gets constructed, but nothing ever calls it.

Other notes:

Chandler had the idea of making a simple FunctionPass that drops the StratifiedSets cache after inlining. Because AFAICT most of our sets are realized prior to inlining, we'll probably be able to give back better results if we do this.
I still have ~3.5 weeks of my internship, during which the code will change more, and I'll do extensive testing/profiling. Places I plan to change are noted in the code.

Diff Detail

Event Timeline

gbiv updated this revision to Diff 11550.Jul 16 2014, 6:38 PM
gbiv retitled this revision from to LLVM CFL Alias Analysis -- Algorithm.
gbiv updated this object.
gbiv edited the test plan for this revision. (Show Details)
gbiv added a project: deleted.
gbiv added a subscriber: Unknown Object (MLST).
dberlin edited edge metadata.Jul 21 2014, 12:33 PM

Some preliminary comments on this one as well

lib/Analysis/CFLAliasAnalysis.cpp
121

Unless i'm miscounting braces, doesn't this cause this function to always return MayAlias?
I assume this was just debugging code that was left in when you were running final numbers, since i've seen the actual numbers :)

142

AssignTo appears unused, and could you add a comment about what type of statement you expect each to represent.

293

Can you talk a bit about why the Insert* calls are Reference edges, instead of say, assignto?

gbiv added inline comments.Jul 24 2014, 11:06 AM
lib/Analysis/CFLAliasAnalysis.cpp
121

Yup, it's from testing timing/etc. My apologies/will be removed in the next revision.. :)

142

It's meant to be the counterpart for AssignFrom, because we're using a bidirected graph. Will add comments and will condense AssignFrom/AssignTo down to Assign if the From/To ends up giving us nothing.

293

Good point! Fixed in next rev.

gbiv updated this revision to Diff 11893.Jul 25 2014, 1:14 PM
gbiv added a reviewer: hfinkel.

+hfinkel

Addressed Danny's comments, fixed a bug where we wouldn't properly handle exceptions, (i.e. in try {} catch (Foo& f) {}, f would be ignored, when it really needs to be treated like an aliasing external.) and added an extra field to Edge so interprocedural is easier to integrate.

hfinkel edited edge metadata.Jul 26 2014, 5:56 PM

Hi again,

+ algorithm. It does not depend on types The algorithm is a mixture of the one
+
described here: www.cs.cornell.edu/~rugina/papers/popl08.pdf and here:
+// www.cse.cuhk.edu.hk/lyu/_media/paper/pldi2013.pdf -- to summarize the papers,

Again, actual references please.

+static bool getPossibleTargets(Inst *, SmallVectorImpl<Function *> *);

Why is this not SmallVectorImpl<Function *> & ?

+ There are certain instructions (i.e. FenceInst, etc.) that we ignore.
+
This detects those.
+static bool hasUsefulEdges(Instruction *inst);

This returns true when we don't ignore the instruction?

+ // TODO: Maybe merge this with AssignFrom if it carries no benefit.

Please place this comment near the relevant code.

+ // TODO: Rename this + Reference

Okay, please do.

+ %b = load %a (%b Dereference %a)
+
%b = extractelement %a, 0 (%b Dereference %a)

I suppose this second item is what motivates the renaming?

+ /// If a function's sets arecurrently being built, it is marked

are currently

+struct CFLAliasAnalysis : public ImmutablePass, public AliasAnalysis {
+private:
...
+ DenseMap<Function *, Optional<StratifiedSets<Value *>>> cache;

This is an ImmutablePass and keeps a cache of Function* -- This is dangerous because a function would be deleted and new one created at the same address. You probably want to use a CallbackVH from llvm/IR/ValueHandle.h so that you can get notified if the Fuction is removed and remove it from the cache. (The ScalarEvolution analysis does something like this).

+ // This should be covered by BasicAA, but it's super cheap to check.

As a genral note, you must not constuct an AA pass assuming that BasicAA will filter for you. Your pass must independently return correct answers for all cases.

+ Currently everything returns an Optional<bool> because it makes life easier
+
to be able to selectively disable support for instructions and troubleshoot
+ our coverage of the instructions we'll get. I plan to remove this as this AA
+
gets tested more/nearer finalization.

Alright, please add a FIXME, an assert, an llvm_unreachable, or something.

+// Nothing - Unknown instruction. Program should die.

Let's leave dying for another day.

(Actually, if you're ready to remove the Optional now, please do).

+ CFLAliasAnalysis *aa;
+ SmallVectorImpl<Edge> *output;

These can be refeences?

+ Optional<bool> visitInstruction(Instruction &) { return NoneType(); }

llvm_unreachable("CFL AA does not know about this kind of instruction");

+ Optional<bool> visitBinaryOperator(BinaryOperator &inst) {
+ auto *op1 = inst.getOperand(0);
+ auto *op2 = inst.getOperand(1);
+ output->push_back({&inst, op1, EdgeWeight::AssignFrom, false});
+ output->push_back({&inst, op2, EdgeWeight::AssignFrom, false});

Do we actually need edges for all binary operators, or only those that could be part of pointer arithmetic? Meaning, can we exclude the floating-point operators, for example?

+ for (unsigned i = 0, e = inst.getNumIncomingValues(); i < e; ++i) {
+ Value *val = inst.getIncomingValue(i);
+ output->push_back({&inst, val, EdgeWeight::AssignFrom, false});
+ }

Does it matter if you add duplicate edges here? A funny property of LLVM phis is that they can have duplicate entries for the same predecessor block so long as the values also match.

+ static bool isFunctionExternal(Function *fn) {
+ return fn->getBasicBlockList().empty();
+ }

This is not the correct way to test for this. I think you want !fn->isDeclaration() && fn->hasLocalLinkage().

+ bool onlyReadMemory =
+ std::all_of(fns.begin(), fns.end(),
+ [](const Function *fn) { return fn->onlyReadsMemory(); });

What does this check have to do with IPA? If this property is true, then it is true regardless of whether the function is local or not. This is how it is defined (it is definitional, not speculative).

+ for (Value *v : inst.arg_operands()) {
+ output->push_back({&inst, v, EdgeWeight::AssignFrom, true});
+ }

Do you need this for calls returning void?

+ Optional<bool> visitShuffleVectorInst(ShuffleVectorInst &inst) {
+ auto *from1 = inst.getOperand(0);
+ auto *from2 = inst.getOperand(1);
+ output->push_back({&inst, from1, EdgeWeight::AssignFrom, false});
+ output->push_back({&inst, from2, EdgeWeight::AssignFrom, false});
+ return true;
+ }

Many vector shuffles have an Undef second operand, I assume you don't need an edge for those. On that thought, you should probably filter out edges from undef everywhere.

+class GetTargetValueVisitor
+ : public InstVisitor<GetTargetValueVisitor, Optional<Value *>> {
+public:
+ Optional<Value *> visitInstruction(Instruction &) { return NoneType(); }
+ Optional<Value *> visitCastInst(CastInst &inst) { return &inst; }
+ Optional<Value *> visitBinaryOperator(BinaryOperator &inst) { return &inst; }
+ Optional<Value *> visitLoadInst(LoadInst &inst) { return &inst; }
+ Optional<Value *> visitSelectInst(SelectInst &inst) { return &inst; }
+ Optional<Value *> visitAllocaInst(AllocaInst &inst) { return &inst; }
+ Optional<Value *> visitPHINode(PHINode &inst) { return &inst; }
+ Optional<Value *> visitCallInst(CallInst &inst) { return &inst; }
+ Optional<Value *> visitInvokeInst(InvokeInst &inst) { return &inst; }
+ Optional<Value *> visitShuffleVectorInst(ShuffleVectorInst &inst) {
+ return &inst;
+ }
...

Just provide a default using visitInstruction only override the special cases.

+ edges. True if we could, false if we couldn't. If you hand this function an
+
instruction it didn't expect, we break. Yay. (For more information on what it
+// expects/how to add instruction support, see GetEdgesVisitor)

Make sure you handle all instructions, obviously, then remove the latter part of this comment.

+===----------------------------------------------------------------------===
+ Definitions of static functions declared at the top/middle of the file
+
===----------------------------------------------------------------------===//

This comment is unnecessary.

+// TODO: Make this look at virtual calls, etc.

Exactly what do you have in mind?

+ bool isExcitingTerminator =
+ isa<InvokeInst>(inst) || isa<LandingPadInst>(inst);

Exciting! Just fold this into the return conditional.

+static bool isAliasingExternal(Value *val) {
+ bool isGlobal = isa<GlobalValue>(val);
+ auto *arg = dyn_cast<Argument>(val);
+ bool isAliasArg = arg != nullptr && !arg->hasNoAliasAttr();
+ return isGlobal || isAliasArg;
+}

You're missing handling for byval arguments. Look at isIdentifiedObject and friends in AliasAnalysis.cpp

+ Aside: We may remove graph construction entirely, because it doesn't really
+
buy us much that we don't already have. I'd like to add interprocedural

Can you please be more specific about what it does or does not buy us?

+ if (!maybeFnA.hasValue() && !maybeFnB.hasValue()) {
+ llvm_unreachable("Can't find Funcation A or Function B");
+ }

Funcation -> Function; also be more specific, how about "Don't know how to extract the parent function from this value".

+ // Ensure we're looking at the same function for both variables.
+ assert(!maybeFnB.hasValue() || *maybeFnB == *maybeFnA);

() && "Interprocedural queries not supported"

+ if (!maybeA.hasValue()) {
+ return NoneType();
+ }
+
+ auto maybeB = sets.find(valB);
+ if (!maybeB.hasValue()) {
+ return NoneType();
+ }

I see no reason to use Optional here. Just return MayAlias.

Thanks again,
Hal

  • Original Message -----

From: "George Burgess IV" <gbiv@google.com>
To: gbiv@google.com, nlewycky@google.com, chandlerc@gmail.com, dberlin@dberlin.org, hfinkel@anl.gov
Cc: llvm-commits@cs.uiuc.edu
Sent: Friday, July 25, 2014 3:14:14 PM
Subject: Re: [PATCH] LLVM CFL Alias Analysis -- Algorithm

+hfinkel

Addressed Danny's comments, fixed a bug where we wouldn't properly
handle exceptions, (i.e. in try {} catch (Foo& f) {}, f would be
ignored, when it really needs to be treated like an aliasing
external.) and added an extra field to Edge so interprocedural is
easier to integrate.

http://reviews.llvm.org/D4551

Files:

include/llvm/Analysis/Passes.h
include/llvm/InitializePasses.h
include/llvm/LinkAllPasses.h
lib/Analysis/CFLAliasAnalysis.cpp
lib/Analysis/CMakeLists.txt
sanjoy added a subscriber: sanjoy.Jul 29 2014, 8:19 PM

Few comments inline.

lib/Analysis/CFLAliasAnalysis.cpp
49

This might be better named as parentFunctionOfValue. valueToFunction sounds a lot like it just does dyn_cast<Function>(va).

57

Can this use the CallSite class?

304

It may be possible to use CallSite here too.

340

Are you sure this is correct? In the comment on enum class EdgeWeight you have:

// The edge used when our edge goes from a value to a handle that may have
// contained it at some point. Examples:
// %b = load %a              (%a Reference %b)
// %b = extractelement %a, 0 (%a Reference %b)
Reference

but here

Optional<bool> visitExtractElementInst(ExtractElementInst &inst) {
  // inst :: %b = extractelementinst %a, 0
  auto *ptr = inst.getVectorOperand();   // ptr = %a
  auto *val = &inst;   // val = %b
  // new edge = (val -> ptr, Reference) => (%b Reference %a)
  output->push_back({val, ptr, EdgeWeight::Reference, false});
  return true;
}

I may be totally missing something though, so please feel free to assert the correctness of this function without a specific reason -- I'll rebuild my understand then. :)

gbiv updated this revision to Diff 12056.Jul 30 2014, 10:02 PM
gbiv edited edge metadata.

Thanks to everyone who's reviewed this so far. :)

Addressing comments:
hfinkel:
I'm aware that some of the code examples are sketchy, but they are "valid" C++ that turns into "valid" LLVM IR. So we technically need to support it.

Do we actually need edges for all binary operators, or only those that could be part of pointer arithmetic? Meaning, can we exclude the floating-point operators, for example?

Can floats not be a part of pointer arithmetic with -fno-strict-aliasing on?
int* a = new int;
int* b = (int*)((float)a + 1.0);

Does it matter if you add duplicate edges here? A funny property of LLVM phis is that they can have duplicate entries for the same predecessor block so long as the values also match.

Duplicate edges cause a little bit more memory and a few more CPU cycles to be taken during set construction. After the StratifiedSets are constructed, it makes no difference if we have one or ten of the same edge.

What does this check have to do with IPA? If this property is true, then it is true regardless of whether the function is local or not. This is how it is defined (it is definitional, not speculative).

Good catch -- that method was more of a stub than anything; I renamed it, but keep in mind that part of IPA is ridding ourselves of it.

Do you need this for calls returning void?

Yes. It unifies all of the arguments into one set for us, which is useful with e.g. std::swap(T&, T&);

Many vector shuffles have an Undef second operand, I assume you don't need an edge for those. On that thought, you should probably filter out edges from undef everywhere.

AFAIK, Undef isa<Constant> && !isAliasingExternal, so they're filtered out for free in buildSetsFrom. :)

Exactly what do you have in mind?

Updated TODO to be more descriptive; for indirect calls, we may be able to figure out all possible targets of the call, and just try IPA on all of them.

Exciting!

I think so too.

You're missing handling for byval arguments. Look at isIdentifiedObject and friends in AliasAnalysis.cpp

Will add in next rev

Can you please be more specific about what it does or does not buy us?

AFAICT, graph construction at the moment just replicates the use graph we already get for free from Value*, and tags each edge with an extra attribute or two. Theoretically, it shouldn't be *too* hard to remove the graph and calculate Weights/AliasingExternals on the fly, which would save us the time+memory of building the graph.

sanjoy:

Can this use the CallSite class?

I might have missed something, but I don't see what the CallSite class would really buy us here -- the goal of that method is to grab us one or more (in the case of a call through a function pointer, or similar) of the functions that can be called for a given CallInst/InvokeInst. All I can see for CallSite is the same functionality that we get from Inst->getCalledFunction().

Are you sure this is correct

Excellent catch! :) The docs on EdgeWeight were flipped -- my mistake.

I don't have any additional comments on this one, so again, you should
wait for hal (or someone else) to give you a final okay so you can be
sure everyone's issues have been addressed.

gbiv updated this revision to Diff 12242.Aug 6 2014, 11:26 AM
gbiv added a subscriber: george.burgess.iv.

+Personal account (internship ends on Friday)

Added some interprocedural analysis into the algorithm, changed from StratifiedSets<Value*> to FunctionInfo everywhere, and added support for StratifiedAttrs.

Hi George,

Will this new design supporting "strict aliasing" finally?

Kevin posted this http://comments.gmane.org/gmane.comp.compilers.llvm.devel/75298.

For that case, if we use "gcc -fno-strict-aliasing -O2", the case behaves correctly. But by default gcc is enabling -strict-aliasing.

Any thoughts?

Thanks,
-Jiangning

Hi! Thanks for your comments :)

Kevin posted this http://comments.gmane.org/gmane.comp.compilers.llvm.devel/75298.

AFAICT, that's not a valid C program. The C89 standard section 3.1.2.6 says "All declarations that refer to the same object or function shall have compatible type; otherwise the behavior is undefined." In this case, the types int and struct heap are not compatible. So, undefined behavior results. For more on type compatibility in C89, see section 3.1.2.6 of the standard. Compiling the executable with -fno-strict-aliasing -O2 on gcc 4.8.2 & clang 3.4.1, it seems that they both produce *very* similar assembly. In terms of pure instruction count, clang actually "wins" by a single instruction.

Will this new design supporting "strict aliasing" finally?

This AA algorithm entirely ignores types at the moment. That being said, if we have an API that allows us to extract C/C++ types from LLVM IR, I don't see adding type sensitivity being a necessarily difficult task. (If you're only looking for a pass that deals with types, then you may find that the TypeBasedAliasAnalysis pass interesting.)

Thanks again,
George

Hi George,

Thanks for your answer!

I'm asking this question, because the feature -strict-aliasing can improve
benchmark performance a lot. And even one of spec2000 benchmarks, I can see
~6% performance improvement.

I simplified one of the cases as below,

{code}
$ cat alias.c
typedef struct {int x; int y;} S;
S **ps;
int i;
main()
{

do {
  ps[i]->x = i;
  i++;
} while (i);

}
$ ~/llvm/build/bin/clang --target=aarch64-linux-gnuabi alias.c -S -O2
alias.c:4:1: warning: type specifier missing, defaults to 'int'
[-Wimplicit-int]
main()
^
1 warning generated.
$ cat alias.s
.text
.file "alias.c"
.globl main
.align 2
.type main,@function
main: @main
BB#0: %entry
adrp x9, ps
adrp x8, i
ldr x9, [x9, :lo12:ps]
ldr w10, [x8, :lo12:i]
.LBB0_1:
%do.body

// =>This Inner Loop Header: Depth=1

ldr x11, [x9, w10, sxtw #3]
str w10, [x11]

  • ldr w10, [x8, :lo12:i] // inside the loop*

add w10, w10, #1 // =1

  • str w10, [x8, :lo12:i] // inside the loop *

cbnz w10, .LBB0_1
BB#2: %do.end
mov w0, wzr
ret
.Ltmp1:
.size main, .Ltmp1-main

.type i,@object @i
.comm i,4,4
.type ps,@object
@ps
.comm ps,8,8

.ident "clang version 3.6.0 "
$ aarch64-linux-gnu-gcc alias.c -S -O2
$ cat alias.s
.cpu generic
.file "alias.c"
.section .text.startup,"ax",%progbits
.align 2
.global main
.type main, %function
main:
adrp x4, i
adrp x1, ps

  • ldr w0, [x4,#:lo12:i] // hoisted out loop*

ldr x1, [x1,#:lo12:ps]
add x1, x1, x0, sxtw 3
b .L3
.L5:
mov w0, w2
.L3:
ldr x3, [x1],8
adds w2, w0, 1
str w0, [x3]
bne .L5
*str w2, [x4,#:lo12:i] // sink out of loop*
ret
.size main, .-main
.comm i,4,4
.comm ps,8,8
.ident "GCC: (Ubuntu/Linaro 4.8.1-10ubuntu7) 4.8.1"
{code}

So for this case, gcc can hoist/sink load/store out of loop, but llvm can't.

Is the goal of your new design to replace the current "BasicAA+TBAA"?

Sorry for my off topic discussion, but I want to understand what is going
to happen around this. Our stakeholders really want performance improvement.

Based on your last reply, my understanding is we should improve TBAA, but
will it still work together with your new design? Should we pay effort on
current TBAA pass to solve the problem?

Thanks,
-Jiangning

2014-08-08 2:40 GMT+08:00 George Burgess IV <gbiv@google.com>:

Hi! Thanks for your comments :)

Kevin posted this

http://comments.gmane.org/gmane.comp.compilers.llvm.devel/75298.
AFAICT, that's not a valid C program. The C89 standard section 3.1.2.6
says "All declarations that refer to the same object or function shall have
compatible type; otherwise the behavior is undefined." In this case, the
types int and struct heap are not compatible. So, undefined behavior
results. For more on type compatibility in C89, see section 3.1.2.6 of the
standard. Compiling the executable with -fno-strict-aliasing -O2 on gcc
4.8.2 & clang 3.4.1, it seems that they both produce *very* similar
assembly. In terms of pure instruction count, clang actually "wins" by a
single instruction.

Will this new design supporting "strict aliasing" finally?

This AA algorithm entirely ignores types at the moment. That being said,
if we have an API that allows us to extract C/C++ types from LLVM IR, I
don't see adding type sensitivity being a necessarily difficult task. (If
you're only looking for a pass that deals with types, then you may find
that the TypeBasedAliasAnalysis pass interesting.)

Thanks again,
George

http://reviews.llvm.org/D4551

Just wanted to ping this since George is on his way back to school.
I'd appreciate it if folks who made comments would confirm/deny that they
are satisfied those are addressed and are okay with this being committed.

nlewycky edited edge metadata.Aug 20 2014, 6:08 PM

I think this is worth landing and any bug fixes can be incremental development?

lib/Analysis/CFLAliasAnalysis.cpp
171

handles -> Handles, I think>

242–243

"disabled while trying to find a bug"? An example that explains when this comes up in production use would be better. Unless it really is a debugging feature?

281

"I < E" is usually "I != E" in llvm, though that causes no particular benefit here.

287–290

What about the index arguments? Suppose you have

inty = ptrtoint ptry
GEP (ptrx-ptry), inty

?

292–297

Only the true and false values but not the condition, eh. You wanna hear something horrible?

void foo(char *p) {
  uintptr_t x = 0;
  uintptr_t y = (uintptr_t)p;
  for (int i = 0; i < 8*sizeof(char*); ++i, x <<= 1) {
    int bit = (y>>(8*sizeof(char*)-1)) & 1;
    x += bit ? 1 : 0;  // SelectInst here.
  }
  char *q = (char*)x;
  // don't answer noalias(p, q).
}
313–315

Eh? I would've expected it's external if declaration || !hasLocalLinkage. Is this function the negative of its name?

387

Does RetVals change size inside this loop? If not, the "for (unsigned X = 0, XE = RetVals.size(); X != XE; ++X) {" formulation makes that clear.

(Oh and again at line 381 for Parameters, 415 for Arguments.)

407–408

You could also check Fn->isVarArgs() way up in the early exit detection code instead? If it's not varargs then #params = #args? Unless the call is indirect (ie., through a bitcast).

415

Typo, "extermely" -> "extremely".

739–740

auto Node = Worklist.pop_back_val()?

794

"Lengthy"?

I agree with Nick; once current comments are addressed, I think this can go in and we'll work on it from there.

lib/Analysis/CFLAliasAnalysis.cpp
12

Missing '.' after types.

64

I think this is fine, but FYI, there is another way of doing it. Instead of taking an Inst* you could take a const CallSite & (which can automatically be constructed by either). If someone else has a preference here, feel free to say so.

96

I don't like the term "weight" here -- the different weights here are really unordered edge types. Can we call this EdgeType instead?

221

I think this deserves a comment. This will catch all GlobalVariable comparisons. Maybe something like:
// Comparisons between global variables and other constants should be handled by BasicAA.

416

It feels like there should be an early-backedge in this loop for parameters that are noalias/byval somewhere.

679

Remove "all"

794

Also, what is lengthy about them? This is really the query interface.

799

It is not obvious what this assert means. Please make this:

assert(DummyPair.second && "Here is what this means");
george.burgess.iv commandeered this revision.Feb 3 2016, 11:50 AM
george.burgess.iv added a reviewer: gbiv.

Commandeering to abandon; this went in a long time ago as part of D5106.

george.burgess.iv abandoned this revision.Feb 3 2016, 11:50 AM