Patch for Bug 27478. Make BasicAliasAnalysis claims NoAlias if two GEPs index different fields of the same structure.
Details
Diff Detail
Event Timeline
lib/Analysis/BasicAliasAnalysis.cpp | ||
---|---|---|
841 | This loop never verifies they are the same struct, just A struct. In general, your entire calculation is essentially trying to figure out (offset, size) of the entire set of accesses. It happens that, for the same struct, you can guarantee that if they go to different (offset, size) they can't alias (because you can't really overlay the structs in a different way). But this is not, in general, true because you don't know where they are really starting from unless you see the allocation site (IE this could be a gep of a gep of a gep of a ...) | |
867 | This is not right. Just because what is being loaded is a struct type, means nothing. (Because LLVM is not C, things that are not legal in C *are* legal in LLVM). You need to verify that what they are *accessing through* is the same struct type, as the top of your loop says. Not that the loaded element is a struct type. | |
868 | This loop is really hard to follow the logic of because of where it breaks and continues. |
lib/Analysis/BasicAliasAnalysis.cpp | ||
---|---|---|
841 | @dberlin Thank you for your comment. This loop is invoked only when the pointer operand of GEP1 and GEP2 are same (there is an assertion above). Doesn't this guarantee that they are starting from the same point regardless of their allocation site? The logic is that if two GEPs sharing the same base pointer access through the same index until index 'i-1' but have a different value at index 'i', and if the index typed at index 'i-1' is a struct, then at index 'i' they access two different fields of the same struct (I assumed 'same struct' because two GEPs share the same pointer operand and same index until 'i-1'). |
Doesn't this guarantee that they are starting from the same point
regardless of their allocation site?
Yes, I missed that in the context :)
The logic is that if two GEPs sharing the same base pointer access through
the same index until index 'i-1' but have a different value at index 'i',
and if the index typed at index 'i-1' is a struct, then at index 'i' they
access two different fields of the same struct (I assumed 'same struct'
because two GEPs share the same pointer operand and same index until 'i-1').
Okay. Because of the way the loop is structured, this isn't really clear
that this is what is happening, despite the comment
:)
@dberlin, thank you for accepting the diff. It would be great if someone can commit this diff for me as I don't have the privilege yet.
If you can please remove the whitespace changes, and update the diff, i'll
commit it.
I get apply failures due to them.
@@ -904,7 +938,7 @@
// Because they cannot partially overlap and because fields in an array // cannot overlap, if we can prove the final indices are different between // GEP1 and GEP2, we can conclude GEP1 and GEP2 don't alias.
+
// If the last indices are constants, we've already checked they don't // equal each other so we can exit early. if (C1 && C2)
@@ -977,7 +1011,7 @@
the highest %f1 can be is (%alloca + 3). This means %random can
not be higher
than (%alloca - 1), and so is not inbounds, a contradiction.
bool BasicAAResult::isGEPBaseAtNegativeOffset(const GEPOperator *GEPOp,
- const DecomposedGEP &DecompGEP, const DecomposedGEP &DecompAlloca,
+ const DecomposedGEP &DecompGEP, const DecomposedGEP &DecompAlloca,
uint64_t AllocaAccessSize) { // If the alloca access size is unknown, or the GEP isn't inbounds, bail. if (AllocaAccessSize == MemoryLocation::UnknownSize || !GEPOp->isInBounds())
It
Rebased. @dberlin, I'm sorry. I should have rebased to the most recent version. It seems that the change in the variable name caused the conflict. I removed the whitespace changes from the diff as well.
Hi Taewook,
I have reverted this, as it seems to cause failures in selfhosting and sanitizer builds.
See
http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/16280
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/20377
etc
I don't know if it is due to undenied behavior in the original code, or your patch missing semantics, but you should investigate.
Happy to recommit it once we've got it working.
llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp | ||
---|---|---|
842 ↗ | (On Diff #59256) | You aren't allowed to make aliasing assumptions based on the type of a GEP; it's just pointer math. "inbounds" just means that the result has to be in bounds relative to the allocation as a whole. See http://llvm.org/docs/LangRef.html#getelementptr-instruction . Even if a frontend generates "nice" code, LLVM itself performs transformations which will break any assumptions based on the type of a GEP (for example, LLVM will transform a bitcast into a GEP). |
llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp | ||
---|---|---|
842 ↗ | (On Diff #59256) | @eli.friedman Thanks for the comment. I'm afraid I didn't understand the point of your comments. This patch is for the case where two GEPs have the same pointer operand and access through the same indices until a certain point. For such case, not only types, but also indexed values should be identical at that point. If so, when the next index values are different, it means that two GEPs will point different fields of the same struct. It would be great if you could suggest an example that this patch generates wrong result. |
His point is that types in GEP's are just suggestions (see the GEP faq).
The type itself is meaningless. You can take the result, bitcast it to a
int*, whatever.
You can also take an int *, bitcast it to a struct, gep it, and use the
result.
Example:
struct A { unsigned char a[10]; unsigned char b; }; void f(A* x) { for (unsigned i = 0; i < 11; ++i) x->b += ((unsigned char*)x)[i]; }
Errr, this is not two GEPS of a struct type base pointer with the same
type, with a minimum of two operands, etc.
It's
%6 = getelementptr inbounds i8, i8* %5, i64 %4 %9 = getelementptr inbounds %struct.A, %struct.A* %x, i32 0, i32 1
This code will do nothing with this.
Want to try again?
:)
and the obvious modification still gives right answers (ie using x->a):
%5 = getelementptr inbounds %struct.A, %struct.A* %x, i32 0, i32 0 %6 = getelementptr inbounds [10 x i8], [10 x i8]* %5, i32 0, i32 0 %7 = getelementptr inbounds i8, i8* %6, i64 %4 %8 = load i8, i8* %7, align 1 %9 = zext i8 %8 to i32 %10 = getelementptr inbounds %struct.A, %struct.A* %x, i32 0, i32 1
%5 and %10 do *not* alias.
The code says nothing about %10 and %6 or %7
My example run through "clang -S -emit-llvm -O2 -fno-unroll-loops":
define void @_Z1fP1A(%struct.A* nocapture %x) #0 { entry: %b = getelementptr inbounds %struct.A, %struct.A* %x, i64 0, i32 1 %.pre = load i8, i8* %b, align 1, !tbaa !1 br label %for.body for.cond.cleanup: ; preds = %for.body ret void for.body: ; preds = %for.body, %entry %0 = phi i8 [ %.pre, %entry ], [ %add, %for.body ] %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] %arrayidx = getelementptr inbounds %struct.A, %struct.A* %x, i64 0, i32 0, i64 %indvars.iv %1 = load i8, i8* %arrayidx, align 1, !tbaa !5 %add = add i8 %0, %1 store i8 %add, i8* %b, align 1, !tbaa !1 %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 %exitcond = icmp eq i64 %indvars.iv.next, 11 br i1 %exitcond, label %for.cond.cleanup, label %for.body }
Note in particular %b = getelementptr inbounds %struct.A, %struct.A* %x, i64 0, i32 1 vs. %arrayidx = getelementptr inbounds %struct.A, %struct.A* %x, i64 0, i32 0, i64 %indvars.iv.
llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp | ||
---|---|---|
842 ↗ | (On Diff #59256) | Yes I think Eli is right here, if you have struct { int arr_a[2]; int arr_b[2]; }; then in LLVM IR the -1 th element of arr_b is the 1 st element of arr_a. Generally I don't think you can do much better than using GEP::accumulateConstantOffset if there is a trailing non-constant index. But maybe in cases like GEP(PTR, 1, unknown, 4) you can play some tricks with divisibility? |
You generally aren't going to be able to do anything without frontend info.
The one trick you can play is actually to use pointer alignment (if it's
there).
@eli.friedman Thank you for bring up this example :) Actually your example is very close with the code snippet that motivates me to work on this patch. The only difference is that in my code index variable i iterates up to 10, not 11.
One thing I'd like to make sure is that the unit of "allocated object" in http://llvm.org/docs/LangRef.html#getelementptr-instruction is the entire struct, not an each field of the struct. I realized that if the value of %indvars.iv is 10 then %arrayidx and %b have the same valid value, but I assumed that it is still fine to aggressively claim NoAlias for such case because accessing %arrayidx will result poison value. However, if the entire struct is the unit of allocated object, then this patch is invalid.
@sanjoy Yes I missed the case of the index value being negative as well (Ahmed pointed out the same point as well in the mailing list). Although it may not be hard to fix the patch to check the sign of indexes, I think it is not worth because this patch has other problems (what Eli pointed) as well.
@dberlin Thank you for your time in reviewing this patch! I'll try to figure out if it is possible to exploit clang annotated metadata to enable this optimization, because this is why gcc outperforms llvm for some of our internal benchmarks. It would be great if you can give me suggestions as well.
You definitely should produce metadata to do this optimization .
(the other alternative, adding some attribute to GEP, tends to be vastly
non-preferred)
Clang already has this: -fstrict-aliasing, which enables struct-path
aware TBAA, via !tbaa metadata.
That TBAA isn't sufficient is intentional though: you're probably
hitting the return PartialAlias at the end of BasicAAResult::aliasGEP
(PR9971).
-Ahmed
"allocated object" in this context means the entire object returned by something like malloc() or the alloca instruction.
If you have an idea for how to improve the wording, please propose a patch for LangRef (llvm/docs/LangRef.rst).
So I'm thinking of adding metadata that indicates the original type of the pointer operand to GEP instructions. With metadata, an analysis can tell if GEP's pointer operand is transformed from bitcast operation or not. What do you think?
So, i'm honestly a bit confused.
Do you mean "what the original language pointer type" was?
Because LLVM does not know enough to know what the language rules are, and
so you'd generally want something lower level, like we have now with !tbaa,
etc.
Do you mean something else?
If your goal is to distinguish whether the original accesses were to
different field offsets, and the language says that means they cna't alias,
you should produce that as metadata.
If your goal is something else, we should start with the goal and try to
design something for it, and hopefully nearby use cases (if any) :)
@dberlin I meant the same level of type information as tbaa for GEP's pointer operand.
For eli's example above, if the original access were to the array field of the struct, clang generates GEP like
%20 = load %struct.A*, %struct.A** %3, align 8 %21 = getelementptr inbounds %struct.A, %struct.A* %20, i32 0, i32 0 %22 = getelementptr inbounds [10 x i8], [10 x i8]* %21, i64 0, i64 %19
But if the access were to char* casted pointer, then it is like
%20 = load %struct.A*, %struct.A** %3, align 8 %21 = bitcast %struct.A* %20 to i8* %22 = getelementptr inbounds i8, i8* %21, i64 %19
and these instructions are eventually combined to
%10 = getelementptr inbounds %struct.A, %struct.A* %0, i64 0, i32 0, i64 %9
So by knowing the original pointer operand type of GEP, it is possible to tell if the original accesses were to the field of the struct itself or to the casted pointer, assuming that clang generates bitscast operation for every type casting.
@dberlin Thank you for your comments. My original plan was making each GEP to understand its source-level type with metadata, but now I also agree on you that the approach will not add much value. Instead, I'm working on adding metadata when clang observes struct field access in the source. The metadata will tell the name of the struct and the index of the accessed field. Does that sound more reasonable?
Yes, but i would go with "offset and size". name and index does not tell
you whether they overlap.
Think of unions :)
@dberlin I was planning to add metadata only for structures (not unions), but even in that case "offset and size" seems like a better idea. Thanks!
Just FYI;
If you want to get this metadata into trunk, you are going to have to have
a real design RFC for it, and, as part of that, explain why you can't or
shouldn't generalize something existing (like the struct path metadata
etc), to this case.
Not suggesting you can't make that argument, of course, just making sure
you realize you'll have to actually be able to defend new metadata, we
don't just add everything :)
@dberlin Thank you for letting me know :) Actually one of my concern is that the scope of the problem that can be solved by adding this metadata might be too limited. Before writing RFC, I'll think a bit more about if there is a way to improve the design of this metadata so it can be useful to other alias analysis problems as well.
Please add a period.