This is an archive of the discontinued LLVM Phabricator instance.

[BasicAA] Add add-hoc rule to distinguish different fields in arrays of structs.
ClosedPublic

Authored by ab on Feb 5 2015, 5:46 PM.

Details

Summary

As discussed on the ML, here is a draft patch that implements some BasicAA improvements to struct GEPs.
It's very naive, but does the trick. I don't think it can reuse much of the GEP decomposition machinery, so it just checks indices manually.

How does it sound?

Hidden behind a flag to ease pre-commit testing; on the few added testcases, we go from:

5 no alias responses (4.9%)

to

25 no alias responses (24.7%)

Also, I obviously added Chandler's counterexample, thanks again ;)

Diff Detail

Repository
rL LLVM

Event Timeline

ab updated this revision to Diff 19449.Feb 5 2015, 5:46 PM
ab retitled this revision from to [BasicAA] Add add-hoc rule to distinguish different fields in arrays of structs..
ab updated this object.
ab edited the test plan for this revision. (Show Details)
ab added subscribers: chandlerc, hfinkel.
chandlerc added inline comments.Feb 6 2015, 11:40 AM
lib/Analysis/BasicAliasAnalysis.cpp
45–46 ↗(On Diff #19449)

Why do we need a flag here? If we discover bugs, we can just revert to green.

1021 ↗(On Diff #19449)

Please extract this into a helper function and use early return to simplify all of the flag usage.

1033–1034 ↗(On Diff #19449)

I would add to this comment an explanation of why we don't need to consider the array indices. I had to think quite some time to fully convince myself this is correct. Something along the lines of:

Because both GEPs begin indexing from the exact same pointer, and only index through arrays prior to the constant index into a struct, the struct that GEP1 indexes into and the struct that GEP2 indexes into must either precisely overlap or be completely disjoint. Because they cannot partially overlap, indexing into different non-overlapping fields of the struct will never alias.

1037–1043 ↗(On Diff #19449)

Sink all of this into the lambda and capture LastIndexedType instead? Then you can just pass C1 and C2 to the lambda.

1055 ↗(On Diff #19449)

No need for else after return.

ab added inline comments.Feb 6 2015, 5:08 PM
lib/Analysis/BasicAliasAnalysis.cpp
45–46 ↗(On Diff #19449)

Right, we don't, it was only in case you had some nice corner cases you wanted to test.

1033–1034 ↗(On Diff #19449)

Added the explanation verbatim, thanks!

1037–1043 ↗(On Diff #19449)

Eh, you'd still need to get the offsets; you might as well avoid doing it twice. I removed the *Idx variables though, makes it more readable.

ab updated this revision to Diff 19520.Feb 6 2015, 5:11 PM
ab added a subscriber: Unknown Object (MLST).

Various cleanups suggested by Chandler.

chandlerc added inline comments.Feb 6 2015, 5:55 PM
lib/Analysis/BasicAliasAnalysis.cpp
1016–1020 ↗(On Diff #19520)

I still think this would all be much more readable in a separate static helper function rather than being inlined (even using lambdas heavily like this to get early return).

This lambda wouldn't be needed, the code would be much less indented, and it would be a good place to hang the high-level comments about this approach, and a good place to add other structural analysis of GEPs in the future.

ab updated this revision to Diff 19527.Feb 6 2015, 7:00 PM

Extract analysis into separate function.

ab added a comment.Feb 6 2015, 7:02 PM

Thanks for the review!

lib/Analysis/BasicAliasAnalysis.cpp
1016–1020 ↗(On Diff #19520)

Fair enough; did that, and made it return AA::AliasResult. The name is overly specific, but for now that's fine.

chandlerc accepted this revision.Feb 6 2015, 8:04 PM
chandlerc added a reviewer: chandlerc.

This is looking really nice now. A minor comment below to try to lay some groundwork for any future enhancements here. Feel free to submit with that addressed.

lib/Analysis/BasicAliasAnalysis.cpp
893–898 ↗(On Diff #19527)

I would probably comment this and name it as something more general so someone else would think to add more logic here rather than in the function below.

Specifically, I think the right interesting criteria is when we have two GEPs off of the exact same pointer.

If we come up with any other ways to analyze two GEPs themselves when starting from the same pointer and concluding no-alias, it would make sense to me to add that logic here.

(Naturally, I would hoist the pointer operand comparison up to the caller, and just leave the check on the number if indices here.

This revision is now accepted and ready to land.Feb 6 2015, 8:04 PM
This revision was automatically updated to reflect the committed changes.