This is an archive of the discontinued LLVM Phabricator instance.

[LangRef] Update the TBAA section
ClosedPublic

Authored by sanjoy on Nov 17 2016, 7:38 PM.

Details

Summary

Update the TBAA section to mention the struct path TBAA that LLVM
implements today. This is not a proposal or change in semantics -- it
is intended only to document what LLVM already does today.

This is related to https://reviews.llvm.org/D26438 where I've tried to
implement some of the constraints as verifier checks.

Diff Detail

Repository
rL LLVM

Event Timeline

sanjoy updated this revision to Diff 78460.Nov 17 2016, 7:38 PM
sanjoy retitled this revision from to [LangRef] Update the TBAA section.
sanjoy updated this object.
sanjoy added a subscriber: llvm-commits.
dberris added inline comments.Nov 18 2016, 3:26 AM
docs/LangRef.rst
4600–4601 ↗(On Diff #78460)

nit: You probably want to hide this as a comment? Not sure how/whether "TODO:" gets rendered differently from normal text in the final HTML.

anna added inline comments.Nov 18 2016, 8:25 AM
docs/LangRef.rst
4438 ↗(On Diff #78460)

I'm confused here. By definition, scalar type descriptors don't contain other types. So, how can it have another scalar type descriptor as a parent?

4452 ↗(On Diff #78460)

Nit: Extra word 'use'.
use type descriptors to describe the *location*

4494–4496 ↗(On Diff #78460)

Could you please explain why this is needed?

4537 ↗(On Diff #78460)

Is Root an identifier in TBAA?

4546–4547 ↗(On Diff #78460)

The mandatory (AccessTy, AccessTy, 0) for the result of GetAliasingLocations(tag3)refers to (CharScalarTy, IntScalarTy, 0) right?

Can we have more than 2 AccessTy?

4547 ↗(On Diff #78460)

Nit: extra comma? 'This means that other than `tag3` itself,'

4555–4556 ↗(On Diff #78460)

Do we need this here? The representation of access tags explains it in detail.

4575–4576 ↗(On Diff #78460)

Could we show the MDString argument for the struct type descriptors in the type system descriptor graph in the above example:

InnerStructTy = {"InnerStructType", (IntScalarTy, 0), (FloatScalarTy, 4)}
4577 ↗(On Diff #78460)

'After this name, the struct type...'

sanjoy added inline comments.Nov 18 2016, 2:42 PM
docs/LangRef.rst
4600–4601 ↗(On Diff #78460)

I'll ditch this before checkin -- this was more of a reminder to myself than anything else.

sanjoy updated this revision to Diff 81702.Dec 15 2016, 6:24 PM
sanjoy marked 6 inline comments as done.
  • review
sanjoy added inline comments.Dec 15 2016, 6:28 PM
docs/LangRef.rst
4494–4496 ↗(On Diff #78460)

I don't want to get into the details in the LangRef (the section is already quite large now), but this requirement stems from MDNode::getMostGenericTBAA. It takes two struct tags as arguments (StructA, ScalarA, OffsetA) and (StructB, ScalarB, OffsetB) and first "coarsens" them to (ScalarA, ScalarA, OffsetA) and (ScalarB, ScalarB, OffsetB). It then tries to find a common alias tag for (ScalarA, ScalarA, OffsetA) and (ScalarB, ScalarB, OffsetB). The first step is wrong if (ScalarA, ScalarA, OffsetA) does not alias (StructA, ScalarA, OffsetA) (same for the other one) since then that step won't be a "coarsening" any more, and we'd go from aliasing some set of locations to aliasing a different set of locations, and the final result will be wrong.

As an example, say we have:

struct A {
  float f;
  int i;
}

and I want to compute the most generic metadata for the two accesses a.f = .. and a.i = ...

In the correct case, a.f will be tagged with (A, float, 0) and a.i will be tagged with (A, int, 4). getMostGenericTBAA will coarsen these to (float, float, 0) (i.e. any float access) and (int, int 0) (i.e. any int access) and will eventually resolve them to a common (char, char, 0) tag, which in C means it basically aliases with everything since char can be used to access locations of any type. In particular, it aliases with both the tags for a.i and a.f.

Now say we make a mistake, and tag a.i with (A, float, 4) (and a.f is the same). getMostGenericTBAA will resolve these to a common (float, float, 0) tag. However, (float, float, 0) does not alias with the incorrect tag for a.i (by the rules stated in GetAliasingLocations), which is wrong -- we wanted a tag that aliases both a.f and a.i.

4546–4547 ↗(On Diff #78460)

These are tuples, and I'm showing the output of GetAliasingLocations

manmanren edited edge metadata.Dec 20 2016, 11:27 AM

Thanks for working on this, Sanjoy!

Manman

docs/LangRef.rst
4440 ↗(On Diff #81702)

I think we should not call scalar type descriptors as a subset of struct type descriptors. The 2nd field for scalar type descriptors means the parent node. Internally we have an offset field to make alias analysis easy.

4453 ↗(On Diff #81702)

Do we actually calculate this set in the source code? It is probably easier to understand if you provide a high-level description instead of the pseudo code. There are descriptions in the comments of TypeBasedAliasAnalysis.cpp.

4561 ↗(On Diff #81702)

After this the name --> After the name

4563 ↗(On Diff #81702)

The 2N th operand

4565 ↗(On Diff #81702)

constant field --> contained field

4577 ↗(On Diff #81702)

I actually prefer the descriptions for scalar type descriptors before this patch.
The first field is an identity field. It is a metadata string, which uniquely identifies the type. The second field identifies the type's parent node in the tree. The third field is optional...

Can we talk about scalar type descriptors first, so when talking about struct type descriptors, we can refer to scalar type descriptors?
Such as here ------
The first operand is an `MDString` which, like for scalar type descriptors

sanjoy updated this revision to Diff 84479.Jan 14 2017, 9:30 PM
sanjoy edited edge metadata.
  • address review
  • reduce the level of detail to make things less cluttered

Hey Sanjoy, I want to accept this, but i can't tell if you've addressed all the comments because they aren't marked done.
In particular, it looks like manman was not happy with some of the wording.
It seems like you reverted that part, but not positive.

sanjoy marked 13 inline comments as done.Feb 13 2017, 9:35 AM

Marking some inline comments as done.

docs/LangRef.rst
4577 ↗(On Diff #78460)

This bit of text is no longer there.

manmanren accepted this revision.Feb 13 2017, 1:26 PM

LGTM. Sorry for the delay,

I wonder where the previous versions of the patch are. The history only lists a single revision.

Manman

This revision is now accepted and ready to land.Feb 13 2017, 1:26 PM
sanjoy marked an inline comment as done.Feb 13 2017, 2:47 PM

LGTM. Sorry for the delay,

I wonder where the previous versions of the patch are. The history only lists a single revision.

I am able to see the previous versions:

This revision was automatically updated to reflect the committed changes.