Page MenuHomePhabricator

[CodeGen] Add initial support for union members in TBAA

Authored by kosarev on Oct 31 2017, 8:15 AM.



The basic idea behind this patch is that since in strict aliasing mode all accesses to union members require their outermost enclosing union objects to be specified explicitly, then for a couple given accesses to union members of the form


it is known they can only alias if both p and q point to the same union type and offset ranges of members a.b.c... and x.y.z.. overlap. Note that the actual types of the members do not matter.

Specifically, in this patch we do the following:

  • Make unions to be valid TBAA base access types. This enables generation of TBAA type descriptors for unions.
  • Encode union types as structures with a single member of a special "union member" type. Currently we do not encode information about sizes of types, but conceptually such union members are considered to be of the size of the whole union.
  • Encode accesses to direct and indirect union members, including member arrays, as accesses to these special members. All accesses to members of a union thus get the same offset, which is the offset of the union they are part of. This means the existing LLVM TBAA machinery is able to handle such accesses with no changes.

While this is already an improvement comparing to the current situation, that is, representing all union accesses as may-alias ones, there are further changes planned to complete the support for unions. One of them is storing information about access sizes so we can distinct accesses to non-overlapping union members, including accesses to different elements of member arrays. Another change is encoding type sizes in order to make it possible to compute offsets within constant-indexed array elements. These enhancements will be addressed with separate patches.

Diff Detail


Event Timeline

kosarev created this revision.Oct 31 2017, 8:15 AM

Colleagues, please let me know if there are any concerns about this patch. It was specifically designed to not cause any substantial changes, but I may be missing something important. Thanks.

rjmccall edited edge metadata.Nov 12 2017, 2:26 PM

I think the implementation looks fine. The metadata design also sounds good to me, but I'd like Hal to weigh in before you commit.

Thanks John.

Hal, do you see any problems with this patch?

hfinkel accepted this revision.Nov 28 2017, 9:36 PM

I think this looks fine (and I agree that the underlying premise makes sense). Before you commit, could you also please add some tests with nested union members?

This revision is now accepted and ready to land.Nov 28 2017, 9:36 PM
kosarev updated this revision to Diff 124726.Nov 29 2017, 5:44 AM
  • Added tests with nested unions.
  • Rebased.

Thanks Hal. Will commit it tomorrow.

This revision was automatically updated to reflect the committed changes.
aheejin added a subscriber: aheejin.Dec 1 2017, 2:41 PM

After this patch, a couple of gcc torture tests started to fail. Could you possibly look at this?

Once you confirm the bug, could you possibly revert the patch?

Once you confirm the bug, could you possibly revert the patch?

I agree. We should revert this. The relevant part of the test case is:

short *q;
p->u.vec[i] = 0;
q = &p->u.vec[16];
*q = 1;
return p->u.vec[i];

demonstrating that you can't use "union member" as the access type here. You need to use the actual access type, or something derived from it, because non-type-changing scalar accesses are still allowed to alias with the union-member accesses.

Thank you for confirming and reverting!

Herald added a project: Restricted Project. · View Herald TranscriptJun 12 2019, 8:31 AM