This is an archive of the discontinued LLVM Phabricator instance.

[clang][CodeGen] Use base subobject type layout for potentially-overlapping fields
ClosedPublic

Authored by dzhidzhoev on Dec 9 2022, 1:11 PM.

Details

Summary

RecordLayoutBuilder assumes the size of a potentially-overlapping field
with non-zero size as the size of the base subobject type corresponding
to the field type.
Make CGRecordLayoutBuilder to acknowledge that in order to avoid incorrect
padding insertion.

Without this patch, test fails on assertion

Assertion failed: (Offset >= Size), function insertPadding, file CGRecordLayoutBuilder.cpp, line 802.

Diff Detail

Event Timeline

dzhidzhoev created this revision.Dec 9 2022, 1:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 9 2022, 1:11 PM
dzhidzhoev requested review of this revision.Dec 9 2022, 1:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 9 2022, 1:11 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
dzhidzhoev edited the summary of this revision. (Show Details)Dec 9 2022, 1:11 PM
efriedma added inline comments.Dec 13 2022, 1:25 PM
clang/include/clang/AST/Decl.h
3076

Maybe clarify that this also checks the object is of class type? That's technically not part of the C++ standard definition of a "potentially-overlapping subobject". (I'm not sure what rule, if any, prevents non-class objects from overlapping.)

clang/lib/CodeGen/CGRecordLayoutBuilder.cpp
387–390

Extra parentheses.

clang/test/CodeGenCXX/no-unique-address-3.cpp
61

It would be nice to also explicitly check the LLVM IR types, instead of just checking that emitting LLVM IR doesn't crash.

(We also try to avoid using -emit-obj from clang tests, since that only works if the relevant backend is linked into clang.)

Addressed @efriedma comments.

This revision is now accepted and ready to land.Dec 14 2022, 2:09 PM

clang-format'ed.

This revision was landed with ongoing or failed builds.Dec 15 2022, 4:12 AM
This revision was automatically updated to reflect the committed changes.

I'm not sure what the libcxx failure was that caused you to revert this, but we also saw a clang crasher as a result of this. clang/lib/AST/Decl.cpp:4300 in unsigned int clang::FieldDecl::getBitWidthValue(const ASTContext &) const: isBitField() && "not a bitfield". I'll try to reduce it.

I'm not sure what the libcxx failure was that caused you to revert this, but we also saw a clang crasher as a result of this. clang/lib/AST/Decl.cpp:4300 in unsigned int clang::FieldDecl::getBitWidthValue(const ASTContext &) const: isBitField() && "not a bitfield". I'll try to reduce it.

Here:

template <class F, class X, class Y>
struct Foo {
  Foo(F, X, Y y) : y_(y) {}
  Y y_;
  union FooHolder {
    int foo;
  };
  [[no_unique_address]] FooHolder foo_holder_;
};
template <typename Factory, typename X, typename Y>
void MakeFoo(Factory f, X x, Y y) {
  Foo(f, x, y);
}
void Start() {
  MakeFoo(0, int(), [] {});
}
dzhidzhoev reopened this revision.Dec 16 2022, 6:36 AM

I'm not sure what the libcxx failure was that caused you to revert this, but we also saw a clang crasher as a result of this. clang/lib/AST/Decl.cpp:4300 in unsigned int clang::FieldDecl::getBitWidthValue(const ASTContext &) const: isBitField() && "not a bitfield". I'll try to reduce it.

Thank you!
The problem was that we should not try to get the base subobject class of union fields.

This revision is now accepted and ready to land.Dec 16 2022, 6:36 AM
dzhidzhoev updated this revision to Diff 483519.EditedDec 16 2022, 6:37 AM

Ignore union fields. Get base subobject of struct and class types only.

Does the following work with the updated patch?

class Empty {};
class UnionClass : Empty {
  [[no_unique_address]] union X {
  private:
    Empty x;
    alignas(2) char C;
  } U;
  char C;
};
UnionClass L;
dzhidzhoev added a comment.EditedDec 16 2022, 1:01 PM

Does the following work with the updated patch?

class Empty {};
class UnionClass : Empty {
  [[no_unique_address]] union X {
  private:
    Empty x;
    alignas(2) char C;
  } U;
  char C;
};
UnionClass L;

Unfortunately no, it fails on the same assert as the current trunk as soon as unions are ignored.

*** Dumping AST Record Layout
         0 | class Empty (empty)
           | [sizeof=1, dsize=1, align=1,
           |  nvsize=1, nvalign=1]

*** Dumping AST Record Layout
         0 | union UnionClass::X
         0 |   class Empty x (empty)
         0 |   char C
           | [sizeof=2, dsize=1, align=2,
           |  nvsize=1, nvalign=2]

*** Dumping AST Record Layout
         0 | class UnionClass
         0 |   class Empty (base) (empty)
         2 |   union UnionClass::X U
         2 |     class Empty x (empty)
         2 |     char C
         3 |   char C
           | [sizeof=4, dsize=4, align=2,
           |  nvsize=4, nvalign=2]

Fixed crashes reported by @efriedma and @rupprecht.
"Base subobject type" layout (e.g. type layout without suffix padding)
is calculated for unions and final classes too here, in order to
lay out fields with [[no_unique_address]].

Clang-format'ted

Fixed crash for 'struct C' in no-unique-address-3.cpp.
Without the fix, packedness for regular union layout in provided test
and for [[no_unique_address]] union layout turns out to be different.
This causes assertion on CGRecordLayoutBuilder.cpp:902.
Fix is similar to what is done on CGRecordLayoutBuilder.cpp:790,794.

efriedma added inline comments.Jan 25 2023, 9:57 AM
clang/lib/CodeGen/CGRecordLayoutBuilder.cpp
370

Should this be if (Layout.getSize() % StorageAlignment || Layout.getDataSize() % StorageAlignment)? The dependency on isNoUniqueAddress is a bit confusing.

dzhidzhoev added inline comments.Feb 2 2023, 8:32 AM
clang/lib/CodeGen/CGRecordLayoutBuilder.cpp
370

For base class subobjects, base class DataSize is used to compute the whole object size in RecordLayoutBuilder.cpp, and base subobject LLVM type (with ".base" suffix) is computed in addition to standard layout LLVM type in CGRecordLayoutBuilder.cpp.

Currently, both base subobject and standard layout LLVM types of the same class are supposed to agree on packedness, as stated in bb51300970b7. Thus, if one is packed, both are marked as packed, as done in CGRecordLowering::determinePacked.

For data members of class/struct type, declared with [[no_unique_address]] attribute, DataSize is used to compute the whole object size in RecordLayoutBuilder.cpp, but standard layout LLVM type is used to represent the field in the whole class's LLVM type.
This patch proposes to use the base subobject LLVM type instead of the default LLVM type of a class to lay out a no_unique_address member of this class type.

Nextly, the patch proposes to create an unpadded LLVM type for unions, in addition to the default LLVM type. It is used to lay out the union members having no_unique_address attribute. Existing code for creating base subobject layout of classes are reused for unions, therefore, packedness of potentially-overlapping union LLVM type and of default union LLVM types must agree as well as for classes.

dzhidzhoev marked an inline comment as not done.Feb 2 2023, 8:38 AM

Existing code creating base subobject layout of classes is reused for unions *

efriedma added inline comments.Feb 7 2023, 7:14 PM
clang/lib/CodeGen/CGRecordLayoutBuilder.cpp
370

That doesn't really get at what I was asking.

LayoutSize % StorageAlignment || Layout.getDataSize() % StorageAlignment is basically equivalent to isNoUniqueAddress ? Layout.getDataSize() % StorageAlignment != 0 : (Layout.getSize() % StorageAlignment || Layout.getDataSize() % StorageAlignment). From my understanding, you want isNoUniqueAddress=true to agree with isNoUniqueAddress=false on whether the result is packed. So why are the checks different?

Simplified condition in lowerUnion.

dzhidzhoev added inline comments.Feb 15 2023, 5:09 PM
clang/lib/CodeGen/CGRecordLayoutBuilder.cpp
370

Thanks, sorry for misunderstanding.

It seems that if Layout.getSize() isn't aligned by StorageAlignment, Layout.getDataSize() isn't aligned by it too, thus Layout.getDataSize() % StorageAlignment is enough in that condition (at least, it doesn't break any check-clang tests).
I have added an assert on that assumption.

dzhidzhoev marked an inline comment as not done.Feb 15 2023, 5:11 PM