This is an archive of the discontinued LLVM Phabricator instance.

[ADT] Add IntrusiveVariant, VariantTraits, and new STLForwardCompat
Needs ReviewPublic

Authored by scott.linder on Mar 11 2021, 9:05 PM.

Details

Summary

Add a compact, restricted version of the C++17 std::variant called
IntrusiveVariant. IntrusiveVariant stores the dynamic tag at the
beginning of every alternative type object, allowing for more efficient
packing in some cases.

Add an llvm::visit, which is compatible with std::visit and is generic
over any variant-like type which implements llvm::VariantTraits. Also
add llvm::visitSameAlternative which can be used when it is known that
all variants hold the same alternative type.

Also adds make_array, type_identity, in_place_index, and in_place_type
to STLForwardCompat.h

Diff Detail

Event Timeline

scott.linder created this revision.Mar 11 2021, 9:05 PM
scott.linder requested review of this revision.Mar 11 2021, 9:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 11 2021, 9:05 PM
whisperity added inline comments.
llvm/include/llvm/ADT/IntrusiveVariant.h
13

Itself

Who? The IntrusiveVariant or the union members?

36–37

I think the implementation detail in the parens could go somewhere else. If this little bit of trivia shouldn't interest the average client library, it shouldn't be placed into an obvious position.

37–39

So is it visit or Visit?

55

Is it possible to suggest here writing an implicit conversion operator (operator int() const { return Int; }) and an assignment operator (AltInt& operator=(int Int)) to the wrapped type? Writing

if (V.holdsAlternative<AltDouble>())
{
  AltDouble AD = V.get<AltDouble>();
  takes_a_double(AD.getDouble());
  AD.setDouble(8.f);
}

is just incredibly superfluous when std::variant does this already with the get<> and the assignment operator.

I'm more concerned about the assignment operator. Can we assign a (mutable) reference to the variant's alternative?

139

The referred macro(?) is not defined anywhere.

whisperity added inline comments.Mar 18 2021, 9:54 AM
llvm/include/llvm/ADT/IntrusiveVariant.h
38

How much is the limit. Where is it stored? Might need a mention of the actual value (or symbol name) here.

60–68

I honestly believe for usability perspectives we should offer the user the option not to write all this out by hand. Just a simple wrapper over a double is 9 lines (or 11 if we include the assignment and implicit conversion from the suggestion above).

This is just waaaaaay too much. If we have like 2 or 3 variations in the variant, we are not much better (if any) than writing out the enum+union ourselves.

At least please consider providing a simple macro that wraps any alternate type into such a variant-capable thing.

For example, I'd like to use a variant like as follows. This is how one would use (std|boost)::variant, right?

struct Something { /* 3 data members */ };
struct SomeOtherThing { /* 3 other data members of different type! */ };


Variant<Something, SomeOtherThing> ThisOrThat;

Now I understand this doesn't work in your case directly. The whole intrusiveness would mean that I have to deal with either making the initial data member private, and then what happens to my other data members that should (or, must!) be public instead?

As a user, I am happy to put up with having to write two additional lines, i.e.

struct Something { /* 3 data members */ };
struct SomeOtherThing { /* 3 other data members of different type! */ };

MAKE_VARIANT_ALTERNATIVE_TYPE(Something);
MAKE_VARIANT_ALTERNATIVE_TYPE(SomeOtherThing);

Variant<AltSomething, AltSomeOtherThing> ThisOrThat;
//      ^~~           ^~~  Might be tricky to figure out!

But the idea is that some of the boilerplate required to make my types (or wrappers thereof) compatible with a working variant (which I love the idea of, by the way!) should not require me to write almost the same insane amount of code as I would have to with a conventional tagged union.

145–146

Are instances of this class only instantiated in the global scope? What's the guarantee that Index will initialise to anything?

I'm not qualified with template magic and I am definitely not a person of authority on this library, so all my comments come from a generally strictly "user's perspective". I would love to have a variant type (it's surprising there never has been one in the code for so long), and I am happy I found this patch instead of trying to get into writing a similar thing myself...

llvm/include/llvm/ADT/IntrusiveVariant.h
208–209

There is a bit too much template "magic" to wrap my head around so quickly, but I have a question: does this implementation allow non-default-constructible alternative types? Or any chance that it forbids default-constructible ones? What's the stance on that. The examples in the leading documentation seem to indicate that all alternatives are non-default-ctorable... However, the "formal" requirements only say they must be trivially-dtorable and standard layout. What's the stance on default constructibleness?

scott.linder added a comment.EditedMar 19 2021, 3:14 PM

Thank you for taking a look @whisperity! I'm glad there is interest from others in the new type, I was worried at first it may be too niche of a use.

I didn't notice your comments until late today, but next week I will make a pass over them, respond, and update the patch.

I had intended to add more potentially-interested reviewers, possibly by posting to llvm-dev, but I was also in the process of rebasing the patchset which precipitated me factoring this out into include/ADT and wanted to make sure it was at the very least usable in that context, so thank you for adding some!

I'm also leaving a few comments I had already drafted as a reminder to myself.

llvm/include/llvm/ADT/IntrusiveVariant.h
216

Should be a move constructor?

258

Maybe rename to FallibleIntrusiveVariantVisitor and fallibleVisit or visitOrError?

llvm/unittests/ADT/IntrusiveVariantTest.cpp
28

These could be simplified, e.g. ASSERT_TRUE(V.hasAlternative<A>())

I'm glad there is interest from others in the new type, I was worried at first it may be too niche of a use.

At face value, I'd like to use it in D75041 in the following context, instead of the tagged union, do have an Optional<Variant<ConvertingCtor, ConversionOperator>> UserConversion. 🙂

enum UserDefinedConversionKind { UDCK_None, UDCK_Ctor, UDCK_Oper };

struct UserDefinedConvertingConstructor {
  const CXXConstructorDecl *Fun;
  QualType ConstructorParameterType;
  QualType UserDefinedType;
};

struct UserDefinedConversionOperator {
  const CXXConversionDecl *Fun;
  QualType UserDefinedType;
  QualType ConversionOperatorResultType;
};

/// The details of the user-defined conversion involved, as a tagged union.
union {
  char None;
  UserDefinedConvertingConstructor UDConvCtor;
  UserDefinedConversionOperator UDConvOp;
};
UserDefinedConversionKind UDConvKind;
scott.linder marked 5 inline comments as done.
  • Address feedback
  • Fix bug where last alternative type could not be emplaced
  • Misc cleanup
scott.linder added inline comments.Mar 24 2021, 4:42 PM
llvm/include/llvm/ADT/IntrusiveVariant.h
13

I tried to reword this a little to clarify, let me know if it helps or if you still see room for improvement.

The "alternative types" wording is lifted from std::variant, and that's what I was attempting to refer to here.

36–37

Good point, I think it can simply be deleted as the comments in detail should be sufficient to capture this. I also changed the phrasing "the set of alternative types must be unique" as I don't think it really says what I intended it to. Let me know what you think.

38

I added static constexpr members to IntrusiveVariant to describe the limits, and just referred to them everywhere the limits are mentioned.

The only place where this breaks down is in the macro to generate the actual switch in the implementation of llvm::visit and similar, but there are already other similar caveats and I see no way to overcome them without ditching the approach entirely.

55

Is it possible to suggest here writing an implicit conversion operator (operator int() const { return Int; }) and an assignment operator (AltInt& operator=(int Int)) to the wrapped type? Writing

I think in the context of this (admittedly over-simplified) example that makes perfect sense, but the moment you have anything more complex than a wrapper around another type it is not as useful anymore. For example I don't see any opportunity to simplify the following with implicit conversions:

if (V.holdsAlternative<Frobinator>()) {
  reticulate(V.get<Frobinator>().getSplines());
  V.get<Frobinator>().setWidget(getCachedWidgets());
}
if (V.holdsAlternative<AltDouble>())
{
  AltDouble AD = V.get<AltDouble>();
  takes_a_double(AD.getDouble());
  AD.setDouble(8.f);
}

is just incredibly superfluous when std::variant does this already with the get<> and the assignment operator.

I'm more concerned about the assignment operator. Can we assign a (mutable) reference to the variant's alternative?

Yes there is a non-const overload for IntrusiveVariant::get that you can assign into if the alternative type provides the appropriate copy/move assignment operator.

60–68

I agree there is a good chunk of boilerplate, but AFAICT there is no clean way to avoid this, and the real utility of the new type is in providing a type-safe, modern interface to a tagged union which is squeezed into the smallest possible memory footprint. For example instead of:

switch (MyTaggedUnion.getKind()) {
case MyTaggedUnion::K_int:
  // use MyTaggedUnion.getInt(), don't accidentally use any other accessors...
  break;
case MyTaggedUnion::K_float:
  // use MyTaggedUnion.getFloat(), don't accidentally use any other accessors...
  break;
case MyTaggedUnion::K_IntPair:
  // use MyTaggedUnion.getFirstInt(), don't accidentally use any other accessors...
  // use MyTaggedUnion.getSecondInt(), don't accidentally use any other accessors...
  break;
}

You can write:

visit(makeVisitor(
         [](AltInt I) { /* use I.getInt() */},
         [](AltFloat) { /* use I.getFloat() */},
         [](AltIntPair IP) { /* use IP.getFirstInt() and IP.getSecondInt() */}
       ), MyVariant);

And at several points things which would be runtime UB in the hand-written tagged-union case are now compile-time errors.

In trying to work through providing a simpler interface I run into problems with each approach. If you are just naively wrapping another type, then in general you are losing out on the optimization the IntrusiveVariant was designed around, which is to get more optimal packing when the "wrapped" type has spare padding that the tag could live in. Consider for example:

struct Inner { char X; int Y; };
struct Outer { char C; Inner I; };
struct Flat { char C; char X; int Y; };
static_assert(sizeof(Flat) < sizeof(Outer), "");

So the only way I see to implement something along the lines of what you suggest is to have a handful of macros to "begin", "add_member", and "end" an alternative:

BEGIN_VARIANT_ALTERNATIVE_TYPE(AltSomething)
ADD_MEMBER_VARIANT_ALTERNATIVE_TYPE(TypeA, A);
ADD_MEMBER_VARIANT_ALTERNATIVE_TYPE(TypeB, B);
ADD_MEMBER_VARIANT_ALTERNATIVE_TYPE(TypeC, C);
END_VARIANT_ALTERNATIVE_TYPE()

Where the above would expand to a class definition.

This seems less attractive than just having some boilerplate, and it doesn't cover the reasonable cases where a class doesn't want to expose all of its members uniformly.

208–209

There is a bit too much template "magic" to wrap my head around so quickly

You and me both :^) I tried to skip as many features as seemed reasonable in an effort to keep the amount of template/preprocessor magic to the bare minimum, but the result is still pretty dense :(

does this implementation allow non-default-constructible alternative types? Or any chance that it forbids default-constructible ones? What's the stance on that.

This supports both default-constructible and non-default-constructible alternative types. The possibilities are:

The type pack ArgTs is empty, in which case it is essentially equivalent to:

template <std::size_t I>
UnionImpl(InPlaceIndex<I>) {

Which will result in default-constructing the appropriate union member if it is default-constructible, otherwise it will result in a compile-time error indicating overload resolution could not find an appropriate constructor.

Or, the type pack is not empty, and we forward whatever arguments along to the appropriate union member's non-default constructor.

The examples in the leading documentation seem to indicate that all alternatives are non-default-ctorable

I can add an example of a type that is default-constructible, but there is explicit documentation as part of IntrusiveVariant itself that an IntrusiveVariant is default-constructible iff the first alternative type is:

/// A default constructed IntrusiveVariant holds a default constructed value
/// of its first alternative. Only enabled if the first alternative has a
/// default constructor.
template <int B = std::is_default_constructible<detail::NthType<0, Ts...>>{},
           typename std::enable_if_t<B, int> = 0>
constexpr IntrusiveVariant() : Union(detail::InPlaceIndex<0>{}) {}

And otherwise the default constructors are just constructors with no arguments, and you can emplace using any alternative type constructor.

Looks like this visitor infrastructure has some overlap with D99560 and D99561 - be good to deduplicate some work here, perhaps.

Perhaps PointerUnion could be implemented as a derivation from this intrusive variant in a follow-up change, subsuming D99561

Looks like this visitor infrastructure has some overlap with D99560 and D99561 - be good to deduplicate some work here, perhaps.

Yes, I agree that factoring out llvm::makeVisitor and llvm::visit and making it work over all variant-like types makes sense. It's unfortunate that even once we move to C++17 we won't be able to re-open std to specialize std::visit for our types, at least if I understand correctly. We may just be stuck with llvm::visit and std::visit living side-by-side?

I also think things like InPlaceType could be moved somewhere general; llvm::Optional has an in_place_t as a detail, so pulling all of those new disambiguation types into a common place would also be good.

Perhaps PointerUnion could be implemented as a derivation from this intrusive variant in a follow-up change, subsuming D99561

I'm not as familiar with the implementation of PointerUnion, I just noted when I did a pass over the LLVM container types that it wouldn't be sufficient for the specific thing I was working on, so I started writing my own compact variant-like type. At the very least I think we should have a common llvm::visit with specializations for both llvm::PointerUnion and llvm::IntrusiveVariant.

Perhaps PointerUnion could be implemented as a derivation from this intrusive variant in a follow-up change, subsuming D99561

I'm not as familiar with the implementation of PointerUnion, I just noted when I did a pass over the LLVM container types that it wouldn't be sufficient for the specific thing I was working on, so I started writing my own compact variant-like type. At the very least I think we should have a common llvm::visit with specializations for both llvm::PointerUnion and llvm::IntrusiveVariant.

Hopefully/probably overloads, rather than specializations (it's not possible to partially specialize a function template, in any case), but yeah -

llvm/include/llvm/ADT/IntrusiveVariant.h
160–167

Can we instead use/generalize something like the type traits used for PointerUnion that declare the number of spare bits in the type? Since we already have those in LLVM (& to macro-y).

277–281

Not sure I follow why this ends up as a runtime error - what if the overload with unreachable (down at 307) wasn't defined? Wouldn't that produce a compile-time error only when the index is out of bounds?

328–344

There's a fair bit of macro expansion stuff here - any chance it can be restructured to use variadic templates to generalize better/reduce the macro usage?

430–433

Technically C++ doesn't allow access to a common initial sequence except for types with standard layout, I think? So types with non-trivial copy ops, etc, wouldn't be usable here. (I wonder what PointerUnion does? I guess it just stuffs everything in void* or uintptr_t, which isn't exactly applicable here - but maybe some aligned storage would be suitable here)

scott.linder added inline comments.Apr 2 2021, 3:42 PM
llvm/include/llvm/ADT/IntrusiveVariant.h
160–167

My understanding of the PointerUnion case is that the standard allows the bitwise manipulation of pointer representations so long as you jump through the appropriate casting hoops (i.e. to (u)intptr_t and back) and ensure you end up with the same representation before casting back. For objects of arbitrary standard-layout type I don't know that we have any neat way to ensure we don't invoke UB, as the "padding" bits in the object representation are not guaranteed to be preserved over most uses of the struct, AFAIK.

That said, if I'm just missing something I would be very happy to remove DECLARE_INTRUSIVE_ALTERNATIVE. I currently see it as an acceptable concession needed to meet the requirements of the "common initial sequence" rule, which seems like the only way to get the compact layout without invoking UB.

277–281

AFAICT the only way to get that behavior is to have a separate overload for every size of IntrusiveVariant we support, as there is no way to recursively generate the switch itself, and if we just declare a switch with some arbitrary number of case limbs (in my patch this is 32) then it is a compile-time error to use any IntrusiveVariant with fewer than 32 alternatives.

In the simplest case:

visit(Callable C, IntrusiveVariant<A> IV) { // i.e. toy example of a monomorphized visit for a variant with a single alternative type A
  switch (IV.index()) {
  case 0: C(IV.Union.getMember(InPlaceIndex{0})); break;
  case 1: C(IV.Union.getMember(InPlaceIndex{1})); break; // This does not resolve to any overload of getMember, and is a compile-time error.
  }
}

In the above, if we only wrote one template for visit and it has two case limbs in the switch, then it will only compile for variants with exactly two alternative types. If we could recursively generate the body of visit we could do this without the unreachable workaround, but I don't see how to.

328–344

I think this is getting at the same issue as above, and I don't see any way to generate a switch with a compile-time determined number of case limbs using variadic templates :/

430–433

Yes, that's my understanding too, and I do static_assert the standard-layout-ness of all alternative types and mention this restriction in the file header comment.

maybe some aligned storage would be suitable here

I think this is the point I may be missing up above, but if I understand what you're getting at the idea would be to basically:

  • Allocate an array of char with the maximum alignment of any of the alternative types
  • If possible, put our index tag in any padding left over after the value representation of all the alternative types (as I mentioned above, this is the part I'm unsure about concerning UB)
  • Manage all of the construction, copying, moving, destruction of alternative types in the array of char

If there is a way to do this without relying on UB, then it seems reasonable to me. I do worry it would end up being a lot of code, in particular I think we would need to do something like the libcxx implementation of std::variant to support all of the possible combinations of levels of trivial-ness of construction, copying, destruction, etc. which was something I wanted to avoid doing purely to make the code simpler.

dblaikie added a subscriber: rsmith.Apr 6 2021, 4:47 PM
dblaikie added inline comments.
llvm/include/llvm/ADT/IntrusiveVariant.h
160–167

Wondering if it might be more intuitive for users to work from the back instead - bit more work in the implementation, though?

eg: if I've got a {int, char} struct, and I use some tool to dump the layout and find I have 3 bytes of tail padding I could create an array of 3 chars at the end ({int, char, char[3]}) and specialize some template to say "I have 3 bytes of tail padding char array" - guess that still doesn't provide compatibility/foundation to implement pointer union on. Would have to declare the spare space in bits, and the aliasing would still be a problem for the pointer case

I guess the system you've got means you don't have to specialize another template, etc - the existence of a friend-accessed member with a specific name you can check for is enough evidence the type has agreed to be intrusively varianted.

I'd love to here @rsmith's or other C++-y folks thoughts on this, as it's some non-trivial design with interesting overlap on existing constructs.

277–281

I've missed a step in the logic here.

I'm still trying to wrap my head around: What happens if the function defined below with llvm_unreachable was instead not defined at all? Where is the code that's creating a dead call to this code such that it needs to be defined, but is guaranteed never to be called?

Separately, as for the switches: Presumably some recursion could be used instead. Yeah, it doesn't scale well especially at -O0 codegen.

Oh, maybe an array of function pointers could be used - a function template templated on the index? Would avoid the recursion.

430–433

Ah. Rather than using the common initial sequence rule - could you instead take a pointer and reinterpret_cast it - knowing that the member is at the start of the object? I would've thought that would be valid even without the common initial sequence rule - but maybe there's some pointer providence that makes that approach invalid?

(eg: if I know struct x { int; }; starts with an int, and I have x *y = &some_x; can I not reinterpret_cast y to int* and dereference it?)

but, yeah - partly I was thinking of putting it at the end rather than the beginning - and, yeah, I'm not sure what happens if you have some of these chars at the end of the object, then some trailing after the object (if some of the objects are of different sizes) it may be fussy to correctly address all those bytes (struct-internal padding, and external padding) in a simple/uniform way.

Sorry for the delay in replying, you pose some very good questions and I wanted to revisit the reasoning that led me to my current approach in as thorough a way as possible (which meant it took some time, and I didn't get around to it right away).

llvm/include/llvm/ADT/IntrusiveVariant.h
277–281

What happens if the function defined below with llvm_unreachable was instead not defined at all?

In short, it is a compile-time error to use any IntrusiveVariant without exactly MaxNumberOfAlternatives alternative types. In effect, the compiler sees exactly one switch with exactly MaxNumberOfAlternatives non-default cases, each of which calls FallibleIntrusiveVariantVisitor::operator(). If the IntrusiveVariant has fewer than MaxNumberOfAlternatives, some of those calls do not resolve to any overload and the compiler complains.

More generally, with the approach I'm taking, when calling visit defined with N non-default switch cases on an IntrusiveVariant with M alternative types:

  • If N == M, each alternative type corresponds to exactly one case, and exactly one meaningful overload of FallibleIntrusiveVariantVisitor::operator().
  • If N < M, the first M alternative types correspond to exactly one case, and exactly one meaningful overload of FallibleIntrusiveVariantVisitor::operator(). The remaining M - N cases correspond to the default: case, and we hit a runtime error. This should be a compile time error, which I manually enforce with static_assert(sizeof...(Ts) <= MaxNumberOfAlternatives); this isn't particularly pretty, but with the simplest switch-based approach I don't know how to achieve anything better.
  • If N > M, the first M alternative types correspond to exactly one case, and exactly one meaningful overload of FallibleIntrusiveVariantVisitor::operator(). However, there are N - M additional cases in the switch, which call "invalid" or "non-meaningful" overloads of FallibleIntrusiveVariantVisitor::operator(). We must provide a definition of these functions to satisfy the compiler/linker, but by construction we ensure they can never actually be called at runtime. To try to convey this to the compiler, we define them to call llvm_unreachable.

Separately, as for the switches: Presumably some recursion could be used instead. Yeah, it doesn't scale well especially at -O0 codegen.

Oh, maybe an array of function pointers could be used - a function template templated on the index? Would avoid the recursion.

Both of these are definitely viable, and have different tradeoffs. I landed on switch-based because:

  • It demands the least amount of work by the compiler to be efficient at runtime. As you mention, this matters most at -O0
  • It demands the least amount of work by the compiler to produce a compact binary. This is (or at least was) apparently an issue in some real world cases with e.g. libc++'s array-of-function-pointer based approach, resulting in pretty significant code-bloat. It has led to multiple attempts to incorporate a switch-based approach into libc++'s implementation of visit.
  • It seemed (definitely subjectively) like the simplest to write and read. The need for the "fallible" adapter might negate some of this, though, so I'm not so certain this is true anymore.

The current state of this in libc++ seems to be that the switch-based approach has been landed and reverted multiple times, each time the revert noting the general approach is still sound, but some issue with the exact implementation was identified and could not be fixed in a short enough window to avoid reverting. The most recent revert is https://reviews.llvm.org/D91662

The fix to this in my patch is to simply not implement the full generality, and have some relatively small limit on the number of alternative type supported. That way the code generated scales linearly with the number of unique IntrusiveVariant types instantiated, and has a pretty small constant factor (a 32-case switch where each branch is a call). At higher optimization levels I would expect all of this to be pretty routinely optimized away, but I haven't actually verified that.

Maybe @mpark, @ldionne, @EricWF have some guidance on what the best implementation strategy for an LLVM-specific visit might be?

scott.linder retitled this revision from [ADT] Add IntrusiveVariant to [ADT] Add IntrusiveVariant, in_place_index, and in_place_type.
scott.linder edited the summary of this revision. (Show Details)

Rebase and do some cleanup

Could you reupload this (or maybe you can handle this just by editing the revision here on the web UI?) using arc/phab's ability to flag dependent patches/part of a series, hopefully that'll make it easier to apply the patch to test things? (tried just applying this patch and of course no template named 'TypeAtIndex')

llvm/include/llvm/ADT/IntrusiveVariant.h
277–281

Yeah, would be happy to hear from the libc++ folks - though, for what it's worth I think maybe the array of function pointers might be simpler (once the patch series is updated so it's easier to apply for testing, I'll poke around with a few options) for our purposes without having to necessarily have the same priorities a highly general library like libc++ has.

Could you reupload this (or maybe you can handle this just by editing the revision here on the web UI?) using arc/phab's ability to flag dependent patches/part of a series, hopefully that'll make it easier to apply the patch to test things? (tried just applying this patch and of course no template named 'TypeAtIndex')

Yes, I had meant to from the start, I think it should be correct now if you want to try again

llvm/include/llvm/ADT/IntrusiveVariant.h
277–281

That's fair, thinking about it again the fact that we are not intending to support visiting multiple variants means the array will be 1-dimensional, which should make it easier to read and reduce the code-bloat in the unoptimized case. I can also draft an implementation of that

Side thought: If this is going to use a prefix for the discriminator - could that be a base class that all types being used as union members have to derive from? No need for a macro, etc. Could use an offsetof comparison to make sure that not only is it a base class, but the first non-empty base class.

(still haven't been able to easily apply this locally - looks like it hit a conflict with c6f20d70a8c93ab3d50c9777c477fe040460a664 - though I thought arc would sync to the right version to apply the patch... not sure what's going on - might just wait to try that until the other patches have been committed)

scott.linder retitled this revision from [ADT] Add IntrusiveVariant, in_place_index, and in_place_type to [ADT] Add IntrusiveVariant, VariantTraits, and new STLForwardCompat.
scott.linder edited the summary of this revision. (Show Details)

Try to factor out the implementation of visit to be based on specializations
of a type VariantTraits. This should make it possible to share one
implementation with other variant-like types, for example PointerUnion.

Switch to an array-of-functions approach which doesn't have any restrictions on
the number of alternative types or the number of variants visited at once.

There are still some WIP things, including SFINAE tests on at least
llvm::visit, a lot of documentation, and the trait implementation for
PointerUnion. Other functions like holds_alternative and get could also be
handled in the same way.

I wanted to get feedback on the general approach before going any further.

Side thought: If this is going to use a prefix for the discriminator - could that be a base class that all types being used as union members have to derive from? No need for a macro, etc. Could use an offsetof comparison to make sure that not only is it a base class, but the first non-empty base class.

(still haven't been able to easily apply this locally - looks like it hit a conflict with c6f20d70a8c93ab3d50c9777c477fe040460a664 - though I thought arc would sync to the right version to apply the patch... not sure what's going on - might just wait to try that until the other patches have been committed)

Unfortunately having any non-static data members as part of a base class will make the alternative types non-standard-layout, which means we can't use the "common initial sequence" rule to pack them.

Side thought: If this is going to use a prefix for the discriminator - could that be a base class that all types being used as union members have to derive from? No need for a macro, etc. Could use an offsetof comparison to make sure that not only is it a base class, but the first non-empty base class.

(still haven't been able to easily apply this locally - looks like it hit a conflict with c6f20d70a8c93ab3d50c9777c477fe040460a664 - though I thought arc would sync to the right version to apply the patch... not sure what's going on - might just wait to try that until the other patches have been committed)

Unfortunately having any non-static data members as part of a base class will make the alternative types non-standard-layout, which means we can't use the "common initial sequence" rule to pack them.

But would we need to use the common initial sequence rule? If the base object was at the start of the layout - a pointer to the start of the layout would be a valid pointer to the base object, yeah?

scott.linder added a comment.EditedJun 3 2021, 12:22 PM

Side thought: If this is going to use a prefix for the discriminator - could that be a base class that all types being used as union members have to derive from? No need for a macro, etc. Could use an offsetof comparison to make sure that not only is it a base class, but the first non-empty base class.

(still haven't been able to easily apply this locally - looks like it hit a conflict with c6f20d70a8c93ab3d50c9777c477fe040460a664 - though I thought arc would sync to the right version to apply the patch... not sure what's going on - might just wait to try that until the other patches have been committed)

Unfortunately having any non-static data members as part of a base class will make the alternative types non-standard-layout, which means we can't use the "common initial sequence" rule to pack them.

But would we need to use the common initial sequence rule? If the base object was at the start of the layout - a pointer to the start of the layout would be a valid pointer to the base object, yeah?

AFAIK that isn't true: we can't assume the base subobject is at the start of the layout of the derived object unless both are standard-layout. The first example I could think of where the assumption breaks down with any reasonable compiler is multiple inheritence, e.g. in https://godbolt.org/z/K9rnxbPo9 it is impossible to know where to find the base subobject for the dynamic alternative object when all we have is a pointer to whatever is providing storage for it. Put another way, we need to already know the dynamic derived type to do this, which is the thing we are after in the first place.

The example in the above godbolt is:

#include <cstdint>

struct Base { int tag; };
struct Unrelated { int foo; };
struct AlternativeA : Base { int bar; };
struct AlternativeB : Unrelated, Base { int baz; };

AlternativeA A;
AlternativeB B;

uintptr_t AddrA() { return (uintptr_t)(void*)&A; }
uintptr_t AddrAtoBase() { return (uintptr_t)(void*)(Base*)&A; }

uintptr_t AddrB() { return (uintptr_t)(void*)&B; }
uintptr_t AddrBtoBase() { return (uintptr_t)(void*)(Base*)&B; }
uintptr_t AddrBtoUnrelated() { return (uintptr_t)(void*)(Unrelated*)&B; }

Where in trunk clang AddrB() != AddrBtoBase().

Maybe there are not other examples where a reasonable compiler doesn't lay out a sole base class at the start of the derived class, but even still it violates strict aliasing if we aren't talking about standard-layout types.

I also tried to look deeper into how this might be implemented using aligned storage, which you suggested earlier. As you point out, there are potential provenance issues. The pointer to the object which provides storage cannot be cast to a derived type without std::launder in at least some cases, although std::launder doesn't exist until C++17 and it isn't clear what the correct interpretation is for a C++14 compiler. Additionally, your example of reinterpret_cast to access the first member is only valid for standard-layout types.

The suggestion to put the tag at the end, starting after the object representation of the biggest alternative type, seems the most feasible, assuming:

  • We don't need std::launder pre-C++17
  • No reasonable compiler we support for compiling LLVM will be modifying trailing padding, i.e. bits in the object representation that are not a part of the value representation. The only thing I found with a quick search is __builtin_clear_padding which seems to be intended for use with atomics https://reviews.llvm.org/D87974?id=292987
  • We can figure out some way to know where the padding in each alternative object starts. You suggested a traits-like mechanism where the alternative itself tries to occupy all the space it expects to be padding with a char array, and makes our implementation aware of it. I don't think I'm happy with that approach for the reasons you mention, and because it seems very non-ergonomic for the user, much more than requiring them to invoke a macro or derive from a base class.

If we can come up with a reliable way to know what padding bits are available, and are comfortable with the fact that compilers are not interested in clobbering them, then I think the std::launder issue seems like a minor detail we can figure out one way or another.

Do you have any thoughts on how I might go about tackling this? It would be really nice to have a cleaner interface to IntrusiveVariant, but I'm still just stuck on how to achieve it in a portable way.

Edit: Also, thank you for being so thorough and responsive in reviewing this! I'm excited to see the design moving in a cleaner direction :^)

Edit: Where I say "it violates strict aliasing" above, I'm not sure if I'm using that term too loosely, but to clarify: the specific case of a reinterpret_cast between a pointer to a standard-layout struct and its first member/base subobject is allowed by (9.2.19):

If a standard-layout class object has any non-static data members, its address is the same as the address of its first non-static data member. Otherwise, its address is the same as the address of its first base class subobject (if any). [ Note: There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning, as necessary to achieve appropriate alignment. — end note ]

combined with (5.2.9.13):

A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T,” where T is an object type and cv2 is the same cv-qualification as, or greater cv qualification than, cv1. The null pointer value is converted to the null pointer value of the destination type. If the original pointer value represents the address A of a byte in memory and A satisfies the alignment requirement of T, then the resulting pointer value represents the same address as the original pointer value, that is, A. The result of any other such pointer conversion is unspecified. A value of type pointer to object converted to “pointer to cv void” and back, possibly with different cv-qualification, shall have its original value.

I.e. they have the same address, so you can reinterpret_cast between them. C++17 adds a pointer-interconvertible relation which seems to formalize this more.

Side thought: If this is going to use a prefix for the discriminator - could that be a base class that all types being used as union members have to derive from? No need for a macro, etc. Could use an offsetof comparison to make sure that not only is it a base class, but the first non-empty base class.

(still haven't been able to easily apply this locally - looks like it hit a conflict with c6f20d70a8c93ab3d50c9777c477fe040460a664 - though I thought arc would sync to the right version to apply the patch... not sure what's going on - might just wait to try that until the other patches have been committed)

Unfortunately having any non-static data members as part of a base class will make the alternative types non-standard-layout, which means we can't use the "common initial sequence" rule to pack them.

But would we need to use the common initial sequence rule? If the base object was at the start of the layout - a pointer to the start of the layout would be a valid pointer to the base object, yeah?

AFAIK that isn't true: we can't assume the base subobject is at the start of the layout of the derived object unless both are standard-layout.

My hope/thought was/is to use offsetof to check that the member is at the start of the object. Though, yes, for maximal portability, offsetof does require standard layout (though it seems both clang and gcc support it at least in some more cases than that: https://godbolt.org/z/Pjr6YEqMq )

In any case - even if it still leaves us with the standard layout restriction - using a base class rather than a macro+member seems like a nicer/more usable solution?

But yeah, that still means standard layout, and potentially using the common initial sequence rules - I think I'm OK with that. Type punning's probably OK there too, but common initial sequence seems OK. And since it's only one byte - I'm OK with going at the front rather than the end.

But, yeah, I think using a base class rather than a macro would be good & I think it's as valid.

llvm/unittests/ADT/IntrusiveVariantTest.cpp
47–76

Could this be a template (& 3 typedefs), rather than a macro?

Side thought: If this is going to use a prefix for the discriminator - could that be a base class that all types being used as union members have to derive from? No need for a macro, etc. Could use an offsetof comparison to make sure that not only is it a base class, but the first non-empty base class.

(still haven't been able to easily apply this locally - looks like it hit a conflict with c6f20d70a8c93ab3d50c9777c477fe040460a664 - though I thought arc would sync to the right version to apply the patch... not sure what's going on - might just wait to try that until the other patches have been committed)

Unfortunately having any non-static data members as part of a base class will make the alternative types non-standard-layout, which means we can't use the "common initial sequence" rule to pack them.

But would we need to use the common initial sequence rule? If the base object was at the start of the layout - a pointer to the start of the layout would be a valid pointer to the base object, yeah?

AFAIK that isn't true: we can't assume the base subobject is at the start of the layout of the derived object unless both are standard-layout.

My hope/thought was/is to use offsetof to check that the member is at the start of the object. Though, yes, for maximal portability, offsetof does require standard layout (though it seems both clang and gcc support it at least in some more cases than that: https://godbolt.org/z/Pjr6YEqMq )

In any case - even if it still leaves us with the standard layout restriction - using a base class rather than a macro+member seems like a nicer/more usable solution?

The moment we have non-static data members in both a derived class and its base class the derived class becomes non-standard-layout, though. Even just the following will not work:

struct VariantHeader {
  uint8_t Tag;
};

// SomeAlternative is not standard-layout
struct SomeAlternative : VariantHeader {
  int Foo;
};

But yeah, that still means standard layout, and potentially using the common initial sequence rules - I think I'm OK with that. Type punning's probably OK there too, but common initial sequence seems OK. And since it's only one byte - I'm OK with going at the front rather than the end.

But, yeah, I think using a base class rather than a macro would be good & I think it's as valid.

Side thought: If this is going to use a prefix for the discriminator - could that be a base class that all types being used as union members have to derive from? No need for a macro, etc. Could use an offsetof comparison to make sure that not only is it a base class, but the first non-empty base class.

(still haven't been able to easily apply this locally - looks like it hit a conflict with c6f20d70a8c93ab3d50c9777c477fe040460a664 - though I thought arc would sync to the right version to apply the patch... not sure what's going on - might just wait to try that until the other patches have been committed)

Unfortunately having any non-static data members as part of a base class will make the alternative types non-standard-layout, which means we can't use the "common initial sequence" rule to pack them.

But would we need to use the common initial sequence rule? If the base object was at the start of the layout - a pointer to the start of the layout would be a valid pointer to the base object, yeah?

AFAIK that isn't true: we can't assume the base subobject is at the start of the layout of the derived object unless both are standard-layout.

My hope/thought was/is to use offsetof to check that the member is at the start of the object. Though, yes, for maximal portability, offsetof does require standard layout (though it seems both clang and gcc support it at least in some more cases than that: https://godbolt.org/z/Pjr6YEqMq )

In any case - even if it still leaves us with the standard layout restriction - using a base class rather than a macro+member seems like a nicer/more usable solution?

The moment we have non-static data members in both a derived class and its base class the derived class becomes non-standard-layout, though. Even just the following will not work:

struct VariantHeader {
  uint8_t Tag;
};

// SomeAlternative is not standard-layout
struct SomeAlternative : VariantHeader {
  int Foo;
};

Ah, thanks for the correction, I missed the "Has all non-static data members and bit-fields declared in the same class (either all in the derived or all in some base)" part of the Standard Layout requirement.

@rsmith got any thoughts on this? I'd rather not have a macro stamping out this member, though I appreciate wanting to have it only be accessible to the variant type (so wanting it to be private+friended) - I do rather like/prefer being able to use a base class, though it means going off-spec with offsetof usage in a non-standard-layout type (& possibly making the offsetof check only done on certain compilers - crossing our fingers that the compilers we didn't test on don't break the layout requirement))

scott.linder edited the summary of this revision. (Show Details)
  • Clean up, rename, and document the VariantTraits interfaces
  • Expose IntrusiveVariant::index() as it is already available through the traits
  • Fix a bug in IntrusiveVariant comparison functions

Ping

I did try to write an offsetof-based version using inheritance instead of the "common initial sequence" rule, but I am not sure how to do the necessary checks statically. The best I can see to do would be to have some dynamic ASSERTs to confirm that casting the alternative type to the base type does not change the underlying representation (i.e. if both pointers are cast to uintptr_t they are equal).

Add an extra function makeThunkForSameAlternative to force old versions of
gcc to deduce the return value of the overloaded "thunk" being instantiated
before deducing the template arguments for make_array.

Workaround for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64194

@rsmith @dexonsmith @aaron.ballman - folks who might have some opinions on C++ API design. Any thoughts on this patch in general, but also in particular how the discriminator is represented in supported classes? (I'd like to avoid macros, and there are a few tradeoffs around that)

ldionne removed a subscriber: ldionne.Dec 16 2021, 10:36 AM

Sorry, I don't have bandwidth to chime in.

No worries, I understand the patch is pretty big and the template code is (unfortunately) dense.

If it would help any other reviewers I can also try to split things up further. One easy axis to do this is the addition of llvm::make_array, and another is VariantTraits.h, both of which can be reviewed independently. Just let me know if this would simplify things for anyone!

I also intend to address an issue with return type inference for overloaded visitor operator()s, like the standards body did with http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0655r0.pdf in C++20 (essentially adding a visit<R> where you can specify the return type). This makes some common use-cases with makeVisitor nicer to look at.

Implement explicit return version of llvm::visit

When the common return type of the visitor is ambiguous it is useful to be able
to specify a type to coerce the result to. This implements the equivalent
std::visit addition in C++20.