Page MenuHomePhabricator

[clang] adds unary type transformations as compiler built-ins

Authored by cjdb on Dec 22 2021, 9:53 PM.




  • __add_const
  • __add_volatile
  • __add_cv
  • __add_lvalue_reference
  • __add_pointer
  • __add_rvalue_reference
  • __decay
  • __make_signed
  • __make_unsigned
  • __remove_all_extents
  • __remove_extent
  • __remove_const
  • __remove_volatile
  • __remove_cv
  • __remove_pointer
  • __remove_reference
  • __remove_cvref

These are all compiler built-in equivalents of the unary type traits
found in [[meta.trans]][1]. The compiler already has all of the
information it needs to answer these transformations, so we can skip
needing to make partial specialisations in standard library
implementations (we already do this for a lot of the query traits). This
will hopefully improve compile times, as we won't need use as much
memory in such a base part of the standard library.


Co-authored-by: zoecarver

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
rsmith added inline comments.Apr 11 2022, 1:07 PM

I'd prefer that you restructure this to first compute a StringRef corresponding to the name of the trait and then use Out << 'u' << Name.size() << Name. (That'd make the miscounting error that a previous version of this patch had with __remove_pointer impossible.)


Do we need different handling for __underlying_type versus the other unary type transforms here?


If we want a file name in llvm_unreachable and it's not providing one, I think that's something we should add centrally rather than in this one usage -- I'd prefer that you not pass __FILE__ here and instead produce a separate patch to extend llvm_unreachable if needed. (Also you seem to be missing any whitespace in the resulting string between the filename and the start of the message.)


Do not pass English-language text fragments to diagnostics -- we aim for our diagnostics to (at least in principle) be translatable. Instead, add a new diagnostic for this situation (or perhaps just use diag::err_expected, tok::l_paren).


Can we give this a better name, given that it's not used for only typeof? Maybe something like setTypeArgumentRange?

The patch description doesn't match the patch: we have __remove_cv in the description but __remove_cv_qualifiers in the patch. It seems unfortunate to me to use a long-term bad name for our trait to work around a (likely) short-term problem with a specific version of libstdc++. Perhaps we could demote the __remove_cv keyword to a plain identifier if it's not followed by a ( instead (we've done similar things for other trait keywords that have conflicted with libstdc++); would that be enough to resolve the incompatibility?

cjdb updated this revision to Diff 423122.Apr 15 2022, 9:21 AM
cjdb marked 14 inline comments as done.

responds to all feedback from the last round, except for handling __remove_cv in libstdc++ land

cjdb marked an inline comment as done.Apr 15 2022, 9:22 AM
cjdb updated this revision to Diff 423225.Apr 16 2022, 12:08 AM

achieves harmony with libstdc++

cjdb updated this revision to Diff 423253.Apr 16 2022, 12:11 PM
  • catches some missing things that only handled __underlying_type
  • makes __remove_pointer remove address spaces (otherwise it'd create types like int __attribute__((address_space(1))))
cjdb marked an inline comment as done.Apr 16 2022, 12:15 PM
cjdb added inline comments.

I don't think handling address spaces is worth the trouble: we can always do __add_pointer(int __attribute__((address_space(1)))) if desired (I added two tests to confirm this). I did need to fix __remove_pointer so that it also removes address spaces, so thank you for flagging this :)

Generally looking good to me, mostly just drive-by commentary at this point.


Should we find a way we can drop the 19 here so that this doesn't become incorrect when someone adds anything to TransformTypeTraits.def?


I don't know why this offends my sensibilities, but it does. It seems like the label should be with the other (case) labels to my lizard brain, but the current way is also correct.


Well, at least my sensibilities are consistent about pearl clutching, as this one also jumped out at me. :-D


And yet this one didn't offend my sensibilities. It's the curly braces that's doing it. :-D



(Also, I'm a bit concerned that this feature requires so many gotos. Each one is reasonable in isolation, but we're adding a fair amount of total parsing complexity with those. That said, I don't have a better solution to suggest.)

cjdb updated this revision to Diff 423716.Apr 19 2022, 1:14 PM

removes discrete number of elements in small vector

Per offline discussion with @aaron.ballman, not going to change the labels' indentation (so it remains consistent with clang-tidy)

cjdb marked 5 inline comments as done.Apr 19 2022, 1:16 PM
This revision is now accepted and ready to land.Apr 20 2022, 9:45 AM
rsmith added inline comments.Mon, Apr 25, 5:05 PM

This file should undef the TRANSFORM_TYPE_TRAIT_DEF macro at the end, like our other .def files do. Then you can remove all of the #undefs elsewhere in this patch.


Consider using the .def file here.


I think the existence of the IsCPlusPlus flag may have the opposite effect of the one that's intended:

  • Without this flag, looking only at the declaration of this function, I would assume you're not supposed to call this function except in C++ mode.
  • With this flag, looking only at the declaration of this function, I would assume that passing in IsCPlusPlus = false would cause the function to always return false (because there are no reference types).

It might be better to define "referenceable" as "function type with no ref-qualifier nor cv-qualifier, object type, or reference type", and just accept calls to isReferenceable across all language modes.


It's better not to test whether the type is a function type (walking past sugar etc) more than once.


Please keep the formatting consistent here, even if you need to override clang-format.


It'd be nice to just compare T against the least and greatest trait value -- this linear scan seems more expensive than we might want. (And sure, we don't care about efficiency given that this function is only used by asserts, but I don't think we should assume it'll remain that way.)

Eg, something like:

constexpr TST Traits[] = {
#define TRANSFORM_TYPE_TRAIT_DEF(_, Trait) TST_##Trait,
#include "clang/AST/TransformTypeTraits.def"
static_assert(std::is_sorted(std::begin(Traits), std::end(Traits)));
return T == TST_atomic || T == TST_typename || T == TST_typeofType || (T >= Traits[0] && T <= std::end(Traits)[-1]);

This should just be u, not Uu.


You're missing the I...E around this argument.


Our normal convention is to put the { after the last label in a sequence of labels.


The name signature you're using for this function is inconsistent with the conventions in the rest of the parser: when a Parser::Parse... function with a bool return type returns true, it means "I have failed and issued a diagnostic". For "parse this if possible and return whether you did", we usually use Parser::MaybeParse... (eg, MaybeParseAttributes).

Alternatively you could do the one-token lookahead here. If the setKind call is here, it'll be a lot clearer why it makes sense to goto ParseIdentifier;.

Also, for workarounds for a standard library issue, we normally include a // HACK: comment explaining what we're doing and why.


Our normal convention is to put the { after the last label in a sequence of labels.


Here you're checking for trait< and treating it as an identifier; elsewhere you're checking for trait( and treating it as a builtin. Is there a reason to treat trait followed by a token other than ( or < inconsistently?


As before, { should go after the last label.


This assert will start failing if someone adds another trait that isn't in this range. Can we do something more robust?

One possibility would be to extend the .def file with macros for selecting the first and last trait only. You could also use that in DeclSpec::isTypeRep(). Another would be to factor out the part of isTypeRep() that deals with this so you can call it from here.


This check seems unnecessary: we shouldn't call any of these functions for a dependent type, and it looks like we don't.


Should we support vector types here?


I don't understand this comment. char, signed char, and unsigned char are three different types, and you can certainly tell them apart if you want to, for example using BaseType->isSpecificBuiltinType(BuiltinType::SChar) or Context.hasSameType(BaseType, Context.SignedCharTy) to detect signed char.


This is !BaseType->isAnyPointerType().

What about block pointers? I think this patch is doing the right thing here, by saying that this only applies to pointers spelled with * (plain and obj-c pointers) and not to pointers spelled with ^, but it seems worth calling out to ensure that I'm not the only one who's thought about it :)


Type attributes are non-standard, so we can't answer this question by looking at the standard. Nonetheless, doing what getDecayedType does seems like the best choice.


This seems inconsistent: the specification says that std::decay_t produces std::remove_cv_t in this case, but your implementation of std::remove_cv_t doesn't remove restrict and this does. If this is intentional, I think it warrants a comment.


Should we even expose this trait in non-C++ languages? Making this a C++-only keyword seems better than making it a no-op in C.


This looks like it'll lose all qualifiers other than CVR qualifiers. Presumably that's not the intent. (This will also unnecessarily remove type sugar in some cases, but that's not super important here.) Please also don't use copy-list-initialization to form a QualType; see

See the suggested edits for an alternative, simpler formulation (that preserves as much type sugar as possible).


It doesn't look like this will properly remove qualifiers from within an array type in general, eg:

using T = const volatile int;
using U = T[3];
using V = __remove_cv(U);

Also, you can't use the opaque value of a Qualifiers object like this -- it's opaque, and is not always just a CVR mask, which is what the QualType constructor wants. This will assert or otherwise misbehave if there are non-CVR qualifiers on the type.

The simple thing to do would be:

Qualifiers Quals;
QualType Unqual = Context.getUnqualifiedArrayType(BaseType, Quals);
// ... remove stuff from Quals ...
QualType Result = Context.getQualifiedType(BaseType, Quals);

It'd be nice to leave the type sugar alone to the extent that we can, but it doesn't really matter too much given that this will usually be wrapped inside a type-sugar-erasing template anyway.


This is wrong: if, say, int and long are the same bit-width, this will give the same type for __make_unsigned(int) and __make_unsigned(long), where they are required to produce unsigned int and unsigned long respectively.

Please look at Context.getCorrespondingSignedType and Context.getCorrespondingUnsignedType to see how to do this properly; you may be able to call those functions directly in some cases, but be careful about the cases where we're required to return the lowest-rank int type of the given size. Note that getIntTypeForBitwidth does *not* do that; rather, it produces the *preferred* type of the given width, and for WebAssembly and AVR it produces something other than the lowest-rank type in practice in some cases.


As before, the opaque value is, well, opaque, and you shouldn't be using it in this way. Also, you just created Underlying so you know it's unqualified, so there's no need to ask for its qualifiers.

See suggested edit. Alternatively (#1), if you want to preserve all qualifiers:

Underlying = Context.getQualifiedType(BaseType.getQualifiers());

Alternatively (#2) we could strictly follow the standard wording and preserve only cv-qualifiers and not restrict. I think preserving all qualifiers is more in line with the intent, but preserving only const and volatile is probably the best match for what a C++ implementation of the type trait would do. *shrug*


If it's easy, can you undo the unrelated formatting changes in this file? That'll make future archaeology a little easier and help reduce the chance of merge conflicts. (Not a bit deal though.)


Do we have a corresponding test for the MS ABI?


Let's keep all the tests for this together -- please move these to namespace test55.


Is there a reason for this wrapper? You're directly using __is_same below, why not directly use __remove_pointer too?


I expected to see some tests for ObjC pointers in here, but I don't see any. I'd like to see __remove_pointer(id) and @class X; static_assert(__is_same(__remove_pointer(X*, X)));


Please use static_assert instead of faking it with an array.

5 ↗(On Diff #423716)

We generally name these tests for libstdc++ workarounds as libstdcxx_..._hack.cpp


Consider using the trait directly instead of wrapping it in an alias.


Please use static_assert.


This is should have some test coverage for restrict and for at least one non-standard qualifier. (Similarly for the remove_cvref and decay tests.)


You seem to be missing coverage that __remove_pointer(const int*) preserves the const.
You also seem to be missing coverage that this works on a cv-qualified pointer, eg __remove_pointer(int *const) -> int.


It seems strange to remove the address space qualifier here, given that this trait doesn't remove cv-qualifiers.


The tests you have here for references to pointers-to-members seem excessive (throughout); I don't think there's any reason to think that a reference to pointer-to-member would be so different from a reference to any other type that would justify this level of testing of the former case.

The cases that seem interesting here are the ones for non-referenceable types, but those are covered separately below.


It's weird that we permit an abominable function type to show up in the argument of a trait during template instantiation but not when written directly. That behavior should really be consistent, and I think we should consistently allow it. That is, we should allow:

static_assert(__is_same(__add_pointer(int () &), int () &));

... just like we allow it if you hide the usage of the traits inside a template. That'd make this easier to test, too: you can then put these tests with the other tests for the corresponding traits. (This should only require passing DeclaratorContext::TemplateArg to ParseTypeName instead of the default of DeclaratorContext::TypeName in the places where we parse type traits, reflecting that we want to use the rules for template arguments in these contexts.) This would also fix an incompatibility between GCC and Clang:

That said, I don't think we need to block this patch on this change, so if you'd prefer to not do this now, that's fine :)


This is the wrong result type, it should be long long.


It'd be useful to test enums with different underlying types. However, this test is not portable: if short and int are the same size, this is required to produce short, not int. It'd be good to have some test coverage of that quirk too. Perhaps this is easiest to see with a test like:

enum E : long long {};
static_assert(__is_same(__make_signed(E), long));

... which should hold in cases where long and long long are the same size and are larger than int.


Please also test the case where the qualifiers are attached outside the array type:

using T = int[1][2];
static_assert(__is_same(__remove_all_extents(const T), const int), "");
cjdb updated this revision to Diff 429331.Fri, May 13, 12:56 PM
cjdb marked 39 inline comments as done.

responds to the vast majority of open feedback


Comment added, but I'd prefer to keep the function call so it keeps the logic self-contained and named.


Is it a conforming extension for make_signed_t<int __attribute__((ext_vector_type(2)))> to work?


Done, and added a test.


I think this is a no-op?


This makes me very happy.


As with decay, I think it's best if we pretend restrict is a standard qualifier, either for everything or for nothing. However, I don't think that restrict is a valid contender here. Isn't it only applicable to pointers, which aren't transformed by this pair?


git-clang-format seems to insist these are formatted despite the lines not being touched. I'm not fond of arguing with clang-format.


Per offline discussion, I think this is no longer being requested?


The rationale here is that we can assign __remove_pointer, etc., to a using declaration, which is how they're exposed to users.


Not knowing a lick of ObjC this was a fun one to get working (I didn't know that @class X had to be global).


Same as in


Does int __attribute__((address_space(1))) make sense as a type? I thought it had to be a pointee.


I've removed all but one pointer-to-member function.


Gonna defer to a separate CL, but will get it done one day soon :-)


I agree, but what about when long is smaller than int?

cjdb added inline comments.Fri, May 13, 12:58 PM

Bleh, this is impossible. What I meant to ask is what happens when long and int are the same size, I think?

cjdb updated this revision to Diff 429336.Fri, May 13, 1:08 PM


cjdb updated this revision to Diff 429476.Sat, May 14, 11:51 AM
  • fixes wchar_t (wasn't supported) and charX_t (all three went to an 8-bit type)
  • makes underlying type for signed wchar_t int rather than wchar_t
  • adds tests for these types