This is an archive of the discontinued LLVM Phabricator instance.

[clang] adds unary type transformations as compiler built-ins

Authored by cjdb on Dec 22 2021, 9:53 PM.




  • __add_lvalue_reference
  • __add_pointer
  • __add_rvalue_reference
  • __decay
  • __make_signed
  • __make_unsigned
  • __remove_all_extents
  • __remove_extent
  • __remove_const
  • __remove_volatile
  • __remove_cv
  • __remove_pointer
  • __remove_reference
  • __remove_cvref

These are all compiler built-in equivalents of the unary type traits
found in [[meta.trans]][1]. The compiler already has all of the
information it needs to answer these transformations, so we can skip
needing to make partial specialisations in standard library
implementations (we already do this for a lot of the query traits). This
will hopefully improve compile times, as we won't need use as much
memory in such a base part of the standard library.


Co-authored-by: zoecarver

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
rsmith added inline comments.Apr 25 2022, 5:05 PM

Consider using the trait directly instead of wrapping it in an alias.


This is should have some test coverage for restrict and for at least one non-standard qualifier. (Similarly for the remove_cvref and decay tests.)


You seem to be missing coverage that __remove_pointer(const int*) preserves the const.
You also seem to be missing coverage that this works on a cv-qualified pointer, eg __remove_pointer(int *const) -> int.


It seems strange to remove the address space qualifier here, given that this trait doesn't remove cv-qualifiers.


The tests you have here for references to pointers-to-members seem excessive (throughout); I don't think there's any reason to think that a reference to pointer-to-member would be so different from a reference to any other type that would justify this level of testing of the former case.

The cases that seem interesting here are the ones for non-referenceable types, but those are covered separately below.


It's weird that we permit an abominable function type to show up in the argument of a trait during template instantiation but not when written directly. That behavior should really be consistent, and I think we should consistently allow it. That is, we should allow:

static_assert(__is_same(__add_pointer(int () &), int () &));

... just like we allow it if you hide the usage of the traits inside a template. That'd make this easier to test, too: you can then put these tests with the other tests for the corresponding traits. (This should only require passing DeclaratorContext::TemplateArg to ParseTypeName instead of the default of DeclaratorContext::TypeName in the places where we parse type traits, reflecting that we want to use the rules for template arguments in these contexts.) This would also fix an incompatibility between GCC and Clang:

That said, I don't think we need to block this patch on this change, so if you'd prefer to not do this now, that's fine :)


This is the wrong result type, it should be long long.


It'd be useful to test enums with different underlying types. However, this test is not portable: if short and int are the same size, this is required to produce short, not int. It'd be good to have some test coverage of that quirk too. Perhaps this is easiest to see with a test like:

enum E : long long {};
static_assert(__is_same(__make_signed(E), long));

... which should hold in cases where long and long long are the same size and are larger than int.


Please also test the case where the qualifiers are attached outside the array type:

using T = int[1][2];
static_assert(__is_same(__remove_all_extents(const T), const int), "");
cjdb updated this revision to Diff 429331.May 13 2022, 12:56 PM
cjdb marked 39 inline comments as done.

responds to the vast majority of open feedback


Comment added, but I'd prefer to keep the function call so it keeps the logic self-contained and named.


Is it a conforming extension for make_signed_t<int __attribute__((ext_vector_type(2)))> to work?


This makes me very happy.


As with decay, I think it's best if we pretend restrict is a standard qualifier, either for everything or for nothing. However, I don't think that restrict is a valid contender here. Isn't it only applicable to pointers, which aren't transformed by this pair?


Done, and added a test.


I think this is a no-op?

54 ↗(On Diff #423716)

git-clang-format seems to insist these are formatted despite the lines not being touched. I'm not fond of arguing with clang-format.

1113 ↗(On Diff #423716)

Per offline discussion, I think this is no longer being requested?

7 ↗(On Diff #423716)

The rationale here is that we can assign __remove_pointer, etc., to a using declaration, which is how they're exposed to users.

12–46 ↗(On Diff #423716)

Not knowing a lick of ObjC this was a fun one to get working (I didn't know that @class X had to be global).


Same as in


Does int __attribute__((address_space(1))) make sense as a type? I thought it had to be a pointee.


I've removed all but one pointer-to-member function.


Gonna defer to a separate CL, but will get it done one day soon :-)


I agree, but what about when long is smaller than int?

cjdb added inline comments.May 13 2022, 12:58 PM

Bleh, this is impossible. What I meant to ask is what happens when long and int are the same size, I think?

cjdb updated this revision to Diff 429336.May 13 2022, 1:08 PM


cjdb updated this revision to Diff 429476.May 14 2022, 11:51 AM
  • fixes wchar_t (wasn't supported) and charX_t (all three went to an 8-bit type)
  • makes underlying type for signed wchar_t int rather than wchar_t
  • adds tests for these types
cjdb updated this revision to Diff 436838.Jun 14 2022, 10:02 AM
cjdb marked 2 inline comments as done.

rebases to head, adds a comment to make something clearer

1750 ↗(On Diff #423716)

The parser needs to treat trait< as an identifier to accommodate libstdc++'s usage of a few of these as alias templates. I've added a comment to explain why in the code.


Gentle ping.


Gentle ping.

aaron.ballman added inline comments.Jun 14 2022, 10:13 AM

Does int attribute((address_space(1))) make sense as a type? I thought it had to be a pointee.

It doesn't have to be on a pointer or a pointee type.

rsmith added inline comments.Jun 14 2022, 1:58 PM

This change seems surprising. Can you explain what's going on here?


If we now allow unsigned types to be passed in here, can we do so consistently?

This seems to do the wrong thing for a vector of scoped enumeration type. Is that a problem for the builtins?


(Continuation of suggested edits in comment a few lines below.)


Perhaps we can handle all the plain unsigned types (other than wchar_t and char, which are special) here? This would also cover the fixed-point types.


Similar comments for this function as previous one.


Remove this line too. No point building an RAII scope with nothing in it.

1750 ↗(On Diff #423716)

I agree we need to treat trait< as an identifier and trait( as the builtin. My question is, why do we have inconsistent treatment of the remaining cases (trait followed by any other token)? For example, MaybeParseTypeTransformTypeSpecifier treats trait + 1 as an identifier. But this code treats it as the builtin name.

I think there are two choices that make the most sense, if they work:

  1. trait( is the builtin and any other utterance is an identifier, or
  2. trait< is an identifier, using trait = is an identifier, and any other utterance is the builtin.

I think I prefer option (2), given that it's defining the narrower special case. But all I'm really looking for here is a consistent story one way or the other, if it's feasible to have one.


I don't think this comment pulls its weight any more, now that you're calling a named function for this check. Maybe just remove it?


Yes, it's a conforming extension compared to C++20; the standard places no restrictions on what a program that uses a vendor extension type such as a vector type does.

On reflection, I'm not sure it would be a conforming extension to OpenCL (in which such vector types exist as part of the language standard -- in particular, ext_vector_type is intended to follow the OpenCL rules). So let's not support such things until / unless we get some direction from the OpenCL folks.


With the changes suggested for getCorresponding*Type, I think we can remove the BitInt special case here.


The comment on this producing the wrong underlying type for enums in some cases doesn't seem to be addressed. You need to produce the lowest-rank type of the given size, which is not what getIntTypeForBitwidth does.


You're splitting the type into unqualified type and quailfiers twice here, once in the call to getQualifiers and again in getSplitUnqualifiedType. It'd be better to only split it once.


Do we really want to remove the address space here? The libc++ and libstdc++ implementations of the trait don't do that ( and it makes the behavior of __remove_pointer inconsistent with the behavior of __add_pointer. Can we just produce the pointee type intact, including all of its qualifiers?


It's not correct to treat a Qualifiers opaque value as if it has meaning for anything other than converting back from an opaque value. The second argument here is a cvr mask, which is not the same thing.

Use Context.getQualifiedType to combine a type with some qualifiers.


As above, you can't use getAsOpaqueValue like this, and should avoid splitting the type twice.

Note that the pattern used here is only correct because we know that Underlying cannot be an array type.


Why "in C++"? It looks like it does this in C too, which is probably appropriate. However, this is a divergence from what std::decay_t does in libc++ and libstdc++, where it does not remove __restrict.


Looks like you forgot to remove the old code when adding the new code?


As before, the way you were splitting and reconstructing types wasn't correct.


LLVM style is to restrict the scopes of variables as much as possible, and there's no point splitting the type if we're not going to use the result.

rsmith added inline comments.Jun 14 2022, 2:07 PM

Specifically, the rule you need to implement for enums is that the resulting type "names the [un]signed integer type with smallest rank (6.8.5) for which sizeof(T) == sizeof(type)". So the first of [un]signed char, [un]signed short, [un]signed int, [un]signed long, [un]signed long long that is the same size as the enum, and if none of those work then an extended integer type.

For example, if int and long are the same size, then given enum E : long {};, __make_unsigned(E) is required to produce unsigned int, not unsigned long. This is not what getIntTypeForBitwidth does in general; rather, it answers the target-specific question of "which integer type would you like to be used for an integer that is N bits wide?".

cjdb updated this revision to Diff 439202.Jun 22 2022, 4:20 PM
cjdb marked 20 inline comments as done.

applies Richard's feedback (or most of it)


Motivated by __make_signed(wchar_t) previously returning wchar_t, and that it was seemingly inconsistent with getUnsignedWCharType below.


I think not, given we're not worrying about vector types right now?


Can we get rid of this function entirely in that case?

1750 ↗(On Diff #423716)

I'd like for it to be (2) as well, but based on how we do alias-declarations, I'm concerned that will have performance implications for a rare occurrence.


That sentence is specifically calling out that __restrict is being removed in C++ mode too. Tightened up the wording. I'm partial to __decay removing __restrict because it would be what language decay does.

cjdb updated this revision to Diff 439953.Jun 24 2022, 10:24 PM

rebases onto main

cjdb updated this revision to Diff 440078.Jun 26 2022, 10:25 AM


rsmith added inline comments.Jul 7 2022, 1:47 PM

It looks to me like this breaks the implementation of a GCC compatibility feature. In GCC 8 and before, signed wchar_t is [accepted and means wchar_t]( and similarly unsigned wchar_t is accepted and means unsigned int, and Clang provides that behavior here; with this change, signed wchar_t will resolve to int instead in Clang, deviating from GCC's behavior for this GCC compatibility feature.

I think I'd be happy to see that GCC compatibility feature removed entirely, especially given that GCC 9 onwards removed it, it's an error by default in Clang, and essentially no-one seems to be using it, but that should not be done in this patch, and we shouldn't change its behavior here either.


This isn't the correct behavior for make_unsigned. We need to pick the integer type with the lowest rank with the same size as the enumeration (see the corresponding wording). For example, if the underlying type of E is long and sizeof(long) == sizeof(int), then make_unsigned<E> is required to produce unsigned int not unsigned long.

It might be that the best approach here is for BuiltinChangeSignedness to only call getCorrespondingUnsignedType when T is a signed or unsigned integral type (notably excluding char16_t and friends -- see next comment -- but including _BitInt), and otherwise for it to look through the signed / unsigned integer types for the first one with the same size as T and pick that.


Mapping char16_t -> unsigned short seems target-specific to me. If plain unsigned char is 16 bits wide, then make_unsigned<char16_t> should be unsigned char. Same for char32_t, wchar_t.


We don't need both of these. Just the llvm_unreachable would suffice.

1755–1756 ↗(On Diff #440078)

Is it feasible to also produce this warning for the corresponding case in MaybeParseTypeTransformTypeSpecifier / the callers of that function?


The name that's wanted by BuildPointerType is the name of the pointer variable, not the name of the type the pointer points to. Eg, for int &*p; it wants to say "p declared as a pointer to a reference". Passing in the name of the base type will mean that __add_pointer(class X&) could produce diagnostics like "X declared as a pointer to a reference", which is nonsense. Fortunately, that diagnostic is impossible here because we never pass it a reference, so you should be able to just pass a null DeclarationName instead.


BuildPointerType can fail and return a null QualType; you'll need to detect that and do something different here -- either return the error to the caller or fall back to the original type for error recovery.


No need to add Pointee's qualifiers to Pointee; it already has them.


As above, this isn't the right name to pass, and we should not pass any name given that we don't have one.


BuildReferenceType can fail and return a null QualType, and that will need to be handled here.


See comments in getCorrespondingUnsignedType: these don't perform the slightly odd "lowest rank type with this size" computation specified in the standard, in the case where the given type is an enum or character type.


I'm still seeing 39 pointer-to-member tests here in check_remove_cvref, 39 in check_decay, and 25 in check_remove_reference, all of which seem to be testing the same thing (that these traits don't look "inside" a pointer-to-member's pointee type in any way). Can those be cut down a bit?

cjdb marked 13 inline comments as done.Jul 30 2022, 11:47 AM
cjdb added inline comments.
1755–1756 ↗(On Diff #440078)

Will this risk sending out the warning twice, and if so, can the diagnostic engine handle that?

cjdb marked an inline comment as done.Jul 30 2022, 11:48 AM
cjdb updated this revision to Diff 449184.Aug 1 2022, 10:26 PM
cjdb edited the summary of this revision. (Show Details)

responds to Richard's feedback

I had accidentally uploaded this to D116280, apologies if there was any confusion.

cjdb updated this revision to Diff 450586.Aug 6 2022, 9:37 PM

rebases to ToT

rsmith accepted this revision.Aug 8 2022, 2:52 PM

A handful more comments, but I think we've basically converged here. I'm happy to take another look after you address these if you'd like (or you could ask someone else for a final pass), or for you to land this once you're happy.

Before landing, it'd be good to patch libc++ to use these intrinsics and run its testsuite to look for any unexpected behavior changes, if you've not already done so with this version of the patch.


I think we don't need this case any more.


This comment still needs to be addressed.

1066–1067 ↗(On Diff #450586)

Just curious: why do we only handle six of the traits here, rather than all of them?


I've been pondering the proper behavior for make_signed on enum E : _BitInt(N) {}. I think the standard wording here is broken (see my email to the lib reflector, subject "make_signed / make_unsigned and padding / extended integer types"), but I'm not certain what the right rule is. Let's say that always producing a _BitInt type in that case is good enough for now. :)


Have you considered moving the calls to getUnaryTransformType into this function? Right now they're scattered among the Builtin* helper functions, and it's hard to be sure that every code path calls it. I think it'd be a lot simpler to create the UnaryTransformType wrapper once, in a single place.


Remove the static -- it's not correct to use a static local for this. There can be multiple Sema instances, and this will pick the types from whichever one was used the first time this function was called.


We don't have an __int128 on 32-bit targets. Maybe below you can form an ArrayRef and slice off the last element if __int128 is unsupported?


Please only compute the size of BaseType once.


Can this fail for an enum whose underlying type is _BitInt(N) for some unusual N?

cjdb updated this revision to Diff 451918.Aug 11 2022, 10:59 AM
cjdb marked 10 inline comments as done.

responds to @richardsmith's feedback

cjdb added a comment.Aug 11 2022, 11:02 AM

Thanks for patiently reviewing! I'll do the libc++ stuff and a pilot run in some code. Hopefully can get it merged tomorrow :-)


Could've sworn I did this one, but it might've gotten mixed up in the D116280 switcheroo.

1750 ↗(On Diff #423716)

You haven't countered this concern, so I'm going to close for now and will happily revisit if this one just slipped by unnoticed.

1066–1067 ↗(On Diff #450586)

Why indeed. Perhaps these were the original six I worked on and didn't come back to update this.


Nice catch. This wasn't supposed to happen, but I ended up changing the logic so that _BitInt is considered here instead.

cjdb updated this revision to Diff 452001.Aug 11 2022, 3:22 PM

tl;dr fixes corner-case bug uncovered by libc++

I'd accidentally inverted the logic for when __int128 and unsigned __int128 would be available for underlying enum types. This led to the types being swapped (??) and there wasn't a test to catch it.

Pre-commit CI found build errors that should be addressed.

1–2 ↗(On Diff #452001)

Have to fix the formatting manually though.


Errr, this isn't Python, so that [-1] terrifies me. It took me a good solid ten minutes to convince myself this was actually valid. Any interest in something along the lines of this form?


I believe you can.


An interesting test case for you to consider (both for enumeration underlying types as well as types in general): signed _BitInt requires at least two bits, so trying to do a make_signed on _BitInt(1) should be diagnosed.

cjdb updated this revision to Diff 452259.Aug 12 2022, 11:46 AM
cjdb marked 5 inline comments as done.

applies Aaron's feedback

cjdb marked an inline comment as done.Aug 12 2022, 12:04 PM
cjdb added inline comments.

I have many questions about the decisions I made in this snippet.


It looks like other things also have this as just empty, so maybe it's best to keep it?

aaron.ballman accepted this revision.Aug 12 2022, 12:16 PM

LGTM, though I did have a nit about diagnostic wording that you can feel free to take or leave at your discretion. Thank you for this work, this was a very large patch with a whole lot of review comments, but I think what we ended up with is really great!


It was definitely a great way to wake up this morning. "Wait, what? WHAT?!?" followed by a bunch of furious standards reading. :-D


That's fine by me!


This diagnostic is a bit unfortunate because it was given a non-bool integer, so I don't know that users will know how to fix the issue from the text. I leave it to your discretion where to split that diagnostic up with a %select so that _BitInt is special or not; I don't think users will hit this diagnostic all that often, but I do have a little bit of a worry about template metaprogramming accidentally getting into this state.

cjdb updated this revision to Diff 452285.Aug 12 2022, 1:06 PM

adds a clearer diagnostic for unsigned _BitInt(1)

cjdb updated this revision to Diff 452535.Aug 14 2022, 10:11 AM

moves TransformTypeTraits.def from clang/AST to clang/Basic

There are dowstream issues with having this definition file in the
former, because of the Bazel rules describing the relationship
between Basic and AST. Since Basic is the base and moving it there
is an NFC, I've taken the initiative to move it ahead of downstrem
usage. (Updating changelist for transperancy, but still merging
with main today.)

This revision was landed with ongoing or failed builds.Aug 14 2022, 10:34 AM
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Aug 14 2022, 11:14 AM

This breaks building on windows:

Please take a look and revert for now if it takes a while to fix.

This also broke building with GCC 9 on Ubuntu 20.04:

[5/225] ASTNodeAPI.json
FAILED: tools/clang/lib/Tooling/ASTNodeAPI.json
cd /home/martin/code/llvm-project/llvm/build/tools/clang/lib/Tooling && /home/martin/code/llvm-project/llvm/build/bin/clang-ast-dump --skip-processing=0 -I /home/martin/code/llvm-project/llvm/build/lib/clang/16.0.0/include -I /home/martin/code/llvm-project/llvm/tools/clang/include -I /home/martin/code/llvm-project/llvm/build/tools/clang/include -I /home/martin/code/llvm-project/llvm/build/include -I /home/martin/code/llvm-project/llvm/include -I /usr/include/c++/9 -I /usr/include/x86_64-linux-gnu/c++/9 -I /usr/include/c++/9/backward -I /usr/lib/gcc/x86_64-linux-gnu/9/include -I /usr/local/include -I /usr/include/x86_64-linux-gnu -I /usr/include --json-output-path /home/martin/code/llvm-project/llvm/build/tools/clang/lib/Tooling/ASTNodeAPI.json
clang-ast-dump: ../tools/clang/lib/Sema/DeclSpec.cpp:833: bool clang::DeclSpec::SetTypeSpecType(clang::DeclSpec::TST, clang::SourceLocation, const char*&, unsigned int&, const clang::PrintingPolicy&): Assertion `!isDeclRep(T) && !isTypeRep(T) && !isExprRep(T) && "rep required for these type-spec kinds!"' failed.
Aborted (core dumped)

Reverted this (and follow-ups) in aacf1a9742f714dd432117d82d19a007289c3dee for now.

cjdb reopened this revision.Aug 21 2022, 6:04 PM
This revision is now accepted and ready to land.Aug 21 2022, 6:04 PM
cjdb updated this revision to Diff 454350.Aug 21 2022, 6:06 PM

changes so this patch doesn't break things:

  1. changes from begin/end to data/size
  2. changes from ArrayRef to array since an ArrayRef to a prvalue is undefined (and causes segfaults when building with GCC)
This revision was landed with ongoing or failed builds.Aug 21 2022, 8:04 PM
This revision was automatically updated to reflect the committed changes.

I believe this patch might have broken LLDB bots

I've confirmed that d2d77e050b32 is good while e9ef45635b77 (this commit) breaks some tests.

To repro, a build of llvm+lldb+libcxx+libcxxapi and

# inside build dir
/bin/llvm-lit -v tools/lldb/test/API/lang/objc/modules-objc-property/

Alternatively, running the check-lldb target should do it. But there is one other unrelated failure that might show up in that target

Forgot to add the failure message:

/usr/include/c++/v1/experimental/propagate_const:138:11: error: no template named 'remove_reference_t'; did you mean 'remove_reference'?
  typedef remove_reference_t<decltype(*_VSTD::declval<_Tp&>())> element_type;
cjdb added a comment.Aug 22 2022, 9:38 AM

Thanks for flagging this. Based on the fact it only seems to be remove_reference_t, I'm wondering if this is an issue with D131732 (which was merged immediately after D116203).

Thanks for flagging this. Based on the fact it only seems to be remove_reference_t, I'm wondering if this is an issue with D131732 (which was merged immediately after D116203).

I'm not sure, to expand on my previous message, I tested these two commits which appear one right after the other in the log:

  • e9ef45635b77 (14 hou..) cjdb@g.. [clang] adds unary type transformations as compiler built-ins
  • d2d77e050b32 (15 hou..) Ting.W.. [PowerPC][Coroutines] Add tail-call check with call information for coroutines

The tests pass on d2d77e050b32 and fail on e9ef45635b77

@cjdb Would you mind reverting the patch until we figured out a solution to unblock the CI?

cjdb added a comment.EditedAug 22 2022, 11:29 AM

@cjdb Would you mind reverting the patch until we figured out a solution to unblock the CI?

Sure, go ahead. I'm not even able to get to a point where I can repro the issue, as I get the following:

Command Output (stdout):
lldb version 16.0.0git ( revision 11ce014a121ed0bbdbb2af6ea20c75f85e89fbe9)
  clang revision 11ce014a121ed0bbdbb2af6ea20c75f85e89fbe9
  llvm revision 11ce014a121ed0bbdbb2af6ea20c75f85e89fbe9
Unable to load lldb extension module.  Possible reasons for this include:
  1) LLDB was built with LLDB_ENABLE_PYTHON=0
  2) PYTHONPATH and PYTHONHOME are not set correctly.  PYTHONHOME should refer to
     the version of Python that LLDB built and linked against, and PYTHONPATH
     should contain the Lib directory for the same python distro, as well as the
     location of LLDB's site-packages folder.
  3) A different version of Python than that which was built against is exported in
     the system's PATH environment variable, causing conflicts.
  4) The executable '/tmp/lldb-breaks/bin/lldb' could not be found.  Please check 
     that it exists and is executable.

Command Output (stderr):
Traceback (most recent call last):
  File "/home/cjdb/projects/llvm-bisect/lldb/test/API/", line 7, in <module>
  File "/home/cjdb/projects/llvm-bisect/lldb/packages/Python/lldbsuite/test/", line 892, in run_suite
    import lldb
ModuleNotFoundError: No module named 'lldb'

1 and 4 seem to be fine.

cjdb added a comment.Aug 22 2022, 2:43 PM

I think there's a forward fix available, but since I can't repro locally, I'm going to need to push it to main and observe how the tools fare.

I think there's a forward fix available, but since I can't repro locally, I'm going to need to push it to main and observe how the tools fare.

I need to be AFK for a bit now, but I can test locally on my end if you upload the patch here

This broke us in emscripten as well (building with trunk clang against a recent-but-not-trunk version of libcxx). I can test the fix if you want.

cjdb added a comment.Aug 22 2022, 4:17 PM

e137fb6fb85 should fix this issue. Thanks for your patience!