This is an archive of the discontinued LLVM Phabricator instance.

Add new 'preferred_name' attribute.
ClosedPublic

Authored by rsmith on Nov 11 2020, 5:18 PM.

Details

Summary

This attribute permits a typedef to be associated with a class template
specialization as a preferred way of naming that class template
specialization. This permits us to specify that (for example) the
preferred way to express 'std::basic_string<char>' is as 'std::string'.

The attribute is applied to the various class templates in libc++ that have
corresponding well-known typedef names.

Diff Detail

Event Timeline

rsmith created this revision.Nov 11 2020, 5:18 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptNov 11 2020, 5:18 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
Herald added a subscriber: jdoerfert. · View Herald Transcript
rsmith requested review of this revision.Nov 11 2020, 5:18 PM

I really like this attribute, thank you for working on this!

clang/include/clang/Basic/Attr.td
2367

This seems like one we should exempt from C code, WDYT? If you agree, you can change it to Clang<"preferred_name", /*AllowInC*/ 0>

2369

Would it make sense for this to be a variadic parameter of type arguments (with a constraint that at least one type be named)? This way you can write: [[clang::preferred_name(string, wstring)]] instead of [[clang::preferred_name(string), clang::preferred_name(wstring)]]

clang/include/clang/Basic/AttrDocs.td
4398

May want to be clear that this includes alias declarations as well as typedefs (the example helps though). Also, you should mention that the typedef cannot include qualifiers.

clang/lib/AST/TypePrinter.cpp
1353

const auto *?

clang/lib/Sema/SemaDeclAttr.cpp
1394

Given that this is checking properties of the type used in the attribute, I wonder if it makes sense to note the original declaration. I left a note on a test case below about this.

clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
557

Can you fix these two lint warnings?

clang/test/SemaTemplate/attributes.cpp
79

This looks like a pretty reasonable declaration but only fails because of an issue with the declaration of Z -- should we point out more explicitly that the qualifier on the alias declaration is the issue?

rsmith updated this revision to Diff 304984.Nov 12 2020, 3:28 PM
rsmith marked 5 inline comments as done.
  • Address review comments.
  • Extend libc++ changes to cover <regex>.
  • Bugfixes from further testing.
rsmith updated this revision to Diff 304985.Nov 12 2020, 3:29 PM
  • Remove debugging code.
rsmith added inline comments.Nov 12 2020, 3:39 PM
clang/include/clang/Basic/Attr.td
2367

Done. For my information, what difference does this make? The Subjects list already means that it can't ever be applied in C -- does this effectively exclude it from __has_c_attribute (and maybe change the diagnostic on use), or does it go deeper than that?

2369

I did think a little bit about that, and I don't mind going in that direction. I have only very weak justifications for the current approach:

  • I think we would want to allow the attribute to be repeated and accumulate regardless (see the <iosfwd> + <string> example for std::basic_string in libc++), and I would prefer to not have two different syntaxes for the same thing.
  • The separate attributes give us some room for extensibility in the future, by adding new optional arguments, though I don't have a great example for such a possible extension.
  • Implementation simplicity: when instantiating a template with preferred_name(X), we either drop the attribute or retain it depending on whether the name is X, and if there were multiple names, we'd need to do something more sophisticated. (Also the trivial simplicity of only iterating over one list of types, rather than a list of lists.)
  • I expect users of this attribute to be very rare (libc++, maybe a couple of types in Abseil, maybe a handful of types in boost, probably not much else), so optimizing for them may not be worth any added complexity, and I wouldn't expect any significant difference in parse time / memory usage / etc from the combined model versus the separate model.

On balance I weakly prefer the one-type-per-attribute model, but I'm happy to let you make the call.

2372

Incidentally, this flag appears to be backwards? If set to 1, we instantiate the attribute with the declaration; if set to 0 we instantiate only with the definition.

rsmith marked an inline comment as not done.Nov 12 2020, 3:40 PM
aaron.ballman accepted this revision.Nov 13 2020, 5:15 AM

The attribute parts LGTM, but I did have a question about the libc++ parts.

clang/include/clang/Basic/Attr.td
2367

You got the important points -- it changes the feature testing macro behavior and impacts diagnostics, but otherwise, not a whole lot of functional difference.

2369

On balance I weakly prefer the one-type-per-attribute model, but I'm happy to let you make the call.

That all sounds like compelling rationale to me, so I'm fine with the current approach. The biggest benefit I can see to the other approach (aside from shorter code) is that it feels like it would be slightly faster to parse in these very commonly included header files, but that may wash out in the extra work to instantiate the template anyway.

2372

Huh, that does seem backwards. I'll investigate and maybe swap the polarity if it seems sensible, thanks for mentioning it!

libcxx/include/iosfwd
188

We always define _LIBCPP_PREFERRED_NAME so is this actually needed?

245

Same question here.

Quuxplusone added inline comments.
libcxx/include/regex
2520

Why does this attribute go on the class template? Shouldn't it be an attribute on the typedef, so that you don't have to repeat yourself? I mean, I'd much rather see

template<class T> class BasicFoo { };
using [[preferred]] Foo = BasicFoo<A>;
using [[preferred]] WFoo = BasicFoo<B>;

than

template<class> class BasicFoo;
using Foo = BasicFoo<A>;
using WFoo = BasicFoo<B>;
template<class T> class [[preferred(Foo, WFoo)]] BasicFoo { };

The latter repeats every identifier one extra time, compared to the former.

And then, in fact, couldn't you go one step further and say that typedefs in the same scope as the class template itself should always implicitly have this attribute? Even if the attribute doesn't appear in the source code, we still want to print basic_regex<char> as regex, right? It shouldn't cost any additional work for the compiler to figure that out.

rsmith updated this revision to Diff 305222.Nov 13 2020, 11:27 AM
  • Properly disable redundant redeclarations if preferred_name attribute is
rsmith marked 2 inline comments as done.Nov 13 2020, 11:28 AM
rsmith added inline comments.
libcxx/include/iosfwd
188

Thanks, I was trying to avoid the redundant redeclarations when the attribute is unavailable, but clearly this doesn't do that! Fixed.

libcxx/include/regex
2520

I don't think we want arbitrary typedefs to be able to declare themselves to be names for the class template after the fact: this is a decision that the (author of the) template itself should be making, not a decision for the (author of the) typedef. Also, we need the information when looking at the class template, not when looking at the typedef, so attaching it to the typedef would require us to internally attach it to the class template instead anyway, and generally it's undesirable and problematic to mutate an existing declaration -- it's much cleaner to introduce new properties of a declaration by redeclaring it. It's certainly not ideal that this results in extra forward declarations in some cases, but given that this attribute is only expected to be written a few dozen times in total, it doesn't seem worth worrying about too much.

I don't think we want typedefs to automatically get this behavior, even if they're in the same namespace. I think it would be problematic if, for example:

namespace absl {
  template<typename T> struct basic_string_view;
  using format_arg = basic_string_view<char>;
  std::string format(format_arg, ...);
}

... resulted in absl::string_view sometimes displaying as the (to a user) unrelated type absl::format_arg depending on (presumably) #include order, and I don't think we want to be in the business of inventing heuristics to pick the "right" typedef.

aaron.ballman added inline comments.Nov 16 2020, 6:38 AM
libcxx/include/regex
2520

I don't think we want arbitrary typedefs to be able to declare themselves to be names for the class template after the fact: this is a decision that the (author of the) template itself should be making, not a decision for the (author of the) typedef.

+1

libcxx/include/regex
2520

generally it's undesirable and problematic to mutate an existing declaration -- it's much cleaner to introduce new properties of a declaration by redeclaring it

Well, I can't argue with that one. :)

I see your point about the author of the typedef vs. the author of the template, but actually I'm ambivalent about your format_arg example. I could vaguely imagine that someone might want to see absl::format_arg in an error message, in the context of a call to absl::format, at least. But, I can see the point that hypotheticals aren't relevant; this attribute is specifically tailored for the STL's idiom of "class template with a bunch of closely associated typedefs," and the STL is monolithic enough that the class template always knows the exact names of those typedefs (I mean, in the STL, it's reasonable for a human to say that basic_string_view is string_view, in a way that it is not format_arg)... plus what you said above about Clang's internal implementation; so that's a good reason to stick with putting the attribute on the template.

What the attribute achieves is great, however I must say I'm really with Arthur's original comment regarding the ergonomics of it.

IMO, it makes a lot more sense to permit the typedef author to pick how their typedef is going to be "named" by the compiler. If they pick something crazy or misleading, it's really their problem.

I think that the fact we need to re declare everything shows how the ergonomics would be better if we could just add the attribute to the typedef. See for example how we're re-declaring a bunch of stuff in <iosfwd>. How bad of an idea is it to try putting the attribute on typedefs instead, and how strongly are you opposed to it? Because from a naive user perspective, having to redeclare the class with a closed set of preferred names feels awkward (IMO, of course).

libcxx/include/__config
1334

Can you please indent inside the #if?

rsmith marked an inline comment as done.Nov 16 2020, 5:23 PM

What the attribute achieves is great, however I must say I'm really with Arthur's original comment regarding the ergonomics of it.

I do agree, the ergonomics aren't good. Due to the way that instantiations of attributes on templates work, we end up needing to see the attribute no later than on the definition of the template, which forces the template to be declared at least twice, and similarly the typedef needs to be named at least twice (once to declare it, and once to annotate it on the template). It would certainly be more convenient to attach the attribute to the typedef.

IMO, it makes a lot more sense to permit the typedef author to pick how their typedef is going to be "named" by the compiler.

Well, this attribute doesn't affect how the typedef is named -- if we know the type was named via a typedef, we'll use that typedef-name to describe the type regardless of this attribute. For example, if you have using Name = std::basic_string<char>; and then use the type Name, we're going to call the type Name everywhere we have tracked enough information to know that you used a typedef. Rather, the attribute affects how a template specialization is named in the case where we can't track it back to a typedef or other type sugar -- for example, if in a use of Name we lose the association back to the typedef Name and only know that it's std::basic_string<char>, we'll call it std::string instead. So this really changes a property of the template (or more accurately, a specialization of the template), not a property of the typedef.

I think allowing the attribute on a typedef would be misleading: people would, quite reasonably, expect that they can specify the attribute on an arbitrary typedef and that typedef name would be preferred when printing that type in general. We could in principle support such a generalized attribute on typedefs, but I don't think that would be a good idea. Such an attribute would be very complex to support (primarily due to lazy loading of module files), and I'd expect it to be heavily abused, with people "renaming" built-in types, pointer types, etc. in ways that are unhelpful to users of their code.

If they pick something crazy or misleading, it's really their problem.

I don't entirely agree with this part -- such misuse would be a problem for anyone who uses the header containing the typedef and now sees type names in diagnostics that have nothing to do with the code in question. Nonetheless, misuse is the problem of the people misusing the attribute and their users; it's not necessarily our problem, and we don't necessarily need to design an attribute that can't be misused.

I think that the fact we need to re declare everything shows how the ergonomics would be better if we could just add the attribute to the typedef. See for example how we're re-declaring a bunch of stuff in <iosfwd>. How bad of an idea is it to try putting the attribute on typedefs instead, and how strongly are you opposed to it? Because from a naive user perspective, having to redeclare the class with a closed set of preferred names feels awkward (IMO, of course).

How much should we concern ourselves with ergonomics in this instance? Most of the intended uses of this attribute (at least by how often they'll affect diagnostics) are included in this patch, and are in code that we don't expect many people to look at most of the time, that is maintained by domain experts, and that is generally optimized for other properties than readability. However, this is certainly intended to be used by third-party library authors; otherwise we could just add a hard-coded table to Clang itself. So I think ergonomics do matter a little, but that other factors are probably more important.

We could allow the attribute on typedefs, and internally reverse the sense of it and attach it to the template specialization instead, but that would be incredibly messy -- in order to maintain AST invariants and source fidelity, we'd need to synthesize a new declaration of the template specialization to attach the attribute to, or something similar. We'd still need the attribute to appear before the template definition, though -- unless we rework how attribute instantiation is done -- so that's not really a general solution to the ergonomics issue. Or we could store a side-table, which would also bring with it a lot of complexity, particularly when we come to lazily load such information from module files. In my view, the added implementation complexity from attaching the attribute to a typedef outweighs the ergonomic benefit.

There's another design approach we could follow, that would keep the attribute on the template but avoid the awkwardness of requiring the typedef to appear first: we could completely divorce this feature from typedefs. Instead, we could say the attribute is of the form preferred_name(type, "name"), where type is a specialization of the type to which the attribute is attached, and "name" is a name used in diagnostics when referring to that type, which is printed as if it's a member of the enclosing namespace (but there'd be no enforcement that such a member actually exists and declares the corresponding type). What do you think?

I think that the fact we need to re declare everything shows how the ergonomics would be better if we could just add the attribute to the typedef. See for example how we're re-declaring a bunch of stuff in <iosfwd>. How bad of an idea is it to try putting the attribute on typedefs instead, and how strongly are you opposed to it? Because from a naive user perspective, having to redeclare the class with a closed set of preferred names feels awkward (IMO, of course).

FWIW, I'd be pretty strongly opposed to putting the attribute on the typedef instead, for all the reasons Richard points out. It's the wrong subject for the attribute given the semantics of how the attribute behaves and I'd liken it somewhat to writing an attribute on a function declaration when the semantics of the attribute impact a parameter of the function instead. While the function and its parameter are certainly related (more strongly than the template and the typedef in this case), it's just the wrong place to write the attribute.

How much should we concern ourselves with ergonomics in this instance? Most of the intended uses of this attribute (at least by how often they'll affect diagnostics) are included in this patch, and are in code that we don't expect many people to look at most of the time, that is maintained by domain experts, and that is generally optimized for other properties than readability. However, this is certainly intended to be used by third-party library authors; otherwise we could just add a hard-coded table to Clang itself. So I think ergonomics do matter a little, but that other factors are probably more important.

I think the ergonomics matter because it's a documented attribute that people will use outside of the STL, but I don't imagine this being an attribute that gets written frequently. I'm fine if the usage is a bit verbose in this case.

We could allow the attribute on typedefs, and internally reverse the sense of it and attach it to the template specialization instead, but that would be incredibly messy -- in order to maintain AST invariants and source fidelity, we'd need to synthesize a new declaration of the template specialization to attach the attribute to, or something similar. We'd still need the attribute to appear before the template definition, though -- unless we rework how attribute instantiation is done -- so that's not really a general solution to the ergonomics issue. Or we could store a side-table, which would also bring with it a lot of complexity, particularly when we come to lazily load such information from module files. In my view, the added implementation complexity from attaching the attribute to a typedef outweighs the ergonomic benefit.

There's another design approach we could follow, that would keep the attribute on the template but avoid the awkwardness of requiring the typedef to appear first: we could completely divorce this feature from typedefs. Instead, we could say the attribute is of the form preferred_name(type, "name"), where type is a specialization of the type to which the attribute is attached, and "name" is a name used in diagnostics when referring to that type, which is printed as if it's a member of the enclosing namespace (but there'd be no enforcement that such a member actually exists and declares the corresponding type). What do you think?

I'm not opposed to that approach, but I don't think the design is as nice as what's originally proposed in terms of behavior. With the current approach, the name that gets used in diagnostics is one that shows up in the code as an actual declaration of something and I think that's a useful property to enforce. I don't like a design where the user can write an arbitrary string literal that has no enforcement that it actually names a type in the user's program. While doing such a thing is up to the author of the attribute and so it's sort of a "well then don't do that if it's nonsense" situation, it's also not hard to get accidental typos that slip through code reviews but would be caught by using the actual type system itself (which provides extra value like diagnostics about the mistake or possible typo correction).

[...]

Thanks for your detailed explanation. I did not understand the philosophy of the attribute, and it's now clear to me that it shouldn't be tied to the typedef, indeed.

There's another design approach we could follow, that would keep the attribute on the template but avoid the awkwardness of requiring the typedef to appear first: we could completely divorce this feature from typedefs. Instead, we could say the attribute is of the form preferred_name(type, "name"), where type is a specialization of the type to which the attribute is attached, and "name" is a name used in diagnostics when referring to that type, which is printed as if it's a member of the enclosing namespace (but there'd be no enforcement that such a member actually exists and declares the corresponding type). What do you think?

I think this is pretty interesting, actually. While it does seem a bit more loosey-goosey in terms of the type system, I can imagine useful applications of being able to use arbitrary strings, such as providing better diagnostics for specific types without having to actually have an associated typedef. Just to clarify, that would look like this, right?

template<class _CharT, class _Traits>
class [[preferred_name(basic_string_view<char>, "string_view")]]
      [[preferred_name(basic_string_view<char8_t>, "u8string_view")]]
      [[preferred_name(basic_string_view<char16_t>, "u16string_view")]]
      [[preferred_name(basic_string_view<char32_t>, "u32string_view")]]
      [[preferred_name(basic_string_view<wchar_t>, "wstring_view")]]
basic_string_view {
    ...
};

I don't want to ask for something unreasonable.. but I think that's a neat design that's actually more powerful than the current one. I would be curious to hear what other people think as well, to see whether I'm the only one preferring the more free-form approach.

[...]

Thanks for your detailed explanation. I did not understand the philosophy of the attribute, and it's now clear to me that it shouldn't be tied to the typedef, indeed.

There's another design approach we could follow, that would keep the attribute on the template but avoid the awkwardness of requiring the typedef to appear first: we could completely divorce this feature from typedefs. Instead, we could say the attribute is of the form preferred_name(type, "name"), where type is a specialization of the type to which the attribute is attached, and "name" is a name used in diagnostics when referring to that type, which is printed as if it's a member of the enclosing namespace (but there'd be no enforcement that such a member actually exists and declares the corresponding type). What do you think?

I think this is pretty interesting, actually. While it does seem a bit more loosey-goosey in terms of the type system, I can imagine useful applications of being able to use arbitrary strings, such as providing better diagnostics for specific types without having to actually have an associated typedef.

How would that work for users - they would get error messages from the compiler using type names that don't exist in the source code? I'd have thought that would be quite confusing.

Just to clarify, that would look like this, right?

template<class _CharT, class _Traits>
class [[preferred_name(basic_string_view<char>, "string_view")]]
      [[preferred_name(basic_string_view<char8_t>, "u8string_view")]]
      [[preferred_name(basic_string_view<char16_t>, "u16string_view")]]
      [[preferred_name(basic_string_view<char32_t>, "u32string_view")]]
      [[preferred_name(basic_string_view<wchar_t>, "wstring_view")]]
basic_string_view {
    ...
};

I don't want to ask for something unreasonable.. but I think that's a neat design that's actually more powerful than the current one. I would be curious to hear what other people think as well, to see whether I'm the only one preferring the more free-form approach.

How would that work for users - they would get error messages from the compiler using type names that don't exist in the source code? I'd have thought that would be quite confusing.

Yes, if a library author decides to say something like:

template<class _CharT, class _Traits>
class [[preferred_name(basic_string_view<char>, "ahaha I'm such a troll")]]
basic_string_view {
    ...
};

Then you might get compiler errors that are not super helpful. I don't think the fact that such nonsense is doable means that we shouldn't give this control to library authors.

For instance, I can easily imagine a library that provides an API where some types shouldn't be named (for example expression templates). In that case, you might want to describe a type by a string along the lines of decltype(some-expression), which could potentially be a lot more useful than the ability to refer to a typedef. Does this sort of usage ring true to someone else?

Pinging fellow library folks in case they have an opinion. @mattcalabrese @david_stone @mpark

How would that work for users - they would get error messages from the compiler using type names that don't exist in the source code? I'd have thought that would be quite confusing.

Yes, if a library author decides to say something like:

template<class _CharT, class _Traits>
class [[preferred_name(basic_string_view<char>, "ahaha I'm such a troll")]]
basic_string_view {
    ...
};

Then you might get compiler errors that are not super helpful. I don't think the fact that such nonsense is doable means that we shouldn't give this control to library authors.

It's certainly a cost/risk to weigh against the benefits. (both intentional weird uses like that, but also accidental misuses like typos).

For instance, I can easily imagine a library that provides an API where some types shouldn't be named (for example expression templates). In that case, you might want to describe a type by a string along the lines of decltype(some-expression), which could potentially be a lot more useful than the ability to refer to a typedef. Does this sort of usage ring true to someone else?

A concrete/real-world example might be helpful, if you happen to have one on hand.

For instance, I can easily imagine a library that provides an API where some types shouldn't be named (for example expression templates). In that case, you might want to describe a type by a string along the lines of decltype(some-expression), which could potentially be a lot more useful than the ability to refer to a typedef. Does this sort of usage ring true to someone else?

A concrete/real-world example might be helpful, if you happen to have one on hand.

I can imagine something like this being useful. But I think we would want a substantially more powerful mechanism to produce those strings. For example, suppose we have an expression template library for matrices:

template<MatrixType a, MatrixType b> detail::mul<a, b> operator*(a, b);
template<MatrixType a, MatrixType b> detail::add<a, b> operator+(a, b);

... then given Matrix3f a, b, we might want the type of a * b + a to be printed as something like decltype(Matrix3f() * Matrix3f() + Matrix3f()) rather than as detail::add<detail::mul<Matrix3f, Matrix3f>, Matrix3f> (and in practice the types of expression template intermediaries tend to be a lot more complex than this). If we had a way to ask for a type to be pretty-printed as a string, we could imagine:

template<MatrixType a, MatrixType b>
struct [[clang::preferred_name(mul<a, b>, "decltype(" + sample_expr_of_type<mul<a,b>>() + ")")]] mul { ... };

... for some suitable function sample_expr_of_type. I don't think we're there yet -- this would require substantially more compile-time power than we have (formatting type strings, and string concatenations resulting in something that we can then feed back into the attribute). So I don't think a proposal that lets us only include string literals as the second attribute argument really gets us where we'd want to be. (This would also require a new kind of delayed attribute parsing: we would need to delay parsing the attribute argument until after mul is declared and in scope -- or we would need to require the template to be declared twice as we currently do.)

While we might not be ready for this right now, going from [[preferred_name(type)]] to [[preferred_name(type, spelling)]] would be a compatible extension if we wanted to do that in the future.

ldionne accepted this revision as: Restricted Project.Nov 18 2020, 1:35 PM

A concrete/real-world example might be helpful, if you happen to have one on hand.

See what Richard included in his comment.

@rsmith

Ok, thanks for thinking this through. I do agree we'd most likely want a more powerful mechanism for creating these strings.

I'm still only "meh" on the current design, especially because I see the ergonomics as a high barrier, but this isn't blocking. So, LGTM from libc++'s perspective -- thanks for having a discussion.

This revision is now accepted and ready to land.Nov 18 2020, 1:35 PM
ldionne requested changes to this revision.Nov 18 2020, 1:37 PM

Actually, apologies -- I might have accepted this too quickly.

We can stick with this design, but I'd like to understand why #if _LIBCPP_HAS_NO_PREFERRED_NAME is necessary in <iosfwd>, and also the CI is failing on MacOS.

libcxx/include/iosfwd
188

Is that really needed? What's the issue with having redundant declarations?

This revision now requires changes to proceed.Nov 18 2020, 1:37 PM

We can stick with this design, but I'd like to understand why #if _LIBCPP_HAS_NO_PREFERRED_NAME is necessary in <iosfwd>, and also the CI is failing on MacOS.

You mean the HWAddressSanitizer test failure? That appears to be a flake. Looking through recent failures I found more that look the same: https://reviews.llvm.org/B79364 https://reviews.llvm.org/B79363 https://reviews.llvm.org/B79358

libcxx/include/iosfwd
188

It's not necessary. I'm happy to remove it and redeclare the templates unconditionally if you prefer.

We can stick with this design, but I'd like to understand why #if _LIBCPP_HAS_NO_PREFERRED_NAME is necessary in <iosfwd>, and also the CI is failing on MacOS.

You mean the HWAddressSanitizer test failure? That appears to be a flake. Looking through recent failures I found more that look the same: https://reviews.llvm.org/B79364 https://reviews.llvm.org/B79363 https://reviews.llvm.org/B79358

I was talking about his issue in the libc++ CI. See https://buildkite.com/llvm-project/libcxx-ci/builds/422#8e9a6d80-32ff-429e-a3de-e7ecc111c2fb (gotta look at the raw log).

/tmp/buildkite-agent/builds/libcxx-mbp-local-1/llvm-project/libcxx-ci/libcxx/include/iosfwd:190:34: error: redefinition of 'ios' as different kind of symbol

    class _LIBCPP_PREFERRED_NAME(ios) _LIBCPP_PREFERRED_NAME(wios) basic_ios;

                                 ^
libcxx/include/iosfwd
188

Yes, I'd prefer that, and removing _LIBCPP_HAS_PREFERRED_NAME.

rsmith updated this revision to Diff 307496.Nov 24 2020, 6:50 PM
  • Remove _LIBCPP_HAS_NO_PREFERRED_NAME
ldionne accepted this revision as: Restricted Project.Nov 25 2020, 12:09 PM

LGTM from the libc++ point of view. The CI is passing -- those failures are flaky modules tests that we need to fix.

ldionne accepted this revision.Nov 25 2020, 12:10 PM
This revision is now accepted and ready to land.Nov 25 2020, 12:10 PM

LGTM from the libc++ point of view. The CI is passing -- those failures are flaky modules tests that we need to fix.

Perhaps we need to specify -fmodules-validate-system-headers in the test so Clang doesn't assume that system headers are unchanged?

LGTM from the libc++ point of view. The CI is passing -- those failures are flaky modules tests that we need to fix.

Perhaps we need to specify -fmodules-validate-system-headers in the test so Clang doesn't assume that system headers are unchanged?

Oh, would that be it? We're not including libc++ headers as system headers when running the tests, though (and we're disabling the system header pragma). Do you think that might still be the issue?

I tried a dumb workaround with D92131, but it's arguably not great. I'd love to have your thoughts on that. We've been seeing these issues for a while when we make some types of changes in the __config_site.

LGTM from the libc++ point of view. The CI is passing -- those failures are flaky modules tests that we need to fix.

Perhaps we need to specify -fmodules-validate-system-headers in the test so Clang doesn't assume that system headers are unchanged?

Oh, would that be it? We're not including libc++ headers as system headers when running the tests, though (and we're disabling the system header pragma). Do you think that might still be the issue?

The module map lists std as being [system] module, so I think the headers will still end up being treated as system headers. So yes, I think there's a possibility that is still the issue.

I tried a dumb workaround with D92131, but it's arguably not great. I'd love to have your thoughts on that.

I'm not sure that'll make any difference: at least in my setup, %t expands to the same path on subsequent invocations of the same test, so you'll still reuse module caches from one test run to the next. In Clang tests for this sort of thing, we have explicit RUN: rm -rf %t/ModuleCache lines to clean up any stale module cache before running a test... but that shouldn't really be necessary; Clang should be able to detect when the files are out of date.

LGTM from the libc++ point of view. The CI is passing -- those failures are flaky modules tests that we need to fix.

Perhaps we need to specify -fmodules-validate-system-headers in the test so Clang doesn't assume that system headers are unchanged?

Oh, would that be it? We're not including libc++ headers as system headers when running the tests, though (and we're disabling the system header pragma). Do you think that might still be the issue?

The module map lists std as being [system] module, so I think the headers will still end up being treated as system headers. So yes, I think there's a possibility that is still the issue.

I tried a dumb workaround with D92131, but it's arguably not great. I'd love to have your thoughts on that.

I'm not sure that'll make any difference: at least in my setup, %t expands to the same path on subsequent invocations of the same test, so you'll still reuse module caches from one test run to the next. In Clang tests for this sort of thing, we have explicit RUN: rm -rf %t/ModuleCache lines to clean up any stale module cache before running a test... but that shouldn't really be necessary; Clang should be able to detect when the files are out of date.

Ok, thanks for the info. I'll use -fmodules-validate-system-headers instead. Anyway, this patch LGTM.

This revision was landed with ongoing or failed builds.Dec 7 2020, 12:55 PM
This revision was automatically updated to reflect the committed changes.
jyknight added inline comments.
clang/include/clang/Basic/Attr.td
2368

I wonder if this attribute ought to be allowed to be used on any Record (which would make it potentially also valid for C). In case you also wanted to do something like this in the future:

class any;
using any_copyable = any;

class [[clang::preferred_name(any_copyable)]] any { ... };
rsmith added inline comments.Dec 10 2020, 6:11 PM
clang/include/clang/Basic/Attr.td
2368

I think that would be reasonable, if the generalized attribute would see usage. If we're going to generalize it that way, I think it'd make sense to allow it on enums too.

While looking into some debug info issues* I've come across what I think are some inconsistencies in the handling of this attribute - @rsmith mind taking a look?

Firstly, it looks like something isn't handled when it comes to templates where only the declaration of a template with a preferred name is instantiated, but not the definition:

template<typename T>
struct t1;
using t1i = t1<int>;
template<typename T1>
struct
#ifdef PREF
__attribute__((__preferred_name__(t1i)))
#endif
t1 {
};

template<typename T>
struct t2 {
};
int main() {
  t2<t1i> v1;
#ifdef DEF
  t1i v2;
#else
  t1<long> v2; // just leave this here so AST dump is similar with/without the t1i definition
#endif
}

Compile with -DPREF -Xclang -ast-dump and compare the output with/without -DDEF. With the definition, the ast dump uses the preferred name when printing the use of t1i as t2's template argument, without the definition it dumps as the raw t1<int> instead:

CXXConstructorDecl 0x10d9cd48 <col:8> col:8 implicit constexpr t2 'void (const t2<t1i> &)' inline default trivial noexcept-unevaluated 0x10d9cd48
CXXConstructorDecl 0x120cb0f8 <col:8> col:8 implicit constexpr t2 'void (const t2<t1<int>> &)' inline default trivial noexcept-unevaluated 0x120cb0f8

(unrelated to this, but related to how I came across this: I think it might be desirable to have a PrintingPolicy to turn off using the preferred name, specifically so we can use this when rendering a template name in DWARF - so the name is consistent with the DW_TAG_template_type_parameter - better chance of cross-compiler compatibility, etc - I've considered going the other way, and using the preferred name in the DW_TAG_template_type_parameter but worry that'll cause even more divergence between compilers & make things more difficult for the consumer (but they could support it, if they looked through typedefs/alias templates when doing structural equivalence between template descriptions in DWARF (which they kind of have to do anyway, because there's a bunch of other ways that names aren't exactly the same all the time - which, maybe that's a reason not to worry about the use of preferred names in the textual description today? Since the names aren't directly matchable anyway... - I haven't tested that too much to see how much divergence there is ignored by existing consumers))

  • The preferred name shows up in template type parameters - but that then mismatches with the DW_TAG_template_type_parameter which uses the underlying type. I have been experimenting with a patch to use the preferred type even there, but that might cause more significant/problematic mismatch between DWARF produced by different compilers, unless they're able to look through typedefs/alias templates when structurally comparing two type descriptions.

Oh, I should say I wasn't able to get this behavior to be exposed in a diagnostic in my limited testing - and attempting to add diagnostic cases caused me to lose the reproduction in the ast dumping, perhaps because I added some AST that tickled/fixed the disparity somehow, though I'm pretty sure I avoided actually instantiating the definition of the template...

@rsmith - would it make sense to disable preferred names use when PrintingPolicy::PrintCanonicalTypes is used? It's used in CGDebugInfo, but also in Sema::findFailedBooleanCondition - so maybe we need another PrintingPolicy flag for this so CGDebugInfo can use the flag, but Sema::findFailedBooleanCondition can keep using preferred names?