This is an archive of the discontinued LLVM Phabricator instance.

[C2X] N3007 Type inference for object definitions
ClosedPublic

Authored by to268 on Sep 5 2022, 4:30 AM.

Details

Summary

This patches implements the auto keyword from the N3007 standard specification.
This allows deducing the type of the variable like in C++

auto nb = 1;
auto chr = 'A';
auto str = "String";

The list of statements which allows the usage of auto:

  • Basic variables declarations (int, float, double, char, char*...)
  • Macros declaring a variable with the auto type

The list of statements which will not work with the auto keyword:

  • auto arrays
  • sizeof(), alignas()
  • auto parameters, auto return type
  • auto as a struct/typedef member
  • uninitialized auto variables
  • auto in an union
  • auto as a enum type specifier
  • auto casts
  • auto in an compound literals

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
to268 edited the summary of this revision. (Show Details)Sep 5 2022, 4:39 AM
to268 updated this revision to Diff 458089.Sep 5 2022, 9:04 PM

I have fixed my CodeGen test with the windows platform.
(Linux uses an i64 for the long type instead windows is using an i32)

to268 updated this revision to Diff 458092.Sep 5 2022, 9:40 PM

Fixed DeclSpec.cpp formatting

shafik added a subscriber: shafik.Sep 6 2022, 11:43 AM

The proposal has several nice tests and I don't they are all covered here e.g.:

{
  double a = 7;
  double b = 9;
  {
    double b = b * b;       // undefined, uses uninitialized variable without address
to268 updated this revision to Diff 458358.Sep 6 2022, 10:08 PM

I've added a test case with scopes.

to268 added a comment.Sep 6 2022, 10:11 PM

Also @aaron.ballman said me that he was thinking about maybe adding an extension to add support for auto in a function return or in a parameter list.

cor3ntin added a subscriber: cor3ntin.EditedSep 7 2022, 12:55 AM

Also @aaron.ballman said me that he was thinking about maybe adding an extension to add support for auto in a function return or in a parameter list.

I think extensions are best left to a separate pr, if at all.
Supporting deduced return type sounds fine, with a warning.
Supporting auto template parameters... Not so much, as it would in effect bring function templates, overloading and mangling into C... I don't think we want to go there.

I think we should consider supporting auto in older language modes though.

Or at least, improving the diagnostics for auto in older language modes.

type specifier missing, defaults to 'int'.

The probability that someone tries to use auto as a storage specifier and wants an implicit int at the same time is about 0.

This is awesome, thank you for working on it! @to268 and I had a really good discussion during my office hours and what we decided for next steps were:

0) Next time you upload a patch, please use -U9999 to give the patch more context for reviewers to see.

  1. Need to handle function pointers: auto (*)(void) and void (*)(auto) should both be rejected - you may want to consider adding logic to Sema::BuildPointerType() to semantically restrict forming such a function pointer type.
  2. Need a ton of random tests:
auto file_scope_1 = 12;
// This <might> be something we accept because the redundant storage class specifier is not
// specifying automatic storage duration due to the lack of a type specifier, or we might reject
// because of the redundant storage class specifier. I have a question out to the author to clarify.
auto auto file_scope_2 = 12;

void func(void) {
  int i;
  _Generic(i, auto : 0); // Should reject auto in an association
  _Alignof(auto); // Should also reject
  _Alignas(auto); // Should also reject


  auto j = ({ auto x = 12; x; }); // Should work, x and j should both be int
  auto auto k = 12; // 99% this is intended to be accepted; first `auto` is the storage class specifier, second `auto` is a redundant storage class specifier

  const int ci = 12;
  auto yup = ci;
  yup = 12; // Okay, the type of yup is int, not const int

  _Atomic(auto) atom1 = 12; // Should reject, asking WG14 about it
  _Atomic auto atom2 = 12; // Should also reject
}
`

It's worth noting that auto in C is a very, very strange beast. It's a storage-class-specifier, not a type-name. If there's a storage class specifier and NO TYPE NAME, then the type is inferred. So grammar productions that use type-name without also using storage-class-specifier should reject any use of auto because that's not a valid type name.

clang/test/C/C2X/n3007.c
7

Rather than include the system header, you should use _Alignas and _Alignof spellings.

46

Heh, this doesn't have much to do with structs. :-)

51

An additional tests here that's interesting is:

_Static_assert(_Generic(test_char_ptr, const char * : 1, char * : 2) == 2, "C is weird");
clang/test/Parser/c2x-auto.c
10

Should also test a structure member just to have the coverage.

clang/test/Sema/c2x-auto.c
4

Same suggestion here as above.

12–21

This seems to be duplicated from the parser tests -- there is only a need to test this once (and in each of these cases it seems to be a parsing restriction, so the tests should live in Parser).

33

This should have a FIXME comment -- we should produce a better diagnostic here. That we give an error is correct though!

C2x 6.5.2.5p4 specifies that compound literal is rewritten as if it were a declaration: SC typeof(T) ID = { IL }; (where SC is the storage class, T is the type, ID is a program-wide unique ID, and IL is the initializer list).

C2x 6.7.2.5p1 only allows typeof on an expression or a type-name, and type-name does not include auto as an option.

So the rewrite makes this invalid.

clang/www/c_status.html
1196

This should use class unreleased and specify Clang 16 instead of Yes ("Yes" basically means "we have no idea when we started supporting something, we've probably supported this forever")

shafik added inline comments.Sep 12 2022, 9:14 AM
clang/test/Sema/c2x-auto.c
50

When I made the comment about the example from the proposal, this was what I was thinking about.

FYI: regarding auto auto, I've heard back from the author of N3007 and the intent was to disallow this construct ("others" was meant to exclude auto and "except typedef" was meant to add onto the list of exclusions).

One thought that occurred to me is that the way C has this specified is awfully effectively the same way implicit int worked. It may be worth exploring a change to our implicit int functionality to instead generate an implicit auto type when in C2x mode.

to268 marked 6 inline comments as done.EditedSep 13 2022, 10:24 PM
auto auto k = 12; // 99% this is intended to be accepted; first `auto` is the storage class specifier, second `auto` is a redundant storage class specifier

It seems that specifying auto twice isn't supported at the time being, so i think i should fix that too.

auto auto test = 12; // error: cannot combine with previous 'auto' declaration specifier
auto auto k = 12; // 99% this is intended to be accepted; first `auto` is the storage class specifier, second `auto` is a redundant storage class specifier

It seems that specifying auto twice isn't supported at the time being, so i think i should fix that too.

auto auto test = 12; // error: cannot combine with previous 'auto' declaration specifier

Yup! I just learned yesterday that the error there is correct and intended, at least according to the author of WG N3007.

to268 updated this revision to Diff 461790.Sep 20 2022, 10:19 PM
to268 marked 2 inline comments as done.

I've added more test cases, added a diagnostic when using auto in compound literals and advised fixes

Also i have found that we don't parse the compound literal with the auto keyword correctly in ParseExpr.cpp between line 939 to 973 which is the beginning of Parser::ParseCastExpression(...)

int test_cl = (int){12};      // Compound literal is detected
auto test_cl2 = (auto){12};   // Compound literal is not detected

I've haven't dig deeper yet but maybe it's because we are not including auto when trying to guess if it's a compound literal due to the fact that auto is a storage-class-specifier

clang/test/Sema/c2x-auto.c
50

Do i need to treat shadowing when using auto as invalid?

One thought that occurred to me is that the way C has this specified is awfully effectively the same way implicit int worked. It may be worth exploring a change to our implicit int functionality to instead generate an implicit auto type when in C2x mode.

I've been thinking about this more and I'm starting to make myself a bit uncomfortable with the current approach, at least in terms of how we're handling it on the parsing side of things. I think it's reasonable for us to form an AutoType when we eventually get down to forming the type. But I'm uncomfortable with not modeling the language when parsing. e.g., I think we want to parse auto as a storage class specifier rather than a type specifier, and we want to rely on the lack of a type specifier coupled with use of auto as a storage class specifier to determine the situations where we want to infer a type. The resulting semantics should be basically equivalent, but this ensures that we're properly parsing as the language expects which helps us be forwards-compatible with future changes in C that build on top of this being a storage class specifier rather than a type.

Adding a few more reviewers for some extra opinions on this.

clang/include/clang/Basic/DiagnosticParseKinds.td
374 ↗(On Diff #461790)
clang/lib/Parse/ParseExpr.cpp
1515 ↗(On Diff #461790)

Why would this not be handled from Sema::ActOnCompoundLiteral()?

clang/test/C/C2x/n3007.c
7 ↗(On Diff #461790)

Why? My reading of the grammar is that const auto is valid. const is a type-specifier-qualifier declaration-specifier, and auto is a storage-class-specifier declaration-specifier, and a declaration is allowed to use a sequence of declaration-specifiers.

15 ↗(On Diff #461790)
clang/test/Sema/c2x-auto.c
50

You shouldn't need to do anything special there, I think. It should naturally fall out that you cannot use a variable in its own initialization. Note, you don't even need an inner scope for that: https://godbolt.org/z/EYa3fMxqx (example is compiled in C++ mode).

shafik added inline comments.Sep 21 2022, 11:17 AM
clang/test/Sema/c2x-auto.c
50

The scope of the identifier begins after its declarator so a is being used in it's own deduction which can not be.

e.g.

double a = 1.0;
  {
      int a = a*a; // undefined behavior a is used uninitialized within it's own initialization
  }
to268 updated this revision to Diff 463415.Sep 27 2022, 10:12 PM
to268 marked 3 inline comments as done.

I've fixed the diagnostic message and all of the bad TODOs comments from the Sema test.
I've explained in details why i'm only able to handle auto in compound literals.

One thought that occurred to me is that the way C has this specified is awfully effectively the same way implicit int worked. It may be worth exploring a change to our implicit int functionality to instead generate an implicit auto type when in C2x mode.

That's something is was trying to do a few weeks earlier, so i'll work on that too.

I've been thinking about this more and I'm starting to make myself a bit uncomfortable with the current approach, at least in terms of how we're handling it on the parsing side of things. I think it's reasonable for us to form an AutoType when we eventually get down to forming the type. But I'm uncomfortable with not modeling the language when parsing. e.g., I think we want to parse auto as a storage class specifier rather than a type specifier, and we want to rely on the lack of a type specifier coupled with use of auto as a storage class specifier to determine the situations where we want to infer a type. The resulting semantics should be basically equivalent, but this ensures that we're properly parsing as the language expects which helps us be forwards-compatible with future changes in C that build on top of this being a storage class specifier rather than a type.

I agree, that's a better approach to design it like that, i'll change my approach when i'll be able to generate an implicit auto.

Here are all the details that i've said earlier

clang/lib/Parse/ParseExpr.cpp
1515 ↗(On Diff #461790)

You're right, it's better to handle this in Sema::ActOnCompoundLiteral()
The problem is that right now, the usage of the auto keyword in a compound literal isn't parsed as a compound literal.
This compound literal is unable to reach Sema::ActOnCompoundLiteral() because the parser emits that it's an invalid expression.

To summarize, i'm unable to handle handle a compound literal that uses the auto keyword inside Sema::ActOnCompoundLiteral()
if it's never going to be called.

int test_ncl = (int){12};           // Parsed as a CL
auto test_cl = (auto){12};          // Not parsed as a CL and emits "expected expression"
/*
 * |-DeclStmt 0x562180ede8b0 <line:17:5, col:29>
 * | `-VarDecl 0x562180ede730 <col:5, col:28> col:9 test_ncl 'int' cinit
 * |   `-ImplicitCastExpr 0x562180ede898 <col:20, col:28> 'int' <LValueToRValue>
 * |     `-CompoundLiteralExpr 0x562180ede870 <col:20, col:28> 'int' lvalue
 * |       `-InitListExpr 0x562180ede828 <col:25, col:28> 'int'
 * |         `-IntegerLiteral 0x562180ede7b0 <col:26> 'int' 12
 * |-DeclStmt 0x562180ede970 <line:18:5, col:30>
 * | `-VarDecl 0x562180ede908 <col:5, col:10> col:10 invalid test_cl 'auto'
 * `-ReturnStmt 0x562180ede9a8 <line:19:5, col:12>
 *   `-IntegerLiteral 0x562180ede988 <col:12> 'int' 0
 */
1515 ↗(On Diff #461790)

When we are parsing an expression between parentheses in Parser::ParseParenExpression() line 2972,
we are calling Parser::isTypeIdInParens() which when using C will return true if it's a type specifier,
which is not the case for the auto keyword.

Ultimately, we fall into the conditional block of Parser::ParseParenExpression() line 3191,
which will return an ExprError() (AKA: a parsing error).

This is why we are unable to forbid a compound literal using the auto keyword at the
Semantic Analysis stage and why this feature is not working out of the box in the first place.

clang/test/C/C2x/n3007.c
7 ↗(On Diff #461790)

It's a mistake, i don't remember why i've added this comment in the first place and i've forgot his existence

aaron.ballman added inline comments.Sep 28 2022, 9:45 AM
clang/lib/Parse/ParseExpr.cpp
1515 ↗(On Diff #461790)

When we are parsing an expression between parentheses in Parser::ParseParenExpression() line 2972,

we are calling Parser::isTypeIdInParens() which when using C will return true if it's a type specifier,
which is not the case for the auto keyword.

Ahhhh, that makes a lot of sense!

to268 updated this revision to Diff 465280.Oct 4 2022, 10:30 PM
This comment was removed by to268.
to268 updated this revision to Diff 465312.Oct 5 2022, 1:02 AM

Added a check to not diagnose a missing type specifier in C2x mode

to268 updated this revision to Diff 471535.Oct 28 2022, 6:48 AM

Reworked the parsing side of the patch

to268 added a comment.Oct 28 2022, 6:52 AM

Also i'm not sure if must add C2x to this condition just in case (falling back or add an assertion)

clang/lib/Sema/DeclSpec.cpp
1362

Do we need to include C2x just in case?

Thank you for the update to this!

clang/lib/Parse/ParseExpr.cpp
1526 ↗(On Diff #471535)

I don't think this is quite right; I suspect we're going to need more complicated logic here. Consider a case like: (auto const){ 12 } or (auto *){ nullptr } where two tokens ahead is ) and not {. (Note, these type specifications can be pretty arbitrarily complex.)

Given that this is a temporary workaround for a lack of support for storage class specifiers in compound literals, perhaps another approach is to not try to catch uses in compound literals. Instead, we could add tests with FIXME comments to demonstrate the behavior we get with compound literals now, and when we do that storage class specifier work, we will (hopefully) break those tests and come back to make them correct.

1530–1531 ↗(On Diff #471535)

This looks a bit less visually jarring to me (the indentation differences were distracting). WDYT?

clang/lib/Sema/DeclSpec.cpp
1362

I don't think so -- we're in a language where the type specifier *is* optional for C2x.

1370–1372
clang/test/C/C2x/n3007.c
13 ↗(On Diff #471535)

I'd also like a test for:

_Atomic auto i = 12;
_Atomic(auto) j = 12;

_Atomic(int) foo(void);
auto k = foo(); // Do we get _Atomic(int) or just int?
36 ↗(On Diff #471535)

I think this test should be _Alignas(auto) right?

40–41 ↗(On Diff #471535)

I think other array cases we want to test are:

auto a = { 1 , 2 }; // Error
auto b = { 1, }; // OK
auto c = (int [3]){ 1, 2, 3 }; // OK
56 ↗(On Diff #471535)

I'd also like to ensure we reject:

typedef auto (*fp)(void);
typedef void (*fp)(auto);

_Generic(0, auto : 1);

and we should ensure we properly get integer promotions:

short s;
auto a = s;
_Generic(a, int : 1);

and a test for use of an explicit pointer declarator (which is UB that we can define if we want to):

int a;
auto *ptr = &a; // Okay?
auto *ptr = a; // Error
clang/test/Sema/c2x-auto.c
28

This one should not be accepted -- the grammar productions for the initialization allow for {0} and {0,} but no more than one element in the braces.

to268 marked 7 inline comments as done.Nov 13 2022, 10:43 PM

This is a status update of the patch.

clang/lib/Parse/ParseExpr.cpp
1526 ↗(On Diff #471535)

I think it's better to make a test of the actual behavior than trying to correct this case.

1530–1531 ↗(On Diff #471535)

That's better but i remember that clang-format has failed on that.

clang/test/C/C2x/n3007.c
13 ↗(On Diff #471535)

The _Atomic qualifier is dropped so the type of k is an int

40–41 ↗(On Diff #471535)

These statements are not working:

auto b = { 1, }; // Not valid
auto d = { 1 };  // Not valid

I think it's related to the compound literal.

56 ↗(On Diff #471535)
> and we should ensure we properly get integer promotions:
>

short s;
auto a = s;
_Generic(a, int : 1);

I got the error controlling expression type 'short' not compatible with any generic association type so we don't promote from short to int

to268 updated this revision to Diff 478113.Nov 27 2022, 11:18 AM
to268 marked 5 inline comments as done.

Added test cases and added an explicit auto* diagnostic

to268 updated this revision to Diff 481339.Dec 8 2022, 9:44 AM

Removed compound literal diagnostic in ParseExpr

to268 updated this revision to Diff 487427.Jan 9 2023, 7:23 AM

I spoke with @to268 during office hours about the current status of the NB comments for this feature in the C committee. For the moment, he's going to pause work on this patch until the C committee weighs in on whether auto is a type specifier or not. The committee meets the week of Jan 23, so we should have a better understanding of direction by the week of Jan 30. If the C committee turns auto into a type specifier, we may go back to the original implementation of this feature which exposes the C++ feature in C (and add diagnostics around the things that need to be constrained in C such as use in a function signature).

It surprised me that there are no type inference messages? The type of this auto is double. I only found warnings of misuse of auto and codegen tests.

I spoke with @to268 during office hours about the current status of the NB comments for this feature in the C committee. For the moment, he's going to pause work on this patch until the C committee weighs in on whether auto is a type specifier or not. The committee meets the week of Jan 23, so we should have a better understanding of direction by the week of Jan 30. If the C committee turns auto into a type specifier, we may go back to the original implementation of this feature which exposes the C++ feature in C (and add diagnostics around the things that need to be constrained in C such as use in a function signature).

The minutes from last week's meeting aren't available yet, but I can give an update from my personal notes. Alex Gilding wrote a proposal to change the specification mechanisms away from "storage class specifier with no type specifier" and to an actual type specifier in WG14 N3076 (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3076.pdf). WG14 briefly looked at the paper, but given how late in the cycle we are, we did not want to adopt new specification wording that is so drastically different from the existing wording. So we repaired the national body comments with WG14 N3079 (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3079.htm) instead.

Does WG14 want something along the lines of N3076 in a future revision of the standard? 10 (for)/0 (against)/7 (abstain) -- consensus

So we now have sufficient support within WG14 for Clang to expose our C++ implementation of auto in C (with additional diagnostics for functionality C doesn't yet support like auto in function signatures) instead of implementing it from scratch as a weird storage class specifier/missing type specifier. However, we did make some other repairs to the feature so the current wording in the working draft (WG14 N3054) is not the latest. Specifically, we adopted US-122 and US-123 changes from WG14 N3079.

WG14 is expecting to have another week of meetings to work on NB comments (week of Feb 13) and we are expecting to have a second round of NB comments. I think it's okay for us to restart work on this patch now that we have a direction, but we might want to hold off on landing the patch until after we have a better idea of the final form of the specification. If there are no comments about it in CD2, then we're fine, but it's possible for CD2 to make breaking changes to the feature, so it's mostly about being aware of that situation.

It surprised me that there are no type inference messages? The type of this auto is double. I only found warnings of misuse of auto and codegen tests.

Can you give a code example of what you expected to see a diagnostic for but aren't getting one?

It surprised me that there are no type inference messages? The type of this auto is double. I only found warnings of misuse of auto and codegen tests.

Can you give a code example of what you expected to see a diagnostic for but aren't getting one?

This is probably only for testing and not for Clang users:

auto Float = 3.0; // note that Float is a float
to268 updated this revision to Diff 504381.Mar 11 2023, 9:34 AM

Patch update: Reverting auto as a type specifier.
I am currently figuring out what behavior the auto keyword need to match.

to268 updated this revision to Diff 514051.Apr 16 2023, 1:39 PM

I have improved some diagnostics and i am now testing nullptr with auto.

to268 updated this revision to Diff 514789.Apr 18 2023, 5:14 PM

I have *hopefully* fixed my broken patch.
This is all the things in need to address:

auto str2[] = "string";
[[clang::overloadable]] auto test(auto x) { return x; }

and maybe retry to support auto in compound literals,
becaue i have seen heavy macro users that want compound literals to work.

I think it's safe to say that we will not support all features from N3076 in this patch.

to268 updated this revision to Diff 520447.May 8 2023, 11:14 AM

Fixed auto char arrays with explicit brackets

c
auto foo[] = "bar";

Thank you for the continued work on this! In general, I think the test coverage should verify the deduced type information more often. e.g., if we accept the code, there should probably be a static_assert nearby that ensures the deduced type is what we'd expect.

I applied the patch locally and tried out various things, and here's some test coverage I think we're lacking.

// Important: these two declarations are at file scope, we're testing that the storage
// class specifier is ignored because we're deducing the type.
auto i = 12; // Ok, deduces int
auto j = 12ull; // Ok, deduces unsigned long long

// This is testing whether storage specifier order matters:
void func() {
  const auto static i = 12; // Ok, static variable of type const int
}

I have *hopefully* fixed my broken patch.
This is all the things in need to address:

auto str2[] = "string";
[[clang::overloadable]] auto test(auto x) { return x; }

and maybe retry to support auto in compound literals,
becaue i have seen heavy macro users that want compound literals to work.

I think it's safe to say that we will not support all features from N3076 in this patch.

My testing of your patch suggests those aren't issues left to address (perhaps). e.g.,

auto test[] = "foo";
static_assert(_Generic(test, char * : 1, default : 0));

this matches what I'd expect (no warning, no failed assertion; string literals have type char [] in C, not const char[].

My testing of __attribute__((overloadable)) with your patch is that we currently reject use of auto in those functions. I think that's good initial behavior -- if we want to allow auto there, that's probably better left to a follow-up patch anyway because I'd expect there to be some debate around whether that's a good idea or not.

As for compound literals, I think we've determined they're not supported by C currently.

clang/test/C/C2x/n3007.c
20 ↗(On Diff #520447)

This diagnostic is incorrect -- I believe this code should be accepted.

21 ↗(On Diff #520447)

This diagnostic is correct.

31–32 ↗(On Diff #520447)

We should test that the types match our expectations:

double A[3] = { 0 };
auto pA = A; // pA is double * due to array decay on A
static_assert(_Generic(pA, double * : 1, default : 0));
auto qA = &A; // qA is double (*)[3]
static_assert(_Generic(qA, double (*)[3] : 1, default : 0));
46 ↗(On Diff #520447)

I think this would make it more clear that what's being tested is the _Alignas behavior.

56–57 ↗(On Diff #520447)

This should now be resolved in N3096 so the FIXME comment can be removed.

66 ↗(On Diff #520447)

This is awfully close to the examples from the paper (6.7.9p3):

auto p1 = (struct { int a; } *)0; // error expected

struct s;
auto p2 = (struct s { int a; } *)0; // error expected

They are expected to be diagnosed due to changes in N3006 (underspecified object definitions), so I think it's fine to add a FIXME comment with the test cases instead of trying to also implement N3006 at the same time. We currently list that as unknown in our status page, but I think it's clearly a no now.

100 ↗(On Diff #520447)

No longer accurate (see comments below), so I think this FIXME comment can be removed.

106–107 ↗(On Diff #520447)

I'd like to see a test that passes -pedantic which diagnoses these declarations with a warning about accepting this code being a Clang extension. (The standard requires we support auto ident = expr; but leaves other declarator pieces up to the implementation.)

Note, 6.7.9p2 made this implementation-defined behavior, so we also need to add some documentation to meet that requirement.

clang/test/Parser/c2x-auto.c
47–52

The C2x diagnostics here are kind of mean because if you give an initializer, we then reject the code anyway. If possible, it'd be nice to reject this for use of auto with an array declarator rather than the lack of an initializer.

79–82

This is tested below with a more clear test.

clang/test/Sema/c2x-auto.c
27–28

This FIXME is no longer accurate, I think it's not clear whether we should accept or reject this code from reading the standard, but I'm now thinking we should reject it for two reasons: 1) 6.7.9p1 says "A declaration for which the type is inferred shall contain the storage-class specifier auto", but a compound literal is not a declaration, it's an expression. What's more, it doesn't have a direct declarator or an equals sign as specified in p2. 2) GCC rejects: https://godbolt.org/z/K8sqr9qTT

to268 updated this revision to Diff 533214.Jun 21 2023, 4:26 AM
to268 marked 8 inline comments as done.

I have updated and fixed all minors comments about the test cases.
I have added the diagnostic to warn users about using a Clang extension when declaring an explicit auto* in pedantic mode.
I need to debug more about the two remaining diagnostics to fix:

  • Wrong diagnostic when using the _Atomic attribute
  • Misleading diagnostic where it should say that auto is not allowed but says that [...] requires an initializer.
to268 updated this revision to Diff 537331.Jul 5 2023, 6:18 AM

I have fixed the misleading diagnostic with auto array declarations.
The diagnostics now says that 'auto' is not allowed in array declarations.

I am not sure with the way i have currently implemented it in SemaDecl.cpp, maybe it can be done in a cleaner way.

to268 updated this revision to Diff 537422.Jul 5 2023, 10:36 AM

Minor code reformatting

Thank you for your patience while WG14 finalized the feature and I wrapped my head around the moving parts involved. I think this is shaping up nicely, but there's still some diagnostic issues. The biggest ones are:

  • We should have an extension warning for array use with string literals auto str[] = "testing";
  • We should be careful about testing underspecified declarations in a way that give the impression they're an extension, so we should sprinkle more fixme comments around the tests
  • _Atomic auto x = 12; is now something we should be accepting and deduce as an _Atomic int

I think there's a typo in the patch summary:

auto in an compound statement

I think you mean "as the type of a compound literal"?

clang/include/clang/Basic/DiagnosticSemaKinds.td
240–242 ↗(On Diff #537422)

I think we'll need this diagnostic for more than just * in a declaration. Consider a case like:

auto val = (struct X { int y; }){ 0 };

accepting this is also an extension because it declares y, which is not an "ordinary" identifier. See 6.7.9p2-3

Also, because this is about the syntax used, I think this might be a diagnostic that should live in DiagnosticParseKinds.td (but maybe it's correct here, I've not reviewed enough yet to see).

2417 ↗(On Diff #537422)

I'm surprised to see decltype(auto) here as decltype isn't a keyword in C so it'd be invalid syntax; let's switch it to something that makes it obvious that we should never issue that particular part of the diagnostic in C.

clang/lib/Sema/SemaDecl.cpp
12684–12685 ↗(On Diff #537422)

Comment can be re-flowed to 80 columns and the comment should end in a period. (Weirdly, Phabricator isn't letting me suggest edits just within this file.)

12687–12688 ↗(On Diff #537422)

This can switch to using !isa_and_present<StringLiteral, InitListExpr>(Init) (the _and_nonnull variant is quietly deprecated).

12690 ↗(On Diff #537422)

Please add a /* */ comment before 23 detailing what that magic number will be printing.

clang/lib/Sema/SemaTemplateDeduction.cpp
4780 ↗(On Diff #537422)
4789–4792 ↗(On Diff #537422)

I think this diagnostic should happen from the parser rather then when determining the concrete type; that ensures consistent diagnosis of the syntax regardless of whether we needed to deduce the type or not.

clang/test/C/C2x/n3007.c
6–22 ↗(On Diff #537422)

Adding one more test case for restrict-qualified pointers.

25 ↗(On Diff #537422)

What we finally landed on in WG14 is that this should be accepted. In this form, _Atomic is being used as a type qualifier (not a type specifier), and 6.7.9p2 explicitly talks about qualifiers: "The inferred type of the declared object is the type of the assignment expression after lvalue, array to pointer or function to pointer conversion, additionally qualified by qualifiers and amended by attributes as they appear in the declaration specifiers, if any."

So this should form an _Atomic int initialized to the value 12, and that matches GCC's behavior: https://godbolt.org/z/s468s86z6

32 ↗(On Diff #537422)

The first error on this line isn't helpful; the second error is the only one I'd expect to see here.

63 ↗(On Diff #537422)

To make it clear that use of auto within sizeof is the issue, not the initialization itself.

81–84 ↗(On Diff #537422)

Might need to re-flow the comment as well.

102 ↗(On Diff #537422)

This should also get the "is a clang extension" warning because the declarators include [].

116 ↗(On Diff #537422)
clang/test/CodeGen/auto.c
16
20–23

I'd like to see a sema test for this:

auto t = ({
  auto b = 12;
  b;
)};
_Generic(t, int : 1);
clang/test/Parser/c2x-auto.c
72–73
98
123

This one should be fixed and accepted.

125

I'd like to see parsing tests for:

auto ident [[clang::annotate("this works")]] = 12; // No diagnostic expected

We should also add a test that this C++'ism doesn't work in C:

int a = auto(0);
clang/test/Sema/c2x-auto.c
75

This should also get an extension warning.

120

Some additional test cases to consider:

_Complex auto i = 12.0; // Should be rejected because _Complex is a type specifier; however, 
                        // when auto becomes a type specifier, this should be accepted. GCC
                        // rejects but if we end up accepting, I won't be sad -- we'll need an
                        // extension warning in that case though.

void foo(void) {
  extern void foo(int a, int array[({ auto x = 12; x;})]); // This is a use of `auto` within function prototype
                                                           // scope, but should still be accepted.
}
clang/www/c_status.html
1193

This can be committed as an NFC change separate from this review, given that it's slightly orthogonal.

  • _Atomic auto x = 12; is now something we should be accepting and deduce as an _Atomic int

The committee is discussing this again on the reflectors. Thus far, almost everyone reads the standard the same way as GCC did with their implementation, which matches what I suggest above. However, there are folks who are claiming we should not be able to deduce the derived type because _Atomic forms an entirely new type and thus isn't actually a qualifier (and there are some words in the standard that could maybe be read as supporting that). The most conservative approach is to reject using _Atomic auto for right now so users don't build a reliance on it. Eventually WG14 will make a decision and we can relax that diagnostic then if we need to. Sorry for the confusion on this topic!

clang/test/Sema/c2x-auto.c
120

The suggested comment I have isn't fully correct. It should be rejected because _Complex is a type specifier, but when auto becomes a type specifier, I think _Complex perhaps should still not deduce. Consider this analogous case (which could be a fun test case as well):

signed auto a = 1L;

signed is a type specifier as well, and this is not accepted in C++ (so we don't want to accept it in C either).

to268 edited the summary of this revision. (Show Details)Jul 13 2023, 12:31 PM
to268 marked an inline comment as done.Jul 13 2023, 12:55 PM

Thank you for your feedback @aaron.ballman!
I really appreciate that you are pointing out all my formatting mistakes, and giving me more guidelines.
The direction of the patch is clearer now.

I think there's a typo in the patch summary:

auto in an compound statement

I think you mean "as the type of a compound literal"?

Yes i was a mistake, I have edited the summary to fix that typo.

  • We should have an extension warning for array use with string literals auto str[] = "testing";

This makes sense, since auto arrays are prohibited in the standard.

The committee is discussing this again on the reflectors. Thus far, almost everyone reads the standard the same way as GCC did with their implementation, which matches what I suggest above. However, there are folks who are claiming we should not be able to deduce the derived type because _Atomic forms an entirely new type and thus isn't actually a qualifier (and there are some words in the standard that could maybe be read as supporting that). The most conservative approach is to reject using _Atomic auto for right now so users don't build a reliance on it. Eventually WG14 will make a decision and we can relax that diagnostic then if we need to. Sorry for the confusion on this topic!

I was wondering about the support of _Atomic auto, i will add new error diagnostic to prohibit _Atomic auto, thank you for addressing that topic!

to268 updated this revision to Diff 541538.Jul 18 2023, 8:05 AM
to268 marked 17 inline comments as done.

Added recommendations from @aaron.ballman.
Adjusting diagnostic messages still need to be done.

clang/test/Sema/c2x-auto.c
120
void foo(void) {
  extern void foo(int a, int array[({ auto x = 12; x;})]); // This is a use of `auto` within function prototype
                                                           // scope, but should still be accepted.
}

I think you made a mistake there by using the same function name as the outer one.

to268 updated this revision to Diff 550874.Aug 16 2023, 1:55 PM
to268 marked 4 inline comments as done.

This update is mostly a followup of a change upstream that consist of renaming all C2x references to C23.

to268 updated this revision to Diff 556008.Sep 6 2023, 5:23 AM

This updates fixes the _Atomic auto diagnostic, the previous diagnostic was confusing since it was referencing a C++ feature (std::is_trivially_copyable).
I have a few diagnostic issues remaining, some parts need to take place as earlier in the Parsing stage and be diagnosed in the Sema stage.
PS: I'll migrate this patch to GH at the end of September if I need to.

to268 updated this revision to Diff 557115.Sep 20 2023, 7:53 AM

Fixed the initializer list diagnotic

to268 added a comment.Sep 20 2023, 7:58 AM

I need thoughts about this.

clang/lib/Sema/SemaTemplateDeduction.cpp
4805–4806 ↗(On Diff #557115)

It feels wrong to check if it's not C++ and changing the diagnostic if we are in C, I only emit the fixed diagnostic in C23 mode for now.

aaron.ballman accepted this revision.Oct 4 2023, 12:18 PM

I spotted some minor issues, but this LGTM for the initial implementation. Thank you for all your effort on this project!!

There are some outstanding issues that should be handled in a follow up, however:

void foo() {
   auto l = (struct S { int x, y; }){ 1, 2 }; // Underspecified declaration, should diagnose

  _Atomic int i;
  _Static_assert(_Generic(&i, _Atomic auto *: 1)); // Gets two errors instead of just 1

  auto int i = 12; // At local and file scope, now gives: warning: 'auto' storage class specifier is not permitted in C++11, and will not be supported in future releases

  auto __attribute__((mode(HI))) i = 12; // Gives confused diagnostics, but GCC also doesn't accept currently either
  _Static_assert(_Generic(i, short : 1)); // But if we accepted the above, this should pass. 
}

but these all seem minor enough to be handled in follow-up work.

Do you need me to land this on your behalf? If so, I can fix up the NFC changes I found for you, but what email address would you like me to use for patch attribution?

clang/lib/Sema/DeclSpec.cpp
1367–1368
clang/lib/Sema/SemaDecl.cpp
12869 ↗(On Diff #557115)

/*in array decl*/ instead of here

clang/test/C/C2x/n3007.c
32 ↗(On Diff #537422)

To be clear, I mean this example should have one diagnostic instead of two:

_Static_assert(_Generic(&i, _Atomic auto *: 1)); // expected-error {{_Atomic cannot be applied to type 'auto' in C23}} \
                                                    expected-error {{'auto' not allowed here}}
clang/test/Sema/c2x-auto.c
111–112
This revision is now accepted and ready to land.Oct 4 2023, 12:18 PM
to268 added a comment.Oct 4 2023, 12:56 PM

Do you need me to land this on your behalf? If so, I can fix up the NFC changes I found for you, but what email address would you like me to use for patch attribution?

It is probably the best thing to do, and yes I need you to land this on your behalf. Guillot Tony <tony.guillot@protonmail.com>

There are some outstanding issues that should be handled in a follow up

I will start to work on these issues, when the patch has been landed.

Thank you for the review!

This revision was automatically updated to reflect the committed changes.
to268 updated this revision to Diff 557614.Oct 5 2023, 8:04 AM

Fix arm buildbot issues

This revision was landed with ongoing or failed builds.Oct 5 2023, 9:16 AM