fixed: https://github.com/llvm/llvm-project/issues/62447
C don't support DependentSizedArrayType, use ConstantArrayType with nullptr as SizeExpr
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/lib/Sema/SemaDecl.cpp | ||
---|---|---|
4399 | I wonder why it is not an InvalidDecl() as well? I don't see that we check both of these anywhere else. |
in SemaType.cpp#BuildArrayType, It will generate DependentSizedArrayType which cannot be merged.
So we can either add additional check in MergeVarDeclTypes or directly ignore to generate this type in BuildArrayType.
Which one is better?
} else if (ArraySize->isTypeDependent() || ArraySize->isValueDependent()) { T = Context.getDependentSizedArrayType(T, ArraySize, ASM, Quals, Brackets);
I am not sure where the right place to fix this is, or when it is correct to use containsErrors()
Maybe @aaron.ballman or @rsmith might have some advice
so containsErrors on an expression just means it has a RecoveryExpr in it. MUCH of the time a RecoveryExpr is just created with a Dependent type, thus the Dependent Array type. I would suggest rather than just returning 'no type' (via the empty qual type), if we find that the Array Size contains errors (I don't get the if CPlusPlus check?), we could either:
1- Make it an IncompleteArrayType.
2- Make it a ConstantArrayType, with a correctly-typed RecoveryExpr (that is, not dependent, still a size_t/etc expression, BUT with no value) for the bounds type. I would think it is better to do this ~2584 though.
Dependent type for CPP is a common case due to template. But for mergeTypes in C, dependent type cannot be handled correctly.
clang/lib/Sema/SemaType.cpp | ||
---|---|---|
2582 ↗ | (On Diff #519675) | So I'm still not sure this is the 'right' away about it. I think we should instead properly handle the containsErrors case and just always create a non-dependent sized array, except with the RecoveryExpr having the correct type. |
clang/lib/Sema/SemaType.cpp | ||
---|---|---|
2582 ↗ | (On Diff #519675) | For CPP, dependent sized array is acceptable and necessary. For C, I don't think SemaType is a correctly way to resolve TypoExpr, It should be done by ActOnXXX. |
clang/lib/Sema/SemaType.cpp | ||
---|---|---|
2582 ↗ | (On Diff #519675) | Why is the dependent array necessary for C++? I also don't get your C logic here... I think that Nullptr should be a RecoveryExpr of the proper type. |
So I'm still not sure this is the 'right' away about it. I think we should instead properly handle the containsErrors case and just always create a non-dependent sized array, except with the RecoveryExpr having the correct type.
You are right. Fix as your suggestion.
clang/lib/Sema/SemaType.cpp | ||
---|---|---|
2583 ↗ | (On Diff #519972) | Actually thinking further... rather than create a NEW RecoveryExpr, is there a problem KEEPING the ArraySize expression? It'd likely give more information/keep more AST consistency. ALSO, creating this as size-0 has some implications we probably don't want. I wonder if a different for the placeholder would be better? I'D ALSO suggest hoisting this ArraySize->containsError out of the else-if and into its own branch (rather than inside the dependent checks). |
use 1 replace 0 as length
clang/lib/Sema/SemaType.cpp | ||
---|---|---|
2583 ↗ | (On Diff #519972) |
Agree.
Do you think we can modify it as 1? But I don't find a better way to identifier it as undef or other.
Done |
clang/lib/Sema/SemaType.cpp | ||
---|---|---|
2583 ↗ | (On Diff #519972) | I don't have good evidence for what else we could do (whether an incomplete or variable type would be better), but at least 1 is 'legal', so perhaps we can live with this for now and see if there is any fallout. What happens if we try to 'merge' a valid and an invalid here? Does it make the experience worse? So same example you added below, but the 1st one has no error in its brackets? |
clang/lib/Sema/SemaType.cpp | ||
---|---|---|
2583 ↗ | (On Diff #519972) | I add test case for this scenario. I think it will not confuse developer. |
@aaron.ballman : Can you comment, particularly on the 'x2' and 'x3' examples here? I think our hackery here to get the AST back in a reasonable position here is unfortunate, and leads to some pretty awkward errors. I'm not sure what else we could do?
We can't make it a dependent type, as it can then not be merged correctly.
We can't really make it a variable array type, since we would then have those types in places where it isn't 'legal'.
We would have a perhaps similar problem with the incomplete array type as the VLA, but at least we would diagnose char[], right?
My original thoughts were to make this still be a constant-array-type, keeping the 'error' size expression, but with a '1' for the extent, but I'm second guessing here based on those errors.
clang/test/Sema/merge-decls.c | ||
---|---|---|
101 | Newline at end of file still needed. |
I'm not sure at the moment... I'd like to have aaron take a look and chat it over with him.
clang/lib/Sema/SemaType.cpp | ||
---|---|---|
2582 ↗ | (On Diff #519675) |
It's how we handle error recovery. I agree with @HerrCai0907 that we should keep the dependent-sized array in C++. To me, the real question here is: what happens when a recovery expression is needed in order to form a type? Should we even *get* a recovery expression in this case? (And should the answer be different between C and C++?) CC @hokein and @sammccall who were active on https://reviews.llvm.org/D89046 which introduced this issue. I'm not certain what design y'all were thinking of for this sort of situation. |
clang/test/AST/ast-dump-types-errors.cpp | ||
5 ↗ | (On Diff #520808) | I don't think we should be changing the behavior in C++, this is a C-specific issue. |
Unless I'm wildly misremembering (@hokein knows better), the type should also be dependent in C.
The intuition is that the code has some meaning that depends on how the error is resolved (by the programmer changing the code!), and that "dependent" is a good first-approximation way to model this (defer further analysis, suppress diagnostics) and "contains-errors" allows us to specifically handle the cases that need it.
Of course this allows dependent constructs to occur in new places, in non-template code in C++ (this was handled quite early), and also in C (this is why the "all languages" D89046 came later).
I'm surprised this only came up now! Ideally I think we'd use DependentSizedArrayType as the types of these variables, dependent types are supposed to work in C. It seems OK to have mergeTypes() return QualType() for types that have errors in them - this will result in a followup diagnostic that ideally we'd drop, but we also emit that diagnostic in this case in C++. You could also make the case for returning the first type, or the type with errors, or the type without errors (i.e. *assume* that the types are the same after errors are fixed).
The dependent type should be able to work for C (we have extended clang to support dependence in C as RecoveryExpr heavily relies on clang's dependence mechanism). We're probably missing some places that need to handle dependent-type for C (like this case), we should fix it by making the C-only codepath properly support dependence.
That being said, it is expected to see dependent types for C, we should not change the type in order to fix the crash.
It seems OK to have mergeTypes() return QualType() for types that have errors in them - this will result in a followup diagnostic that ideally we'd drop, but we also emit that diagnostic in this case in C++. You could also make the case for returning the first type, or the type with errors, or the type without errors (i.e. *assume* that the types are the same after errors are fixed).
The mergeTypes is a widely-used method (for C and C++ code paths), and the "non-dependent-type" seems like an important invariant for this method. I'm nervous about removing it, I think it'd be better to keep it. Similar to other callers, we can make the caller (here it is MergeVarDeclTypes) guarantee this invariant.
An "easy" alternative, is to bail out if we see the containsErrors bit on the Old or New type in MergeVarDeclTypes. A bonus is that it can drop a not-quite-useful followup diagnostic (redefinition of 'x' with a different type: 'char[((sizeof (<recovery-expr>(d))) == 8)]' vs 'char[((sizeof (<recovery-expr>(d)) == 8))]') for C++.
I think this is looking fine to me. Thanks for your patience on this, I know it was a tough review.
clang/lib/Sema/SemaDecl.cpp | ||
---|---|---|
4399 | NIT: 80 line character limit per line. |
I wonder why it is not an InvalidDecl() as well? I don't see that we check both of these anywhere else.