This is an archive of the discontinued LLVM Phabricator instance.

[Sema] avoid merge error type
ClosedPublic

Authored by HerrCai0907 on May 1 2023, 12:52 PM.

Details

Summary

fixed: https://github.com/llvm/llvm-project/issues/62447
C don't support DependentSizedArrayType, use ConstantArrayType with nullptr as SizeExpr

Diff Detail

Event Timeline

HerrCai0907 created this revision.May 1 2023, 12:52 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 1 2023, 12:52 PM
HerrCai0907 requested review of this revision.May 1 2023, 12:52 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 1 2023, 12:52 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
shafik added inline comments.May 2 2023, 10:45 AM
clang/lib/Sema/SemaDecl.cpp
4399

I wonder why it is not an InvalidDecl() as well? I don't see that we check both of these anywhere else.

in SemaType.cpp#BuildArrayType, It will generate DependentSizedArrayType which cannot be merged.
So we can either add additional check in MergeVarDeclTypes or directly ignore to generate this type in BuildArrayType.
Which one is better?

} else if (ArraySize->isTypeDependent() || ArraySize->isValueDependent()) {
  T = Context.getDependentSizedArrayType(T, ArraySize, ASM, Quals, Brackets);

fix in Sema::BuildArrayType

shafik added a subscriber: rsmith.May 2 2023, 9:12 PM

in SemaType.cpp#BuildArrayType, It will generate DependentSizedArrayType which cannot be merged.
So we can either add additional check in MergeVarDeclTypes or directly ignore to generate this type in BuildArrayType.
Which one is better?

} else if (ArraySize->isTypeDependent() || ArraySize->isValueDependent()) {
  T = Context.getDependentSizedArrayType(T, ArraySize, ASM, Quals, Brackets);

I am not sure where the right place to fix this is, or when it is correct to use containsErrors()

Maybe @aaron.ballman or @rsmith might have some advice

in SemaType.cpp#BuildArrayType, It will generate DependentSizedArrayType which cannot be merged.
So we can either add additional check in MergeVarDeclTypes or directly ignore to generate this type in BuildArrayType.
Which one is better?

} else if (ArraySize->isTypeDependent() || ArraySize->isValueDependent()) {
  T = Context.getDependentSizedArrayType(T, ArraySize, ASM, Quals, Brackets);

I am not sure where the right place to fix this is, or when it is correct to use containsErrors()

Maybe @aaron.ballman or @rsmith might have some advice

so containsErrors on an expression just means it has a RecoveryExpr in it. MUCH of the time a RecoveryExpr is just created with a Dependent type, thus the Dependent Array type. I would suggest rather than just returning 'no type' (via the empty qual type), if we find that the Array Size contains errors (I don't get the if CPlusPlus check?), we could either:
1- Make it an IncompleteArrayType.
2- Make it a ConstantArrayType, with a correctly-typed RecoveryExpr (that is, not dependent, still a size_t/etc expression, BUT with no value) for the bounds type. I would think it is better to do this ~2584 though.

Dependent type for CPP is a common case due to template. But for mergeTypes in C, dependent type cannot be handled correctly.

getConstantArrayType instal QualType()

use nullpt as SizeExpr

HerrCai0907 edited the summary of this revision. (Show Details)May 4 2023, 3:19 PM
erichkeane added inline comments.May 4 2023, 4:27 PM
clang/lib/Sema/SemaType.cpp
2582 ↗(On Diff #519675)

So I'm still not sure this is the 'right' away about it. I think we should instead properly handle the containsErrors case and just always create a non-dependent sized array, except with the RecoveryExpr having the correct type.

HerrCai0907 added inline comments.May 4 2023, 10:02 PM
clang/lib/Sema/SemaType.cpp
2582 ↗(On Diff #519675)

For CPP, dependent sized array is acceptable and necessary.

For C, I don't think SemaType is a correctly way to resolve TypoExpr, It should be done by ActOnXXX.

erichkeane added inline comments.May 5 2023, 6:51 AM
clang/lib/Sema/SemaType.cpp
2582 ↗(On Diff #519675)

Why is the dependent array necessary for C++? I also don't get your C logic here... I think that Nullptr should be a RecoveryExpr of the proper type.

So I'm still not sure this is the 'right' away about it. I think we should instead properly handle the containsErrors case and just always create a non-dependent sized array, except with the RecoveryExpr having the correct type.

You are right. Fix as your suggestion.

erichkeane accepted this revision.May 8 2023, 6:06 AM
This revision is now accepted and ready to land.May 8 2023, 6:06 AM
erichkeane requested changes to this revision.May 8 2023, 11:32 AM
erichkeane added inline comments.
clang/lib/Sema/SemaType.cpp
2583 ↗(On Diff #519972)

Actually thinking further... rather than create a NEW RecoveryExpr, is there a problem KEEPING the ArraySize expression? It'd likely give more information/keep more AST consistency.

ALSO, creating this as size-0 has some implications we probably don't want. I wonder if a different for the placeholder would be better?

I'D ALSO suggest hoisting this ArraySize->containsError out of the else-if and into its own branch (rather than inside the dependent checks).

This revision now requires changes to proceed.May 8 2023, 11:32 AM
HerrCai0907 marked 2 inline comments as done.

use 1 replace 0 as length

clang/lib/Sema/SemaType.cpp
2583 ↗(On Diff #519972)

KEEPING the ArraySize expression

Agree.

creating this as size-0 has some implications we probably don't want.

Do you think we can modify it as 1? But I don't find a better way to identifier it as undef or other.

hoisting this ArraySize->containsError out of the else-if

Done

erichkeane added inline comments.May 9 2023, 6:36 AM
clang/lib/Sema/SemaType.cpp
2583 ↗(On Diff #519972)

I don't have good evidence for what else we could do (whether an incomplete or variable type would be better), but at least 1 is 'legal', so perhaps we can live with this for now and see if there is any fallout.

What happens if we try to 'merge' a valid and an invalid here? Does it make the experience worse? So same example you added below, but the 1st one has no error in its brackets?

add more test case

add more test cases

HerrCai0907 added inline comments.May 9 2023, 1:45 PM
clang/lib/Sema/SemaType.cpp
2583 ↗(On Diff #519972)

I add test case for this scenario. I think it will not confuse developer.

@aaron.ballman : Can you comment, particularly on the 'x2' and 'x3' examples here? I think our hackery here to get the AST back in a reasonable position here is unfortunate, and leads to some pretty awkward errors. I'm not sure what else we could do?

We can't make it a dependent type, as it can then not be merged correctly.
We can't really make it a variable array type, since we would then have those types in places where it isn't 'legal'.
We would have a perhaps similar problem with the incomplete array type as the VLA, but at least we would diagnose char[], right?

My original thoughts were to make this still be a constant-array-type, keeping the 'error' size expression, but with a '1' for the extent, but I'm second guessing here based on those errors.

clang/test/Sema/merge-decls.c
101

Newline at end of file still needed.

Maybe return QualType() as a temporary solution to avoid crash? @erichkeane

Maybe return QualType() as a temporary solution to avoid crash? @erichkeane

I'm not sure at the moment... I'd like to have aaron take a look and chat it over with him.

aaron.ballman added inline comments.
clang/lib/Sema/SemaType.cpp
2582 ↗(On Diff #519675)

Why is the dependent array necessary for C++?

It's how we handle error recovery. I agree with @HerrCai0907 that we should keep the dependent-sized array in C++.

To me, the real question here is: what happens when a recovery expression is needed in order to form a type? Should we even *get* a recovery expression in this case? (And should the answer be different between C and C++?)

CC @hokein and @sammccall who were active on https://reviews.llvm.org/D89046 which introduced this issue. I'm not certain what design y'all were thinking of for this sort of situation.

clang/test/AST/ast-dump-types-errors.cpp
5 ↗(On Diff #520808)

I don't think we should be changing the behavior in C++, this is a C-specific issue.

Unless I'm wildly misremembering (@hokein knows better), the type should also be dependent in C.

The intuition is that the code has some meaning that depends on how the error is resolved (by the programmer changing the code!), and that "dependent" is a good first-approximation way to model this (defer further analysis, suppress diagnostics) and "contains-errors" allows us to specifically handle the cases that need it.

Of course this allows dependent constructs to occur in new places, in non-template code in C++ (this was handled quite early), and also in C (this is why the "all languages" D89046 came later).

I'm surprised this only came up now! Ideally I think we'd use DependentSizedArrayType as the types of these variables, dependent types are supposed to work in C. It seems OK to have mergeTypes() return QualType() for types that have errors in them - this will result in a followup diagnostic that ideally we'd drop, but we also emit that diagnostic in this case in C++. You could also make the case for returning the first type, or the type with errors, or the type without errors (i.e. *assume* that the types are the same after errors are fixed).

Unless I'm wildly misremembering (@hokein knows better), the type should also be dependent in C.

The intuition is that the code has some meaning that depends on how the error is resolved (by the programmer changing the code!), and that "dependent" is a good first-approximation way to model this (defer further analysis, suppress diagnostics) and "contains-errors" allows us to specifically handle the cases that need it.

Of course this allows dependent constructs to occur in new places, in non-template code in C++ (this was handled quite early), and also in C (this is why the "all languages" D89046 came later).

I'm surprised this only came up now! Ideally I think we'd use DependentSizedArrayType as the types of these variables, dependent types are supposed to work in C.

The dependent type should be able to work for C (we have extended clang to support dependence in C as RecoveryExpr heavily relies on clang's dependence mechanism). We're probably missing some places that need to handle dependent-type for C (like this case), we should fix it by making the C-only codepath properly support dependence.
That being said, it is expected to see dependent types for C, we should not change the type in order to fix the crash.

It seems OK to have mergeTypes() return QualType() for types that have errors in them - this will result in a followup diagnostic that ideally we'd drop, but we also emit that diagnostic in this case in C++. You could also make the case for returning the first type, or the type with errors, or the type without errors (i.e. *assume* that the types are the same after errors are fixed).

The mergeTypes is a widely-used method (for C and C++ code paths), and the "non-dependent-type" seems like an important invariant for this method. I'm nervous about removing it, I think it'd be better to keep it. Similar to other callers, we can make the caller (here it is MergeVarDeclTypes) guarantee this invariant.

An "easy" alternative, is to bail out if we see the containsErrors bit on the Old or New type in MergeVarDeclTypes. A bonus is that it can drop a not-quite-useful followup diagnostic (redefinition of 'x' with a different type: 'char[((sizeof (<recovery-expr>(d))) == 8)]' vs 'char[((sizeof (<recovery-expr>(d)) == 8))]') for C++.

revert untill first version

erichkeane accepted this revision.May 19 2023, 11:22 AM

I think this is looking fine to me. Thanks for your patience on this, I know it was a tough review.

clang/lib/Sema/SemaDecl.cpp
4399

NIT: 80 line character limit per line.

This revision is now accepted and ready to land.May 19 2023, 11:22 AM
hokein accepted this revision.May 19 2023, 11:11 PM

thanks, looks good to me.

clang/test/Sema/merge-decls.c
96

It would be helpful to mention the related github issue), can you rename the x to something like test8_gh62447 ?

96

you can simplify the test further, char x[d.undef == 8] should be enough to trigger the crash.

simplify test case

This revision was landed with ongoing or failed builds.May 20 2023, 2:19 AM
This revision was automatically updated to reflect the committed changes.