This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Basic/
-
clang/
-
Basic/
-
TokenKinds.def
-
lib/Parse/
-
Parse/
1/3
ParseExpr.cpp
1
ParseTentative.cpp
-
Parser.cpp
-
test/AST/
-
AST/
-
alignas_maybe_odr_cleanup.cpp

Differential D80925

Fix compiler crash when an expression parsed in the tentative parsing and must be claimed in the another evaluation context.
ClosedPublic

Authored by ABataev on Jun 1 2020, 7:30 AM.

Download Raw Diff

Details

Reviewers

rjmccall
rsmith

Commits

rG2f7269b6773d: Fix compiler crash when an expression parsed in the tentative parsing and must…

Summary

Clang crashes when trying to finish function body. MaybeODRUseExprs is
not empty because of const static data member parsed in outer evaluation
context, upon call for isTypeIdInParens() function. It builds
annot_primary_expr, later parsed in ParseConstantExpression() in
inner constant expression evaluation context.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

ABataev created this revision.Jun 1 2020, 7:30 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 1 2020, 7:30 AM

Harbormaster completed remote builds in B58615: Diff 267613.Jun 1 2020, 9:39 AM

Narrowly this seems to fix the immediate problem, but I feel like we're in trouble if tentative parsing is changing the semantic context in ways that persist. In particular, I'm concerned that we could end up tentatively parsing an ODR use as part of an expression and then completely discarding it, causing Sema to think that there's an ODR use later because it never sees an L2R conversion (because the expression is not actually used). Probably the most architectural thing would be for tentative expression parsing to push a possibly-unevaluated context, and then when we claim an expression annotation token we can do the retroactive work necessary to make it an expression in the proper context. We already have most of the logic to support that because of C99 sizeof, which is usually not evaluated but can be in the narrow circumstance of a VLA.

If we do decide to solve this more narrowly, then we should audit our use of the tentative-parsing queries to make sure that we're pushing contexts consistently, and we should leave comments in places like this to make sure that maintainers understand the subtleties.

In D80925#2066728, @rjmccall wrote:

Narrowly this seems to fix the immediate problem, but I feel like we're in trouble if tentative parsing is changing the semantic context in ways that persist. In particular, I'm concerned that we could end up tentatively parsing an ODR use as part of an expression and then completely discarding it, causing Sema to think that there's an ODR use later because it never sees an L2R conversion (because the expression is not actually used). Probably the most architectural thing would be for tentative expression parsing to push a possibly-unevaluated context, and then when we claim an expression annotation token we can do the retroactive work necessary to make it an expression in the proper context. We already have most of the logic to support that because of C99 sizeof, which is usually not evaluated but can be in the narrow circumstance of a VLA.

If we do decide to solve this more narrowly, then we should audit our use of the tentative-parsing queries to make sure that we're pushing contexts consistently, and we should leave comments in places like this to make sure that maintainers understand the subtleties.

So, you suggest to not create annot_primary_expr during tentative parsing and revert parsing completely, right?

In D80925#2066915, @ABataev wrote:

In D80925#2066728, @rjmccall wrote:

Narrowly this seems to fix the immediate problem, but I feel like we're in trouble if tentative parsing is changing the semantic context in ways that persist. In particular, I'm concerned that we could end up tentatively parsing an ODR use as part of an expression and then completely discarding it, causing Sema to think that there's an ODR use later because it never sees an L2R conversion (because the expression is not actually used). Probably the most architectural thing would be for tentative expression parsing to push a possibly-unevaluated context, and then when we claim an expression annotation token we can do the retroactive work necessary to make it an expression in the proper context. We already have most of the logic to support that because of C99 sizeof, which is usually not evaluated but can be in the narrow circumstance of a VLA.

If we do decide to solve this more narrowly, then we should audit our use of the tentative-parsing queries to make sure that we're pushing contexts consistently, and we should leave comments in places like this to make sure that maintainers understand the subtleties.

So, you suggest to not create annot_primary_expr during tentative parsing and revert parsing completely, right?

Not creating the annotation doesn't help if we're still making Sema calls. Also, I assume we're making the annotation token intentionally, probably to avoid re-doing the lookup. But I do think we could recognize that we're doing this, push an unevaluated context in tentative parsing, and then call TransformToPotentiallyEvaluated when we see the token in expression parsing.

In D80925#2067145, @rjmccall wrote:

In D80925#2066915, @ABataev wrote:

In D80925#2066728, @rjmccall wrote:

Narrowly this seems to fix the immediate problem, but I feel like we're in trouble if tentative parsing is changing the semantic context in ways that persist. In particular, I'm concerned that we could end up tentatively parsing an ODR use as part of an expression and then completely discarding it, causing Sema to think that there's an ODR use later because it never sees an L2R conversion (because the expression is not actually used). Probably the most architectural thing would be for tentative expression parsing to push a possibly-unevaluated context, and then when we claim an expression annotation token we can do the retroactive work necessary to make it an expression in the proper context. We already have most of the logic to support that because of C99 sizeof, which is usually not evaluated but can be in the narrow circumstance of a VLA.

If we do decide to solve this more narrowly, then we should audit our use of the tentative-parsing queries to make sure that we're pushing contexts consistently, and we should leave comments in places like this to make sure that maintainers understand the subtleties.

So, you suggest to not create annot_primary_expr during tentative parsing and revert parsing completely, right?

Not creating the annotation doesn't help if we're still making Sema calls. Also, I assume we're making the annotation token intentionally, probably to avoid re-doing the lookup. But I do think we could recognize that we're doing this, push an unevaluated context in tentative parsing, and then call TransformToPotentiallyEvaluated when we see the token in expression parsing.

Ah, got it, will try to implement it.

Reworked after comments.

Harbormaster completed remote builds in B58685: Diff 267743.Jun 1 2020, 5:54 PM

rjmccall added inline comments.Jun 1 2020, 7:14 PM

clang/lib/Parse/ParseExpr.cpp

1009

Pushing an unevaluated context here isn't correct. Here we're parsing the expression for real and should already be in the right context for TransformToPotentiallyEvaluated.

clang/lib/Parse/ParseTentative.cpp

1279

Suggestion:

// Tentative parsing may not be done in the right evaluation context
// for the ultimate expression.  Enter an unevaluated context to prevent
// Sema from immediately e.g. treating this lookup as a potential ODR-use.
// If we generate an expression annotation token and the parser actually
// claims it as an expression, we'll transform the expression to a
// potentially-evaluated one then.

ABataev marked an inline comment as done.Jun 2 2020, 5:09 AM

ABataev added inline comments.

clang/lib/Parse/ParseExpr.cpp
1009	`TransformToPotentiallyEvaluated` expects that the innermost context is reevaluated. By the time we approach this branch, we're already out of the original unevaluated context. So, we need to create fake unevaluated context to mimic the original one. Function `TransformToPotentiallyEvaluated` copies the context state from the previous context to this new one and rebuilds the expression correctly. Instead, I can add a new function that transforms the expression in the current context and use it in `TransformToPotentiallyEvaluated`

ABataev retitled this revision from Fix compiler crash when trying to parse alignment argument as a constant expression. to Fix compiler crash when tentative parsing is unsuccessful..Jun 2 2020, 5:15 AM

ABataev retitled this revision from Fix compiler crash when tentative parsing is unsuccessful. to Fix compiler crash when an expression parsed in the tentative parsing and must be claimed in the another evaluation context..

rjmccall added inline comments.Jun 2 2020, 9:46 AM

clang/lib/Parse/ParseExpr.cpp
1009	TransformToPotentiallyEvaluated expects that the innermost context is reevaluated. By the time we approach this branch, we're already out of the original unevaluated context. Oh, I see. TransformToPotentiallyEvaluated expects that it will still be in a (temporary) unevaluated context and then looks through that context to build it in the surrounding context. (In fact, it actually changes the current context to be that outer context and never changes it back.) So we need to push an unevaluated context to balance things out. I actually really dislike that design approach; the caller should be expected to have popped off the evaluation context under which the operand was parsed, and we should rebuild in the current context. There's way too much subtle reasoning about the exact global state here. But given that it is what it is, please leave a comment explaining why pushing a context is necessary here.

Rebase + fixes

LGTM

This revision is now accepted and ready to land.Jun 2 2020, 10:34 AM

Harbormaster completed remote builds in B58787: Diff 267927.Jun 2 2020, 12:05 PM

Closed by commit rG2f7269b6773d: Fix compiler crash when an expression parsed in the tentative parsing and must… (authored by ABataev). · Explain WhyJun 2 2020, 12:05 PM

This revision was automatically updated to reflect the committed changes.

This is at best only a partial fix. Sema::NC_ContextIndependentExpr is supposed to be used (unsurprisingly) only if we form a context-independent annotation, but here we're forming a context-dependent expression that depends on whether it appears in an unevaluated context. I think the better approach would be to fix the case in Sema::ClassifyName that violates context-independence instead (there's a FIXME there for this issue).

This fix changed us from producing a bad AST if the member reference was not supposed to be evaluated, to producing a bad AST if the member reference was supposed to be annotated and is type-dependent -- we now crash in CodeGen on this invalid code:

struct C { void g(); };
template<typename T> struct A {
  T x;
  static void f() {
    (x.g());
  }
};
void h() { A<C>::f(); }

... because we now incorrectly form an unevaluated DeclRefExpr for x when disambiguating between a cast and a parenthesized expression (and we don't fix it due to the added "type-dependent" check).

I'm going to try to fix this a different way, by fixing the bad case in Sema::ClassifyName instead.

In D80925#2177446, @rsmith wrote:

I'm going to try to fix this a different way, by fixing the bad case in Sema::ClassifyName instead.

Done in llvmorg-12-init-1234-g23d6525cbdc.

Revision Contents

Path

Size

clang/

include/

clang/

Basic/

TokenKinds.def

3 lines

lib/

Parse/

ParseExpr.cpp

15 lines

ParseTentative.cpp

9 lines

Parser.cpp

3 lines

test/

AST/

alignas_maybe_odr_cleanup.cpp

15 lines

Diff 267948

clang/include/clang/Basic/TokenKinds.def

Show First 20 Lines • Show All 736 Lines • ▼ Show 20 Lines	ANNOTATION(template_id) // annotation for a C++ template-id that names a
// might not have explicit template arguments),		// might not have explicit template arguments),
// e.g. "C", "C<int>".		// e.g. "C", "C<int>".
ANNOTATION(non_type) // annotation for a single non-type declaration		ANNOTATION(non_type) // annotation for a single non-type declaration
ANNOTATION(non_type_undeclared) // annotation for an undeclared identifier that		ANNOTATION(non_type_undeclared) // annotation for an undeclared identifier that
// was assumed to be an ADL-only function name		// was assumed to be an ADL-only function name
ANNOTATION(non_type_dependent) // annotation for an assumed non-type member of		ANNOTATION(non_type_dependent) // annotation for an assumed non-type member of
// a dependent base class		// a dependent base class
ANNOTATION(primary_expr) // annotation for a primary expression		ANNOTATION(primary_expr) // annotation for a primary expression
		ANNOTATION(
		uneval_primary_expr) // annotation for a primary expression which should be
		// transformed to potentially evaluated
ANNOTATION(decltype) // annotation for a decltype expression,		ANNOTATION(decltype) // annotation for a decltype expression,
// e.g., "decltype(foo.bar())"		// e.g., "decltype(foo.bar())"

// Annotation for #pragma unused(...)		// Annotation for #pragma unused(...)
// For each argument inside the parentheses the pragma handler will produce		// For each argument inside the parentheses the pragma handler will produce
// one 'pragma_unused' annotation token followed by the argument token.		// one 'pragma_unused' annotation token followed by the argument token.
PRAGMA_ANNOTATION(pragma_unused)		PRAGMA_ANNOTATION(pragma_unused)

▲ Show 20 Lines • Show All 127 Lines • Show Last 20 Lines

clang/lib/Parse/ParseExpr.cpp

Show First 20 Lines • Show All 992 Lines • ▼ Show 20 Lines	ExprResult Parser::ParseCastExpression(CastParseKind ParseKind,
case tok::kw___objc_yes:		case tok::kw___objc_yes:
case tok::kw___objc_no:		case tok::kw___objc_no:
return ParseObjCBoolLiteral();		return ParseObjCBoolLiteral();

case tok::kw_nullptr:		case tok::kw_nullptr:
Diag(Tok, diag::warn_cxx98_compat_nullptr);		Diag(Tok, diag::warn_cxx98_compat_nullptr);
return Actions.ActOnCXXNullPtrLiteral(ConsumeToken());		return Actions.ActOnCXXNullPtrLiteral(ConsumeToken());

		case tok::annot_uneval_primary_expr:
case tok::annot_primary_expr:		case tok::annot_primary_expr:
Res = getExprAnnotation(Tok);		Res = getExprAnnotation(Tok);
		if (SavedKind == tok::annot_uneval_primary_expr) {
		if (Expr *E = Res.get()) {
		if (!E->isTypeDependent() && !E->containsErrors()) {
		// TransformToPotentiallyEvaluated expects that it will still be in a
		// (temporary) unevaluated context and then looks through that context
		// to build it in the surrounding context. So we need to push an
		rjmccallUnsubmitted Not Done Reply Inline Actions Pushing an unevaluated context here isn't correct. Here we're parsing the expression for real and should already be in the right context for `TransformToPotentiallyEvaluated`. rjmccall: Pushing an unevaluated context here isn't correct. Here we're parsing the expression for real…
		ABataevAuthorUnsubmitted Done Reply Inline Actions `TransformToPotentiallyEvaluated` expects that the innermost context is reevaluated. By the time we approach this branch, we're already out of the original unevaluated context. So, we need to create fake unevaluated context to mimic the original one. Function `TransformToPotentiallyEvaluated` copies the context state from the previous context to this new one and rebuilds the expression correctly. Instead, I can add a new function that transforms the expression in the current context and use it in `TransformToPotentiallyEvaluated` ABataev: `TransformToPotentiallyEvaluated` expects that the innermost context is reevaluated. By the…
		rjmccallUnsubmitted Not Done Reply Inline Actions TransformToPotentiallyEvaluated expects that the innermost context is reevaluated. By the time we approach this branch, we're already out of the original unevaluated context. Oh, I see. TransformToPotentiallyEvaluated expects that it will still be in a (temporary) unevaluated context and then looks through that context to build it in the surrounding context. (In fact, it actually changes the current context to be that outer context and never changes it back.) So we need to push an unevaluated context to balance things out. I actually really dislike that design approach; the caller should be expected to have popped off the evaluation context under which the operand was parsed, and we should rebuild in the current context. There's way too much subtle reasoning about the exact global state here. But given that it is what it is, please leave a comment explaining why pushing a context is necessary here. rjmccall: > TransformToPotentiallyEvaluated expects that the innermost context is reevaluated. By the…
		// unevaluated context to balance things out.
		EnterExpressionEvaluationContext Unevaluated(
		Actions, Sema::ExpressionEvaluationContext::Unevaluated,
		Sema::ReuseLambdaContextDecl);
		Res = Actions.TransformToPotentiallyEvaluated(Res.get());
		}
		}
		}
ConsumeAnnotationToken();		ConsumeAnnotationToken();
if (!Res.isInvalid() && Tok.is(tok::less))		if (!Res.isInvalid() && Tok.is(tok::less))
checkPotentialAngleBracket(Res);		checkPotentialAngleBracket(Res);
break;		break;

case tok::annot_non_type:		case tok::annot_non_type:
case tok::annot_non_type_dependent:		case tok::annot_non_type_dependent:
case tok::annot_non_type_undeclared: {		case tok::annot_non_type_undeclared: {
▲ Show 20 Lines • Show All 2,606 Lines • Show Last 20 Lines

clang/lib/Parse/ParseTentative.cpp

Show First 20 Lines • Show All 1,269 Lines • ▼ Show 20 Lines	if (!getLangOpts().ObjC && Next.is(tok::identifier))
return TPResult::True;		return TPResult::True;

if (Next.isNot(tok::coloncolon) && Next.isNot(tok::less)) {		if (Next.isNot(tok::coloncolon) && Next.isNot(tok::less)) {
// Determine whether this is a valid expression. If not, we will hit		// Determine whether this is a valid expression. If not, we will hit
// a parse error one way or another. In that case, tell the caller that		// a parse error one way or another. In that case, tell the caller that
// this is ambiguous. Typo-correct to type and expression keywords and		// this is ambiguous. Typo-correct to type and expression keywords and
// to types and identifiers, in order to try to recover from errors.		// to types and identifiers, in order to try to recover from errors.
TentativeParseCCC CCC(Next);		TentativeParseCCC CCC(Next);
		// Tentative parsing may not be done in the right evaluation context
		// for the ultimate expression. Enter an unevaluated context to prevent
		rjmccallUnsubmitted Not Done Reply Inline Actions Suggestion: // Tentative parsing may not be done in the right evaluation context // for the ultimate expression. Enter an unevaluated context to prevent // Sema from immediately e.g. treating this lookup as a potential ODR-use. // If we generate an expression annotation token and the parser actually // claims it as an expression, we'll transform the expression to a // potentially-evaluated one then. rjmccall: Suggestion: ``` // Tentative parsing may not be done in the right evaluation context // for…
		// Sema from immediately e.g. treating this lookup as a potential ODR-use.
		// If we generate an expression annotation token and the parser actually
		// claims it as an expression, we'll transform the expression to a
		// potentially-evaluated one then.
		EnterExpressionEvaluationContext Unevaluated(
		Actions, Sema::ExpressionEvaluationContext::Unevaluated,
		Sema::ReuseLambdaContextDecl);
switch (TryAnnotateName(&CCC)) {		switch (TryAnnotateName(&CCC)) {
case ANK_Error:		case ANK_Error:
return TPResult::Error;		return TPResult::Error;
case ANK_TentativeDecl:		case ANK_TentativeDecl:
return TPResult::False;		return TPResult::False;
case ANK_TemplateName:		case ANK_TemplateName:
// In C++17, this could be a type template for class template argument		// In C++17, this could be a type template for class template argument
// deduction. Try to form a type annotation for it. If we're in a		// deduction. Try to form a type annotation for it. If we're in a
▲ Show 20 Lines • Show All 886 Lines • Show Last 20 Lines

clang/lib/Parse/Parser.cpp

Show First 20 Lines • Show All 1,688 Lines • ▼ Show 20 Lines	case Sema::NC_Type: {
setTypeAnnotation(Tok, Ty);		setTypeAnnotation(Tok, Ty);
Tok.setAnnotationEndLoc(Tok.getLocation());		Tok.setAnnotationEndLoc(Tok.getLocation());
Tok.setLocation(BeginLoc);		Tok.setLocation(BeginLoc);
PP.AnnotateCachedTokens(Tok);		PP.AnnotateCachedTokens(Tok);
return ANK_Success;		return ANK_Success;
}		}

case Sema::NC_ContextIndependentExpr:		case Sema::NC_ContextIndependentExpr:
Tok.setKind(tok::annot_primary_expr);		Tok.setKind(Actions.isUnevaluatedContext() ? tok::annot_uneval_primary_expr
		: tok::annot_primary_expr);
setExprAnnotation(Tok, Classification.getExpression());		setExprAnnotation(Tok, Classification.getExpression());
Tok.setAnnotationEndLoc(NameLoc);		Tok.setAnnotationEndLoc(NameLoc);
if (SS.isNotEmpty())		if (SS.isNotEmpty())
Tok.setLocation(SS.getBeginLoc());		Tok.setLocation(SS.getBeginLoc());
PP.AnnotateCachedTokens(Tok);		PP.AnnotateCachedTokens(Tok);
return ANK_Success;		return ANK_Success;

case Sema::NC_NonType:		case Sema::NC_NonType:
▲ Show 20 Lines • Show All 832 Lines • Show Last 20 Lines

clang/test/AST/alignas_maybe_odr_cleanup.cpp

This file was added.

				// RUN: %clang_cc1 -fsyntax-only %s -ast-dump \| FileCheck %s

				struct FOO {
				static const int vec_align_bytes = 32;
				void foo() {
				double a alignas(vec_align_bytes);
				;
				}
				};

				// CHECK: AlignedAttr {{.*}} alignas
				// CHECK: ConstantExpr {{.+}} 'int' Int: 32
				// CHECK: ImplicitCastExpr {{.*}} 'int' <LValueToRValue>
				// CHECK: DeclRefExpr {{.}} 'const int' lvalue Var {{.}} 'vec_align_bytes' 'const int' non_odr_use_constant
				// CHECK: NullStmt

This is an archive of the discontinued LLVM Phabricator instance.

Fix compiler crash when an expression parsed in the tentative parsing and must be claimed in the another evaluation context.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 267948

clang/include/clang/Basic/TokenKinds.def

clang/lib/Parse/ParseExpr.cpp

clang/lib/Parse/ParseTentative.cpp

clang/lib/Parse/Parser.cpp

clang/test/AST/alignas_maybe_odr_cleanup.cpp

Fix compiler crash when an expression parsed in the tentative parsing and must be claimed in the another evaluation context.
ClosedPublic