This is an archive of the discontinued LLVM Phabricator instance.

[Sema] Correct typos in LHS, RHS before building a binop expression.
ClosedPublic

Authored by vsapsai on Aug 30 2017, 6:19 PM.

Details

Summary

Specifically, typo correction should be done before dispatching between
different kinds of binary operations like pseudo-object assignment,
overloaded binary operation, etc.

Without this change we hit an assertion

Assertion failed: (!LHSExpr->hasPlaceholderType(BuiltinType::PseudoObject)), function CheckAssignmentOperands

when in Objective-C we reference a property without self and there are
2 equally good typo correction candidates: ivar and a class name. In
this case LHS expression in BuildBinOp is

CXXDependentScopeMemberExpr
`-TypoExpr

and instead of handling Obj-C property assignment as pseudo-object
assignment, we call CreateBuiltinBinOp which corrects typo to

ObjCPropertyRefExpr '<pseudo-object type>'
`-ImplicitCastExpr
  `-ObjCIvarRefExpr
    `-ImplicitCastExpr
      `-DeclRefExpr

but cannot handle pseudo-objects and asserts about it (indirectly,
through CheckAssignmentOperands).

There is no crash in Objective-C++ but I've added a test case because in
the fix we execute different paths for C and C++. I've also considered
removing typo correction from CreateBuiltinBinOp and it doesn't cause
any test failures. But I decided to keep it because CreateBuiltinBinOp
sometimes is called directly, not through BuildBinOp.

rdar://problem/33102722

Event Timeline

vsapsai created this revision.Aug 30 2017, 6:19 PM
ahatanak edited edge metadata.Sep 1 2017, 12:11 PM

Is it possible to avoid creating CXXDependentScopeMemberExpr in the first place? It seems to me that we shouldn't be creating a CXXDependentScopeMemberExpr for an ObjC property access.

Or perhaps there are reasons to type-correct it later?

Is it possible to avoid creating CXXDependentScopeMemberExpr in the first place? It seems to me that we shouldn't be creating a CXXDependentScopeMemberExpr for an ObjC property access.

Or perhaps there are reasons to type-correct it later?

In Sema::ActOnMemberAccessExpr we handle dependent member expressions before other member references

if (Base->getType()->isDependentType() || Name.isDependentName() ||
    isDependentScopeSpecifier(SS)) {
  return ActOnDependentMemberExpr(Base, Base->getType(), IsArrow, OpLoc, SS,
                                  TemplateKWLoc, FirstQualifierInScope,
                                  NameInfo, TemplateArgs);
}

ActOnMemberAccessExtraArgs ExtraArgs = {S, Id, ObjCImpDecl};
return BuildMemberReferenceExpr(Base, Base->getType(), OpLoc, IsArrow, SS,
                                TemplateKWLoc, FirstQualifierInScope,
                                NameInfo, TemplateArgs, S, &ExtraArgs);

In this case variable Base is a TypoExpr and it is created as type-dependent regardless of the current language.

TypoExpr(QualType T)
    : Expr(TypoExprClass, T, VK_LValue, OK_Ordinary,
           /*isTypeDependent*/ true,
           /*isValueDependent*/ true,
           /*isInstantiationDependent*/ true,
           /*containsUnexpandedParameterPack*/ false) {
  assert(T->isDependentType() && "TypoExpr given a non-dependent type");
}

As a result we end up with CXXDependentScopeMemberExpr when reference a member of a TypoExpr. I'm not entirely sure but it looks like it was designed to work this way.

I was thinking about correcting the typo in Sema::ActOnMemberAccessExpr before checking whether Base is dependent. Doing so would cause BuildMemberReferenceExpr to be called instead of ActOnDependentMemberExpr.

I thought that would be a simpler approach, but I'm actually not sure which approach is better. I think this patch should be fine too as long as creating CXXDependentScopeMemberExpr for an ObjC property access doesn't cause a problem before typo correction.

What do you think?

Correcting typo in Sema::ActOnMemberAccessExpr is an interesting idea and it almost works. After such change there is a single failure in clang/test/SemaCXX/typo-correction-cxx11.cpp

void run(A *annotations) {
  map new_annotations;

  auto &annotation = *annotations;
  auto new_it = new_annotations.find(5);
  auto &new_anotation = new_it.second;  // expected-note {{'new_anotation' declared here}}
  new_annotation->Swap(&annotation);  // expected-error {{use of undeclared identifier 'new_annotation'; did you mean 'new_anotation'?}}
}

because we don't mention did you mean 'new_anotation'? anymore. I haven't investigated in depth what's causing it but my guess is that general typo correction doesn't suggest replacement due to ambiguity between new_anotation and new_annotations, and prevents member-specific typo correction from handling it.

I think having CXXDependentScopeMemberExpr for an ObjC property should be fine as we are dealing with invalid code and all typos are already type-dependent and instantiation-dependent though it doesn't apply to ObjC.

I agree that my fix isn't specific to the code triggering the assertion. But broader fix can be useful as it covers more cases. After checking SemaExpr.cpp and its multiple CorrectDelayedTyposInExpr, CorrectTypo, CorrectTypoDelayed I have an impression that typo correction absent in Sema::BuildBinOp is an omission worth fixing.

ahatanak accepted this revision.Sep 14 2017, 2:06 PM

I see. It looks like there is an ambiguity between 'new_anotation' and 'new_annotations' that can't be resolved in ActOnMemberAccessExpr because they have the same distance from 'new_annotation'. This doesn't happen if the full expression (new_annotation->Swap) is typo-corrected because 'new_anotation' will be the only valid choice in that case.

I don't think there is a problem with creating CXXDependentScopeMemberExpr for ObjC property access, so this looks fine.

This revision is now accepted and ready to land.Sep 14 2017, 2:06 PM
This revision was automatically updated to reflect the committed changes.

Thanks for the review.