This is an archive of the discontinued LLVM Phabricator instance.

[Sema] Fix infinite typo correction loop.
ClosedPublic

Authored by vsapsai on May 24 2018, 11:07 AM.

Details

Summary

NumTypos guard value ~0U doesn't prevent from creating new delayed typos. When
you create new delayed typos during typo correction, value ~0U wraps around to
0. When NumTypos is 0 we can miss some typos and treat an expression as it can
be typo-corrected. But if the expression is still invalid after correction, we
can get stuck in infinite loop trying to correct it.

Fix by not using value ~0U so that NumTypos correctly reflects the number of
typos.

rdar://problem/38642201

Diff Detail

Repository
rC Clang

Event Timeline

vsapsai created this revision.May 24 2018, 11:07 AM
arphaman added inline comments.May 24 2018, 3:50 PM
clang/test/SemaCXX/typo-correction-delayed.cpp
146 ↗(On Diff #148456)

Loosing typo correction for 'getX' is fine, however, I think we still want to typo-correct 'variableX' to 'variable' here.

rsmith added a subscriber: rsmith.May 24 2018, 4:38 PM

Rather than disabling typo correction in TransformTypos, it would be preferable to attempt to immediately correct them. (That should allow the variableX.getX() case to still work.)

vsapsai planned changes to this revision.May 25 2018, 7:19 PM

After looking into this more, I think there are 2 different bugs: one with infinite loop and another with DelayedTypos.empty() && "Uncorrected typos!" assertion. And disabling typo correction happened to fix both of them.

Infinite loop seems to be avoidable by not using ~0U as the guard value, need to investigate further.

And uncorrected dangling delayed typo bug has a different cause. When we are checking potential corrections, we have roughly the following behaviour

1. structVarTypo -> structVar; structVarTypo2 -> structVar;
     after correction discover more typos
     fieldNameTypo -> fieldName; fieldNameTypo2 -> fieldName;  // Overall correction fails but newly discovered typos are processed and removed.

2. structVarTypo -> <empty correction>; structVarTypo2 -> structVar;
     correction fails early, don't discover more typos

3. structVarTypo -> structVar; structVarTypo2 -> <empty correction>;
     correction fails but we discover fieldNameTypo and the way correction fails we don't attempt to correct this new delayed typo

4. structVarTypo -> <empty correction>; structVarTypo2 -> <empty correction>;
     correction fails early, don't discover more typos

So the typo from step 3 is the one that remains till ~Sema.

I've spent some time looking into Richard's suggestion to correct typos immediately. That gets more involved than I expected and I want to finish investigating my other ideas. Now I think that my original approach with disabling typo correction just hides the issue instead of fixing it. And I feel like immediate typo correction can be hiding the actual issue too but it is too early to say.

vsapsai updated this revision to Diff 148841.May 28 2018, 6:22 PM
  • Fix only infinite loop and don't touch the assertion failure.

New commit message should be:

[Sema] Fix infinite typo correction loop.

NumTypos guard value ~0U doesn't prevent from creating new delayed typos. When
you create new delayed typos during typo correction, value ~0U wraps around to
0. When NumTypos is 0 we can miss some typos and treat an expression as it can
be typo-corrected. But if the expression is still invalid after correction, we
can get stuck in infinite loop trying to correct it.

Fix by not using value ~0U so that NumTypos correctly reflects the number of
typos.

rdar://problem/38642201

Some debugging details that can be useful but too specific for the commit message. When we have infinite loop, the steps are:

  1. Detect typos structVar1, structVar2
  2. Correct typos structVar1, structVar2
    1. Replace structVar1 with structVar. Add corresponding object to TransformTypos::TypoExprs.
    2. Detect typo fieldName1. NumTypos becomes 0.
    3. Try to CorrectDelayedTyposInExpr for fieldName1 TypoExpr. As there are no typos, no expression transformation.
    4. Replace structVar2 with structVar. Add corresponding object to TransformTypos::TypoExprs.
    5. Detect typo fieldName2. NumTypos becomes 1.
      1. Replace fieldName2 with fieldName
    6. Figure out that structVar.fieldName.member2 is invalid, CheckAndAdvanceTypoExprCorrectionStreams
  3. Try different TypoCorrection for structVar1. As it is empty, the correction fails.
  4. CheckAndAdvanceTypoExprCorrectionStreams. Reset correction stream for structVar1 and keep going with typo correction because structVar2 typo correction stream isn't finished.
  5. Try different TypoCorrection for structVar1.
    1. Replace structVar1 with structVar
    2. Detect typo fieldName1
      1. Replace fieldName1 with fieldName
    3. Figure out that structVar.fieldName.member1 is invalid, CheckAndAdvanceTypoExprCorrectionStreams
  6. Try different TypoCorrection for structVar1. As it is empty, the correction fails.
  7. CheckAndAdvanceTypoExprCorrectionStreams over and over again.

After my fix the steps are:

  1. Detect typos structVar1, structVar2
  2. Correct typos structVar1, structVar2
    1. Replace structVar1 with structVar. Add corresponding object to TransformTypos::TypoExprs.
    2. Detect typo fieldName1. NumTypos becomes 3.
      1. Replace fieldName1 with fieldName
    3. Figure out that structVar.fieldName.member1 is invalid, CheckAndAdvanceTypoExprCorrectionStreams
  3. Try different TypoCorrection for structVar1. As it is empty, the correction fails.
  4. All available typo corrections were tried because TransformTypos::TypoExprs contains only structVar1. Complete the typo correction.
vsapsai retitled this revision from [Sema] Disable creating new delayed typos while correcting existing. to [Sema] Fix infinite typo correction loop..Jun 11 2018, 9:58 AM
vsapsai edited the summary of this revision. (Show Details)

Ping.

rsmith accepted this revision.Jun 25 2018, 3:53 PM
This revision is now accepted and ready to land.Jun 25 2018, 3:53 PM
This revision was automatically updated to reflect the committed changes.

Thanks for the review, Richard.