This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Fix HitMapping assertion in Tokens.cpp
Needs ReviewPublic

Authored by vabridgers on Apr 5 2020, 11:49 AM.

Details

Summary

Extend test cases for tokens, and remove assertion that is unneeded and
hitting in Tokens.cpp.

Fixes Bugzilla https://bugs.llvm.org/show_bug.cgi?id=45428

Diff Detail

Event Timeline

vabridgers created this revision.Apr 5 2020, 11:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 5 2020, 11:49 AM
vabridgers updated this revision to Diff 255193.Apr 5 2020, 1:40 PM

Remove extraneous test case

Thanks for tracking this down. Unfortunately I don't think removing the assertion is enough, the code doesn't handle this case correctly.

clang/unittests/Tooling/Syntax/TokensTest.cpp
487

This is the case that always worked, right? Not sure we need to add this test.

505

From the linked bug, the varags isn't essential to this crash.
The linked example can be simplified a little further by removing foo:

#define N 1
#define ID(X) X
#define ID2 ID
ID2(N)

This makes it clearer that the trigger is using a macro in a arg to an "indirect" function macro (i.e. the function macro's name is expanded from another macro).

519

The problem with just removing the assertion is that this behavior is wrong: it's mapping M0(, onto { 42 ; } - i.e. the whole M0 macro expansion. Then all the other mappings are empty. Compare to the baseline case above...

This should be [M0_42, ;_46)] => [{_6, ;_10] I think.

Oh dear, this seems like a design bug.

TokenBuffer only attempts to record "top-level" expansions, i.e. FOO(BAR(BAZ)) will record the expansion of FOO, and say the tokens FOO ( BAR ( BAZ ) ) were expanded into some other thing in an essentially-opaque way.
The problem is this conflates two notions of "top-level"

  • macro use is in main file (this is what the PPCallbacks checks)
  • tokens emitted by the expansion are in the final expanded token stream (this is what makes the code correct, and I think what we're asserting)

The second seems more fundamental, so we should try patching the code to avoid the former. But I worry there are subtle assumptions of this scattered through...

Thank you for the comments. I'll keep looking at this to find a proper fix, but any hints you may have would be greatly appreciated (debugging tips, strategies, areas of code to focus on). This code is new to me, so I may not be as efficient at tracking this down as yourself or Ilya. Best!

Thank you for the comments. I'll keep looking at this to find a proper fix, but any hints you may have would be greatly appreciated (debugging tips, strategies, areas of code to focus on). This code is new to me, so I may not be as efficient at tracking this down as yourself or Ilya. Best!

Hope you don't mind, I ended up writing up a fix myself. (Well, I got frustrated with TokenBuffer's internals being complicated, and rewrote some of it in a way that makes this bug easy to fix.) D77615 is the proposed fix for this bug, and D77614 is the required refactoring.

FWIW I don't have much great advice about debugging this stuff beyond minimizing the example as far as possible and instrumenting with lots of printfs...

ayup, no problem and thanks for the fix. I'll abandon this change. Best!

sammccall resigned from this revision.Sep 18 2020, 2:04 AM