This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Provide patched diagnostics with preamble patch
ClosedPublic

Authored by kadircet on Feb 1 2023, 10:36 AM.

Details

Summary

Translates diagnostics from baseline preamble to relevant modified
contents.

Translation is done by looking for a set of lines that have the same
contents in diagnostic/note/fix ranges inside baseline and modified
contents.

A diagnostic is preserved if its main range is outside of main file or
there's a translation from baseline to modified contents. Later on fixes
and notes attached to that diagnostic with relevant ranges are also
translated and preserved.

Depends on D143095

Diff Detail

Event Timeline

kadircet created this revision.Feb 1 2023, 10:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 1 2023, 10:36 AM
Herald added a subscriber: arphaman. · View Herald Transcript
kadircet requested review of this revision.Feb 1 2023, 10:36 AM

It looks like this fixes up the location only of diagnostics attached to particular directives (#include) based on some "deep" idea about the content of the directive (the spelled header name).

Some shortcomings:

  • This misses diagnostics attached to other directives / continued lines in macro definitions / etc
  • it allows diagnostics to be translated across lines even if the directive spelling changed (in which case the ranges are incorrect since only line numbers are updated.)
  • it requires modelling the directive content in some way that needs to be extended if we find another place that diagnostics can be attached

We discussed the idea of attaching to the text of the line, which seems pretty generic. Needs some handling of duplicate line content but seems like something pretty naive (closest line number with matching content?) would work well, be just as simple and more generic. Any reason this alternative was rejected?

clang-tools-extra/clangd/Preamble.h
162

this looks like a gratuitous copy - ParsedAST copies again

nridge added a subscriber: nridge.Feb 4 2023, 7:17 PM
kadircet marked an inline comment as done.Feb 7 2023, 1:06 AM

It looks like this fixes up the location only of diagnostics attached to particular directives (#include) based on some "deep" idea about the content of the directive (the spelled header name).

Some shortcomings:

  • This misses diagnostics attached to other directives / continued lines in macro definitions / etc
  • it allows diagnostics to be translated across lines even if the directive spelling changed (in which case the ranges are incorrect since only line numbers are updated.)
  • it requires modelling the directive content in some way that needs to be extended if we find another place that diagnostics can be attached

We discussed the idea of attaching to the text of the line, which seems pretty generic. Needs some handling of duplicate line content but seems like something pretty naive (closest line number with matching content?) would work well, be just as simple and more generic. Any reason this alternative was rejected?

agreed this is better, also somewhat more elegant on the implementation side. switching to a line-content based translation, while keeping the logic around limiting diagnostics to single lines. as multiple lines seem more wonky and unclear how common they're.

kadircet updated this revision to Diff 495411.Feb 7 2023, 1:06 AM
  • Change to a line based translation logic
sammccall added inline comments.Feb 8 2023, 12:13 AM
clang-tools-extra/clangd/ParsedAST.cpp
678

llvm::append_range(Diags, Patch->patchedDiags()) ?

clang-tools-extra/clangd/Preamble.cpp
369

llvm::append_range(SP.Lines, llvm::split(Contents, "\n")) ?

631

nit: comment sense is the opposite of the function name

maybe say why this is an important property, or give the function a higher-level name?

632

this function + translateDiag look like a validate/parse pair that need to be kept in sync.

they could be a single translateDiag() function that could fail, instead.

637

this check looks dubious to me:

  • File is a human-readable string with ill-defined format, comparing them for equality isn't safe, you want InsideMainFile instead
  • if the note points to another file, then it's not that translation is impossible, but rather no translation is needed
  • if the note points to the same file, then why can't we apply the same translation?

I think this should rather be something like:

bool translateDiag(DiagBase& D, bool IsMainDiag) {
  if (IsMainDiag) {
    for (const auto& N : cast<Diag>(D).Notes) {
      if (!translateDiag(N, /*IsMainDiag=*/false))
        return false;
    }
  }
  if (D.InsideMainFile) {
    // attempt patching range of D itself
  }
}
672

nit: name seems backwards (but as noted below I think this function can go away entirely)

697

If I'm understanding the algorithm right:

  • we make a map of line content => original line number
  • we iterate through each line of the modified preamble, and:
    • find the corresponding original line
    • look for diagnostics to translate

Why not loop over diagnostics instead of lines in the modified preamble? There should be a lot fewer, and you avoid the lines-to-diags map.

699

large preamble with no diagnostics seems pretty plausible, maybe bail out early to avoid building the content map?

762–763

I think this change and the IncludeKey class can be reverted now?

clang-tools-extra/clangd/Preamble.h
160

the stuff about #include/#define doesn't seem to correspond to anything in the code (obsolete?)
and below

174–175

if you want to undoxygen this comment, also remove \p?

sammccall added inline comments.Feb 8 2023, 12:45 AM
clang-tools-extra/clangd/Preamble.cpp
640

Oh yeah, one more thing :-)

Currently if the note/fix aren't at this line we're bailing out of the diagnostic entirely.

We have three behaviors available:

  1. patch the note/fix range as best we can and emit it
  2. drop the note/fix but keep the diagnostic
  3. drop the diagnostic entirely

I think we should generally prefer 1 when we can, maybe there's a case for choosing 2 sometimes (heuristics about when diagnostics have probably been invalidated, with comments), and we should ~never do 3.

kadircet updated this revision to Diff 496900.Feb 13 2023, 2:51 AM
kadircet marked 12 inline comments as done.
  • Change patching logic to iterate over diags instead of preamble lines
  • Change translation logic to:
    • Preserve a diagnostic if its range can be translated.
    • Preserve all the notes whose range can be translated.
    • Preserve all the fixes whose edit ranges can be translated.
  • Drop restrictions on ranges being a single line, make sure contents for all the lines are matched between modified and basline contents.
  • Add couple new tests to demonstrate what's preserved.
kadircet updated this revision to Diff 496905.Feb 13 2023, 2:58 AM
  • Reflow comments
kadircet updated this revision to Diff 496907.Feb 13 2023, 3:03 AM
kadircet edited the summary of this revision. (Show Details)
  • Update commit message
sammccall added inline comments.Feb 16 2023, 11:03 AM
clang-tools-extra/clangd/Preamble.cpp
633

I think this function is too long with too many local lambdas, consider converting it to a class instead (call me old-fashioned!)

653

we're assuming that the ranges are within the preamble and everyone agrees about the bounds. If not, BaselineStart may not be a valid index into BaselineScan.Lines

This assumption seems broadly reasonable, but is *very much* not locally enforced. I'd suggest a defensive check or at least an assert.

656

this doesn't look right: you're first deciding which possible starting point is closest, and then deciding whether it matches. So a range that matches can be masked by a range where only the first line matches, if the latter is closer.

691

probably doesn't matter, but you're inconsistent about moving vs copying range

714

reasoning about the fields you're *not* rewriting feels fragile. (Here and in TranslateFix).

Consider copying the whole object and mutating in place (bool TranslateDiag(Diag&) together with erase_if)

clang-tools-extra/clangd/Preamble.h
161

nit: blank line after

clang-tools-extra/clangd/unittests/PreambleTests.cpp
765

i'm confused about what this comment is getting at - a note without a diag is not even representable

kadircet updated this revision to Diff 498308.Feb 17 2023, 3:19 AM
kadircet marked 6 inline comments as done.
  • Move patch translation logic to a separate class
  • Perform eager copies and use bool translateX(X&) type of APIs instead of returning optionals.
  • Perform multi line content check for each alternative, not just the closest match.
clang-tools-extra/clangd/Preamble.cpp
633

well it's gonna be an equally long class, but sure :D

653

oops, thanks! both here and also the multi-ine check below.

656

i thought because this would be rare it's fine to not do that, but the same applies in the other direction as well :D
moved the content based matching inside the loop to consider a line as alternative only after it matches the contents.

714

well i am a little worried about copying around notes/fixes just to drop them again (which can have quite some strings). but probably rare enough in practice to not matter.

clang-tools-extra/clangd/unittests/PreambleTests.cpp
765

well, it is to make sure we're not promoting an orphaned note to a main diag. updated the comment but happy to drop if you think it's confusing rather than being helpful.

sammccall accepted this revision.Feb 17 2023, 4:33 AM

LG thanks!

clang-tools-extra/clangd/Preamble.cpp
490

up to you, but I find avoiding iterators easier to read

for (int AlternateLine : CurrentContentsToLine.lookup(OldLines[OldStart]))

(or with extracted variables)

503

why not just if (RangeContents != CurrentLines.slice(...))? I feel like I'm missing something subtle :-)

Also if you replace the slice(a, b) with drop_front(a).take_front(b) then you don't need the CurrentEnd > CurrentLines.size() check

511

if you make Closest the delta rather than the line number this gets a bit easier to follow IMO:
abs(Closest) < abs(Delta) here
R.start.line += Closest; R.end.line += Closest; below

You need to make Closest an optional<int> but that's also clearer than a sentinel -1 for me

562

FWIW if we're willing to pay a copy for a successful patch, I don't think it's worth any hassle to avoid the copy for a failed patch. Up to you though.

This revision is now accepted and ready to land.Feb 17 2023, 4:33 AM
kadircet updated this revision to Diff 498351.Feb 17 2023, 6:33 AM
kadircet marked 4 inline comments as done.
  • Use direct comparison of ArrayRefs
  • optional for closest
  • loop over lookup results rather than through iterator

thanks for the review!

clang-tools-extra/clangd/Preamble.cpp
503

why not just if (RangeContents != CurrentLines.slice(...))? I feel like I'm missing something subtle :-)

no that's absolutely right, i guess i had a brain fart 😂

This revision was automatically updated to reflect the committed changes.

Hello,

I noticed that if I compile the compiler with ubsan, then lots of Clangd tests start failing with this patch:

Failed Tests (165):
  Clangd :: ast-no-range.test
  Clangd :: ast.test
  Clangd :: call-hierarchy.test
  Clangd :: check-lines.test
  Clangd :: check.test
  Clangd :: code-action-request.test
  Clangd :: completion-auto-trigger.test
  Clangd :: completion-snippets.test
  Clangd :: completion.test
  Clangd :: config.test
  Clangd :: crash-parse.test
  Clangd :: dependency-output.test
  Clangd :: diagnostic-category.test
  Clangd :: diagnostics-no-tidy.test
  Clangd :: diagnostics-notes.test
  Clangd :: diagnostics-tidy.test
  Clangd :: did-change-configuration-params.test
  Clangd :: execute-command.test
  Clangd :: filestatus.test
  Clangd :: fixits-codeaction.test
  Clangd :: fixits-command.test
  Clangd :: fixits-embed-in-diagnostic.test
  Clangd :: folding-range.test
  Clangd :: formatting.test
  Clangd :: hover.test
  Clangd :: implementations.test
  Clangd :: inlayHints.test
  Clangd :: memory_tree.test
  Clangd :: metrics.test
  Clangd :: protocol.test
  Clangd :: references-container.test
  Clangd :: references.test
  Clangd :: rename.test
  Clangd :: request-reply.test
  Clangd :: selection-range.test
  Clangd :: semantic-tokens-refresh.test
  Clangd :: semantic-tokens.test
  Clangd :: signature-help-with-offsets.test
  Clangd :: signature-help.test
  Clangd :: symbol-info.test
  Clangd :: symbols.test
  Clangd :: target_info.test
  Clangd :: test-uri-posix.test
  Clangd :: textdocument-didchange-fail.test
  Clangd :: trace.test
  Clangd :: tweaks-format.test
  Clangd :: type-definition.test
  Clangd :: type-hierarchy-ext.test
  Clangd :: type-hierarchy.test
  Clangd :: unsupported-method.test
  Clangd :: utf8.test
  Clangd :: version.test
  Clangd :: xrefs.test
  Clangd Unit Tests :: ./ClangdTests/0/146
  [...]
  Clangd Unit Tests :: ./ClangdTests/99/146


Testing Time: 362.21s
  Skipped          :    47
  Unsupported      :  1094
  Passed           : 89208
  Expectedly Failed:   192
  Failed           :   165

Not sure if all fail the same way but there are a lot of ubsan complaints like this

07:38:09 ../lib/Support/MemoryBuffer.cpp:138:33: runtime error: null pointer passed as argument 2, which is declared to never be null
07:38:09 /usr/include/string.h:43:28: note: nonnull attribute specified here
07:38:09 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../lib/Support/MemoryBuffer.cpp:138:33 in

thanks for the ping! see the discussion in https://reviews.llvm.org/D142890#4149679