This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Handle PresumedLocations in IncludeCollector
ClosedPublic

Authored by kadircet on Apr 23 2020, 11:43 AM.

Details

Summary

This will enable extraction of correct line locations in preamble patch
for includes.

Diff Detail

Event Timeline

kadircet created this revision.Apr 23 2020, 11:43 AM
sammccall added inline comments.Apr 30 2020, 5:01 PM
clang-tools-extra/clangd/Headers.cpp
40

Is it possible that doing this for every include we process in the preamble might be significantly expensive?

You could consider doing something like:

FileID BuiltinFile;
bool InBuiltin;
void FileChanged(Loc, Reason, _, PrevFID) override {
  if (Reason == EnterFile && !BuiltinFile && SM.isWrittenInBuiltInFile(Loc)) {
    BuiltinFile = SM.getFileID(Loc);
    InBuiltin = true;
  } else if (Reason == ExitFile && InBuiltinFile && BuiltinFile == PrevFID) {
    InBuiltin = false;
  }
}

and then only run this for InBuiltin. (In practice fairly soon during processing, BuiltinFile will be set and InBuiltin will have become false, so all of this will be short-circuited)

41

wouldn't be SM.getFileEntryForID(PreLoc.FID) be more efficient?

43

could use a comment to explain what's going on at this point: we're transforming the parameters so we can process this as if it occurred in the main file.

But actually it might be clearer to extract a function that both cases can call.

45

This part looks a little iffy to me, with all the coordinate transforms.

If we're synthesizing the include, chars don't have to match 1:1 right?
e.g. if the original code was # include /* foo */ "bar.h" // baz and we synthesize #include "bar.h", how is this going to get the coordinates of "bar.h" right?

This seems awkward to resolve. R isn't actually used much though, go-to-definition looks at its line number only, and DocumentLink uses it (but it seems OK to just to do approximate re-lexing there). Maybe we can just drop it?


(Original comment disregarding above problem)

Isn't it the case that the filename expansion location has to be in the same file as the hashloc?
So can we do something like:

FilenameRange = SM.getExpansionRange(FilenameRange);
if (SM.getFileID(FilenameRange.start()) == SM.getFileID(FilenameRange.end()) == SM.getFileID(OrigHashLoc)) {
  // all in the same file
  // compute NewStart = OrigStart - OrigHashLoc + NewHashLoc, etc
} else {
  FilenameRange = CharSourceRange();
}
kadircet added inline comments.May 1 2020, 2:14 PM
clang-tools-extra/clangd/Headers.cpp
41

well, that would've been even easier to just check for PreLoc.FID == SM.getMainFileID(), but unfortunately PresumedLoc.FID is invalid in presence of line directives :/

but we can do something like storing the MainFilePath and then just compare the PreLoc.getFileName with it.

45

This part looks a little iffy to me, with all the coordinate transforms.

If we're synthesizing the include, chars don't have to match 1:1 right?
e.g. if the original code was # include /* foo */ "bar.h" // baz and we synthesize #include "bar.h", how is this going to get the coordinates of "bar.h" right?

well, the patching actually ensures both # and "filename" starts at the correct offset, by padding them with whitespaces ignoring any comments and such.

This seems awkward to resolve. R isn't actually used much though, go-to-definition looks at its line number only, and DocumentLink uses it (but it seems OK to just to do approximate re-lexing there). Maybe we can just drop it?

I am fine with dropping it too, the padding looks really ugly in the patching code :D.

Regarding go-to-def, I suppose we can keep storing the include line, since we calculate it anyway while getting the presumed location for HashLoc.

For DocumentLink, I suppose we can either lex while handling the request or store those separately in parsedast. I would go with the former.

WDYT?

kadircet marked 5 inline comments as done.May 4 2020, 2:52 AM
kadircet added inline comments.
clang-tools-extra/clangd/Headers.cpp
45

sent out D79315.

kadircet updated this revision to Diff 261765.May 4 2020, 2:52 AM
kadircet marked an inline comment as done.
  • Perform line mapping only for includes found through built-in file.
  • Drop filename range mapping.
  • Make use of FileChanged events to track being in built-in file.
sammccall accepted this revision.May 5 2020, 2:16 AM

LG, Just readability stuff.

clang-tools-extra/clangd/Headers.cpp
39

nit: Inc -> Include

39

this structure is not terrible but the reverse-chronological-order hurts readability a bit.

You could try:

// If an include is part of the preamble patch, translate #line directives.
if (BuiltinFileInStack) {
  auto PreLoc = ...;
  if (...) {
    HashLoc = ...; // now we'll hit the case below
  }
}
// Record main-file inclusions (including those mapped from the preamble patch).
if (isInsideMainFile(HashLoc, SM)) {
  // add the include
}
55

can -> might, line -> #line, ... as part of a preamble patch
(just to hint a little more clearly who created the mapping)

57

PreLoc -> Presumed?
(these abbreviations aren't common)

58

For clarity I think we could do this only with an invalid file ID and pull out a function, ending up with:

if (Presumed.getFileID().isInvalid() && isMainFile(Presumed.getFilename()))
  AddMainFileInc(HashLoc);
65

nit: I'd find it slightly clearer to handle this case first in the if/else, as it kind of "motivates" the more complicated one.

69

maybe delimit this with a comment like
// Record include graph (not just for main-file includes)

107

Using the name stack for a boolean was pretty confusing to me. I think InBuiltinFile plus the comment would be enough, otherwise maybe UnderBuiltinFile or so?

clang-tools-extra/clangd/unittests/HeadersTests.cpp
309

I don't think testPath is needed here: it should be on the include path, and FS.Files adds the qualifiers where needed

314

Can we use more aribtrary placeholder here, like 42? :)
We could easily hit 1 by mistake.

324

changing plus to comma here seems clearer

This revision is now accepted and ready to land.May 5 2020, 2:16 AM
kadircet updated this revision to Diff 262057.May 5 2020, 4:01 AM
kadircet marked 13 inline comments as done.
  • Address comments
clang-tools-extra/clangd/Headers.cpp
39

well this was the initial version :D moved away from it since you asked for an explicit function both cases can call, reverting back to it.

clang-tools-extra/clangd/unittests/HeadersTests.cpp
314

unfortunately we can't, at least not without having that line available in the main file. Since we need to get a real offset into it in the end.

changing it to be line 3 as a middle ground :P

kadircet updated this revision to Diff 262324.May 6 2020, 2:41 AM
  • Try to fix windows builds
kadircet marked an inline comment as done.May 6 2020, 2:42 AM
This revision was automatically updated to reflect the committed changes.
kadircet reopened this revision.May 6 2020, 7:27 AM
This revision is now accepted and ready to land.May 6 2020, 7:27 AM
kadircet updated this revision to Diff 262374.May 6 2020, 7:27 AM
  • use basename in tests as having unescaped backslashes breaks windows build bots. going to address that separetely in line directive generation logic.
This revision was automatically updated to reflect the committed changes.