This is an archive of the discontinued LLVM Phabricator instance.

[libTooling] Add function to determine associated text of a declaration.
ClosedPublic

Authored by ymandel on Jan 3 2020, 7:01 AM.

Details

Summary

This patch adds getAssociatedRange which, for a given decl, computes preceding
and trailing text that would conceptually be associated with the decl by the
reader. This includes comments, whitespace, and separators like ';'.

Diff Detail

Event Timeline

ymandel created this revision.Jan 3 2020, 7:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 3 2020, 7:01 AM
ymandel updated this revision to Diff 236057.Jan 3 2020, 7:08 AM

clang-format

ymandel updated this revision to Diff 236879.Jan 8 2020, 11:49 AM

Fixed behavior in some corner cases; added tests

ymandel updated this revision to Diff 237044.Jan 9 2020, 5:56 AM

tweaked test organization

ymandel updated this revision to Diff 237051.Jan 9 2020, 6:09 AM

fold initLexer into the callsite, because it's only called once.

gribozavr2 added inline comments.
clang/lib/Tooling/Transformer/SourceCode.cpp
149

Parenthetical is not closed.

190

What if the token is not a terminator? It seems like the loop will keep looking for a terminator, but I don't understand why -- based on the doc comment, it seems like the intent is that this function will only extend the range with extra terminators.

217

I don't understand this case. We can only reach this loop if we already found one terminator. If semicolon is a terminator, fine, we can consume an extra one. However, if comma is a terminator, if we enter this "case tok::comma", it means we found two commas in a row, which is probably a syntax error.

315

Please add tests for templates. (In particular, out-of-line definitions of member templates, which have two template argument lists.)

333

Need tests for cases described by this comment.

373

s/getExpansionRange/makeFileCharRange/? Although I don't think it does that, because its comment says "... the function will fail because the range overlaps with only a part of the macro".

clang/unittests/Tooling/SourceCodeTest.cpp
188

If you feel motivated, it could be beneficial to lift this member into TestVisitor itself (and change RunOver to accept llvm::Annotations).

202

s/newline/semicolon/

249

It assumes that it is invoked with the code that corresponds to this->Code. Therefore the argument is redundant, I think.

269

But there's no whitespace.

296

"Does not include comments when only the decl or the comment come from a macro."

ymandel updated this revision to Diff 241124.Jan 29 2020, 5:35 AM
ymandel marked 4 inline comments as done.

addressed comments, with significant reworking of tests.

ymandel updated this revision to Diff 241127.Jan 29 2020, 5:48 AM
ymandel marked 12 inline comments as done.

tweaks

ymandel marked an inline comment as done.Jan 29 2020, 5:50 AM

Thank you for the detailed review. I've significantly expanded and refactored the tests. I also lifted validateEditRange into its own function and added corresponding tests.

clang/lib/Tooling/Transformer/SourceCode.cpp
190

The intent is also to extend with comments and whitespace. Since the lexer is keeping whitespace (line 116), the whitespace will show up as tokens. I do not believe there is any documentation on the semantics of this mode, fwiw -- I had to read the code. Let me know if you think it would help to spell out more about the operation of the keep-whitespace mode.

217

I've added comments and additional logic (captured by TerminatedByMacro) to explain/refine this case. PTAL.

373

makeFileCharRange is right, even if it will result in an error. Basically, if the attribute is the start of the macro expansion, then makeFileCharRange will expand it to an valid range. Otherwise, it will return an invalid range, so we will as well. I've extended the comments in the header to call out this possibility of returning an invalid range.

clang/unittests/Tooling/SourceCodeTest.cpp
188

Possibly. For now, I've lifted into a class template shared by most of the new tests. I think it would be good to followup with a patch that moves it to TestVisitor but that may take more design work than I want to bundle in this patch.

269

the whitespace trails the *decalaration*.

gribozavr2 accepted this revision.Feb 25 2020, 8:07 AM
gribozavr2 added inline comments.
clang/include/clang/Tooling/Transformer/SourceCode.h
42

s/and/an/

clang/lib/Tooling/Transformer/SourceCode.cpp
152

Indent -2.

156

s/that we know to be deleted/that is being deleted/

164

s/These terminators these/These terminators/

282

I'm not sure what code implements the "if the first character is a newline" logic. The for loop below checks if LocChars starts with horizontal whitespace followed by vertical whitespace.

Also, I don't see where we find an *empty* line. I understand an empty line as zero or more horizontal whitespace characters between two vertical whitespace characters.

This revision is now accepted and ready to land.Feb 25 2020, 8:07 AM
ymandel updated this revision to Diff 246556.Feb 25 2020, 1:30 PM
ymandel marked 8 inline comments as done.

Responded to all comments. No functional changes.

Thanks for the detailed review, especially given the complexity of the code!

clang/lib/Tooling/Transformer/SourceCode.cpp
152

I re-read this comment and I don't think it's helpful -- it's a holdover from the usecase where the code is deleting the decl,. instead of its more general use of just identifying the associated text. Moreover, the comment that follows does a full explanation of the role of this variable. So, I've deleted the comment and moved the declaration to follow the comment and lead the block of code that follows, which seems the appropriate location.

156

Thanks. Here too I rewrote the comment a bit to avoid discussing deletion specifically.

282

Good catch. The comment hadn't been updated since the first version, while the code has changed. There's an implicit invariant on how it's called that guarantees that if the rest of the line is whitespace and newline then the preceding character was a newline.

However, I think that's far too complicated an invariant to rely on implicitly, so I've updated the code to check it explicitly. Now, it only relies on Loc not being the first of the file (which seems reasonable given that this function is used for exclusive end locations -- meaning, there should always be something that precedes them).

This revision was automatically updated to reflect the committed changes.
ymandel reopened this revision.Feb 26 2020, 6:47 AM

Reopening for fix to failing tests that resulted in revert.

This revision is now accepted and ready to land.Feb 26 2020, 6:47 AM
ymandel updated this revision to Diff 246706.Feb 26 2020, 6:47 AM

Fix failing test under msvc compatibility.

Adds -fno-delayed-template-parsing to compilation arguments in tests.

This revision was automatically updated to reflect the committed changes.