This is an archive of the discontinued LLVM Phabricator instance.

[SourceManager] isPointWithin: avoid using isBeforeInTranslationUnit, compare buffer offsets directly for lexical correctness
AbandonedPublic

Authored by arphaman on Aug 14 2018, 2:55 PM.

Details

Summary

The current implementation of isPointWithin uses isBeforeInTranslationUnit to check if a location is within a range.
The problem is that isPointWithin is used to perform is lexical range check, i.e. we want to determine if a location in a particular source file is within a source range in the same source file.
The use of isBeforeInTranslationUnit is not correct for this use-case, as one source file can be included multiple times. Given a source range and a location that's lexically contained within that source range (e.g. file.h:1:2->1:10 and file:h:1:5), isBeforeInTranslationUnit will fail to correctly determine if a source range from the first occurrence of the source file contains a location from the second occurrence of the same source file, as the start and end of the range are both included before the location.
This patch reimplements isPointWithin to ensure that buffer offsets are compared directly for lexical correctness.

This patch is a pre-requisuite to fix a bug in clang-rename where we are unable to rename a structure in a PCH that has a header that's included multiple times (follow-up patch will be posted to update clang-rename to use isPointWithin).

rdar://42746284

Diff Detail

Repository
rC Clang

Event Timeline

arphaman created this revision.Aug 14 2018, 2:55 PM

Hi Alex, nice work!

I am just wondering if it would be beneficial to assert Loc and End are in the same file. It might help to catch bugs.

I also stumbled upon this function but not sure it makes things significantly cleaner here:
https://clang.llvm.org/doxygen/SourceLocation_8h_source.html#l00175

LGTM otherwise.

ioeric added inline comments.Aug 16 2018, 4:34 AM
lib/Basic/SourceManager.cpp
2035

Why do we disallow locations from macro expansions?

2049

nit: the pattern seems repetitive. Maybe pull out a helper/lambda like: isBeforeInSameFile(L1, L2)?

Hi Alex, nice work!

I am just wondering if it would be beneficial to assert Loc and End are in the same file. It might help to catch bugs.

I don't see the value in that unless I'm misunderstanding something. We already check if Loc and End are in the same file, and return false if they're not.

I also stumbled upon this function but not sure it makes things significantly cleaner here:
https://clang.llvm.org/doxygen/SourceLocation_8h_source.html#l00175

LGTM otherwise.

arphaman marked 2 inline comments as done.Aug 16 2018, 3:28 PM
arphaman added inline comments.
lib/Basic/SourceManager.cpp
2035

I changed the patch to allow them by working with spelling locs.

arphaman updated this revision to Diff 161132.Aug 16 2018, 3:29 PM
arphaman marked an inline comment as done.
  • Use lambda
  • Work with spelling locs
rsmith added a subscriber: rsmith.Aug 16 2018, 3:48 PM

Hi Alex, nice work!

I am just wondering if it would be beneficial to assert Loc and End are in the same file. It might help to catch bugs.

I don't see the value in that unless I'm misunderstanding something. We already check if Loc and End are in the same file, and return false if they're not.

I think we should assert that Start and End are in the same file. If not, the query doesn't make any sense; the range is malformed. And maybe we should also assert that Start is before End in that file.

include/clang/Basic/SourceManager.h
1585–1586

Please rename this function to make it clearer what it's doing. The old implementation is the obvious one for a function with this name. Maybe isSpellingLocInFileRange or similar?

lib/Basic/SourceManager.cpp
2038–2050

Does this do the right thing if both points are within (eg) macro expansion ranges in the same file? Eg, given:

// A
#define FOO BAR
// B
FOO
// C

is the macro-expanded BAR token between B and C? Is it between A and B?

I would guess that the intended semantics here are:

  • Start and End are expected to be file locations in the same inclusion of the same file, with Start occuring no later than End.
  • Loc is an arbitrary location.
  • Function returns true if the spelling location of Loc is in the same source file as Start and End, and occurs between them; false otherwise.

Is that right? If so, please write that down somewhere; the current function comment is too vague. (And if not, please write down what the intended rule is.)

Hmm, after more analysis I realized that this is not the right solution for the rename problem. It would be correct to map one file:line:column location to a set of SourceLocations, and to use the old isPointWithin on a set of such locations. I opened https://reviews.llvm.org/D50926 instead. Sorry for the trouble!

arphaman abandoned this revision.Aug 17 2018, 2:18 PM