This is an archive of the discontinued LLVM Phabricator instance.

Fix the bugs in the mapDiagnosticRanges (Still in progress)
ClosedPublic

Authored by zhengkai on Aug 26 2015, 2:04 PM.

Details

Summary

Use a backtracing alogirthm to match all possible source locations to match the FileID of the Caret location.
This can map the ranges back as much as possible.
You can see some new ranges underlined in the diagnostic information In the test cases modified. This can prove that the new method is effective.
(Solved https://llvm.org/bugs/show_bug.cgi?id=24592)

Also change the standard to judge if the macro backtraces should be reduced.
Using raw encodings to specify if the ranges are all in one argument of the macro invocations.
But this part is still questionable due to some bugs inside the preprocessor system. (See: https://llvm.org/bugs/show_bug.cgi?id=24788)

Diff Detail

Repository
rL LLVM

Event Timeline

zhengkai updated this revision to Diff 33244.Aug 26 2015, 2:04 PM
zhengkai retitled this revision from to Fix the bugs in the mapDiagnosticRanges (Still in progress).
zhengkai updated this object.
zhengkai added a reviewer: rtrieu.
zhengkai added a subscriber: cfe-commits.
zhengkai updated this revision to Diff 34588.Sep 11 2015, 2:19 PM
zhengkai updated this object.
rtrieu edited edge metadata.Sep 14 2015, 4:59 PM

Why are you leaking the raw encoding of locations from the SourceManager? This is an internal implementation detail and should not be relied on externally. SourceLocation is the typical way to use locations.

Why are you leaking the raw encoding of locations from the SourceManager? This is an internal implementation detail and should not be relied on externally. SourceLocation is the typical way to use locations.

Because I want to check if all the ranges fit in the same argument of one macro invocation, they can be different Source Locations. In ExpansionInfo, these locations have the same ExpansionLocStart.

Why are you leaking the raw encoding of locations from the SourceManager? This is an internal implementation detail and should not be relied on externally. SourceLocation is the typical way to use locations.

Because I want to check if all the ranges fit in the same argument of one macro invocation, they can be different Source Locations. In ExpansionInfo, these locations have the same ExpansionLocStart.

If you call ExpansionInfo::getExpansionLocStart(), then the SourceLocation will be the same if the ExpansionLocStart is the same.

Why are you leaking the raw encoding of locations from the SourceManager? This is an internal implementation detail and should not be relied on externally. SourceLocation is the typical way to use locations.

Because I want to check if all the ranges fit in the same argument of one macro invocation, they can be different Source Locations. In ExpansionInfo, these locations have the same ExpansionLocStart.

If you call ExpansionInfo::getExpansionLocStart(), then the SourceLocation will be the same if the ExpansionLocStart is the same.

But I still need to change the isMacroArg function to get the result of getExpansionLocStart(), and if I want to compare two SourceLocation, I still need to check the raw encoding. I don't know if it is needed not to leak the raw encoding.

Why are you leaking the raw encoding of locations from the SourceManager? This is an internal implementation detail and should not be relied on externally. SourceLocation is the typical way to use locations.

Because I want to check if all the ranges fit in the same argument of one macro invocation, they can be different Source Locations. In ExpansionInfo, these locations have the same ExpansionLocStart.

If you call ExpansionInfo::getExpansionLocStart(), then the SourceLocation will be the same if the ExpansionLocStart is the same.

But I still need to change the isMacroArg function to get the result of getExpansionLocStart(), and if I want to compare two SourceLocation, I still need to check the raw encoding. I don't know if it is needed not to leak the raw encoding.

I kind of see what you are trying to do here, and we should fix it a different way.

  1. Get rid of changes to ExpansionInfo
  2. Change the function SourceManager::isMacroArgExpansion to take two arguments instead of one. The new argument should be a SourceLocation pointer named MacroBegin.
  3. Change the last line of SourceManager::isMacroArgExpansion from "return Expansion.isMacroArgExpansion();" to "if (!Expansion.isMacroArgExpansion()) return false;" After that, add the code to get the macro begin location.
  4. Update DiagnosticRenderer to use SourceLocation instead of raw encodings.
zhengkai updated this revision to Diff 34826.Sep 15 2015, 12:20 PM
zhengkai edited edge metadata.

Use the rtrieu's method to re write the implementation of SourceManager.

Why are you leaking the raw encoding of locations from the SourceManager? This is an internal implementation detail and should not be relied on externally. SourceLocation is the typical way to use locations.

Because I want to check if all the ranges fit in the same argument of one macro invocation, they can be different Source Locations. In ExpansionInfo, these locations have the same ExpansionLocStart.

If you call ExpansionInfo::getExpansionLocStart(), then the SourceLocation will be the same if the ExpansionLocStart is the same.

But I still need to change the isMacroArg function to get the result of getExpansionLocStart(), and if I want to compare two SourceLocation, I still need to check the raw encoding. I don't know if it is needed not to leak the raw encoding.

I kind of see what you are trying to do here, and we should fix it a different way.

  1. Get rid of changes to ExpansionInfo
  2. Change the function SourceManager::isMacroArgExpansion to take two arguments instead of one. The new argument should be a SourceLocation pointer named MacroBegin.
  3. Change the last line of SourceManager::isMacroArgExpansion from "return Expansion.isMacroArgExpansion();" to "if (!Expansion.isMacroArgExpansion()) return false;" After that, add the code to get the macro begin location.
  4. Update DiagnosticRenderer to use SourceLocation instead of raw encodings.

Have changed the implementation.

rtrieu added inline comments.Sep 15 2015, 6:20 PM
include/clang/Basic/SourceManager.h
333 ↗(On Diff #34826)

Remove unrelated whitespace change.

1155–1164 ↗(On Diff #34826)

Merge the two isMacroArgExpansion functions by making the second argument a SourceLocation pointer with default argument of nullptr. See isAtStartOfImmediateMacroExpansion or isAtEndOfImmediateMacroExpansion functions for how this is done.

lib/Basic/SourceManager.cpp
1015–1024 ↗(On Diff #34826)

Merge with other isMacroArgExpansion function

lib/Frontend/DiagnosticRenderer.cpp
311 ↗(On Diff #34826)

No newline between comment and function. And what is the logic behind in this function that wasn't in the code it is replacing?

312 ↗(On Diff #34826)

Why not return a SourceLocation instead of passing one in by reference?

313 ↗(On Diff #34826)

Why is this passed by reference?

314–350 ↗(On Diff #34826)

Use more vertical whitespace when coding.

323 ↗(On Diff #34826)

In the other function, there is "End = SM->getImmediateExpansionRange(Backup).second;" Why is there not a "Begin = SM->getImmediateExpansionRange(Backup).first;" here?

331 ↗(On Diff #34826)

There is a lot of duplicated code between these two functions. Can they merged together with an option to select whether to check the start or end locations?

352–367 ↗(On Diff #34826)

Function not used; remove it.

417–418 ↗(On Diff #34826)

Can these function be rewritten to return SourceLocation's? Then it would be:
Begin = retrieveBeginLocation(Begin, BeginFileID, CaretLocFileID, SM);
End = retrieveEndLocation(End, EndFileID, CaretLocFileID, SM);

if (Begin.invalid() || End.invalid()) continue;

466 ↗(On Diff #34826)

What is this function doing that is not captured by the old isMacroArgExpansion function?

469 ↗(On Diff #34826)

Drop the "= SourceLocation". Default initialization of SourceLocation does the same thing.

471 ↗(On Diff #34826)

Use "PreLoc.isInvalid()" instead of "PreLoc == SourceLocation()".

504 ↗(On Diff #34826)

Why does the range count matter?

512 ↗(On Diff #34826)

Comment is out of date, no longer holding the raw encoding. Rename the variable to ArgumentLoc to convey what the SourceLocation is holding.

513 ↗(On Diff #34826)

Drop the "= SourceLocation()". Default initialization for SourceLocation does the same thing.

test/Misc/caret-diags-macros.c
210 ↗(On Diff #34826)

Why is Cstrlen changed to strlen_test?

zhengkai updated this revision to Diff 34943.Sep 16 2015, 5:01 PM
zhengkai marked 8 inline comments as done.
zhengkai added inline comments.
lib/Frontend/DiagnosticRenderer.cpp
312 ↗(On Diff #34826)

Because this function has two recursive calls inside.
So if return a SourceLocation here, I will need to use a temporal variable to store the result of the first call.
And this doesn't make it easier.

323 ↗(On Diff #34826)
Begin = SM->getImmediateMacroCallerLoc(Begin);

Use getImmediateMacroCallerLoc function here.

331 ↗(On Diff #34826)

I tried it but the code doesn't seem easier.

352–367 ↗(On Diff #34826)

deleted. But I wonder this function is important for those who want to debug this code.

417–418 ↗(On Diff #34826)

Replied before.

466 ↗(On Diff #34826)

I don't understand what you mean here.

504 ↗(On Diff #34826)

Because some ranges passed in are invalid.
And if the mapDiagnosticRanges function drops some ranges, this means this macro call doesn't contain all the information needed of all the ranges.

test/Misc/caret-diags-macros.c
210 ↗(On Diff #34826)

The preprocessor can't get right when the argument itself is a function-like macro. This bug is not due to the high-level macro case but due to this problem.

zhengkai marked 2 inline comments as done.Sep 17 2015, 3:49 PM
zhengkai added inline comments.
lib/Frontend/DiagnosticRenderer.cpp
455 ↗(On Diff #34943)

Because the function isMacroArgExpansion doesn't check if all locations are in the same argument location,

rtrieu added inline comments.Sep 22 2015, 10:04 PM
lib/Basic/SourceManager.cpp
1008 ↗(On Diff #34943)

Fix alignment so function arguments line up.

1013 ↗(On Diff #34943)

Add line break after this line.

lib/Frontend/DiagnosticRenderer.cpp
323 ↗(On Diff #34943)

You did not add that here.

331 ↗(On Diff #34943)

Using reference parameters to input/output data into functions is confusing and generally not done. SourceLocation is a small enough class that it is commonly used as return values and it has a built-in invalid value. Use the following function:

static SourceLocation retrieveMacroLocation(SourceLocation Loc,
                                            FileID MacroFileID,
                                            FileID CaretFileID,
                                            bool getBeginLoc,
                                            const SourceManager *SM) {
  if (MacroFileID == CaretFileID) return Loc;
  if (!Loc.isMacroID()) return SourceLocation();

  SourceLocation MacroLocation, MacroArgLocation;

  if (SM->isMacroArgExpansion(Loc)) {
    MacroLocation = SM->getImmediateMacroCallerLoc(Loc);
    MacroArgLocation = getBeginLoc ? SM->getImmediateExpansionRange(Loc).first
                                   : SM->getImmediateExpansionRange(Loc).second;
  } else {
    MacroLocation = getBeginLoc ? SM->getImmediateExpansionRange(Loc).first
                                : SM->getImmediateExpansionRange(Loc).second;
    MacroArgLocation = SM->getImmediateSpellingLoc(Loc);
  }

  MacroFileID = SM->getFileID(MacroLocation);
  MacroLocation = retrieveMacroLocation(MacroLocation, MacroFileID, CaretFileID,
                                        getBeginLoc, SM);
  if (MacroLocation.isValid()) return MacroLocation;

  MacroFileID = SM->getFileID(MacroArgLocation);
  return retrieveMacroLocation(MacroArgLocation, MacroFileID, CaretFileID,
                               getBeginLoc, SM);
}
352–367 ↗(On Diff #34943)

It is more important to have good comments than debug functions since that saves the trouble of running a debugger. If someone is debugging this code, this function would be in the wrong place. Instead, the SourceManager would be a better place to put it, and it saves the trouble of passing in a SourceManager object. Besides that, there special considerations for debugging functions, like annotating it properly so it only shows up in debug builds and not increase the size of regular builds. If you still feel it is important, then the work should go into another patch.

406–407 ↗(On Diff #34943)

Use:

Begin = retrieveMacroLocation(Begin, BeginFileID, CaretLocFileID,
                              true /*getBeginLoc*/, SM);
End = retrieveMacroLocation(End, BeginFileID, CaretLocFileID,
                            false /*getBeginLoc*/, SM);
if (Begin.isInvalid() || End.isInvalid()) continue;

with the fixed function above.

455 ↗(On Diff #34943)

I still don't know what you are doing here. You have two SourceLocation's, Loc and PreLoc. Those are not good names for telling them apart or what they do. You pass the second SourceLocation in by reference, both as an input value, and later as something to put an output value in. This is a bad idea. If you need a SourceLocation to be passed back to the caller, make the function return a SourceLocation, or use a SourceLocation pointer as an argument that is only for output. The SourceLocation class is tiny so it is okay to copy around.

460 ↗(On Diff #34943)

Do the same for NewLoc

471 ↗(On Diff #34943)

What is this SourceLocation? Why is it passed by reference?

502 ↗(On Diff #34943)

Not done.

test/Misc/caret-diags-macros.c
210 ↗(On Diff #34943)

Add a FIXME comment that it should be changed back to Cstrlen when the bug is resolved.

zhengkai updated this revision to Diff 35554.Sep 23 2015, 2:07 PM
zhengkai marked 11 inline comments as done.
zhengkai added inline comments.
lib/Frontend/DiagnosticRenderer.cpp
323 ↗(On Diff #34943)
The getImmediateMacroCallerLoc is implemented as following:

SourceLocation getImmediateMacroCallerLoc(SourceLocation Loc) const {
  if (!Loc.isMacroID()) return Loc;
  if (isMacroArgExpansion(Loc))
    return getImmediateSpellingLoc(Loc);

  return getImmediateExpansionRange(Loc).first;
}
471 ↗(On Diff #34943)

Changed

rtrieu added inline comments.Sep 23 2015, 2:51 PM
lib/Frontend/DiagnosticRenderer.cpp
455 ↗(On Diff #34943)

Still not answered: What is PreLoc? Why is it important to this function? What does PreLoc even mean?

471 ↗(On Diff #34943)

What is Loc? Where is it pointing to? Why do you need it?

test/Misc/caret-diags-macros.c
210 ↗(On Diff #34943)

Use:

// FIXME: Change test to use 'Cstrlen' instead of 'strlen_test' when macro printing is fixed.
zhengkai updated this revision to Diff 35556.Sep 23 2015, 2:57 PM
zhengkai marked an inline comment as done.
zhengkai added inline comments.
lib/Frontend/DiagnosticRenderer.cpp
442 ↗(On Diff #35554)

I have stated that I need to check if all the locations checked fit in the same location in the expansionInfo. So I need the preLoc to store the previous checked location to see if they are the same.

454 ↗(On Diff #35554)

See the comment ahead.

rtrieu added inline comments.Sep 23 2015, 3:14 PM
lib/Frontend/DiagnosticRenderer.cpp
442 ↗(On Diff #35554)

That explains nothing. You have a function 'checkLocForMacroArgExpansion' that has three parameters, SourceLocation Loc, SourceManager SM, and SourceLocation PreLoc. From the name of the function, it appears that it checks Loc to see if is a macro argument expansion. The SourceManger is needed for checking information about SourceLocation's. That leaves PreLoc, which does not have a descriptive name ('pre' could be short for lots of different words) nor is it commented.

You mentioned it is the previously checked location. Does this mean PreLoc is one location behind Loc? If so, why not just use the offset to get one behind? Or does it refer to different locations on the macro stack?

Since both 'checkLocForMacroArgExpansion' and 'checkRangeForMacroArgExpansion' now have a third parameter that is not obvious from the function name what they do, add comments to the functions to explain the purpose of the third parameter.

zhengkai updated this revision to Diff 35561.Sep 23 2015, 3:22 PM
zhengkai added inline comments.
lib/Frontend/DiagnosticRenderer.cpp
442 ↗(On Diff #35561)

Explained in the new comments in the updated diff.

zhengkai updated this revision to Diff 35573.Sep 23 2015, 4:47 PM
zhengkai updated this revision to Diff 35574.
zhengkai updated this revision to Diff 35575.Sep 23 2015, 5:06 PM
This revision was automatically updated to reflect the committed changes.