This is an archive of the discontinued LLVM Phabricator instance.

Use `getFileLoc()` instead of `getSpellingLoc()` in the ASTImporter
ClosedPublic

Authored by spyffe on Oct 27 2016, 5:42 PM.

Details

Summary

When the ASTImporterimports a source location, it avoids importing macro expansions by calling getSpellingLoc(). That's great in most cases, but for macros defined in the '<built-in>' source file, the source file is invalid and does not import correctly, causing an assertion failure (the assertion is Invalid SLocOffset or bad function choice).

A more reliable way to avoid this is to use getFileLoc(), which does not return built-in locations. This avoids the crash but still preserves valid source locations.

I've added a testcase that covers the previously crashing scenario.

Diff Detail

Event Timeline

spyffe updated this revision to Diff 76146.Oct 27 2016, 5:42 PM
spyffe retitled this revision from to Use `getFileLoc()` instead of `getSpellingLoc()` in the ASTImporter.
spyffe updated this object.
spyffe added reviewers: bruno, akyrtzi, a.sidorin.
spyffe added a subscriber: cfe-commits.
spyffe updated this revision to Diff 76213.Oct 28 2016, 10:12 AM

Updated the corresponding comment.

a.sidorin accepted this revision.Oct 29 2016, 5:38 AM
a.sidorin edited edge metadata.

Looks good. But I think it will be good to put the files in Input to a separate directory.

This revision is now accepted and ready to land.Oct 29 2016, 5:38 AM
spyffe added a comment.Nov 7 2016, 9:42 AM

Aleksei, thank you for your review.
I don't quite follow what you'd like me to do with the Input files, though. Some of them certainly appear to be input files in the same way that all the other files in Inputs are. Are you suggesting that I move macro.modulemap and macro1.h somewhere else (say, Modules/) or are you making a general comment about the layout of all of the tests?

Hi Sean!
I meant that I'd like to have a layout like this:
macro-loc.m
Inputs/macro-loc/macro.modulemap
Inputs/macro-loc/macro1.h
Inputs/macro-loc/macro1.m
Inputs/macro-loc/macro2.m

By the way, I see that we use another workaround for this issue: we use getDecomposedExpansionLoc() instead of getDecomposedLoc() with removing previous line at all. I wonder if we are incorrect in our suggestion.

That seems reasonable, and would go a long way toward cleaning up the Inputs and making clear exactly which inputs correspond to which test file.
Do you think it would be reasonable to take this diff the way it currently it is, and start a new one that pulls all the input fiels into test-specific subdirectories?
That way the desired layout of the Inputs directory will be clear to future developers touching the source base.

Do you think it would be reasonable to take this diff the way it currently it is, and start a new one that pulls all the input fiels into test-specific subdirectories?
That way the desired layout of the Inputs directory will be clear to future developers touching the source base.

I'm totally OK with a new diff. Nice suggestion. Thank you!

spyffe closed this revision.Nov 7 2016, 12:52 PM
$ svn commit lib test
Sending        lib/AST/ASTImporter.cpp
Adding         test/ASTMerge/Inputs/macro.modulemap
Adding         test/ASTMerge/Inputs/macro1.h
Adding         test/ASTMerge/Inputs/macro1.m
Adding         test/ASTMerge/Inputs/macro2.m
Adding         test/ASTMerge/macro.m
Transmitting file data ......done
Committing transaction...
Committed revision 286144.
spyffe added a comment.Nov 7 2016, 1:21 PM

Fixed the testcase to use _Nullable instead of __nullable, for Linux buildbots

$ svn commit test
Sending        test/ASTMerge/Inputs/macro1.h
Transmitting file data .done
Committing transaction...
Committed revision 286151.
r.stahl added a subscriber: r.stahl.May 2 2018, 9:27 AM

I encountered an issue caused by this change.

In the scenario as shown below the call to getFileLoc causes the startLoc of the BinaryOp to be after the endLoc, because the LHS (s->a) is located in the marco argument while the RHS (0) is located at the beginning of the marco. See below: BinaryOperator 0x12f45b8 <col:13, col:5>

This is an issue because the diagnostics machinery asserts on correctly ordered sourceranges. Namely:

llvm/tools/clang/lib/Frontend/TextDiagnostic.cpp:1015: void highlightRange(const clang::CharSourceRange&, unsigned int, clang::FileID, const {anonymous}::SourceColumnMap&, std::__cxx11::string&, const clang::SourceManager&, const clang::LangOptions&): Assertion 'StartColNo <= EndColNo && "Invalid range!"' failed.

I have tested the solution proposed by @a.sidorin and it works for my tests and the regression tests still pass.

code:

01|#include <stdint.h>
02|
03|
04|#define CLR_BIT(reg)     (reg = 0)
05|
06|
07|struct S
08|{
09|    int a;
10|};
11|
12|int foo()
13|{
14|    struct S *s = 0x80008000;
15|    CLR_BIT(s->a);
16|}

original:

`-FunctionDecl 0xd29850 <line:12:1, line:16:1> line:12:5 foo 'int ()'
  `-CompoundStmt 0xd312d8 <line:13:1, line:16:1>
    |-DeclStmt 0xd311e0 <line:14:5, col:29>
    | `-VarDecl 0xd299e0 <col:5, col:19> col:15 used s 'struct S *' cinit
    |   `-ImplicitCastExpr 0xd29a60 <col:19> 'struct S *' <IntegralToPointer>
    |     `-IntegerLiteral 0xd29a40 <col:19> 'unsigned int' 2147516416
    `-ParenExpr 0xd312b8 <line:4:26, col:34> 'int'
      `-BinaryOperator 0xd31290 <line:15:13, line:4:33> 'int' '='
        |-MemberExpr 0xd31238 <line:15:13, col:16> 'int' lvalue ->a 0xd297b8
        | `-ImplicitCastExpr 0xd31220 <col:13> 'struct S *' <LValueToRValue>
        |   `-DeclRefExpr 0xd311f8 <col:13> 'struct S *' lvalue Var 0xd299e0 's' 'struct S *'
        `-IntegerLiteral 0xd31270 <line:4:33> 'int' 0

imported:

`-FunctionDecl 0x12f4218 prev 0x12f3fa0 <line:12:1, line:16:1> line:12:5 used foo 'int ()'
  `-CompoundStmt 0x12f4600 <line:13:1, line:16:1>
    |-DeclStmt 0x12f4508 <line:14:5, col:29>
    | `-VarDecl 0x12f4470 <col:5, col:19> col:15 used s 'struct S *' cinit
    |   `-ImplicitCastExpr 0x12f44f0 <col:19> 'struct S *' <IntegralToPointer>
    |     `-IntegerLiteral 0x12f44d0 <col:19> 'unsigned int' 2147516416
    `-ParenExpr 0x12f45e0 <line:15:5> 'int'
      `-BinaryOperator 0x12f45b8 <col:13, col:5> 'int' '='
        |-MemberExpr 0x12f4560 <col:13, col:16> 'int' lvalue ->a 0x12f4378
        | `-ImplicitCastExpr 0x12f4548 <col:13> 'struct S *' <LValueToRValue>
        |   `-DeclRefExpr 0x12f4520 <col:13> 'struct S *' lvalue Var 0x12f4470 's' 'struct S *'
        `-IntegerLiteral 0x12f4598 <col:5> 'int' 0

Hi Rafael,

Could you please show the AST we get with getDecomposedExpansionLoc()? This change can be an item for a separate patch.

It puts everything at the start of the marco expansion.

`-FunctionDecl 0x12f5218 prev 0x12f4fa0 <line:12:1, line:16:1> line:12:5 used foo 'int ()'
  `-CompoundStmt 0x12f5600 <line:13:1, line:16:1>
    |-DeclStmt 0x12f5508 <line:14:5, col:29>
    | `-VarDecl 0x12f5470 <col:5, col:19> col:15 used s 'struct S *' cinit
    |   `-ImplicitCastExpr 0x12f54f0 <col:19> 'struct S *' <IntegralToPointer>
    |     `-IntegerLiteral 0x12f54d0 <col:19> 'unsigned int' 2147516416
    `-ParenExpr 0x12f55e0 <line:15:5> 'int'
      `-BinaryOperator 0x12f55b8 <col:5> 'int' '='
        |-MemberExpr 0x12f5560 <col:5> 'int' lvalue ->a 0x12f5378
        | `-ImplicitCastExpr 0x12f5548 <col:5> 'struct S *' <LValueToRValue>
        |   `-DeclRefExpr 0x12f5520 <col:5> 'struct S *' lvalue Var 0x12f5470 's' 'struct S *'
        `-IntegerLiteral 0x12f5598 <col:5> 'int' 0

I see,t hank you. Please feel free to submit a patch - it seems like you already have a nice test case that shows the difference between two import options.