Page MenuHomePhabricator

[Attribute/Diagnostics] Print macro if definition is an attribute declaration
Needs ReviewPublic

Authored by leonardchan on Aug 27 2018, 4:18 PM.

Details

Summary

If an address_space attribute is defined in a macro, print the macro instead when diagnosing a warning or error for incompatible pointers with different address_spaces.

Originally started from http://lists.llvm.org/pipermail/cfe-dev/2018-August/058702.html where the original proposal was a new pragma, but instead just changing the way address_space gets printed in a diagnostic.

We now allow this for all attributes (not just address_space), and for multiple attributes declared in the same macro.

Diff Detail

Repository
rC Clang

Event Timeline

leonardchan created this revision.Aug 27 2018, 4:18 PM

Actually, after posting this I realize I didn't handle the macro expansion very well. I'll fix this first.

rsmith added inline comments.Aug 27 2018, 5:07 PM
include/clang/AST/ASTContext.h
1428 ↗(On Diff #162773)

There is no reason to make this specific to address space attributes. Please keep it general.

lib/AST/TypePrinter.cpp
1370 ↗(On Diff #162773)

This is unnecessary; just print the modified type here. (The modified type by definition does not have the attribute applied to it.)

lib/Sema/SemaType.cpp
5817–5822 ↗(On Diff #162773)

This is not the right way to find the macro.

What you should do is this: look at the source range of the attribute, and find the object-like macro expansion(s) common to both the start and end location. Search through those for the outermost macro expansion whose range is that of the attribute plus the enclosing syntax (__attribute__(( ... )) or [[ ... ]] or similar). That macro name is the name that was used to write this type, and that's the name we should preserve.

Plus, please do this all from getAttributedType itself so that it applies to all type attributes, not just address space attributes.

test/Sema/address_space_print_macro.c
3–4 ↗(On Diff #162773)

I know this is just a test, but please avoid using reserved identifiers here. (Drop the leading _ from these macro names.)

leonardchan marked 3 inline comments as done.
leonardchan added inline comments.
lib/AST/TypePrinter.cpp
1370 ↗(On Diff #162773)

When you say the modified type, do you mean just the type without it's qualifiers? I wasn't sure if removing all the qualifiers would suffice since there were also other non-address_space qualifiers that could be printed.

+@aaron.ballman for attributes-related changes.

It would be easier and more precise to track the macro corresponding to an attribute in the Parser rather than in Sema (and stash it on the ParsedAttr for consumers such as Sema that want it). That way, we still have the information about where the attribute delimiters were, and whether there were multiple attributes in the same set of delimiter brackets. Specifically, in Parser::ParseGNUAttributes, if we find that we parsed exactly one attribute, look through the source location information for the outermost macro expansion that exactly covers the token sequence in question, and that's your corresponding macro name. (Note that we don't need to check what the expansion of the macro is, nor track what attribute-like macros have been defined, to support this approach.)

Sorry for not thinking of this sooner.

lib/AST/TypePrinter.cpp
1370 ↗(On Diff #162773)

I mean T->getModifiedType(), which tracks what the type was before the attribute was applied.

leonardchan added inline comments.
lib/AST/TypePrinter.cpp
1370 ↗(On Diff #162773)

Oh, I understand that you meant T->getModifiedType(). This is a special case when printing the address_space since even though the attribute is applied and printed here, when we reach the qualifiers of the modified type, the address space will still be printed unless we remove it here.

I'm not sure if there's a better way to do this though.

You also need to update the various other places that create AttributedTypes (eg, template instantiation, ASTImporter, deserialization) to pass in the macro name. You can try removing the default argument from ASTContext::getAttributedType to find other places that might need updating.

include/clang/AST/Type.h
4318–4323 ↗(On Diff #163155)

Please keep this general, there's nothing address-space-specific here.

4343–4344 ↗(On Diff #163155)

Likewise.

include/clang/Sema/ParsedAttr.h
538–542 ↗(On Diff #163155)

Please use MacroIdentifier or MacroName in these function names; abbreviations such as II should only be used in entities with a relatively small scope, not in public member functions (here and in `AttributedType).

A documentation comment on these would also be useful.

lib/AST/TypePrinter.cpp
1370 ↗(On Diff #162773)

If the address space qualifiers are present on the modified type, then that's a bug in the creation of the AttributedType. And indeed looking at r340765, I see we are passing the wrong type in as the modified type when creating the AttributedType. Please fix that, and then just use getModifiedType here.

lib/Parse/ParseDecl.cpp
169–170 ↗(On Diff #163155)

This is not right: the loop below parses multiple __attribute__s and you should track each one separately...

173 ↗(On Diff #163155)

... by grabbing the location of the __attribute__ token here.

244–252 ↗(On Diff #163155)

You shouldn't do this if NumParsedAttrs != 1. We're only looking for a macro that exactly covers one attribute.

(Also, your computation of the number of attributes in the attribute list is not correct in the presence of late-parsed attributes.)

246–252 ↗(On Diff #163155)

From my earlier comment: "look through the source location information for the outermost macro expansion that exactly covers the token sequence in question, and that's your corresponding macro name. (Note that we don't need to check what the expansion of the macro is, nor track what attribute-like macros have been defined, to support this approach.)"

We should not be tracking a list of attribute-like macros in the preprocessor. Doing so is unnecessary and problematic for various reasons.

leonardchan marked 5 inline comments as done.Aug 30 2018, 9:33 AM

@rsmith When you clarify what you mean by outermost macro expansion? Say I have the following:

#define ATTR __attribute__
#define AS(i) address_space(i)
#define AS1 ATTR((address_space(1)))
#define AS2 ATTR((AS(2)))

int cmp(AS1 char *x, AS2 char *y) {
  return x < y ? x : y;
}

AS1 char * and AS2 char * would be diagnosed for an error relating to mismatched types, but I think outermost macro expansion implies either ATTR or AS would be the identifier stored as opposed to AS1 and AS2.

Would it be simpler to instead just check:

  • the starting __attribute__ token was the start of a macro expansion
  • the ending right parenthesis token was the end of a macro expansion
  • both start and end tokens had the same location to indicate this attribute was used in a macro

After fulfulling these, we can call Lexer::getSourceText() to get the spelling of the macro from this SourceLocation.

leonardchan marked 6 inline comments as not done.Aug 30 2018, 9:36 AM
leonardchan added inline comments.Aug 30 2018, 10:08 AM
lib/Parse/ParseDecl.cpp
244–252 ↗(On Diff #163155)

One of the things we would like is for this to cover more than one attribute in the attribute list since in sparse, address_space is sometimes used with noderef.

So given # define __user __attribute__((noderef, address_space(1))), __user would be saved into the ParsedAttr made for noderef and address_space.

What would be the appropriate way to track newly added attributes into the ParsedAttributes, including late-parsed attributes?

aaron.ballman added inline comments.Aug 31 2018, 5:32 AM
include/clang/AST/Type.h
4343 ↗(On Diff #163155)

Given that this function is const, I think the returned pointer should be as well.

lib/Lex/PPDirectives.cpp
2584 ↗(On Diff #163155)

This count is specific to GNU spellings. For instance, a C++ spelling might have 5 tokens (two square brackets, attribute-token, two more square brackets) or more and a declspec spelling might have 4 tokens (__declspec keyword, paren, attribute token, paren).

2586–2588 ↗(On Diff #163155)

Why does this need to be specific to GNU-style spellings?

2720–2721 ↗(On Diff #163155)

This should not be specific to address_space, but even if it were, this is wrong as it can be spelled [[clang::address_space(0)]].

lib/Parse/ParseDecl.cpp
116 ↗(On Diff #163155)

attrs doesn't meet the usual naming conventions. Same with i below.

119 ↗(On Diff #163155)

You can use a range-based for loop here instead.

leonardchan marked 8 inline comments as done.
  • Removed default value for getAttributedType and added any macro identifier necessary as a fourth argument to getAttributedType
  • Added tracking for LateAttrs to keep any found macro identifier, although I don't quite know how I can test this in the same fashion that I test with address_space since it doesn't get printed in the type.
lib/AST/TypePrinter.cpp
1370 ↗(On Diff #162773)

Making another patch for this.

lib/Lex/PPDirectives.cpp
2584 ↗(On Diff #163155)

See comment below

2586–2588 ↗(On Diff #163155)

Similar to the noderef patch I sent out, this is initially only meant for GNU-style attributes but can be expanded on later.

The main goal for our side was to change the diagnostics relating to address_spaces to instead print the macro if the attribute was defined in a macro. This was initially discussed in http://lists.llvm.org/pipermail/cfe-dev/2018-August/058702.html, but eventually decided on not using a pragma and instead redirecting diagnostics (http://lists.llvm.org/pipermail/cfe-dev/2018-August/058732.html).

Essentially, this was meant specifically for just the address_space attribute, which I believe only has GNU-style support, but Richard Smith thought that this would be useful to apply to other attributes. For our use case alone, we just need it to work for GNU-style, but we can expand on this afterwards for other attribute spellings.

2720–2721 ↗(On Diff #163155)

Removed code relating to the processor since we don't need it anymore

lib/Parse/ParseDecl.cpp
116 ↗(On Diff #163155)

Removed

119 ↗(On Diff #163155)

The condition should instead be i < attrs.size(), sorry.

I don't know if ParsedAttributesView has a reverse iterator though or I can start with an iterator in the middle of the view.

246–252 ↗(On Diff #163155)

Removed the preprocessor code

rsmith added a comment.Sep 4 2018, 4:51 PM

I think this patch is nearly ready to commit. However... if you want to handle macros that specify a sequence of attributes, we should switch to treating the macro name as a separate type sugar node rather than modeling it as a name for the AttributedType node. Up to you how we proceed from here -- if you want to land this patch with a one-attribute-in-the-macro restriction, that seems fine; if you'd prefer to go straight to a more general approach where we add a new type sugar representation for a single macro describing (potentially) multiple attributes, that's fine too.

lib/AST/ASTDiagnostic.cpp
78 ↗(On Diff #163604)

Use castAs here and below (because it would be a programming error if the type isn't an AttributedType).

lib/AST/ASTImporter.cpp
991 ↗(On Diff #163604)

You need to import the identifier; the source and destination ASTs have different identifier tables. (Importer.Import(T->getMacroIdentifier())).

lib/AST/TypePrinter.cpp
1370 ↗(On Diff #162773)

Incidentally, the last two arguments to the AddressSpaceAttr constructor in that patch (address space and spelling list index) are also reversed.

lib/Parse/ParseDecl.cpp
244–252 ↗(On Diff #163155)

One of the things we would like is for this to cover more than one attribute in the attribute list since in sparse, address_space is sometimes used with noderef.

Hold on, this is a new requirement compared to what we'd previously discussed (giving a name to an address space). How important is this use case to you?

I don't think it's a reasonable AST model to assign a macro identifier to an AttributedType if the macro defines multiple attributes. If you really want to handle that use case, I think that an identifier on the AttributedType is the wrong way to model it, and we should instead be creating a new type sugar node representing a type decoration written as a macro.

Assuming you want to go ahead with the current patch direction (at least in the short term), please add the "only one attribute in the macro" check.

138 ↗(On Diff #163604)

No need to copy the entire Token here, you only use its location before. It's also idiomatic to fold this into consuming the token:

SourceLocation AttrTokLoc = ConsumeToken();

215–220 ↗(On Diff #163604)

You're only looking at the innermost macro expansion; I still think we should take the outermost one. (You can iteratively walk the expansion locations of the start and end locations (SourceManager::getExpansionLocStart/End) to build a list of FileIDs representing macro expansions that each of them is included in, and then take the last common such FileID that represents an object-like macro. You can use SourceManager::getSLocEntry to get the corresponding ExpansionInfo for the macro, isFunctionMacroExpansion to determine if it's a function-like or object-like macro, and once you know it's object-like, take the spelling locations of the ExpansionLocStart and ExpansionLocEnd to find the name of the macro.

mcgrathr added inline comments.Sep 4 2018, 5:16 PM
lib/Parse/ParseDecl.cpp
244–252 ↗(On Diff #163155)

A single macro that uses multiple attributes is the central use case for us.
It would be fine if it's constrained to a single attribute clause (or [[...]] clause) with the attributes comma-separated, as in attribute((foo, bar)) as opposed to two separate attribute((foo)) attribute((bar)) in the macro, if that matters. It's even fine if it's constrained to the macro being nothing but the attribute((foo, bar)) clause (aside from whitespace and comments).

Leo can decide how he wants to proceed as far as incremental implementation.
But we won't have a real-world use for the feature only covering a single attribute.
We'll start using when it can cover the cases like the Linux user and iomem examples (address_space + noderef).

leonardchan marked 7 inline comments as done.Sep 10 2018, 9:31 AM

@rsmith So I chose to go with the type sugar route since we would prefer to have multiple attributes in one attribute list. This type just contains the underlying type and the macro identifier originally held by the AttributedType. Do you have any recommendations on where this type should be created?

Currently, I'm wrapping any AttributedType created under processTypeAttrs with this type since I still only hold the macro identifier in the ParsedAttr. The problem is that I keep incrementally running into failing assertions that I can slowly address, but all these assertions make me think I'm approaching this the wrong way. The failing assertions come from building or finding the type location if this type.

I'm thinking I could either make this type a subclass of AttributedType since we will essentially be using this only for attributes declared in a macro (I think this would allow me to not have to change some of the stuff that happens under GetTypeSourceInfoForDeclarator), or somehow find a way to make this in another part of Sema, although I would still need access to the parsed attributes so I would know what the macro identifiers are.

rsmith added a comment.Oct 3 2018, 5:56 PM

@rsmith So I chose to go with the type sugar route since we would prefer to have multiple attributes in one attribute list. This type just contains the underlying type and the macro identifier originally held by the AttributedType. Do you have any recommendations on where this type should be created?

I think the most natural thing to do would be to create the sugar node in processTypeAttrs after parsing each "chunk" of attributes produced by the same macro. (You'll need to extend ParsedAttributesView so it can tell you which ranges came from which macros.)

I'm thinking I could either make this type a subclass of AttributedType since we will essentially be using this only for attributes declared in a macro (I think this would allow me to not have to change some of the stuff that happens under GetTypeSourceInfoForDeclarator), or somehow find a way to make this in another part of Sema, although I would still need access to the parsed attributes so I would know what the macro identifiers are.

Your new type sugar node should be a new subclass of Type unrelated to AttributedType, maybe MacroQualifiedType. You'll need to teach TypeSpecLocFiller to fill in location information from it, but you can save that into the ParsedAttributesView when you parse the attributes and identify the macro expansions therein.

leonardchan marked an inline comment as done.
  • Added a new type sugar node MacroDefinedType which wraps AttributedTypes for the purpose of determining if an attribute was defined in a macro (this took much longer than I would've expected it to).
leonardchan added inline comments.Nov 7 2018, 4:50 PM
lib/Sema/SemaType.cpp
5817–5822 ↗(On Diff #162773)

Moved the saved identifier into the new type sugar.

leonardchan added inline comments.Nov 7 2018, 4:51 PM
clang/lib/AST/TypePrinter.cpp
976–982

This has not been addressed yet in this patch, but will still make another patch moving the address_space from the qualifier to the AttributedType (in reference to r340765).

leonardchan marked 2 inline comments as done.Jan 23 2019, 8:45 PM
leonardchan added inline comments.
clang/lib/AST/TypePrinter.cpp
976–982

@rsmith Finally fixed in D55447, so I don't think there are any more outstanding prior comments that haven't been addressed in this patch so far.

rsmith added inline comments.Jan 28 2019, 5:49 PM
clang/include/clang/AST/Type.h
4175

This represents a macro qualifier, not a type defined in a macro, so I don't think this name is appropriate. Maybe MacroQualifiedType?

4191

Please also provide an accessor to get the unqualified type here (that is, the type that we would have had if the macro had not been written -- the original type of the AttributedType). Right now, that's (effectively) being computed in the type printer, which doesn't seem like the right place for that logic.

clang/lib/AST/Type.cpp
382–386

This doesn't seem like a sufficiently general operation to warrant being a member of QualType to me, and like IgnoreParens before it, this is wrong with regard to qualifiers (it drops top-level qualifiers on the type if they're outside the macro qualifier). Looking through the uses below, I think there are better ways to write them that avoid the need for this function.

clang/lib/AST/TypePrinter.cpp
972–973

This should apply to all attributes, not only address space attributes.

978–982

This is the code that I mentioned above that should be moved to MacroDefinedType.

clang/lib/Parse/ParseDecl.cpp
263

Typo "attatch"

268–273

I think these checks need to be moved to the last loop of FindLocsWithCommonFileID. Otherwise for a case like:

#define THING \
  int *p = nullptr;
  AS1 int *q = nullptr;

THING

... FindLocsWithCommonFileID will pick out the THING macro and return the range from the int token to the second semicolon, and the checks here will fail. Instead, we should pick out the inner AS1 expansion, because it's the outermost macro expansion that starts with StartLoc and ends with EndLoc.

clang/lib/Sema/SemaExpr.cpp
13667

Please change TypeLoc::getAsAdjusted to skip macro type sugar instead of skipping it here. (getAsAdjusted already skips AttributedType sugar, and your new sugar node is the same kind of thing.)

Please also change Type::getAsAdjusted to likewise remove the new type node.

clang/lib/Sema/SemaStmt.cpp
3388–3392

This duplicates the logic in TypeLoc::getAsAdjusted (and misses a case!). This whole function definition should be replaced by return FD->getTypeSourceInfo()->getTypeLoc().getAsAdjusted<FunctionProtoTypeLoc>().getReturnLoc();.

clang/lib/Sema/SemaType.cpp
6965–6972

Instead of calling IgnoreParens() and IgnoreMacroDefinitions here, I think we should just be ignoring any type sugar by using getAs<AttributedType>():

bool Sema::hasExplicitCallingConv(QualType T) {
  while (const auto *AT = T->getAs<AttributedType>()) {
    if (AT->isCallingConv())
      return true;
    T = AT->getModifiedType();
  }
  return false;
}

(Note also the change from passing T by reference -- and then never storing to it in the function -- to passing it by value.)

7553–7560

The handling (both here and in the parser) for a macro that defines multiple attributes doesn't seem right. Given:

#define ATTRS __attribute__((attr1, attr2))
ATTRS int x;

it looks like you'll end up with a type built as

AttributedType attr1
 `- MacroDefinedType
      `- AttributedType attr2
          `- int

... which seems wrong.

The parser representation also can't distinguish between the above case and

#define ATTRS __attribute__((attr1))
ATTRS
#define ATTRS __attribute__((attr2))
ATTRS
int x;

... since both will be represented as simply a macro identifier ATTR attached to two different attributes (you could solve this by tracking the source location of the expansion loc of the macro in the parsed attribute representation, instead of or in addition to the macro identifier).

And the AST representation can't distinguish between the above and

#define ATTRS __attribute__((attr1))
ATTRS __attribute__((attr2)) int x;

... because it doesn't track what the modified type was, so we have to guess which attributes are covered by the macro.

Perhaps the best thing would be to support only the case of a macro that expands to exactly one attribute for the initial commit, and add support for macros covering multiple attributes as a follow-up commit?

clang/test/Frontend/macro_defined_type.cpp
10

Typo 'wether'

leonardchan marked 15 inline comments as done.
leonardchan added inline comments.
clang/lib/AST/Type.cpp
382–386

Removed this method.

clang/lib/Parse/ParseDecl.cpp
268–273

Moved, although this doesn't appear to fix this case. On closer inspection, when generating the vector of starting locations, the expansion location for AS1 doesn't seem to be returned by getExpansionLocStart. It goes straight from the location of the __attribute__ behind AS1 to THING and skips AS1. I found this out by just dumping StartLoc on every iteration.

The only way I can generate the location for AS1 in THING is by also watching for the spelling location of each expansion, but this SourceLoc is not a macro ID and would not fulfill these last 2 checks. Is this intended? If it's a bug I don't mind addressing it if you point me in the right direction to start.

clang/lib/Sema/SemaType.cpp
6965–6972

When using only getAs, the following 3 tests fail:

Clang :: CodeGenCXX/mangle-ms.cpp
Clang :: SemaCXX/calling-conv-compat.cpp
Clang :: SemaCXX/decl-microsoft-call-conv.cpp

since using getAs would involved removing a TypeDefType for each of these. Looking at the comment above this method's declaration, part of the intent is to retain the typedef sugar. We could still use getAs and not IgnoreMacroDefinitions by additionally checking for a typedef:

const AttributedType *AT;
while ((AT = T->getAs<AttributedType>()) && !isa<TypedefType>(T)) {

Using !T->getAs<TypedefType>() will still fail for types laid out as:

AttributedType
` - TypedefType
7553–7560

If it makes this cleaner and easier to review, I don't mind separating it into 2 patches with the first working for exactly one and the followup for multiple. For us though, we would mainly be using this for multiple attribute macros (especially __attribute__((address_space(n), noderef)). I can split it into 2 if you would like after this revision.

As for the cases you presented, for

#define ATTRS __attribute__((attr1, attr2))
ATTRS int x;

I see that this prevents from adding any more MacroQualifiedTypes for attributes that share the same name. Updated in this revision so that we continue to wrap AttributedTypes, producing:

MacroQualifiedType (ATTRS)
 `- AttributedType attr1
     ` - MacroQualifiedType (ATTRS)
        `- AttributedType attr2
             `- int

In the case of

#define ATTRS __attribute__((attr1))
ATTRS
#define ATTRS __attribute__((attr2))
ATTRS
int x;

I added the expansion loc so we can distinguish between this and the first case now.

As for

#define ATTRS __attribute__((attr1))
ATTRS __attribute__((attr2)) int x;

With this new revision, the AST representation now differs from the previous 2 examples in that only attr1 gets wrapped with a MacroQualifiedType and not attr2 producing:

AttributedType attr2
 ` - MacroQualifiedType (ATTRS)
    `- AttributedType attr1
         `- int

I also added test cases for these.

leonardchan retitled this revision from [Attribute/Diagnostics] Print macro instead of whole attribute for address_space to [Attribute/Diagnostics] Print macro if definition is an attribute declaration.Feb 21 2019, 2:05 PM
leonardchan edited the summary of this revision. (Show Details)