This is an archive of the discontinued LLVM Phabricator instance.

[clang][AST matchers] adding attributedStmt AST matcher
Needs ReviewPublic

Authored by ajohnson-uoregon on Mar 3 2022, 5:05 PM.

Details

Summary

adding support for AttributedStmts to the AST matcher interface; also adding hasSubStmt matcher under attributedStmt so we can actually match the stmt inside the attribute is attached to

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2022, 5:05 PM
ajohnson-uoregon requested review of this revision.Mar 3 2022, 5:05 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMar 3 2022, 5:05 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Fixing to actually have the right commits

ajohnson-uoregon retitled this revision from [clang][AST matchers] adding hasSubStmt matcher under attributedStmt so we can actually match the stmt inside the attribute is attached to to [clang][AST matchers] adding attributedStmt AST matcher.Mar 3 2022, 5:13 PM
ajohnson-uoregon edited the summary of this revision. (Show Details)

Still doesn't have right files/commits

Still trying to get right commits, apologies XD

updated resolution rules to try to get right commits

maybe this will work

had to split a commit because past me messed up

getting the commit range right, finally

Okay this is actually the right commits now, sorry about the spam, Aaron. 🙃 Didn't realize I'd put unrelated things in the first commit.

The names for isAttr and hasSubStmt are up for debate, I just picked things that were close enough and weren't already in use.

making clang-format happy

Okay this is actually the right commits now, sorry about the spam, Aaron. 🙃 Didn't realize I'd put unrelated things in the first commit.

No worries! I usually review AST matcher changes anyway, so there's definitely no harm in tagging me. :-)

You also need to add some test coverage for the changes to clang/unittests/ASTMatchers/ and you also need to regenerate the documentation by running clang/docs/tools/dump_ast_matchers.py.

clang/include/clang/ASTMatchers/ASTMatchers.h
2704–2712
2713–2714

Design-wise, I'm on the fence here. AST matchers match the AST nodes that Clang produces, and from that perspective, this makes a lot of sense. But AttributedStmt is a bit of a hack in some ways, and do we want to expose that hack to AST matcher users? e.g., is there a reason we shouldn't make returnStmt(hasAttr(attr::Likely)) work directly rather than making the user pick their way through the AttributedStmt? This is more in line with how you check for a declaration with a particular attribute and seems like the more natural way to surface this.

For the moment, since this is following the current AST structure in Clang, I think this is fine. But I'm curious if @klimek or perhaps @sammccall has an opinion here.

2722
2732–2737

Probably have to reformat this as well.

2746
2753–2755
ajohnson-uoregon marked 5 inline comments as done.

update with fixes from Aaron, tests, and docs

Actually has the right doc updates now

Somehow got other edits into ASTMatcher.h, dunno how, it's fixed now

jdoerfert added inline comments.Mar 7 2022, 9:53 PM
clang/include/clang/ASTMatchers/ASTMatchers.h
2713–2714

I think a way to find any kind of statement (or expression) that has a specific attribute is very useful. How that should look, idk.

Figuring out git-clang-format

sammccall added inline comments.Mar 9 2022, 1:32 PM
clang/include/clang/ASTMatchers/ASTMatchers.h
2713–2714

TL;DR: I think this matcher fits the current design best. returnStmt(hasAttr()) sounds awesome, but I think it's a significant new concept, and a cross-cutting project like traversal modes.


returnStmt(hasAttr()) is definitely nicer in isolation (and in combination with how decls work).
It's definitely potentially confusing to not match the AST. The AST is unlikely to be fixed because (IIUC) we don't want to burden each stmt with tracking if it has attrs.
So I think the easy answer is this patch gets it right.

The inconsistency with decls is annoying, and maybe worse it doesn't yield a simple way to express "a return statement, maybe with an attribute, I don't care" unless you're searching recursively anyway, and this should almost be the default.
returnStmt(hasAttr()) suggests that it would enable this, maybe by skipping over the AttributedStmt with some fancy traversal mode, and then looking it up again in the hasAttr matcher from the parent map.

I think this would be a less brittle way to handle mostly-transparent nodes that you nevertheless want to be able to match the contents of sometimes. The current options (explicitly unwrap, rely on recursive search, and traversal modes) all seem to have significant limitations.
However this is a pretty general idea (and I guess a pretty large project), and I don't think it's worth introducing just for AttributedStmt.

2733

This definitely seems more like hasAttr than isAttr to me. An AttributedStmt *is* the (sugar) statement, and *has* the attribute(s).

Maybe rather describesAttr so people don't confuse this for one that walks up, though?

aaron.ballman added inline comments.Mar 10 2022, 10:27 AM
clang/include/clang/ASTMatchers/ASTMatchers.h
2713–2714

Thanks for the feedback Sam!

The AST is unlikely to be fixed because (IIUC) we don't want to burden each stmt with tracking if it has attrs.

I'm less convinced of this. We didn't want to do it originally because there were so very few statement attributes. These days, there's quite a few more more statement attributes, so we may very well revisit this. AttributedStmt is a pain that has caused us problems in the past with things like isa<FooStmt>() failing because it didn't expect an attributed FooStmt.

That said, the rest of your points are compelling, so this matcher is fine for me.

2733

Good point on the isAttr name!

I'm not sure describeAttr works (the statement doesn't describe attributes, it... has/contains/uses the attribute). Maybe.. specifiesAttr()?

Or are we making this harder than it needs to be and we should use hasAttr() for parity with other things that have attributes associated with them?

sammccall added inline comments.Mar 10 2022, 10:40 AM
clang/include/clang/ASTMatchers/ASTMatchers.h
2713–2714

We didn't want to do it originally because there were so very few statement attributes.

Ah, i assumed it was some kind of issue with size or the logistics of allocation. Fixing the AST sounds good then! (But I wouldn't block this patch on it unless someone's ready to work on it).

aaron.ballman added inline comments.Mar 10 2022, 10:44 AM
clang/include/clang/ASTMatchers/ASTMatchers.h
2713–2714

Yeah, I'm not ready to work on it and I don't know of anyone else planning to do that work any time soon (it's more of an idle "someday" task than anything we need to do immediately). So definitely agreed, let's not block this patch on it!

clang/include/clang/ASTMatchers/ASTMatchers.h
2713–2714

I think both would be useful. Because sometimes you might want to look for all statements of a particular kind (eg, returnStmts) that have an attributed, and then 'returnStmt(hasAttr())` should work. But there are also cases (like the one I'm working on: https://github.com/ajohnson-uoregon/llvm-project/tree/feature-ajohnson/clang-tools-extra/clang-rewrite) where we want to find all statements with an attribute, but we don't really care what kind of statement it is, in which case looking for AttributedStmts is easier.

2733

Yeah I wasn't sure on the name, I just picked something that wouldn't make me figure out polymorphic matcher declarations because this was the first AST matcher I'd written. :)

I like containsAttr() or specifiesAttr(), since it could have multiple attributes. hasAttr() makes the most sense to me though. If y'all think we should go with that, the new hasAttr() definition would look something like this, right? (To make sure I do in fact understand polymorphic matchers.)

AST_POLYMORPHIC_MATCHER_P(
    hasAttr,
    AST_POLYMORPHIC_SUPPORTED_TYPES(Decl, AttributedStmt),
    attr::Kind, AttrKind) {
  return llvm::any_of(Node.getAttrs(),
                      [&](const Attr *A) { return A->getKind() == AttrKind; });
}

Original hasAttr() for reference:

AST_MATCHER_P(Decl, hasAttr, attr::Kind, AttrKind) {
  for (const auto *Attr : Node.attrs()) {
    if (Attr->getKind() == AttrKind)
      return true;
  }
  return false;
}

(...... I just figured out where the 'submit' button is. My comment on 2713-2714 is from like, 5 days ago. Oops. 🙃 )

aaron.ballman added inline comments.Mar 15 2022, 9:30 AM
clang/include/clang/ASTMatchers/ASTMatchers.h
2733

I like containsAttr() or specifiesAttr(), since it could have multiple attributes.

The same is true for declarations; they can have multiple attributes. Which is an interesting test case for us to add for statements:

int foo();

int main() {
  [[clang::nomerge]] [[likely]] return foo();
}

We should verify we can find both of those attributes on the same attributedStmt() (e.g., you don't have to do something like attributedStmt(attributedStmt(...)) to match.)

As for the name, I think hasAttr() is fine; if it causes awkwardness in the future when we change the AST in Clang, we'll have to fix it up then, but that awkwardness will already be there for attributedStmt() itself.

hasAttr() makes the most sense to me though. If y'all think we should go with that, the new hasAttr() definition would look something like this, right?

That looks about like what I'd expect, yes.

I overloaded hasAttr and had to fix a couple CUDA tests to explicitly use the decl overload (the other tests
for hasAttr did this already, so I think it was just a mistake in those tests).

I also had to remove an assert from Decl::getAttrs in DeclBase.cpp that checked if there were actually attributes
on that decl to make the overload work (when I wrote the overload, I made both use getAttrs because AttributedStmts
don't have attrs() like Decls do). After trawling git blame a bit, it looks like that assert was added way
back in like 2010 when ASTContext::getDeclAttrs just did a straight lookup in the DeclAttrs map, which
presumably would have done bad things if the decl had no attrs. In this commit though
https://github.com/llvm/llvm-project/commit/561eceb4c4a20280d5324c873ddad1940960b891 that got fixed, so
getDeclAttrs will return a new empty AttrVec if the lookup fails. The assert is not necessary since that commit,
but I guess no one realized they could remove the assert.

tl;dr That assert broke lots of the old hasAttr tests for Decls that were now using getAttrs() instead of attrs(),
and removing it fixes them, and since it's no longer needed I think it's okay to get rid of it.

updating commit message of last commit to be more accurate

Can you also add a release note for the changes?

clang/include/clang/ASTMatchers/ASTMatchers.h
7800

Well this is awkward. :-D There's no way to check for whether the node has attributes for both kinds of nodes, but decl nodes requires that check; but there's no range-based for loop (or other iterator interface) that's the same on both nodes either.

I think you're going to need to add a class to ASTMatchersInternal.h to dispatch to the correct implementation for this query. You can look at how HasSizeMatcher is implemented and used to get an idea.

clang/lib/AST/DeclBase.cpp
909 ↗(On Diff #418341)

I don't think it's valid to remove this assert (looking at the git blame for this and ASTContext::getDeclAttrs() shows this has all been the same for the past 12 years).

Okay, I put the assert back in and made a thing similar to HasSizeMatcher that works and all the tests pass.

What do you mean by a release note, in the commit message, in the docs...?

ajohnson-uoregon marked an inline comment as done.Mar 30 2022, 12:45 PM

Okay, I put the assert back in and made a thing similar to HasSizeMatcher that works and all the tests pass.

What do you mean by a release note, in the commit message, in the docs...?

https://github.com/llvm/llvm-project/blob/main/clang/docs/ReleaseNotes.rst has a section for AST Matchers, so you should add the notes there.

Otherwise, the changes here all LGTM.