This is an archive of the discontinued LLVM Phabricator instance.

AttributeList de-listifying:
ClosedPublic

Authored by erichkeane on Jun 29 2018, 1:53 PM.

Details

Summary

Basically, "AttributeList" loses all list-like mechanisms, ParsedAttributes is switched to use a TinyPtrVector (and a NonOwningParsedAttributes is created to have a non-allocating attributes list). DeclaratorChunk gets the later kind, Declarator/DeclSpec keep ParsedAttributes.

Iterators are added to the ParsedAttribute types so that for-loops work.

Now for the bad news;
1- This is an insanely large patch. The refactor itself touches a ton of files.

Feedback/assistance would be greatly appreciated!
-Erich

Diff Detail

Event Timeline

erichkeane created this revision.Jun 29 2018, 1:53 PM

The most interesting changes are in AttributeList.h. I also forgot to mention: I'd love to be more aggressive about the ownership semantics (in fact, my first 2 attempts were much stricter), however we make a lot of assumptions based on the ability to pass attribute lists between functions and have them modify on eachother. I'm hoping the tests that fail aren't too related to being able to modify the list for a different ParsedAttributes collection.

ALSO, I'd love to do the renaming of AttributeList to ParsedAttribute, but I just don't have the heart right now :)

Wow, this is an amazing start to a very large refactoring! Thank you for tackling this. I've made a few review comments but haven't gotten through the entire patch yet. The parts that I've seen all look reasonably sane thus far.

The most interesting changes are in AttributeList.h. I also forgot to mention: I'd love to be more aggressive about the ownership semantics (in fact, my first 2 attempts were much stricter), however we make a lot of assumptions based on the ability to pass attribute lists between functions and have them modify on eachother. I'm hoping the tests that fail aren't too related to being able to modify the list for a different ParsedAttributes collection.

One question that came to my mind is whether we should be attempting to add some better const correctness as part of this refactoring. This part of the code has always been unsatisfactory in that regard and I think the lack of const correctness is part of why the semantics here are as muddled as they are. If we add more const correctness, the patch is likely to grow even larger, but it may also point out the areas where we're doing something questionable when there's now a better way.

Given the massive nature of this patch, I can volunteer to handle the fit and finish comments rather than nag you to take care of them, if you'd like to tag team on the patch.

More review comments to come later. :-)

include/clang/Parse/Parser.h
2286–2287

Unrelated formatting changes?

2331

attrs -> Attrs

2333

isInvalid() instead of !isValid()

You could make the same change above as a drive-by.

include/clang/Sema/AttributeList.h
736–742

Do all of these need to be explicitly defaulted?

758–775

Why a custom iterator class instead of: using iterator = VecTy::iterator;?

778

Should probably spell out PA as ParsedAttribute

804

Can we have two overloads: AttributePool &getPool() and const AttributePool &getPool() const? I'm not keen on a const method exposing a non-const reference to an internal data structure.

806–811

Can be removed.

erichkeane marked 5 inline comments as done.Jul 2 2018, 6:31 AM

One question that came to my mind is whether we should be attempting to add some better const correctness as part of this refactoring. This part of the code has always been unsatisfactory in that regard and I think the lack of const correctness is part of why the semantics here are as muddled as they are. If we add more const correctness, the patch is likely to grow even larger, but it may also point out the areas where we're doing something questionable when there's now a better way.

I had a couple of false-starts where I attempted to be more const-correct, but it ends up breaking quite a bit of stuff, and I was overwhelmed. I suspect that once we have a somewhat functional version of this patch that we an add it piecemeal.

Given the massive nature of this patch, I can volunteer to handle the fit and finish comments rather than nag you to take care of them, if you'd like to tag team on the patch.

That would be wonderful! Last week was quite an adventure attempting this on my own, so any help I could possibly have would be greatly appreciated!

More review comments to come later. :-)

include/clang/Sema/AttributeList.h
736–742

Probably not, I got carried away with rule-of-5 I guess :)

758–775

VecTy has to hold a pointer, since the allocation is handled separately, however I intend for this type to appear to hold AttributeList directly. This iterator makes dereferencing an iterator cause an AttributeList& instead of an AttributeList*.

778

Woops! At one point I'd accidentally dropped a reference from an operator somewhere, and this was part of my diagnosis. I'll just delete it, since it is guaranteed by iterator_adaptor_base.

804

I don't think we can just yet... The pool is a mutable field, since sharing the allocator across a number of places is part of the existing design. For example, a function that handles declarator chunks would have a const-ref to its Declarator, but shouldn't modify it. HOWEVER, in order to create new attributes it needs to allocate on the Declarator's collection.

I'm not a huge fan of this design, but unfortunately, its what we have so far :/ I tried a couple of other ways to make allocation of these more sane than this, but quickly fell into traps with shared ownership.

erichkeane updated this revision to Diff 153703.Jul 2 2018, 6:32 AM
erichkeane marked an inline comment as done.

Running through Aaron's first set of comments.

erichkeane updated this revision to Diff 153725.Jul 2 2018, 8:46 AM
erichkeane added a reviewer: Meinersbur.
erichkeane added a subscriber: Meinersbur.

This patch rebases, since I didn't have @Meinersbur s revert. Additionally, I fix a couple of stupid errors, and get failure count down to 47!

All help is welcome :)

Would it be possible (not in this patch) to consolidate the different lists of attributes: ParsedAttributes, AttributeFactory, AttributeList, AttributePool, NonOwningParsedAttributes, NonOwningParsedAttributesWithRange? Naively, expect to use something like std::vector<Attribute*>. Currently AttributeList can be a single attribute as well as a list of attributes.

include/clang/Sema/AttributeList.h
591–593

Is there another way than using a vector of vectors (of AttributeLists)?

IMHO the small size of 8 for the inner list is quite large: When the outer vector needs to grow, that many elements need to be copied (+ control structures).

[Please ignore me if it makes your work more difficult]

617

Thank you for working on this.

include/clang/Sema/DeclSpec.h
1545

Why does this have an additional AF parameter?

lib/Parse/ParseDeclCXX.cpp
3877–3879

It seems that from here on Attrs is interpreted as a single attribute. Did you consider storing Attrs.begin() into a Attr variable?

lib/Sema/SemaDeclAttr.cpp
5789

What is the reason to make this a method?

erichkeane marked an inline comment as done.

Fix things found by @Meinersbur

Would it be possible (not in this patch) to consolidate the different lists of attributes: ParsedAttributes, AttributeFactory, AttributeList, AttributePool, NonOwningParsedAttributes, NonOwningParsedAttributesWithRange? Naively, expect to use something like std::vector<Attribute*>. Currently AttributeList can be a single attribute as well as a list of attributes.

So, each of those things has a distinct meaning unfortunately. AttributeFactory can very much be renamed "AttributeAllocator", because that is all that it is. "AttributePool" is basically an ownership/lifetime container of attributes, and uses the AttributeFactory to deallcoate them.

ParsedAttributes and NonOwningParsedAttributes I think need to stay separate for a while, the later is intended to replace "AttributeList*" (since it is a vector of AttributeList pointers). AttributeList will definitely be renamed at a point in the future.

The intent of this patch is to make it so that AttributeList is no longer used as a 'list' of attributes, simply a single attribute. I think I've successfully made that conversion in this patch.

include/clang/Sema/AttributeList.h
591–593

A different structure here is definitely possible. In the past, it was a linked list (Yes, AttributeList actually had TWO linked lists involved in it). I chose this to unblock the effort, but something a little more efficient might be nice.

That said, this is only allocated a single time and should almost never be resized. Additionally, there is only one AttributeFactory instance in the entire project.

include/clang/Sema/DeclSpec.h
1545

Woops! I guess I forgot to tear those out! A previous implementation I had was attempting to get lifetime-ownership semantics into DeclaratorChunk, but I realized that was too aggressive for this change. I'll revert these sections next time I have a chance.

lib/Sema/SemaDeclAttr.cpp
5789

ProcessDeclAttributeList was used in at least 1 spot (see change in SemaAttr.cpp) where it was intended to just process a single attribute. I made this a method so that it can be called in those locations.

erichkeane updated this revision to Diff 153805.Jul 2 2018, 2:55 PM
erichkeane retitled this revision from [WIP] First somewhat successful take at AttributeList de-listifying. to AttributeList de-listifying:.
erichkeane edited the summary of this revision. (Show Details)

All tests pass!

All tests pass!

Very cool, thank you!

In D48100, @echristo mentioned that it broke test/Sema/attr-target-mv.c. No tests are changed here (so it's a NFC patch?), i.e. this is not fixed yet? The test case could be adapted by just changing the order of __vectorcall and __attribute__((target to emit the same message as before.

How do you intend to commit it? Just merging with D48100 into a single commit?

include/clang/Sema/AttributeList.h
777–778

According to the LLVM code style, parameter names start with capital letters.

All tests pass!

Very cool, thank you!

In D48100, @echristo mentioned that it broke test/Sema/attr-target-mv.c. No tests are changed here (so it's a NFC patch?), i.e. this is not fixed yet? The test case could be adapted by just changing the order of __vectorcall and __attribute__((target to emit the same message as before.

How do you intend to commit it? Just merging with D48100 into a single commit?

I rebased this after the revert of D48100, so it is 'unbroken' now :) D48100 didn't actually break attr-target-mv.c, it just revealed a nasty side effect of the AttributeList linked-list behavior. Basically:
AttributeList objects were stored as a linked list in 3 places:
1- Declarator
2- DeclSpec
3- DeclaratorChunk

At one point during the process, we go through a bunch of attributes that apply to the function type and remove them from the declspec and put them on the declarator chunk. VectorCall is one of these attribute types. THEN, a RAII container's dtor re-adds them to the DeclSpec.

When they are at the end (pre D48100), this doesn't cause any issue, it ends up being at the end of both linked lists without issue. HOWEVER, when it was put at the FRONT (thanks to D48100), it actually has a 'Next' pointer that gets set as a part of the RAII destructor, resulting in things that are supposed to stay on the declspec appearing 2x, thus being SEMA'ed 2x. By removing the linked-list, we keep it so that only 'type' attributes get redistributed, then restored.

erichkeane updated this revision to Diff 153901.Jul 3 2018, 6:14 AM
erichkeane marked an inline comment as done.

Forgot to answer this before:

How do you intend to commit it? Just merging with D48100 into a single commit?

I intend to commit this on its own, I think this patch stands on its own. I see this as simply unblocking the ability to recommit D48100, since it won't be breaking things that suffer from that failure.

Meinersbur accepted this revision.Jul 3 2018, 9:12 AM

LGTM; however, since this is a relatively core change, maybe seek some more approvals.
Tests pass for me as well.

include/clang/Sema/AttributeList.h
900

[nit] unrelated whitespace change

lib/Sema/SemaType.cpp
605–607

[nit] unnecessary added braced

This revision is now accepted and ready to land.Jul 3 2018, 9:12 AM
erichkeane updated this revision to Diff 153928.Jul 3 2018, 9:17 AM

Thanks for the review! I fixed the two nits. I'm definitely going to wait for @aaron.ballman to give the final approval, he's promised me a slice of his time when he gets a chance :)

aaron.ballman added inline comments.Jul 3 2018, 3:00 PM
include/clang/Sema/AttributeList.h
600

Formatting and naming (don't name the argument Attr).

624

Why the name change here? I read this as somehow creating a new attribute and returning memory that is owned by the caller.

730–731

Would this be a ParsedAttributesView or some such? I'm wondering if "view" or "span" nomenclature would get rid of the awkwardness of "nonowning", but if you prefer the bikeshed with its current color, I'm not going to scream in pain.

741–745

I almost wonder if we should rename this to addAtStart()?

789–790

This signature excludes the ability to add from a const_iterator. Could either use a concrete type of iterator_adapter_base where the interesting type arguments are deduced, or could use a template type and pray/enable_if.

793–800

Ditto.

include/clang/Sema/DeclSpec.h
1152

I think this empty struct can go away entirely now, can't it?

2132–2134

So this is accepting attrs by rvalue reference only because it is expected to steal the resources from attr by taking everything from its pool? If so, that might be a good thing to put into the function comments; it's a bit subtle.

lib/Parse/ParseCXXInlineMethods.cpp
47–48

Spurious formatting change that isn't needed?

lib/Parse/ParseDecl.cpp
1611

How did you come up with 5? I would expect this to be like 1.

lib/Parse/ParseDeclCXX.cpp
2478

Should probably use empty parens for consistency with the rest of the expression.

2764

No longer need to check AccessAttrs.empty().

lib/Parse/ParseObjc.cpp
1202

Is there a benefit to doing this in two passes instead of looping in reverse order? e.g.,

for (const auto &AL : llvm::reverse(from)) {
  if (!AL.isUsedAsTypeAttr()) {
    from.remove(AL);
    attrs.add(AL);
  }
}
1866–1868

No need for the local here, just construct within the call expression.

lib/Parse/ParsePragma.cpp
1432–1433

Instead of calling Attrs.begin() multiple times, I think we should hoist this local variable up and use it when needed.

lib/Parse/ParseStmt.cpp
629–630

No need to check empty().

lib/Sema/DeclSpec.cpp
996–997

writtenBS.ModeAttr = llvm::any_of(getAttributes(), [](const auto &AL) { return AL.getKind() == AttributeList::AT_Mode; });

lib/Sema/SemaDecl.cpp
6205–6206

You can use llvm::any_of() here as well to turn this into a one-liner. Because it looks like this is only used once, it might be worth ditching entirely.

lib/Sema/SemaDeclAttr.cpp
5789

I think this is a lot of churn for one use case -- attributes are generally expected as groups (especially because all of the attribute syntaxes allow you to pass multiple attributes within the designating syntax) and exposing a public method to process a single attribute encourages people to forget that detail. Personally, I think this is better left as a static function, especially because public functions have a habit of having their parameter lists augmented and this is a situation we emphatically do not want that (we want ProcessDeclAttribute() to have the same signature as the handleFooAttr() functions because I have high hopes of scripting more of this away with tablegen magic someday.

6667

I don't believe we need to check empty() here any longer.

lib/Sema/SemaDeclCXX.cpp
2819–2826

Can the return type and argument type be brought into alignment in terms of their constness?

lib/Sema/SemaLambda.cpp
1173

Should probably switch to parens for consistency.

1600

Should probably switch to parens for consistency.

lib/Sema/SemaStmt.cpp
4286

Should probably switch to parens for consistency.

lib/Sema/SemaTemplateInstantiate.cpp
2128

Should probably switch to parens for consistency.

lib/Sema/SemaType.cpp
599

existin glist -> existing list

3997–4001

llvm::any_of?

4537–4538

llvm::any_of?

Honestly, I'm wondering if we want to add some inline helper free functions to AttributeList.h given how often we hit two different use cases. The two common ones seem to be "does this list contain an attribute of this kind?" and "get me the first attribute from this list with this kind".

5026–5028

llvm::any_of()?

5216

attrs -> Attrs

erichkeane updated this revision to Diff 154239.Jul 5 2018, 7:49 AM
erichkeane marked 33 inline comments as done.

I think I got all of @aaron.ballman s comments.

include/clang/Sema/AttributeList.h
624

I thought it better reflected the usage, which is from 'createNew', this takes ownership. Though, I guess it should be pretty obvious since this is on AttributePool (and I can now see the alternate interpretation) so I'll change it back.

789–790

It seems to me just adding a second overload of each is more explicit and about the same amount of typing. I'll add those.

include/clang/Sema/DeclSpec.h
2132–2134

Thats exactly why. It steals ownership of them.

lib/Parse/ParseObjc.cpp
1202

I don't believe so. Likely just a leftover from the iterative conversion process where these were just forward iterators. Presumably SmallVector won't invalidate the iterators on removal in this situation?

@aaron.ballman any more feedback here? Or am I good to go?

aaron.ballman accepted this revision.Jul 12 2018, 12:47 PM

LGTM! Thank you for doing this work, Erich!