This is an archive of the discontinued LLVM Phabricator instance.

[GCC] PR23529 Mangler part of attrbute abi_tag support
ClosedPublic

Authored by DmitryPolukhin on Mar 10 2016, 4:56 AM.

Details

Summary

Original patch by Stefan Bühler http://reviews.llvm.org/D12834

Difference between original and this one:

  • fixed all failing tests
  • fixed mangling for global variable outside namespace
  • emit ABI tags for guards and local names (i.e. GCC ABI 10+)
  • clang-format + other stylistic changes
  • significantly reworked patch according to Richard's suggestions

Sema part is here: http://reviews.llvm.org/D17567

Diff Detail

Repository
rL LLVM

Event Timeline

DmitryPolukhin retitled this revision from to [GCC] PR23529 Mangler part of attrbute abi_tag support.
DmitryPolukhin updated this object.
DmitryPolukhin added a reviewer: majnemer.

Re-base to resolve merge conflicts with r263109.

majnemer added inline comments.
lib/AST/ItaniumMangle.cpp
217–218 ↗(On Diff #50416)

Please add a comment describe this state.

329 ↗(On Diff #50416)

Please use = instead of list initialization.

332 ↗(On Diff #50416)

Variables start with an uppercase letter.

339–340 ↗(On Diff #50416)

const typically goes before the type.

363 ↗(On Diff #50416)

Don't use dyn_cast if you aren't going to use the value, use isa instead.

364–365 ↗(On Diff #50416)

commented out code?

394 ↗(On Diff #50416)

Sentences in LLVM comments start with a capital letter and end with a period.

427 ↗(On Diff #50416)

Is this clang-format'd?

670–671 ↗(On Diff #50416)

Why do you have an assert and check for that the pointer is not null?

824 ↗(On Diff #50416)

Use const auto * if the type is obvious from the RHS.

1447 ↗(On Diff #50416)

Variables start with an upper case letter.

4641–4642 ↗(On Diff #50416)

I think a noun is missing between "for" and "but".

rnk added inline comments.Mar 14 2016, 3:45 PM
lib/AST/ItaniumMangle.cpp
268–290 ↗(On Diff #50416)

I think it would be a valuable exercise to write up something more concrete than this in a .rst file in docs/. I think part of the reason why this patch has languished so long is because there is no specification or documentation to speak of. This new abi_tag is a major departure from the wonderful *standard* Itanium C++ ABI that Unix has enjoyed. On Windows, we suffer greatly from having to follow each and every implementation detail that Microsoft accidentally baked into their ABI. It would be good if we could put together a document that we could send back to someone who works on GCC, so that they can verify that that is how abi_tag is intended to work, even if GCC deviates from it in some places.

test/SemaCXX/attr-abi-tag.cpp
1 ↗(On Diff #50416)

This doesn't seem like a Sema test. Maybe fold it into the mangling test?

DmitryPolukhin marked 14 inline comments as done.

All comments resolved.

PTAL

lib/AST/ItaniumMangle.cpp
268–290 ↗(On Diff #50416)

It was part of previous commit with syntax of the attribute, please see http://clang.llvm.org/docs/ItaniumMangleAbiTags.html

Also as far as I understand, Jason Merrill is going to write some document about it, see https://llvm.org/bugs/show_bug.cgi?id=23529#c58 a

427 ↗(On Diff #50416)

Yes, that is how clang-format formats this in-class {}-initializer. But I removed it and replaced with initialization in c-tor.

olifre added a subscriber: olifre.Mar 15 2016, 1:51 AM
rjmccall edited edge metadata.Mar 15 2016, 11:29 AM

It's languished because the idea itself has unavoidable problems with incomplete types.

rsmith added inline comments.Mar 18 2016, 8:11 AM
lib/AST/ItaniumMangle.cpp
272–299 ↗(On Diff #50699)

Just reference the documentation for most of this, you don't need to duplicate so much of the description here.

275 ↗(On Diff #50699)

build -> built

321–323 ↗(On Diff #50699)

Do we need to? IIUC, the only time we need this list is when determining the set of "available" tags for a function declaration with a tagged return type, and in that case, a tag can only be available from a substitution if it's also available from the target of that substitution (right?).

326–328 ↗(On Diff #50699)

It seems unnecessarily expensive to hold these as std::sets.

348–361 ↗(On Diff #50699)

Why do we need a stack of these? It seems like we only need one set of available tags for the complete mangling process (it should only be used once at the top level).

DmitryPolukhin edited edge metadata.
DmitryPolukhin marked 4 inline comments as done.

Fixed comments.

PTAL

lib/AST/ItaniumMangle.cpp
321–323 ↗(On Diff #50699)

It is not only for functions but also for variables but yes, as far as I understand, substitution itself cannot add tags and target should have all tags. I removed FIXME.

348–361 ↗(On Diff #50699)

Stack is required for mangling local names. In that case we have more than one list of available tags and it can be enclosed.

aroth added a subscriber: aroth.Mar 26 2016, 9:24 AM

Friendly ping, please take a look!

TwoFx added a subscriber: TwoFx.Apr 2 2016, 3:45 PM

Richard, Reid and David,

Friendly ping, please take a look to this patch. If there are no more comments/suggestion, I think it is better to commit this patch iterate on GCC abi_tag support based on users' feedback.

azat added a subscriber: azat.Apr 13 2016, 8:22 AM
sberg added a subscriber: sberg.Apr 13 2016, 8:54 AM

Richard, this is just weekly friendly ping about abi_tag support.

Please let me know if you think if something in this patch needs improvements or you are waiting for something specific before committing this patch.

Weekly friendly ping, PTAL!

rnk edited edge metadata.May 3 2016, 8:47 AM

I think Richard has a counterexample that shows that the "NullOut" approach to computing abi_tag sets isn't the right way to go. I wasn't able to craft it myself, but I figured I should send along the feedback that maybe a separate, up-front pass over the return type with a custom TypeVisitor might be a better way to do this.

Richard, could you please share your counterexample so I could test it on my patch and GCC?

As for separate pass, it was my first reaction on the original patch from Stefan but soon I realized that I'll have to copy parts of magnler to some simplified mangler but with high probability of forgetting something and with code duplication. So now I think approach which runs full mangler in special mode is better and less error prone.

DmitryPolukhin edited edge metadata.

+ rebase
+ added testcase with Richard's example

foutrelis added inline comments.
test/CodeGenCXX/mangle-abi-tag.cpp
115 ↗(On Diff #56116)

I'm seeing a test failure on i686 on Arch Linux (all tests pass on x86_64):

I'm using this patch and a slightly tweaked version of D17567 on top of clang 3.8.0; not sure if that's the reason for the failure.

Thank you for reporting this issue, I'll take a look.
I'm mostly testing on x86_64 so I may not notice the problem.

Just to confirm seeing exactly the same error as @foutrelis on Arch i686 (and everything else pass too), but when building the current 3.9.0svn trunk with only this patch applied unmodified on it. Thus, it's probably not likely to be something connected to Evangelos' specific setup.

cecio added a subscriber: cecio.May 9 2016, 2:37 AM
  • fixed tests for i686 when function may have hidden parameter

@DmitryPolukhin: Dmitry, all tests now pass fine on my setup, both on i686 and x86_64. Thanks for taking care of this issue!

Richard, friendly ping. It seems that your counterexample cause no issue with this patch. I added tests to prove it. Could you please take another look?

One more friendly ping...... :(

Dmitry, for a few days now I'm seeing another regression test failure. It has most likely appeared in the 6 hours frame after r269852, because that's the period of my automatic builds, and r269852 was the last to pass the tests successfully. The failure this time is on both x86_64 and i686:

Fix test/PCH/attrs.c failure due to warning about unsupported abi_tag attribute (committed in rL269869).
This patch implements abi_tag so the warning is not expected with this patch.

Ping?

Can we get some traction on this one, please? The 3.8.1 deadline is tomorrow and I'd really like to see these two patches (plus the related fixes) in it, or some LTS Linux distributions will live without them, and thus with completely broken copies of LLVM.

FWIW, multiple people seem to have tested it and reported good results. Also, Arch Linux has had the patch for several months, without a hiccup, probably others, too.

I think we can get this one in as it is, and maybe apply fixes later if relevant.

cheers,
--renato

majnemer edited edge metadata.May 25 2016, 2:41 PM

One more friendly ping...... :(

I think the best way to make progress on this is to refactor this patch along the lines @rsmith suggested back on May 3.

rsmith edited edge metadata.May 25 2016, 3:20 PM

The 3.8.1 deadline is tomorrow and I'd really like to see these two patches (plus the related fixes) in it,

Realistically, this seems unlikely to make it in time.

or some LTS Linux distributions will live without them, and thus with completely broken copies of LLVM.

No, they'll be shipping with a broken copy of their C++ standard library, and will fail to conform to the Linux Standards Base specifications due to deviating from the Itanium C++ ABI. This patch is working around another popular implementation's failure to follow the relevant specification, nothing more. Please try to appropriately apportion the responsibility here; if your distribution opted into a non-standard ABI for their C++ standard library, you should point out to them that they made a mistake.

ismail added a subscriber: ismail.May 26 2016, 1:15 AM

Richard,

Does this mean this feature will never be accepted? It would help to make a clear statement on this issue. I am not happy how this feature is introduced but it's too late now that every big Linux distro out there switched to the new ABI.

No, they'll be shipping with a broken copy of their C++ standard library, and will fail to conform to the Linux Standards Base specifications due to deviating from the Itanium C++ ABI. This patch is working around another popular implementation's failure to follow the relevant specification, nothing more. Please try to appropriately apportion the responsibility here; if your distribution opted into a non-standard ABI for their C++ standard library, you should point out to them that they made a mistake.

The new ABI is the de facto standard and we have to live with it, or at least support it.
It's too late to convince the major distributions to choose another solution.

Please try to appropriately apportion the responsibility here

It has always been clear who created the issue, GCC ABI 11 are there to stay, with the result of having an unusable clang since Ubuntu 15.10 (at least in my case).
It's responsibility of clang developers to review and accept patches tho, please hit the "Merge" button so clang users can build their applications on recent distros without having to rely on GCC.

Thank you Dmitry for the effort put into this patch and into the weekly friendly pings, your perseverance is admirable.

leezu added a subscriber: leezu.May 28 2016, 11:23 PM
rengolin added a comment.EditedMay 31 2016, 4:07 AM

Please try to appropriately apportion the responsibility here; if your distribution opted into a non-standard ABI for their C++ standard library, you should point out to them that they made a mistake.

I think I have been clear enough on the bug and the list about responsibility... It is clear that the distros screwed that up on their own, and neither GCC not LLVM could do much to work around the complete lack of proper testing before such a big decision.

In no way I think this is something we could have done anything to prevent from happening, nor I'm assuming that this patch is the best way forward for the future.

I only asked this to be reviewed and accepted IFF it's the right (temporary) implementation, so that distros can pull those patches with a bit more confidence.

If it makes to 3.8.1, that'd make things a lot easier for *them* (not us). If it doesn't, it wouldn't make it much harder (they can already pull odd patches). But since this has been tested by many people, including active usage in distros, and it doesn't touch anything outside the ABI tag issue, we could have it in (at least on trunk), so that we can test it better ourselves until 3.9 branches.

I really don't want to pull this one in just before we branch. Nor just after. Regardless of who screwed up, having a fix would be good to LLVM as well, not just the distros.

cheers,
--renato

For the records, this patch implements GCC 6 abi_tag semantic and to the best of my knowledge mangles everything same as GCC 6. To support GCC 5 semantic Clang will need -fabi-version command line flag to support multiple ABI versions. But as far as I can see, Richard is not going to approve this patch so it seems to be dead end.

...

But as far as I can see, Richard is not going to approve this patch so it seems to be dead end.

At the last C++ committee meeting, I observed Richard reaching out to the Redhat folks to ask for a specification so that we could make sure we implemented this feature properly (i.e. as intended). Based on that, I'm fairly certain he's interested in seeing a solution here. We just received the first draft of that specification yesterday. As far as I can tell, Richard also is very busy, like many of us, and it can take time to get to things, even important things. I understand why this is frustrating, but please don't become discouraged.

Richard, can you comment on whether you still want the refactoring you suggested in light of Dmitry's responses?

We just received the first draft of that specification yesterday.

Is this specification confidential? If not, why it hasn't been made available in this code review? Most puzzlingly, why it hasn't been shared with Dmitry, who obviously contributed the most towards making abi_tag supported in clang?

As far as I can tell, Richard also is very busy, like many of us, and it can take time to get to things, even important things. I understand why this is frustrating, but please don't become discouraged.

I understand that Richard is super-busy and apart of doing a lot of stuff himself also reviews *a ton* of other peoples' patches (I wonder how he finds time... probably saves on sleeping :-))

But what's wrong with letting us (Intel) carry on some burden and implement this particular piece? Seeing how this review is developing makes me... yeah, discouraged is the right word. :-(

Yours,

Andrey

Software Engineer
Intel Compiler Team

We just received the first draft of that specification yesterday.

Is this specification confidential?

The link was posted here: http://sourcerytools.com/pipermail/cxx-abi-dev/2016-June/002919.html - which is the mailing list where the C++ ABI specification is discussed.

If not, why it hasn't been made available in this code review?

It was, it was part of my comment.

Most puzzlingly, why it hasn't been shared with Dmitry, who obviously contributed the most towards making abi_tag supported in clang?

If he does not otherwise follow the cxx-abi-dev list, I assume he saw the link when I posted it.

Yours,

Andrey

Software Engineer
Intel Compiler Team

rsmith added a comment.Jun 6 2016, 2:20 PM

Yes, I definitely want us to implement this for GCC compatibility. And now that we have a specification for this feature, we can evaluate whether this is doing the right thing. On that basis:

I still want this refactored so that the normally-mangled part of a function name is only mangled once, rather than being mangled twice (once to find the implicit tag set and once to actually produce the mangled name). As suggested, you can achieve this by first computing the set of tags from the return type, then mangling the encoding to a separate buffer (collecting tags as you go) if the set is non-empty, and finally writing any remaining tags and the buffer contents.

Please factor out a function to mangle the source name and ABI tags for a NamedDecl rather than duplicating that pair of calls throughout the patch.

Current discussion on the ABI list suggests that it is not correct to mangle the return type / variable type to get the implicit tag set. The set of available attributes should be determined by a recursive walk of the original type (prior to any substitution), not by mangling it and seeing what it references. It's not yet clear whether that's the actual design intent or just the emergent behavior of the GCC implementation, however.

The link was posted here: http://sourcerytools.com/pipermail/cxx-abi-dev/2016-June/002919.html - which is the mailing list where the C++ ABI specification is discussed.

Ah, I see... Dmitry told me that this one is not terribly detailed, and you revealed the C++ meeting episode after posting the link -- that's why I thought there is another ("real" :-)) one available. I stand corrected.

Sorry if my message offended you -- knowing you, I just couldn't believe you are not sharing a spec you have. Turns out I was right! :-)

Yours,
Andrey

Yes, I definitely want us to implement this for GCC compatibility. And now that we have a specification for this feature, we can evaluate whether this is doing the right thing. On that basis:

I still want this refactored so that the normally-mangled part of a function name is only mangled once, rather than being mangled twice (once to find the implicit tag set and once to actually produce the mangled name). As suggested, you can achieve this by first computing the set of tags from the return type, then mangling the encoding to a separate buffer (collecting tags as you go) if the set is non-empty, and finally writing any remaining tags and the buffer contents.

Please factor out a function to mangle the source name and ABI tags for a NamedDecl rather than duplicating that pair of calls throughout the patch.

Current discussion on the ABI list suggests that it is not correct to mangle the return type / variable type to get the implicit tag set. The set of available attributes should be determined by a recursive walk of the original type (prior to any substitution), not by mangling it and seeing what it references. It's not yet clear whether that's the actual design intent or just the emergent behavior of the GCC implementation, however.

I'm monitoring the discussion and will try to rewrite this patch according to the guidance and avoid double mangling.

karies added a subscriber: karies.Jun 15 2016, 6:51 AM
DmitryPolukhin edited edge metadata.

This patch set contains serious refactoring to make it closer to what Richard suggested. Unfortunately there is no way to fully implement Richard's idea because we have to mangle twice name of functions and variables. But this patch reuse mangled bare function type generated during tags calculation.

NOTE for people who use this patch for local testing: this patch set was not tested extensively on real world apps so it might have be worse than previous one. If you can, please test it in your environment and report any issues. For now "stable" version of this patch is previous patch set from May 23.

Richard, please take a look and let me know if it looks better for you. IMHO, it becomes smaller and simpler and reuse mangling for function prototype.

  • removed wrong asserts that I used for an experiment
DmitryPolukhin updated this object.

Fixed issue with substitution from function name used in bare type encoding and added test. With this patch Clang passes self build with GCC 5.3.1 headers and C++ libraries so it gives some level of confidence that it works on real apps.

PTAL!

mbaron added a subscriber: mbaron.Jun 23 2016, 11:24 AM

Thank you! I'm happy with this implementation (other than some data structure improvements), and cxx-abi-dev discussion seems to be settling on this approach (mangling the return type / variable type to extract attributes) being the right one.

(Minor nit: please make sure that your comments start with a capital letter and end in a period.)

lib/AST/ItaniumMangle.cpp
352–355 ↗(On Diff #61642)

You're going to sort the result list anyway, so how about adding the tag here without checking if it's already present, then using std::unique to weed out duplicates after you sort?

679–683 ↗(On Diff #61642)

Instead of keeping the UsedAbiTags as a SmallSetVector, you can just keep a SmallVector of them, and sort/unique it here. Since you keep the list of emitted ABI tags sorted, you can then use std::set_difference to extract the additional tags in linear time.

DmitryPolukhin marked 2 inline comments as done.

Use SmallVector instead of SmallSetVector and sort the vector when needed.

rsmith accepted this revision.Jun 29 2016, 11:16 AM
rsmith edited edge metadata.

LGTM, thank you so much for your patience and hard work on this change!

This revision is now accepted and ready to land.Jun 29 2016, 11:16 AM
This revision was automatically updated to reflect the committed changes.

Richard, thank you for the review!

I decided to commit this patch without waiting for GCC response to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71712 (that is last compatibility issues in comparison with GCC6) so more people could test Clang implementation of ABI tags on real apps and report issues if any. All, please let me know (file bug and add me in CC) if you observe any issues with abi_tag implementation in Clang.