This is an archive of the discontinued LLVM Phabricator instance.

[demangler] stricter NestedName parsing
ClosedPublic

Authored by urnathan on Jan 25 2022, 5:14 AM.

Details

Summary

The parsing of nested names is a little lax. This corrects that.

  1. The 'L' local name prefix cannot appear before a NestedName -- only

within it. Let's remove that check from parseName, and then adjust
parseUnscopedName to allow it with or without the 'St' prefix.

  1. In a nested name, a <template-param>, <decltype> or <substitution>

can only appear as the first element. Let's enforce that. Note I do
not remove these from the loop, to make the change easier to follow
(such a change will come later).

  1. Given that, there's no need to special case 'St' outside of the

loop, handle it with the other 'S' elements.

  1. There's no need to reset 'EndsWithTemplateArgs' after each

non-template-arg component. Rather, always clear it and then set it
in the template-args case.

  1. An template-args cannot immediately follow a template-args.
  1. The parsing of a CDtor name with ABITags would attach the tags to

the NestedName node, rather than the CDTor node. This is different to
how ABITags are attached to an unscopedName. Make it consistent.

  1. We remain with only CDTor and UnscopedName requireing construction

of a NestedName, so let's drop the PushComponent lambda.

  1. Add some tests to catch the new rejected manglings.

Diff Detail

Event Timeline

urnathan requested review of this revision.Jan 25 2022, 5:14 AM
urnathan created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJan 25 2022, 5:14 AM

Thanks for adding me as reviewer. It is good for me to learn about CXXABI. Due to the same reason, maybe I couldn't give a high quality feedback. I read code with your description. I think they are basically consistent with your description. But I couldn't found problem in the description if any.

BTW, I feel the if { continue; } style is more simpler than if {} else if {} else if{} style. The cyclomatic complexity is lower. Although there is a little bit redundant, but I feel it more clear and explicit to read and understand.

libcxxabi/src/demangle/ItaniumDemangle.h
3173

This change looks not necessary.

3187–3188

I am not sure about the style guide of libcxxabi. But I feel it is necessary to add comment here. I get the logic from your description of the patch, but I can't get it from the code simply.

3188–3189

Similarly, I can get this from your description that "In a nested name, a <template-param>, <decltype> or <substitution>
can only appear as the first element." I think it would be helpful to add a comment. Personally, I think it would be more easy to read if we move this outside the loop if we want enforce something that could only be at the start.

3202–3227

This line might require further knowledge. Does it implies that if the kind of SoFar is not Node::KNameWithTemplateArgs, the result of make<NameWithTemplateArgs>(SoFar, TA); wouldn't be nullptr?

3229–3230

At least we need a comment some like:

::= St [L]* <unqualified-name>   # ::std::
urnathan added inline comments.Jan 26 2022, 6:01 AM
libcxxabi/src/demangle/ItaniumDemangle.h
3173

FWIW I've been informed that one must put braces on all the elses or none of them -- not a mixture. I dunno.

urnathan updated this revision to Diff 403255.Jan 26 2022, 7:04 AM

While I would probably agree about 'if () {... continue;}' if all those conditionals had different bodies, that;s not the case here. All but one end with 'if (!SoFar) return; Subs.push_back(SoFar);' I find it much easier to understand when that code is not duplicated. The case were we don't push_back remains a continue.

urnathan marked 4 inline comments as done.Jan 26 2022, 7:07 AM
urnathan added inline comments.
libcxxabi/src/demangle/ItaniumDemangle.h
3188–3189

It's a semantic restriction. I've added a comment, but I'm not totally wedded to this particular change.

3202–3227

There is no such inference. Not sure what you're asking here.

3229–3230

That;s not the grammar being parsed though. This is simply a substitution that parseSubstitution does not handle. I've added comments.

OK, it looks consistency for me now. Due to I don't have experience in demangler before, I think it would be better to not give a LGTM by me immediately. But if there is no other reviewers to review this in the next week, I would try to look at it again and make the status green. I understand that sometime it is hard to find someone to review : )

ChuanqiXu accepted this revision.Feb 6 2022, 6:21 PM

We need to format this before landing.

This revision is now accepted and ready to land.Feb 6 2022, 6:21 PM
urnathan marked 3 inline comments as done.Feb 7 2022, 8:01 AM

We need to format this before landing.

I'm going to decline that.
(a) the comment formatting is breaking the long lines of the grammar -- and making that hard to read. Also, not lines I've changed.
(b) the if(state) ... formatting is idiomatic in this file. Also, not something I've changed.

This revision was landed with ongoing or failed builds.Feb 7 2022, 8:21 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 7 2022, 8:21 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript