This is an archive of the discontinued LLVM Phabricator instance.

[demangler][NFC] Refactor some parsing
ClosedPublic

Authored by urnathan on Jan 21 2022, 5:06 AM.

Details

Summary

There's some unnecessary code duplication in the demangler's parser. This refactors that, deploying boolean variable technology to avoid the duplication. These also happen to help adding module demangling (with an updated mangling scheme that I am working on).

1a) The grammar requires some lookahead concerning <template-args>. We may discover an <unscoped-name> is actually <unscoped-template-name> <template-args>. (When <unscoped-name> was a substitution, there must be a following <template-args>.) Refactor parseName to only have one code path looking for the 'I' indicating <template-args>.

1b) While there I altered the control flow to hold the result in a variable, rather than tail call. Made it easier to debug (and of course an optimizer will DTRT here anyway).

2a) An <unscoped-name> can have an St or StL prefix. No need for completely separate code paths handling the following unqualified-name though.

2b) Also no need to look for both 'St' and 'StL' separately. Look for 'St' and then conditionally swallow an 'L'.

  1. We get a similar issue as #1a when parsing a typeName. Here I just change the control flow slightly to bring the 'break' out to the end of the 'S' block and embed the early return inside an if. That's more in keeping with the code style.
  1. Although NFC, there's a new testcase as that's not covered by the existing demangler tests and is significant in the #1a case above.

Diff Detail

Event Timeline

urnathan requested review of this revision.Jan 21 2022, 5:06 AM
urnathan created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJan 21 2022, 5:06 AM
aaron.ballman added inline comments.Jan 21 2022, 1:02 PM
libcxxabi/src/demangle/ItaniumDemangle.h
2596–2599

Personally, I think the early returns are a bit easier to follow along with. WDYT?

2629
4077–4079

Was this change NFC, or did we drop a needed check for \0 because there is nothing to lookahead to?

urnathan added inline comments.Jan 21 2022, 5:35 PM
libcxxabi/src/demangle/ItaniumDemangle.h
2596–2599

I can ramble on a bit here. TL;DR: not particularly wedded to this particular change. This variant has a helpful blank line after the 'if () return ...' sequence.

4077–4079

This is NFC. The previous 'look ()' test would cause us not to enter this block, but then fail later as 'S\0'[*] has no encoding. Removing that test causes us to enter this block and barf sooner.

look returns NUL on both end-of-string and of course an actual NUL. They're both equally ill-formed.

urnathan updated this revision to Diff 402146.Jan 21 2022, 5:36 PM
urnathan marked 3 inline comments as done.
ChuanqiXu added inline comments.Jan 22 2022, 8:23 PM
libcxxabi/src/demangle/ItaniumDemangle.h
2672–2683

This looks equal to:

bool IsStd = consumeIf("St") || consumeIf('L');
llvm/include/llvm/Demangle/StringView.h
45–49 ↗(On Diff #402146)

I thought the original logic is simpler.

urnathan added inline comments.
libcxxabi/src/demangle/ItaniumDemangle.h
2672–2683

It turns out the original logic gets in the way of some later nonNFC changes I want to make.

llvm/include/llvm/Demangle/StringView.h
45–49 ↗(On Diff #402146)

This change surprised me -- I did not intentionally make it. These files in llvm/include/llvm/Demangle are synced from libcxxabi/src/demangle via that director's cp-to-llvm.sh script, which I applied. It seems that commit 1f9e18b6565fd1bb69c4b649b9efd3467b3c7c7d
Author: serge-sans-paille <sguelton@redhat.com>
Date: Thu Jan 20 11:21:47 2022 +0100

[llvm] Remove (some) LLVMDemangle header dependencies

just jumped in between my recent resync and this diff. Paging Serge!?

it looks like porting Serge's patches to libcxxabi is fine. working on that, expect a diff shortly -- thanks for noticing ChuanqiXu!

llvm/include/llvm/Demangle/StringView.h
45–49 ↗(On Diff #402146)

Sorry, I wasn't aware of the sync... (/me checks) and it's written in the header file, my bad.

urnathan added inline comments.Jan 23 2022, 2:36 PM
llvm/include/llvm/Demangle/StringView.h
45–49 ↗(On Diff #402146)

np, you;ll see my later diffs (a)D117990 fixing this and (b) D118008 making it harder to make this mistake. it'd be great if you could review those!

ChuanqiXu accepted this revision.Jan 23 2022, 6:08 PM

I've checked that the change should be NFC.

This revision is now accepted and ready to land.Jan 23 2022, 6:08 PM
This revision was landed with ongoing or failed builds.Jan 24 2022, 5:31 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJan 24 2022, 5:31 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript