This is an archive of the discontinued LLVM Phabricator instance.

[llvm-cxxfilt] Do not consider the prefix dot as part of the demangled symbol name.
ClosedPublic

Authored by DiggerLin on Dec 12 2022, 11:08 AM.

Details

Summary

In AIX OS, function entry label are begin with '.', it can not be decoded currently.
we support to decode the name in this patch for all OS.

Diff Detail

Unit TestsFailed

TimeTest
490 msLinux x64 > Clang.CodeGen/PowerPC::ppc-emmintrin.c
Exit Code: 2 Command Output (stderr):
430 msLinux x64 > Clang.CodeGen/PowerPC::ppc-mmintrin.c
Exit Code: 2 Command Output (stderr):
390 msLinux x64 > Clang.CodeGen/PowerPC::ppc-pmmintrin.c
Exit Code: 2 Command Output (stderr):
500 msLinux x64 > Clang.CodeGen/PowerPC::ppc-smmintrin.c
Exit Code: 2 Command Output (stderr):
390 msLinux x64 > Clang.CodeGen/PowerPC::ppc-tmmintrin.c
Exit Code: 2 Command Output (stderr):
View Full Test Results (8 Failed)

Event Timeline

DiggerLin created this revision.Dec 12 2022, 11:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 12 2022, 11:08 AM
DiggerLin requested review of this revision.Dec 12 2022, 11:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 12 2022, 11:08 AM
DiggerLin retitled this revision from [AIX] Demangle the name prefix with '.' in AIX OS to [AIX] Demangle the name prefix with '.' in AIX OS for llvm-cxxfilt.Dec 12 2022, 11:09 AM
llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
72

Wouldn't it make more sense to introduce a StripDot variable similar to StripUnderscore?

83

I'm not sure I understand the interaction of all the options and demangling methods, but if nonMicrosoftDemangle() returns false, maybe the potential leading dot should be restored for the rest of this function.

179

Set StripDot here if appropriate

DiggerLin marked 3 inline comments as done.Feb 13 2023, 7:28 AM
DiggerLin added inline comments.Feb 13 2023, 7:28 AM
llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
72

we do not do StripDot, we just demangle the symbol begin with "dot" if there is. (and this is xcoff specific option) I checked AIX clangtana cxxfilt, it do not has an option for "demangling the symbol begin with 'dot') . So I think we also do not need an option here and not need to introduce a new variable "StripDot".

83

if (nonMicrosoftDemangle(DecoratedStr, Result)) return false.
it will continue the following code till

Result = Undecorated ? DotPrefix + Prefix + Undecorated : Mangled;
 free(Undecorated);

not sure whether I answer your comment or not?

179
  1. for '_' , there is an option "--strip-underscore" , and the ""--strip-underscore"(and the option also work for other triple) is set d enabled by default for MachO.
  1. if we want to introduce a new option "--decode-dot", Setting here is reasonable, but it do not think we need to introduce a new option "--decode-dot" , (AIX clangtana cxxfilt do not have any option), and other triple do not has the feature.
DiggerLin marked 3 inline comments as done.

Please remember to upload your diff with full context.

llvm/test/tools/llvm-cxxfilt/delimiters.test
35

I don't think this test should be changing, except possibly to mark it unsupported on AIX (since we don't want the behaviour changing otherwise). You could have an AIX version of the test that shows the behaviour difference, I guess.

llvm/test/tools/llvm-cxxfilt/with-dot-prefix-aix.test
5 ↗(On Diff #498080)
llvm/test/tools/llvm-cxxfilt/with-dot-prefix-non-aix.test
1 ↗(On Diff #498080)
2 ↗(On Diff #498080)

UNSUPPORTED: system-aix is perhaps more idiomatic.

4 ↗(On Diff #498080)
llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
72

I'm ambivalent about whether there's a --strip-dot option (it could be on by default for AIX, and off otherwise). The advantage with having it is that it would be possible to demangle a binary designed for AIX OS on a different platform.

Also "lable" -> "label".

DiggerLin added inline comments.
llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
72

I checked the gnu cxxfilt (in linux OS), it can demangle ._Z3foov to .foo() as you mention in your patch https://reviews.llvm.org/D63722?id=206236#inline-567977

I am not sure whether we can enable to decode the name prefixed with '.' by default without any new option for all OS platform ? @james Henderson

DiggerLin added inline comments.Feb 27 2023, 10:38 AM
llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
72

sorry , above comment should be " I checked the gnu c++filt (in linux OS), it can demangle ._Z3foov to .foo() as you mention in your patch ......." @jhenderson

jhenderson added inline comments.Mar 1 2023, 12:07 AM
llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
72

I am not sure whether we can enable to decode the name prefixed with '.' by default without any new option for all OS platform

I would imagine it would work similarly to how --strip-underscore works, which IIRC is intended to help on Mac. On non-AIX OS, the --strip-dot option would be off by default, so that users don't have the behaviour unless they opt into it. On AIX OS it would be on by default, so users don't have to remember to specify it. Testing would be slightly complicated: you'd need to have testing that shows that the option is off/on by default on the respective OS types, with the rest of the tests having to handle it being in either state. You'd also need to consider how --strip-underscore interacts with it.

llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
72

I disagree. As Digger pointed out, GNU c++filt works on Linux OS to decode .Z3foov to .foo(). --strip-underscore would replace __Z3foov (with two underscores) with foo() and not _foo(). The desired behaviour here is to retain the leading dot in the input. It seems to me that GNU c++filt simply has a default tokenization policy that does not consume leading . characters. I see no reason why llvm-cxxfilt should remain incompatible with GNU c++filt on the treatment of leading .s.

So if default c++filt behaviour is to strip and then re-add a leading dot, on all OSes, I'm fine with llvm-cxxfilt changing to support that behaviour, in which case this patch needs to be made platform agnostic (i.e. tests should work on all platforms).

One thing to look at though is whether a dot in the middle of a name is considered a delimiter or not in c++filt. We should match their behaviour in this case.

So if default c++filt behaviour is to strip and then re-add a leading dot, on all OSes, I'm fine with llvm-cxxfilt changing to support that behaviour, in which case this patch needs to be made platform agnostic (i.e. tests should work on all platforms).

Understood. I agree the patch needs such an update. I would prefer to think of it as processing a leading dot as "normal text" and not part of a symbol name, but the strip+re-add conception of the behaviour leads to the same result.

One thing to look at though is whether a dot in the middle of a name is considered a delimiter or not in c++filt. We should match their behaviour in this case.

I had observed that _Z3f.0v becomes f.0() (including on AIX), hence my characterization that GNU c++filt does not consume leading dots as part of the input symbol name.

I would prefer to think of it as processing a leading dot as "normal text" and not part of a symbol name, but the strip+re-add conception of the behaviour leads to the same result.

Correction: The strip+re-add semantic is correct. .._Z3foov does not demangle.

Given that the demangling code (not the llvm-cxxfilt code though) is used in various places in LLVM, it's important this dot behaviour goes in the right place. I assume therefore that the dot stripping behaviour belongs in LLVMDemangle, on the basis that I expect tools like llvm-objdump should be able to handle the dot too.

jhenderson requested changes to this revision.Apr 27 2023, 12:08 AM
This revision now requires changes to proceed.Apr 27 2023, 12:08 AM

Given that the demangling code (not the llvm-cxxfilt code though) is used in various places in LLVM, it's important this dot behaviour goes in the right place. I assume therefore that the dot stripping behaviour belongs in LLVMDemangle, on the basis that I expect tools like llvm-objdump should be able to handle the dot too.

I agree this should be explicitly addressed. For reference, the GNU implementation of __cxa_demangle does not accept the leading dot (so the dot handling is at a level closer to the tools).

DiggerLin retitled this revision from [AIX] Demangle the name prefix with '.' in AIX OS for llvm-cxxfilt to [AIX] Not consider the prefix dot as part of the demangled symbol name..May 1 2023, 10:36 AM
DiggerLin edited the summary of this revision. (Show Details)
DiggerLin edited the summary of this revision. (Show Details)
DiggerLin retitled this revision from [AIX] Not consider the prefix dot as part of the demangled symbol name. to [llvm-cxxfilt] Not consider the prefix dot as part of the demangled symbol name..
DiggerLin updated this revision to Diff 518451.May 1 2023, 10:59 AM
DiggerLin marked 4 inline comments as done.
DiggerLin added a comment.EditedMay 1 2023, 12:25 PM

Given that the demangling code (not the llvm-cxxfilt code though) is used in various places in LLVM, it's important this dot behaviour goes in the right place. I assume therefore that the dot stripping behaviour belongs in LLVMDemangle, on the basis that I expect tools like llvm-objdump should be able to handle the dot too.

I think it is difficult to implement the functionality in LLVMDemangle, for example, if I implement the functionality in the function llvm::nonMicrosoftDemangle(const char *MangledName, std::string &Result)

it will cause the symbol name "_._Z3f.0v" to be demangled as ".f.0()" in some place.

for example ,in the file Demangle.cpp

std::string llvm::demangle(const std::string &MangledName) {
  std::string Result;
  const char *S = MangledName.c_str();

  if (nonMicrosoftDemangle(S, Result))
    return Result;

  if (S[0] == '_' && nonMicrosoftDemangle(S + 1, Result))
    return Result;
....
}

but gnu c++filt demangle above as

[zhijian@krypton src]$ c++filt _._Z3f.0v
_._Z3f.0v

I think it is difficult to implement the functionality in LLVMDemangle, for example, if I implement the functionality in the function llvm::nonMicrosoftDemangle(const char *MangledName, std::string &Result)

it will cause the symbol name "_._Z3f.0v" to be demangled as ".f.0()" in some place.

for example ,in the file Demangle.cpp

std::string llvm::demangle(const std::string &MangledName) {
  std::string Result;
  const char *S = MangledName.c_str();

  if (nonMicrosoftDemangle(S, Result))
    return Result;

  if (S[0] == '_' && nonMicrosoftDemangle(S + 1, Result))
    return Result;
....
}

Would adding a if (S[0] == '.' && nonMicrosoftDemangle(S + 1, Result)) case there be appropriate?

DiggerLin added a comment.EditedMay 2 2023, 10:55 AM
I think it is difficult to implement the functionality in LLVMDemangle, for example, if I implement the functionality in the function llvm::nonMicrosoftDemangle(const char *MangledName, std::string &Result)

it will cause the symbol name "_._Z3f.0v" to be demangled as ".f.0()" in some place.

for example ,in the file Demangle.cpp

std::string llvm::demangle(const std::string &MangledName) {
  std::string Result;
  const char *S = MangledName.c_str();

  if (nonMicrosoftDemangle(S, Result))
    return Result;

  if (S[0] == '_' && nonMicrosoftDemangle(S + 1, Result))
    return Result;
....
}

Would adding a if (S[0] == '.' && nonMicrosoftDemangle(S + 1, Result)) case there be appropriate?

if I change something like as your suggestion

std::string llvm::demangle(const std::string &MangledName) {
  std::string Result;
  const char *S = MangledName.c_str();

  if (nonMicrosoftDemangle(S, Result))
    return Result;

  if (S[0] == '_' && S[1] != '.' && nonMicrosoftDemangle(S + 1, Result))
    return Result;

  if (char *Demangled = microsoftDemangle(S, nullptr, nullptr)) {
    Result = Demangled;
    std::free(Demangled);
    return Result;
  }

  return MangledName;
}

bool llvm::nonMicrosoftDemangle(const char *MangledName, std::string &Result) {
  char *Demangled = nullptr;

  // Not consider the prefix dot as part of the demangled symbol name.
  if (MangledName[0] == '.') {
    ++MangledName;
    Result = ".";
  }

  if (isItaniumEncoding(MangledName))
    Demangled = itaniumDemangle(MangledName, nullptr, nullptr, nullptr);
  else if (isRustEncoding(MangledName))
    Demangled = rustDemangle(MangledName);
  else if (isDLangEncoding(MangledName))
    Demangled = dlangDemangle(MangledName);

  if (!Demangled)
    return false;

  Result += Demangled;
  std::free(Demangled);
  return true;

it will be have problem in the function demangleMachO llvm-nm.cpp

static std::optional<std::string> demangle(StringRef Name) {
  std::string Demangled;
  if (nonMicrosoftDemangle(Name.str().c_str(), Demangled))
    return Demangled;
  return std::nullopt;
}


static std::optional<std::string> demangleMachO(StringRef Name) {
  if (!Name.empty() && Name[0] == '_')
    Name = Name.drop_front();
  return demangle(Name);
}

the function demangleMachO still will decode
symbol name "_._Z3f.0v" to be demangled as ".f.0()"

so my suggestion is putting the removing and adding prefix 'dot' back in demangle() function of in each llvm tools. (as my current patch implement).

jhenderson added a subscriber: MaskRay.

My opinion hasn't changed that it's a mistake to put the dot demangling in llvm-cxxfilt specifically. That will mean you'll need to make the same change in llvm-nm, llvm-readelf, llvm-objdump and probably other tools too. Code duplication is bad. The next tool to be added might not realise that they can't just use the main demangler library as-is to do their name demangling.

If there is tool code that conflicts with that behaviour (such as the llvm-nm mach-o code), perhaps it's that tool code that should change instead, or perhaps we need to enhance the demangler library a little further, for example by adding an optional "strip underscore" behaviour to it. Adding @MaskRay as a reviewer in case he's got a different opinion.

llvm/test/tools/llvm-cxxfilt/delimiters.test
36

That being said, this isn't the right place for these new test cases. The only thing that should change in this test is related to the ._Z3Foo string - see below.

This test is about testing which characters are treated as delimiters, i.e. characters that separate sequences of names that are to be demangled individually. It is not about testing how things are demangled in the presence of ".". You should add that testing in a new test file.

With the change in behaviour of leading dot prefixes, this test itself doesn't actually test the intended behaviour. The intention was to show that a dot was treated as part of the mangled name, rather than a separator between the previous character sequence (in this case simply the preceding space), and the subsequent mangled name (_Z3Foo). If the dot was treated as a delimiter, the _Z3Foo would have been demangled, resulting in the final part of the output being _Z3Foo$ .Foo. As you can see, this is indistinguishable from the case where ._Z3Foo as a whole is demangled (to .Foo). Instead, this bit of the bit of the check should be modified such that the new code can't demangle it still, but if instead the dot was treated as delimiter, it could be. For example, you could have two valid name manglings separated by a single ".", e.g. _Z3Foo._Z3Bar. If the dot were a delimiter, that would result in an output of Foo.Bar, but since it isn't, the original input should be emitted to the output (i.e. _Z3Foo._Z3Bar).

I hope that makes sense!

llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
71

Nit: "Do not consider"

Also applies to the patch title.

DiggerLin marked an inline comment as done.Sep 21 2023, 11:19 AM

My opinion hasn't changed that it's a mistake to put the dot demangling in llvm-cxxfilt specifically. That will mean you'll need to make the same change in llvm-nm, llvm-readelf, llvm-objdump and probably other tools too. Code duplication is bad. The next tool to be added might not realise that they can't just use the main demangler library as-is to do their name demangling.

If there is tool code that conflicts with that behaviour (such as the llvm-nm mach-o code), perhaps it's that tool code that should change instead, or perhaps we need to enhance the demangler library a little further, for example by adding an optional "strip underscore" behaviour to it. Adding @MaskRay as a reviewer in case he's got a different opinion.

I think it is difficult to llvm::demangle for the llvm-cxxfilt , since there is option `--strip-underscore` , please see the comment in https://reviews.llvm.org/D110664#inline-1063385 , the llvm-nm, llvm-readelf, llvm-objdump can share the same llvm::demangle()

std::string llvm::demangle(std::string_view MangledName) {

std::string Result;

if (nonMicrosoftDemangle(MangledName, Result))
  return Result;

if (starts_with(MangledName, '_') &&
    nonMicrosoftDemangle(MangledName.substr(1), Result))
  return Result;
 
if (char *Demangled = microsoftDemangle(MangledName, nullptr, nullptr)) {
  Result = Demangled;
  std::free(Demangled);
} else {
  Result = MangledName;
}
return Result;

}

llvm/test/tools/llvm-cxxfilt/delimiters.test
36

The intention was to show that a dot was treated as part of the mangled name

._Z3Foo is a symbol name in AIX , but the ._Z3Foo is not a mangle name ,only _Z3Foo is a mangle name, we put the dot in front of the _Z3Foo to generate the entry label symbol of function in AIX.

but if instead the dot was treated as delimiter, it could be. For example, you could have two valid name manglings separated by a single ".", e.g. _Z3Foo._Z3Ba

according to the https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling , the dot maybe part of symbol (but in the middle of the mangled name, not the front of the mangle name).

So we have to do some special for the front dot of symbol name for AIX.

DiggerLin marked an inline comment as done.EditedSep 21 2023, 11:49 AM

My opinion hasn't changed that it's a mistake to put the dot demangling in llvm-cxxfilt specifically. That will mean you'll need to make the same change in llvm-nm, llvm-readelf, llvm-objdump and probably other tools too. Code duplication is bad. The next tool to be added might not realise that they can't just use the main demangler library as-is to do their name demangling.

If there is tool code that conflicts with that behaviour (such as the llvm-nm mach-o code), perhaps it's that tool code that should change instead, or perhaps we need to enhance the demangler library a little further, for example by adding an optional "strip underscore" behaviour to it. Adding @MaskRay as a reviewer in case he's got a different opinion.

  1. there is option --strip-underscore in the llvm-cxxfilt, it is difficult to use std::string llvm::demangle(std::string_view MangledName) for the llvm-cxxfilt , please see the https://reviews.llvm.org/D110664#inline-1063385
  1. even I add a new parameter to function std::string llvm::demangle(std::string_view MangledName) as std::string llvm::demangle(std::string_view MangledName,bool StripUnderscore=true)

    But the function std::string llvm::demangle() has the functionailty microsoftDemangle(MangledName, nullptr, nullptr) which do not need by the static std::string demangle(const std::string &Mangled) of llvm-cxxfile.cpp
  1. In the static std::string demangle(const std::string &Mangled) of llvm-cxxfile.cpp ,if demangle do not success, it still to do

if nonMicrosoftDemangle not successful, it will continue to do

if (Types)
    Undecorated = itaniumDemangle(DecoratedStr);

  if (!Undecorated && starts_with(DecoratedStr, "__imp_")) {
    Prefix = "import thunk for ";
    Undecorated = itaniumDemangle(DecoratedStr.substr(6));
  }

but std::string llvm::demangle(std::string_view MangledName) do not return a bool to indicated demangle is successful or not.

  1. llvm-nm, llvm-readelf, llvm-objdump can use the same std::string llvm::demangle(std::string_view MangledName) , I will create as seperate NFC patch for the llvm-nm, llvm-readelf, llvm-objdump. (https://reviews.llvm.org/D159539).

@jhenderson

DiggerLin retitled this revision from [llvm-cxxfilt] Not consider the prefix dot as part of the demangled symbol name. to [llvm-cxxfilt] Do not consider the prefix dot as part of the demangled symbol name..Sep 21 2023, 12:13 PM
DiggerLin updated this revision to Diff 557196.Sep 21 2023, 1:17 PM
DiggerLin marked an inline comment as done.

address comment

DiggerLin updated this revision to Diff 557218.Sep 21 2023, 7:07 PM

rebase the code

Could you not just modify nonMicrosoftDemangle to handle the dot prefix? That would mean this works for all tools, including llvm-cxxfilt, rather than needing this and the separate D159539 patch. I'm assuming that a dot prefix in front of a Microsoft mangled name doesn't make sense...

llvm/test/tools/llvm-cxxfilt/delimiters.test
36

The intention was to show that a dot was treated as part of the mangled name

._Z3Foo is a symbol name in AIX , but the ._Z3Foo is not a mangle name ,only _Z3Foo is a mangle name, we put the dot in front of the _Z3Foo to generate the entry label symbol of function in AIX.

You're arguing about semantics here. As far as I'm concerned, the dot is part of the mangled name. The demangling code removes it, demangles the rest of the name, and then readds it, but it's still part of the mangled name (just not part of the Itanium mangling).

But regardless of the exact terminology, it's clear you didn't understand my comment. We both agree that the dot is part of the name (whether it's part of the mangled name or not is irrelevant in this context). What's important to this test is that it isn't a delimiter (i.e. a character that separates a string into two separate names) and the rest of my comment and explanation still stands: your change in behaviour has rendered this part of the test as no longer testing what it wanted to test and therefore the test needs updating as I already explained (and you need proper testing of the dot demangling in a different test, which I see you've added).

llvm/test/tools/llvm-cxxfilt/prefix-dot.test
1

I suggest we use the phrase "dot prefix" rather than "prefix dot" in the test name and comment as it's more natural. Same goes in other places in this patch.

3–4

I don't think you need this many strings to show the behaviour. One case where it can be successfully demangled and one case where it cannot be demangled I think should be sufficient.

This comment was removed by DiggerLin.
DiggerLin added a comment.EditedSep 22 2023, 6:47 AM

Could you not just modify nonMicrosoftDemangle to handle the dot prefix? That would mean this works for all tools, including llvm-cxxfilt, rather than needing this and the separate D159539 patch.

as I mention before, If I modify in `nonMicrosoftDemangle' as

bool llvm::nonMicrosoftDemangle(const char *MangledName, std::string &Result) {
  char *Demangled = nullptr;

  // Not consider the prefix dot as part of the demangled symbol name.
  if (MangledName[0] == '.') {
    ++MangledName;
    Result = ".";
  }

  if (isItaniumEncoding(MangledName))
    Demangled = itaniumDemangle(MangledName, nullptr, nullptr, nullptr);
  else if (isRustEncoding(MangledName))
    Demangled = rustDemangle(MangledName);
  else if (isDLangEncoding(MangledName))
    Demangled = dlangDemangle(MangledName);

  if (!Demangled)
    return false;

  Result += Demangled;
  std::free(Demangled);
  return true;

The llvm-nm or llvm-cxxfilt for machO will demangle "_._Z3f.0v" as ".f.0()"
but
[zhijian@krypton src]$ /opt/at15.0/bin/c++filt _._Z3f.0v
_._Z3f.0v

I'm assuming that a dot prefix in front of a Microsoft mangled name doesn't make sense...

in llvm-cxxfilt.cpp, the static std::string demangle(const std::string &Mangled) do not invoke microsoftDemangle

DiggerLin marked 2 inline comments as done.

address comment

Could you not just modify nonMicrosoftDemangle to handle the dot prefix? That would mean this works for all tools, including llvm-cxxfilt, rather than needing this and the separate D159539 patch.

as I mention before, If I modify in `nonMicrosoftDemangle' as

bool llvm::nonMicrosoftDemangle(const char *MangledName, std::string &Result) {
  char *Demangled = nullptr;

  // Not consider the prefix dot as part of the demangled symbol name.
  if (MangledName[0] == '.') {
    ++MangledName;
    Result = ".";
  }

  if (isItaniumEncoding(MangledName))
    Demangled = itaniumDemangle(MangledName, nullptr, nullptr, nullptr);
  else if (isRustEncoding(MangledName))
    Demangled = rustDemangle(MangledName);
  else if (isDLangEncoding(MangledName))
    Demangled = dlangDemangle(MangledName);

  if (!Demangled)
    return false;

  Result += Demangled;
  std::free(Demangled);
  return true;

The llvm-nm or llvm-cxxfilt for machO will demangle "_._Z3f.0v" as ".f.0()"
but
[zhijian@krypton src]$ /opt/at15.0/bin/c++filt _._Z3f.0v
_._Z3f.0v

Honestly, I don't think I have an issue with this behaviour. --strip-underscore should literally just drop the leading underscore before demangling the rest of the string, and therefore the same should happen for tools where the underscore stripping behaviour is implicit. Whilst I agree there's no use-case for extra leading underscores (a Mach-O thing) and leading dots (XCOFF), the interaction of the option just seems natural to me. However, I don't really care about it, so if it's easier to do it the other way (i.e. only supporting an extra leading undescore, or a dot), I'm not going to fight about it.

Aside: the example you provided with GNU c++filt is not applicable, because you didn't specify the --strip-underscore option.

llvm/lib/Demangle/Demangle.cpp
27 ↗(On Diff #557248)

Please mark this false with a /*StripDot=*/ label

52 ↗(On Diff #557248)
llvm/test/tools/llvm-cxxfilt/delimiters.test
36

This test still needs updating.

llvm/test/tools/llvm-cxxfilt/dot-prefix.test
2 ↗(On Diff #557248)
llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
70

As we're not actually stripping the dot (it's restored at the end), I recommend naming this CanHaveLeadingDot or something to that effect. Same goes in the nonMicrosoftDemangle function.

DiggerLin marked 4 inline comments as done.Sep 25 2023, 10:39 AM

rebase the code and address James' comment.

Thanks for the update. Doing the following looks good to me.

bool CanHaveLeadingDot = true;
if (StripUnderscore)
  if (DecoratedStr[0].consume_front("_")) // simplified
    CanHaveLeadingDot = false;

std::string Result;
if (nonMicrosoftDemangle(DecoratedStr, Result, CanHaveLeadingDot))

Thanks for the update. Doing the following looks good to me.

bool CanHaveLeadingDot = true;
if (StripUnderscore)
  if (DecoratedStr[0].consume_front("_")) // simplified
    CanHaveLeadingDot = false;

std::string Result;
if (nonMicrosoftDemangle(DecoratedStr, Result, CanHaveLeadingDot))

DecoratedStr is object of class std::string_view, it do not have API consume_front https://en.cppreference.com/w/cpp/string/basic_string_view

DiggerLin updated this revision to Diff 557337.Sep 25 2023, 3:37 PM
MaskRay accepted this revision.Sep 25 2023, 3:54 PM

LGTM

jhenderson accepted this revision.Sep 25 2023, 11:37 PM
jhenderson added a subscriber: shchenz.

One remaining comment, then looks good to me

For reference, somebody else (possibly @shchenz?) put up https://github.com/llvm/llvm-project/pull/67389. They appear to be working on xcoff too, so it may be worth liaising with them/getting them involved with other reviews too.

llvm/test/tools/llvm-cxxfilt/delimiters.test
35

As per my previous comments, I think I'd prefer this to be _Z3Foo._Z3Bar, in keeping with the existing patterns above, which should then be identical in the output (i.e. _Z3Foo._Z3Bar).

This revision is now accepted and ready to land.Sep 25 2023, 11:37 PM

For reference, somebody else (possibly @shchenz?) put up https://github.com/llvm/llvm-project/pull/67389. They appear to be working on xcoff too, so it may be worth liaising with them/getting them involved with other reviews too.

Thanks @jhenderson. The implementation LGTM.

DiggerLin marked an inline comment as done.Sep 26 2023, 7:04 AM
DiggerLin added inline comments.
llvm/test/tools/llvm-cxxfilt/delimiters.test
35

without the patch or with the patch llvm-cxxfilt has the same behavior as

bash> llvm-cxxfilt _Z3Foo._Z3Bar
 Foo (._Z3Bar)

so I do not use the _Z3Foo._Z3Bar as your suggestion.