This is an archive of the discontinued LLVM Phabricator instance.

WIP: Add error handling to demangle API
Needs ReviewPublic

Authored by thieta on Dec 5 2022, 12:12 AM.

Details

Summary

As discussed in https://reviews.llvm.org/D138595 - the current demangler API
can't really report more detailed errors. This is an attempt to implement
that using std::optional and std::error_code

Diff Detail

Event Timeline

thieta created this revision.Dec 5 2022, 12:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 5 2022, 12:12 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
thieta requested review of this revision.Dec 5 2022, 12:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 5 2022, 12:12 AM

@dblaikie @jhenderson This is an attempt to implement the API discussed in the other Diff using std::error_code and std::optional. Let me know what you think before I convert the rest of the methods to have similar error handling.

I also think the demangle() and nonMicrosoftDemangle() should probably follow the same API if we decide this is what we want.

dblaikie added inline comments.Dec 5 2022, 5:09 PM
llvm/lib/Demangle/Demangle.cpp
75–78

Part of what I was hoping for was that we wouldn't need an explicit isItaniumEncoding check, even inside the implementation - that Demangler::parse could produce that as a separate error result if it parsed something without the desired prefix. Would that be possible?

jhenderson added inline comments.Dec 6 2022, 12:06 AM
llvm/include/llvm/Demangle/Demangle.h
27

I generally support the use of enum class, but the enums are generally being cast to an int, so maybe that's a sign that this should just be a plain enum?

28

Nit: comments should end with "." (here and below).

53

Nit: capitalization here is inconsistent.

FWIW, the LLVM coding standards state for warning and error messages to a) start with a lower case letter, and b) not to end with a period. Whether that applies in this context is debatable.

66

I'd just overload the function name, rather than appending "WithError". The fact that it takes a std::error_code reference indicates that it will do something to that field, and it's then similar to e.g. how std::filesystem overloads function.

llvm/lib/Demangle/Demangle.cpp
75–78

Not sure if this is what @dblaikie is suggesting, but it seems to me that you could have the underlying demangling code set a status of "NotThisFormat" or similar if it encounters something with the wrong prefix type. This would enable calling code to know whether to try the next format, or just stop. That might not work so well in the face of type demangling though.

Oh, and just to mention that this week is my last week working before a 6 week break, so I may not have the time to review/comment much more on this.

thieta added inline comments.Dec 6 2022, 1:54 AM
llvm/include/llvm/Demangle/Demangle.h
27

Ideally we should change the api to not return a int status from the demangling function instead using DemangleError() - as I noted in my other comment, I was trying to preserve the old API because I thought it was wide-spread. But that doesn't seem to be the case, so I might end up just fixing that instead of making this a normal int.

53

Hmm seems like it should apply here, I will make it consistent.

66

Agreed.

llvm/lib/Demangle/Demangle.cpp
75–78

We can do that - but it becomes a little more complex and we probably need to audit all the places where the demangling functions are used now in that case.

itaninumDemangle() calls: AbstractManglingParser::parse() which returns nullptr if there is any errors, but have no status reports here.

So we could either:

  1. change the AbstractManglingParser API to return the error_code from there instead. This seems possible since it's only used internally in the itaniumDemangler - it's not exposed from there (wonder why the implementation lives in a header in that case). The function also seems to be aware when it's not starting with __Z (or rather it tries to find *Z and we can probably return InvalidFormat if it doesn't). We could change this and either keep the current itaniumDemangle() function which will change the error_code to the current status int.
  2. change itaniumDemangle() to do something similar to what isItaniumEncoding does. but this seems to just move the responsibility for it.

I was going to say that it's harder to change the API of itaniumDemangle since it's used "everywhere" - but that doesn't seem to be true - searching for it only seems to happen in 10 places:

https://github.com/search?q=repo%3Allvm%2Fllvm-project+itaniumDemangle%28+language%3AC%2B%2B&type=code&l=C%2B%2B

I am going to give it a try to move the logic to AbstractManglingParser to return a error_code instead and see if it can work.

thieta added a comment.Dec 6 2022, 1:54 AM

Oh, and just to mention that this week is my last week working before a 6 week break, so I may not have the time to review/comment much more on this.

Enjoy your vacation!

thieta updated this revision to Diff 496059.Feb 9 2023, 2:32 AM

rebased and implemented some of the suggestions.

thieta added a comment.Feb 9 2023, 2:35 AM

Hi - I finally got around to picking this one up again. This is rebased, passing all tests and now we pass the std::error_code directly to the API and changed all the callers to itaniumDemangle() and changed most of the places to actually check the error_code instead of the return const char*.

What do you guys think about this API change? If we like this in general, I can start changing the other demangle calls, either as part of this diff or in separate diffs per demangle API.

thieta added inline comments.Feb 9 2023, 2:37 AM
llvm/include/llvm/Demangle/Demangle.h
53

I missed this - but will fix in the next revision of the patch.

llvm/tools/llvm-undname/llvm-undname.cpp
80 ↗(On Diff #496059)

just a temp change until I do the microsoftDemangle api.

Adding std::error_code looks good.

This is an attempt to implement that using std::optional and std::error_code

I don't see std::optional in the patch :)

llvm/include/llvm/Demangle/Demangle.h
53

If there isn't a precedent, as mentioned, no-capitalization is preferred (https://llvm.org/docs/CodingStandards.html#error-and-warning-messages).

Also, no-capitalization makes the message easily composable with a function that appends a prefix.

Could make the error_code out parameter optional (take by pointer, populate if non-null, have a null default parameter value) - then callers who don't benefit from/care about the specific type can continue using the null pointer result signal?

llvm/lib/ProfileData/ItaniumManglingCanonicalizer.cpp
290–296 ↗(On Diff #496059)

This could be changed to parse, check the error and then do the make<NameType>(StringView thing if the parse failed as invalid, maybe? (potentially in a follow-up commit)

jhenderson added inline comments.Feb 20 2023, 1:08 AM
llvm/unittests/Demangle/DemangleTest.cpp
38

Here and below, can you use EXPECT_EQ(E, DemangleError::Success) (or equivalent)? This provides nicer error messages since it gives the actual value if the two aren't the same.

41

The comment seems to be wrapped a bit early too?

Also applies below.