Page MenuHomePhabricator

[Demangle] Add minimal support for D simple basic types
ClosedPublic

Authored by ljmf00 on Oct 8 2021, 8:23 AM.

Details

Summary

This patch implements simple demangling of two basic types to add minimal type functionality. This will be later used in function type parsing. After that being implemented we can add the rest of the types and test the result of the type name.

Diff Detail

Unit TestsFailed

TimeTest
70 msx64 debian > LLVM.Bindings/Go::go.test
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/llvm-go go=/usr/bin/go test llvm.org/llvm/bindings/go/llvm

Event Timeline

ljmf00 created this revision.Oct 8 2021, 8:23 AM
ljmf00 requested review of this revision.Oct 8 2021, 8:23 AM

Missing test coverage - all the cases added should be tested.

Missing test coverage - all the cases added should be tested.

Well, I thought about this but currently, the demangler is not prepending the type to the demangled symbol so, there is no way to visually test this until https://reviews.llvm.org/D111421. That is why those tests get introduced there instead (types are shown in function parameters). I can add tests to check if this, at least, doesn't fail (nullptr) although I thought that adding tests for all of the types expecting the same output was useless when https://reviews.llvm.org/D111421 get introduced.

Missing test coverage - all the cases added should be tested.

Well, I thought about this but currently, the demangler is not prepending the type to the demangled symbol so, there is no way to visually test this until https://reviews.llvm.org/D111421. That is why those tests get introduced there instead (types are shown in function parameters). I can add tests to check if this, at least, doesn't fail (nullptr) although I thought that adding tests for all of the types expecting the same output was useless when https://reviews.llvm.org/D111421 get introduced.

Perhaps the patches are out of order then, or split too much - code should be tested when it is introduced.

Missing test coverage - all the cases added should be tested.

Well, I thought about this but currently, the demangler is not prepending the type to the demangled symbol so, there is no way to visually test this until https://reviews.llvm.org/D111421. That is why those tests get introduced there instead (types are shown in function parameters). I can add tests to check if this, at least, doesn't fail (nullptr) although I thought that adding tests for all of the types expecting the same output was useless when https://reviews.llvm.org/D111421 get introduced.

Perhaps the patches are out of order then, or split too much - code should be tested when it is introduced.

They are not out of order, according to the ABI specification grammar for name mangling: https://dlang.org/spec/abi.html#name_mangling
This is a problem on this particular demangler, since it doesn't include return types, or variable types. I can add tests for this, anyway, but as I described earlier, it won't be distinguishable at this point.
This behaviour is tested when function symbols are parsable (https://reviews.llvm.org/D111421). Function symbols are dependent of of TypeFunction node (https://dlang.org/spec/abi.html#TypeFunction) and subsequently dependent of Type node, which is what this patch implements.

There is two possible options I can think of:
A. Add intentional support for return types / variable types to the demangled symbol and add visible tests accordingly.
B. Add invisible tests to do increase coverage and remove them lately when https://reviews.llvm.org/D111421 get introduced.

To clarify what do I mean about invisible tests is:

{"_D8demangle1ai", "demangle.a"},
{"_D8demangle1aOi", "demangle.a"},
{"_D8demangle1ae", "demangle.a"},

Those three tests include different mangled symbols but the same demangled symbol.

What are the implications of A. ? It will introduce some complexity to the demangler and doesn't add much benefit to the end user, since you can't have same symbol name with different types.
I guess another point that can be raised is consistency among the other demanglers on LLVM: do they include the return type of a function symbol or type of a variable? For what I analyzed from Rust, it doesn't, but I'm not too familiar with Rust and can't read/understand its mangled names easily to know if it includes the type of a symbol. If you can help me on giving me examples, we can discuss better on what is better to do here :)

Missing test coverage - all the cases added should be tested.

Well, I thought about this but currently, the demangler is not prepending the type to the demangled symbol so, there is no way to visually test this until https://reviews.llvm.org/D111421. That is why those tests get introduced there instead (types are shown in function parameters). I can add tests to check if this, at least, doesn't fail (nullptr) although I thought that adding tests for all of the types expecting the same output was useless when https://reviews.llvm.org/D111421 get introduced.

Perhaps the patches are out of order then, or split too much - code should be tested when it is introduced.

They are not out of order, according to the ABI specification grammar for name mangling: https://dlang.org/spec/abi.html#name_mangling
This is a problem on this particular demangler, since it doesn't include return types, or variable types. I can add tests for this, anyway, but as I described earlier, it won't be distinguishable at this point.
This behaviour is tested when function symbols are parsable (https://reviews.llvm.org/D111421). Function symbols are dependent of of TypeFunction node (https://dlang.org/spec/abi.html#TypeFunction) and subsequently dependent of Type node, which is what this patch implements.

There is two possible options I can think of:
A. Add intentional support for return types / variable types to the demangled symbol and add visible tests accordingly.
B. Add invisible tests to do increase coverage and remove them lately when https://reviews.llvm.org/D111421 get introduced.

To clarify what do I mean about invisible tests is:

{"_D8demangle1ai", "demangle.a"},
{"_D8demangle1aOi", "demangle.a"},
{"_D8demangle1ae", "demangle.a"},

Those three tests include different mangled symbols but the same demangled symbol.

What are the implications of A. ? It will introduce some complexity to the demangler and doesn't add much benefit to the end user, since you can't have same symbol name with different types.
I guess another point that can be raised is consistency among the other demanglers on LLVM: do they include the return type of a function symbol or type of a variable? For what I analyzed from Rust, it doesn't, but I'm not too familiar with Rust and can't read/understand its mangled names easily to know if it includes the type of a symbol. If you can help me on giving me examples, we can discuss better on what is better to do here :)

I think we're getting into the weeds a bit.

A few things that might help: Whole grammar rules don't have to be implemented atomically - a patch could add support for, say, only a basic integer type if that's accessible/testable right now but other types aren't accessible/testable. Then when function types (basic functionality of that could be tested only with integer types?) are added and other types are accessible via that - other types could be added and tested.

When I said out of order, what I meant was maybe some other functionality could be added first (and tested) that would then ennable these type parsing cases to be tested when they're added on top. For instance, maybe function types can be added without these other types being supported - starting with a void() function.

But, to answer the other question, yes return types are rendered when they are mangled. In C++ templates get mangled return types, and they are rendered in the demangling:

$ cat mangle.cpp
void f1() { }
template<typename T>
void f2() { }
int main() {
  f2<int>();
}
$ clang++-tot mangle.cpp -c && nm mangle.o | llvm-cxxfilt
0000000000000010 T main
0000000000000000 T f1()
0000000000000000 W void f2<int>()

Missing test coverage - all the cases added should be tested.

Well, I thought about this but currently, the demangler is not prepending the type to the demangled symbol so, there is no way to visually test this until https://reviews.llvm.org/D111421. That is why those tests get introduced there instead (types are shown in function parameters). I can add tests to check if this, at least, doesn't fail (nullptr) although I thought that adding tests for all of the types expecting the same output was useless when https://reviews.llvm.org/D111421 get introduced.

Perhaps the patches are out of order then, or split too much - code should be tested when it is introduced.

They are not out of order, according to the ABI specification grammar for name mangling: https://dlang.org/spec/abi.html#name_mangling
This is a problem on this particular demangler, since it doesn't include return types, or variable types. I can add tests for this, anyway, but as I described earlier, it won't be distinguishable at this point.
This behaviour is tested when function symbols are parsable (https://reviews.llvm.org/D111421). Function symbols are dependent of of TypeFunction node (https://dlang.org/spec/abi.html#TypeFunction) and subsequently dependent of Type node, which is what this patch implements.

There is two possible options I can think of:
A. Add intentional support for return types / variable types to the demangled symbol and add visible tests accordingly.
B. Add invisible tests to do increase coverage and remove them lately when https://reviews.llvm.org/D111421 get introduced.

To clarify what do I mean about invisible tests is:

{"_D8demangle1ai", "demangle.a"},
{"_D8demangle1aOi", "demangle.a"},
{"_D8demangle1ae", "demangle.a"},

Those three tests include different mangled symbols but the same demangled symbol.

What are the implications of A. ? It will introduce some complexity to the demangler and doesn't add much benefit to the end user, since you can't have same symbol name with different types.
I guess another point that can be raised is consistency among the other demanglers on LLVM: do they include the return type of a function symbol or type of a variable? For what I analyzed from Rust, it doesn't, but I'm not too familiar with Rust and can't read/understand its mangled names easily to know if it includes the type of a symbol. If you can help me on giving me examples, we can discuss better on what is better to do here :)

I think we're getting into the weeds a bit.

A few things that might help: Whole grammar rules don't have to be implemented atomically - a patch could add support for, say, only a basic integer type if that's accessible/testable right now but other types aren't accessible/testable. Then when function types (basic functionality of that could be tested only with integer types?) are added and other types are accessible via that - other types could be added and tested.

When I said out of order, what I meant was maybe some other functionality could be added first (and tested) that would then ennable these type parsing cases to be tested when they're added on top. For instance, maybe function types can be added without these other types being supported - starting with a void() function.

Ok, that makes sense to me. So adding only a few types here to prove this functionality, adding basic testing for them and introduce the rest of the types after function types get introduced and test them.

But, to answer the other question, yes return types are rendered when they are mangled. In C++ templates get mangled return types, and they are rendered in the demangling:

$ cat mangle.cpp
void f1() { }
template<typename T>
void f2() { }
int main() {
  f2<int>();
}
$ clang++-tot mangle.cpp -c && nm mangle.o | llvm-cxxfilt
0000000000000010 T main
0000000000000000 T f1()
0000000000000000 W void f2<int>()

Ok, didn't know that. This can be introduced in a future patch then and do what you suggested, right now.

MaskRay added inline comments.
llvm/lib/Demangle/DLangDemangle.cpp
286
287
ljmf00 retitled this revision from [Demangle] Add support for D simple basic and compound types to [Demangle] Add support for D types back referencing.Jan 5 2022, 12:18 PM
ljmf00 edited the summary of this revision. (Show Details)
ljmf00 retitled this revision from [Demangle] Add support for D types back referencing to [Demangle] Add minimal support for D simple basic types.
ljmf00 edited the summary of this revision. (Show Details)
ljmf00 updated this revision to Diff 397674.Jan 5 2022, 12:22 PM
ljmf00 added a comment.Jan 5 2022, 1:11 PM

@dblaikie I removed all the uncovered types and added just minimal support just to prove type parsing functionality. I change the name of the patch to reflect the changes. I still have Demangled << "int"; and Demangled << "void"; that are covered but with untested behaviour. I can add a comment explaining why or remove it completely if you prefer.

llvm/lib/Demangle/DLangDemangle.cpp
287

Done

ljmf00 updated this revision to Diff 397752.Jan 5 2022, 5:30 PM

Added comment on nullptr testcases.

@dblaikie I removed all the uncovered types and added just minimal support just to prove type parsing functionality. I change the name of the patch to reflect the changes. I still have Demangled << "int"; and Demangled << "void"; that are covered but with untested behaviour. I can add a comment explaining why or remove it completely if you prefer.

Yeah, I think best to remove them for now - introduce them when they're testable. Not sure what the minimal functionality si for now - if you need one of them, but to do nothing, could just have "case 'v': ++Mangled; return Mangled;" or the like.

ljmf00 updated this revision to Diff 399035.Jan 11 2022, 12:08 PM

@dblaikie I removed all the uncovered types and added just minimal support just to prove type parsing functionality. I change the name of the patch to reflect the changes. I still have Demangled << "int"; and Demangled << "void"; that are covered but with untested behaviour. I can add a comment explaining why or remove it completely if you prefer.

Yeah, I think best to remove them for now - introduce them when they're testable. Not sure what the minimal functionality si for now - if you need one of them, but to do nothing, could just have "case 'v': ++Mangled; return Mangled;" or the like.

I added i (int) instead of v (void), since it makes more sense. A void variable is permitted by the ABI but feels weird, especially if adding support for function types before a function that receives a void parameter is also weird. I removed the untested Demangled. Can you please re-review it?

dblaikie added inline comments.Jan 11 2022, 5:58 PM
llvm/lib/Demangle/DLangDemangle.cpp
282

Looks like this function is currently never called with a null parameter (the one call is inside an if (Mangled != nullptr) block). So this should probably be omitted, or replaced with an assertion.

Is the other condition (== '\0') reachable/tested? (I guess it'd fall out naturally from the switch case below - so maybe that could be done instead and this if test/early return could be removed?)

ljmf00 updated this revision to Diff 399156.Jan 11 2022, 6:16 PM
ljmf00 added inline comments.Jan 11 2022, 6:22 PM
llvm/lib/Demangle/DLangDemangle.cpp
282

Makes sense. I omitted the nullptr check, since this was contrarily tested on line 182 and this should never ever happen and an assert wouldn't benefit much as dereferencing will certainly happen right after anyway.

About the null terminator char check, I added a test for that.

dblaikie accepted this revision.Jan 11 2022, 6:24 PM

Good stuff, thanks!

This revision is now accepted and ready to land.Jan 11 2022, 6:24 PM
ljmf00 updated this revision to Diff 399179.Jan 11 2022, 7:21 PM

Added final dots on unit test comments.

This revision was landed with ongoing or failed builds.Jan 12 2022, 2:01 PM
This revision was automatically updated to reflect the committed changes.