Page MenuHomePhabricator

[Demangle] Add support for D programming language
AbandonedPublic

Authored by ljmf00 on Sep 27 2021, 12:49 PM.

Details

Reviewers
None
Summary

This is part of https://github.com/dlang/projects/issues/81 .

This patch adds support for D programming language demangling on LLVM core by
porting libiberty D demangler to LLVM C++ codebase. This will allow easier
integration on a future LLDB plugin for D either in the upstream tree or
outside of it. It also includes tests and a fuzzer tool.

Signed-off-by: Luís Ferreira <contact@lsferreira.net>

Diff Detail

Event Timeline

ljmf00 created this revision.Sep 27 2021, 12:49 PM
ljmf00 requested review of this revision.Sep 27 2021, 12:49 PM
ljmf00 changed the edit policy from "All Users" to "Administrators".Sep 27 2021, 12:56 PM
ljmf00 edited the summary of this revision. (Show Details)Sep 27 2021, 12:59 PM
ljmf00 edited the summary of this revision. (Show Details)
ljmf00 updated this revision to Diff 375419.Sep 27 2021, 3:23 PM
ljmf00 updated this revision to Diff 375421.Sep 27 2021, 3:26 PM

As with the Rust demangling (check the commit history, it was done earlier this year), I think this'd be easier to review if it were implemented incrementally - it'll be easier to spot the test coverage for each feature as it's added. Otherwise it's a lot of code to try to validate in one piece.

As with the Rust demangling (check the commit history, it was done earlier this year), I think this'd be easier to review if it were implemented incrementally - it'll be easier to spot the test coverage for each feature as it's added. Otherwise, it's a lot of code to try to validate in one piece.

Well, this is actually a port so it is harder to split up and the logic is well refactored since it is split up according to D mangling ABI nodes. I can split this into independent ABI nodes and incrementally add features like template demangling, although this will radically increase work on my side since I need to manually rewrite it to be incrementally independent patches.

About the coverage, I can't find where the coverage is officially reported. https://llvm.org/reports/coverage/ points to a very outdated report.

ljmf00 changed the edit policy from "Administrators" to "All Users".Sep 28 2021, 8:43 AM
Geod24 added a subscriber: Geod24.Sep 28 2021, 8:15 PM

As with the Rust demangling (check the commit history, it was done earlier this year), I think this'd be easier to review if it were implemented incrementally - it'll be easier to spot the test coverage for each feature as it's added. Otherwise it's a lot of code to try to validate in one piece.

I'll try to provide a bit more verbose background into the contribution:

This is a port of libiberty's D demangler as some comments mention, which was written by Iain Buclaw, the maintainer of GDC (GCC-based D compiler).
We have been in frequent communication with him before the project started and he has no issue with this code being used (and re-licensed) by LLVM. If any document / proof is needed on that side, please let us know.
As the code is part of libiberty, it is the backbone used to demangle D symbols in binutils since v2.25 (released 2014-12-23, see https://forum.dlang.org/thread/rvoqllpimfskvlabprot@forum.dlang.org for the D-targeted announcement) and GDB.

As such, we're also pretty confident that it is quite robust and covers most, if not all, of D feature. As Luis mentioned, if breaking it down is unavoidable for the purpose of moving forward, then it will be, but that sounds counter-productive.

As for me, I am Luis' mentor during his Symmetry Autumn of Code time (a program similar to Google Summer of Code). The rough milestones / description of the project is also available here, as mentioned in the OP.

ljmf00 updated this revision to Diff 375892.Sep 29 2021, 7:53 AM

As with the Rust demangling (check the commit history, it was done earlier this year), I think this'd be easier to review if it were implemented incrementally - it'll be easier to spot the test coverage for each feature as it's added. Otherwise it's a lot of code to try to validate in one piece.

I'll try to provide a bit more verbose background into the contribution:

This is a port of libiberty's D demangler as some comments mention, which was written by Iain Buclaw, the maintainer of GDC (GCC-based D compiler).
We have been in frequent communication with him before the project started and he has no issue with this code being used (and re-licensed) by LLVM. If any document / proof is needed on that side, please let us know.
As the code is part of libiberty, it is the backbone used to demangle D symbols in binutils since v2.25 (released 2014-12-23, see https://forum.dlang.org/thread/rvoqllpimfskvlabprot@forum.dlang.org for the D-targeted announcement) and GDB.

As such, we're also pretty confident that it is quite robust and covers most, if not all, of D feature. As Luis mentioned, if breaking it down is unavoidable for the purpose of moving forward, then it will be, but that sounds counter-productive.

As for me, I am Luis' mentor during his Symmetry Autumn of Code time (a program similar to Google Summer of Code). The rough milestones / description of the project is also available here, as mentioned in the OP.

I think this is roughly the same situation as was true with Rust: https://reviews.llvm.org/D99981#2837256 - but I think in that case and in this one, it'd be good to break it down to better validate the code against LLVM's coding conventions and other code review scrutiny.

As with the Rust demangling (check the commit history, it was done earlier this year), I think this'd be easier to review if it were implemented incrementally - it'll be easier to spot the test coverage for each feature as it's added. Otherwise it's a lot of code to try to validate in one piece.

I'll try to provide a bit more verbose background into the contribution:

This is a port of libiberty's D demangler as some comments mention, which was written by Iain Buclaw, the maintainer of GDC (GCC-based D compiler).
We have been in frequent communication with him before the project started and he has no issue with this code being used (and re-licensed) by LLVM. If any document / proof is needed on that side, please let us know.
As the code is part of libiberty, it is the backbone used to demangle D symbols in binutils since v2.25 (released 2014-12-23, see https://forum.dlang.org/thread/rvoqllpimfskvlabprot@forum.dlang.org for the D-targeted announcement) and GDB.

As such, we're also pretty confident that it is quite robust and covers most, if not all, of D feature. As Luis mentioned, if breaking it down is unavoidable for the purpose of moving forward, then it will be, but that sounds counter-productive.

As for me, I am Luis' mentor during his Symmetry Autumn of Code time (a program similar to Google Summer of Code). The rough milestones / description of the project is also available here, as mentioned in the OP.

I think this is roughly the same situation as was true with Rust: https://reviews.llvm.org/D99981#2837256 - but I think in that case and in this one, it'd be good to break it down to better validate the code against LLVM's coding conventions and other code review scrutiny.

Not sure I'd say it's mandatory - other folks might have other opinions/perspectives/framing, but I tend to think it's the right way to go. (@jhenderson @rnk)

Whilst I appreciate that it'd require a fair bit of extra work, I'm inclined to agree with @dblaikie. I don't see any exception in the contribution policy (https://llvm.org/docs/Contributing.html#how-to-submit-a-patch) for larger ports. In particular, I'm pretty sure other even larger-scale contributions from a pre-done area have had to undergo this process, so it's not a new thing.

IANAL, but re. the relicensing, I guess we need some sort of formal agreement that the original author is happy for the relicensing, especially if any breakdown isn't going to end up with a significant rewrite. I suspect you should contact the LLVM board to discuss it.

One other minor point, please make sure to create patches with full context (i.e. add something like -U999999 to the patch-creation command line).

Whilst I appreciate that it'd require a fair bit of extra work, I'm inclined to agree with @dblaikie. I don't see any exception in the contribution policy (https://llvm.org/docs/Contributing.html#how-to-submit-a-patch) for larger ports. In particular, I'm pretty sure other even larger-scale contributions from a pre-done area have had to undergo this process, so it's not a new thing.

IANAL, but re. the relicensing, I guess we need some sort of formal agreement that the original author is happy for the relicensing, especially if any breakdown isn't going to end up with a significant rewrite. I suspect you should contact the LLVM board to discuss it.

Oh, yeah, thanks for pointing that out - ideally the original author would chime in on this thread and state that they're contributing this work to the LLVM project.

ljmf00 added a comment.Oct 1 2021, 5:29 PM

Whilst I appreciate that it'd require a fair bit of extra work, I'm inclined to agree with @dblaikie. I don't see any exception in the contribution policy (https://llvm.org/docs/Contributing.html#how-to-submit-a-patch) for larger ports. In particular, I'm pretty sure other even larger-scale contributions from a pre-done area have had to undergo this process, so it's not a new thing.

IANAL, but re. the relicensing, I guess we need some sort of formal agreement that the original author is happy for the relicensing, especially if any breakdown isn't going to end up with a significant rewrite. I suspect you should contact the LLVM board to discuss it.

One other minor point, please make sure to create patches with full context (i.e. add something like -U999999 to the patch-creation command line).

I'm already partitioning this patch. When everything is arranged I'm going to post some context here. A small question: should I abandon this patch or just override the diff with a minimal implementation and stack the other patches behind?

About the licensing, should I post a thread on https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev mailing list or anywhere else? Should I CC anyone in specific too, besides Mathias and Iain?

I passed that flag on the other patches, on the latest update I forgot to add it tho, although I'm aware of that.

Whilst I appreciate that it'd require a fair bit of extra work, I'm inclined to agree with @dblaikie. I don't see any exception in the contribution policy (https://llvm.org/docs/Contributing.html#how-to-submit-a-patch) for larger ports. In particular, I'm pretty sure other even larger-scale contributions from a pre-done area have had to undergo this process, so it's not a new thing.

IANAL, but re. the relicensing, I guess we need some sort of formal agreement that the original author is happy for the relicensing, especially if any breakdown isn't going to end up with a significant rewrite. I suspect you should contact the LLVM board to discuss it.

One other minor point, please make sure to create patches with full context (i.e. add something like -U999999 to the patch-creation command line).

I'm already partitioning this patch. When everything is arranged I'm going to post some context here. A small question: should I abandon this patch or just override the diff with a minimal implementation and stack the other patches behind?

I think what was done with the Rust thread made sense - keep this one as-is (repurposing patches that change too much can make the overall review confusing, etc) - either abandoning it now, or waiting until all the work is otherwise committed - and then following up on this patch either way, with a list of the reviews/patches that implemented the functionality (& abandoning this patch at that point, if not done previously). Reference this patch in the other patches for context, to give an overview, etc.

About the licensing, should I post a thread on https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev mailing list or anywhere else? Should I CC anyone in specific too, besides Mathias and Iain?

Having the original author reply to this patch with their affirmative consent to contribute this work under LLVM's licensing, should be enough for me.

About the licensing, should I post a thread on https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev mailing list or anywhere else? Should I CC anyone in specific too, besides Mathias and Iain?

I don't think it'd hurt to post on the llvm-dev mailing list anyway, probably CC'ing someone like Tanya Lattner, who can point you in the right direction of anything specific. Posting it there also has the added benefit of getting more attention to your patch series - I'd hold off posting until you've posted the first few patches on Phabricator, and then you can link to them in the email thread. Alternatively, send the email now and just follow up with the links in a later email on the same thread.

About the licensing, should I post a thread on https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev mailing list or anywhere else? Should I CC anyone in specific too, besides Mathias and Iain?

I don't think it'd hurt to post on the llvm-dev mailing list anyway, probably CC'ing someone like Tanya Lattner, who can point you in the right direction of anything specific. Posting it there also has the added benefit of getting more attention to your patch series - I'd hold off posting until you've posted the first few patches on Phabricator, and then you can link to them in the email thread. Alternatively, send the email now and just follow up with the links in a later email on the same thread.

I finished the patch split. I tried to minimize dependent code, and end up with this list:

https://reviews.llvm.org/D111414
https://reviews.llvm.org/D111415
https://reviews.llvm.org/D111416
https://reviews.llvm.org/D111417
https://reviews.llvm.org/D111419
https://reviews.llvm.org/D111420
https://reviews.llvm.org/D111421
https://reviews.llvm.org/D111422
https://reviews.llvm.org/D111423
https://reviews.llvm.org/D111424
https://reviews.llvm.org/D111425
https://reviews.llvm.org/D111426
https://reviews.llvm.org/D111428
https://reviews.llvm.org/D111429
https://reviews.llvm.org/D111430
https://reviews.llvm.org/D111431
https://reviews.llvm.org/D111432

All patches are compiled independently if stacked correctly. I'm going to write to the mailing list CCing Tanya Lattner and referencing this diff for some context.

I'd be happy to help review this stack at some point - I find demangling interesting, but I'd need a spec to refer to, to help confirm correctness, and cannot confirm I'll have masses of time to do it.

Please either subscribe me or add me as a reviewer to the patches in the patch series if they're ready for review, and I'l take a look at some point, if nobody else does.

I'd be happy to help review this stack at some point - I find demangling interesting, but I'd need a spec to refer to, to help confirm correctness, and cannot confirm I'll have masses of time to do it.

Please either subscribe me or add me as a reviewer to the patches in the patch series if they're ready for review, and I'l take a look at some point, if nobody else does.

https://dlang.org/spec/abi.html#name_mangling

ibuclaw added a subscriber: ibuclaw.EditedOct 12 2021, 2:23 PM

About the licensing, should I post a thread on https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev mailing list or anywhere else? Should I CC anyone in specific too, besides Mathias and Iain?

Having the original author reply to this patch with their affirmative consent to contribute this work under LLVM's licensing, should be enough for me.

Hi @dblaikie, I'm the author of libiberty/d-demangle.c. I'm aware of the SAoC project for this work, and am supportive of it. To be sure, I also ran a question past the copyright clerk at FSF as well, who confirmed that I have worldwide rights to use my contributions as I see fit, including relicensing them.

About the licensing, should I post a thread on https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev mailing list or anywhere else? Should I CC anyone in specific too, besides Mathias and Iain?

Having the original author reply to this patch with their affirmative consent to contribute this work under LLVM's licensing, should be enough for me.

Hi @dblaikie, I'm the author of libiberty/d-demangle.c. I'm aware of the SAoC project for this work, and am supportive of it. To be sure, I also ran a question past the copyright clerk at FSF as well, who confirmed that I have worldwide rights to use my contributions as I see fit, including relicensing them.

Thanks!

MaskRay added inline comments.
llvm/lib/Demangle/DLangDemangle.cpp
1862

For a block comment, // is preferred as well. https://llvm.org/docs/CodingStandards.html#comment-formatting

yshui added a subscriber: yshui.Oct 13 2021, 11:23 AM
rsmith added a subscriber: rsmith.Oct 13 2021, 12:59 PM

This is a port of libiberty's D demangler as some comments mention, which was written by Iain Buclaw, the maintainer of GDC (GCC-based D compiler).
We have been in frequent communication with him before the project started and he has no issue with this code being used (and re-licensed) by LLVM. If any document / proof is needed on that side, please let us know.

Is the consent of @ibuclaw sufficient? According to github, there have been seven contributors to this code: https://github.com/gcc-mirror/gcc/blob/master/libiberty/d-demangle.c. In addition to @ljmf00 and @ibuclaw, they are: (github usernames) bobsayshilol, jakubjelinek, rainers, brobecke, amodra

This is a port of libiberty's D demangler as some comments mention, which was written by Iain Buclaw, the maintainer of GDC (GCC-based D compiler).
We have been in frequent communication with him before the project started and he has no issue with this code being used (and re-licensed) by LLVM. If any document / proof is needed on that side, please let us know.

Is the consent of @ibuclaw sufficient? According to github, there have been seven contributors to this code: https://github.com/gcc-mirror/gcc/blob/master/libiberty/d-demangle.c. In addition to @ljmf00 and @ibuclaw, they are: (github usernames) bobsayshilol, jakubjelinek, rainers, brobecke, amodra

A few additions to your comment:

Please note that I'm not saying that small contributions are not eligible for relicensing, just stating that fact, for further appreciation, if eligible.

llvm/lib/Demangle/DLangDemangle.cpp
1862

I'll address that, however, if find anything more like that, please review on the split patches, please, otherwise I can loose track of it more easily.

ljmf00 abandoned this revision.Dec 2 2021, 10:22 AM

I'm going to abandon this revision since it is has superseded by the list of patches I mentioned above.