This is an archive of the discontinued LLVM Phabricator instance.

[lld] Provide a hook to customize missing symbols error handling
ClosedPublic

Authored by serge-sans-paille on Oct 19 2020, 4:38 AM.

Details

Summary

This is a follow up to https://reviews.llvm.org/D87758, implementing the missing symbol part, as done by binutils.

Diff Detail

Event Timeline

serge-sans-paille requested review of this revision.Oct 19 2020, 4:38 AM
MaskRay added inline comments.Oct 19 2020, 11:15 PM
lld/test/ELF/error-handling-script-linux.test
9

You may want to test (default)/--no-demangle with a _Z symbol.

jhenderson accepted this revision.Oct 23 2020, 3:49 AM

Looks reasonable to me, once @MaskRay's comment has been addressed.

lld/Common/ErrorHandler.cpp
243

I take it missing-symbol is already in use in GNU bfd? undefined-symbol otherwise seems like a better name (since the symbol is present, just undefined).

lld/docs/ld.lld.1
189
This revision is now accepted and ready to land.Oct 23 2020, 3:49 AM
MaskRay added inline comments.Oct 23 2020, 3:28 PM
lld/Common/ErrorHandler.cpp
243

Agree that undefined-symbol is more appropriate.

MaskRay requested changes to this revision.Oct 23 2020, 3:33 PM

Requested changes because

(1) There is a point about missing-symbol -> undefined-symbol
(2) The interaction with correctSpelling is important and should be tested. In the presence of a spelling corrector, I am on the fence whether this script should run.

This revision now requires changes to proceed.Oct 23 2020, 3:33 PM

Fixed undefined vs missing symbol

Added demangling test case

serge-sans-paille marked an inline comment as done.Oct 26 2020, 3:33 AM

Requested changes because

(1) There is a point about missing-symbol -> undefined-symbol
(2) The interaction with correctSpelling is important and should be tested. In the presence of a spelling corrector, I am on the fence whether this script should run.

Pointers would be greatly appreciated for (2) :-)

Requested changes because

(1) There is a point about missing-symbol -> undefined-symbol
(2) The interaction with correctSpelling is important and should be tested. In the presence of a spelling corrector, I am on the fence whether this script should run.

Pointers would be greatly appreciated for (2) :-)

I filed https://sourceware.org/bugzilla/show_bug.cgi?id=26781 for (2). It is referenced in https://sourceware.org/bugzilla/show_bug.cgi?id=26626#c18
I have seen that Nick has changed 'missing-symbol' to 'undefined-symbol' in GNU ld. Thanks!

MaskRay added inline comments.Oct 26 2020, 8:30 AM
lld/test/ELF/error-handling-script-linux.test
13

My feeling is that despite --demangle/--no-demangle, the handling script should receive the original symbol name.

jhenderson added inline comments.Oct 27 2020, 3:19 AM
lld/docs/ld.lld.1
189

Not addressed yet?

lld/test/ELF/error-handling-script-linux.test
13

+1. The script might well try looking for symbols in certain files, for example, in which case it would need to have the original symbol name.

A case could be made for an optional second argument that takes the demangled name, so that the script can print the demangled name without needing to actually do demangling itself.

MaskRay added inline comments.Oct 27 2020, 9:57 AM
lld/test/ELF/error-handling-script-linux.test
13

I will go for not taking another argument for the unmangled symbol. The linker can support Itanium C++ mangling rules but probably not others. Assuming the error handling script is more tight to the distribution, it knows about the environment better and can support other mangling schemes.

Update documentation as asked by @jhenderson
Add a test case with symbol suggestion as suggested by @MaskRay
Always pass mangled symbol name to the script

serge-sans-paille marked 2 inline comments as done.Oct 28 2020, 2:26 AM
jhenderson added inline comments.Oct 28 2020, 2:55 AM
lld/test/ELF/error-handling-script-linux.test
15–20

Did you mean to line up the output for all 5 cases? Looks like the first one isn't lined up.

Also, should the 2nd-5th cases be using -NEXT?

Rebase on missing-lib patch now that it landed.
Update test cases and documentation.

jhenderson accepted this revision.Nov 3 2020, 5:56 AM

LGTM, but needs @MaskRay to approve too.

Requested changes because

(1) There is a point about missing-symbol -> undefined-symbol
(2) The interaction with correctSpelling is important and should be tested. In the presence of a spelling corrector, I am on the fence whether this script should run.

Pointers would be greatly appreciated for (2) :-)

I filed https://sourceware.org/bugzilla/show_bug.cgi?id=26781 for (2). It is referenced in https://sourceware.org/bugzilla/show_bug.cgi?id=26626#c18
I have seen that Nick has changed 'missing-symbol' to 'undefined-symbol' in GNU ld. Thanks!

My point here has not been resolved so I am still waiting. What shall we do if the linker has an alternative spelling suggestion (typo correction or suggesting extern "C")?

Requested changes because

(1) There is a point about missing-symbol -> undefined-symbol
(2) The interaction with correctSpelling is important and should be tested. In the presence of a spelling corrector, I am on the fence whether this script should run.

Pointers would be greatly appreciated for (2) :-)

I filed https://sourceware.org/bugzilla/show_bug.cgi?id=26781 for (2). It is referenced in https://sourceware.org/bugzilla/show_bug.cgi?id=26626#c18
I have seen that Nick has changed 'missing-symbol' to 'undefined-symbol' in GNU ld. Thanks!

My point here has not been resolved so I am still waiting. What shall we do if the linker has an alternative spelling suggestion (typo correction or suggesting extern "C")?

The testing highlights that the proposed name correction is printed after the script message, as part of the normal linker error (see the final case). I think this makes sense - we definitely should call the script with the name as described by the actual symbol (not a corrected version), and we also definitely should call the script when there is a spelling suggestion as otherwise it would be inconsistent with the cases where no suggestion could be made. I could maybe see a case for not suggesting the spelling suggestion in the presence of the script, but I personally don't think the extra logic for that is worthwhile - it seems to me like the script could be doing something completely orthoganol to the undefined symbol reporting, so the spelling suggestion would still be useful.

lld/test/ELF/error-handling-script-linux.test
32

Nit: delete extra blank line.

Requested changes because

(1) There is a point about missing-symbol -> undefined-symbol
(2) The interaction with correctSpelling is important and should be tested. In the presence of a spelling corrector, I am on the fence whether this script should run.

Pointers would be greatly appreciated for (2) :-)

I filed https://sourceware.org/bugzilla/show_bug.cgi?id=26781 for (2). It is referenced in https://sourceware.org/bugzilla/show_bug.cgi?id=26626#c18
I have seen that Nick has changed 'missing-symbol' to 'undefined-symbol' in GNU ld. Thanks!

My point here has not been resolved so I am still waiting. What shall we do if the linker has an alternative spelling suggestion (typo correction or suggesting extern "C")?

The testing highlights that the proposed name correction is printed after the script message, as part of the normal linker error (see the final case). I think this makes sense - we definitely should call the script with the name as described by the actual symbol (not a corrected version), and we also definitely should call the script when there is a spelling suggestion as otherwise it would be inconsistent with the cases where no suggestion could be made. I could maybe see a case for not suggesting the spelling suggestion in the presence of the script, but I personally don't think the extra logic for that is worthwhile - it seems to me like the script could be doing something completely orthoganol to the undefined symbol reporting, so the spelling suggestion would still be useful.

In that case, I would worry that the linker may print duplicated information (one for an alternative spelling suggestion and one from the script). The script needs some more information to avoid duplication, which can be confusing to the user.

Requested changes because

(1) There is a point about missing-symbol -> undefined-symbol
(2) The interaction with correctSpelling is important and should be tested. In the presence of a spelling corrector, I am on the fence whether this script should run.

Pointers would be greatly appreciated for (2) :-)

I filed https://sourceware.org/bugzilla/show_bug.cgi?id=26781 for (2). It is referenced in https://sourceware.org/bugzilla/show_bug.cgi?id=26626#c18
I have seen that Nick has changed 'missing-symbol' to 'undefined-symbol' in GNU ld. Thanks!

My point here has not been resolved so I am still waiting. What shall we do if the linker has an alternative spelling suggestion (typo correction or suggesting extern "C")?

The testing highlights that the proposed name correction is printed after the script message, as part of the normal linker error (see the final case). I think this makes sense - we definitely should call the script with the name as described by the actual symbol (not a corrected version), and we also definitely should call the script when there is a spelling suggestion as otherwise it would be inconsistent with the cases where no suggestion could be made. I could maybe see a case for not suggesting the spelling suggestion in the presence of the script, but I personally don't think the extra logic for that is worthwhile - it seems to me like the script could be doing something completely orthoganol to the undefined symbol reporting, so the spelling suggestion would still be useful.

In that case, I would worry that the linker may print duplicated information (one for an alternative spelling suggestion and one from the script). The script needs some more information to avoid duplication, which can be confusing to the user.

I disagree. The two messages are printed at different points. From the test output:

script: info: called with undefined-symbol a # script output - not labelled with "error:" or "ld.lld:"
ld.lld: error: undefined symbol: a           # linker error message.
>>> referenced by [[OBJ]]:                   # linker error additional context, with ">>>" used to highlight it is additional context of the message.
>>> did you mean: a_                         # also additional context, similarly highlighted with the ">>>" output, separated from the script output by the error message and "referenced by" context.

I think it's very unlikely that the script output could cause confusion here. There are plenty of things that distinguish the spelling suggestion produced by the linker from the output of the script. Even if there weren't, I would suggest that the spelling correction functionality should be an opt-out command-line option of the linker, rather than any script arguments. That way if a user wants to use a script and the output of the script could potentially create confusion with the spelling correction functionality, they can specify the option, whilst everyone else still gets the suggestion, even if they use the script, without either side needing to add potentially complicated logic to the script.

As a user, I would find it confusing if I specify an error handler script and it doesn't get called because there's a spelling suggestion.

If the script prints nothing if it cannot find a candidate library, I think having both is fine. If the script prints something, it will be confusing.

If the script prints nothing if it cannot find a candidate library, this patch LGTM.

If the script prints nothing if it cannot find a candidate library, I think having both is fine. If the script prints something, it will be confusing.

If the script prints nothing if it cannot find a candidate library, this patch LGTM.

(I assume you meant "cannot find a symbol") Surely it's up to the user what the script prints? I really don't think LLD should change it's behaviour based on what the script might or might not do, and that includes what information it passes to the script.

MaskRay accepted this revision.Nov 6 2020, 10:06 AM

[lld] Provide a hook to customize missing symbols error handling

missing -> undefined

lld/test/ELF/error-handling-script-linux.test
9

-unknown-linux can be deleted. This is generic ELF behavior applying to other OSes (e.g. FreeBSD)

This revision is now accepted and ready to land.Nov 6 2020, 10:06 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 9 2020, 4:29 AM

Thanks a lot @MaskRay and @jhenderson for the careful reviews. I've learnt a lot in the process \o/