This is a follow up to https://reviews.llvm.org/D87758, implementing the missing symbol part, as done by binutils.
Details
Diff Detail
Event Timeline
lld/test/ELF/error-handling-script-linux.test | ||
---|---|---|
11 | You may want to test (default)/--no-demangle with a _Z symbol. |
lld/Common/ErrorHandler.cpp | ||
---|---|---|
243 | Agree that undefined-symbol is more appropriate. |
Requested changes because
(1) There is a point about missing-symbol -> undefined-symbol
(2) The interaction with correctSpelling is important and should be tested. In the presence of a spelling corrector, I am on the fence whether this script should run.
I filed https://sourceware.org/bugzilla/show_bug.cgi?id=26781 for (2). It is referenced in https://sourceware.org/bugzilla/show_bug.cgi?id=26626#c18
I have seen that Nick has changed 'missing-symbol' to 'undefined-symbol' in GNU ld. Thanks!
lld/test/ELF/error-handling-script-linux.test | ||
---|---|---|
15 | My feeling is that despite --demangle/--no-demangle, the handling script should receive the original symbol name. |
lld/docs/ld.lld.1 | ||
---|---|---|
189–190 | Not addressed yet? | |
lld/test/ELF/error-handling-script-linux.test | ||
15 | +1. The script might well try looking for symbols in certain files, for example, in which case it would need to have the original symbol name. A case could be made for an optional second argument that takes the demangled name, so that the script can print the demangled name without needing to actually do demangling itself. |
lld/test/ELF/error-handling-script-linux.test | ||
---|---|---|
15 | I will go for not taking another argument for the unmangled symbol. The linker can support Itanium C++ mangling rules but probably not others. Assuming the error handling script is more tight to the distribution, it knows about the environment better and can support other mangling schemes. |
Update documentation as asked by @jhenderson
Add a test case with symbol suggestion as suggested by @MaskRay
Always pass mangled symbol name to the script
lld/test/ELF/error-handling-script-linux.test | ||
---|---|---|
28–46 | Did you mean to line up the output for all 5 cases? Looks like the first one isn't lined up. Also, should the 2nd-5th cases be using -NEXT? |
My point here has not been resolved so I am still waiting. What shall we do if the linker has an alternative spelling suggestion (typo correction or suggesting extern "C")?
The testing highlights that the proposed name correction is printed after the script message, as part of the normal linker error (see the final case). I think this makes sense - we definitely should call the script with the name as described by the actual symbol (not a corrected version), and we also definitely should call the script when there is a spelling suggestion as otherwise it would be inconsistent with the cases where no suggestion could be made. I could maybe see a case for not suggesting the spelling suggestion in the presence of the script, but I personally don't think the extra logic for that is worthwhile - it seems to me like the script could be doing something completely orthoganol to the undefined symbol reporting, so the spelling suggestion would still be useful.
lld/test/ELF/error-handling-script-linux.test | ||
---|---|---|
45 | Nit: delete extra blank line. |
In that case, I would worry that the linker may print duplicated information (one for an alternative spelling suggestion and one from the script). The script needs some more information to avoid duplication, which can be confusing to the user.
I disagree. The two messages are printed at different points. From the test output:
script: info: called with undefined-symbol a # script output - not labelled with "error:" or "ld.lld:" ld.lld: error: undefined symbol: a # linker error message. >>> referenced by [[OBJ]]: # linker error additional context, with ">>>" used to highlight it is additional context of the message. >>> did you mean: a_ # also additional context, similarly highlighted with the ">>>" output, separated from the script output by the error message and "referenced by" context.
I think it's very unlikely that the script output could cause confusion here. There are plenty of things that distinguish the spelling suggestion produced by the linker from the output of the script. Even if there weren't, I would suggest that the spelling correction functionality should be an opt-out command-line option of the linker, rather than any script arguments. That way if a user wants to use a script and the output of the script could potentially create confusion with the spelling correction functionality, they can specify the option, whilst everyone else still gets the suggestion, even if they use the script, without either side needing to add potentially complicated logic to the script.
As a user, I would find it confusing if I specify an error handler script and it doesn't get called because there's a spelling suggestion.
If the script prints nothing if it cannot find a candidate library, I think having both is fine. If the script prints something, it will be confusing.
If the script prints nothing if it cannot find a candidate library, this patch LGTM.
(I assume you meant "cannot find a symbol") Surely it's up to the user what the script prints? I really don't think LLD should change it's behaviour based on what the script might or might not do, and that includes what information it passes to the script.
[lld] Provide a hook to customize missing symbols error handling
missing -> undefined
lld/test/ELF/error-handling-script-linux.test | ||
---|---|---|
11 | -unknown-linux can be deleted. This is generic ELF behavior applying to other OSes (e.g. FreeBSD) |
Thanks a lot @MaskRay and @jhenderson for the careful reviews. I've learnt a lot in the process \o/
I take it missing-symbol is already in use in GNU bfd? undefined-symbol otherwise seems like a better name (since the symbol is present, just undefined).