This is an archive of the discontinued LLVM Phabricator instance.

Provide a hook to customize missing library error handling
ClosedPublic

Authored by serge-sans-paille on Sep 16 2020, 6:09 AM.

Details

Summary

It would be nice to be able to provide a custom script that would help users to find missing libraries. A possible scenario could be:

% clang /tmp/a.c -fuse-ld=lld -loauth -Wl,--error-handling-script=/tmp/addLibrary.py
unable to find library -loauth
looking for relevant packages to provides that library

  • liboauth-0.9.7-4.el7.i686
  • liboauth-devel-0.9.7-4.el7.i686
  • liboauth-0.9.7-4.el7.x86_64
  • liboauth-devel-0.9.7-4.el7.x86_64
  • pix-1.6.1-3.el7.x86_64

Where addLibrary would be called with the missing library name as first argument (in that case addLibrary.py oauth)

Diff Detail

Event Timeline

Add an extra argument to the script call, to allow for future extensions, as suggested in https://sourceware.org/bugzilla/show_bug.cgi?id=26626

I am sure you are planning to do it but please don't forget the release notes and the documentation.

This is a great idea, I would probably enable that by default in Debian.

The suggestion from https://sourceware.org/bugzilla/show_bug.cgi?id=26626#c2 looks good to me:)

lld/ELF/Driver.cpp
297

This returns too early. We handle all input options before exit.

lld/ELF/Options.td
182

For newer options, we can use EEq to enforce the double dash forms. The latest GNU ld options have adopted this convention as well.

MaskRay added a reviewer: jhenderson.

This sounds like a useful thing too. It should be fairly straightforward to test, as it's possible to e.g. have a python script in the test, which gets executed. Could you add test cases, please?

I think the script should take 1+ arguments, where the first argument is the error "ID", and the optional remainder allow for different amounts of info for different error types.

lld/ELF/Driver.cpp
295

Sounds like this should be "missing-library" here? After all, that's what the situation is.

I also think we should still print the original error message. There are a few reasons for this:

  1. Since not all error messages (currently) get passed to the script, the inconsistency in this patch could get a bit confusing at times, especially if the script itself didn't produce the message (and it wouldn't be required to).
  2. Printing the error message within the code means that other behaviour in error isn't skipped. For example, VS diagnostics style would style mean messages get adjusted etc.
  3. This would allow scripts to implement handlers for only specific errors - any that it doesn't know how to handle, it simply skips, and the link continues as normal (having printed the error). This also means that scripts are future-proof against new LLD changes adding more cases use the script for error handling.

Also, also, maybe we should consider an additional overlaod to the error function, which takes the ID string. The error function would then be responsible for handling the error by calling the script. This will prevent this same code being needed for other usages in the future - a future caller would simply need to call error("the normal error message", "some ID", {"arg1", "arg2"}) (or something to that effect), and the ID would be automatically be passed back to the script.

299

Should this be failed to call... (i.e. lower-case)? I'm not sure what the LLD style actually is, but elsewhere, the general consensus is that new messages should be start with a lower-case letter (see also the style guide).

Also, is there any ability to get more error information in this message about why it failed to call the script? I imagine the value of Res might be some error code (I haven't looked...)?

912

Is this deliberately early in the order? The list is mostly in alphabetical order currently. It might well make sense for it to be early, but in that case, we should probably add a comment highlighting this fact.

Should the option name be more specific? --error-handling-script sounds like all errors can be handled with a script,
though this is only intended to be used for missing libraries errors, right?

It can be --missing-lib-err-handler, --missing-lib-ext-script, or something better probably.

Address reviews, taking the « extend error function » path.

serge-sans-paille marked 5 inline comments as done.

Minor review update

Should the option name be more specific? --error-handling-script sounds like all errors can be handled with a script,
though this is only intended to be used for missing libraries errors, right?

It can be --missing-lib-err-handler, --missing-lib-ext-script, or something better probably.

@serge-sans-paille seems to have missed this comment, but I think this was suggested on the GNU feature request (see https://sourceware.org/bugzilla/show_bug.cgi?id=26626). That way, the functionality can be extended without needing new command line options.

lld/Common/ErrorHandler.cpp
230

LLD uses lower-case variable names.

It occurs to me that we could just fold this and the existing error method together, and just default args to an empty list. Then, if args is empty, don't do the script launching bit.

@MaskRay/@grimar, what do you think about extending the ErrorHandler to actually store the path to the script somewhere? That way, it doesn't need specifying every time, and the only argument needs to be the tag and any missing payload. It'll keep the error calls that use this cleaner.

233–238

I'm not sure we need this? This just makes it harder to extend the functionality going forward, because then a user has to remember to update both the error() call site (to pass in the tag and payload) and this list here. I'd just delete this.

239–240

This sounds like it should be an assertion! Or a regular error (it probably wants validating in the Driver).

lld/ELF/Options.td
185

Nit: delete extra blank line.

Still needs a test case.

lld/Common/ErrorHandler.cpp
230

@MaskRay/@grimar, what do you think about extending the ErrorHandler to actually store the path to the script somewhere?

I am fine with this.

Then I wonder if the API would look cleaner if it was:

error(const Twine &msg, const Twine &tag, ArrayRef<StringRef> payload)
lld/Common/ErrorHandler.cpp
233–238

The goal was to sanitize the input. Using the approach proposed by @grimar would make this obsolete if we use an enumeration as Tag instead of a Twine.

239–240

Again, if the config script becomes a member of the ErrorHandler, we'll be able to assert on it.

Take most reviews into account. I still kept the two error functions, because I like the logic split.

Test case added + fix implementation accordingly :-)

jhenderson added inline comments.Sep 23 2020, 12:15 AM
lld/Common/ErrorHandler.cpp
232

Watch out - this could exit LLD potentially before calling the script. I think we want to make sure the script bit is executed, even if this is the error that pushes us over the error limit.

235–244

I think you can slightly simplify this code by using push_back/emplace_back/insert to add to scriptArgs. You'd start with an empty vector, and then just add the elements you need in order.

lld/test/ELF/error-handling-script.s
2–3 ↗(On Diff #293544)
  1. Will this work on Windows? I'm not sure Windows knows how to execute shell scripts.
  2. Is it possible to just inline the script in the test file, rather than having to write it out at run time?
8 ↗(On Diff #293544)

Nit: delete extra blank line at EOF.

grimar added inline comments.Sep 23 2020, 2:21 AM
lld/Common/ErrorHandler.cpp
233–238

I also don't think we need a ErrorTag enum, so I'd go with const Twine &tag and remove this.

serge-sans-paille marked 2 inline comments as done.

Take most reviews into account

Did you upload the right diff? This looks like going back to one of your earlier versions, or at least very similar to it.

Upload the right diff (!)

jhenderson added inline comments.Sep 29 2020, 12:53 AM
lld/Common/ErrorHandler.cpp
236

As there aren't going to be that many arguments, you probably don't need to call reserve. The cost of this is minimal.

247–248

This behaviour needs testing. Probably the easiest cross-platform way is for the script to not exist.

What about if the script returns a non-zero exit code? Do we really want to throw away that information?

We may want some sort of convention where some specific error code represents an unsupported tag (and therefore should be ignored), and other error codes cause an error to be reported by LLD, indicating the non-zero exit code.

lld/ELF/Driver.cpp
289–290

No need for braces here.

lld/test/ELF/error-handling-script.test
2 ↗(On Diff #294652)

Is it possible to have a Windows version of this test? I imagine it would just be a batch file that called echo, but otherwise would be pretty much identical.

3 ↗(On Diff #294652)

Does this work?

serge-sans-paille marked 5 inline comments as done.

Better error handling, more testing, take review into account.

lld/Common/ErrorHandler.cpp
233–238

I'm wondering why you prefer the string approach? I like the enum approach because it allows for compile-time checking, while nothing prevents someone from using an unsupported string.

235–244

But using push_back /emplace_back / insert implies a more complex machinery under the hood, while presizing the array + std::copy is easier on the generated code. But it implies some index/size book keeping so I understand your point and will change the code.

lld/test/ELF/error-handling-script.test
2 ↗(On Diff #294652)

I had to duplicate the test case, tried to forge a polyglot file... but that was a mess. I could also just echo "message" but I'd rather test the actual arguments, what do you think?

3 ↗(On Diff #294652)

It does, but is that pore portable?

Realease note + manpage update

jhenderson added inline comments.Oct 1 2020, 1:03 AM
lld/Common/ErrorHandler.cpp
232

Ping this. Have you considered the case where the error limit is hit and LLD exits? In that case the error handling script will not be called, which I don't think we want (we should call the script for each time an error is hit, not just each time except the last).

233–238

I don't have a strong argument either way, but one advantage of strings is precisely because it means there'll be no compile-time checking, making it easier to add new usages. However, the risk is that a call site will have a typo or similar - at least in the enum case, this point of failure is all in one place.

235–244

I'm not sure it indicates that to me at all. To me push/emplace_back is a standard way of creating a list of things at run time.

247–248

Sorry, maybe I caused confusion with my discussion of unsupported tags. I actually think unsupported tags should be allowed without an error being reported. Having thought more about it, I think that would work best if it's handled by the script. Thus a script can opt into or out of ignoring unsupported tags by returning non-zero for them or not.

The end result is that you can probably get away with deleting case 1.

262

How some variation of this for the default case:
"'" + errorHandlingScript + "': error handling script exited with code " + Twine(Res)

It's quite common for error messages to be prefixed with the file where the problem is, so perhaps the script could be the same. Also, I think it's useful to record the script's exit code in the error message.

You also need a test case for the default label, i.e. where the script exits with some non-zero exit code.

lld/test/ELF/error-handling-script.bat
1 ↗(On Diff #295324)

Does this work with an extension other than .bat? It's fine either way, it's just that when I double-click the test in Windows when it has .bat as an extension, it runs the script, which isn't what I want!

Also, it might be worth naming this error-handling-script-windows.bat to make clear the distinction.

lld/test/ELF/error-handling-script.test
2 ↗(On Diff #294652)

Duplicate test case is the best way forward in this case, I think. We do similar things in other cases too.

3 ↗(On Diff #294652)

I believe lit special cases /dev/null to be the equivalent of an empty file stream. Certainly it's worked for me in the past.

serge-sans-paille marked 5 inline comments as done.Oct 6 2020, 5:55 AM
MaskRay added inline comments.Oct 6 2020, 2:40 PM
lld/docs/ReleaseNotes.rst
24

This needs to be the ELF Improvements section.

lld/include/lld/Common/ErrorHandler.h
99

StringRef defaults to an empty string. Delete = ""

https://sourceware.org/bugzilla/show_bug.cgi?id=26626 @serge-sans-paille Perhaps create a thread on binutils@sourceware.org because not many people subscribe bugzilla..

You need a test case that shows what happens when the script returns a non-zero exit code.

lld/test/ELF/error-handling-script-linux.test
7

Use FileCheck -D to make sure you get the script name right in the error message.

9
12

It's helpful to indent the spaces to make the checked-for text line up.

13
lld/test/ELF/error-handling-script-windows.bat
10–11

This looks like it's missing the script name?

Take review into account + add a document specifying the interface expected for the error handling script, so that binutils devs can use it as reference.

Looks like you missed context from the latest diff. It also looks like you didn't address my previous set of comments.

lld/Common/ErrorHandler.cpp
249

This needs to happen after the switch statement, or the original message won't be reported in the event of a non-zero exit code.

lld/docs/error_handling_script.rst
6–8

(please reflow the suggestion)

17–18

I don't think this paragraph is useful. I'd just delete it. The important bits of information in it are already mentioned elsewhere.

23
30–31
37
lld/docs/ld.lld.1
185–192

This description looks stale to me.

serge-sans-paille marked 6 inline comments as done.

Update documentation and answer reviewers.

Development have started on ld side using the documentation bundled in that patch as reference.

lld/Common/ErrorHandler.cpp
249

The way I (tried to) address your comment issue was through aggregation of msg in the event of a non-zero exit code. That way there's always a single call to error whatever the error code.

lld/docs/error_handling_script.rst
23

According to https://docutils.sourceforge.io/docs/user/rst/quickref.html#literal-blocks that's the way one introduces literal blocks in restructured text.

jhenderson added inline comments.Oct 13 2020, 5:35 AM
lld/Common/ErrorHandler.cpp
249

Oh, I missed that. Won't that result in only a single "error:" prefix though? The later lines will look a bit odd then, right?

For example:

error: some file: some error message
script.sh: error handling script failed to execute
MaskRay accepted this revision.Oct 13 2020, 10:06 AM

LGTM, but please wait for @jhenderson and Nick on binutils side:)

lld/Common/ErrorHandler.cpp
246

lowercase res

lld/docs/error_handling_script.rst
1

If the SPHINX_EXECUTABLE cmake variable is set and SPHINX_OUTPUT_HTML is on, ninja docs-lld-html can build the doc. You can also use rst2html5.py or similar tool too check whether the .rst renders nicely.

This revision is now accepted and ready to land.Oct 13 2020, 10:06 AM

Synchronize with binutils which also implements missing-symbol.

binutils implementation got commited recently https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=23ae20f5e3afeeab9aa616c63ab3b357668476d5

Minor doc and format update

It looks like you've uploaded the wrong diff? Also, I'd recommend adding the missing symbol path in a separate patch, to keep this patch smaller.

Upload correct diff, with only the missing-lib part

jhenderson requested changes to this revision.Oct 20 2020, 12:44 AM

None of my most recent comments have been addressed. They need to be before I'm happy for this to land.

This revision now requires changes to proceed.Oct 20 2020, 12:44 AM
serge-sans-paille marked 10 inline comments as done.

Tackle (most of ?) @jhenderson comment

  • Change the way errors messages are formatted and aggregated, as pointed out
  • Update documentation
  • Use placeholder in testcase

Nearly there.

For some reason this patch doesn't seem to be triggering the pre-merge checks. @MaskRay, any idea why?

lld/test/ELF/error-handling-script-linux.test
7–8

You might want to split these lines are they're getting quite long.

lld/test/ELF/error-handling-script-windows.bat
4–5
7

The message checked for here doesn't match what the script actually produces.

Split long test lines and update windows test

jhenderson accepted this revision.Nov 2 2020, 12:35 AM

Have you run this on Windows to confirm the Windows test works? If not, I can probably arrange to check that. LGTM once it's been checked.

lld/test/ELF/error-handling-script-linux.test
7–8

FWIW, I prefer the format as indicated in my suggested edit, with the | on the first line, and the second line indented with a total of 3 spaces. This means that a user who is reading the first line doesn't need to check the next to see whether the command continues on the next line, whilst a user reading the second line can see from the indentation that it is a continuation of a sequence of commands from the previous line.

This revision is now accepted and ready to land.Nov 2 2020, 12:35 AM

Have you run this on Windows to confirm the Windows test works? If not, I can probably arrange to check that. LGTM once it's been checked.

Yeah, please do!

Update multi-line test formatting

jhenderson requested changes to this revision.Nov 2 2020, 5:26 AM

Requesting changes, since the test is failing on Windows. The problem is that Windows batch files print the command being run before actually running it, so the output is something like this:

C:\llvm\build\tools\lld\test\ELF>echo "script: info: called with missing-lib idontexist"
"script: info: called with missing-lib idontexist"
ld.lld: error: unable to find library -lidontexist

Note that this means the script message appears twice in the output, whilst the LLD error message appears after the second one, which means the CHECK-NEXT for the latter fails. To fix, you probably just want to add the line @echo off immediately before the current echo statement in the windows script.

lld/test/ELF/error-handling-script-windows.bat
7
This revision now requires changes to proceed.Nov 2 2020, 5:26 AM

@echo off in windows test.

jhenderson added inline comments.Nov 3 2020, 12:34 AM
lld/test/ELF/error-handling-script-windows.bat
7

This hasn't been addressed?

14

Nit: I'd add a blank line before this to make the script contents stand out from the lit and FileCheck stuff.

Update windows test file

jhenderson accepted this revision.Nov 3 2020, 1:37 AM

LGTM, thanks!

This revision is now accepted and ready to land.Nov 3 2020, 1:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 3 2020, 2:06 AM