Page MenuHomePhabricator

[ELF] Rework debug line parsing to use llvm::Error and callbacks (LLD-side)
ClosedPublic

Authored by jhenderson on Mar 16 2018, 6:54 AM.

Details

Summary

Feedback on D44382 suggested to change the interface to use a callback instead of the structures used. As I didn't want to effectively overwrite that diff, I created D44560 as an alternative implementation using the suggested approach. This is the corresponding patch required for LLD assuming D44560 were to be applied.

As with D44382, D44560 changes the debug line parser interface to report LLVM errors in an interface that different executables can use, rather than always being printed directly as warnings to stderr. This change allows LLD to make use of the new interface and call its own warning methods to report problems.

To test this, I have extended the bad-debug undefined symbol message case to show that a corresponding warning is printed, if the debug line cannot be parsed. In addition, I have also added tests for LLD attempting to parse a non-existent/empty debug line section, showing that the new warning for attempting to do this is not emitted.

Diff Detail

Repository
rL LLVM

Event Timeline

grimar added inline comments.Mar 19 2018, 2:46 AM
ELF/InputFiles.cpp
128 ↗(On Diff #138691)

I would move this comment above isValidOffset(0) and update it as you are
doing something with offset 0 right there already.

136 ↗(On Diff #138691)

Is it useful to warn here? I think we only call this method on linkage error,
so anyways going to terminate the link.
With that, it seems easier to always error out here to simplify code/logic.

jhenderson added inline comments.Mar 19 2018, 3:28 AM
ELF/InputFiles.cpp
128 ↗(On Diff #138691)

Actually, the comment (and code) is technically wrong - it is possible for there to be multiple CUs in a single object input. I'm thinking objects built via LTO or -r links. I'll add a FIXME and file a bug, along with updating the comment.

136 ↗(On Diff #138691)

I'm currently preserving behaviour (or fixing intended behaviour, at least), though I certainly see where you're coming from. In general we try to emit as many errors/warnings as we can before stopping, so I still think it's useful - it might help the user identify some other problem, for example.

I'm not bothered either way though, so if the consensus is to not emit the warning, I can easily change that.

grimar added inline comments.Mar 19 2018, 3:33 AM
ELF/InputFiles.cpp
136 ↗(On Diff #138691)

I meant we always can error here instead of warn for all cases I think.

grimar added inline comments.Mar 19 2018, 3:34 AM
ELF/InputFiles.cpp
136 ↗(On Diff #138691)

As at this point, we are already in error state.

jhenderson marked an inline comment as done.

Update comment.

ELF/InputFiles.cpp
136 ↗(On Diff #138691)

Right, but that would suggest to the user that fixing the malformed debug line is a requirement to get a working link, which it isn't. Perhaps another way to think about it is if we wanted to introduce a new warning that could use source information. We'd want to use this function too, but we shouldn't stop the link succeeding if we can't parse the line table.

I could actually make a case for making the second one a warning too, but I made that an error, because we don't actually ever expect it to be reported currently.

espindola added inline comments.Mar 19 2018, 2:24 PM
test/ELF/undef.s
37 ↗(On Diff #138892)

I think I agree with George about producing an error on corrupted output. In practice we expect to never see it, so we may as well make the code simpler.

espindola added inline comments.Mar 19 2018, 2:25 PM
ELF/InputFiles.cpp
130 ↗(On Diff #138892)

When do we expect to have LineData that doesn't include the current CU?

David Blaikie via llvm-commits <llvm-commits@lists.llvm.org> writes:

@rafael on the mailing list:

Now that I think of it, the fact that lld could not parse the debug info to provide a better error message should always be a note, not an error, regardless of how broken the section happens to be.

To be clear, are you saying you're happy with the warnings for now, or would you prefer to drop the severity to just use "message()"? Also, what do you think about the generic error type which currently emits an error? I'd be happy to change the severity of that one too. It's currently an error because we don't expect an Error of that type ever to be reported.

ELF/InputFiles.cpp
130 ↗(On Diff #138892)

LineData contains the contents of the .debug_line section. It is possible to have an empty section, or a missing section, in which case there will be no valid contents in LineData. Previously, when LLD requested parsing of this section, the parser would return false immediately, because the offset was invalid. With the changes in D44560, an Error is now returned saying the offset is invalid.

In addition, since an object file can contain multiple CUs (e.g. via LTO or -r links), some of which might be missing debug data and others not, we can't know for certain using this method that offset 0 is the offset of the CU for the corresponding symbol (see the reproducible in PR36793).

@rafael on the mailing list:

Now that I think of it, the fact that lld could not parse the debug info to provide a better error message should always be a note, not an error, regardless of how broken the section happens to be.

To be clear, are you saying you're happy with the warnings for now, or would you prefer to drop the severity to just use "message()"? Also, what do you think about the generic error type which currently emits an error? I'd be happy to change the severity of that one too. It's currently an error because we don't expect an Error of that type ever to be reported.

Using message() in all cases seems better. We know the link is failing for unrelated reasons and we are just letting the user know why we are not printing better error messages (could not parse the debug info).

Using message() in all cases seems better. We know the link is failing for unrelated reasons and we are just letting the user know why we are not printing better error messages (could not parse the debug info).

message() prints to stdout, which I hadn't realised earlier. This clearly should be printed to stderr, for which there is no mechanism currently, if we don't want a warning or error. Thinking about it more, the rest of LLVM has a concept of "remarks" which are "lower" severity messages than errors or warnings (and can't be promoted to errors, as I understand it). How about I add this to LLD? Alternatively, I'll need to add a note() function, or just print to errs() directly.

Add a "remark" method to the LLD ErrorHandler, and use that instead of warnings to report problems with the debug line. Also added another case to the testing, to demonstrate failures via the callback in contrast to failures via the return value.

espindola accepted this revision.Mar 26 2018, 6:52 PM

Tentative LGTM pending the llvm change.

This revision is now accepted and ready to land.Mar 26 2018, 6:52 PM
jhenderson planned changes to this revision.Mar 27 2018, 1:23 AM

Thanks @espindola. I'll still need to rebase this once I've figured out what to do following rLLD328284 (see also my comments in D44560). I don't anticipate this changing from using remark() though. It's just how that function gets called.

Rebase following rLLD328284. I was able to drop one of the test inputs, now that LLD can handle multiple CUs in the same input file. Also added a comment to the test explaining the purpose of the zed6/zed7 test cases.

This revision is now accepted and ready to land.Mar 28 2018, 7:01 AM
jhenderson requested review of this revision.Mar 28 2018, 7:01 AM

@espindola/@ruiu, are you happy with the latest update?

ruiu added inline comments.Mar 28 2018, 1:54 PM
Common/ErrorHandler.cpp
86–91 ↗(On Diff #140069)

Honestly I don't think we want to add a new level of error messages, as I think error() for errors, warning() for warnings and message() for non-error messages is enough. If something needs to be fixed, it should be an error or a warning. If something doesn't have to be fixed, lld shouldn't print out anything. A situation like "verbose messages are printed out but you can ignore them" isn't actionable and thus not desirable.

So, can you choose either (1) show it as a warning or (2) don't show anything?

jhenderson added inline comments.Mar 29 2018, 1:42 AM
Common/ErrorHandler.cpp
86–91 ↗(On Diff #140069)

Okay, I have no problem making it a warning. I think this might be a safer course than not printing anything. If undefined symbol messages are incomplete, due to a problem in the input, it might be possible for users to fix it, depending on the nature of the problem (e.g. bogus hand-written debug_line assembly), and therefore allowing them to see the useful error information, allowing them to more quickly identify the problem. I could also see us potentially using this function in the future to produce other informational messages, or implement other features, without always emitting an error, so a warning would be appropriate here too.

Change back to using warnings.

espindola accepted this revision.Mar 29 2018, 3:38 PM

LGTM. I will review the LLVM side too.

This revision is now accepted and ready to land.Mar 29 2018, 3:38 PM
jhenderson edited the summary of this revision. (Show Details)May 10 2018, 3:52 AM
jhenderson removed a subscriber: rafael.
This revision was automatically updated to reflect the committed changes.