The .debug_line parser previously reported errors by printing to stderr and return false. This is not particularly helpful for clients of the library code, as it prevents them from handling the errors in a manner based on the calling context. This change switches to using llvm::Error to indicate what problems were detected during parsing, and has updated clients to handle the errors in a location-specific manner. In general, this means that they continue to do the same thing to external users. Below, I have outlined what the known behaviour changes are, relating to this change.
There are three levels of "errors" in the new error mechanism, to broadly distinguish between different fail states of the parser, since not every failure will prevent parsing of the unit, or of subsequent unit. Fatal errors represent errors that prevent reading the unit length field. If this happens, it will be impossible to know where the next unit starts, so this will prevent further parsing of both the current unit and any subsequent ones. Major errors represent errors that prevent reading the remainder of the table, e.g. because
the version is unsupported. Minor errors represent problems with parsing that do not prevent attempting to continue reading the table. The only example of this currently is when the last sequence of a unit is unterminated. However, I think it would be good to change the handling of unrecognised opcodes to report as minor errors as well, rather than just printing to the stream if --verbose is used (this would be a subsequent change however).
I have substantially extended the DwarfGenerator to be able to handle custom-crafted .debug_line sections, allowing for comprehensive unit-testing of the parser code. For now, I am just adding unit tests to cover the basic error reporting, and positive cases, and do not currently intend to test every part of the parser, although the framework should be sufficient to do so at a later point. Note that the current diff is still a work-in-progress in the unit tests - I am happy with most of what is there, but I intend to add further tests for basic DWARF 5 positive cases, and for each of the different possible errors that can be reported. I will update the diff once these are complete, but I wanted to put the current state up so that people can start commenting on it.
Known behaviour changes:
- The dump function in DWARFContext now does not attempt to read subsequent tables when searching for a specific offset, if there was a fatal error.
- The dump function now uses the severity of the error to determine if subsequent units should be read or not. If a major error is detected, the table is skipped, but subsequent ones are read, making the assumption that the unit length field is valid.
- getOrParseLineTable now returns a useful Error if an invalid offset is encountered, rather than simply a nullptr.
- getOrParseLineTable now returns both the Error returned from LineTable::parse and the line table, if only Minor-level errors are encountered.
- The parse functions no longer use fprintf(stderr,...) to report errors, meaning that LLD will now correctly print errors, rather than them sometimes not being flushed, or being interleaved with other errors (see also the forthcoming LLD review).
- The existing parse error messages have been updated to not specifically include "warning" in their message, allowing consumers to determine what severity the problem is.
- If the line table version field appears to have a value less than 2, an informative error is returned, instead of just false.
- If the line table unit length field uses a reserved value, an informative error is returned, instead of just false.
As a helper for the generator code, I have re-added EmitInt64 to the AsmPrinter code. This previously existed, but was removed way back in r100296, presumably because it was dead at the time. A quick review of LLVM code suggests that there are several places that could do with this function, instead of using EmitIntValue(..., 8).
This change also requires a change to LLD. I will post a separate review for this shortly, since I don't know how to reliably create a review for both repositories simultaneously.
I'm conscious that this is quite a large change, so if anybody has any suggestions on how to usefully break it up, please let me know.
Drop the &&. This makes it clear that the function will clear the error (whereas previously it could just ignore them and leave the caller with an uncleared error).