This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy][NFC] refactor error handling. part 3.
ClosedPublic

Authored by avl on Sep 20 2020, 4:55 AM.

Details

Summary

Remove usages of special error reporting functions(error(),
reportError()). Errors are reported as Expected<>/Error returning
values. This part is for ELF subfolder of llvm-objcopy.

Diff Detail

Event Timeline

avl created this revision.Sep 20 2020, 4:55 AM
Herald added a project: Restricted Project. · View Herald Transcript
avl requested review of this revision.Sep 20 2020, 4:55 AM
avl updated this revision to Diff 293032.Sep 20 2020, 2:14 PM

fixed several minor mistakes.

Do we actually have any non-fatal errors in llvm-objcopy etc currently? If not, I think a better approach would be to simply bubble up the errors via Expected and Error usage, and handle the failures in the client code directly. The client code could then choose to either report them as errors (or warnings) and/or abort what it was doing. I imagine in the case of dwarfutil, you'd be best off aborting optimizing a given CU's debug data if it looked dodgy somehow, in which case, you don't need to distinguish between different classes of errors. This is the approach we take in our downstream port for our debug line rewriting that we've implemented.

avl added a comment.Sep 21 2020, 4:01 AM

Do we actually have any non-fatal errors in llvm-objcopy etc currently?

yes. there are some. f.e. CopyConfig.cpp::parseObjcopyOptions uses ErrorCallback to decide whether the error would be fatal or non-fatal.
I can leave this callback for places where it is already used and convert interfaces into Expected and Error usage in other places.

If not, I think a better approach would be to simply bubble up the errors via Expected and Error usage, and handle the failures in the client code directly.

Ok. Currently the error is reported inplace and the program just dies(exit(1)). To change this behavior it would be necessary to update many interfaces(f.e. writeSectionData()->SectionBase::accept()->SectionVisitor::visitor()).
Will do that.

avl updated this revision to Diff 294609.Sep 27 2020, 11:55 PM

addressed comments(reworked to use Expected and Error as returning values).

avl retitled this revision from [llvm-objcopy][NFC] allow to redefine error handling. to [llvm-objcopy][NFC] refactor error handling. part 3..Sep 27 2020, 11:56 PM
avl edited the summary of this revision. (Show Details)

Before you go too much further with this, I'd like some comments from others on the approach. The fact that this code is far more extensive and has far more error paths suggests to me that we might be on the wrong approach here, and perhaps we should actually be using the error handler callback approach after all. @grimar/@rupprecht/@MaskRay - any thoughts?

If we had an error handler callback, we'd probably want it added to the objcopy config class, to avoid having to pass the callbacks themselves everywhere (only as far as the point where the config is created).

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
332–333

I'd find it more readable now that this is a non-trivial for loop, if it were to have braces.monospaced text

llvm/tools/llvm-objcopy/ELF/Object.cpp
55

Unrelated clang-formatting?

568–582

This isn't the way to report the error back. The message will have lost fidelity. You should probably change this to use the fallible constructor idiom - see https://llvm.org/docs/ProgrammersManual.html#fallible-constructors. That way, the error can be propagated back to the caller in the way it always was.

657–658

See https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements: "readability is also harmed if an if/else chain does not use braced bodies for either all or none of its members".

But actually, this is one of several examples in this patch where you'd be better off doing something like this:

Expected<SymbolTableSection *> Sec = SecTable.getSectionOfType<SymbolTableSection>(
              Link,
              "Link field value " + Twine(Link) + " in section " + Name +
                  " is invalid",
              "Link field value " + Twine(Link) + " in section " + Name +
                  " is not a symbol table"));
if (!Sec)
  return Sec.takeError();

setSymTab(*Sec);
Symbols->setShndxTable(this);
return Error::success();

In this case, your code for what to do in the "good" case is kept all next to each other, rather than partially split between the if body and the post if/else bit.

800–811

Same as above.

1129–1134

Same as above.

1413

I'd avoid the auto here. It's not clear to me what sort of thing program_headers returns (in this case an Expected).

1513–1523

Same as above. This is also different to the style used elsewhere in this function.

1633–1636

Avoid unnecessary else.

1752–1757

Don't use auto. The type isn't obvious.

1756

As you are changing this line, I'd get rid of this auto too. Same goes elsewhere.

1844–1867

Here and throughout this function, don't use auto if the type isn't obvious from the RHS, unless it's an iterator.

llvm/tools/llvm-objcopy/ELF/Object.h
114–125

No need to specify virtual when a function is marked as override.

llvm/tools/llvm-objcopy/llvm-objcopy.cpp
59–63

This might be an indication that you should use an error handler after all. You don't really want your library code printing warnings directly itself.

avl added inline comments.Sep 28 2020, 2:45 AM
llvm/tools/llvm-objcopy/ELF/Object.h
114–125

I mark them as virtual so that they look the same as following definitions - will remove them.

llvm/tools/llvm-objcopy/llvm-objcopy.cpp
59–63

This reportWarning() function is exactly implementation(used by only llvm-objcopy.cpp) of the error handler. It is not seen by other parts of llvm-objcopy. It is passed as "llvm::function_ref<Error(Error)> ErrorCallback" to other parts. Thus, It could/should not be used by library.

jhenderson added inline comments.Sep 28 2020, 2:52 AM
llvm/tools/llvm-objcopy/ELF/Object.h
114–125

Thanks - the key is that they AREN'T the same so we should be highlighting this by making them look different!

llvm/tools/llvm-objcopy/llvm-objcopy.cpp
59–63

Okay, thanks. That's good to know.

avl updated this revision to Diff 294646.Sep 28 2020, 4:12 AM
avl edited the summary of this revision. (Show Details)

addressed comments.

grimar added a comment.EditedSep 28 2020, 11:49 PM

Before you go too much further with this, I'd like some comments from others on the approach. The fact that this code is far more extensive and has far more error paths suggests to me that we might be on the wrong approach here, and perhaps we should actually be using the error handler callback approach after all. @grimar/@rupprecht/@MaskRay - any thoughts?

I have 2 general comments/thoughts atm:

  1. I've noticed that the code tries to address 2 a bit different situations currently:
    1. Cases that fix the unwrapOrError: like unwrapOrError(HeadersFile.program_headers()) which seems are trying to fix places related to broken inputs which were never tested before I think.
    2. Cases that are tested and report errors, like: error("program header with offset 0x" + Twine::utohexstr(Phdr.p_offset) + .....

      I think for start we should probably focus on (B) and probably can keep unwrapOrError untouched until we convert them and add tests.
  1. Regarding the error handler callback suggested. I am thinking about something that is implemented in LLD. I.e. a error state is remembered and a error is reported via callback. Then we are trying to stop execution and exit at a convenient place. @jhenderson, did you mean something like this?

Before you go too much further with this, I'd like some comments from others on the approach. The fact that this code is far more extensive and has far more error paths suggests to me that we might be on the wrong approach here, and perhaps we should actually be using the error handler callback approach after all. @grimar/@rupprecht/@MaskRay - any thoughts?

  1. Regarding the error handler callback suggested. I am thinking about something that is implemented in LLD. I.e. a error state is remembered and a error is reported via callback. Then we are trying to stop execution and exit at a convenient place. @jhenderson, did you mean something like this?

I was actually referring more to the style that this patch series initially looked at, which is based on something similar to what I did in the DebugLine parser, which in turn inspired my lightning talk on error handling in libraries three(?) years ago. In other words, the client would provide one or more callbacks for different degrees of severity of problems, which llvm-objcopy's code could then call when an error occurred.

I'm not convinced there are many points in the llvm-objcopy code where continuing after failure makes much sense, personally, so just storing an error state for reporting later seems unnecessary?

jhenderson added inline comments.Sep 29 2020, 12:26 AM
llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
321–322

I would either replace both these names or neither of them, since these are the variable names already. I think we want to avoid an inconsistent style between shouldReplace and AddSection.

avl added a comment.Sep 29 2020, 1:41 AM

I was actually referring more to the style that this patch series initially looked at, which is based on something similar to what I did in the DebugLine parser, which in turn inspired my lightning talk on error handling in libraries three(?) years ago. In other words, the client would provide one or more callbacks for different degrees of severity of problems, which llvm-objcopy's code could then call when an error occurred.

I'm not convinced there are many points in the llvm-objcopy code where continuing after failure makes much sense, personally, so just storing an error state for reporting later seems unnecessary?

Right, that patch initially was done in that manner - the client provide one or more callbacks for different degrees of severity of problems. The argument against such a solution was that there are only fatal_errors currently, which could be reported using Expected<>/Error returning values. It looks to me that It is OK to use that(Expected<>/Error) solution until we need to report other errors(warning/recoverable error). i.e. Let`s create handlers when we would need to report other kinds of errors.

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
321–322

My own preference is that it is OK to fix small style issues if they were encountered during making some patch. But it is often rejected with the reason - "It should be separate patch". Thus I am trying to change only lines affected by patch. In this concrete case clang-tidy reported style issue - So I fixed it. If it is OK - I would change shouldReplace also.

jhenderson added inline comments.Sep 30 2020, 12:23 AM
llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
321–322

I think it's okay to ignore clang-tidy issues where you aren't modifying the specific bit it's complaining about. However, it's probably just better to fix the style issue for both function_ref instances in a separate patch, thinking about it.

What I don't think is okay is to have inconsistent style (I don't mind if it is a precursor patch to fix the style, or just ignoring the clang-tidy warning here).

avl added a comment.Sep 30 2020, 2:56 AM

So, what about current solution in this patch to report fatal errors through Expected<>/Error in llvm-objcopy? Is it OK to do things that way? My understanding is that currently only fatal errors are reported and then no need to create handlers at the moment.

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
321–322

OK, I will return AddSection back and fix the style issue in a separate commit.

In D87987#2302840, @avl wrote:

So, what about current solution in this patch to report fatal errors through Expected<>/Error in llvm-objcopy? Is it OK to do things that way? My understanding is that currently only fatal errors are reported and then no need to create handlers at the moment.

I'm waiting on others feedback to see what they think. The advantage of the handler is that it would allow us to simplify the error handling a little probably (less of a need to repeatedly check for error/takeError patterns etc). The downside is the need for the handler, and the fact that there'd need to be some way of bailing out once an error is hit (the handler could be noreturn or something like that, I suppose).

avl added a comment.Sep 30 2020, 4:05 AM

and the fact that there'd need to be some way of bailing out once an error is hit (the handler could be noreturn or something like that, I suppose).

yes, and that way(bailing out once an error is hit) could be not flexible enough. The Expected<>/Error solution allows to handle errors at any level, while handler would need to create some agreement on that. f.e. current agreement(before this patch) is that the program just dies with exit(1) at the point where error is occured. From that point of view, Expected<>/Error solution looks more flexible: it could be caught at some level and the execution continiues(kind of recoverable error).

I'm really happy to see this. I tried removing the error() methods from llvm-objcopy.h a year or two ago and didn't have the patience to work through all the plumbing seen here.

This approach LGTM to me. I just have a couple nits.

llvm/tools/llvm-objcopy/ELF/Object.cpp
561–569

Nothing's happening to this variable, so I think return CompressedSection(...) looks better here

575–582

Once we detect an error, we probably don't want to go on and process the rest of this function, e.g. use of CompressedData may not be valid.

You can't return early from the constructor IIUC, but maybe something like:

if (Error Err = zlib::compress(...)) {
  OutErr = ...
} { else {
  // Rest of constructor
}
1749

Is this actually necessary? I think the above switch statement handles all cases (including a default case), and if any of the types break instead of returning, you should get a -Wreturn error trying to build this.

1902–1905

nit: this pattern can be reduced:

if (Error E = foo()) return E;
return Error::success();

->

return foo();
jhenderson added inline comments.Oct 1 2020, 1:19 AM
llvm/tools/llvm-objcopy/ELF/Object.cpp
575–582

+1 to bailing out early here. You can use a return; (no value) in a constructor. See https://stackoverflow.com/questions/5255777/what-if-i-write-return-statement-in-constructor.

avl updated this revision to Diff 295525.Oct 1 2020, 4:18 AM

addressed comments.

avl updated this revision to Diff 295528.EditedOct 1 2020, 4:22 AM

fixed a small typo.

jhenderson accepted this revision.Oct 1 2020, 4:36 AM

LGTM, but please wait for others.

This revision is now accepted and ready to land.Oct 1 2020, 4:36 AM
avl added a comment.Oct 1 2020, 6:15 AM

@grimar

I've noticed that the code tries to address 2 a bit different situations currently:

Cases that fix the unwrapOrError: like unwrapOrError(HeadersFile.program_headers()) which seems are trying to fix places related to >broken inputs which were never tested before I think.
Cases that are tested and report errors, like: error("program header with offset 0x" + Twine::utohexstr(Phdr.p_offset) + .....

I think for start we should probably focus on (B) and probably can keep unwrapOrError untouched until we convert them and add tests.

Probably, we could leave unwrapOrError expanded by this patch?
This patch does not change the logic existed with unwrapOrError.
It just removes the function and put the code inplace.
Before this patch:

for (const auto &Phdr : unwrapOrError(HeadersFile.program_headers()))

after this patch:

Expected<typename ELFFile<ELFT>::Elf_Phdr_Range> Headers =
    HeadersFile.program_headers();
if (!Headers)
  return Headers.takeError();

for (const typename ELFFile<ELFT>::Elf_Phdr &Phdr : *Headers)

The unwrapOrError was defined as :

template <class T> T unwrapOrError(Expected<T> EO) {
  if (EO)
    return *EO;
  std::string Buf;
  raw_string_ostream OS(Buf);
  logAllUnhandledErrors(EO.takeError(), OS);
  OS.flush();
  error(Buf);
}

i.e. it either continues execution, either report the error and die.
After unwrapOrError() is expanded the behavior is not changed: it either
continues execution, either report Error which is handled in llvm-objcopy.cpp and stop.

WDYT, Would it be OK to remove unwrapOrError() by this patch?

rupprecht accepted this revision.Oct 1 2020, 2:22 PM

LGTM, thanks again!

grimar accepted this revision.Oct 2 2020, 12:38 AM
In D87987#2305883, @avl wrote:

@grimar
....
WDYT, Would it be OK to remove unwrapOrError() by this patch?

Sorry, I've not noticed this question yesterday. It is probably fine.

This revision was landed with ongoing or failed builds.Oct 2 2020, 1:56 PM
This revision was automatically updated to reflect the committed changes.
llvm/tools/llvm-objcopy/MachO/MachOReader.cpp