This is an archive of the discontinued LLVM Phabricator instance.

[llvm-readelf/obj] - Allow dumping of ELF header even if some elements are corrupt.
ClosedPublic

Authored by grimar on Nov 5 2020, 10:47 PM.

Details

Summary

For creating ELFObjectFile instances we have the factory method
ELFObjectFile<ELFT>::create(MemoryBufferRef Object).

The problem of this method is that it scans the section header to locate some sections.
When a file is truncated or has broken fields in the ELF header, this approach does
not allow us to create the ELFObjectFile and dump the ELF header.

This is https://bugs.llvm.org/show_bug.cgi?id=40804

This patch suggests a solution - it allows to delay scaning sections in the
ELFObjectFile<ELFT>::create. It now allows user code to call an object
initialization (initContent()) later. With that it is possible,
for example, for dumpers just to dump the file header and exit.
By default initialization is still performed as before, what helps to keep
the logic of existent callers untouched.

I've experimented with different approaches when worked on this patch.
I think this approach is better than doing initialization of sections (i.e. scan of them)
on demand, because normally users of ELFObjectFile API expect to work with a valid object.
In most cases when a section header table can't be read (because of an error), we don't
have to continue to work with object. So we probably don't need to implement a more complex API.

Diff Detail

Event Timeline

grimar created this revision.Nov 5 2020, 10:47 PM
Herald added a project: Restricted Project. · View Herald Transcript
grimar requested review of this revision.Nov 5 2020, 10:47 PM
jhenderson accepted this revision.Nov 6 2020, 2:52 AM

LGTM. About the only other viable alternative I could think of would be to have the sections that are initialised early stored in some sort of container that does the lazy initialization itself, and then being a transparent object thereafter (like the Proxy design pattern). This would delay the error reporting until later. However, that would probably require call sites to be updated to handle errors where they don't currently, which I can understand might be a pain.

llvm/include/llvm/Object/ELFObjectFile.h
249
This revision is now accepted and ready to land.Nov 6 2020, 2:52 AM
grimar added a comment.Nov 6 2020, 3:06 AM

LGTM. About the only other viable alternative I could think of would be to have the sections that are initialised early stored in some sort of container that does the lazy initialization itself, and then being a transparent object thereafter (like the Proxy design pattern). This would delay the error reporting until later. However, that would probably require call sites to be updated to handle errors where they don't currently, which I can understand might be a pain.

I've tried to delay the initialization when experimented with possible solutions. In that experiment I had 3 additional functions: getDynSymSec(), getSymtabSec() and getSymtabShndxSec() which lazily initialized the
pointers to corresponding sections on call. It was a pain, because some tools, like llvm-ar still want to see the old behavior, i.e. to report an error early when they are unable to
initialize an object because of broken section header table. For them I had to add the new validate()/initialize() method, but this did not look nice:
it is easy to forget to call such method after creating an object and it also required too many call sites to be updated.