This is an archive of the discontinued LLVM Phabricator instance.

[llvm-readobj][ELF] Move ELF note parsing into lib/Object
ClosedPublic

Authored by scott.linder on Mar 1 2018, 10:21 AM.

Diff Detail

Event Timeline

scott.linder created this revision.Mar 1 2018, 10:21 AM

The first patch does not check the bounds on the first note before the user can call getName/getDesc. This patch performs the bounds check ahead of time, when the iterator is first created, and immediately after every increment.

t-tye added inline comments.Mar 2 2018, 3:03 PM
include/llvm/Object/ELFTypes.h
623

To be consistent with operator++ suggest:

if (From == To)
  Nhdr = nullptr;
else
  checkBounds();

This supports iterating over any empty notes list by ensuring the returned iterator is the end iterator (represented as Nhdr == nullptr).

Technically setting Nhdr to pointing outside an object or one past the end of an object is not defined in C++. So really should be checking the sizes before creating the pointers.

627

Suggest add an assert for Nhdr being nullptr. That ensures are not incrementing an end iterator. This also avoids checkBounds having to check Err is nullptr as the only way for that to happen is by constructing an end iterator.

643–644

Suggest this return a reference type not a pointer, and asserts if Nhdr is nullptr indicating the end iterator.

This patch should make the interfaces safe in any context. Bounds checks are done using sizes, and are completed before pointers are created. The checks are done only as necessary: the size of the container (section or program header) is checked before creating the iterator, the size of each Nhdr is checked before a pointer to it is created, and the sizes of the name/descriptor are checked before the StringRef/ArrayRef are created.

Programmer errors, like continuing to iterate after reaching the end iterator or providing a non-success Error when creating the start iterator, are caught with asserts.

The constructors of of the Elf_Note_Impl and Elf_Note_Iterator_Impl are still private with friend declarations. Would we prefer to make these public? I cannot think of a use-case, as all calling code should be using the functions in ELFFile to access notes, and this makes it harder for a caller to get an iterator or note in a bad state.

t-tye added inline comments.Mar 5 2018, 9:56 PM
include/llvm/Object/ELFTypes.h
643

Is this constructor needed?

647

Since this iterator constructor always requires an Err should it be passed as Error&?

651–654

An alternative approach is that all checking for a note being legal is done in the iterator constructor and operator++. The check could be factored into a common helper function called by both. That would include ensuring that the RemainingSize is > the *Nhrd size, getting the pointer to the Nhdr, followed by checking that Nhdr->getSize <= RemainingSize (which is the entire size of the note record). That would avoid needing to do any checking in the access functions for the note name or description which avoids the need for MaxSize, and avoids them needing to return an Expected as they will always return valid values.

scott.linder added inline comments.Mar 6 2018, 8:19 AM
include/llvm/Object/ELFTypes.h
643

Not strictly, but for code in ELF.h it makes the intent of "construct a failing end iterator" clearer, like in return Elf_Note_Iterator(&Err);

647

This was a typo, I was also intending to check the bool of the Error, not just the Error *. Thank you for the catch, I will update the patch.

There are cases (such as constructing the end iterator) where there is no chance for error, and so having a non-nullable reference is not appropriate (and complicates the calling code for no benefit). The alternative is an Optional<Error &>, which I can update the patch to use. I could also do the same for the pointer to the Nhdr, and in both cases it would make the fact that they are optional clearer.

651–654

Yes, I would also be fine with this approach, and the checks being in Note currently is in part left over from starting with a public constructor for the type.

The reason I did not hoist the checks up into the iterator is that I am not confident we want the constructor to be private. If we are OK with it, I can move the checks up to avoid the extra size_t and duplication of bound checks.

scott.linder added inline comments.Mar 6 2018, 8:25 AM
include/llvm/Object/ELFTypes.h
647

I think I understand what you mean after re-reading your comment; I can certainly make the constructor accept the reference and take the pointer itself, so I will make that change at least.

Address feedback from Rafael and Tony.

  • ProcessNote now accepts a reference, thank you for the catch.
  • Removed the assert on Err in the iterator constructor; the check actually marks the Error as "checked", which is incorrect.
  • Moved the "whole note" bound checks up into the iterator, unifying the bound checks in once place.

All of the public interfaces are still safe, and there is a bit less waste (not copying around MaxSize, not checking the "whole note" bounds multiple times per iteration).

As for Rafael's comment on checking errors in each iteration, there is no need as the iterator is careful to always stop iteration when an error is encountered, so one check after iteration ends should suffice.

Scott Linder via Phabricator <reviews@reviews.llvm.org> writes:

scott.linder updated this revision to Diff 137080.
scott.linder added a comment.

This patch should make the interfaces safe in any context. Bounds checks are done using sizes, and are completed before pointers are created. The checks are done only as necessary: the size of the container (section or program header) is checked before creating the iterator, the size of each Nhdr is checked before a pointer to it is created, and the sizes of the name/descriptor are checked before the StringRef/ArrayRef are created.

Programmer errors, like continuing to iterate after reaching the end iterator or providing a non-success Error when creating the start iterator, are caught with asserts.

Thanks, that is exactly what we should aim for.

+ auto ProcessNote = [&](const Elf_Note Note) {

Don't you want a reference?

+ PrintHeader(P.p_offset, P.p_filesz);
+ Error Err = Error::success();
+ for (const auto &Note : Obj->notes(P, Err))
+ ProcessNote(Note);
+ if (Err)
+ error(std::move(Err));

Don't you need to check Err on every loop iteration?

+ Elf_Note_Iterator_Impl &operator++() {
+ assert(Nhdr && "incremented ELF note end iterator");
+ size_t NhdrSize = Nhdr->getSize();
+ if (NhdrSize >= RemainingSize) {

The idea seems to be that we can create an iterator to a note that is
truncated, but we will detect that if we try to iterate past it or
access the truncated part with getDesc/getName.

That is fine and error handling at its most precise. On the other hand,
it might be simpler to error on an attempt at creating an iterator that
points to the truncated note. I am OK with both.

LGTM with the above nits.

Cheers,
Rafael


llvm-commits mailing list
llvm-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits

LGTM

Thanks!

Scott Linder via Phabricator <reviews@reviews.llvm.org> writes:

scott.linder updated this revision to Diff 137241.
scott.linder added a comment.

Address feedback from Rafael and Tony.

  • ProcessNote now accepts a reference, thank you for the catch.
  • Removed the assert on Err in the iterator constructor; the check actually marks the Error as "checked", which is incorrect.
  • Moved the "whole note" bound checks up into the iterator, unifying the bound checks in once place.

All of the public interfaces are still safe, and there is a bit less waste (not copying around MaxSize, not checking the "whole note" bounds multiple times per iteration).

As for Rafael's comment on checking errors in each iteration, there is no need as the iterator is careful to always stop iteration when an error is encountered, so one check after iteration ends should suffice.

https://reviews.llvm.org/D43958

Files:

include/llvm/BinaryFormat/ELF.h
include/llvm/Object/ELF.h
include/llvm/Object/ELFTypes.h
tools/llvm-readobj/ELFDumper.cpp

Index: tools/llvm-readobj/ELFDumper.cpp

  • tools/llvm-readobj/ELFDumper.cpp

+++ tools/llvm-readobj/ELFDumper.cpp
@@ -92,6 +92,7 @@

using Elf_Word = typename ELFT::Word;                                        \
using Elf_Hash = typename ELFT::Hash;                                        \
using Elf_GnuHash = typename ELFT::GnuHash;                                  \

+ using Elf_Note = typename ELFT::Note; \

using Elf_Sym_Range = typename ELFT::SymRange;                               \
using Elf_Versym = typename ELFT::Versym;                                    \
using Elf_Verneed = typename ELFT::Verneed;                                  \

@@ -3534,62 +3535,57 @@

const Elf_Ehdr *e = Obj->getHeader();
bool IsCore = e->e_type == ELF::ET_CORE;
  • auto process = [&](const typename ELFT::Off Offset,
  • const typename ELFT::Addr Size) {
  • if (Size <= 0)
  • return; -
  • const auto *P = static_cast<const uint8_t *>(Obj->base() + Offset);
  • const auto *E = P + Size; -

+ auto PrintHeader = [&](const typename ELFT::Off Offset,
+ const typename ELFT::Addr Size) {

OS << "Displaying notes found at file offset " << format_hex(Offset, 10)
   << " with length " << format_hex(Size, 10) << ":\n"
   << "  Owner                 Data size\tDescription\n";

+ };

  • while (P < E) {
  • const Elf_Word *Words = reinterpret_cast<const Elf_Word *>(&P[0]); -
  • uint32_t NameSize = Words[0];
  • uint32_t DescriptorSize = Words[1];
  • uint32_t Type = Words[2]; -
  • ArrayRef<Elf_Word> Descriptor(&Words[3 + (alignTo<4>(NameSize) / 4)],
  • alignTo<4>(DescriptorSize) / 4); -
  • StringRef Name;
  • if (NameSize)
  • Name =
  • StringRef(reinterpret_cast<const char *>(&Words[3]), NameSize - 1); -
  • OS << " " << Name << std::string(22 - NameSize, ' ')
  • << format_hex(DescriptorSize, 10) << '\t'; -
  • if (Name == "GNU") {
  • OS << getGNUNoteTypeName(Type) << '\n';
  • printGNUNote<ELFT>(OS, Type, Descriptor, DescriptorSize);
  • } else if (Name == "FreeBSD") {
  • OS << getFreeBSDNoteTypeName(Type) << '\n';
  • } else if (Name == "AMD") {
  • OS << getAMDGPUNoteTypeName(Type) << '\n';
  • printAMDGPUNote<ELFT>(OS, Type, Descriptor, DescriptorSize);
  • } else {
  • OS << "Unknown note type: (" << format_hex(Type, 10) << ')';
  • }
  • OS << '\n'; -
  • P = P + 3 * sizeof(Elf_Word) + alignTo<4>(NameSize) +
  • alignTo<4>(DescriptorSize);

+ auto ProcessNote = [&](const Elf_Note &Note) {
+ StringRef Name = Note.getName();
+ ArrayRef<Elf_Word> Descriptor = Note.getDesc();
+ Elf_Word Type = Note.getType();
+
+ OS << " " << Name << std::string(22 - Name.size(), ' ')
+ << format_hex(Descriptor.size(), 10) << '\t';
+
+ if (Name == "GNU") {
+ OS << getGNUNoteTypeName(Type) << '\n';
+ printGNUNote<ELFT>(OS, Type, Descriptor, Descriptor.size());
+ } else if (Name == "FreeBSD") {
+ OS << getFreeBSDNoteTypeName(Type) << '\n';
+ } else if (Name == "AMD") {
+ OS << getAMDGPUNoteTypeName(Type) << '\n';
+ printAMDGPUNote<ELFT>(OS, Type, Descriptor, Descriptor.size());
+ } else {
+ OS << "Unknown note type: (" << format_hex(Type, 10) << ')';

}

+ OS << '\n';

};
 
if (IsCore) {
  • for (const auto &P : unwrapOrError(Obj->program_headers()))
  • if (P.p_type == PT_NOTE)
  • process(P.p_offset, P.p_filesz);

+ for (const auto &P : unwrapOrError(Obj->program_headers())) {
+ if (P.p_type != PT_NOTE)
+ continue;
+ PrintHeader(P.p_offset, P.p_filesz);
+ Error Err = Error::success();
+ for (const auto &Note : Obj->notes(P, Err))
+ ProcessNote(Note);
+ if (Err)
+ error(std::move(Err));
+ }

} else {
  • for (const auto &S : unwrapOrError(Obj->sections()))
  • if (S.sh_type == SHT_NOTE)
  • process(S.sh_offset, S.sh_size);

+ for (const auto &S : unwrapOrError(Obj->sections())) {
+ if (S.sh_type != SHT_NOTE)
+ continue;
+ PrintHeader(S.sh_offset, S.sh_size);
+ Error Err = Error::success();
+ for (const auto &Note : Obj->notes(S, Err))
+ ProcessNote(Note);
+ if (Err)
+ error(std::move(Err));
+ }

}

}

Index: include/llvm/Object/ELFTypes.h

  • include/llvm/Object/ELFTypes.h

+++ include/llvm/Object/ELFTypes.h
@@ -40,6 +40,9 @@
template <class ELFT> struct Elf_Hash_Impl;
template <class ELFT> struct Elf_GnuHash_Impl;
template <class ELFT> struct Elf_Chdr_Impl;
+template <class ELFT> struct Elf_Nhdr_Impl;
+template <class ELFT> class Elf_Note_Impl;
+template <class ELFT> class Elf_Note_Iterator_Impl;

template <endianness E, bool Is64> struct ELFType {
private:
@@ -66,6 +69,9 @@

using Hash = Elf_Hash_Impl<ELFType<E, Is64>>;
using GnuHash = Elf_GnuHash_Impl<ELFType<E, Is64>>;
using Chdr = Elf_Chdr_Impl<ELFType<E, Is64>>;

+ using Nhdr = Elf_Nhdr_Impl<ELFType<E, Is64>>;
+ using Note = Elf_Note_Impl<ELFType<E, Is64>>;
+ using NoteIterator = Elf_Note_Iterator_Impl<ELFType<E, Is64>>;

using DynRange = ArrayRef<Dyn>;
using ShdrRange = ArrayRef<Shdr>;
using SymRange = ArrayRef<Sym>;

@@ -551,6 +557,127 @@

Elf_Xword ch_addralign;

};

+/ Note header
+template <class ELFT>
+struct Elf_Nhdr_Impl {
+ LLVM_ELF_IMPORT_TYPES_ELFT(ELFT)
+ Elf_Word n_namesz;
+ Elf_Word n_descsz;
+ Elf_Word n_type;
+
+
/ The alignment of the name and descriptor.
+ /
+
/ Implementations differ from the specification here: in practice all
+ / variants align both the name and descriptor to 4-bytes.
+ static const unsigned int Align = 4;
+
+
/ Get the size of the note, including name, descriptor, and padding.
+ size_t getSize() const {
+ return sizeof(*this) + alignTo<Align>(n_namesz) + alignTo<Align>(n_descsz);
+ }
+};
+
+/ An ELF note.
+
/
+/ Wraps a note header, providing methods for accessing the name and
+
/ descriptor safely.
+template <class ELFT>
+class Elf_Note_Impl {
+ LLVM_ELF_IMPORT_TYPES_ELFT(ELFT)
+
+ const Elf_Nhdr_Impl<ELFT> &Nhdr;
+
+ template <class NoteIteratorELFT> friend class Elf_Note_Iterator_Impl;
+
+ Elf_Note_Impl(const Elf_Nhdr_Impl<ELFT> &Nhdr) : Nhdr(Nhdr) {}
+
+public:
+ / Get the note's name, excluding the terminating null byte.
+ StringRef getName() const {
+ if (!Nhdr.n_namesz)
+ return StringRef();
+ return StringRef(reinterpret_cast<const char *>(&Nhdr) + sizeof(Nhdr),
+ Nhdr.n_namesz - 1);
+ }
+
+
/ Get the note's descriptor.
+ ArrayRef<Elf_Word> getDesc() const {
+ if (!Nhdr.n_descsz)
+ return ArrayRef<Elf_Word>();
+ return ArrayRef<Elf_Word>(
+ reinterpret_cast<const Elf_Word *>(
+ reinterpret_cast<const uint8_t *>(&Nhdr) + sizeof(Nhdr) +
+ alignTo<Elf_Nhdr_Impl<ELFT>::Align>(Nhdr.n_namesz)),
+ Nhdr.n_descsz);
+ }
+
+ / Get the note's type.
+ Elf_Word getType() const { return Nhdr.n_type; }
+};
+
+template <class ELFT>
+class Elf_Note_Iterator_Impl
+ : std::iterator<std::forward_iterator_tag, Elf_Note_Impl<ELFT>> {
+
Nhdr being a nullptr marks the end of iteration.
+ const Elf_Nhdr_Impl<ELFT> *Nhdr = nullptr;
+ size_t RemainingSize = 0u;
+ Error *Err = nullptr;
+
+ template <class ELFFileELFT> friend class ELFFile;
+
+ Stop iteration and indicate an overflow.
+ void stopWithOverflowError() {
+ Nhdr = nullptr;
+ *Err = make_error<StringError>("ELF note overflows container",
+ object_error::parse_failed);
+ }
+
+
Advance Nhdr by NoteSize bytes, starting from NhdrPos.
+
+
Assumes NoteSize <= RemainingSize. Ensures Nhdr->getSize() <= RemainingSize
+ upon returning. Handles stopping iteration when reaching the end of the
+
container, either cleanly or with an overflow error.
+ void advanceNhdr(const uint8_t *NhdrPos, size_t NoteSize) {
+ RemainingSize -= NoteSize;
+ if (RemainingSize == 0u)
+ Nhdr = nullptr;
+ else if (sizeof(*Nhdr) > RemainingSize)
+ stopWithOverflowError();
+ else {
+ Nhdr = reinterpret_cast<const Elf_Nhdr_Impl<ELFT> *>(NhdrPos + NoteSize);
+ if (Nhdr->getSize() > RemainingSize)
+ stopWithOverflowError();
+ }
+ }
+
+ Elf_Note_Iterator_Impl() {}
+ explicit Elf_Note_Iterator_Impl(Error &Err) : Err(&Err) {}
+ Elf_Note_Iterator_Impl(const uint8_t *Start, size_t Size, Error &Err)
+ : RemainingSize(Size), Err(&Err) {
+ assert(Start && "ELF note iterator starting at NULL");
+ advanceNhdr(Start, 0u);
+ }
+
+public:
+ Elf_Note_Iterator_Impl &operator++() {
+ assert(Nhdr && "incremented ELF note end iterator");
+ const uint8_t *NhdrPos = reinterpret_cast<const uint8_t *>(Nhdr);
+ size_t NoteSize = Nhdr->getSize();
+ advanceNhdr(NhdrPos, NoteSize);
+ return *this;
+ }
+ bool operator==(Elf_Note_Iterator_Impl Other) const {
+ return Nhdr == Other.Nhdr;
+ }
+ bool operator!=(Elf_Note_Iterator_Impl Other) const {
+ return !(*this == Other);
+ }
+ Elf_Note_Impl<ELFT> operator*() const {
+ assert(Nhdr && "dereferenced ELF note end iterator");
+ return Elf_Note_Impl<ELFT>(*Nhdr);
+ }
+};
+
// MIPS .reginfo section
template <class ELFT>
struct Elf_Mips_RegInfo;

Index: include/llvm/Object/ELF.h

  • include/llvm/Object/ELF.h

+++ include/llvm/Object/ELF.h
@@ -67,6 +67,9 @@

using Elf_Versym = typename ELFT::Versym;
using Elf_Hash = typename ELFT::Hash;
using Elf_GnuHash = typename ELFT::GnuHash;

+ using Elf_Nhdr = typename ELFT::Nhdr;
+ using Elf_Note = typename ELFT::Note;
+ using Elf_Note_Iterator = typename ELFT::NoteIterator;

using Elf_Dyn_Range = typename ELFT::DynRange;
using Elf_Shdr_Range = typename ELFT::ShdrRange;
using Elf_Sym_Range = typename ELFT::SymRange;

@@ -155,6 +158,73 @@

  return makeArrayRef(Begin, Begin + getHeader()->e_phnum);
}

+ / Get an iterator over notes in a program header.
+
/
+ / The program header must be of type \c PT_NOTE.
+
/
+ / \param Phdr the program header to iterate over.
+
/ \param Err [out] an error to support fallible iteration, which should
+ / be checked after iteration ends.
+ Elf_Note_Iterator notes_begin(const Elf_Phdr &Phdr, Error &Err) const {
+ if (Phdr.p_type != ELF::PT_NOTE) {
+ Err = createError("attempt to iterate notes of non-note program header");
+ return Elf_Note_Iterator(Err);
+ }
+ if (Phdr.p_offset + Phdr.p_filesz > getBufSize()) {
+ Err = createError("invalid program header offset/size");
+ return Elf_Note_Iterator(Err);
+ }
+ return Elf_Note_Iterator(base() + Phdr.p_offset, Phdr.p_filesz, Err);
+ }
+
+
/ Get an iterator over notes in a section.
+ /
+
/ The section must be of type \c SHT_NOTE.
+ /
+
/ \param Shdr the section to iterate over.
+ / \param Err [out] an error to support fallible iteration, which should
+
/ be checked after iteration ends.
+ Elf_Note_Iterator notes_begin(const Elf_Shdr &Shdr, Error &Err) const {
+ if (Shdr.sh_type != ELF::SHT_NOTE) {
+ Err = createError("attempt to iterate notes of non-note section");
+ return Elf_Note_Iterator(Err);
+ }
+ if (Shdr.sh_offset + Shdr.sh_size > getBufSize()) {
+ Err = createError("invalid section offset/size");
+ return Elf_Note_Iterator(Err);
+ }
+ return Elf_Note_Iterator(base() + Shdr.sh_offset, Shdr.sh_size, Err);
+ }
+
+ / Get the end iterator for notes.
+ Elf_Note_Iterator notes_end() const {
+ return Elf_Note_Iterator();
+ }
+
+
/ Get an iterator range over notes of a program header.
+ /
+
/ The program header must be of type \c PT_NOTE.
+ /
+
/ \param Phdr the program header to iterate over.
+ / \param Err [out] an error to support fallible iteration, which should
+
/ be checked after iteration ends.
+ iterator_range<Elf_Note_Iterator> notes(const Elf_Phdr &Phdr,
+ Error &Err) const {
+ return make_range(notes_begin(Phdr, Err), notes_end());
+ }
+
+ / Get an iterator range over notes of a section.
+
/
+ / The section must be of type \c SHT_NOTE.
+
/
+ / \param Shdr the section to iterate over.
+
/ \param Err [out] an error to support fallible iteration, which should
+ /// be checked after iteration ends.
+ iterator_range<Elf_Note_Iterator> notes(const Elf_Shdr &Shdr,
+ Error &Err) const {
+ return make_range(notes_begin(Shdr, Err), notes_end());
+ }
+

Expected<StringRef> getSectionStringTable(Elf_Shdr_Range Sections) const;
Expected<uint32_t> getSectionIndex(const Elf_Sym *Sym, Elf_Sym_Range Syms,
                                   ArrayRef<Elf_Word> ShndxTable) const;

Index: include/llvm/BinaryFormat/ELF.h

  • include/llvm/BinaryFormat/ELF.h

+++ include/llvm/BinaryFormat/ELF.h
@@ -1483,6 +1483,20 @@

Elf64_Xword ch_addralign;

};

+ Node header for ELF32.
+struct Elf32_Nhdr {
+ Elf32_Word n_namesz;
+ Elf32_Word n_descsz;
+ Elf32_Word n_type;
+};
+
+
Node header for ELF64.
+struct Elf64_Nhdr {
+ Elf64_Word n_namesz;
+ Elf64_Word n_descsz;
+ Elf64_Word n_type;
+};
+
// Legal values for ch_type field of compressed section header.
enum {

ELFCOMPRESS_ZLIB = 1,            // ZLIB/DEFLATE algorithm.
This revision was not accepted when it landed; it landed in state Needs Review.Mar 12 2018, 12:31 PM
This revision was automatically updated to reflect the committed changes.
t-tye added inline comments.Mar 12 2018, 12:35 PM
include/llvm/Object/ELF.h
173

Should this return the end iterator if the note section size is 0?

If not the end_iterator should this check that the total size of the note record (including the data of the name and desc) fits in the section? The operator++ does that check.

scott.linder added inline comments.Mar 12 2018, 12:54 PM
include/llvm/Object/ELF.h
173

If the size (p_filesz) is 0 here, the advanceNhdr call in the Iterator constructor will immediately see that RemainingSize == 0 and end iteration cleanly. The check is done in the Iterator because it would otherwise be duplicated here: once for program headers, and once for sections.

t-tye added a comment.Mar 12 2018, 1:05 PM

LGTM

include/llvm/Object/ELF.h
173

Sorry, I noticed that and was going to retract my comment but you beat me to it:-)