Page MenuHomePhabricator

[llvm-readelf] Implement note parsing for NT_FILE and unknown descriptors

Authored by rupprecht on Aug 6 2019, 3:38 PM.



This patch implements two note parsers; one for NT_FILE coredumps, e.g.:

CORE                  0x00000080      NT_FILE (mapped files)
  Page size: 4096
               Start                 End         Page Offset
  0x0000000000001000  0x0000000000002000  0x0000000000003000
  0x0000000000004000  0x0000000000005000  0x0000000000006000
  0x0000000000007000  0x0000000000008000  0x0000000000009000

(A more realistic example can be tested locally by creating a crashing program and running llvm-readelf -n core)

And also implements a raw hex dump for unknown descriptor data for unhandled descriptor types.

Diff Detail


Event Timeline

rupprecht created this revision.Aug 6 2019, 3:38 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 6 2019, 3:38 PM
MaskRay added inline comments.Aug 6 2019, 10:08 PM
4362 ↗(On Diff #213737)

StringRef Filename(reinterpret_cast<const char *>(;

4363 ↗(On Diff #213737)

(I think it is NUL, not null.)

4369 ↗(On Diff #213737)

Or for (CoreFileMapping<ELFT> &Mapping : Ret.Mappings) {

4442 ↗(On Diff #213737)

|| Obj->getHeader()->e_type == ELF::ET_CORE can be deleted.

readelf -n doesn't seem to perform the test, either.

5601 ↗(On Diff #213737)

|| Obj->getHeader()->e_type == ELF::ET_CORE can be deleted.

grimar added inline comments.Aug 7 2019, 1:25 AM
62 ↗(On Diff #213737)

Perhaps worth to check there is no more output here:

# GNU-NOT: {{.}}
4339 ↗(On Diff #213737)

Other errors in this file are lowercase, I think the new errors should follow.

5556 ↗(On Diff #213737)

This part is common and can be reused for both LLVM and GNU styles:

if (NoteType != ELF::NT_FILE)

Expected<CoreNote<ELFT>> Note = readCoreNote<ELFT>(Desc);
if (!Note) {

I.e. I would try either add a one more helper or try to combine printCoreNoteLLVMStyle and printCoreNote.

rupprecht updated this revision to Diff 213955.Aug 7 2019, 11:16 AM
rupprecht marked 12 inline comments as done.
  • Remove ET_CORE check
  • null -> NUL
  • Use more consistent error messages
  • Check for no more lines in test output
  • Other misc cleanup
4339 ↗(On Diff #213737)

These are just lifted directly from GNU readelf, but we probably don't need to match them -- reformatted/reworded all of them to be more consistent. I'm open to different wording for any of them.

(especially "too short for supplied file count", I've rewritten it as "too short for number of files", but it still sounds a little awkward)

4363 ↗(On Diff #213737)

Wow, TIL about the difference between NUL and null

4442 ↗(On Diff #213737)

Removed -- but I should elaborate, readelf does perform this check. Note that GNU readelf actually prints notes in two stages:

  • First it tries to print the note type. It starts by looking at the name, although the fallback (if the name is empty, doesn't match any of the known names, or in some (but not all) cases matches the name but doesn't match any known constants) will look at the file type (see process_note and get_note_type).
    • Note: none of the "names" it checks is "CORE", it _only_ prints out the NT_* core descriptions if the actual elf type is ET_CORE
  • Second, it prints note data, this time only looking at the name, i.e. if name is "CORE" (and it ignores the elf type) -- this is what you're talking about, but not the first.

I think these points don't matter that much. I think the only thing this would break is if someone is expecting a ET_REL/ET_EXEC file with CORE notes to print non-core NT_* fields. But still, I think we should figure out if/how we want to match GNU readelf for printing notes.

5556 ↗(On Diff #213737)

I moved some of this into the GNU/LLVMStyle::printNotes methods and it's a bit more succinct now, PTAL -- I'm not sure if it's what you're going for.

rupprecht marked an inline comment as done.Aug 7 2019, 12:25 PM
rupprecht added inline comments.
4442 ↗(On Diff #213737)

Actually, I do notice a change after testing this locally:
When I run a dummy crashing program and then run llvm-readelf -n core, there's a note with type "LINUX" that says "NT_X86_XSTATE (x86 XSAVE extended state)" in GNU readelf that now shows up as "Unknown note type: (0x00000202)"

So, I'll have to undo this part of the change and update the diff in a bit. We should consider separating the note type vs description printing similarly.
(Aside: there is very little llvm-readobj test coverage for the FreeBSD/AMD/AMDGPU note types above, I'll add some in a separate commit)

rupprecht updated this revision to Diff 213987.Aug 7 2019, 1:53 PM
  • Restore ET_CORE check
MaskRay added inline comments.Aug 8 2019, 12:37 AM
7 ↗(On Diff #213987)

llvm-mc doesn't support generating ET_CORE files; the following section data was generated with the following steps:

May be worth mentioning that the Content field of` was derived from the following assembly.

The comment make me think that other fields of the file are derived from llvm-mc output, but they aren't according to "following steps".

(Not necessarily mentioned in the comment: llvm-readobj doesn't give hex pairs you can immediately use, but llvm-objcopy>(xxd -p) tmp.o /dev/null (>(..) is zsh syntax) does.)

14 ↗(On Diff #213987)

Nit: /* */ -> #

# is probably a bit easier to read.

4442 ↗(On Diff #213737)

Verified. || Obj->getHeader()->e_type == ELF::ET_CORE dumps "LINUX". This makes me think if we should use more specific Name == "CORE" || Name == "LINUX".

I think this should be sufficient. See Linux/fs/binfmt_elf{,_fdpic}.c, fill_note() is only called on "CORE" and "LINUX".

grimar added inline comments.Aug 8 2019, 2:33 AM
5556 ↗(On Diff #213737)

Looks fine, thanks.

rupprecht updated this revision to Diff 214440.Aug 9 2019, 1:25 PM
rupprecht marked 4 inline comments as done.
  • Split up type/description printing
  • Add printing for unknown note types
  • Clarify comment about generating section data
rupprecht added inline comments.Aug 9 2019, 1:25 PM
14 ↗(On Diff #213987)

I don't really have a preference, but this is consistent with all the other tests like this in this package.

4442 ↗(On Diff #213737)

I think it's safer to check for the ET_CORE since the name seems vendor/kernel specific.

Actually, there's a bigger issue... the conditional logic for printing types vs description data seems to be orthogonal, so I've gone ahead and split it up. I've included my next (unmailed) patch for printing raw description bytes to show a stronger motivation, e.g. FreeBSD has custom note types defined (and so has a method to get those types), but goes through the fallback logic for printing raw bytes for the description. That's difficult/error prone to do with just one loop.

rupprecht updated this revision to Diff 214485.Aug 9 2019, 5:02 PM
  • Use DataExtractor instead of custom logic + remove some now unnecessary template use
alexshap added inline comments.Aug 11 2019, 7:39 AM
4420 ↗(On Diff #214485)

maybe const CoreNote &Note ?

rupprecht updated this revision to Diff 214642.Aug 12 2019, 8:43 AM
rupprecht marked an inline comment as done.
  • Use const for CoreNote param
rupprecht retitled this revision from [llvm-readelf] Implement NT_FILE core file parsing to [llvm-readelf] Implement note parsing for NT_FILE and unknown descriptors.Aug 12 2019, 12:08 PM
rupprecht edited the summary of this revision. (Show Details)
grimar accepted this revision.Aug 13 2019, 2:11 AM

LGTM. Please wait for other opinion(s) too.

30 ↗(On Diff #214642)

Please align:

Class:   ELFCLASS64
Type:    ET_CORE
Machine: EM_X86_64
34 ↗(On Diff #214642)

and here (and below).

This revision is now accepted and ready to land.Aug 13 2019, 2:11 AM
MaskRay accepted this revision.Aug 13 2019, 3:48 AM
This revision was automatically updated to reflect the committed changes.