This is an archive of the discontinued LLVM Phabricator instance.

[llvm-readobj] - Simplify conditions used for printing segment mappings. NFCI.
ClosedPublic

Authored by grimar on Apr 23 2020, 6:19 AM.

Details

Summary

This patch is a NFC refactoring.

Currently the logic is overcomplicated, contains dead conditions and is very hard to read.
This patch performs a very straightforward simplification. Probably it can be
simplified and improved more, but we need to land test cases documenting/testing
all the current functionality first.

Diff Detail

Event Timeline

grimar created this revision.Apr 23 2020, 6:19 AM
grimar marked an inline comment as done.Apr 23 2020, 6:19 AM
grimar added inline comments.
llvm/tools/llvm-readobj/ELFDumper.cpp
4154

This now lives in checkTLSSections

jhenderson added inline comments.Apr 24 2020, 1:44 AM
llvm/tools/llvm-readobj/ELFDumper.cpp
4022–4036

Unrelated to this commit, but I'm not at all convinced by this function.

Firstly, I wonder if this function really should be specifically abpout TLS segments, or if there's some equivalent behaviour for all segment types + NOBITS sections? For example, what if there were a .bss.rel.ro segment at the end of the PT_GNU_RELRO segment? Should it appear in the parent data segment or not (I'm guessing not)?

Example linker script snippet:

.data.rel.ro : { *(.data.rel.ro .data.rel.ro.*) } : ph_relro : ph_data
.bss.rel.ro : { *(.bss.rel.ro .bss.rel.ro.*) } : ph_relro
.data : { *(.data .data.*) } : ph_data

Semantically, it doesn't sound much different to me to TLS nobits stuff.

Secondly, why should we omit non TLS sections from the TLS segment printing and why should we omit TLS sections from non-TLS/LOAD/RELRO segments? It seems like the logic is unnecessarily complicated here.

4025

I'd get rid of "The"

4042

I'd rephrase this comment to "SHT_NOBITS sections don't need to have an offset inside the segment."

Aside: is that true compared to what GNU readelf does?

4049

non-zero section -> non-empty sections

4051

It might be good for the RHS of this expression to match the order in the below equivalent line.

4052

Nit: too many spaces before Sec.sh_size. Did you clang-format?

4071

non-zero section -> non-empty sections

4074

Might be worth using Sec.sh_size instead of SectionSize here?

4076

I think this comment made sense. Please reinstate it.

4078–4080

Similar to the TLS one, this function seems bonkers. Why does it exist at all? Why is the dynamic segment special compared to other segments?

4083

This comment belongs with the above if, I think.

4161–4162

Related to my above comments re. TLS/Dynamic segments, I wonder how much of this actually happens in GNU readelf in practice, and how much was incidental for other reasons? Could you check if you are considering looking more at my other comments?

I might even consider some of the conditions that GNU readelf follows to be non-sensical and just ignore them, though that might be a bit controversial.

grimar marked 3 inline comments as done.Apr 24 2020, 3:08 AM

I've read all the comments and I have no answers for many of them atm.
(I am at the begining of investigating/understanding the logic we have. It looks messy.
This patch is the first step that reveals the conditions we have to document/test them).

Currently I am working on a patch to document the existent behavior.
I am going to investigate the conditions, GNU readelf behavior and probably it's code.
Hopefully it will help to understand and answer the questions that were asked.
It will take some time, I'll return here after posting it.

After landing this and the patch mentioned we will be able to clean-up/improve the logic.
(I am in fear of making any non-trivial changes/cleanups while we have no solid test coverage).

llvm/tools/llvm-readobj/ELFDumper.cpp
4083

I think no. It seems to be about

"(Sec.sh_offset > Phdr.p_offset && Sec.sh_offset < Phdr.p_offset + Phdr.p_filesz)"

One of conditions to get here is to have "Sec.sh_size == 0".

jhenderson added inline comments.Apr 24 2020, 3:51 AM
llvm/tools/llvm-readobj/ELFDumper.cpp
4083

This comment is to do with sections with zero size and PT_DYNAMIC, so the comment belongs with the zero size/PT_DYNAMIC check surely?

grimar marked an inline comment as done.Apr 24 2020, 4:27 AM
grimar added inline comments.
llvm/tools/llvm-readobj/ELFDumper.cpp
4083

See:

if (Phdr.p_type != ELF::PT_DYNAMIC || Phdr.p_memsz == 0 || Sec.sh_size != 0) above means that
we can get here only when a segment is PT_DYNAMIC and section has zero size.

The comment says "No section with zero size must be at the start or at the end of PT_DYNAMIC"

It is related to the following condition:

(Sec.sh_offset > Phdr.p_offset && Sec.sh_offset < Phdr.p_offset + Phdr.p_filesz)

Note that > and < are used here instead of >= and <=. I.e. it checks that the empty section is not at the
start or at the end. Just like comment says.

Right?

grimar marked an inline comment as done.Apr 24 2020, 4:55 AM
grimar added inline comments.
llvm/tools/llvm-readobj/ELFDumper.cpp
4083

I think I'll be able to improve comments in this method to avoid confusion.

grimar updated this revision to Diff 260608.Apr 28 2020, 6:16 AM
grimar marked 17 inline comments as done.
  • Addressed review comments.
Herald added a project: Restricted Project. · View Herald TranscriptApr 28 2020, 6:16 AM
grimar added inline comments.Apr 28 2020, 6:16 AM
llvm/tools/llvm-readobj/ELFDumper.cpp
4022–4036

This and other conditions seems to be based on the GNU readelf code.
They are either similar or very close. See:

https://github.com/bminor/binutils-gdb/blob/master/binutils/readelf.c#L5399
https://github.com/bminor/binutils-gdb/blob/master/include/elf/internal.h#L323
https://github.com/bminor/binutils-gdb/blob/master/include/elf/internal.h#L309

Perhaps, it worth keeping the checkTLSSections because it implements first conditions
of the ELF_SECTION_IN_SEGMENT_1 (https://github.com/bminor/binutils-gdb/blob/master/include/elf/internal.h#L323)
(that are related to TLS) what makes easier to compare the code (since we are trying to match GNU's behavior).

We have checkTLSSections, checkOffsets, checkVMA and checkPTDynamic currently.
Seems they just implement the checks mentioned. I.e. we have a split of ELF_SECTION_IN_SEGMENT_1 here.

4042

Yes, it seems to be true:

   /* Any section besides one of type SHT_NOBITS must have file		\
      offsets within the segment.  */					\
   && ((sec_hdr)->sh_type == SHT_NOBITS					\
       || ((bfd_vma) (sec_hdr)->sh_offset >= (segment)->p_offset	\
	   && (!(strict)						\
	       || ((sec_hdr)->sh_offset - (segment)->p_offset		\
		   <= (segment)->p_filesz - 1))				\
	   && (((sec_hdr)->sh_offset - (segment)->p_offset		\
		+ ELF_SECTION_SIZE(sec_hdr, segment))			\
	       <= (segment)->p_filesz)))

https://github.com/bminor/binutils-gdb/blob/master/include/elf/internal.h#L345

4078–4080

I believe it is based on the GNU readelf logic:
https://github.com/bminor/binutils-gdb/blob/master/include/elf/internal.h#L366

It says "No zero size sections at start or end of PT_DYNAMIC nor PT_NOTE.".
Here only the PT_DYNAMIC is handled, seems we want to handle PT_NOTE too.

4161–4162

I wonder how much of this actually happens in GNU readelf in practice, and how much was incidental for other reasons?

It is what they implement explicitly:
https://github.com/bminor/binutils-gdb/blob/master/include/elf/internal.h#L366

I might even consider some of the conditions that GNU readelf follows to be non-sensical and just ignore them

Maybe. We should land this and add tests for the current behavior first though I think.

grimar edited the summary of this revision. (Show Details)Apr 28 2020, 6:22 AM
jhenderson added inline comments.Apr 30 2020, 12:54 AM
llvm/tools/llvm-readobj/ELFDumper.cpp
4052

This hasn't been addressed. There are still two spaces between + and Sec.sh_size.

4069

I wonder if it now makes sense to make this IsEmpty instead? In other words:

bool IsEmpty = (IsTbss && Phdr.p_type != ELF::PT_TLS) || Sec.sh_size == 0;
if (IsEmpty)
  ...

Or even:

bool IsTbssInNonTLS = IsTbss && Phdr.p_type != ELF::PT_TLS;
if (Sec.sh_size == 0 || IsTbssInNonTLS)
  ...
4078–4080

Not dug into it, but I guess the intent here is to avoid having non SHT_DYNAMIC/SHT_NOTE sections appear in PT_DYNAMIC/PT_NOTE segments, which does make some sense. Perhaps worth writing/expanding a comment to explain this? What do you think?

4086–4090

Perhaps worth breaking this into a few separate booleans, as it gets quite hard to follow the logic.

grimar updated this revision to Diff 261188.Apr 30 2020, 4:36 AM
grimar marked 6 inline comments as done.
  • Addressed review comments.
llvm/tools/llvm-readobj/ELFDumper.cpp
4052

Sorry. Fixed.

4069

OK.

4078–4080

but I guess the intent here is to avoid having non SHT_DYNAMIC/SHT_NOTE sections appear in PT_DYNAMIC/PT_NOTE segment. Perhaps worth writing/expanding a comment to explain this? What do you think?

The code does not check the section type, i.e. if it is "SHT_DYNAMIC" or not. We might have (in a mailformed object) 2 SHT_DYNAMIC sections:
first is empty, second is not and will never see the first one in the output with the current code.

I guess the best comment could be: "This is just how GNU does, we follow". We can return to it later when will add support for "PT_NOTE" here.

jhenderson added inline comments.May 1 2020, 2:07 AM
llvm/tools/llvm-readobj/ELFDumper.cpp
4078–4080

I actually am coming around to GNU's process for this case here. Yes, it doesn't check the section type, but the chances of an empty dynamic section are minimal anyway. Maybe we could special case SHT_DYNAMIC though at a future point. If we did that, I'd want the SHT_DYNAMIC exception for PT_DYNAMIC only and SHT_NOTE for PT_NOTE in a similar manner.

4094

Shouldn't this be an || to match the old behaviour?

grimar updated this revision to Diff 261446.May 1 2020, 2:23 AM
grimar marked 2 inline comments as done.
  • Addressed review comment.
llvm/tools/llvm-readobj/ELFDumper.cpp
4094

Right.

jhenderson accepted this revision.May 1 2020, 2:25 AM

LGTM.

llvm/tools/llvm-readobj/ELFDumper.cpp
4094

I'm nervous that (presumably) no test failed because of this mistake, but I guess that the situation isn't any worse than before.

This revision is now accepted and ready to land.May 1 2020, 2:25 AM
grimar marked an inline comment as done.May 1 2020, 2:27 AM
grimar added inline comments.
llvm/tools/llvm-readobj/ELFDumper.cpp
4094

Me too. And this is what I've mentioned in the description: "Probably it can be
simplified and improved more, but we need to land test cases documenting/testing
all the current functionality first."

grimar closed this revision.May 1 2020, 11:50 AM

Was committed as e4ba3ff35942da35c65315d1a590c2709dc99f07, phab did not close it for some reason.

https://discord.gg/xS7Z362 #infrastructure

Was committed as e4ba3ff35942da35c65315d1a590c2709dc99f07, phab did not close it for some reason.

This was discussed in https://discord.gg/xS7Z362 #infrastructure. I noticed a "disk is full" issue for reviews.llvm.org about 17 hours ago. It seems not everything is restored.

grimar added a comment.May 2 2020, 3:57 AM

https://discord.gg/xS7Z362 #infrastructure

Was committed as e4ba3ff35942da35c65315d1a590c2709dc99f07, phab did not close it for some reason.

This was discussed in https://discord.gg/xS7Z362 #infrastructure. I noticed a "disk is full" issue for reviews.llvm.org about 17 hours ago. It seems not everything is restored.

I also had no mails from mailing lists (llvm-comits, llvm-dev) for about ~2.5 days atm. Only have mails for reviews where I am a subscriber/author/reviewer.
Do you know anything about it? I just wonder if it is our local mail filter issue again or something more global.

https://discord.gg/xS7Z362 #infrastructure

Was committed as e4ba3ff35942da35c65315d1a590c2709dc99f07, phab did not close it for some reason.

This was discussed in https://discord.gg/xS7Z362 #infrastructure. I noticed a "disk is full" issue for reviews.llvm.org about 17 hours ago. It seems not everything is restored.

I also had no mails from mailing lists (llvm-comits, llvm-dev) for about ~2.5 days atm. Only have mails for reviews where I am a subscriber/author/reviewer.
Do you know anything about it? I just wonder if it is our local mail filter issue again or something more global.

This was a server side issue. The server had been down for some time http://lists.llvm.org/pipermail/llvm-dev/2020-May/141343.html
After it came back, I've observed for at least 8 hours some daemon was not running: (1) Differential Revisions were not closed automatically (2) Harbomaster (pre-merge testing) was not working (3) "Recent Activity" did not update. I did not subscribe llvm-commits. llvm-dev is fine (so people could discuss that reviews.llvm.org was down...)

! In D78709#2016336, @MaskRay wrote:
This was a server side issue. The server had been down for some time http://lists.llvm.org/pipermail/llvm-dev/2020-May/141343.html
After it came back, I've observed for at least 8 hours some daemon was not running: (1) Differential Revisions were not closed automatically (2) Harbomaster (pre-merge testing) was not working (3) "Recent Activity" did not update. I did not subscribe llvm-commits. llvm-dev is fine (so people could discuss that reviews.llvm.org was down...)

Thanks. That means my issue is something different, it started ~3 days earlier than that discussion happened.