Added support for segments
Details
Diff Detail
Event Timeline
How about some testing for changing program header properties, e.g. size or address?
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
212–213 | Any particular reason you've moved this line? | |
244 | Did you mean to remove this? | |
249 | This should be in one of the earlier changes probably, if it can be. | |
331 | What is the point of this and the less-than operator? | |
332 | Just do: return Object == Other.Object && Index == Other.Index; | |
334 | Blank line before this. Also, is it clang-formatted? | |
345 | What's the point of this function? | |
347–353 | I'm not convinced this patch should attempt to add sections to segments. The reason for this is because they're not an intrinsic part of segments, but rather an artefact of the layout of an ELF object. Relatedly, this doesn't handle segments within other segments, so feels incomplete. I'd defer it to a patch that handles laying out an ELF. | |
369 | Did you mean to remove this? Looks like a bad rebase to me. | |
377 | Did you mean to remove this? Looks like a bad rebase to me. | |
415–421 | Did you mean to remove this? Looks like a bad rebase to me. | |
455 | Comment? Also Seg -> Segment | |
llvm/unittests/Object/MutableELFObjectTest.cpp | ||
437 | Seems unrelated? | |
479 | Seems unrelated? | |
498 | Seems unrelated? | |
603 | I'd have at least 2 different segments here and test that both can be read in. |
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
585 | Can this be inlined into getMutableSegment? | |
586 | Is there a particular reason this implementation requires getIfNew, whereas the symbol and section versions don't? | |
llvm/unittests/Object/MutableELFObjectTest.cpp | ||
660 | Changing the address of a segment is very unlikely to ever actually happen, I'd think. I'd change the p_offset field instead. |
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
73 | It eventually gets called by an emplace_back | |
239 | I had this in an earlier patch but then moved it out because it wasn't needed until now. I don't have a preference either way which patch it goes in. | |
328 | Hmm am I missing something on how to construct objects? C++ doesn't have the nice C syntax like SegmentRef S = {.Object = nullptr, .Index = 0}; Is there a way to leave out these boiler plate constructors that I don't know of? | |
586 | Yes, although I'm not sure if you'll like it :) Creating a MutableELFSegment is more expensive than the others because it takes a list of all sections that reside in it. So using getIfNew allows to not have to calculate that if it is already been made mutable then it just returns a mutable reference to a MutableELFSegment. I say you might not like that because what I described doesn't happen in this patch but only will in Part 7. |
Posting this here since I've not seen a patch for Part 7 yet. I may have said this elsewhere already, but I don't think it's a good idea to have sections as members of segments. I think you'll find the layout algorithm easier to work with when using a parent relationship, rather than a child one. Note that this is how llvm-objcopy is already written. That should allow you to handle multiple levels of nesting/overlapping etc without too much difficulty. That would also, I believe, make the Segment and Section implementations much closer to each other.
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
328 | Do you ever need the default value of Index? It seems to me to be unnecessary. |
Added parent segment to both section and segment types and methods to find parent segments
That should allow you to handle multiple levels of nesting/overlapping etc without too much difficulty.
Hmm if a segment had a list of its child segments, then couldn't layout be done recursively on each child segment? Would this not be more straight forward?
I think what I currently have should move more in the direction you are suggesting. I have a concern with my current implementation, its if I should use the original program headers or the updated ones. It's possible to change a segments vaddr, although you have mentioned unlikely. This would be problematic to use sectionInSegment after changing the program headers because then it would be obviously wrong. So right now I use the original program headers. Also something to look out for, I use parent segments as indexes not pointers and those are ssize_t not size_t because the alternative of using one past the last segment like an end iterator was too ugly than using -1 to denote no parent. I could also use size_t's max I suppose.
What happens if a segment is inside another segment? You only want to lay it out if it hasn't been laid out already, so you'd have to find a way to track that. On the other hand, if you only lay out things that have no parent (and then lay out the things with parents based on their parent's layout), you don't have this issue. Keeping track of children is not necessarily bad, as it allows you to lay the children out at the same time as the parent, but it's not as important (you can always do two passes for example).
I think what I currently have should move more in the direction you are suggesting. I have a concern with my current implementation, its if I should use the original program headers or the updated ones. It's possible to change a segments vaddr, although you have mentioned unlikely. This would be problematic to use sectionInSegment after changing the program headers because then it would be obviously wrong. So right now I use the original program headers.
Using the original program headers is probably the right thing to do, as that ensures the original relative layout is preserved.
Also something to look out for, I use parent segments as indexes not pointers and those are ssize_t not size_t because the alternative of using one past the last segment like an end iterator was too ugly than using -1 to denote no parent. I could also use size_t's max I suppose.
See my inline suggestion about using Optional. size_t's max would be my next suggestion.
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
289–290 | How about using Optional<size_t> instead of -1? I think that would more clearly convey the meaning and would avoid the weirdness of using ssize_t everywhere. | |
294 | Could you not just move SegmentRef's definition earlier? | |
297 | I'd keep this and findSectionsParentSegment() next to each other, and I'd seriously consider just calling them both findParentSegment(). | |
609–610 | Something you may wish to consider is effectively flattening the nesting, so that only top-level segments are parents, i.e. if a segment has a parent, then the parent cannot have a parent itself. This may make the layout algorithm simpler, but could probably be left until you're at that point. Same comment applies to sections. | |
616 | Is this identical to the llvm-objcopy version? If not, could you highlight how it's different, please. | |
640 | Delete the blank line here. You need to be careful. It looks to me like a segment could end up as its own parent here, if I'm not misreading the code. Also, you should consider taking alignment requirements into account, in the event that two segments have the same size and offset. See https://bugs.llvm.org/show_bug.cgi?id=42963. Also, consider the following bug in your sectionWithinSegment code: https://bugs.llvm.org/show_bug.cgi?id=41126. |
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
289–290 | Thats much better :) | |
609–610 | Something like if (segmentWithinSegment(Child, Phdr) && findParentSegment(Phdr) != None)?
I think I am misunderstanding something, do sections have anything to do whether a segment can have a parent? There is no overlapping sections I thought so I don't think this relates to that. Sorry about the confusion! | |
640 |
Indeed, thanks fixed that.
I have added this it happens on line 941 of the unit test, is that what you mean?
I think I have this: |-Segment-| |-Section-| tested and it properly says that the section is in the segment. There was no change needed for that. |-Segment-| |-Section-| |-Segment-| |---Section---| but I know for sure they will not be reported as being in the segment. |
Could you write some comments in the test code that show what the cases are that you are testing, similar to how you've written it in your inline comments. This will make it easier to understand what is going on.
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
609–610 | I'm thinking that if the segment layout looks something like the following: | Section1 | Section2 | Section3 | Section4 | | Segment1 | Segment2 | | Segment3 | Everything should have Segment3 as their parent (except Segment3 itself of course). If you don't flatten the tree, then sections 1, 2, and 3 will have parents of Segment 1 or 2 which will make it harder to perform layout, as you effectively have to chase the parents pointer to find the top-level element to determine where to move them.
Yes, I think you misunderstood. I was just referring to the fact that the parent of a section should be the top-most parent in the tree. | |
615 | except it uses | |
616 | and not | |
640 |
I think you've missed an issue - what happens if two segments have the same offset, size and alignment? In that case, neither will be treated as a parent of the other, and the algorithm falls apart.
I think there's a ShOffset field available to set the claimed offset of a section (the actual data will still be written where it would be). The alternative to that is to effectively force the section's offset by setting the AddressAlign and Address fields, so that yaml2obj has to place it at a certain position. You'll see quite a few examples of this within tests. Once you have forced a section to a specific position, you can then just set the ProgramHeader fields accordingly. | |
641 | I'd rename Parent to PotentialParent and Child to Segment, because the Parent isn't necessarily a Parent yet (and therefore the Child isn't necessarily a Child of that segment). |
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
640 |
Ok, now it is finally fixed. It isn't very pretty because I need to pass the indexes of each segment but I just choose whichever segment is first to be the parent. |
I'm out of time for today on this review. I'll come back to it tomorrow.
llvm/include/llvm/Object/MutableELFObject.h | ||
---|---|---|
587–588 | A comment briefly explaining this would be helpful. | |
593–596 | Some code comments explaining what is going on here would be useful. | |
605–606 | An explanation with some comments as to why you can have this early out would make this clearer. | |
645 | ChildOffset -> SegmentIndex Write a comment at the start of this function outlining the behaviour, please. | |
llvm/unittests/Object/MutableELFObjectTest.cpp | ||
752–753 | doesnt -> doesn't But really, I'm a little confused what this comment is trying to say. I think you want to say something about offsets, because it's confusing what you're talking about in this comment. It appears you're saying 4 doesn't come before 5, but it's earlier in the program header list, which seems to contradict what you're saying. |
The ctor is not really necessary.