- User Since
- May 16 2017, 11:34 AM (65 w, 5 d)
Tue, Aug 14
Mon, Aug 13
Thu, Aug 9
I don't think I want this option to do anything more than literally add a prefix to all section names. It isn't clear that those precise semantics are being relied on anywhere and the behavior of making relocation names and not renaming symtab etc... is a result of how bfd reconstructs the output binary without consideration for what the user initially put in which is opposite the stated goal of llvm-objcopy.
Sorry, I'm not ready to land this yet.
Awesome! I'd like to have someone else look over this but this pretty well LGTM.
Wed, Aug 8
Nice. We need the std::string change for names for several other changes. LGTM.
oh god I apologize. I was wondering why this wasn't already submitted.
Mon, Aug 6
So I think I want to remove -split-dwo as an option all together so that HandleArgs can just not worry about all of this.
Ah so I think I'm against BinaryReader being templated against ELFT. That is required to implement getElfType() however. getElfType drives which writer we create. That shouldn't be how things work however. I'll go and add my related comments to this to the previous patch. Sorry if I didn't see what was going on in the other patch and lead you astray. This patch makes things a bit more clear.
Fri, Aug 3
Thu, Aug 2
Make sure the commit message reflects this change but other than that LGTM
Speaking of that conflict I mentioned, it won't happen! So we're good to go. This change more or less LGTM. Please resolve Alex's comments however.
This LGTM other than one issue
Wed, Aug 1
Due to the soon encroaching change where -I binary will be implemented I'm very much in favor of the OriginalData field being added to SectionBase. Really seems to handle everything very cleanly IMO.
I think I'd like to make a recommendation that we implement compression first and then decompression next.
I forsee a slight conflict between this and another change but it isn't clear how that will work out and I'm not sure what the best solution is. This change overall LGTM so maybe we'll just let things shake out and deal with that conflict as we have more concrete plans.
Compressed sections are a general kind of section. You can compress
literally any kind of section. All relocations are in terms of the
decompressed section. We can't use ELFSectionWriter on say symbol tables
because the actual data there isn't finalized until finalization. We don't
have to concern ourselves with how compression and modification interact
for debug sections because we don't modify those.
Tue, Jul 31
Yeah I think (to within the margin of the details we've worked out) we seem to roughly agree @alexshap
I'm going to outline a proposal for how I think this stuff should work. It turns out I support creating a new section type after all (needed to support a few things like constructing the section name from a Twine and moving the compressed data into the Section's ownership in a clean way). I think this is a faithful version of what James (James please say so if this is not what you had in mind) and I have in mind for how to handle this (well I had to change my stance slightly). If the name ownership problem is solved and we figure out a nice way to plumb the 32-bit vs 64-bit information though that would be even better!
So here's an idea. It is ELF specific but it extends the SectionBase interface in a way I've considered for a while and would make both this and another change quite easy. If we add an OriginalData field (ArrayRef<uint8_t>) to SectionBase we can dump that inside of HandleArgs.
Can we add test? Otherwise this LGTM.
Mon, Jul 30
Can you file a bug against me on the segfault? That sounds like a serious issue that needs to be addressed.
Thanks! I've been meaning to implement this forever. For instance this is the only way to dump non-allocated sections. I've run into cases debugging things where that was useful and had to use objcopy.
Fri, Jul 27
Forgot to accept
LGTM, I had the same confusion James did. I have independently confirmed this is the correct behavior. As much as I dislike this behavior I think we need to match what GNU objcopy does here because this might result in observable differences in behavior (like a segfault if this is run on a ,o file and then linked)
Thu, Jul 26
Also I wanted to say I'm pro this. This should be achieved regardless of which idea we settle on.
Yeah I really really think this should be handled by constructing new OwnedDataSections. The compression code should also be templated over an ELFT as well to enable using ELFT::Chdr.
Wed, Jul 25
Also also, why do you need this? I ask because from talking with some people about what this is used for there is almost certainly a better way to accomplish any gain you get out of it. The main reason I can still see to do this is to achieve command line compatibility; if that is the goal we can probably noop or just noop with a section rename.
Also I think you can build llvm with or without zlib. I think that means you're going to need to #ifdef a lot of this code.
Mon, Jul 23
Jul 20 2018
I want to keep the public interface of Object as small as possible but other than that LGTM
Jul 19 2018
The reason for it was just that this all seemed too complicated to me for "something so simple" but now that I've looked into the problem I think what you've done makes perfect sense.
Jul 18 2018
What would the consequence be of just converting the bytes to a string and writing that out? I have neither a windows box nor python 3 installed.
Jul 17 2018
Uh...I mean.."unobjectionable"...don't judge me.
Seems pretty non-objectionable to me. LGTM
Jul 16 2018
Nope, I was just about to land it. I wasn't in office Friday and I was conducting an interview this morning. It will be in shortly.
Jul 13 2018
Jul 12 2018
Updated shndx name for this test.
Forgot to update error message for one of James' comments
Now uses python to uncompress.
Responded to comments.
Jul 11 2018
Jun 28 2018
LGTM, sorry for the back and forth!
Jun 27 2018
Jun 26 2018
So in considering the case you pointed out about -dyn-relocations I discovered that the current behavior for -relocations is to not print out dynamic relocations at all. This is hidden by the fact that dynamic relocations tend to be duplicated with non-dynamic relocations by default. So we should only print these out on -dyn-relocations unless they're duplicated. As part of my proposal to a solution to this I'm proposing an additional flag so that the raw form that Roland requested can be printed out. This way debugging of the underlying format is possible but for most purposes the decoded form can be used.
I really feel that non-decoded entries shouldn't have an addend or faked relocation type. It's really confusing to look at those and see relocations. I think non-decoded entries should just print the relr entry.
Jun 22 2018
I think I'm more or less happy with the llvm-readobj part of this.
Jun 11 2018
I'll have more comments tomorrow. I'm still grokking my way though llvm-readobj as I've never looked at that code before.
Jun 7 2018
Ah yes. Thanks!
Jun 6 2018
Hey I'm away but I got a bug report requesting -x as an alias in
llvm-strip. I haven't looked at the code for this patch so it might already
be solved here but I'd appreciate it being added. No worries either way.
I'll just add it when aight get back of it isn't there.
May 30 2018
Quick drop in. What I was thinking about before was to just never pass the
null symbol to remove or anything else. It can still be there it just
shouldn't be allowed to be updated or removed.
May 25 2018
This is causing breaks in fuchsia,
May 24 2018
nit: Add comment above loop but otherwise LGTM. Feel free to submit after that.
As always the right solution winds up 10000 times simpler than anything I considered in the interim. This looks good to me after fixing Alex's and my comments.
Landed in rL333136
May 23 2018
Switched to use index. Fixed to print out full buffer in case no newline exists.
May 22 2018
Modified to group multiple lines into a single flush but still flush at least once if a newline is found.
Yeah I agree with Alex and James here. I think I'd prefer some sort of interface extension. Something as simple as 'markSymbols' that's done as a loop though the sections which then just sets a Boolean like you had originally. That should a) give the desired time complexity and b) not be super complicated. That loop can also only be triggered if --strip-uneeded is used which is nice.
Just a heads up I'm not convinced my shared_ptr idea is ok. I want to
consider what Alex and James are saying. My hope has always been to keep
the symbol table as the owner so if Alex has a proposal I want to read it
May 21 2018
I didn't think about that. I think the key thing here then is to use shared_ptr to store all symbols and then use use_count to keep track of the references.
LGTM but please wait on james
LGTM with a couple additions.
Yeah I haven't heard back from Lexan nor have I tried reproducing such a build myself and if we're that close to not relaying on this sort of hack I'd much rather wait on this than risk breaking someone for the sake of a stopgap for such a short period of time. @echristo Can you link the review to that?