- User Since
- May 16 2017, 11:34 AM (118 w, 8 h)
Fri, Aug 2
This looks fine to me
LGTM other than a couple atomicity issues.
This looks awesome!
I think everything but the implementation of genIndex being confusing looks good to me. I haven't actually groked how that code works yet other than the fact that it generates the index tree that I expect and all the surrounding code looks good to me. I understand that some of the trickiness here comes from the fact that you're building from a list of values but trying to generate the tree structure from that list which is hard. I think we can structure that code better; lets see if we can't device a better algorithm.
The code looks fine to me but I don't understand the global details of this. Hopefully Julie can still review things from there but if not we don't get timely reviews then we can still land this.
What's the use case for this and how do we handle return codes when an error occurs but we continue on?
Tue, Jul 23
Sorry there's a lot for me to catch up on it would seem. Would it be possible to get an of how having a mutable subclass of ELFObjectFile works us towards our goal of having a library?
Jul 18 2019
I'm blindly accepting the CSS but all other code looks fine. I still have that chrome bug where I can't LGTM sometimes but consider this reviewed and accepted by me as well
Jul 16 2019
Ok so I thought about endianness more and I think that's not an issue because computations are generally done in native endianness. I do however think you should definitely have either the 64 bit version be the default or have both a 32 and 64 bit version. This type is most useful in cases where tools like lld, llvm-objcopy, yaml2obj, etc... would be using an internal structure or temporary variable and generally you want to handle both 32-bit and 64-bit operations in these cases. The solution is always to default to the highest bit size which is 64-bits because it will handle the 32-bit case just fine. Unless there's a compelling reason to not use 64-bits I think you should just make it your default. Certainly before lld or llvm-objcopy can use this we need to have a 64-bit version.
Jul 12 2019
I think teaching more raw_ostreams about seeking is probably right. I started to propose a solution but it boiled down to make our own abstractions over existing ostreams and what amounts to raw_string_ostream. I think raw_string_ostream and raw_os_ostream should both support seeking
Jul 8 2019
(sorry I just realized this was already landed, I was out when it was bumped).
Sorry looking now.
The code looks good to me. I'll let Julie give the final architectural approval.
As in I'm not quite ready for this to land but It looks preety solid. I think if you respond to those comments I'll take one last look and then give an actual accept
This looks good to me!
I can't verify the windows parts today. If you can wait until tomorrow I can run it on a windows machine to test it.
Jul 2 2019
LGTM (sorry a chrome bug is preventing me from using drop down menus...sigh)
Overall I think this looks good. Others have been following the details far more closely that I've been so I'll defer to them.
Speaking with Roland on this it seems to be that the established protocol for how you match up vaddrs is to use the program headers. Roland invoked arcane scary references about prelinking and other sharp corners about other possible methods but I don't know or understand the details. The guidance I'm getting is that we need to keep the program headers virtual address space correct. He has stated the program headers are used by many existing tools and that the consumers are more varied than just the two major debuggers. If you need more than a "here be dragons" from Roland I'll probe for more information but I'm content enough with just knowing that the existing protocol requires using program headers.
My feeling is that segments are not important. The offsets of those NOBITS sections are not important. If segments are laid out weirdly after objcopy --only-keep-debug, I think it doesn't matter if llvm-objcopy also fails to layout the segments in a sensible way. The important thing here is that debuggers can correctly load debug info from such --only-keep-debug processed files. So I asked you test the workflow:
I think this change as is should not go in. I think I'd like to explain some details that I've learned about this flag and how it functions in practice since I've had to consume binaries produced by this and it's equivalent in eu-strip.
Jul 1 2019
How important is using ErrSaver. Isn't it possible to just use std::string in these error cases?
This is *awesome*. This is much more clear and precise that the GNU objcopy documentation!
Jun 28 2019
Can you upload git diff -U100000 <commit before genHTML> functions so that I can see the code without it being fragmented? It's hard to parse some of the more fragmented stuff here
This all LGTM now. Please wait long enough for others to get their say but overall I'm happy now.
I'm getting confused in all the different patches. I reviewed the HTML generator one like yesterday and was awaiting changes. Doesn't this do the same thing?
Jun 27 2019
Overall looks better. One of my comments causes a sweeping change to occur so I'll respond back after thats done.
Jun 26 2019
How invasive would it be to change this to not use an ostream but to instead use an in memory buffer? It would be useful to know where the use cases come from. Do we intend to use this inside of MC? Do we intend to output raw files? When outputting to MC is the space preallocated? Is there a reason that using an ostream would let us avoid a copy? If we can't avoid a copy or theres not an easy way to know the total size in advance then I'm pro using an std::vector<uint8_t> as the underlying storage mechanism instead of an ostream.
Jun 25 2019
Sorry about the lull here. This LGTM
Jun 24 2019
I got to the 'genHTML' functions and I decided that this general approach is not great.
Jun 20 2019
General idea looks good to me!
Jun 17 2019
Jun 14 2019
A bunch of nits but this looks good to me.
Jun 13 2019
Jun 12 2019
Sorry I wasn't very quick to get to this. This looks really good to me. I'm going to go over some of the this with a fine tooth comb tomorrow but It basically looks good as at high level glance.
Jun 10 2019
Addressed Armando's comments
Jun 6 2019
Actually if we can make Description a hash set that would be best. I think using std::set would require an awkward < operator to be defined. There should be standard ways of getting hashes out of all the string and vector types and you can use hash_combine to combine all of these into a single hash.
Jun 5 2019
This is just where I got tired today but I think I can recommend how to split this up so I could move faster and provide more useful high level review. Prior to splitting I'll keep chugging away for at least a bit each day.
"goes past the end of the file" is the phrasing that I've seen GNU tools use and gets more to the point so I agree with James. Other than that this looks good to me!
Jun 4 2019
I'm happy with this as is but I'd like @jhenderson, @rupprecht, and @alexshap to take a look since this represents a fairly major feature addition even if the code is pretty small. I don't believe this imposes any extra assumptions or requirements on the rest of the code however.
Neat! I didn't even know this was possible TBH
May 31 2019
Might label as NFC.
This looks good to me but please have someone else accept first as well.
Ok I spoke offline about some of my questions with Peter. Here are some useful facts:
May 30 2019
Alternatively we could only sort program headers in the layout algorithm and then just layout sections in whatever order we please. It will just mean that our layout won't mirror the input but it will be valid.
Alternatively we can sort by Index to allow ourselves to make mistakes in maintaining the order along the way but I think I'd rather not do that since that's just a hack.
Can we just sort in the layout code instead of sorting everything? Then we don't have to sort at all. I think we've largely been good about making operations stable so order should be maintained if just don't sort.
May 29 2019
A few questions
If it isn't too much work I'd prefer these be split up as well. Otherwise this looks like a strict improvement to me!
May 23 2019
I think this is good. The current behavior treats sectionWithinSegment as not mattering in this case because this predicate is used for getting file offsets right and ignores the SHT_NOBITS case. This is a strict improvement that just sets the predicate in such a way that llvm-objcopy will output the conventional offset now.
May 22 2019
Solid improvement. LGTM.
May 21 2019
Thanks for fixing all of my mistakes!
May 20 2019
Awesome. LGTM. Tell me if you want me or someone else to land it. You can also email Chris Latner to get access yourself.
May 16 2019
I'm still confused. If the behavior intended only has to do with the existence of relocations why are we making this change? A clear difference of behavior needs to be pointed out and that hasn't been made clear.
I have the same question as James with respect to requiring PT_DYNAMIC but otherwise the code looks good to me. I'm still looking at the tests.
Also what does GNU objdump do in this case?
When uploading changes make sure to upload the whole context. If using git diff use git diff -U100000 or some other large number to get all of the surrounding context.
link to the bug for the lazy: https://bugs.llvm.org/show_bug.cgi?id=41862
May 13 2019
I'm still a bit unclear. Does it strip symbols that are duplicates of symbols in .dynsym? Does it just get rid of the the whole symbol table no matter what?
May 10 2019
This shouldn't emit the .tbe format at all, the .tbe format is meant to be in near 1-1 correspondence with .so files, not with .o files. It has things like DT_NEEDED, and DT_SONAME related information that don't make sense for .o files for instance.
May 9 2019
The following is the first split. Subsequent splits will be much smaller (adding support for program headers, adding support for soname and needed, adding support for symbols, etc...): https://reviews.llvm.org/D61767#change-LCioCf1j7osU
I'm confused as to what this is trying to accomplish. This really just marks every symbol as unneeded in a DSO and keeps the behavior for other cases. In that case why not strip the symbol table all togethor? This is a very round about way to accomplish that.
May 6 2019
Can you explain how this occurred a bit more and how this is solving the issue?
May 3 2019
- No more TLS stuff but STT_TLS symbol types are preserved which appears to be sufficient for all known use cases
- Fixed all the tests, accidental extra tests, minor issues, and made all tests pass.
@MuskRay Have you checked other linkers or tools?
Ok so I went digging on the PT_TLS and tls section issues. Seems like removing both the segment and the section is the right way to go.
May 2 2019
Ok so I'm good with this. We would like to land this ASAP.
Jake, I am still not sure what you would prefer for moving forward. The current patch can output the yaml format I originally proposed or the one that looks similar to the one you proposed. What I need to know is:
Do you want the merging if the "ifo" files to happen inside of llvm-elfabi?
Do you care if we upstream the yaml we proposed as a first step (which is really a distilled version of that yaml2obj can consume anyways. this right now functions actually) ???
Or, would you rather the ifo files be merged by a different separate tool (something maybe called llvm-ifsogen)???
May 1 2019
Ok we're getting closer
I can't upload a new patch right now but I plan on making all recommended changes, even the ones I didn't reply to.
This code looks good to me and the use case seems valid. I'll give my LGTM but please wait for others.
Good idea! The code looks good as well as testing.