- User Since
- Jan 18 2017, 2:49 AM (82 w, 2 d)
I've just noticed that your .debug_str section appears to have SHF_MERGE and SHF_STRINGS set in the input, but the output does not have these flags set. Is this correct? Does GNU objcopy strip the flags? I doubt that it should...
This LGTM. But maybe worth giving it a day or two to see if anybody responds to the email thread. Or you could commit it as is, and then fix things if people complain in the thread, I don't mind.
If I get a chance, I'll verify this on Windows. Otherwise, this LGTM.
Wed, Aug 15
LGTM, with nit.
Hi @vleschuk, I just wanted to quickly check-in on this to see if you need any help with making further improvements as suggested?
Oops, hang on, didn't see your change as I wrote my review...
LGTM, with nit. I guess if somebody wants decimal support later, they can add it then.
Is it normal for yaml2obj to only parse hex values and not decimal for other fields?
Could I make one request going forwards. It's helpful if you post a quick summary of the changes you made each time you update the diff (a bullet-point list would be fine). Due to the scale of the change, I'd prefer not to re-review everything each time I look, and sometimes it's a little hard to remember whether a bit has changed.
I can't remember whether we deliberately decided to deviate from GNU on the --globalize behaviour. To me, our behaviour makes sense: if a user deliberately requests to globalize a weak symbol, why shouldn't we? I could even see a use-case for it: a member, in an archive, with a weak symbol definition, won't normally be used, unless other symbols from the archive are needed, so --globalize could force its inclusion. I'd go back and dig out the change that introduced the behaviour and check if it was discussed previously, and if not, file a bug or send an email on the mailing list to discuss it further.
LGTM, but there is one thing still outstanding, I think, which @jakehehrlich mentioned in a comment regarding the CopyConfig options:
Tue, Aug 14
Please make sure to rebase this patch when you get a chance, because at the moment, it's a little hard to apply the patch on trunk.
No null symbol is a blocker, so I've marked this as requesting changes. Sorry!
I think I wasn't explicit enough with my wording to explain what I was thinking: what happens if you specify --globalize-symbol a --keep-global-symbol b? It looks like the code will turn a into STB_LOCAL, and b will remain as it currently is (similar points about --localize|weaken-symbol also apply). Does this match GNU's behaviour? I'd expect a to be STB_GLOBAL and b to remain unchanged.
Mon, Aug 13
What happens if you specify a mixture of --globalize-symbol and --keep-global-symbol?
Fri, Aug 10
I've run out of time of review the rest of this change today, but I will get back to it next week. Please could you try to fix the issues I've highlighted in the meantime.
I can't think of any semantic reason why names should matter for the cases discussed, with the exception of a tiny number of sections (usually to do with debug information or similar such things), names of sections shouldn't matter (and indeed, this has been discussed on the ELF gABI mailing list as a stated goal of ELF). This option WILL break DWARF sections though, which rely on the section naming. I'm not sure if this is something we should care about though?
Thu, Aug 9
It turns out that this is a wider bug in llvm-objcopy that is related to, but not specific to this issue: if you do --rename-section in GNU objcopy, you also implicitly rename relocation sections. Meanwhile, if you do --rename-section on relocation sections or strtab or symtab sections, nothing happens (if less concerned about this, because you requested something specific for those sections, but still, it is a behaviour difference).
I've got some comments that I'll send through a bit later today, but as things stand, this doesn't match GNU objcopy's behaviour fro relocation sections at the very least - the "prefix" is placed between the end of ".rela" and the start of the section name so that they have the name ".rela<prefixed relocated section name>"
Wed, Aug 8
Tue, Aug 7
Mon, Aug 6
LGTM with one nit.
LGTM, with the suggested changes, but please get @jakehehrlich to confirm he's happy with the test changes.
Fri, Aug 3
I just noticed that llvmObject has an object_error equivalent to std::errc, with a make_error_code method allowing us to turn them into std::error_code for use with the new function. In particular, there are unexpected_eof and parser_failed errors, which might be more appropriate for certain situations in this file. What do you think?
Would you mind when you mark things as done, to upload the latest diff, as it makes it a little confusing for me seeing something marked as "done" when it doesn't appear to have been yet (might be best to then hold off marking off things as done until you are ready to upload).
Thu, Aug 2
Okay, go for it. I have nerves around lifetime issues with references like that, but I think the structure means that the Reader (and therefore the file it reads) will be around long enough for it to not matter.
Wed, Aug 1
I think @jakehehrlich's proposal more or less matches my thoughts. But I'm wondering why we need the OriginalData field? Do we have anything else that can modify the contents of a debug section? If not, why do we need a different field here? I may be missing something obviously...
I think this is a reasonable approach for now. I can't think of any situation where converting between TLS and non-TLS (and vice versa) makes sense, since the relocations needed are different.
Tue, Jul 31
Related aside to my comments regarding version mismatches: is it likely that future versions of DWARF (e.g. DWARF 6) will update the version number of the .debug_addr tables without changing anything in them? I know in previous versions of DWARF, some sections didn't increase in version number in line with the overall standard version. If not, then the version mismatch check may not make sense anyway.
Okay, I think I'm happy with this, but please let the others give their feedback and thumbs up too.
I'm not seeing anything that checks that the requested compression style is a supported style. I'd expect an error to be generated if a nonsensical version is specified.
The code looks correct to me (barring a couple of nits) for what you are trying to do, but I do have a couple of outstanding concerns, that I'd like to discuss further:
Mon, Jul 30
Marking as "Request Changes" to clear the LGTM until my above points about section flags are addressed.
My main concern is still the interaction between these switches and existing flags. Specifically, I'm worried about what will happen if this is used on a section with lots of flags that cannot be controlled via this switch. A quick look at the spec shows that this will clear the following flags unconditionally: SHF_INFO_LINK, SHF_GROUP, SHF_TLS, SHF_COMPRESSED, and any OS/processor-specific flags. I haven't checked what GNU objcopy does, but in basically all of those, changing the flags could be bad, and have a negative impact on the ability to link the resultant object. Could you investigate what GNU does in each of these cases, and write tests to show the behaviour, please.
Fri, Jul 27
What is the behaviour of --rename-section when it touches a section with existing flags that don't conflict with the specified value? For example alloc when the section has SHF_EXECINSTR already? Does it preserve those flags? Can multiple modes be specified for the same section (e.g. code + alloc)? I don't think the tests really show these.
Thu, Jul 26
As an additional point, it would be good to minimise the number of distinct areas where we have to add these #ifdefs. I'd recommend trying to hide as much as possible of this behind an interface of some kind, which turns into a no-op or warning/error if --compress-debug-sections is used without zlib support.
Wed, Jul 25
Tue, Jul 24
Not specific to your change, but I noticed that a lot of the DWARFContext code uses 32-bit offsets. Would it not make sense where we are writing new stuff, to use 64-bit offsets, since DWARF does support this?
Mon, Jul 23
Okay. Please make sure in the commit comment to explain that this is a copy of the llvm-readobj code, and that a subsequent refactor is still to come.
Thu, Jul 19
Jul 18 2018
Hmm... I don't like using the dynamic symbol table to lookup the dynamic string table. There is a reason for a specific DT_STRTAB tag. We should use that. Also, this latest version won't work if the section headers have been stripped, but we should be able to dump the dynamic table in this instance, by looking at the DT_* tags in the PT_DYNAMIC segment (if present). Try adding a test for this case (i.e. section headers stripped), and you'll see an error, I suspect... Also, please make sure you have a test that has no section headers or PT_DYNAMIC segment.
Good catch! I agree with a simple swap from one to the other at some point, but actually, maybe we should just abandon this change, and use array_lengthof, until we are ready to use C++17 directly? It would make the find/replace much easier: I took a quick look, and it looks like array_lengthof is solely used for this particular function, and not some other invocation. But "size" on the other hand will be much harder to isolate.
Jul 17 2018
LGTM, with a comment change in the test.
LGTM. But I'd like somebody else to chip in on this as well before you commit. If nobody else does in the next couple of days, go ahead and commit.
LGTM, with a couple of minor improvements to the test.
Apologies, I messed up that last part - it should probably say "(perhaps referencing the implementation you are following)"
You might want to canvas opinions from others who have developed in this area recently, such as @vsk, before putting this in, but aside from the missing comment and couple of nits, this looks good.
Jul 16 2018
I think this would be a reasonable way forward.
LGTM, with the caveat that I don't know for certain if that was the original issue, without looking at the failing job and what was going on.
Jul 13 2018
Take a look at the documentation for it here:
LGTM, with the nits, assuming the tests aren't realistic without spending hours at it.
Simple functions are also on my wish-list! The main ones we care about are converting to and from hex numbers, and being able to do additions and subtractions. But I think just the basic integer and hex reading and additions would be a good start.
Jul 12 2018
I might be going overboard with the comments around "auto", and I didn't highlight all of the ones I spotted. But here's the link to the corresponding style point: https://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable. Feel free to ignore what I said if you think they're valid uses.
Jul 11 2018
(moving this comment out of line, because the comments got completely out of sync with the code):
Regarding the VirtBase calculation, yeah, you're right, there was a mistake on my part - I made an incorrect assumption regarding program layout.
Jul 10 2018
LGTM, with a few minor points.
First up, I want to say thank you for doing this! This is one of the major missing features from FileCheck over the test harness I am used to using, and leads to fragile testing where people have hard-coded in values in tests that are subject to change.