User Details
- User Since
- Aug 17 2017, 6:34 AM (178 w, 4 d)
Fri, Jan 15
rebased. changed enumerator to the container.
Tue, Jan 12
LGTM with small comment inside.
Mon, Jan 11
ping
ping
Fri, Jan 8
If I understood correctly, It looks like this change is not enough to work properly for DWARF5.
Currently, relocation is checked using offsets to the attribute:
Sun, Jan 3
Wed, Dec 23
@avl - it might be this and the related patch/usage in llvm-objcopy deserve an llvm-dev design thread about the overall issues you're trying to solve and the design directions you've explored and the ones you are proposing.
Tue, Dec 22
Thank you! Will address comments.
Dec 18 2020
Aren't many MemoryBuffers backed by memory mapped files, which aren't necessarily causing real memory usage? Could that approach be used here?
Dec 17 2020
Dec 16 2020
Colleagues, so what is your opinion on this refactoring? Is it OK to use streams as output data for llvm-objcopy?
Dec 14 2020
ping.
Dec 11 2020
Dec 10 2020
Dec 1 2020
Honestly, I don't have the time to look at this process in detail, and refactoring things to support an objcopy library is not high on my priority list. I'm not convinced that your suggestions are actually going to be workable/useful in practice, if I'm honest, and have tried to outline my concerns already.
@jhenderson James, what do you think of using streams as suggested by D91028 and D91693? Would it be useful? It seems it would reduce amount of copied data.
Nov 27 2020
It's not immediately clear to me which data you're concerned about loading into memory, but it's worth remembering that objcopy will need to have all the data (or at least its size) in its internal representation before it can write most bits, at least for ELF objcopy, because the section header offset field is part of the ELF header, but the section header table itself is at the end.
Nov 25 2020
@jhenderson Summary of proposed approach:
Nov 23 2020
The change for DWARFLinker is fine. I also think it is better to split the patch up into separate patches.
update for Note 3. It was considered to be not useful to implement reserve() method as fixed-size memory-mapped file(D91693). Instead it was suggested to implement resizable raw_mmap_ostream.
Thus, if we need to have memory mapped file for objcopy then we have this alternative - implement resizable raw_mmap_ostream.
Nov 20 2020
Yeah, I suppose that's reasonable. Though, if the scope of this is small (e.g.: it only needs to write to files _or_ memory buffers, and it's not going go have a lot of callers/large surface area), I would not completely dismiss some custom solution either...
Not really. :( For resize, I would expect that it has a physical effect on the underlying object (changing file size), which is good, I guess. But I would still expect that it is possible to write more data than that size, and have it be appended, as that's what our other streams do. I would also expect this operation to also have some effect on raw_string_ostream (changing the underlying string size) and such. And I am still worried about using mmap(2)/write(2) interchangeably. The two apis have different characteristics, and I don't think that can be conveyed by a single method call. E.g., the mmap approach will not respect O_APPEND, it will cause SIGBUS if the file is concurrently truncated (that's why our mmap APIs like MemoryBuffer have IsVolatile arguments), etc.
Nov 19 2020
I second the separate class idea. It seems like it could be much cleaner. The normal treatment of reserve-like methods is that of a hint -- one that the implementation could ignore or adjust in some circumstances -- I wouldn't expect that calling this method would completely change the way in which a file is accessed, nor that writing "beyond" the reserved storage would result in an assertion...
addressed minor comments.
Nov 18 2020
For raw_fd_ostream, I wonder if we should actually just have a separate class e.g. raw_mmap_ostream, rather than trying to make raw_fd_ostream represent two possible different things. For other streams, I'm not sure the reserving behaviour is useful - strings and vectors could be reserved from outside the wrapping stream, for example.
Nov 16 2020
addressed comments.
ping.
ping.
Nov 13 2020
Well, this is a bit different from my original idea but is an overall good heuristic for many of the debug sections. It works for .debug_info, which is one of the biggest sections; it does not work for .debug_line, though, which is not that big as .debug_info, but potentially might become problematic in the (distant) future; it also does not work for .debug_abbrev, .debug_addr, .debug_ranges, and some others, which are usually not that big. However, the patch should be extended to support .debug_str, which can be even larger than .debug_info.
Nov 8 2020
Nov 7 2020
LGTM. with small inline comments.
Nov 6 2020
Nov 2 2020
Using a stream approach sounds reasonable. I don't really know what the benefits are of using a memory mapped file versus other options (I vaguely recall from some older work that they improve performance, but am not sure if that is still the case or not). The one concern I'd have with a stream for writing output is if we ever need to jump back and forth within the object for some reason. Without looking at the existing objcopy code, I don't know if there are any instances where this happens though.
Nov 1 2020
Folks, Before creating a patch I would like to consult what would be the best option to refactor the Buffer class.
One of the alternatives is described here https://reviews.llvm.org/D88827#2321871 ("Adapter" approach).
addressed comments(changed text of commentary).
Oct 23 2020
changed patch so that all temporarily variables created first,
and then copy these variables into incoming parameters.
Oct 22 2020
To summarize comments - I am going to:
corrected byvalue operand`s alignment calculation.
Oct 21 2020
addressed comments: renamed files, added doxygen comments into library headers(except CopyConfig.h).
Oct 20 2020
Oct 14 2020
rebased, added command-line option --gc-debuginfo-no-odr.
Oct 12 2020
Implemented that algorithm: First, create a temporary alloca
corresponding to each byval argument. Then, copy each byval
argument to the call to its temporary alloca. Finally, copy
each byval argument to the call from the temporaries to the
memory originally allocated for the function's arguments.
Oct 9 2020
Oct 8 2020
use incoming function arguments area for storing byvalue operand of the recursive call.
1, 2 - perhaps, this can be refactored / reorganized / better documented, but it would be great to have it done before the introduction of the library.
The functions in the library header need doxygen style comments to explain the interface. Possibly these could be added in a follow-up patch.
Oct 7 2020
addressed comment. implicit copy replaced with explicit one.
Oct 6 2020
apply clang-tidy comments.
This refactoring tries to reuse this part of the functionality:
addressed comments.
Oct 5 2020
Oct 2 2020
Oct 1 2020
I've noticed that the code tries to address 2 a bit different situations currently:
Cases that fix the unwrapOrError: like unwrapOrError(HeadersFile.program_headers()) which seems are trying to fix places related to >broken inputs which were never tested before I think. Cases that are tested and report errors, like: error("program header with offset 0x" + Twine::utohexstr(Phdr.p_offset) + ..... I think for start we should probably focus on (B) and probably can keep unwrapOrError untouched until we convert them and add tests.
fixed a small typo.
addressed comments.
Sep 30 2020
and the fact that there'd need to be some way of bailing out once an error is hit (the handler could be noreturn or something like that, I suppose).
So, what about current solution in this patch to report fatal errors through Expected<>/Error in llvm-objcopy? Is it OK to do things that way? My understanding is that currently only fatal errors are reported and then no need to create handlers at the moment.
Sep 29 2020
I was actually referring more to the style that this patch series initially looked at, which is based on something similar to what I did in the DebugLine parser, which in turn inspired my lightning talk on error handling in libraries three(?) years ago. In other words, the client would provide one or more callbacks for different degrees of severity of problems, which llvm-objcopy's code could then call when an error occurred.
Sep 28 2020
addressed comments.
Sep 27 2020
addressed comments(reworked to use Expected and Error as returning values).
Sep 25 2020
addressed comments(removed variable renaming, removed std::move() usage, used std::remove_if() implementation.