This is an archive of the discontinued LLVM Phabricator instance.

[Support] Add reserve() method to the raw_ostream.
ClosedPublic

Authored by avl on Nov 18 2020, 2:41 AM.

Details

Summary

If resulting size of the output stream is already known,
then the space for stream data could be preliminary
allocated in some cases. f.e. raw_string_ostream could
preallocate the space for the target string(it allows
to avoid reallocations during writing into the stream).

Diff Detail

Event Timeline

avl created this revision.Nov 18 2020, 2:41 AM
avl requested review of this revision.Nov 18 2020, 2:41 AM

For raw_fd_ostream, I wonder if we should actually just have a separate class e.g. raw_mmap_ostream, rather than trying to make raw_fd_ostream represent two possible different things. For other streams, I'm not sure the reserving behaviour is useful - strings and vectors could be reserved from outside the wrapping stream, for example.

llvm/include/llvm/Support/raw_ostream.h
138
139–140

What's the reason for setting the stream to be unbuffered on reserving space? It seems like a potentially unexpected side effect that isn't strictly needed.

144
575–578

Same comments as above. I'm not convinced duplicating the comment is worthwhile, but I suppose these are separate classes, so it's probably acceptable.

582

No need for virtual when the base method is virtual and the sub class method has override.

656–660

Ditto.

698–702

Ditto.

llvm/lib/Support/raw_ostream.cpp
742–743

It's definitely a programmer error if this case is hit, but I think it might be worth also bailing out of the function in that case, to avoid writing past the end of the file, which could cause all sorts of horridness. What do you think?

819

Same as above. Perhaps seek to EOF here in that case?

920–923

This should report some sort of error up the tree if this happens, right? Maybe reserve should return Error.

I second the separate class idea. It seems like it could be much cleaner. The normal treatment of reserve-like methods is that of a hint -- one that the implementation could ignore or adjust in some circumstances -- I wouldn't expect that calling this method would completely change the way in which a file is accessed, nor that writing "beyond" the reserved storage would result in an assertion...

avl added a comment.Nov 18 2020, 5:06 AM

For raw_fd_ostream, I wonder if we should actually just have a separate class e.g. raw_mmap_ostream, rather than trying to make raw_fd_ostream represent two possible different things. For other streams, I'm not sure the reserving behaviour is useful - strings and vectors could be reserved from outside the wrapping stream, for example.

The idea behind usage of reserve() method is that the size which should be reserved could be unknown.
Thus we do not know how to reserve strings and vectors at creation point. The same for memory mapped file: it could be unknown which class to create raw_fd_ostream or raw_mmap_ostream:

void WriteToStreamLibrayFunc(raw_ostream& Out) {
   if ( HasVariableData()) {
      while (HasData()) {
         Out.Write()
      }
   } else {
      size_t Size = calculateSize();
      
      Out.reserve(Size);
      Out.Write();
   }
}

Usage 1:

std::string string;
string.reserve(?????);
raw_string_ostream Out(string);
WriteToStreamLibrayFunc (Out);

Usage 2:

raw_fd_ostream?raw_mmap_ostream? Out();
WriteToStreamLibrayFunc (Out);

So, using reserve() method allows us to create effective storage if resulting size could be calculated under some conditions.

llvm/include/llvm/Support/raw_ostream.h
139–140

The reason is when some space is allocated by reserve() method then other buffers become useless in most cases. We do not need to copy data through internal buffers - we could write them directly into memory mapped file.

llvm/lib/Support/raw_ostream.cpp
742–743

It seems to me that we do not need to bail out. That is a programmer error as you said. But we need to stay correct. i.e. I think I need to add check for sizes(to ignore writing out of the file), like in raw_fd_stream::read().

819

yes. I think seeking to EOF is a good idea. But I think it is better to have assertion here, not to bail out with error.

920–923

The idea of that method is when it could not use memory-mapped file then we continue with usual processing. F.e. if file is opened without read access then we will have a error "permission denied". But we would not like to report this error. We would like to continue without memory mapped file. So the idea, is that if reserve() method is not able to reserve data then we continue the same as without calling to reserve().

avl updated this revision to Diff 306431.Nov 19 2020, 9:04 AM

addressed minor comments.

The question of creating separate raw_mmap_ostream class
is under discussion yet.

avl added a comment.Nov 19 2020, 9:25 AM

I second the separate class idea. It seems like it could be much cleaner. The normal treatment of reserve-like methods is that of a hint -- one that the implementation could ignore or adjust in some circumstances -- I wouldn't expect that calling this method would completely change the way in which a file is accessed, nor that writing "beyond" the reserved storage would result in an assertion...

What do you think of WriteToStreamLibrayFunc use case? In that context reserve() is not just a hint, but it limits the size of resulting stream. Probably the method name is not good. Would resize() be better ?

In D91693#2405908, @avl wrote:

I second the separate class idea. It seems like it could be much cleaner. The normal treatment of reserve-like methods is that of a hint -- one that the implementation could ignore or adjust in some circumstances -- I wouldn't expect that calling this method would completely change the way in which a file is accessed, nor that writing "beyond" the reserved storage would result in an assertion...

What do you think of WriteToStreamLibrayFunc use case? In that context reserve() is not just a hint, but it limits the size of resulting stream. Probably the method name is not good. Would resize() be better ?

Not really. :( For resize, I would expect that it has a physical effect on the underlying object (changing file size), which is good, I guess. But I would still expect that it is possible to write more data than that size, and have it be appended, as that's what our other streams do. I would also expect this operation to also have some effect on raw_string_ostream (changing the underlying string size) and such. And I am still worried about using mmap(2)/write(2) interchangeably. The two apis have different characteristics, and I don't think that can be conveyed by a single method call. E.g., the mmap approach will not respect O_APPEND, it will cause SIGBUS if the file is concurrently truncated (that's why our mmap APIs like MemoryBuffer have IsVolatile arguments), etc.

TBH, I am not sure that raw_ostream is the best class for this use case. How much of it's functionality do you need? If you don't need the formatted output capabilities, then maybe a different interface might be better suited. (I realize you're trying to remove one.)

In D91693#2402455, @avl wrote:

Usage 2:

raw_fd_ostream?raw_mmap_ostream? Out();
WriteToStreamLibrayFunc (Out);

One way to get around that would be to make raw_mmap_ostream usable even without an explicit reserve call. So, if the user does not call reserve (or if he goes past the buffer he has already reserved), the stream would automatically mmap successive chunks of the file, and write to them.

That would make make the mmap approach harder to implement, but otoh, it would at least behave rougly as a normal stream. And the usage of a distinct class would make it clear that something funky is happening. Also, this approach might give you a performance boost (reduced copying) even for the cases where the size is not known ahead of time. We may not even need to add an new reserve method to the class, as we could have the class mmap buffer-sized chunks, and the code could adjust the buffer size when the size is known.....

avl added a comment.Nov 20 2020, 2:09 AM

Not really. :( For resize, I would expect that it has a physical effect on the underlying object (changing file size), which is good, I guess. But I would still expect that it is possible to write more data than that size, and have it be appended, as that's what our other streams do. I would also expect this operation to also have some effect on raw_string_ostream (changing the underlying string size) and such. And I am still worried about using mmap(2)/write(2) interchangeably. The two apis have different characteristics, and I don't think that can be conveyed by a single method call. E.g., the mmap approach will not respect O_APPEND, it will cause SIGBUS if the file is concurrently truncated (that's why our mmap APIs like MemoryBuffer have IsVolatile arguments), etc.

I see. So the better solution would be resizable raw_mmap_ostream which could be used in that scenario(when the size is not known at creation time):

raw_mmap_ostream Out();
WriteToStreamLibrayFunc (Out);

TBH, I am not sure that raw_ostream is the best class for this use case. How much of it's functionality do you need? If you don't need the formatted output capabilities, then maybe a different interface might be better suited. (I realize you're trying to remove one.)

I think I do not need formatted output capabilities. They could probably be nice to have in some cases. But main functionality which is useful - possibility to write data as a sequence of pieces. i.e. not as one huge chunk, but as a series of smaller chunks.

Speaking of reserve() method. Do you think it would be useful to implement it in the sense of hint? The only practical implementation for the current moment would be reservation underlying string by raw_string_ostream? It might be useful for the future raw_mmap_ostream, but it would depend on implementation.

void raw_string_ostream::reserve(uint64_t Size) override {
  OS.reserve(Size);
}
In D91693#2407604, @avl wrote:

Not really. :( For resize, I would expect that it has a physical effect on the underlying object (changing file size), which is good, I guess. But I would still expect that it is possible to write more data than that size, and have it be appended, as that's what our other streams do. I would also expect this operation to also have some effect on raw_string_ostream (changing the underlying string size) and such. And I am still worried about using mmap(2)/write(2) interchangeably. The two apis have different characteristics, and I don't think that can be conveyed by a single method call. E.g., the mmap approach will not respect O_APPEND, it will cause SIGBUS if the file is concurrently truncated (that's why our mmap APIs like MemoryBuffer have IsVolatile arguments), etc.

I see. So the better solution would be resizable raw_mmap_ostream which could be used in that scenario(when the size is not known at creation time):

Yes, *I* would think so. You may want to gather additional opinions before spending too much time on that, though...

TBH, I am not sure that raw_ostream is the best class for this use case. How much of it's functionality do you need? If you don't need the formatted output capabilities, then maybe a different interface might be better suited. (I realize you're trying to remove one.)

I think I do not need formatted output capabilities. They could probably be nice to have in some cases. But main functionality which is useful - possibility to write data as a sequence of pieces. i.e. not as one huge chunk, but as a series of smaller chunks.

Yeah, I suppose that's reasonable. Though, if the scope of this is small (e.g.: it only needs to write to files _or_ memory buffers, and it's not going go have a lot of callers/large surface area), I would not completely dismiss some custom solution either...

Speaking of reserve() method. Do you think it would be useful to implement it in the sense of hint? The only practical implementation for the current moment would be reservation underlying string by raw_string_ostream? It might be useful for the future raw_mmap_ostream, but it would depend on implementation.

I don't think that would be *un*reasonable, but I'd wait until a use case for it shows up.

avl added a comment.Nov 20 2020, 5:37 AM

Yeah, I suppose that's reasonable. Though, if the scope of this is small (e.g.: it only needs to write to files _or_ memory buffers, and it's not going go have a lot of callers/large surface area), I would not completely dismiss some custom solution either...

This work is done exactly because of scope for custom solution become wider. There is a D88827 review which tries to move core implementation of llvm-objcopy into the Object library. So there is a request to avoid using custom solution in favor of more standard one. That is why I am trying to replace custom llvm-objcopy solution(D91028).

I don't think that would be *un*reasonable, but I'd wait until a use case for it shows up.

There is such a use case in D91028

SmallVector<char, 0> Buffer;
raw_svector_ostream MemStream(Buffer);

if (Error E = executeObjcopyOnBinary(Config, **ObjOrErr, MemStream))
  return E;
In D91693#2407967, @avl wrote:

Yeah, I suppose that's reasonable. Though, if the scope of this is small (e.g.: it only needs to write to files _or_ memory buffers, and it's not going go have a lot of callers/large surface area), I would not completely dismiss some custom solution either...

This work is done exactly because of scope for custom solution become wider. There is a D88827 review which tries to move core implementation of llvm-objcopy into the Object library. So there is a request to avoid using custom solution in favor of more standard one. That is why I am trying to replace custom llvm-objcopy solution(D91028).

Ok, fair enough.

I don't think that would be *un*reasonable, but I'd wait until a use case for it shows up.

There is such a use case in D91028

SmallVector<char, 0> Buffer;
raw_svector_ostream MemStream(Buffer);

if (Error E = executeObjcopyOnBinary(Config, **ObjOrErr, MemStream))
  return E;

Seems reasonable, then. I would like to get a second opinion though..

In D91693#2407967, @avl wrote:

Yeah, I suppose that's reasonable. Though, if the scope of this is small (e.g.: it only needs to write to files _or_ memory buffers, and it's not going go have a lot of callers/large surface area), I would not completely dismiss some custom solution either...

This work is done exactly because of scope for custom solution become wider. There is a D88827 review which tries to move core implementation of llvm-objcopy into the Object library. So there is a request to avoid using custom solution in favor of more standard one. That is why I am trying to replace custom llvm-objcopy solution(D91028).

Ok, fair enough.

I don't think that would be *un*reasonable, but I'd wait until a use case for it shows up.

There is such a use case in D91028

SmallVector<char, 0> Buffer;
raw_svector_ostream MemStream(Buffer);

if (Error E = executeObjcopyOnBinary(Config, **ObjOrErr, MemStream))
  return E;

Seems reasonable, then. I would like to get a second opinion though..

@jhenderson mind chiming in here? If you and @labath find agreement with the direction, that seems good to me.

I slightly lost track of what people are suggesting, so instead, I'm deliberately taking a step back from my knowledge of the existing code here, to outline an idealised theoretical design, so this might not work quite as planned. To me, it would make more sense for executeObjcopyOnBinary to take a filename and do the writing internally, rather than receiving some kind of input buffer that it writes to. This then means you should know the size up-front before writing to the buffer, if I'm not mistaken, allowing you to set the size of your buffer up front. That in turn removes the need for any resize/reserve method - you just specify the size needed at the construction of the buffer. You might therefore have two different stream classes being used, one for in-memory writes and the other for file writing. The two of them share a common base class, so only a small part of the code requires knowing what buffer kind to create. The in-memory version would just take a size (and possibly some backing storage), whilst the other takes a size and file name and creates a file, possibly via memory mapping, that ultimately is that size. It would be an assertion in these cases to write past the end of the file.

Ultimately, I think the only new code needed would be a new constructor overload for raw_svector_ostream taking a size that the output size will be and a new raw_ostream subclass which does the same with a file behind it.

avl added a comment.Nov 25 2020, 6:53 AM

@jhenderson Summary of proposed approach:

  1. Current solution, which uses Buffer/MemoryBuffer/FileBuffer assumes that the buffer for the whole file should be pre-allocated. This assumes that the whole file should be loaded into the memory. For the library, it might be an inconvenient requirement. Some tools might want to have a possibility to reduce memory usage.

    Using streams allows us to reduce memory usages. No need to load all data into the memory - the data could be streamed through a smaller buffer.
  1. Passing file name for the executeObjcopyOnBinary might be inconvenient if we would like to have a possibility to replace destination.
executeObjcopyOnBinary(CopyConfig &Config, object::Binary &In, StringRef OutName);

i.e. when we would like to write data not in the file but in the memory, or just discard output,
or calculate the hash for output we might want to replace destination. That would be hard
if the only file name is specified. Opposite, following design:

executeObjcopyOnBinary(CopyConfig &Config, object::Binary &In, raw_ostream &Out);

allows to use various kinds of destinations;

SmallVector<char, 0> Buffer;
raw_svector_ostream MemStream(Buffer);
executeObjcopyOnBinary(Config, In, Out);

raw_fd_ostream Out(OutputFilename);
executeObjcopyOnBinary(Config, In, Out);

raw_null_ostream Out;
executeObjcopyOnBinary(Config, In, Out);

raw_sha1_ostream Out;
executeObjcopyOnBinary(Config, In, Out);
  1. Current objcopy implementation assumes that data is already pre-allocated. Thus, to have the advantage of using streams it should be rewritten in such a way that streams were used directly(without creating intermediate buffers).
  1. If llvm-objcopy needs to use memory-mapped files then we might implement resizable raw_mmap_ostream. Though I do not know at the current moment whether the effective implementation of resizable memory-mapped file could be done.
  1. The fact that llvm-objcopy knows the size of resulting data could be used to make implementation a bit more efficient. We could tell reserve() for the streams and the more effective buffers could be used:
SmallVector<char, 0> Buffer;
raw_svector_ostream MemStream(Buffer);
executeObjcopyOnBinary(Config, In, Out);

executeObjcopyOnBinary (Config, In, Out) {
  Out.reserve(XX);   /// <<< calls Buffer.reserve(XX); 
}

Finally, we can support all current functionality and have the advantages of using streams.
So my current suggestion is

  1. to use raw_ostream for output parameter of executeObjcopyOnBinary interface.
  2. add reserve() method to the raw_ostream, so that streams were able to optimize internal buffers if possible.

What do you think?

In D91693#2416252, @avl wrote:

@jhenderson Summary of proposed approach:

  1. Current solution, which uses Buffer/MemoryBuffer/FileBuffer assumes that the buffer for the whole file should be pre-allocated. This assumes that the whole file should be loaded into the memory. For the library, it might be an inconvenient requirement. Some tools might want to have a possibility to reduce memory usage.

    Using streams allows us to reduce memory usages. No need to load all data into the memory - the data could be streamed through a smaller buffer.

It's not immediately clear to me which data you're concerned about loading into memory, but it's worth remembering that objcopy will need to have all the data (or at least its size) in its internal representation before it can write most bits, at least for ELF objcopy, because the section header offset field is part of the ELF header, but the section header table itself is at the end.

  1. Passing file name for the executeObjcopyOnBinary might be inconvenient if we would like to have a possibility to replace destination.

This can be solved by overloading:

// In all examples, types of first two parameters are arbitrarily chosen, the third parameter, when present, is the important one).

// Execute objcopy then write to file Out.
Error executeObjcopyOnBinary(const CopyConfig &Config, const Binary &In, StringRef Out) {
  // Do some objcopy stuff to finalise the objcopy layout.
  Object Obj = executeObjcopy(Config, In);
  size_t Size = Obj.getOutputSize();
  raw_mmap_ostream OS(Out, Size);
  writeOutput(OS, Obj);
}

// Execute objcopy then write to raw_ostream Out. raw_mmap_ostream won't be usable here, since the output size isn't known up front.
Error executeObjcopyOnBinary(const CopyConfig &Config, const Binary &In, raw_ostream &Out) {
  Object Obj = executeObjcopy(Config, In);
  writeOutput(Out, Obj);
}

// If a user wanted to write the output to an existing raw_mmap_ostream, they'd need to call the lower-level API directly:
void aFunction(...) {
  ...
  Object Obj = executeObjcopy(Config, In);
  size_t Size = Obj.getOutputSize();
  ...
  raw_mmap_ostream Out(FileName, Size + ...);
  ...
  writeOutput(Out, Obj);
}

It's worth noting that we cannot have Expected<raw_ostream &> executeObjcopyOnBinary(const CopyConfig &Config, const Binary &In); where the function itself creates the stream and returns it, because something needs to own the backing storage.

  1. Current objcopy implementation assumes that data is already pre-allocated. Thus, to have the advantage of using streams it should be rewritten in such a way that streams were used directly(without creating intermediate buffers).

I don't think it's possible to start writing to a stream before the size of the data is known (see above). Does that mean streams are still justified?

  1. If llvm-objcopy needs to use memory-mapped files then we might implement resizable raw_mmap_ostream. Though I do not know at the current moment whether the effective implementation of resizable memory-mapped file could be done.

I don't think resizeable memory-mapped files are useful in this context, because the size needs to be known before writing anyway. The exception is if we think calling the objcopy guts is too complicated for users that want to write to an existing memory mapped file, but I think it is possible to keep it simple, like in my example above.

  1. The fact that llvm-objcopy knows the size of resulting data could be used to make implementation a bit more efficient. We could tell reserve() for the streams and the more effective buffers could be used:

This is merely a nice-to-have, and not itself directly related to the issue at hand. It can be implemented later. People who want the efficiency bonus of reserving could use the same approach as I did in the above example with the adding to an existing memory mapped file, by calling the lower-level API directly. They will usually have access to their underlying backing storage anyway (e.g. the std::vector) and so can just reserve it directly.

avl added a comment.Nov 27 2020, 5:47 AM

It's not immediately clear to me which data you're concerned about loading into memory, but it's worth remembering that objcopy will need to have all the data (or at least its size) in its internal representation before it can write most bits, at least for ELF objcopy, because the section header offset field is part of the ELF header, but the section header table itself is at the end.

I mean bits of the resulting object. f.e. output ELF file data.

My understanding is that this(having all the data in internal representation) might be not
necessary for other formats, f.e. MachO/Wasm. Following, is the existing code for MachO :

Expected<std::unique_ptr<MemoryBuffer>>
object::writeUniversalBinaryToBuffer(ArrayRef<Slice> Slices) {
  SmallVector<char, 0> Buffer;
  raw_svector_ostream Out(Buffer);

  if (Error E = writeUniversalBinaryToStream(Slices, Out))  <<< data is written to the memory stream
    return std::move(E);
    
  return std::make_unique<SmallVectorMemoryBuffer>(std::move(Buffer));
}    
......    
  Expected<std::unique_ptr<MemoryBuffer>> B =
      writeUniversalBinaryToBuffer(Slices);
  if (!B)
    return B.takeError();
  if (Error E = Out.allocate((*B)->getBufferSize()))
    return E;
  memcpy(Out.getBufferStart(), (*B)->getBufferStart(), (*B)->getBufferSize());  
  ^^^^^^^^^
  data is copied into the output buffer

Using stream as a destination would allow us not to use the intermediate memory object.
i.e. If it is not necessary to update the file header after the end of the file is written/or if that size
could be precalculated in the start - then we do not need to load all file data into the
internal buffers.

For the ELF case, there would be three alternatives(in streams solution)(*) :

  1. If the section header offset could somehow be precalculated then we are lucky and

do not need to update it after the file is written.

  1. We could store the file data into the memory buffer, update the field and then copy this buffer

into the destination stream.

  1. Use raw_pwrite_stream as the destination. That would allow us to update the section header

offset after the file is written.

So we would still be able to successfully handle the ELF case and we do not require loading whole
file bits into the memory for other cases.

This can be solved by overloading:

// In all examples, types of first two parameters are arbitrarily chosen, the third parameter, when present, is the important one).

// Execute objcopy then write to file Out.
Error executeObjcopyOnBinary(const CopyConfig &Config, const Binary &In, StringRef Out) {

// Do some objcopy stuff to finalise the objcopy layout.
Object Obj = executeObjcopy(Config, In);
size_t Size = Obj.getOutputSize();
raw_mmap_ostream OS(Out, Size);
writeOutput(OS, Obj);

}

// Execute objcopy then write to raw_ostream Out. raw_mmap_ostream won't be usable here, since the output size isn't known up front.
Error executeObjcopyOnBinary(const CopyConfig &Config, const Binary &In, raw_ostream &Out) {

Object Obj = executeObjcopy(Config, In);
writeOutput(Out, Obj);

}

// If a user wanted to write the output to an existing raw_mmap_ostream, they'd need to call the lower-level API directly:
void aFunction(...) {

...
Object Obj = executeObjcopy(Config, In);
size_t Size = Obj.getOutputSize();
...
raw_mmap_ostream Out(FileName, Size + ...);
...
writeOutput(Out, Obj);

}

It's worth noting that we cannot have Expected<raw_ostream &> executeObjcopyOnBinary(const CopyConfig &Config, const Binary &In); where the function itself creates the stream and returns it, because something needs to own the backing storage.

In this design we have a new entity "Object" in the interface:

Object Obj = executeObjcopy(Config, In);

This new "Object" would force additional copying. F.e. following usage would become useless:

void aFunction(...) {
  ...
  Object Obj = executeObjcopy(Config, In);
  size_t Size = Obj.getOutputSize();
  ...
  raw_mmap_ostream Out(FileName, Size + ...);
  ...
  writeOutput(Out, Obj); <<< data should be copied from Object Obj into the raw_mmap_ostream Out;
}

The advantage of using a memory-mapped file is that we could write directly to the memory owned by the memory-mapped file
and this memory would be stored by the system. Using an intermediate Object, would require us to do extra copying from that Object into a memory-mapped file. In the streams scenario we do not need to do extra copying:

executeObjcopyOnBinary(CopyConfig &Config, object::Binary &In, raw_ostream &Out) {
  Writer->finalize(Out); <<< calls Out.reserve(XXX);

  Writer->write(Out);  <<< writes directly to the memory mapped file data
}

void aFunction(...) {
  ...
  raw_mmap_ostream Out(FileName);
  executeObjcopy(Config, In, Out);
}

I don't think it's possible to start writing to a stream before the size of the data is known (see above). Does that mean streams are still justified?

My understanding is yes, streams are still justified. see (*).

I don't think resizeable memory-mapped files are useful in this context, because the size needs to be known before writing anyway. The exception is if we think calling the objcopy guts is too complicated for users that want to write to an existing memory mapped file, but I think it is possible to keep it simple, like in my example above.

But we already know the size inside objcopy before writing.
Currenty we call Buf.allocate(totalSize()) before starting writing.
With streams solution, we would call reserve(totalSize()) in that place.

This is merely a nice-to-have, and not itself directly related to the issue at hand. It can be implemented later. People who want the efficiency bonus of reserving could use the same approach as I did in the above example with the adding to an existing memory mapped file, by calling the lower-level API directly. They will usually have access to their underlying backing storage anyway (e.g. the std::vector) and so can just reserve it directly.

yes, that could be implemented later.
But that would be required to have the same performance results for streams solution and the current implementation.

avl added a comment.Dec 1 2020, 4:52 AM

@jhenderson James, what do you think of using streams as suggested by D91028 and D91693? Would it be useful? It seems it would reduce amount of copied data.

In D91693#2425369, @avl wrote:

@jhenderson James, what do you think of using streams as suggested by D91028 and D91693? Would it be useful? It seems it would reduce amount of copied data.

Hi @avl,

Honestly, I don't have the time to look at this process in detail, and refactoring things to support an objcopy library is not high on my priority list. I'm not convinced that your suggestions are actually going to be workable/useful in practice, if I'm honest, and have tried to outline my concerns already. My biggest concern is how do you stream an ELF header without already knowing where your section header table will live. If you know where your section header table will live, you have all the information you need for presizing your output buffer, so being able to reserve post stream creation becomes pointless. There's no need to read the entire input object file into memory either - llvm-objcopy doesn't do this already (note that generic sections that require no manipulation just use an ArrayRef to refer to the section contents).

I will give it more thought, but I cannot guarantee when I'll be able to come back to this.

avl added a comment.Dec 1 2020, 2:36 PM

Honestly, I don't have the time to look at this process in detail, and refactoring things to support an objcopy library is not high on my priority list. I'm not convinced that your suggestions are actually going to be workable/useful in practice, if I'm honest, and have tried to outline my concerns already.

yes. thank you for your comments. I tried to explain why I think that streams could still be used. Please, check whether it is correct argument.

My biggest concern is how do you stream an ELF header without already knowing where your section header table will live.

please, check three ways of how to do this:

  1. Use preliminary memory buffer(ELF header could be updated after sections are written):
executeObjcopyOnBinary(CopyConfig &Config, object::Binary &In, raw_ostream &Out) {
  Buf = WritableMemoryBuffer::getNewMemBuffer(totalSize()); <<< allocate preliminary memory buffer
  
  // Following code stores result in the buffer Buf(that is how the current implementation works).
  // Segment data must be written first, so that the ELF header and program
  // header tables can overwrite it, if covered by a segment.
  writeSegmentData();
  writeEhdr();
  writePhdrs();
  if (Error E = writeSectionData())
    return E;
  if (WriteSectionHeaders)
    writeShdrs();

  Out.write(Buf->getBufferStart(), Buf->getBufferSize()); <<<<< copy data from internal buffer into output stream
  Out.flush();
  return Error::success();  
}

That scheme looks equal(in the sence of copied output data) to solution you have proposed earlier:

void aFunction(...) {
  ...
  Object Obj = executeObjcopy(Config, In);  <<<< allocate memory buffer and return result in it
  size_t Size = Obj.getOutputSize();
  ...
  raw_fd_ostream Out(FileName);
  ...
  writeOutput(Out, Obj);  <<< copy data from the memory buffer into output file
}
  1. Write directly into the stream:

Before writing started the overall output file size is already known.
It is calculated by ELFWriter<ELFT>::finalize(). Probably, the information about the place where section header table will live could be understood inside ELFWriter<ELFT>::finalize() ? In that case the ELF header
might be written immediately in the correct form.

  1. If above two variants are not good then we could consider using raw_pwrite_stream for the destination.
executeObjcopyOnBinary(CopyConfig &Config, object::Binary &In, raw_pwrite_stream &Out) {

  // Following code stores result in the raw_pwrite_stream &Out.
  // Segment data must be written first, so that the ELF header and program
  // header tables can overwrite it, if covered by a segment.
  writeSegmentData();
  writeEhdr();
  writePhdrs();
  if (Error E = writeSectionData())
    return E;
  if (WriteSectionHeaders)
    writeShdrs();

  // seek to the ELF header and update it
  Out.seek(to ELF Header);
  Out.write(updated ELF header);

  return Error::success();  
}

if you know where your section header table will live, you have all the information you need for presizing your output buffer, so being able to reserve post stream creation becomes pointless.

Why it becomes pointless?

There's no need to read the entire input object file into memory either - llvm-objcopy doesn't do this already (note that generic sections that require no manipulation just use an ArrayRef to refer to the section contents).

Right. I am not proposing to optimize memory used for input object file.
That proposal is about using streams to use standard classes and to optimize memory used for the output file.

following is an example when the entire output file is loaded into the memory:

Expected<std::unique_ptr<MemoryBuffer>>
object::writeUniversalBinaryToBuffer(ArrayRef<Slice> Slices) {
  SmallVector<char, 0> Buffer;
  raw_svector_ostream Out(Buffer);

  if (Error E = writeUniversalBinaryToStream(Slices, Out))  <<< entire output file is loaded into the memory
    return std::move(E);
    
  return std::make_unique<SmallVectorMemoryBuffer>(std::move(Buffer));
}    
......    
  Expected<std::unique_ptr<MemoryBuffer>> B =
      writeUniversalBinaryToBuffer(Slices);
  if (!B)
    return B.takeError();
  if (Error E = Out.allocate((*B)->getBufferSize()))
    return E;
  memcpy(Out.getBufferStart(), (*B)->getBufferStart(), (*B)->getBufferSize());  
  ^^^^^^^^^
  data is copied from memory into the output buffer
In D91693#2425369, @avl wrote:

@jhenderson James, what do you think of using streams as suggested by D91028 and D91693? Would it be useful? It seems it would reduce amount of copied data.

Hi @avl,

Honestly, I don't have the time to look at this process in detail, and refactoring things to support an objcopy library is not high on my priority list. I'm not convinced that your suggestions are actually going to be workable/useful in practice, if I'm honest, and have tried to outline my concerns already. My biggest concern is how do you stream an ELF header without already knowing where your section header table will live. If you know where your section header table will live, you have all the information you need for presizing your output buffer, so being able to reserve post stream creation becomes pointless.

I guess that depends on the API - if the API is "write this object to this stream" then that API implementation has a "Oh, I know how big my output is, I can pass that hint to the stream and, if it has use for that (such as presizing a memory mapped output file to back storage) it can do that" - if the API is defined more in terms of "write this object to the file specified by this file name", then, yeah, you can change the logic about how the file is opened in the first place - presize, open memory mapped, and use a MemoryBuffer instead of a stream, potentially.

But a stream API is more general - allows for streaming out to stdout or other places that can't be identified by a file name, for instance.

@avl - it might be this and the related patch/usage in llvm-objcopy deserve an llvm-dev design thread about the overall issues you're trying to solve and the design directions you've explored and the ones you are proposing.

avl added a comment.Dec 23 2020, 9:58 AM

@avl - it might be this and the related patch/usage in llvm-objcopy deserve an llvm-dev design thread about the overall issues you're trying to solve and the design directions you've explored and the ones you are proposing.

Thanks! Will start that thread right after holidays...

avl added a comment.Jan 19 2021, 10:31 AM

@avl - it might be this and the related patch/usage in llvm-objcopy deserve an llvm-dev design thread about the overall issues you're trying to solve and the design directions you've explored and the ones you are proposing.

https://lists.llvm.org/pipermail/llvm-dev/2021-January/147892.html

avl updated this revision to Diff 322122.Feb 8 2021, 8:30 AM

removed usages of memory mapped file. make reserve() method to
pre-allocate internal buffers only.

avl edited the summary of this revision. (Show Details)Feb 8 2021, 9:11 AM
dblaikie added inline comments.Feb 18 2021, 3:27 PM
llvm/include/llvm/Support/raw_ostream.h
656

Would it make sense for the reserve operation to be relative to where the stream has reached so far?

ie: for it to be implemented as:

void reserve(size_t ExtraCapacity) override { OS.reserve(OS.size() + ExtraCapacity); }

(similarly for other implementations)
Because the average caller probably shouldn't be thinking about how many bytes have already been written to the stream when deciding how much capacity they want for future write operations?

avl added inline comments.Feb 19 2021, 7:00 AM
llvm/include/llvm/Support/raw_ostream.h
656

I think - yes, it makes sense. In that case, it would probably be better to name that method reserveExtraSpace()(to look differently than std::vector::reserve()).

Also, I think it should not be calculated against OS.size(). It looks better to calculate it against current_pos()(Because the callee can receive stream with the current position in the middle of the stream)

Would update the review accordingly(if there are no objections).

dblaikie added inline comments.Feb 19 2021, 12:22 PM
llvm/include/llvm/Support/raw_ostream.h
656

Happy to use current_pos, though it looks like it's == size anyway, right?

uint64_t raw_svector_ostream::current_pos() const { return OS.size(); }

& raw_string_ostream's member:

uint64_t current_pos() const override { return OS.size(); }
avl added inline comments.Feb 19 2021, 1:20 PM
llvm/include/llvm/Support/raw_ostream.h
656

Happy to use current_pos, though it looks like it's == size anyway, right?

right. I was thinking on general case when we might seek from the current position(like in raw_fd_ostream). Thus using current_pos() might be more general. But, since we have specialized implementation in raw_svector_ostream/raw_string_ostream and do not have seek methods there - it would be fine to use OS.size().

avl updated this revision to Diff 325488.Feb 22 2021, 10:34 AM
avl edited the summary of this revision. (Show Details)

change reserve() to reserveExtraSpace(). That allows pre-allocating
the internal stream buffers without knowing its current size.

dblaikie accepted this revision.Feb 22 2021, 1:19 PM
dblaikie added inline comments.
llvm/unittests/Support/raw_ostream_test.cpp
462 ↗(On Diff #325488)

maybe make this a bit more interesting by having a non-zero value for tell() here, otherwise there's no difference between reserve using an absolute size, and reserve using a tell-relative value.

This revision is now accepted and ready to land.Feb 22 2021, 1:19 PM
This revision was automatically updated to reflect the committed changes.
avl added a comment.Feb 23 2021, 3:11 AM

Thanks! updated test accordingly.