This is an archive of the discontinued LLVM Phabricator instance.

[objcopy] Refactor CommonConfig to add posibility to specify added/updated sections as MemoryBuffer.
ClosedPublic

Authored by avl on Feb 24 2022, 7:22 AM.

Details

Summary

Current objcopy implementation has a possibility to add or update sections.
The incoming section is specified as a pair: section name and name of the file
containing section data. The interface does not allow to specify incoming
section as a memory buffer. This patch adds possibility to specify incoming
section as a memory buffer.

Diff Detail

Event Timeline

avl created this revision.Feb 24 2022, 7:22 AM
avl requested review of this revision.Feb 24 2022, 7:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 24 2022, 7:22 AM
jhenderson requested changes to this revision.Feb 25 2022, 12:52 AM
jhenderson added inline comments.
llvm/include/llvm/ObjCopy/CommonConfig.h
193

Take the Buffer by R-value reference, since you are releasing its contents.

194

Use an initialiser list.

199

Why shared_ptr?

llvm/test/tools/llvm-objcopy/COFF/add-section.test
62–64

Here and throughout your test changes, please remove the unused FILE1 variable, and rename FILE2 to FILE (or make equivalent changes).

llvm/tools/llvm-objcopy/ObjcopyOptions.cpp
876–880

It would be great if you could pull all this logic into a function for sharing with the update section case. I think the only differences are the error messages and where the final data gets put.

llvm/unittests/ObjCopy/ObjCopyTest.cpp
120

Split this into separate Add and Update test cases, to avoid confusing the two functionalities.

As the fact that the logic is independent of whether it is a debug section or not, remove reference to "debug" and use arbitrary section contents below (e.g. ".foo" with content "1234" or whatever), rather than a real debug section.

Why only an ELF case?

This revision now requires changes to proceed.Feb 25 2022, 12:52 AM
avl added inline comments.Feb 25 2022, 2:38 AM
llvm/include/llvm/ObjCopy/CommonConfig.h
199

There are many cases where CommonConfig and ConfigManager are copied using "copy" semantic. unique_ptr should be copied with "move" semantic. if unique_ptr would be used here then we need to write strange copy operator:

NewSectionInfo& operator=(const NewSectionInfo& Other) {
   SectionData = std::move(const_cast<NewSectionInfo*>(&Other)->SectionData);
}
jhenderson added inline comments.Feb 25 2022, 5:43 AM
llvm/include/llvm/ObjCopy/CommonConfig.h
199

There are many cases where CommonConfig and ConfigManager are copied using "copy" semantic

Do they need to copy though? It seems like a reference should be enough.

avl added inline comments.Feb 25 2022, 5:59 AM
llvm/include/llvm/ObjCopy/CommonConfig.h
199

There are many cases where CommonConfig and ConfigManager are copied using "copy" semantic

Do they need to copy though? It seems like a reference should be enough.

parseStripOptions fills DC.CopyConfigs by copying of initial Config :

    for (StringRef Filename : Positional) {
...
      Config.InputFilename = Filename;
      Config.OutputFilename = Filename;
      DC.CopyConfigs.push_back(ConfigMgr);

"move" semantic could not be used here.

jhenderson added inline comments.Feb 25 2022, 6:37 AM
llvm/include/llvm/ObjCopy/CommonConfig.h
199

Okay. I'm wondering whether we could change the structure of how that works to allow a core config that has everything except the filenames in, and then that can be taken by reference, or some similar design.

However, actually, I think it would just be simpler to implement a copy constructor that makes a copy of the memory buffer - you already have code that does this in this patch as it is.

avl updated this revision to Diff 411449.Feb 25 2022, 10:11 AM

addressed comments.

jhenderson added inline comments.Feb 28 2022, 1:09 AM
llvm/tools/llvm-objcopy/ObjcopyOptions.cpp
563
llvm/unittests/ObjCopy/ObjCopyTest.cpp
123

I have a marginal prefernce to make Add an enum, as it helps give a bit more meaning to the parameter.

163
170

Add a break at the end of the if.

174–179

These can all be EXPECT rather than ASSERT.

avl updated this revision to Diff 411782.Feb 28 2022, 3:51 AM

addressed comments.

jhenderson added inline comments.Feb 28 2022, 3:59 AM
llvm/include/llvm/ObjCopy/CommonConfig.h
199

However, actually, I think it would just be simpler to implement a copy constructor that makes a copy of the memory buffer - you already have code that does this in this patch as it is.

Are you planning on looking at this? I still think you shouldn't have a shared_ptr.

llvm/unittests/ObjCopy/ObjCopyTest.cpp
132–133

These two should be ASSERT, since the test will crash if the object isn't valid.

162–163

Ditto.

avl added inline comments.Feb 28 2022, 4:08 AM
llvm/include/llvm/ObjCopy/CommonConfig.h
199
However, actually, I think it would just be simpler to implement a copy constructor that makes a copy of the memory buffer - you already have code that does this in this patch as it is.

Are you planning on looking at this? I still think you shouldn't have a shared_ptr.

Oh, sorry, I misunderstood the comment.

With shared_ptr we do not need to copy of the memory buffer. For unique_ptr we need to copy of the memory buffer. But then we might have some performance degradation. Would it be OK?

Why using shared_ptr is a problem?

jhenderson added inline comments.Feb 28 2022, 4:16 AM
llvm/include/llvm/ObjCopy/CommonConfig.h
199

shared_ptr is almost always a design smell in my experience. The majority of the CopyConfig isn't shared (it's copied), so why should one part of it be? Furthermore, if somebody wanted to update the SectionData for one particular config in their library, but use the original for all the others, they'd end up changing it for all of them unintentionally.

avl added inline comments.Feb 28 2022, 4:58 AM
llvm/include/llvm/ObjCopy/CommonConfig.h
199

shared_ptr is almost always a design smell in my experience. The majority of the CopyConfig isn't shared (it's copied), so why should one part of it be?

There are other parts of CopyConfig which are shared:

class NameOrPattern {
  StringRef Name;
  // Regex is shared between multiple CommonConfig instances.
  std::shared_ptr<Regex> R;
  std::shared_ptr<GlobPattern> G;

The reason to not copy part of CommonConfig related to Add/Update sections might be intention to avoid copying big piece of data. All other parts of CommonConfig do not require a big storage.

Furthermore, if somebody wanted to update the SectionData for one particular config in their library, but use the original for all the others, they'd end up changing it for all of them unintentionally.

My understanding is that shared part of data is not supposed to be changed. It Is MemoryBuffer which keeps "const char*". If there would be changed particular SectionData member in the particular config - then it would be perfectly fine and it would not affect others.

If someone would cast away const-ness of MemoryBuffer then it seems to be his responsibility to understand the details.

jhenderson added inline comments.Mar 1 2022, 12:33 AM
llvm/include/llvm/ObjCopy/CommonConfig.h
199

Okay, thanks for the explanation, your comments make sense.

avl updated this revision to Diff 412016.Mar 1 2022, 2:10 AM

addressed comments(returned ASSERT_TRUE back)

This revision is now accepted and ready to land.Mar 1 2022, 3:06 AM
avl added a comment.Mar 1 2022, 3:10 AM

Thank you for the review!

This revision was landed with ongoing or failed builds.Mar 1 2022, 3:50 AM
This revision was automatically updated to reflect the committed changes.

So it looks like this commit regressed some functionality we were using in Android's Linux kernel builds.

It looks like with GNU objcopy, can you specify both --dump-section= then --add-section= in one command line invocation with the same file in order to move a section.
https://android.googlesource.com/kernel/common/+/refs/heads/android14-5.15/arch/arm64/Makefile.postlink#21

$ echo 'void x(void) {}` > foo.c
$ clang -c foo.c
$ rm -f foo.text && llvm-objcopy --dump-section=.text=foo.text --add-section=.hello=foo.text foo.o
llvm-objcopy: error: 'foo.text': No such file or directory

I think I can move us to using --rename-section instead of dump then add, but just a heads up in case this regresses others' builds, too.

Herald added a project: Restricted Project. · View Herald TranscriptSep 26 2022, 3:47 PM

It seems that the change was for llvm-dwarfutil which needs the MemoryBuffer functionality.
The patch changed --add-section to run in twp steps, one before --dump-section (load the file content) and one after (add a section).

In general I wish that users do not expect a particular order between two operations. The order is often unclear/undocumented.