This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] Refactor ELF-specific config out to ELFCopyConfig. NFC.
ClosedPublic

Authored by seiya on Sep 3 2019, 7:22 PM.

Details

Summary

This patch splits the command-line parsing into two phases:

First, parse cross-platform options and leave ELF-specific options unparsed.

Second, in the ELF implementation, parse ELF-specific options and construct ELFCopyConfig.

Diff Detail

Repository
rL LLVM

Event Timeline

seiya created this revision.Sep 3 2019, 7:22 PM

just want to mention a slightly different approach:

CopyConfig.h:

// Forward declarations only.
struct COFFCopyConfig;
struct ELFCopyConfig;
struct MachOCopyConfig;

struct CopyConfig {
    ...
   Optional<StringRef> NewSymbolVisibility;
   std::vector<StringRef> SymbolsToAdd;
  
   std::unique_ptr<COFFCopyConfig> COFF;
   std::unique_ptr<ELFCopyConfig> ELF;
   std::unique_ptr<MachOCopyConfig> MachO;
};

ELFObjcopy.h:

struct ELFCopyConfig {
   Optional<uint8_t> NewSymbolVisibility;
   std::vector<NewSymbolInfo> SymbolsToAdd
};

Error parseELFCopyConfig(CopyConfig &C);

Error executeObjcopyOnBinary(
     CopyConfig &C,
     object::ELFObjectFileBase &In, Buffer &Out);

What do you think ?

I've got another suggestion: have separate COFFConfig/ELFConfig etc classes that only contain the interpreted versions and lazily construct them prior to calling the corresponding code path. CopyConfig would contain the raw unparsed strings (perhaps rename them to "RawSymbolsToAdd" etc). Perhaps a higher-level Config class could exist just to wrap them and pass them to executeObjcopyOnBinary etc.

Rough outline:

CopyConfig.h:

class CopyConfig {
  std::vector<StringRef> RawSymbolsToAdd;
  ...
};

class ELFConfig {
  std::vector<NewSymbolInfo> SymbolsToAdd;
  ...
};

// etc for COFF, Mach-O...

llvm-objcopy.cpp:

struct Configurations {
  CopyConfig Common;
  Optional<ELFConfig> ELF;
  ...
}

...

static Error executeObjcopyOnBinary(Configurations &Cfgs, ...) {
  if (/*isElf*/...) {
    if (!Cfgs.ELF)
     Cfgs.ELF.emplace(Cfgs.Common);
    return elf::executeObjcopyOnBinary(Cfgs.Common, Cfgs.ELF, ...)
  }
  ...
}

On a related note, I think the ELF-specific etc parsing code belongs in a file other than ELFObjcopy.cpp. I feel like that file should be confined to doing the actual objcopy processing, whereas the command-line option parsing etc should be separate, e.g. in a new file called ELFConfig.cpp etc. What do you think?

P.S. i'm ok with @jhenderson's suggestion.

seiya added a comment.Sep 5 2019, 8:01 PM

I've got another suggestion: have separate COFFConfig/ELFConfig etc classes that only contain the interpreted versions and lazily construct them prior to calling the corresponding code path. CopyConfig would contain the raw unparsed strings (perhaps rename them to "RawSymbolsToAdd" etc). Perhaps a higher-level Config class could exist just to wrap them and pass them to executeObjcopyOnBinary etc.

Rough outline:

CopyConfig.h:

class CopyConfig {
  std::vector<StringRef> RawSymbolsToAdd;
  ...
};

class ELFConfig {
  std::vector<NewSymbolInfo> SymbolsToAdd;
  ...
};

// etc for COFF, Mach-O...

llvm-objcopy.cpp:

struct Configurations {
  CopyConfig Common;
  Optional<ELFConfig> ELF;
  ...
}

...

static Error executeObjcopyOnBinary(Configurations &Cfgs, ...) {
  if (/*isElf*/...) {
    if (!Cfgs.ELF)
     Cfgs.ELF.emplace(Cfgs.Common);
    return elf::executeObjcopyOnBinary(Cfgs.Common, Cfgs.ELF, ...)
  }
  ...
}

LGTM. I'll implement the idea.

On a related note, I think the ELF-specific etc parsing code belongs in a file other than ELFObjcopy.cpp. I feel like that file should be confined to doing the actual objcopy processing, whereas the command-line option parsing etc should be separate, e.g. in a new file called ELFConfig.cpp etc. What do you think?

I agree with this. I'll add ELFConfig.cpp to separate the command-line parsing.

seiya updated this revision to Diff 219870.Sep 12 2019, 3:14 AM
seiya edited the summary of this revision. (Show Details)
jhenderson added inline comments.Sep 12 2019, 3:40 AM
llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
135 ↗(On Diff #219870)

For some warning levels, this will produce a warning as it is unused. Probably, you should comment out the name: const ELFCopyConfig &/*ELFConfig*/.

Of course, this becomes moot if you follow my suggestion of only passing in ELFConfig.

172–173 ↗(On Diff #219870)

This and several other functions don't look like they need to change their signatures, since you don't use the ELF-specific parameters. Please revert them, or switch to passing in only ELFConfig and NOT Config.

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.h
29–30 ↗(On Diff #219870)

Since ELFConfig has Config as a member, you could just pass in ELFConfig in this and the other functions here, and use that.

llvm/tools/llvm-objcopy/llvm-objcopy.cpp
271 ↗(On Diff #219870)

Here and elsewhere, I'd keep the name as Config.

seiya updated this revision to Diff 220054.Sep 13 2019, 2:24 AM

Pass only ELFCopyConfig instead of passing both CopyConfig and ELFCopyConfig.

seiya marked 4 inline comments as done.Sep 13 2019, 2:25 AM
jhenderson accepted this revision.Sep 13 2019, 2:28 AM

LGTM. Might want to give a chance for somebody else to approve too.

This revision is now accepted and ready to land.Sep 13 2019, 2:28 AM
llvm/tools/llvm-objcopy/llvm-objcopy.cpp
93 ↗(On Diff #220054)

frankly speaking I'm not a big fan of this design again, i kinda read James' suggestion differently.
Here we have this duplicated reference structure (one reference to Common here, another one is inside ELFConfig) and the responsibilities / concerns of these classes are kinda intricated.

If we want to separate Common from everything else - I'm totally fine with that, but would prefer to have a simpler structure here:

struct DriverConfig {
    SmallVector<Configuration, 1> Configurations; 
    BumpPtrAllocator Alloc;
};

// Parsed command-line arguments
struct CopyConfig {
     ....
};

// COFF specific details
struct MachOCopyConfig {
};

// ELF-specific details
struct ELFCopyConfig {
};

// MachO specific details
struct MachOCopyConfig {
};

struct Configuration {
    CopyConfig Common;
    Optional<ELFCopyConfig> ELF;
    Optional<MachOCopyConfig> MachO;
    Optional<COFFCopyConfig> COFF;
};

Somehow I find it a tiny bit more straightforward / simpler.

And have

Error executeObjcopyOnObjcopy(const Configuration &Config);

etc.

CC: @jhenderson

I don't have a strong feeling about this either way, if I'm honest. I'm happy for alternative approaches, if @alexshap prefers.

seiya marked an inline comment as done.Sep 13 2019, 4:04 AM
seiya added inline comments.
llvm/tools/llvm-objcopy/llvm-objcopy.cpp
93 ↗(On Diff #220054)

Eliminating duplicated references to CopyConfig sounds better to me. I'll adopt this design if there are no objections.

MaskRay added inline comments.Sep 13 2019, 4:11 AM
llvm/tools/llvm-objcopy/llvm-objcopy.cpp
93 ↗(On Diff #220054)

I don't have strong opinions on the interface (or I probably should say this really depends on the implementation)

No matter what design this patch ends up with, I hope we can avoid the verbose`Config.Common.Option` and use the simpler Config.Option, Common.Option or ELF.Option.

llvm/tools/llvm-objcopy/llvm-objcopy.cpp
93 ↗(On Diff #220054)
struct DriverConfig {
  SmallVector<Configuration, 1> Configurations; 
  BumpPtrAllocator Alloc;
};

// COFF specific details
struct MachOCopyConfig {
};

// ELF-specific details
struct ELFCopyConfig {
};

// MachO specific details
struct MachOCopyConfig {
};

struct CopyConfig {
   ...
   Optional<ELFCopyConfig> ELF;
   Optional<MachOCopyConfig> MachO;
   Optional<COFFCopyConfig> COFF;
};

What about having ELFcopyConfig inherit from CopyConfig?

seiya added a comment.Sep 15 2019, 6:07 PM

What about having ELFcopyConfig inherit from CopyConfig?

How about this design? I believe moving the CopyConfig is not a big cost.

/// Parses CopyConfig and creates a ELFCopyConfig, or returns Error on parse errors.
class ELFCopyConfigBuilder {
  CopyConfig Common;
public:
  ELFCopyConfig(CopyConfig&& Common) : Common(std::move(Common)){};
  Expected<ELFCopyConfig> build();
}

class ELFCopyConfig : public ELFConfig {
public:
  Optional<uint8_t> NewSymbolVisibility;
  std::vector<NewSymbolInfo> SymbolsToAdd;

  ELFCopyConfig(CopyConfig&& Common, Optional<uint8_t> NewSymbolVisibility, /* ... */)
                : CopyConfig(std::move(Common)), NewSymbolVisibility(NewSymbolVisibility) /* ... */ {}
};

What about having ELFcopyConfig inherit from CopyConfig?

I thought of that, but if I'm not mistaken, that won't help us when we have a mixture of file formats, e.g. ELF and COFF. That could be a problem in some corner cases (e.g. archives containing objects of different formats). I think Seiya's one has the same problem - we can only use the common config once, not multiple times. If it's not a big deal to create copies of CopyConfig, then I don't mind doing it using inheritance: construct an ELFCopyConfig from the CopyConfig, but not std::move it. That would allow the original CopyConfig to continue to be used later.

for what it's worth - i was thinking about it for some time,
it seems to me that using inheritance here would look more complicated + would require a bit more code.
Somehow it feels better to me when there is only one instance of CopyConfig (and it's not moved around).
(yeah, some fields would be populated only when the format of the input is known, but overall this looks kinda simpler to me)
(but i don't want to insist)

seiya added a comment.Sep 17 2019, 2:46 AM

Thanks you for feedbacks. Putting it all together, the @alexshap's suggestion (D67139#inline-607235) looks desirable to me because:

I'll update the patch but comments are still welcome of course.

seiya updated this revision to Diff 220638.Sep 18 2019, 2:46 AM
seiya marked 3 inline comments as done.Sep 18 2019, 2:48 AM
seiya updated this revision to Diff 220640.Sep 18 2019, 2:53 AM
  • Added a comment for parseELFConfig.
jakehehrlich accepted this revision.Sep 18 2019, 4:13 PM
MaskRay accepted this revision.Sep 19 2019, 7:30 PM
MaskRay added inline comments.
llvm/tools/llvm-objcopy/llvm-objcopy.cpp
265 ↗(On Diff #220640)

typedef -> using

seiya updated this revision to Diff 220952.Sep 19 2019, 11:23 PM
  • Addressed a review comment.
seiya marked an inline comment as done.Sep 19 2019, 11:23 PM
jhenderson accepted this revision.Sep 23 2019, 8:34 AM

LGTM, thanks!

This revision was automatically updated to reflect the committed changes.