This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] Switch over to using TableGen for parsing arguments
ClosedPublic

Authored by jakehehrlich on Mar 7 2018, 4:44 PM.

Details

Summary

So I'm starting back on the push to create llvm-strip which requires switching how arguments are being parsed. This change does the following

  1. Deletes the old argument parsing code
  2. Dactors out arguments into a Config class (will later be used repeatedly by llvm-strip)
  3. Removes all uses of "-R=" and "-j=" because they're non-standard and hard to support in TableGen. I was wrong to use them in the first place
  4. Adds in Opts.td (which will later be split into Common.td and Objcopy.td when Strip.td exists)
  5. Adds in objcopy driver using Opts.td (later there will be a StripDriver and symlink)

Diff Detail

Repository
rL LLVM

Event Timeline

jakehehrlich created this revision.Mar 7 2018, 4:44 PM
llvm/tools/llvm-objcopy/llvm-objcopy.cpp
136

for what it's worth, it seems to me this is kind of strange that a struct Config has these methods / logic (i mean CreateWriter, CreateReader, SplitDWOToFile). I've just looked at LLD which also has some kind of Config internally, but there the Config is not loaded with this business logic.

jakehehrlich added inline comments.Mar 7 2018, 7:46 PM
llvm/tools/llvm-objcopy/llvm-objcopy.cpp
136

LLD uses a global Config object instead of warping it in a class like I have. I also don't like what I've done that much. The alternative that I see is to pass the Config in as an argument. For HandleArgs that would be pretty painful. I can do that for the others though, what do you think? Alternatively I can go though the burden of Maybe I should just use a different name? The idea is that you want to be able to set and configure all these different values and then make the result happen. Each driver should set one of these up for each copy operation and it should execute it. Any other ideas?

compnerd added inline comments.Mar 7 2018, 9:39 PM
llvm/tools/llvm-objcopy/Opts.td
13

LLVM style is to name this O, so that you can find it as OPT_O.

23

LLVM style would be to name this remove_section.

26

LLVM style would be to name this R.

34

LLVM style would be to name this j.

llvm/tools/llvm-objcopy/llvm-objcopy.cpp
136

I agree with @alexshap here, I don't think that the Config structure here should have anything in it other than the options. It can be consumed by the tool to drive the tool as necessary. I can understand the desire to share code between the tool, but, I think that standalone functions or a different class would make for a better home for this.

Like the others who have commented, it seems to me like the execute() function should be a member of a driver class (or stand-alone, since we don't really need a class here), rather than a configuration class. A configuration describes how to do something, but I don't think doing it itself feels right to me (for a slightly abstract example, think configuration files for an application - the executable does the work, the config file just allows controlling what is done).

Is there a particular reason you don't want to follow LLD's approach of one global configuration variable? Alternatively making it a member of an "ObjcopyDriver" class?

llvm/tools/llvm-objcopy/llvm-objcopy.cpp
10โ€“12

These headers are in the wrong place in the list.

54

Is there a particular reason you've gone with "OBJCOPY_<OPTION_NAME>" here? I think it's more common to do OPT_<OPTION_NAME>.

136

The alternative that I see is to pass the Config in as an argument. For HandleArgs that would be pretty painful.

Why would it be painful? I don't see how it would be hard to extend this signature?

184โ€“186

clang-format and elsewhere.

384

I think I'd prefer the argument parsing split into a separate function from the driver function, since it's a distinct block of work, and will make the different stages of the program somewhat clearer. Ideally, I'd also just hoist the body of Config::Execute into this function then.

386

Please don't abbreviate these two variables - I have no idea what "MAI" and "MAC" are.

466

I don't think return 0 is necessary in main?

Like the others who have commented, it seems to me like the execute() function should be a member of a driver class (or stand-alone, since we don't really need a class here), rather than a configuration class. A configuration describes how to do something, but I don't think doing it itself feels right to me (for a slightly abstract example, think configuration files for an application - the executable does the work, the config file just allows controlling what is done).

Is there a particular reason you don't want to follow LLD's approach of one global configuration variable? Alternatively making it a member of an "ObjcopyDriver" class?

Making an ObjcopyDriver class is probably fine. Eventually I want the Config to be copyable so that multiple different configs can be used at the same time.

llvm/tools/llvm-objcopy/llvm-objcopy.cpp
54

Eventually these will clash with the llvm-strip symbols for the same thing.

136

It would just mean typing "Config." a ton of times. I'm fine making it a parameter though. It also means *everything* needs to take Config as a parameter.

384

After I put Config::Execute in its own function (per the discussion above) that code will (literally in the next diff) be shared between the StripDriver and the ObjcopyDriver

386

Yeah fair, that was cargo-culted from other tools. They stand for "missing argument count" and "missing argument index" which is 100% not clear. I'll fix that.

I split the "Config" up into "Config" and "CopyAction". The plan is for "CopyAction" to handle multiple formats and for "Config" to determine which sort of CopyAction is constructed. When COFF support rolls around we should be able to use CopyAction for those as well. Archives support is also something I plan on adding eventually and that will be a copy action composed of many copy actions.

so maybe the following approach can also be considered: (deliberately presented in the reverse order to see how different pieces would fit together):

int main(int argc, char **argv) {
    StringRef ToolName = path::filename(argv[0]);
    Config Conf;
    if (ToolName) == "objcopy")
       Conf = parseObjcopyArgs(argc, argv);
    else if (ToolName = "strip") 
       Conf = parseStripArgs(argc, argv);
    ExecuteObjcopy(Conf);
    return 0;
}

void ExecuteObjcopy(const Config &Conf) {
     auto Reader = createReader(Conf.InputName);
     if (Reader.isELF())
         return ExecuteElfObjcopy(Conf, Reader);
     if (Reader.isCoff())
         return ExecuteCoffObjcopy(Conf, Reader);
}

void ExecuteElfObjcopy(const Config &C, Reader &R) {
      Object O;  // intermediate representation which we discussed with Jake and James
      O.read(R);   
      /* modify the object according to the config, similarly to the existing handleArgs */
      .....
      .....  
      .....
      auto W = createWriter(R.elftype(), C.OutputFormat, C.OutputFilename);
      O.write(W);
 }

}

(seems to be a bit simpler, less inheritance, all the classes are "stateful")

compnerd added inline comments.Apr 5 2018, 8:39 AM
llvm/tools/llvm-objcopy/llvm-objcopy.cpp
136

I'm not sure I understand why there is still so much "configuration" in the CopyAction here. The flag state should be hoisted into the Config as that is not part of the "copy" operation.

Could you please share what your plan is with the additional machinery here? It should be possible to do the work more simply (as @alexshap mentions).

This looks good to me. Thanks!

In D44236#1058119, @alexshap wrote:

so maybe the following approach can also be considered: (deliberately presented in the reverse order to see how different pieces would fit together):

int main(int argc, char **argv) {
    StringRef ToolName = path::filename(argv[0]);
    Config Conf;
    if (ToolName) == "objcopy")
       Conf = parseObjcopyArgs(argc, argv);
    else if (ToolName = "strip") 
       Conf = parseStripArgs(argc, argv);
    ExecuteObjcopy(Conf);
    return 0;
}

void ExecuteObjcopy(const Config &Conf) {
     auto Reader = createReader(Conf.InputName);
     if (Reader.isELF())
         return ExecuteElfObjcopy(Conf, Reader);
     if (Reader.isCoff())
         return ExecuteCoffObjcopy(Conf, Reader);
}

void ExecuteElfObjcopy(const Config &C, Reader &R) {
      Object O;  // intermediate representation which we discussed with Jake and James
      O.read(R);   
      /* modify the object according to the config, similarly to the existing handleArgs */
      .....
      .....  
      .....
      auto W = createWriter(R.elftype(), C.OutputFormat, C.OutputFilename);
      O.write(W);
 }

}

(seems to be a bit simpler, less inheritance, all the classes are "stateful")

llvm/tools/llvm-objcopy/llvm-objcopy.cpp
136

These aren't necessarily in direct correspondence to flags; they are for objcopy but not for strip. For instance strip will set StripAll unless another strip option is used. The idea is that this is information shared by every action that could occur on any input type or via any driver (objcopy or strip). I like Alex's idea though so I'm going to implement that.

sorry about the disturbance, i'm wondering if i can help push this diff forward - switching to TableGen looks like the right thing to do.
I have (locally, have not sent them for code review yet) a couple of other patches to llvm-objcopy (with various features),
but I'd like to apply them on top of the refactored codebase (in particular, with this change), although it's not a blocker nor is it critical.

In D44236#1062012, @alexshap wrote:

sorry about the disturbance, i'm wondering if i can help push this diff forward - switching to TableGen looks like the right thing to do.
I have (locally, have not sent them for code review yet) a couple of other patches to llvm-objcopy (with various features),
but I'd like to apply them on top of the refactored codebase (in particular, with this change), although it's not a blocker nor is it critical.

I can have something up by the end of tomorrow for you to review. I'm also fine with you submitting something for me to review. You can copy pasta any part that happens to be useful.

Updated to (in the future) match Alex's plan for how we can expand this.

great, many thanks,
I've added a few comments (some minor nits (like names of the variables + formatting)), if you don't mind - let's clang-format -style=llvm these files and update those names,
otherwise this looks good to me.

llvm/tools/llvm-objcopy/llvm-objcopy.cpp
221โ€“222

nit: const CopyConfig &Config

374

const CopyConfig &Config

388

nit: this blank line is probably unnecessary, maybe let's clang-format -style=llvm this diff or the entire file (if not yet).

391

nit: https://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable
I would use the explicit type InputArgList instead of auto because unlike some other case (i.e. auto *VarDecl = dyn_cast<VarDecl>(Decl)) here it's not mentioned anywhere else. (so, probably, being explicit would improve readability).

406

very minor nit: Positional.empty() (here and above where we check if size() equals 0)

412

out -> Out,
but, probably, Config would be a better name

ran format with style=llvm (git-clang-format seems to not have a way to do this?) and fixed a couple other nits

alexander-shaposhnikov added inline comments.
llvm/tools/llvm-objcopy/llvm-objcopy.cpp
394

empty()

This revision is now accepted and ready to land.Apr 11 2018, 2:59 PM
This revision was automatically updated to reflect the committed changes.