Page MenuHomePhabricator

[llvm-objcopy] Allow using regex in name comparison
ClosedPublic

Authored by evgeny777 on Jan 31 2019, 8:26 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

evgeny777 created this revision.Jan 31 2019, 8:26 AM

In general, I think regex is probably a better choice than wildcards, but I'd be wary of pretending to be GNU compatible with -w when it's a different format. Maybe -r would be better, or maybe we don't even need a short arg for this. If we're going to implement it as -w, I think it would be better to actually be wildcards.
i.e. if someone drops llvm-objcopy into their toolchain and somewhere they use -w -G foo*, that means it now only matches things like fooooooo but not foobar (and additionally will now match fo too!)... but it may be a error that takes a while to be exposed. (In general, people assume tools work just fine, and don't always feel the need to write tests).

If we use a different flag, the error will be obvious: it will be "-w is not a valid flag" and exit 1.

Also a question, for others as well: is this a feature we actually want to carry forward? Are there any users of this feature/is it useful? This isn't something I've come across as actually being used, though I'm happy to review it if is.

tools/llvm-objcopy/CopyConfig.h
100 ↗(On Diff #184512)

Technically, the section flags among these always accept wildcards. -w only affects symbol name matching. I'm fine with this though.

Are there any users of this feature/is it useful?

One of such users is myself. I'm using it to strip specific classes (with all methods and data) from relocatable objects and then execute those object files on memory tight embedded platform using OrcJIT. This is pretty specific use case though.

Are there any users of this feature/is it useful?

One of such users is myself. I'm using it to strip specific classes (with all methods and data) from relocatable objects and then execute those object files on memory tight embedded platform using OrcJIT. This is pretty specific use case though.

Still sounds like a reasonable use-case anyway. I could see it being useful for library providers too, who want to hide some things in the external interface, though I don't know of any concrete examples.

If we choose to support this (and I'm happy to +1 it), then I think it should only be a long-form switch, with no short-form equivalent. That's because there's a much higher chance of something being added to GNU objcopy in the future with a short alias that causes a clash than for a long-form version.

evgeny777 updated this revision to Diff 184726.Feb 1 2019, 5:12 AM

Removed -w

rupprecht marked an inline comment as done.Feb 1 2019, 8:04 AM

Are there any users of this feature/is it useful?

One of such users is myself. I'm using it to strip specific classes (with all methods and data) from relocatable objects and then execute those object files on memory tight embedded platform using OrcJIT. This is pretty specific use case though.

Still sounds like a reasonable use-case anyway. I could see it being useful for library providers too, who want to hide some things in the external interface, though I don't know of any concrete examples.

If we choose to support this (and I'm happy to +1 it), then I think it should only be a long-form switch, with no short-form equivalent. That's because there's a much higher chance of something being added to GNU objcopy in the future with a short alias that causes a clash than for a long-form version.

OK, thanks -- just wanted to double check it's useful, and sounds like it is. I'm on board too :)

test/tools/llvm-objcopy/ELF/globalize.test
7 ↗(On Diff #184726)

nit: please use --regex instead of -regex in these test cases

8 ↗(On Diff #184726)

In cases where this should be identical to the output of another another command, you can use cmp instead of another readobj | FileCheck, e.g. cmp %t2 %t3

tools/llvm-objcopy/CopyConfig.h
52 ↗(On Diff #184726)

Can you add a test that just demonstrates regex matching itself?

64 ↗(On Diff #184726)

It looks like this does regex matching anywhere within the string. I think it might be less error prone if we make it a full string regex match, e.g. special.* -- currently matches not-special. I think we should force the user to write .*special.* if that's what they want.

Python has a similar distinction, re.match vs re.search. I don't think llvm's regex has anything similar builtin, so you may need to manually wrap the expression in ^$.

101 ↗(On Diff #184726)

It might be worth using StringSaver or something for this, so they can all be StringRef and avoid templating NameOrRegex. I'll take a look since I introduced that nonsense here.

alexshap added inline comments.Feb 3 2019, 9:42 PM
tools/llvm-objcopy/CopyConfig.h
53 ↗(On Diff #184726)

tbh I have mixed feelings regarding this,

  1. so basically string comparison is just a particular case of regex - maybe we can simply use Regex and don't need this class ?
  2. khm, where is CopyConfig copied ?
alexshap added inline comments.Feb 3 2019, 9:58 PM
tools/llvm-objcopy/CopyConfig.h
53 ↗(On Diff #184726)

although my second thought is that probably it might make sense to have these two cases separate (performance, simplicity of the most popular use case).
However, maybe we can get rid of shared_ptr here ? (movable but non-copyable classes work just fine with the standard containers) i.e. use optional or (probably even better) - Regex has a default "empty" state (i.e. default constructed Regex is in this state), maybe we can just add a method "empty" (or safe bool conversion operator) to Regex

evgeny777 marked an inline comment as done.Feb 3 2019, 11:50 PM
evgeny777 added inline comments.
tools/llvm-objcopy/CopyConfig.h
53 ↗(On Diff #184726)

llvm-strip may use multiple CopyConfig instances which differ only in input/output file names.

for (const char *Filename : Positional) {
   Config.InputFilename = Filename;
   Config.OutputFilename = Filename;
   DC.CopyConfigs.push_back(Config);
 }
alexshap added inline comments.Feb 4 2019, 12:02 AM
tools/llvm-objcopy/CopyConfig.h
53 ↗(On Diff #184726)

oh, i see, this is kind of unfortunate, but i see what you are saying.

evgeny777 updated this revision to Diff 185042.Feb 4 2019, 6:49 AM

Addressed. I temporarily merged it with D57617

rupprecht accepted this revision.Feb 5 2019, 9:38 AM
This revision is now accepted and ready to land.Feb 5 2019, 9:38 AM
jhenderson accepted this revision.Feb 6 2019, 1:49 AM

LGTM, with nit fix.

tools/llvm-objcopy/CopyConfig.h
54 ↗(On Diff #185250)

Nit: mumltiple -> multiple

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 6 2019, 3:01 AM