This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Support expressions with -defsym option
ClosedPublic

Authored by phosek on Nov 1 2017, 2:40 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

phosek created this revision.Nov 1 2017, 2:40 PM
phosek updated this revision to Diff 121218.Nov 1 2017, 5:47 PM
grimar added a subscriber: grimar.Nov 2 2017, 2:14 AM

I think I like this. Few comments:

ELF/ScriptLexer.cpp
206 ↗(On Diff #121218)

What is this code for ? I tried following and no testcases failed for me:

if (E == StringRef::npos) {
  assert(false);
  Ret.push_back(S);
  return Ret;
}
229 ↗(On Diff #121218)

I think instead of passing function as argument it could probably be better to always call maybeSplit
which would do nothing for case when '!InExpr' or '!InDefsym' and it could call either tokenizeDefsym or tokenizeExpr
inside.

ELF/ScriptLexer.h
41 ↗(On Diff #121218)

Should this be single new enum variable ?

ruiu added inline comments.Nov 2 2017, 1:37 PM
ELF/ScriptLexer.cpp
200 ↗(On Diff #121218)

Please expand the comment to explain what form of input is expected.

ELF/ScriptParser.cpp
274 ↗(On Diff #121218)

You create a new instance of ScriptParser for each -defsym string, so Orig is always the default value which is false.

phosek updated this revision to Diff 121563.Nov 3 2017, 4:19 PM
phosek marked 5 inline comments as done.
ruiu added inline comments.Nov 3 2017, 4:32 PM
ELF/ScriptParser.cpp
1342 ↗(On Diff #121563)

Actually, I think a better way of doing this is to pass two StringRefs, one for symbol name and the other for expression to this function. Introducing a different tokenization scheme only to separate "<sym>=<expr>" into <sym> and <expr> seems a bit too much. You probably should split a -defsym argument into two StringRefs in the driver.

phosek added inline comments.Nov 3 2017, 5:04 PM
ELF/ScriptLexer.cpp
206 ↗(On Diff #121218)

It's the -defsym=foo case, I've added test case for this.

ELF/ScriptParser.cpp
1342 ↗(On Diff #121563)

That's what I've done in the first version: https://reviews.llvm.org/D39511?id=121190. Would you prefer that approach? The current approach is more akin to what gold and BFD.ld does. The main advantage of this approach is that the error messages that come out of script parser have the right offset, i.e. for -defsym=foo=bar, the error at offset 1 points to f while with the original approach it'd be b.

274 ↗(On Diff #121218)

I've changed this to use enum to represent the context.

ruiu added inline comments.Nov 3 2017, 5:13 PM
ELF/ScriptParser.cpp
1342 ↗(On Diff #121563)

Yes, I think I prefer that over this patch. (A reason why you added defsym'ed symbols to the symbol table using addSymbolAlias in that patch isn't clear to me though.)

phosek updated this revision to Diff 121570.Nov 3 2017, 5:43 PM
phosek added inline comments.
ELF/ScriptParser.cpp
1342 ↗(On Diff #121563)

That was just a leftover code, that should be fixed now.

ruiu accepted this revision.Nov 3 2017, 5:47 PM

LGTM

I think this is much better than the previous version.

ELF/Driver.cpp
1047–1048 ↗(On Diff #121570)

nit: you can eliminate by MB by readDefsym(From, MemoryBufferRef(To, "-defsym")).

ELF/ScriptParser.cpp
274 ↗(On Diff #121570)

After this, you should call atEOF() to make sure that you've read all characters. Otherwise, I think you'll miss an error in `-defsym="(1+2)++++" for example, as readExpr will return at the closing parenthesis.

This revision is now accepted and ready to land.Nov 3 2017, 5:47 PM
phosek updated this revision to Diff 121573.Nov 3 2017, 5:58 PM
phosek marked 2 inline comments as done.
This revision was automatically updated to reflect the committed changes.