This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Implemented --retain-symbols-file option
ClosedPublic

Authored by grimar on Dec 13 2016, 9:19 AM.

Details

Summary

--retain-symbols-file=filename
Retain only the symbols listed in the file filename, discarding all others. filename is simply a flat file, with one symbol name per line. This option is especially useful in environments (such as VxWorks) where a large global symbol table is accumulated gradually, to conserve run-time memory.

Note: though documentation says "--retain-symbols-file does not discard undefined symbols, or symbols needed for relocations.", both bfd and gold do that, and this patch too, like testcase show.

Diff Detail

Repository
rL LLVM

Event Timeline

grimar updated this revision to Diff 81238.Dec 13 2016, 9:19 AM
grimar retitled this revision from to [ELF] - Implemented --retain-symbols-file option.
grimar updated this object.
grimar added reviewers: ruiu, rafael.
grimar added subscribers: llvm-commits, grimar, evgeny777.

This is PR31334.

emaste added a subscriber: emaste.Dec 13 2016, 9:24 AM

How do we handle combinations of related options?

ELF/Config.h
39 ↗(On Diff #81238)

DiscardPolicy File seems confusing to me - in that it seems File would list the symbols to discard. Not sure of a good word to encompass Discard/Retain though.

ruiu added inline comments.Dec 13 2016, 9:32 AM
ELF/Config.h
76 ↗(On Diff #81238)

Using CachedHashStringRef doesn't make sense here. You still compute string hash values before looking up strings, so it doesn't really cache anything (unless you have a gigantic --retain-symbols-file which is unrealistic). Just use a string set.

ELF/Driver.cpp
648 ↗(On Diff #81238)

If --retain-symbol-file is used, we'll retail only the symbols listed in the file and discard all others.

grimar updated this revision to Diff 81356.Dec 14 2016, 3:15 AM
  • Addressed review comments.
ELF/Config.h
39 ↗(On Diff #81238)

I agree it was confusing. What about RetainFile ?

76 ↗(On Diff #81238)

Ok. Currently we know 2 users of this functionality. One of them uses /dev/null as file. That was done I think as workaround for gold which in case of empty file thinks it was not used, what is wrong and inconsistent with bfd.
So I would probably agree now with statement about "unrealistic", though I implemented that initially assuming it is possible to have such file.

ELF/Driver.cpp
648 ↗(On Diff #81238)

Done. Thanks.

ruiu added inline comments.Dec 14 2016, 2:02 PM
ELF/Config.h
76 ↗(On Diff #81238)

Sort.

ELF/Driver.cpp
477 ↗(On Diff #81356)

Can you return std::vector<StringRef> instead of SmallVector? We don't use SmallVector that much in our code, so I'd want to avoid using it as part of a function interface.

480 ↗(On Diff #81356)

To return a std::vector, I believe you can just do return {Arr.begin(), Arr.end()}.

498–499 ↗(On Diff #81356)

I think that you are inserting an empty string to the set because usually a text file ends with "\n", so the last line is empty. I'd do trim() and remove blank lines in getLines().

652 ↗(On Diff #81356)

This flag should be set to true whether we readFile succeded or not.

grimar updated this revision to Diff 81537.Dec 15 2016, 12:07 AM
  • Addressed review comments.
ELF/Config.h
76 ↗(On Diff #81238)

Not sure what is expected place then. I think we just put variable here in sorted groups, since there is no other StrringSets, I assumed it is sorted.
I moved that line after last llvm::StringRef then.

ELF/Driver.cpp
477 ↗(On Diff #81356)

Done.

498–499 ↗(On Diff #81356)

No I am not, last argument of split() does all job:

MB.getBuffer().split(Arr, '\n', INT32_MAX, /*KeepEmpty*/ false);

Also set would assert that:

std::pair<typename base::iterator, bool> insert(StringRef Key) {
  assert(!Key.empty());
  return base::insert(std::make_pair(Key, '\0'));
}
652 ↗(On Diff #81356)

I think we error out if something is wrong inside readFile() anyways. Though I did that change to
make code shorter.

ruiu added inline comments.Dec 15 2016, 9:24 AM
ELF/Driver.cpp
498–499 ↗(On Diff #81356)

What if the file ends with "\n " (newline and an space character)?

grimar updated this revision to Diff 81717.Dec 16 2016, 12:02 AM
  • Addressed review comments.
ELF/Driver.cpp
498–499 ↗(On Diff #81356)

It would assert, you right. Fixed, thanks !

ruiu added inline comments.Dec 16 2016, 12:08 AM
ELF/Driver.cpp
480–483 ↗(On Diff #81717)

I think plain, boring, non-higher-order code is easier to read.

for (StringRef S : Arr) {
  S = S.trim();
  if (!S.empty())
    Res.push_back(S);
}
grimar updated this revision to Diff 81718.Dec 16 2016, 12:11 AM
  • Addressed review comment.
ELF/Driver.cpp
480–483 ↗(On Diff #81717)

Looks so..

ruiu added inline comments.Dec 16 2016, 12:18 AM
ELF/Options.td
179–180 ↗(On Diff #81718)

I guess you also want to add S<"retain-symbols-file">? Please check GNU ld if it accepts that.
.

ELF/Writer.cpp
425 ↗(On Diff #81718)
// If --retain-symbols-file is given, we'll keep only symbols listed in that file.
427 ↗(On Diff #81718)

Does this compile? You are still using CachedHashString.

grimar added inline comments.Dec 16 2016, 12:26 AM
ELF/Writer.cpp
427 ↗(On Diff #81718)

Unusual, but it was. I think because of

operator StringRef() const { return val(); }

I'll fix.

grimar updated this revision to Diff 81719.Dec 16 2016, 12:40 AM
  • Addressed review comments.
ELF/Options.td
179–180 ↗(On Diff #81718)

Yes, bfd accepts S too, with single or double "-":

umb@ubuntu:~/LLVM/build$ ld.bfd --retain-symbols-file xx
ld.bfd: xx: No such file or directory
ld.bfd: no input files
umb@ubuntu:~/LLVM/build$ ld.bfd -retain-symbols-file xx
ld.bfd: xx: No such file or directory
ld.bfd: no input files
umb@ubuntu:~/LLVM/build$ ld.bfd -retain-symbols-file=xx
ld.bfd: xx: No such file or directory
ld.bfd: no input files
umb@ubuntu:~/LLVM/build$ ld.bfd --retain-symbols-file=xx
ld.bfd: xx: No such file or directory
ld.bfd: no input files

I updated testcase to check both forms.

ELF/Writer.cpp
425 ↗(On Diff #81718)

Done.

grimar updated this revision to Diff 81720.Dec 16 2016, 12:48 AM
  • clang-formatted comment.
ruiu accepted this revision.Dec 16 2016, 8:45 AM
ruiu edited edge metadata.

LGTM with a nit.

ELF/Writer.cpp
427–428 ↗(On Diff #81720)

We usually use && instead of two ifs. I'd think I've pointed out too many small errors in this review. Maybe you want to take a little bit more time to read your entire patch before requesting another round of code review? I usually read my patch many times from top to bottom until it looks perfect to me before sending it to review.

This revision is now accepted and ready to land.Dec 16 2016, 8:45 AM
This revision was automatically updated to reflect the committed changes.