This is an archive of the discontinued LLVM Phabricator instance.

[lld-link] Replace LazyObjFile with lazy ObjFile/BitcodeFile
ClosedPublic

Authored by MaskRay on Dec 31 2021, 12:19 AM.

Details

Summary

Similar to ELF 3a5fb57393c3bc77be9e7afc2ec9d4ec3c9bbf70.

  • previously when a LazyObjFile was extracted, a new ObjFile/BitcodeFile was created; now the file is reused, just with lazy cleared
  • avoid the confusing transfer of symbols from LazyObjFile to the new file
  • simpler code, smaller executable (5200+ bytes smaller on x86-64)
  • make eager parsing feasible (for parallel section/symbol table initialization)

Diff Detail

Event Timeline

MaskRay created this revision.Dec 31 2021, 12:19 AM
MaskRay requested review of this revision.Dec 31 2021, 12:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 31 2021, 12:19 AM
aganea added inline comments.Dec 31 2021, 8:12 AM
lld/COFF/Driver.cpp
211

Add , bool lazy) to constructor and then this code should read: ctx.symtab.addFile(make<BitcodeFile>(ctx, mbref, "", 0, lazy));.
Same below for ObjFile.

lld/COFF/InputFiles.cpp
138

Same comment as BitcodeFile, move code into ObjFile::parse()

1082

Can we move this to BitcodeFile::parse() and do something like:

void BitcodeFile::parse() {
  if (lazy) {
    std::unique_ptr<lto::InputFile> obj = CHECK(lto::InputFile::create(mb), this);
    for (const lto::InputFile::Symbol &sym : obj->symbols())
      if (!sym.isUndefined())
        ctx.symtab.addLazyObject(this, sym.getName());
    return;
  }
  ...
lld/COFF/InputFiles.h
98

Can you somehow move this below after fileKind to prevent excessive padding? Otherwise the bool will take 64-bits :-)
See my other comments, this doesn't need to be public.

371

Add , bool lazy) to constructor.

lld/COFF/SymbolTable.cpp
40–54

I wonder if it's not better to leave types to do their own testing of lazy internally, in Type::parse(), and act accordingly (instead of doing it outside like here).
Same thing applies to pushing to the instances, shouldn't parse() do it internally? (but that could be done later)
We should be ending with just file->parse(); here.

585–587

You shouldn't be needing this test & assignment here (or it could be an assert?).

MaskRay added inline comments.Jan 1 2022, 5:21 PM
lld/COFF/SymbolTable.cpp
40–54

The reason parse calling parseLazy may feel better is because the function is in SymbolTable (while it can be placed into InputFile, too), so lazy has to be public to be accessed here.

If the function is in InputFile, this function dispatching to parse and parseLazy will be superior in my view as we can avoid the slightly unnecessary abstraction that parse returns parseLazy...


If PE/COFF needs the lazyBitcodeFiles as ELF does (Bazel distributed ThinLTO requirement), handling lazy in this function may be better, too.

MaskRay updated this revision to Diff 397145.Jan 3 2022, 2:25 PM

Add lazy parameter to InputFile/ObjFile/BitcodeFile

MaskRay updated this revision to Diff 397152.Jan 3 2022, 3:06 PM
MaskRay marked 4 inline comments as done.

address some comments

lld/COFF/InputFiles.cpp
138

Sam reply - the function will likely become more complex. A separate function will justify in the future.

aganea accepted this revision.Jan 3 2022, 7:45 PM

LGTM, a few minor comments:

lld/COFF/SymbolTable.cpp
40–54

The reason parse calling parseLazy may feel better is because the function is in SymbolTable (while it can be placed into InputFile, too), so lazy has to be public to be accessed here.

If the function is in InputFile, this function dispatching to parse and parseLazy will be superior in my view as we can avoid the slightly unnecessary abstraction that parse returns parseLazy...

Fair enough, let's keep them apart.

If PE/COFF needs the lazyBitcodeFiles as ELF does (Bazel distributed ThinLTO requirement),

Do you need that (ie.thinLTOCreateEmptyIndexFiles()) because you have to emit exactly the same number of index files as there are input objects? If LLD exits with no error, emits less index files than inputs, could the build system assume that only those files are to be distributed?

585

Is it more appropriate to move this assert at the top?

587

You can just write addFile(f);

This revision is now accepted and ready to land.Jan 3 2022, 7:45 PM
MaskRay updated this revision to Diff 397334.Jan 4 2022, 10:27 AM
MaskRay marked 3 inline comments as done.

address comments

MaskRay marked 3 inline comments as done.Jan 4 2022, 10:27 AM
MaskRay added inline comments.
lld/COFF/SymbolTable.cpp
40–54

lld/ELF needs thinLTOCreateEmptyIndexFiles because Bazel's distributed ThinLTO requires it (--thinlto-index-only). Bazel needs it because it's the build system that requires all specified input and output and the output needs to be unconditional.

OK, it seems that lld/COFF uses a different strategy (D64461) which avoids lazyBitcodeFiles...

rnk accepted this revision.Jan 4 2022, 2:06 PM

Looks good to me

lld/COFF/SymbolTable.cpp
42

If parse is virtual, should parseLazy be virtual as well? Doing O(objects) virtual calls shouldn't matter for performance, it's the O(symbols) actions that matter.

MaskRay added inline comments.Jan 4 2022, 2:09 PM
lld/COFF/SymbolTable.cpp
42

The virtual dispatch cost doesn't matter.

If parseLazy is made virtual, virtual void InputFile::parseLazy() = 0; will be needed as many file kinds don't have reasonable parseLazy semantics. That may not be clearer than open coding the two possible file kinds here.

MaskRay edited the summary of this revision. (Show Details)Jan 4 2022, 3:11 PM