This is an archive of the discontinued LLVM Phabricator instance.

[COFF] Move type merging to TpiSource::mergeDebugT virtual method
ClosedPublic

Authored by rnk on May 9 2020, 7:21 AM.

Details

Summary

This paves the way to doing more things in parallel, and allows us to
order type sources in dependency order. PDBs and PCH objects have to be
loaded before object files which use them.

This is a rebase of the unapplied remaining changes in
https://reviews.llvm.org/D59226. I found it very challenging to rebase
this across the LLD variable name style change. I recall there was a
tool for that, but I didn't take the time to use it.

Diff Detail

Event Timeline

rnk created this revision.May 9 2020, 7:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 9 2020, 7:21 AM
aganea added inline comments.
lld/COFF/DebugTypes.cpp
89

We will need tests for these I think. Same for the other fatal below.

173

I recall @santagada saying that linking .debug$H .OBJs with MSVC link.exe would fail. While browsing the microsoft-pdb repo, I was under the impression at one point that this 'Magic' field is actually a size of some sort to read more data. If link.exe treats it as a size, it would certainly fail in some way or another. This is unrelated to this patch, but it'd be good to clarify that situation.

199–249

I started re-basing this patch at some point, and one thing that I wanted to do was to move all this .debug$H calculation into llvm/lib/DebugInfo/CodeView/. This was to be able to re-create .debug$H streams from llvm-objcopy, see D59025. The end-goal was to be able to run this step (llvm-objcopy) on a remote host, after MSVC cl.exe compilation. That way, you'll speed-up LLD linking when building with MSVC, and your cached .OBJs would already contain .debug$H in all cases. Again, this could be done later, I simply wanted to raise it.

rnk marked 4 inline comments as done.May 12 2020, 12:21 PM

This next update fixes the variable naming issues, test failures, and adds a PCH test. It should be ready for review.

lld/COFF/DebugTypes.cpp
89

I did this with obj2yaml -> sed -> yaml2obj. It works, but it's fragile.

173

I have some vague recollection that things went like this:

  • LLVM added .debug$H. I wanted to ensure that it was compatible with link.exe, but I cannot confirm that this was tested thoroughly.
  • After switching Chromium to LLD, we enabled .debug$H in Chromium
  • Somebody was using link.exe to link Chromium, ran into a problem, they reported it to Microsoft, Yong Kang fixed it on the MSVC side by ignoring .debug$H

So I think the current state is that everything works, but @santagada probably did see some transient incompatibility.

199–249

Got it, that makes sense.

rnk updated this revision to Diff 263492.May 12 2020, 12:23 PM
rnk marked an inline comment as done.
  • Fix tests, add tests, fix naming
rnk edited the summary of this revision. (Show Details)May 12 2020, 12:24 PM
aganea marked an inline comment as done.May 13 2020, 9:12 AM

Thanks a lot for rebasing this!

lld/COFF/DebugTypes.cpp
89

There's also Inputs/precomp-invalid.obj which is a copy of precomp.obj (I think) but with a different signature. So maybe this was already covered?

94

Could you please ensure this fatal codepath is covered? It should be, just to be sure.

124

@santagada reported a 0.5 sec delay at shutdown, for a 800 MB .PDB, just for releasing allocated memory.

Do you think we can attach gc to the PDBLinker instead instead of it being static?

The goal would be after to integrate https://github.com/santagada/llvm-project/commit/5c8a17bbd2d7d7717f9c017e082cd37b0abbe52d

352

I'm not sure we need to check the 'Age' here. In a incremental build, the .OBJ pointing to the /Zi .PDB might not change, but the .PDB will change, incrementing its 'Age'. I think the only this that matters is the Guid.

lld/COFF/PDB.cpp
950

I was wondering if we could pre-link some libraries to speed-up linking of rarely modified projects. We're building lots of external/third-party libraries compiled from source, because we want to enforce the same settings as the rest of the gamecode (for options like /MT or -flto=thin). However they are cached and very rarely compiled. But we'd still pay the cost to merge symbols and types at link-time.
Maybe each library (or each group of libraries) could come with a /Zi-style .PDB which LLD would use instead of the library's .OBJs. If that ever happens, we'd need to merge symbols here as well.

rnk marked 5 inline comments as done.May 13 2020, 1:23 PM
rnk added a subscriber: blackhole12.
rnk added inline comments.
lld/COFF/DebugTypes.cpp
89

That one seems to cover a slightly different edge case. It has a valid S_OBJNAME record and signature, so we don't come to this fatal call.

94

Yep, it's already covered. The test just cps the PCH object and links it twice.

124

Hm, this reminds me, I seem to recall that there is a user who complains every time we add these global lists. Oh yeah: D70378 @blackhole12

In any case, I can confirm I saw @santagada's issues, I made this change recently:
rG2868ee5b3273e203d3b5d170531ecba764b2d610
We were destroying the BumpPtrAllocator used for PDB linking, instead of using the global one, which is not destroyed.

352

OK, I'll remove it. I noticed this was a functional change, I had to update one of the tests.

lld/COFF/PDB.cpp
950

True.

rnk updated this revision to Diff 263854.May 13 2020, 2:18 PM
  • Add logic to clear global tpi source lists
  • Remove check for type server PDB age

It looks like this time around, the global list that was added has been properly cleaned up in link(), so this should be fine. Of course, the fixes that were mentioned still haven't been merged, so as of right now LLVM 10's LLD linker crashes on windows when used multiple times, but apparently nobody cares about that right now. I can attempt to test this modification on my fork to see if it actually works, if people are willing to commit to actually fixing the memory cleanup errors.

santagada added inline comments.May 13 2020, 2:58 PM
lld/COFF/DebugTypes.cpp
173

We were using until very recently the 2015 link.exe because all the newer ones generate problems to debug when fast linking. Ms fixed those on 2019 .4 release so I can try again.

Still we are so close to moving away from link.exe completely. But thanks for the info

lld/COFF/PDB.cpp
950

What I would like is to generate static libraries that are more than just a tar of obj, but actually have a merged type and possibly symbol tables. That would make linking much faster as you can distribute those libs and reuse most of them. Ar files have a place for these, maybe we could put a empty .obj with a specific name on a .lib and read from there when available?

aganea accepted this revision.EditedMay 14 2020, 8:14 AM
aganea marked an inline comment as done.

LGTM, thanks again! (minor comment below)

It looks like this time around, the global list that was added has been properly cleaned up in link(), so this should be fine. Of course, the fixes that were mentioned still haven't been merged, so as of right now LLVM 10's LLD linker crashes on windows when used multiple times, but apparently nobody cares about that right now. I can attempt to test this modification on my fork to see if it actually works, if people are willing to commit to actually fixing the memory cleanup errors.

I do care. I fixed a few similar issues on our end. We have a similar scenario where the LLD excutable is loaded as a DLL and repeatly called from multiple threads by the build system (see this for more info, especially the last part). However in addition to the issues you're seeing, in our case the linker can run with different inputs, in several threads. Any static is a no-go. I concur with your suggestion of making everything as part of a context, and not have statics.

lld/COFF/DebugTypes.cpp
364

@rnk : If we're using sys::path::Style::windows below, we should do the same here?

lld/COFF/PDB.cpp
950

@santagada : Yeah that's what cl /Zi does, it stamps a .PDB redirection into the .OBJ type stream:

0x1000 : Length = 58, Leaf = 0x1515 LF_TYPESERVER2
                GUID={2760BCA6-C486-455B-833A-13B67CA4FA9A}, age = 0x00000001, PDB name = 'F:\llvm-project\__test\vc140.pdb

Except that the .PDB only contains the type stream, not the symbol stream, which remains in the .OBJ. The type stream is self-standing, whereas symbols need references to other sections in the .OBJ. MSVC does this with an external application (mspdbsrv.exe), so it would be a bit complicated to merge the symbols on the fly, multithreaded.

In our case, this would be a deferred step, not a 'live' step like MSVC. We could call lld-link in a special pre-link mode. At least that's the idea.

This revision is now accepted and ready to land.May 14 2020, 8:14 AM
rnk marked 2 inline comments as done.May 14 2020, 9:27 AM
rnk added inline comments.
lld/COFF/PDB.cpp
950

Hm, a special pre-link step sounds a lot like the Unix ld -r mode to produce a relocatable object. So, the build system's responsibility is to partition the objects into objects-under-development and the rest, and then ld -r the rest into a semi-static pre-linked blob. Interesting.

rnk marked an inline comment as done.May 14 2020, 9:44 AM

Thanks! I measured time and memory usage on a particular binary, and this change had no measurable effect, which is good.

lld/COFF/PDB.cpp
990

Before committing, I discovered that this results in warnings on any object that lacks debug info, such as the ones in the CRT. I don't think this warning is necessary, so I'll remove it and commit.

aganea added inline comments.May 14 2020, 10:06 AM
lld/COFF/PDB.cpp
950

Yes something along those lines. We could further partition into the 'rarely changed .LIBs' and the 'unmodified locally .LIBs'. The 'rarely changed' would be in the network cache and will not be built most of the time. The 'unmodified locally' would end up as a large blob, linked in every time the user modifies a file in a given project. On one hand you would have the pre-linked blob, and on the other hand only the .LIB that you're modifying currently. This would keep things deterministic, ie. the same link order. Most of the users rarely change files across projects, and if they do, it's only a few projects.

Another alternative to this would be an incremental link mode, where we would save all internal structures raw to disk. But that's a bit more complicated, as we need to add extra information to the runtime. For example for tracking down relationships between which .OBJ inserted which record. This would allow removing records for "stale" .OBJs and inserting records from the "new" .OBJs. But I think there's higher ROI for entirely parallelizing the COFF linker first.

This revision was automatically updated to reflect the committed changes.