This is an archive of the discontinued LLVM Phabricator instance.

[PDB] Fix linking of function symbols and local variables
ClosedPublic

Authored by zturner on Aug 7 2017, 1:30 PM.

Details

Summary
[PDB] Fix linking of function symbols and local variables.

The compiler outputs PROC32_ID symbols into the object files
for functions, and these symbols have an embedded type index
which, when copied to the PDB, refer to the IPI stream.  However,
the symbols themselves are also converted into regular symbols
(e.g. S_GPROC32_ID -> S_GPROC32), and type indices in the regular
symbol records refer to the TPI stream.  So this patch applies
two fixes to function records.
  1. It converts ID symbols to the proper non-ID record type.
  2. After remapping the type index from the object file's index
     space to the PDB file/IPI stream's index space, it then
     remaps that index to the TPI stream's index space by.

Besides functions, during the remapping process we were also
discarding symbol record types which we did not recognize.  In
particular, we were discarding S_BPREL32 records, which is what
MSVC uses to describe local variables on the stack.  So this
patch fixes that as well by copying them to the PDB.

Diff Detail

Repository
rL LLVM

Event Timeline

zturner created this revision.Aug 7 2017, 1:30 PM
rnk added inline comments.Aug 7 2017, 2:06 PM
lld/COFF/PDB.cpp
334 ↗(On Diff #110078)

This thing isn't really canonicalizing the symbol kind anymore, it's rewriting the symbol record. Maybe name it something like rewriteSymbolForPDB?

361 ↗(On Diff #110078)

We need an error check for OOB IPI index

362–365 ↗(On Diff #110078)

In theory, we could do the same discoverTypeIndices trick that we do above to find the type index reference. Alternatively, if we know it's an LF_FUNC_ID, we should load the appropriate offset. Are these ever LF_MFUNC_IDs, though?

488 ↗(On Diff #110078)

Let's keep this as part of the copySymbolForPdb.

llvm/lib/DebugInfo/CodeView/TypeIndexDiscovery.cpp
401 ↗(On Diff #110078)

Maybe add S_BPREL32 before this? They're related, BPRELSYM32 vs REGREL32.

llvm/tools/llvm-pdbutil/MinimalSymbolDumper.cpp
740 ↗(On Diff #110078)

Remove the FIXME, it's done now. :)

zturner added inline comments.Aug 7 2017, 4:00 PM
lld/COFF/PDB.cpp
361 ↗(On Diff #110078)

That should have already happened when we merged the type streams. Wouldn't we have discarded any records that have invalid type index references, rather than emitting them with bad indices?

362–365 ↗(On Diff #110078)

We could just load the appropriate offset, but that would require hardcoding the offset, which is duplication of magic numbers that I'd like to avoid. I chose to do it this way because it makes the code more readable and is ultimately just copying some fields.

zturner added inline comments.Aug 7 2017, 6:01 PM
lld/COFF/PDB.cpp
488 ↗(On Diff #110078)

I actually think it should be here. I tried moving it into copySymbolForPdb and the code becomes really ugly. You end up having to duplicate a lot of the logic from remapTypesInSymbolRecord but forcing it to remap into the TPI stream's index space. By doing it this way, each step is logically independent. copy symbol literally just copies some memory like the comment says. remap types does exactly what it says, and so does canonicalize. I can call it remapSymbolKind or something like that if you prefer, but I kind of like the idea of keeping it as a sequence of 3 independent steps like this.

rnk added inline comments.Aug 7 2017, 6:48 PM
lld/COFF/PDB.cpp
334 ↗(On Diff #110078)

I still like rewriteSymbolForPDB

361 ↗(On Diff #110078)

Yeah, that makes sense. Any invalid item index would've become the simple "not translated" type index.

488 ↗(On Diff #110078)

Right, I agree, it should happen after remapping. Let's call it something else, though. We're doing a lot more than canonicalizing the symbol kind. Ultimately, we're going to need logic here for filtering out S_UDT records.

365 ↗(On Diff #110121)

These can be LF_MFUNC_ID records, we need to handle that.

489 ↗(On Diff #110121)

Rather than having this assert, let's remove the return type from canonicalizeSymbolKind and do SymbolKind NewKind = symbolKind(NewData);. It's one load.

490–492 ↗(On Diff #110121)

These != assertions are trivial, IMO. They also don't represent internal errors, we emitted those records for a while, and cvdump still accepted our PDBs.

llvm/lib/DebugInfo/CodeView/TypeIndexDiscovery.cpp
401 ↗(On Diff #110078)

ping?

zturner updated this revision to Diff 110127.Aug 7 2017, 7:39 PM

I think I uploaded the wrong diff last time. Hopefully this one is right.

rnk edited edge metadata.Aug 8 2017, 10:24 AM

Only one comment, but it does need to be addressed before committing. I'm surprised we don't hit this case anywhere in the LLD test suite. I guess we mostly have C test case inputs, not C++.

lld/COFF/PDB.cpp
365 ↗(On Diff #110127)

This would fail if it were an LF_MFUNC_ID

This revision was automatically updated to reflect the committed changes.