This is an archive of the discontinued LLVM Phabricator instance.

[PDB] Fix bug when using multiple PCH header objects with the same name.
ClosedPublic

Authored by zturner on Aug 19 2019, 11:03 AM.

Details

Summary

A common pattern in Windows is to have all your precompiled headers use an object named stdafx.obj. If you've got a project with many different static libs, you might use a separate PCH for each one of these.

During the final link step, a file from A might reference the PCH object from A, but it will have the same name (stdafx.obj) as any other PCH from another project. The only difference will be the path. For example, A might be A/stdafx.obj while B is B/stdafx.obj.

The existing algorithm checks only the filename that was passed on the command line (or stored in archive), but this is insufficient in the case where relative paths are used, because depending on the command line object file / library order, it might find the wrong PCH object first resulting in a signature mismatch.

The fix here is to simply check whether the absolute path of the PCH object (which is stored in the input obj file for the file that references the PCH) *ends with* the full relative path of whatever is specified on the command line (or is in the archive).

Diff Detail

Event Timeline

zturner created this revision.Aug 19 2019, 11:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 19 2019, 11:03 AM
rnk added a comment.Aug 19 2019, 2:07 PM

Looks fine to me, but let @aganea comment since he's looked at this more than I have.

lld/COFF/PDB.cpp
520–523

Erg, this is obviously quadratic in input objects. Whatever, it's not your problem, either it shows up in the profile or it doesn't.

aganea accepted this revision.Aug 19 2019, 3:20 PM

Thanks for fixing this Zachary! Our precomp files are named after their project, ie. Engine.pch.obj

LGTM, a few comments:

lld/COFF/PDB.cpp
512

These two lines do not compile on my PC. I had to change to: SmallString<128> path1_norm{StringRef(sys::path::convert_to_slash(path1))};

520–523

This won't be an issue once the remaining steps in D59226 are done, I need to find time to send the reviews. We separately parse PCHs and (input) PDBs before any other OBJ files, and later use a hashtable for looking up the corresponding PCH.

557

You could probably remove this now, and replace usage below in createFileError with precomp.getPrecompFilePath().

This revision is now accepted and ready to land.Aug 19 2019, 3:20 PM
zturner marked 2 inline comments as done.Aug 19 2019, 3:30 PM

I know it's super lame to request this, but would one of you mind submitting on my behalf? I don't have SVN installed and it seems like git llvm push still requires it (unless there is a more modern workflow that doesn't require SVN anymore). If I find myself submitting more and more patches then I'll bite the bullet.

lld/COFF/PDB.cpp
512

Actually I'll just make them std::string instead of SmallString. That's what convert_to_slash returns anyway.

lld/test/COFF/precomp-link-samename.test
2

Note the order of object files here. It's important.

b/useprecomp -> a/precomp.

This means that when searching the list of input files for b/useprecomp, it will find a/precomp first, and since the filename 'precomp.obj' matches, the existing code would have found it, whereas the new code actually checks matches against precompa/precomp.obj , which won't match.

Ugh nevermind, you can't because of the binary files. Fineeeeee.

Ugh nevermind, you can't because of the binary files. Fineeeeee.

Send the binaries to my email.

Ugh nevermind, you can't because of the binary files. Fineeeeee.

Send the binaries to my email.

It's fine, i'll do it. I actually have another clang patch that I want to get in in the coming days, and then I might have found another bug involving manifest files, so... I can't avoid it :)

aganea added inline comments.Aug 19 2019, 5:35 PM
lld/test/COFF/precomp-link-samename.test
2

Could you please add this as a comment at the top of the test file?

thakis added a subscriber: thakis.Aug 23 2019, 5:27 AM

Is this only for cl.exe-created pch files? Those create special obj files iirc. Or does it affect clang-cl-built pch obj files as well? (If it's cl.exe-only, please mention that in the patch description.)

lld/test/COFF/precomp-link-samename.test
6

Could you add a comment that describes how the checked-in Input files were created?

Is this only for cl.exe-created pch files? Those create special obj files iirc. Or does it affect clang-cl-built pch obj files as well? (If it's cl.exe-only, please mention that in the patch description.)

Yes, this code is only for MSVC PCH.OBJ files. Clang does not generate LF_PRECOMP records.

FYI both the PCH.OBJ and referring OBJs are "special": OBJs contain only a partial .debug$T type stream, which cannot be parsed as-is because it is incomplete (the records backreference to other external records in the PCH.OBJ). The PCH.OBJ contains a .debug$P type stream, which is back-referred by the OBJ type stream. At runtime, we simply collate the .debug$P and the .debug$T for each OBJ, which yields a valid type stream ready to be merged into the PDB.

lld/COFF/PDB.cpp
512

Also, indentation.

Finally got around to trying to commit this.

Creating svn staging directory: (.git\llvm-upstream-svn)
svn staging area ready in '.git\llvm-upstream-svn'
Pushing 1 monorepo commit:
  29ff148f662 [PDB] Fix bug when using multiple PCH header objects with the same name.
`svn update --depth=files lld lld/trunk 'lld/trunk\test' 'lld/trunk\COFF' 'lld/trunk\test/COFF' 'lld/trunk\test/COFF/Inputs' 'lld/trunk\test/COFF/Inputs/precompb' 'lld/trunk\test/COFF/Inputs/precompa'` returned 1
svn: E170013: Unable to connect to a repository at URL 'https://llvm.org/svn/llvm-project'
svn: E175012: Connection timed out

Is git llvm push a thing of the past? Should I be using a new workflow?

Finally got around to trying to commit this.

Creating svn staging directory: (.git\llvm-upstream-svn)
svn staging area ready in '.git\llvm-upstream-svn'
Pushing 1 monorepo commit:
  29ff148f662 [PDB] Fix bug when using multiple PCH header objects with the same name.
`svn update --depth=files lld lld/trunk 'lld/trunk\test' 'lld/trunk\COFF' 'lld/trunk\test/COFF' 'lld/trunk\test/COFF/Inputs' 'lld/trunk\test/COFF/Inputs/precompb' 'lld/trunk\test/COFF/Inputs/precompa'` returned 1
svn: E170013: Unable to connect to a repository at URL 'https://llvm.org/svn/llvm-project'
svn: E175012: Connection timed out

Is git llvm push a thing of the past? Should I be using a new workflow?

SVN seems down, same thing for me, all requests time out.

zturner updated this revision to Diff 224164.Oct 9 2019, 2:24 PM

Rebased this onto tip of trunk. I'm *finally* going to submit this.

I had to make a few slight changes because it broke one existing test. I don't remember it breaking this test before, but anyway... The behavioral change is that now we don't distinguish between "mismatched signature" and "missing pch object". We also don't use the relative path comparison logic that I first implemented, because that breaks absolute paths. Instead, we use the previous fileNameOnly logic, but we couple that with a signature check. In other words, we continue searching until we have a match on *both* the signature and the file name. If nothing matches, we just return a generic "nothing matched" error. This changes some error text, but otherwise it should be more robust and handle every possible case with relative and absolute paths.

Planning to commit this this afternoon unless someone has objections.

rnk accepted this revision.Oct 10 2019, 1:12 PM

Go for it.

This revision was automatically updated to reflect the committed changes.