This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
test/COFF/
-
COFF/
-
pdb-tpi-hash-size.test
-
pdb.test

Differential D56942

Change TPI Bucket size for PDBs from minimum to maximum
ClosedPublic

Authored by CJHebert on Jan 18 2019, 2:50 PM.

Download Raw Diff

Details

Reviewers

zturner
rnk

Commits

rG8371da385a5d: [PDB] Increase TPI hash bucket count.
rL352117: [PDB] Increase TPI hash bucket count.
rLLD352117: [PDB] Increase TPI hash bucket count.

Summary

This patch changes the bucket count for the TPI and IPI streams in PDBs generated via LLVM from the minimum value to the maximum value. Changing this value improves symbol lookup for PDBs with large numbers of entries in the TPI and IPI streams.

In the microsoft-pdb repro published to support LLVM implementing PDB support, the provided code initializes the bucket count for the TPI and IPI streams to the maximum size. This occurs in tpi.cpp L33 and tpi.cpp L398. In the LLVM code for generating PDBs, these streams are created with minimum number of buckets. This difference makes LLVM generated PDBs slower for when used for debugging.

Diff Detail

Repository: rLLD LLVM Linker

Event Timeline

For context, this took the time for cold symbol look up for chrome.pdb from 10 minutes down to 2 minutes.

Excellent find! Alexandre Ganea (cc'ed) reported to me offline that he had seen degraded watch-window performance when using clang-cl generated code, but we only had some guesses as to what was causing it (none of which was this). However, after reading your explanation, I'm pretty certain it had to be this. I don't think we would have discovered this without your patch, so thanks!

Do you have commit access?

This revision is now accepted and ready to land.Jan 18 2019, 3:23 PM

I don't have commit access.

Thank you for this @CJHebert !

I was wondering:

Can we do an adaptative scheme, based on TypeHashes.size()? Some of us have lots of DLL loaded in the process, this might uselessly increase the debugger's memory usage.
Is MaxTpiHashBuckets - 1 (why not MaxTpiHashBuckets rather?) really a hard limit, or would VS support higher values?

In D56942#1363926, @CJHebert wrote:

For context, this took the time for cold symbol look up for chrome.pdb from 10 minutes down to 2 minutes.

What takes 2 min? For the VS debugger to lookup a symbol when you debug break?

Can we do an adaptative scheme, based on TypeHashes.size()? Some of us have lots of DLL loaded in the process, this might uselessly increase the debugger's memory usage.

I don't know the answer to this one directly. This is the value that the MSVC toolchain appears to use for all PDBs, and is the default used in the microsoft-pdb github repo when making new PDBs. I have no particular knowledge, but from the repo I suspect that any other count supports old PDBs that may have used an adaptive algorithm like you mention.

Is MaxTpiHashBuckets - 1 (why not MaxTpiHashBuckets rather?) really a hard limit, or would VS support higher values?

This constant comes from the Microsoft-PDB github repo and appears to be the cap for these streams. The check for MaxTpiHashBuckets is checked via count < MaxTpiHashBuckets in LLVM and cchnMax in microsoft-pdb, so subtracting one produces the largest count without violating this check. I don't know how pdbs with higher values would work with existing tools.

For context, this took the time for cold symbol look up for chrome.pdb from 10 minutes down to 2 minutes.

What takes 2 min? For the VS debugger to lookup a symbol when you debug break?

The specific scenario that is improved was cold lookup of symbols when using the dx command in windbg. This command and Visual Studio watch window use natvis, which appears to cause recursively cause symbol look ups. The mentioned chrome.pdb is absolutely massive at 1.4 gigs, the behavior for smaller pdbs was also improved.

When I apply this change locally, I get some test failures in LLD. Essentially, if you run llvm-pdbutil dump -types -type-extras foo.pdb we do some sort of crude hash verification. Basically, we take the hash that is in the PDB, then recompute what we think the hash should be, then compare the two, and if they're not equal we indicate that with a message. This message is trigger in the LLD tests, which makes me think that either something is wrong with the test, or something is wrong with the way we do the hash verification.

I'll have more time to see which one of these it is tomorrow.

Closed by commit rLLD352117: [PDB] Increase TPI hash bucket count. (authored by zturner). · Explain WhyJan 24 2019, 2:25 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

test/

COFF/

pdb-tpi-hash-size.test

10 lines

pdb.test

34 lines

Diff 183401

test/COFF/pdb-tpi-hash-size.test

				# RUN: yaml2obj < %p/Inputs/pdb1.yaml > %t1.obj
				# RUN: yaml2obj < %p/Inputs/pdb2.yaml > %t2.obj
				# RUN: rm -f %t.dll %t.pdb
				# RUN: lld-link /debug /pdb:%t.pdb /dll /out:%t.dll \
				# RUN: /entry:main /nodefaultlib %t1.obj %t2.obj

				# RUN: llvm-pdbutil dump -types -type-extras %t.pdb \| FileCheck %s

				CHECK: Hash Key Size: 4
				CHECK-NEXT: Num Hash Buckets: 262143

test/COFF/pdb.test

	Show First 20 Lines • Show All 136 Lines • ▼ Show 20 Lines
	RAW-NEXT: SC[???] \| mod = 2, 0000:0000, size = 0, data crc = 0, reloc crc = 0			RAW-NEXT: SC[???] \| mod = 2, 0000:0000, size = 0, data crc = 0, reloc crc = 0
	RAW-NEXT: none			RAW-NEXT: none
	RAW-NEXT: Obj: ``:			RAW-NEXT: Obj: ``:
	RAW-NEXT: debug stream: 13, # files: 0, has ec info: false			RAW-NEXT: debug stream: 13, # files: 0, has ec info: false
	RAW-NEXT: pdb file ni: 1 `{{.*pdb.test.tmp.pdb}}`, src file ni: 0 ``			RAW-NEXT: pdb file ni: 1 `{{.*pdb.test.tmp.pdb}}`, src file ni: 0 ``
	RAW: Types (TPI Stream)			RAW: Types (TPI Stream)
	RAW-NEXT: ============================================================			RAW-NEXT: ============================================================
	RAW-NEXT: Showing 5 records			RAW-NEXT: Showing 5 records
	RAW-NEXT: 0x1000 \| LF_ARGLIST [size = 8, hash = 0xEC0]			RAW-NEXT: 0x1000 \| LF_ARGLIST [size = 8, hash = 0x32484]
	RAW-NEXT: 0x1001 \| LF_PROCEDURE [size = 16, hash = 0x7BC]			RAW-NEXT: 0x1001 \| LF_PROCEDURE [size = 16, hash = 0x27EE9]
	RAW-NEXT: return type = 0x0074 (int), # args = 0, param list = 0x1000			RAW-NEXT: return type = 0x0074 (int), # args = 0, param list = 0x1000
	RAW-NEXT: calling conv = cdecl, options = None			RAW-NEXT: calling conv = cdecl, options = None
	RAW-NEXT: 0x1002 \| LF_POINTER [size = 12, hash = 0x884]			RAW-NEXT: 0x1002 \| LF_POINTER [size = 12, hash = 0x39732]
	RAW-NEXT: referent = 0x1001, mode = pointer, opts = None, kind = ptr64			RAW-NEXT: referent = 0x1001, mode = pointer, opts = None, kind = ptr64
	RAW-NEXT: 0x1003 \| LF_ARGLIST [size = 12, hash = 0x936]			RAW-NEXT: 0x1003 \| LF_ARGLIST [size = 12, hash = 0x1FC10]
	RAW-NEXT: <no type>: ``			RAW-NEXT: <no type>: ``
	RAW-NEXT: 0x1004 \| LF_PROCEDURE [size = 16, hash = 0x852]			RAW-NEXT: 0x1004 \| LF_PROCEDURE [size = 16, hash = 0x1BD3]
	RAW-NEXT: return type = 0x0074 (int), # args = 0, param list = 0x1003			RAW-NEXT: return type = 0x0074 (int), # args = 0, param list = 0x1003
	RAW-NEXT: calling conv = cdecl, options = None			RAW-NEXT: calling conv = cdecl, options = None
	RAW: Types (IPI Stream)			RAW: Types (IPI Stream)
	RAW-NEXT: ============================================================			RAW-NEXT: ============================================================
	RAW-NEXT: Showing 12 records			RAW-NEXT: Showing 12 records
	RAW-NEXT: 0x1000 \| LF_FUNC_ID [size = 20, hash = 0x330]			RAW-NEXT: 0x1000 \| LF_FUNC_ID [size = 20, hash = 0x38E5A]
	RAW-NEXT: name = main, type = 0x1004, parent scope = <no type>			RAW-NEXT: name = main, type = 0x1004, parent scope = <no type>
	RAW-NEXT: 0x1001 \| LF_FUNC_ID [size = 16, hash = 0x120]			RAW-NEXT: 0x1001 \| LF_FUNC_ID [size = 16, hash = 0xD08E]
	RAW-NEXT: name = foo, type = 0x1001, parent scope = <no type>			RAW-NEXT: name = foo, type = 0x1001, parent scope = <no type>
	RAW-NEXT: 0x1002 \| LF_STRING_ID [size = 16, hash = 0x757] ID: <no type>, String: D:\b			RAW-NEXT: 0x1002 \| LF_STRING_ID [size = 16, hash = 0x3EBD9] ID: <no type>, String: D:\b
	RAW-NEXT: 0x1003 \| LF_STRING_ID [size = 36, hash = 0xC3A] ID: <no type>, String: C:\vs14\VC\BIN\amd64\cl.exe			RAW-NEXT: 0x1003 \| LF_STRING_ID [size = 36, hash = 0xD327] ID: <no type>, String: C:\vs14\VC\BIN\amd64\cl.exe
	RAW-NEXT: 0x1004 \| LF_STRING_ID [size = 260, hash = 0x433] ID: <no type>, String: -Z7 -c -MT -IC:\vs14\VC\INCLUDE -IC:\vs14\VC\ATLMFC\INCLUDE -I"C:\Program Files (x86)\Windows Kits\10\include\10.0.10150.0\ucrt" -I"C:\Program Files (x86)\Windows Kits\NETFXSDK\4.6\include\um" -I"C:\Program Files (x86)\Windows Kits\8.1\include\shared"			RAW-NEXT: 0x1004 \| LF_STRING_ID [size = 260, hash = 0x2FA6A] ID: <no type>, String: -Z7 -c -MT -IC:\vs14\VC\INCLUDE -IC:\vs14\VC\ATLMFC\INCLUDE -I"C:\Program Files (x86)\Windows Kits\10\include\10.0.10150.0\ucrt" -I"C:\Program Files (x86)\Windows Kits\NETFXSDK\4.6\include\um" -I"C:\Program Files (x86)\Windows Kits\8.1\include\shared"
	RAW-NEXT: 0x1005 \| LF_SUBSTR_LIST [size = 12, hash = 0x759]			RAW-NEXT: 0x1005 \| LF_SUBSTR_LIST [size = 12, hash = 0x6053]
	RAW-NEXT: 0x1004: `-Z7 -c -MT -IC:\vs14\VC\INCLUDE -IC:\vs14\VC\ATLMFC\INCLUDE -I"C:\Program Files (x86)\Windows Kits\10\include\10.0.10150.0\ucrt" -I"C:\Program Files (x86)\Windows Kits\NETFXSDK\4.6\include\um" -I"C:\Program Files (x86)\Windows Kits\8.1\include\shared"`			RAW-NEXT: 0x1004: `-Z7 -c -MT -IC:\vs14\VC\INCLUDE -IC:\vs14\VC\ATLMFC\INCLUDE -I"C:\Program Files (x86)\Windows Kits\10\include\10.0.10150.0\ucrt" -I"C:\Program Files (x86)\Windows Kits\NETFXSDK\4.6\include\um" -I"C:\Program Files (x86)\Windows Kits\8.1\include\shared"`
	RAW-NEXT: 0x1006 \| LF_STRING_ID [size = 132, hash = 0xF57] ID: 0x1005, String: -I"C:\Program Files (x86)\Windows Kits\8.1\include\um" -I"C:\Program Files (x86)\Windows Kits\8.1\include\winrt" -TC -X			RAW-NEXT: 0x1006 \| LF_STRING_ID [size = 132, hash = 0xCAC7] ID: 0x1005, String: -I"C:\Program Files (x86)\Windows Kits\8.1\include\um" -I"C:\Program Files (x86)\Windows Kits\8.1\include\winrt" -TC -X
	RAW-NEXT: 0x1007 \| LF_STRING_ID [size = 24, hash = 0x2D1] ID: <no type>, String: ret42-main.c			RAW-NEXT: 0x1007 \| LF_STRING_ID [size = 24, hash = 0x21783] ID: <no type>, String: ret42-main.c
	RAW-NEXT: 0x1008 \| LF_STRING_ID [size = 24, hash = 0xB8B] ID: <no type>, String: D:\b\vc140.pdb			RAW-NEXT: 0x1008 \| LF_STRING_ID [size = 24, hash = 0x1DB87] ID: <no type>, String: D:\b\vc140.pdb
	RAW-NEXT: 0x1009 \| LF_BUILDINFO [size = 28, hash = 0xA8C]			RAW-NEXT: 0x1009 \| LF_BUILDINFO [size = 28, hash = 0x5E91]
	RAW-NEXT: 0x1002: `D:\b`			RAW-NEXT: 0x1002: `D:\b`
	RAW-NEXT: 0x1003: `C:\vs14\VC\BIN\amd64\cl.exe`			RAW-NEXT: 0x1003: `C:\vs14\VC\BIN\amd64\cl.exe`
	RAW-NEXT: 0x1007: `ret42-main.c`			RAW-NEXT: 0x1007: `ret42-main.c`
	RAW-NEXT: 0x1008: `D:\b\vc140.pdb`			RAW-NEXT: 0x1008: `D:\b\vc140.pdb`
	RAW-NEXT: 0x1006: ` -I"C:\Program Files (x86)\Windows Kits\8.1\include\um" -I"C:\Program Files (x86)\Windows Kits\8.1\include\winrt" -TC -X`			RAW-NEXT: 0x1006: ` -I"C:\Program Files (x86)\Windows Kits\8.1\include\um" -I"C:\Program Files (x86)\Windows Kits\8.1\include\winrt" -TC -X`
	RAW-NEXT: 0x100A \| LF_STRING_ID [size = 20, hash = 0x39C] ID: <no type>, String: ret42-sub.c			RAW-NEXT: 0x100A \| LF_STRING_ID [size = 20, hash = 0x7C68] ID: <no type>, String: ret42-sub.c
	RAW-NEXT: 0x100B \| LF_BUILDINFO [size = 28, hash = 0xAD7]			RAW-NEXT: 0x100B \| LF_BUILDINFO [size = 28, hash = 0x254D2]
	RAW-NEXT: 0x1002: `D:\b`			RAW-NEXT: 0x1002: `D:\b`
	RAW-NEXT: 0x1003: `C:\vs14\VC\BIN\amd64\cl.exe`			RAW-NEXT: 0x1003: `C:\vs14\VC\BIN\amd64\cl.exe`
	RAW-NEXT: 0x100A: `ret42-sub.c`			RAW-NEXT: 0x100A: `ret42-sub.c`
	RAW-NEXT: 0x1008: `D:\b\vc140.pdb`			RAW-NEXT: 0x1008: `D:\b\vc140.pdb`
	RAW-NEXT: 0x1006: ` -I"C:\Program Files (x86)\Windows Kits\8.1\include\um" -I"C:\Program Files (x86)\Windows Kits\8.1\include\winrt" -TC -X`			RAW-NEXT: 0x1006: ` -I"C:\Program Files (x86)\Windows Kits\8.1\include\um" -I"C:\Program Files (x86)\Windows Kits\8.1\include\winrt" -TC -X`
	RAW: Public Symbols			RAW: Public Symbols
	RAW-NEXT: ============================================================			RAW-NEXT: ============================================================
	RAW-NEXT: Publics Header			RAW-NEXT: Publics Header
	▲ Show 20 Lines • Show All 97 Lines • Show Last 20 Lines