This is an archive of the discontinued LLVM Phabricator instance.

[PDB] Print the most redundant type record indices with /summary
ClosedPublic

Authored by rnk on Dec 12 2019, 2:31 PM.

Details

Summary

I used this information to motivate splitting up the Intrinsic::ID enum
(5d986953c8b917bacfaa1f800fc1e242559f76be) and adding a key method to
clang::Sema (586f65d31f32ca6bc8cfdb8a4f61bee5057bf6c8) which saved a
fair amount of object file size.

Example output for clang.pdb:

Top 10 types responsible for the most TPI input bytes:
       index     total bytes   count     size
      0x3890:      8,671,220 = 1,805 *  4,804
     0xE13BE:      5,634,720 =   252 * 22,360
     0x6874C:      5,181,600 =   408 * 12,700
      0x2A1F:      4,520,528 = 1,574 *  2,872
     0x64BFF:      4,024,020 =   469 *  8,580
      0x1123:      4,012,020 = 2,157 *  1,860
      0x6952:      3,753,792 =   912 *  4,116
      0xC16F:      3,630,888 =   633 *  5,736
      0x69DD:      3,601,160 =   985 *  3,656
      0x678D:      3,577,904 =   319 * 11,216

In this case, we can see that record 0x3890 is responsible for ~8MB of
total object file size for objects in clang.

The user can then use llvm-pdbutil to find out what the record is:

$ llvm-pdbutil dump -types -type-index 0x3890
                     Types (TPI Stream)
============================================================
  Showing 1 records.
     0x3890 | LF_FIELDLIST [size = 4804]
              - LF_STMEMBER [name = `WORDTYPE_MAX`, type = 0x1001, attrs = public]
              - LF_MEMBER [name = `U`, Type = 0x37F0, offset = 0, attrs = private]
              - LF_MEMBER [name = `BitWidth`, Type = 0x0075 (unsigned), offset = 8, attrs = private]
              - LF_METHOD [name = `APInt`, # overloads = 8, overload list = 0x3805]
...

In this case, we can see that these are members of the APInt class,
which is emitted in 1805 object files.

The next largest type is ASTContext:

$ llvm-pdbutil dump -types -type-index 0xE13BE bin/clang.pdb
    0xE13BE | LF_FIELDLIST [size = 22360]
              - LF_BCLASS
                type = 0x653EA, offset = 0, attrs = public
              - LF_MEMBER [name = `Types`, Type = 0x653EB, offset = 8, attrs = private]
              - LF_MEMBER [name = `ExtQualNodes`, Type = 0x653EC, offset = 24, attrs = private]
              - LF_MEMBER [name = `ComplexTypes`, Type = 0x653ED, offset = 48, attrs = private]
              - LF_MEMBER [name = `PointerTypes`, Type = 0x653EE, offset = 72, attrs = private]
...

ASTContext only appears 252 times, but the list of members is long, and
must be repeated everywhere it is used.

This was the output before I split Intrinsic::ID:

Top 10 types responsible for the most TPI input:
      0x686C:     69,823,920 = 1,070 * 65,256
      0x686D:     69,819,640 = 1,070 * 65,252
      0x686E:     69,819,640 = 1,070 * 65,252
      0x686B:     16,371,000 = 1,070 * 15,300
      ...

These records were all lists of intrinsic enums.

Diff Detail

Event Timeline

rnk created this revision.Dec 12 2019, 2:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 12 2019, 2:31 PM
Herald added a subscriber: mgrang. · View Herald Transcript
rnk updated this revision to Diff 233686.Dec 12 2019, 2:48 PM
  • share isIdRecord
ruiu added a comment.Dec 12 2019, 10:04 PM

Looks like this is an interesting feature, but the output might be a bit too spartan? It feels to me that it is not easy to make this feature's output into an actionable item unless you are not familiar with the debug record format. I wonder if there's an way to print out some "recommendation" to the user, say "<some class> is repeated xxx times. consider moving it out of a header file if it is written in a header.".

rnk added a comment.Dec 13 2019, 1:23 PM

Do you think I could just print a note that says "run llvm-pdbutil dump -types -type-index 0xNN path/to/foo.pdb to view a type record"? I could add more functionality to textually dump the type as more than a hex number to try to get closer to something actionable for the user, but I hesitate to duplicate functionality already available in llvm-pdbutil. And I wouldn't want LLD to depend on all the llvm-pdbutil dumping code, either.

ruiu added a comment.Dec 16 2019, 9:03 PM

Yeah, I guess that showing that tip line would help users.

rnk updated this revision to Diff 234616.Dec 18 2019, 2:03 PM
  • print llvm-pdbutil command
  • fix variable names

Unit tests: pass. 61012 tests passed, 0 failed and 728 were skipped.

clang-tidy: fail. Please fix clang-tidy findings.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

The PDB subject made me worry that I don't have enough knowledge to review it. Now I see it is a new feature and is related to debug info, which I really should learn more. I'll take some time tomorrow to understand it...

MaskRay added inline comments.Dec 21 2019, 5:26 PM
lld/COFF/PDB.cpp
431

Can tpiCounts be a non-empty vector before resize()?

If yes, assign(tMerger.getTypeTable().size(), 0);

437

Excuse my ignorance. What does "Untranslated or other" mean? I cannot find "untranslated" at other call sites of isSimple.

519

auto -> TypeIndex

1394

I've seen SmallVector<..., 0> in some code. Does that has any advantage over std::vector<...>?

rnk marked 4 inline comments as done.Dec 30 2019, 2:38 PM
rnk added a subscriber: bkramer.
rnk added inline comments.
lld/COFF/PDB.cpp
431

Yes, it can be non-empty, but why is assign better? resize() does zero initialization, it is not reserve.

437

Type merging can fail for some records, and rather than failing the whole link, the the record is discarded. Any indexes referring to it are replaced with SimpleTypeKind::NotTranslated, which is a simple type. The programmer could reasonably assume that complex types always map to complex types, but this is the common case where that assumption won't hold. The toArrayIndex call below will assert in that case.

1394
rnk updated this revision to Diff 235649.Dec 30 2019, 2:39 PM
  • some changes

Unit tests: pass. 61147 tests passed, 0 failed and 728 were skipped.

clang-tidy: fail. Please fix clang-tidy findings.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

MaskRay accepted this revision.Dec 30 2019, 3:57 PM
MaskRay added inline comments.
lld/COFF/PDB.cpp
431

If it was non-empty, existing non-zero values would not be zeroed.

If the current size is less than count,

1) additional default-inserted elements are appended
This revision is now accepted and ready to land.Dec 30 2019, 3:57 PM
rnk marked an inline comment as done.Jan 2 2020, 4:08 PM
rnk added inline comments.
lld/COFF/PDB.cpp
431

If it was non-empty, existing non-zero values would not be zeroed.

That is the intended behavior. I can't think of another good way to express this: grow the array, fill it with zeros, do it in a way that will trigger quadratic growth. Maybe a longer .insert call? .insert(tpiCounts.end(), size, 0)?

Well, I will commit it this way for now.

This revision was automatically updated to reflect the committed changes.
MaskRay added inline comments.Jan 2 2020, 5:14 PM
lld/COFF/PDB.cpp
431

Sorry I thought you wanted to zero the whole array. resize is the best to insert zeros while keeping existing values unchanged.