This is an archive of the discontinued LLVM Phabricator instance.

[llvm-pdbutil] Improve output of llvm-pdbutil diff mode
ClosedPublic

Authored by zturner on Jul 5 2017, 4:53 PM.

Details

Summary

We're getting to the point that some MS tools (e.g. DIA) can recognize our PDBs but others (e.g. link.exe) cannot. I think the way forward is to improve our tooling to help us find differences more easily. For example, if we can compile the same program with clang-cl and cl and have a tool tell us all the places where the PDBs differ, this could tell us what we're doing wrong. It's tricky though, because there are a lot of "benign" differences in a PDB. For example, if the string table in one PDB consists of "foo" followed by "bar" and in the other PDB it consists of "bar" followed by "foo", this is not necessarily a critical difference, as long as the uses of these strings also refer to the correct location. On the other hand, if the second PDB doesn't even contain the string "foo" at all, this is a critical difference.

diff mode has been in llvm-pdbutil for quite a while, but because of the above challenge along with some others, it's been hard to make it useful. I think this patch addresses that. It looks for all the same things, but it now prints the output in tabular format (carefully formatted and aligned into tables and fields), and it highlights critical differences in red, non-critical differences in yellow, and identical fields in green. Example:

This makes it easy to spot the places we differ, and the general concept of outputting arbitrary fields in tabular format can be extended to provide analysis into many of the different types of information that show up in a PDB.

Diff Detail

Repository
rL LLVM

Event Timeline

zturner created this revision.Jul 5 2017, 4:53 PM
rnk added inline comments.Jul 7 2017, 10:03 AM
llvm/tools/llvm-pdbutil/DiffPrinter.cpp
1 ↗(On Diff #105359)

Add the annoying LLVM banner for consistency.

llvm/tools/llvm-pdbutil/DiffPrinter.h
43 ↗(On Diff #105359)

I don't quite get this. Basically we have a comparator which says "if they're different, no big deal, they're actually equivalent". Is that more or less the idea? Maybe instead of asking the caller to pass a comparator we should have a different printEquivalent method? It seems shorter and less magical.

50–51 ↗(On Diff #105359)

Having a different entry point for "equivalent" number comparisons would let us hoist these format calls up to the next level, since they don't depend on the comparator.

zturner added inline comments.Jul 7 2017, 10:15 AM
llvm/tools/llvm-pdbutil/DiffPrinter.h
43 ↗(On Diff #105359)

Yea. Like, In pdb 1 the stream for module foo has stream index 12, but in pdb 2 it has stream index 15. These are equivalent because a stream exists in both pdbs. The number doesn't matter. If both of them had kInvalidStreamIndex, they would be identical. If one had kInvalidStreamIndex and another had a valid stream index, these would be different.

I think we need some notion of a comparator because sometimes the comparisons get rather obtuse, and we want to wrap that up in an abstraction that can be re-used over and over. For example, the stream number comparison described above. But it even gets worse. If it's a name index, then we actually have to go look in the PDB file, get the corresponding string out, and compare the strings themselves rather than the numbers.

I have some enhancements to this model in a forthcoming patch (not up for review yet, but depends on all 3 of these changes) where the class has both a format method and a compare method. Consider the case where you have two module names foo.obj and bar.obj. You want to print Module "foo.obj" and Module "bar.obj" but you only want to compare the actual file paths, but you can't compare the strings directly because for really long paths you *also* need to truncate part of the module name so that it displays like Module "...o.obj" rather than ...dule "foo.obj". So yea, it's kinda gross.

Note that the caller doesn't have to pass anything, it gets a default template argument of Identical. In this patch, the caller does have to instantiate a default-constructed instance and pass that through, but in the forthcoming patch I mentioned, all you have to do is write:

print<Equivalent>("Foo", A, B);

or

print<CustomComparator>("Foo", A, B);

and it just works.

This revision was automatically updated to reflect the committed changes.