This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO] Use MD5 hash in function index.
ClosedPublic

Authored by tejohnson on Feb 9 2016, 7:06 AM.

Details

Summary

This patch uses the lower 64-bits of the MD5 hash of a function name as
a GUID in the function index, instead of storing function names. Any
local functions are first given a global name by prepending the original
source file name. This is the same naming scheme and GUID used by PGO in
the indexed profile format.

This change has a couple of benefits. The primary benefit is size
reduction in the combined index file, for example 483.xalancbmk's
combined index file was reduced by around 70%. It should also result in
memory savings for the index file in memory, as the in-memory map is
also indexed by the hash instead of the string.

Second, this enables integration with indirect call promotion, since the
indirect call profile targets are recorded using the same global naming
convention and hash. This will enable the function importer to easily
locate function summaries for indirect call profile targets to enable
their import and subsequent promotion.

The original source file name is recorded in the bitcode in a new
module-level record for use in the ThinLTO backend pipeline.

Diff Detail

Repository
rL LLVM

Event Timeline

tejohnson updated this revision to Diff 47317.Feb 9 2016, 7:06 AM
tejohnson retitled this revision from to [ThinLTO] Use MD5 hash in function index..
tejohnson updated this object.
tejohnson added reviewers: davidxl, mehdi_amini.
tejohnson added a subscriber: llvm-commits.
davidxl added inline comments.Feb 9 2016, 9:18 AM
lib/Bitcode/Reader/BitcodeReader.cpp
5466 ↗(On Diff #47317)

Is the assert unrelated change?

lib/Bitcode/Writer/BitcodeWriter.cpp
823 ↗(On Diff #47317)

This code looks like a common utility (string emission) -- is there any code reuse opportunity?

tejohnson added inline comments.Feb 9 2016, 9:30 AM
lib/Bitcode/Reader/BitcodeReader.cpp
5466 ↗(On Diff #47317)

It is related as we can't support lazy function summary reading for the per-module index since we need the function summary to get the global identifier below. Nor do we need to support lazy function summary reading here, that was meant for the combined index reading and the support was probably inadvertently cloned here.

lib/Bitcode/Writer/BitcodeWriter.cpp
823 ↗(On Diff #47317)

In other places where we set up string encoding abbrevs are structured a bit differently since we are emitting multiple strings (e.g. the VST). Thus they emit all 3 abbrevs unconditionally, then for each string to be emitted invoke getStringEncoding, and use the appropriate abbrev.

davidxl edited edge metadata.Feb 9 2016, 4:41 PM

Any test on the recorded source file name?

include/llvm/IR/FunctionInfo.h
151 ↗(On Diff #47317)

I am a little concerned with using DenseMap data structure here.

Both DenseMap and StringMap are implemented as open hashtab with quadratic probing. The load factor of both are guaranteed to < 0.75 -- that means there are lots of empty buckets in a large table. The main differences are:

  1. StringMap uses indirection -- each bucket contains pointer to the key-value pair object, so the bucket size is small. The copy is also cheaper when rehashing happens. However DenseMap is key-value pair is embedded inside the bucket. That means if the value type has large size, it will cause lots of waste in memory. The copy of dense map can also cause large overhead.
  2. When resizing, DenseMap will round up the new size to the next power of 2 value -- this further increases the memory overhead.

There is another bad side effect is that if elements are inserted into map one by one, it will incur lots of reallocation operation (just like vector) unless the size of the map is known before hand and properly resized at the beginning.

I suggest using std::map if the size of the map is not known priori. If it is known, and if the map lookup happens after the map is created, it might be better to use a vector (pushback) followed by a sort after the map is populated.

include/llvm/IR/Module.h
200 ↗(On Diff #47317)

This comment is not quite clear in its meaning.

test/tools/gold/X86/thinlto.ll
27 ↗(On Diff #47317)

test [offset, guid] pattern here?

mehdi_amini added inline comments.Feb 9 2016, 4:59 PM
lib/Bitcode/Reader/BitcodeReader.cpp
461 ↗(On Diff #47317)

doxygen, at least pointing to the Module one.

5461 ↗(On Diff #47317)

Typo in the comment: VST_FNENTRY instead of VST_CODE_FNENTRY

5488 ↗(On Diff #47317)

Typo in the comment: VST_FNENTRY instead of VST_CODE_COMBINED_FNENTRY

lib/IR/Module.cpp
50 ↗(On Diff #47317)

Can you initialize SourceFileName to empty and test in getSourceFileName()?

lib/Transforms/IPO/FunctionImport.cpp
206 ↗(On Diff #47317)

I was not very happy with that originally, and it may be a good time to revisit it.
When does it happen? Why don't we always rename SrcModule ?

tools/llvm-bcanalyzer/llvm-bcanalyzer.cpp
176 ↗(On Diff #47317)

Test?

tejohnson added inline comments.Feb 9 2016, 9:37 PM
include/llvm/IR/FunctionInfo.h
151 ↗(On Diff #47317)

Good point, I hadn't compared the relative overheads of these data structures.

Unfortunately, we don't know the size of this map ahead of time. E.g. when merging multiple indexes to create the combined index. And we do lookups while adding (e.g. when creating the combined index we may have multiple comdat with the same name/GUID).

So it sounds like the best bet for now is to use std::map, and we can investigate alternatives that might involve restructuring the accesses if this turns out not to be efficient enough.

include/llvm/IR/Module.h
200 ↗(On Diff #47317)

The point I'm trying to make is that if we are compiling this from bitcode, it comes from the new bitcode record, otherwise it is the name of the input file (which is the same as the ModuleID). E.g.:

clang -c foo.c -flto=thin ModuleID = SourceFileName = foo.c
(foo.o is bitcode with new MODULE_SOURCE_FILENAME record = "foo.c")
clang foo.o -flto=thin
ModuleID = foo.o, SourceFileName = foo.c

test/tools/gold/X86/thinlto.ll
27 ↗(On Diff #47317)

I'm afraid testing the specific offsets will be brittle, unless I just check for any integer in that field. I could test for the GUIDs, but would have to handle either ordering as the order in the combined index is not currently guaranteed. Let me go ahead and check for the GUIDs as that maintains the same level of checking as we had with the function names.

tejohnson added inline comments.Feb 9 2016, 9:55 PM
lib/Bitcode/Reader/BitcodeReader.cpp
461 ↗(On Diff #47317)

Will do.

5461 ↗(On Diff #47317)

Will fix.

5488 ↗(On Diff #47317)

Will fix.

lib/IR/Module.cpp
50 ↗(On Diff #47317)

Actually, this is by design. When we are compiling from source (say clang -c foo.c -flto=thin), we want the SourceFileName to be foo.c. Only when we later compile the foo.o bitcode file do we then want this set to foo.c from the new bitcode record.

lib/Transforms/IPO/FunctionImport.cpp
206 ↗(On Diff #47317)

The old comment was stale. We aren't actually doing any renaming in SrcModule until the importing happens later on. At this point we haven't even materialized everything in SrcModule that we want to import, so I think it is premature to do any renaming.

The CalledFunctionName however comes out of the Worklist which was populated by findExternalCalls. We had to get the global identifier in order to access the correct function summary out of the index here, and to disambiguate from same named locals we are going to import from other modules.

(Looking at findExternalCalls, I just realized that we should still be checking if this is already in the DestModule using the suffix form, however, since that is the name the local will get when it is promoted. However, with the current bulk importing mechanism we would never already have a promoted local from another module imported into DestModule yet, but I will fix this in case that ever changes.)

tools/llvm-bcanalyzer/llvm-bcanalyzer.cpp
176 ↗(On Diff #47317)

Will add one.

tejohnson updated this revision to Diff 47454.Feb 10 2016, 7:49 AM
tejohnson edited edge metadata.
  • Address David and Mehdi's review comments and rebase.
mehdi_amini accepted this revision.Feb 10 2016, 9:11 AM
mehdi_amini edited edge metadata.

LGTM. You may wait a little bit to give a change to David to chime in.

lib/IR/Module.cpp
50 ↗(On Diff #47454)

Not if I was clear because I think what I suggested matches your behavior (unless I missed something).

I'd have left the SourceFileName member empty and change the getter to:

const std::string &getSourceFileName() const { 
  if (SourceFileName.empty()) return getModuleIdentifier();
  return SourceFileName;
}

But it is not important so feel free to keep as you did.

lib/Transforms/IPO/FunctionImport.cpp
221 ↗(On Diff #47454)

We can't rename before materializing?

test/tools/gold/X86/thinlto.ll
27–28 ↗(On Diff #47454)

What is the op1 here? Is it the 64 low-bits of the MD5 as an int?
Can we have a better pretty-print from llvm-bcanalyzer? (don't hold this patch for this though)

This revision is now accepted and ready to land.Feb 10 2016, 9:11 AM
tejohnson added inline comments.Feb 10 2016, 9:38 AM
lib/IR/Module.cpp
50 ↗(On Diff #47454)

Oh got it, I misunderstood. It will be more efficient to initialize SourceFileName once and not have to do the empty check every time we call this, so I think I will leave it as-is.

lib/Transforms/IPO/FunctionImport.cpp
221 ↗(On Diff #47454)

I was worried about doing additional parsing after we have done some renaming. But thinking through this some more it should be ok as all GV references should be through the centralized ValueList which would contain value handles to the renamed GVs.

In any case, there isn't a big need to do the renaming in the SrcModule before the actual importing.

test/tools/gold/X86/thinlto.ll
27–28 ↗(On Diff #47454)

Yes, op0 is the bitcode offset of the summary, and op0 is the MD5 as an int. I will add a comment to these tests to enumerate the format before committing.

Would you rather see the values printed in hex? The bcanalyzer just prints all record values as ints, it doesn't look at the record type to determine the best format. That would require more plumbing in the bcanalyzer than I'm thinking this is worth. Probably the easiest thing would be to add an option to llvm-bcanalyzer to dump all record values in hex.

tejohnson updated this revision to Diff 47474.Feb 10 2016, 9:46 AM
tejohnson edited edge metadata.

Fix missed incorrect VST_CODE_COMBINED_FNENTRY comment, and add comments
to test describing expected format of this record type.

davidxl accepted this revision.Feb 10 2016, 9:49 AM
davidxl edited edge metadata.

lgtm

test/tools/gold/X86/thinlto.ll
27–28 ↗(On Diff #47454)

nit -- for better reading, probably just need to keep one line with actual MD5 value, the rest can be replaced with [0-9]+

tejohnson added inline comments.Feb 10 2016, 9:53 AM
test/tools/gold/X86/thinlto.ll
27–31 ↗(On Diff #47474)

I think that's what I have here already? I could shorten this a bit by replacing the intermediate parts with {{.*}}. E.g.:

; COMBINED-NEXT: <COMBINED_FNENTRY {{.*}} {{-3706093650706652785|-5300342847281564238}}

Is that preferable?

I meant the op1 part can be simplified -- but not necessariliy need to.

David

This revision was automatically updated to reflect the committed changes.