This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO] YAML traits for module summaries
Needs ReviewPublic

Authored by ncharlie on Jun 9 2017, 12:56 PM.

Details

Summary

[ThinLTO] Dump YAML for module summaries

Adds YAML traits for module summaries so dumps can include summary information for functions, global values, and aliases.

Diff Detail

Event Timeline

ncharlie created this revision.Jun 9 2017, 12:56 PM
ncharlie updated this revision to Diff 102068.Jun 9 2017, 1:02 PM

Oops - got rid of old llvm-dis code.

pcc edited edge metadata.Jun 9 2017, 1:05 PM

Can you split this into two patches (both with their own tests), one which adds the subcommand and the other which adds the new YAML output?

tools/llvm-dis/llvm-dis.cpp
155 ↗(On Diff #102067)

This seems like an unrelated change.

ncharlie added inline comments.Jun 9 2017, 1:06 PM
include/llvm/IR/ModuleSummaryIndexYAML.h
277

Some of these traits are duplicated. I could create a second struct called SharedSummaryYaml or something to deduplicate these fields, if anyone thinks they're worth centralizing.

302

I couldn't figure out how to thread a pointer to the Index down to the code that needs it (i.e. structs that are nested deeper in and have their own MappingTraits), so I left it here for now. If anyone has recommendations for a better place to put this, please let me know.

In D34063#777137, @pcc wrote:

Can you split this into two patches (both with their own tests), one which adds the subcommand and the other which adds the new YAML output?

Sorry - the bit you commented on wasn't supposed to be there - that was old code I forgot to remove. Do you want the new YAML code and llvm-lto2 subcommand to be in separate patches?

pcc added a comment.Jun 9 2017, 1:15 PM

Do you want the new YAML code and llvm-lto2 subcommand to be in separate patches?

Yes, please.

mehdi_amini added inline comments.Jun 9 2017, 1:37 PM
include/llvm/IR/ModuleSummaryIndex.h
538

Never seen "Oid" used here before.

Also s/dissassembly/disassembly and s/stringified//

539

Do we want this on all the time? Is this a fundamental property of the Index?
It isn't clear to me if it isn't just a debugging tool that map to the IR when available right now?

654

Return a const ref, and don't name it find... but get...OrEmpty if you want this API contract.

ncharlie updated this revision to Diff 102090.Jun 9 2017, 4:00 PM

Addressed some updates from the comments.

ncharlie marked 2 inline comments as done.Jun 9 2017, 4:02 PM
ncharlie added inline comments.
include/llvm/IR/ModuleSummaryIndex.h
539

How would you recommend making this an optional debugging option? Would a constructor argument work?

mehdi_amini added inline comments.Jun 9 2017, 7:12 PM
include/llvm/IR/ModuleSummaryIndex.h
539

Is this information in the index in the bitcode?
If not then it can be reconstructed at dump time.

ncharlie added inline comments.Jun 10 2017, 6:04 AM
include/llvm/IR/ModuleSummaryIndex.h
539

The reason I'm storing the map in the index is because I only have access to the ValueName when the bitcode is being parsed in ModuleSummaryIndexBitcodeReader (see BitcodeReader.cpp:4715 in this patch). Also, when the YAML is being generated, I'm only given a pointer to a ModuleSummaryIndex, which is why the map is in that class.

tejohnson added inline comments.Jun 10 2017, 6:42 AM
include/llvm/IR/ModuleSummaryIndex.h
539

The value name is available on the GlobalValue (Value::getName()), as is the GUID (getGUID()). So you can just walk the Module's list of global_objects() (which gives you all the functions and global variables) and the list of aliases(), and compute the mapping.

That being said, I am not familiar enough with the YAML support to know how to get an additional object (either the Module, or a mapping pre-computed by the client or a wrapper function), passed down to where you need it. pcc may have some insight.

mehdi_amini added inline comments.Jun 10 2017, 9:49 AM
include/llvm/IR/ModuleSummaryIndex.h
539

Also, when the YAML is being generated, I'm only given a pointer to a ModuleSummaryIndex, which is why the map is in that class.

Then just modify the YAML to get an optional Module * as well? If you get it then you have names available, otherwise not.
Since it wouldn't roundtrip the YAML, I'd display it as a comment next to the GUID.

ncharlie added inline comments.Jun 12 2017, 1:13 PM
include/llvm/IR/ModuleSummaryIndex.h
539

The value name is available on the GlobalValue (Value::getName()), as is the GUID (getGUID()). So you can just walk the Module's list of global_objects()

Then just modify the YAML to get an optional Module * as well?

OK, I'll look into to this. I didn't realize there was already a map that stored the names in the Module class, so now I really want to get rid of my duplicate.

tejohnson added inline comments.Jun 12 2017, 1:22 PM
include/llvm/IR/ModuleSummaryIndex.h
539

It isn't really a map, but rather the global values are available from the Module, and as I mentioned in my comment (just above Mehdi's), the names and guids are available on the global values.

ncharlie retitled this revision from [ThinLTO][llvm-lto2] Dump YAML for module summaries to [ThinLTO] YAML traits for module summaries.Jun 13 2017, 7:22 AM
ncharlie edited the summary of this revision. (Show Details)
ncharlie added inline comments.Jun 13 2017, 1:01 PM
include/llvm/IR/ModuleSummaryIndex.h
539

Then just modify the YAML to get an optional Module * as well?

After looking into this, I can see two options for doing this.

  1. Create a struct that holds both the Module * and the ModuleSummaryIndex, and output the new struct. I went part way down this path today, and I definitely think it's possible to do this, but it might result in some slightly odd code in some places - especially when reading in YAML (since it may not always be entirely obvious why a ModuleSummaryIndex and Module * have to be tracked together).
  2. Add a Module * or getModule function to the ModuleSummaryIndex - I think this option is the cleanest and simplest implementation, despite adding a bit more information to the ModuleSummaryIndex class.

If there's some way to get a Module from a ModuleID or ModuleHash, that could work as well, but I'm not sure if it's possible.

Let me know which one you guys think is the best.

Prazek removed a subscriber: Prazek.Jun 13 2017, 1:32 PM
mehdi_amini added inline comments.Jun 14 2017, 2:35 PM
include/llvm/IR/ModuleSummaryIndex.h
539

I'm not sure we're all on the same track.

Names are never part of the summaries IIRC. So when you only work to/from YAML you don't have any names. Which is why I mentioned that the names in the YAML can only be comments.

I don't expect any change to include/llvm/IR/ModuleSummaryIndex.h ; it should be fully handled in the routine that dumps to YAML: it should take an optional Module. The first it can do is build a map from GUID -> Name for every symbol in the module. The YAML dump takes the summary index and this map and process from here.

ncharlie added inline comments.Jun 15 2017, 6:19 AM
include/llvm/IR/ModuleSummaryIndex.h
539

Got it - I'll pull my changes from the ModuleSummaryIndex and only worry about the names at dump time.

ncharlie updated this revision to Diff 102826.Jun 16 2017, 7:38 AM

Fixes according to comments.

ncharlie marked 2 inline comments as done.Jun 16 2017, 7:40 AM
tejohnson edited edge metadata.Jun 19 2017, 9:17 PM

Can you add some test cases?

include/llvm/IR/ModuleSummaryIndexYAML.h
277

Yes I think that would be good. How about GlobalValueSummaryYaml, to mirror the class structure of the summaries themselves.

377

Also need to iterate over the refs() here and for alias and global var summaries (refs() is in the GlobalValueSummary base class)

424

Should there be an assert here that this is never invoked then?

438

Needs clang format. I believe you will also need to iterate over the aliases doing the same thing (global objects only includes functions and globalvars).

ncharlie added a comment.EditedJun 20 2017, 6:10 AM

Can you add some test cases?

Sure. How should I be testing the output?

Can you add some test cases?

Sure. How should I be testing the output?

With the llvm-lto2 support you are adding in D34080 - it's already approved so can go in.

ncharlie updated this revision to Diff 103218.Jun 20 2017, 9:15 AM

Added test, further cleanup.

ncharlie updated this revision to Diff 103219.Jun 20 2017, 9:18 AM
ncharlie marked 4 inline comments as done.

formatting

fhahn removed a subscriber: fhahn.Jun 20 2017, 9:19 AM
ncharlie added inline comments.Jun 20 2017, 9:19 AM
include/llvm/IR/ModuleSummaryIndexYAML.h
424

I realized that mapping wasn't necessary so I removed it.

mehdi_amini added inline comments.Jun 20 2017, 11:25 AM
include/llvm/IR/ModuleSummaryIndexYAML.h
438

There is a helper for this: Module::global_values()

(also: prefer for-range loops, and variable name needs to be uppercase).

ncharlie updated this revision to Diff 103259.Jun 20 2017, 1:35 PM

Updated GlobalValue loop.

sfertile added inline comments.
include/llvm/IR/ModuleSummaryIndexYAML.h
400

GVSum is unused, which will introduce a warning.

ncharlie updated this revision to Diff 103952.Jun 26 2017, 7:52 AM
ncharlie marked an inline comment as done.

Looks really close, just a few minor suggestions/nits left.

include/llvm/IR/ModuleSummaryIndexYAML.h
400

You can use "isa" instead of "dyn_cast" here now

426

Minor nit: prefer no braces around single statement for and if bodies.

include/llvm/Support/YAMLTraits.h
659

Ditto on no braces preferred.

test/tools/llvm-lto2/X86/dump-summary-yaml.ll
15

For the lines that have names, can you check for the names as well? You could either put in the expected GUID value, or use a regex to accept any GUID #. The FileCheck regex format for matching anything is: {{.*}}

ncharlie updated this revision to Diff 104173.Jun 27 2017, 8:10 AM
ncharlie marked 5 inline comments as done.Jun 27 2017, 8:14 AM
ncharlie added inline comments.
test/tools/llvm-lto2/X86/dump-summary-yaml.ll
15

Added the regex to match the GUIDs now.

I didn't check for names, because the current llvm-lto2 pass won't print them out, because it needs to pass in a special struct with that includes a Module pointer for the name. Since the llvm-lto2 dump summary patch doesn't support the struct yet (as it's in this patch), I didn't enable the names in the other patch yet. I'll need to create another patch once both these patches are merged in so llvm-lto2 prints the names in the summary.

tejohnson added inline comments.Jun 27 2017, 8:18 AM
test/tools/llvm-lto2/X86/dump-summary-yaml.ll
15

I think the other patch is approved, and can go in now. Then since you are adding the name mapping functionality in this patch, I would say go ahead and add the llvm-lto2 name dumping in this patch.

ncharlie updated this revision to Diff 104180.Jun 27 2017, 8:38 AM

Added names to dump-summary in llvm-lto2.

ncharlie marked 3 inline comments as done.Jun 27 2017, 8:38 AM

Can you commit D34080 and update the patch so it only shows the differences relative to that?

ncharlie updated this revision to Diff 104194.Jun 27 2017, 9:30 AM

Applied D34080

removed test/tools/llvm-lto2/X86/dump-summary.ll from D34080 since it was more of a temporary placeholder test before test/tools/llvm-lto2/X86/dump-summary-yaml.ll could be added.

test/tools/llvm-lto2/X86/dump-summary-yaml.ll