This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO] Parse module summary index from assembly
ClosedPublic

Authored by tejohnson on Jun 7 2018, 1:33 PM.

Details

Summary

Adds assembly parsing support for the module summary index (follow on
to r333335 which added the assembly writing support).

I added support to llvm-as to invoke the index parsing, so that it can
create either a bitcode file with a Module and a per-module index, or
a combined index without a Module.

I will send follow on patches soon to do the following:

  • add support to tools such as llvm-lto2 to parse the per-module indexes

from assembly instead of bitcode when testing the thin link.

  • verification support.

I wanted to get the guts of the parsing out for review now though.

Depends on D47844 and D47842.

Diff Detail

Repository
rL LLVM

Event Timeline

tejohnson created this revision.Jun 7 2018, 1:33 PM

FYI I will be on vacation tomorrow morning through next week, so my responses to any comments will likely come June 18 or later. I wanted to get this out before I left so that hopefully I get some comments by the time I return. =)

I've just had time to skim a few files; feel free to ping me privately if I don't continue the review later today.

include/llvm/AsmParser/Parser.h
87 ↗(On Diff #150401)

I wonder if it's worth adding a simple named struct for this.

136 ↗(On Diff #150401)

I feel like this comment should come after you describe what the function does. I think the next line should be first (and have an empty line after, so that it's the only thing showing up in doxygen "brief").

include/llvm/IR/ModuleSummaryIndex.h
775–776 ↗(On Diff #150401)

The comment suggests these aren't always used. Should they be in an optional?

lib/AsmParser/LLLexer.cpp
722 ↗(On Diff #150401)

Missing period.

dexonsmith requested changes to this revision.Jun 20 2018, 9:38 AM

I just have a couple of additional (minor) comments on the code, although I admit it was hard to be thorough as a it's a monolithic patch. I might have preferred if this were committed incrementally, adding support for 1-2 constructs at a time. Because the lexing/parsing code is fairly mechanical, I think splitting it up now would be quite likely to introduce errors so I'm not going to ask for that.

In terms of testing, I don't see any positive assembly parsing lit tests (aside from implicit ones when round-tripping bitcode through llvm-as), and I think that would be valuable. I'm not sure if it's pragmatic to write targeted tests for each construct, or if a couple of monolithic tests would be better. What do you think?

test/Assembler/thinlto-bad-summary1.ll
3 ↗(On Diff #150401)

Why did you switch from llvm-as to opt in these tests?

tools/llvm-as/llvm-as.cpp
97 ↗(On Diff #150401)

I think putting IndexToWrite first would make the condition easier for me to parse. It might even design away the need for the assertion that follows in the else block.

100–102 ↗(On Diff #150401)

I personally don't love the style of dangling the else after a comment that's outdented from the "then" statement. Reading the code sequentially, I get tricked into thinking the if is complete, and then the else is a big surprise when I see it. I'd prefer:

if (cond) {
  // We have a non-empty module: write it plus the ...
  ...
} else {
  // We have an empty module but non-empty index: write a ...
  ...
}

However, if you think it's clearer/better as is, that's fine.

This revision now requires changes to proceed.Jun 20 2018, 9:38 AM

I just have a couple of additional (minor) comments on the code, although I admit it was hard to be thorough as a it's a monolithic patch. I might have preferred if this were committed incrementally, adding support for 1-2 constructs at a time. Because the lexing/parsing code is fairly mechanical, I think splitting it up now would be quite likely to introduce errors so I'm not going to ask for that.

In terms of testing, I don't see any positive assembly parsing lit tests (aside from implicit ones when round-tripping bitcode through llvm-as), and I think that would be valuable. I'm not sure if it's pragmatic to write targeted tests for each construct, or if a couple of monolithic tests would be better. What do you think?

Thanks for the review! Agree it is a rather large patch - sorry. I wanted to implement it fully before sending for review because I was concerned there would be too much churn as I encountered issues while building various parts of the index from assembly.

As a follow on I was planning on adding support to tools like llvm-lto/llvm-lto2 to parse the index from assembly for simulating the thin link and will add more tests reading from assembly then. But I agree this patch should have some for completeness. How about for now a monolithic test that tries to cover all the various summary constructs in assembly, then runs through llvm-as | llvm-dis to ensure we get the input back out as expected?

include/llvm/AsmParser/Parser.h
87 ↗(On Diff #150401)

Sure I can do that.

136 ↗(On Diff #150401)

Sure, will fix and the other one I added like this above (this was a clone/modify of the comment for parseAssembly)

include/llvm/IR/ModuleSummaryIndex.h
775–776 ↗(On Diff #150401)

Looks like I forgot to create the patch as a diff based on a couple other patches I pulled out of this big one. I'll see if I can fix the patch to rebase on top of those. Note this code in particular I pulled out into a separate patch D47842.

Regarding your suggestion, I'm not sure we can use Optional here since StringSaver is non-copyable.

(I also pulled out some other functionality into D47844).

test/Assembler/thinlto-bad-summary1.ll
3 ↗(On Diff #150401)

Because these tests are specifically testing the functionality added in the assembly writing patch to skip module entries (LLParser::SkipModuleSummaryEntry). With this patch, that is no longer invoked by llvm-as, but it will continue to be invoked by opt, i.e. via the existing parseAssembly handling (in the RFC we agreed it is better to ignore the summary when optimizing the IR, and only use it in limited situations where we will be testing e.g. the thin link).

tools/llvm-as/llvm-as.cpp
97 ↗(On Diff #150401)

Makes sense, will change (and agree the assertion can be removed).

100–102 ↗(On Diff #150401)

Sure, I will fix that.

tejohnson marked 4 inline comments as done.Jun 20 2018, 2:54 PM

I've address the comments except for adding a new test, which I will work on and that will probably get added tomorrow. I'm about to upload a new patch which is properly stacked on top of the 2 patches I had sent separately.

I've address the comments except for adding a new test, which I will work on and that will probably get added tomorrow. I'm about to upload a new patch which is properly stacked on top of the 2 patches I had sent separately.

Adding a test right now that includes all possible summary types and fields in various combinations. This is essentially a combined index, which made testing all combinations significantly easier (no need to come up with matching IR), and also allowed me to test entries that aren't in the per module summaries (the TypeId entries introduced for CFI during the thin link), and force both forward and backwards references to various summary entry types.

This exposed a few issues that I am also uploading the fixes for (some CFI types and a few optional flags weren't being parsed properly, and backwards aliasee references weren't working correctly).

PTAL.

tejohnson updated this revision to Diff 152484.Jun 22 2018, 8:59 AM

Add new test with more complete parsing testing, and some fixes it triggered.

dexonsmith accepted this revision.Jun 22 2018, 7:46 PM

LGTM. Thanks!

I still think you might be able to use an Optional (see my inline comment) but I don't need to see this again.

include/llvm/IR/ModuleSummaryIndex.h
775–776 ↗(On Diff #150401)

I'm not sure we can use Optional here since StringSaver is non-copyable.

Optional doesn't require copyable or moveable. You can construct in place using the Optional::emplace member.

This revision is now accepted and ready to land.Jun 22 2018, 7:46 PM
tejohnson added inline comments.Jun 24 2018, 8:55 PM
include/llvm/IR/ModuleSummaryIndex.h
775–776 ↗(On Diff #150401)

Thanks, I missed that. The emplace solves another issue for me with the string saver change as well. I'm fixing this on D47842 (the patch with just the string saver change), and will add you to the review there. I'm going to add you to D47844 as well (the other patch this one depends on).

This revision was automatically updated to reflect the committed changes.