This is an archive of the discontinued LLVM Phabricator instance.

IR: Module summary representation for type identifiers; summary test scaffolding for lowertypetests.
ClosedPublic

Authored by pcc on Dec 21 2016, 5:22 PM.

Details

Summary

Set up basic YAML I/O support for module summaries, plumb the summary into
the pass and add a few command line flags to test YAML I/O support. Bitcode
support to come separately, as will the code in LowerTypeTests that actually
uses the summary. Also add a couple of tests that pass by virtue of the pass
doing nothing with the summary (which happens to be the correct thing to do
for those tests).

Depends on D28014

Diff Detail

Repository
rL LLVM

Event Timeline

pcc updated this revision to Diff 82293.Dec 21 2016, 5:22 PM
pcc retitled this revision from to IR: Module summary representation for type identifiers; summary test scaffolding for lowertypetests..
pcc updated this object.
pcc added reviewers: mehdi_amini, tejohnson.
pcc added a subscriber: llvm-commits.
mehdi_amini edited edge metadata.Dec 21 2016, 10:29 PM

I don't have any strong feeling against YAML, but we should take this decision for the full summary and not just for the "type test", so I'd like to hear from Teresa about that (in the past we were considering a LLVM "metadata" serialization instead, or similar, so that it can round-trip in the IR itself).

(I'm tired of the llvm-bcanalyzer tests for the summary, so any kind of textual serialization of the summaries for testing and debugging purpose will be welcome at this point)

llvm/include/llvm/IR/ModuleSummaryIndex.h
401 ↗(On Diff #82293)

Is this supposed to be "per module" or should it end up global to the index? (My understanding was the latter but you implemented the former AFAICT).

646 ↗(On Diff #82293)

Could these all go in an implementation file with proper forward declaration here?

llvm/test/Transforms/LowerTypeTests/import-unsat.ll
23 ↗(On Diff #82293)

Can you add a one line comment about that this test is checking?

mehdi_amini added inline comments.Dec 21 2016, 10:31 PM
llvm/include/llvm/IR/ModuleSummaryIndex.h
646 ↗(On Diff #82293)

Alternatively, what about extracting this into a separate header llvm/include/llvm/IR/ModuleSummaryIndexSerialization.h that would include llvm/include/llvm/IR/ModuleSummaryIndex.h (if necessary).

tejohnson edited edge metadata.Dec 22 2016, 8:57 AM

I don't have any strong feeling against YAML, but we should take this decision for the full summary and not just for the "type test", so I'd like to hear from Teresa about that (in the past we were considering a LLVM "metadata" serialization instead, or similar, so that it can round-trip in the IR itself).

It would be optimal to have a serialization in the LLVM assembly format (I started to look into doing that some time ago but got side tracked - I was thinking not as metadata since we don't carry it internally as metadata, but a new format e.g. similar to the way attributes are emitted as a separate format not metadata). Rather than taking the trouble to emit as YAML, why not emit into the LLVM assembly from the start?

(I'm tired of the llvm-bcanalyzer tests for the summary, so any kind of textual serialization of the summaries for testing and debugging purpose will be welcome at this point)

llvm/include/llvm/IR/ModuleSummaryIndex.h
401 ↗(On Diff #82293)

Mehdi: What makes this per-module vs global? The same ModuleSummaryIndex data structure is used for both the permodule index and the combined index.

pcc: Why no accessors for this map?

mehdi_amini added inline comments.Dec 22 2016, 10:00 AM
llvm/include/llvm/IR/ModuleSummaryIndex.h
401 ↗(On Diff #82293)

Oh forget my comment, I mis-identified the enclosing class.

pcc added a comment.Dec 22 2016, 10:07 AM

I don't have any strong feeling against YAML, but we should take this decision for the full summary and not just for the "type test", so I'd like to hear from Teresa about that (in the past we were considering a LLVM "metadata" serialization instead, or similar, so that it can round-trip in the IR itself).

It would be optimal to have a serialization in the LLVM assembly format (I started to look into doing that some time ago but got side tracked - I was thinking not as metadata since we don't carry it internally as metadata, but a new format e.g. similar to the way attributes are emitted as a separate format not metadata). Rather than taking the trouble to emit as YAML, why not emit into the LLVM assembly from the start?

It's far from clear to me that in assembly is the right way to go. It might be for the individual summaries but not necessarily for the combined summary as it isn't really a property of any specific module.

Given that, and also given that this format is for debugging/testing purposes only, we might as well do the easiest thing we can while still making our tests readable. For me that meant leveraging the existing YAML support in LLVM. If we ultimately decide that some other format is better, I would imagine that converting the tests to the new format would be easier if we start from YAML rather than bcanalyzer output.

I was thinking not as metadata since we don't carry it internally as metadata, but a new format e.g. similar to the way attributes are emitted as a separate format not metadata). Rather than taking the trouble to emit as YAML, why not emit into the LLVM assembly from the start?

(I'm tired of the llvm-bcanalyzer tests for the summary, so any kind of textual serialization of the summaries for testing and debugging purpose will be welcome at this point)

In D28041#629835, @pcc wrote:

It's far from clear to me that in assembly is the right way to go. It might be for the individual summaries but not necessarily for the combined summary as it isn't really a property of any specific module.

I agree, but we wouldn't want two different serializations for each of the cases (knowing that the in-memory datastructure is shared).

Given that, and also given that this format is for debugging/testing purposes only,

Isn't it true of the textual IR as well?

In D28041#629835, @pcc wrote:

It's far from clear to me that in assembly is the right way to go. It might be for the individual summaries but not necessarily for the combined summary as it isn't really a property of any specific module.

I agree, but we wouldn't want two different serializations for each of the cases (knowing that the in-memory datastructure is shared).

I was envisioning that in the combined index case it would be serialized to a separate file (similar to how we can save the combined index to a distinct bitcode file in bitcode format now), as LLVM assembly just containing the combined index (aka textual IR as Mehdi calls it below, I think we are talking about the same thing).

Given that, and also given that this format is for debugging/testing purposes only,

Isn't it true of the textual IR as well?

pcc added a comment.Dec 22 2016, 10:36 AM

I was thinking not as metadata since we don't carry it internally as metadata, but a new format e.g. similar to the way attributes are emitted as a separate format not metadata). Rather than taking the trouble to emit as YAML, why not emit into the LLVM assembly from the start?

(I'm tired of the llvm-bcanalyzer tests for the summary, so any kind of textual serialization of the summaries for testing and debugging purpose will be welcome at this point)

In D28041#629835, @pcc wrote:

It's far from clear to me that in assembly is the right way to go. It might be for the individual summaries but not necessarily for the combined summary as it isn't really a property of any specific module.

I agree, but we wouldn't want two different serializations for each of the cases (knowing that the in-memory datastructure is shared).

We could just have some way to embed a YAML "blob" in the asm. Yes it would be a little ugly, but it wouldn't really matter because debugging/testing. And again: this is all predicated on assembly being the "right" format.

Given that, and also given that this format is for debugging/testing purposes only,

Isn't it true of the textual IR as well?

Yes, but IR in YAML format would likely be unreadable, so if you follow my point to its conclusion, YAML would be a bad choice for the IR. On the other hand, summaries are more straightforwardly representable as YAML.

pcc updated this revision to Diff 83113.Jan 4 2017, 1:16 PM
pcc marked an inline comment as done.
pcc edited edge metadata.
  • Address review comments
llvm/include/llvm/IR/ModuleSummaryIndex.h
401 ↗(On Diff #82293)

pcc: Why no accessors for this map?

I don't need them yet. That will come in a later change.

646 ↗(On Diff #82293)

Separate header sounds better; done.

mehdi_amini accepted this revision.Jan 4 2017, 4:31 PM
mehdi_amini edited edge metadata.

LGTM.

This revision is now accepted and ready to land.Jan 4 2017, 4:31 PM
This revision was automatically updated to reflect the committed changes.