This is an archive of the discontinued LLVM Phabricator instance.

[llvm-mca][NFC] Separate calculation of display data from its display in the summary and instruction info views
ClosedPublic

Authored by wolfgangp on Aug 18 2020, 4:08 PM.

Details

Summary

This is a preparatory patch for PR47227. We want to separate the calculation of display data from its display for the MCA views with the goal to then alternatively generate JSON (serialized) output.

We move the calculations to methods by the name of collectData() before displaying the data.

Diff Detail

Event Timeline

wolfgangp created this revision.Aug 18 2020, 4:08 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2020, 4:08 PM
Herald added a subscriber: gbedwell. · View Herald Transcript
wolfgangp requested review of this revision.Aug 18 2020, 4:08 PM

Before going with JSON, i think it will be important to establish the stability guarantees:
there shouldn't be any stability guarantees, neither on the data nor on the structure.

llvm/tools/llvm-mca/Views/InstructionInfoView.cpp
29

These should be default member-initializers

29

This is the only place with IIVDVec, let's just inline it?

43–45

auto I : zip(IIVD, Source)
const InstructionInfoViewData& IIVDEntry = std::get<0>(I)
const MCInst &Inst = std::get<1>(I)

99

Just take MutableArrayRef.

101

auto I : zip(Source, IIVD)
const MCInst &Inst = std::get<0>(I);
InstructionInfoViewData& IIVDEntry = std::get<1>(I);

Nice!

Thanks Wolfgang! I always wanted to add a structured output to llvm-mca.
Thanks for working on this.

I also agree with the idea of splitting the process in two stages: a preliminar stage to collect the data from a view, a second (and final) stage where the data collected during the first stage is properly structured and printed out.

I agree with Roman in that we should not guarantee stability of data and/or structure.
Also, changes to the output structure should always be advertised (for example, by adding a line in the release notes), so that people are always aware of it.

That being said, it is not impossible for most (if not all) default views to guarantee stability of data (but not structure). After all, default views rarely (if ever) change in practice.
However, to guarantee data stability, we need some form of versioning and ideally an "auto-upgrade" functionality too (to convert/map data from an older structure to elements of the new structure). The way how I see it is that there is no need to implement this now. We can always add more guarantees in the future if we really think it is worth it.

About the patch:
Roman has already pointed out what can be improved in the patch (foreach loops / default initializer / etc.). I have nothing else to add on it.
I like your design and the general direction. I will be happy to accept this patch once comments from Roman are addressed.

Thanks,
Andrea

wolfgangp updated this revision to Diff 286654.Aug 19 2020, 1:51 PM
wolfgangp marked 4 inline comments as done.

Addressed review comments:

  • Using MutableArrayRef
  • Using range-based for loops with zip
  • Using default initializers with InstructionInfoViewData
lebedev.ri added inline comments.Aug 19 2020, 1:56 PM
llvm/tools/llvm-mca/Views/InstructionInfoView.cpp
43–45

Ok, then

auto I : enumerate(zip(IIVD, Source))
const InstructionInfoViewData &IIVDEntry = std::get<0>(I.value());
CE.getEncoding(I.index())

Nice!

Thank you Andrea!

I agree with Roman in that we should not guarantee stability of data and/or structure.
Also, changes to the output structure should always be advertised (for example, by adding a line in the release notes), so that people are always aware of it.

I must confess I am not very informed about the issues regarding stability. Doesn't data stability depend on the scheduling model? I don't see how llvm-mca could guarantee it if that changes.
Or maybe I'm just confused about this. If you or Roman could briefly outline what's implied by data stability I'd appreciate it. Regarding structure stability a migration strategy would imply that llvm-mca would have to be able to read an older version JSON and update it to a later version, no?

I'll submit the follow-on patch to get started and you can educate me further on stability.

That being said, it is not impossible for most (if not all) default views to guarantee stability of data (but not structure). After all, default views rarely (if ever) change in practice.
However, to guarantee data stability, we need some form of versioning and ideally an "auto-upgrade" functionality too (to convert/map data from an older structure to elements of the new structure). The way how I see it is that there is no need to implement this now. We can always add more guarantees in the future if we really think it is worth it.

<snip>

  • wolfgang
llvm/tools/llvm-mca/Views/InstructionInfoView.cpp
29

I'm planning one more use. Would you still prefer using the explicit type?

Nice!

Thank you Andrea!

I agree with Roman in that we should not guarantee stability of data and/or structure.
Also, changes to the output structure should always be advertised (for example, by adding a line in the release notes), so that people are always aware of it.

I must confess I am not very informed about the issues regarding stability. Doesn't data stability depend on the scheduling model? I don't see how llvm-mca could guarantee it if that changes.

What I was trying to say is that the structure of the json report may change over time.

Future versions of mca may decide to structure data differently. Since we don't guarantee a stable structure for the json file, this may introduce incompatibilites in downstream tools which instead expect/assume a fixed/unchanging layout.

One way to solve this issue is by adding a version string to the json file, with the idea that each version string corresponds to a different structure of the json report.
It doesn't solve the incompatibility issue, but it allow tools to quickly identify incompatible files.

My other point was about the "stability of the data" (specifically: data generated by the default llvm-mca views).
Default mca views are not likely to change in future. So we can assume that the data generated by those views will always be the same. What could change is how data is layout in the final json report; Since the layout is not fixed, data may be moved to different objects/used by different name-value pairs etc.

If data doesn't change, and the only thing that changes is how data is mapped to the json structure, then it is possible to write an automated tool (e.g. an auto-upgrade script) which maps the data from an older report to the new json report structure. This again would probably require the presence of a string version.

In conclusion: if we really want to, in future we can always introduce a version string and write a simple auto-upgrade with simple rewriting rules for a few basic views.
However, yy personal opinion is that we should not worry about this (at least, not now).

wolfgangp updated this revision to Diff 287048.Aug 21 2020, 8:56 AM

Addressed review comment:

  • using enumerate() instead of an explicit index variable
wolfgangp marked an inline comment as done.Aug 21 2020, 8:57 AM
This revision is now accepted and ready to land.Aug 21 2020, 8:59 AM
andreadb closed this revision.Feb 22 2021, 2:57 AM

Not sure why this review wasn't automatically closed.
This was committed back in August as cf6adecd6a8718ee2737ca55e4cd938364b984cc