This is an archive of the discontinued LLVM Phabricator instance.

[llvm-mca] Report the number of dispatched micro opcodes in the DispatchStatistics view.
ClosedPublic

Authored by andreadb on Aug 29 2018, 8:33 AM.

Details

Summary

This patch introduces the following changes to the DispatchStatistics view:

  1. DispatchStatistics now reports the number of dispatched opcodes instead of the number of dispatched instructions.
  2. The "Dynamic Dispatch Stall Cycles" table now also reports the percentage of stall cycles against the total simulated cycles.

This change allows users to easily compare dispatch group sizes with the processor DispatchWidth.
Before this change, it was difficult to correlate the two numbers, since DispatchStatistics view reported numbers of instructions (instead of opcodes). DispatchWidth defines the maximum size of a dispatch group in terms of number of micro opcodes.

The other change introduced by this patch is related to how DispatchStage generates "instruction dispatch" events.
In particular:

  1. there can be multiple dispatch events associated with a same instruction
  2. each dispatch event now encapsulates the number of dispatched micro opcodes.

The number of micro opcodes declared by an instruction may exceed the processor DispatchWidth. Therefore, we cannot assume that instructions are always fully dispatched in a single cycle.
DispatchStage knows already how to handle instructions declaring a number of opcodes bigger that DispatchWidth. However, DispatchStage always emitted a single instruction dispatch event (during the first simulated dispatch cycle) for instructions dispatched.

With this patch, DispatchStage now correctly notifies multiple dispatch events for instructions that cannot be dispatched in a single cycle.

A few views had to be modified. Views can no longer assume that there can only be one dispatch event per instruction.

Tests (and docs) have been updated.

Please let me know if okay to commit

-Andrea

Diff Detail

Repository
rL LLVM

Event Timeline

andreadb created this revision.Aug 29 2018, 8:33 AM
mattd accepted this revision.Aug 29 2018, 10:39 AM

LGTM.

tools/llvm-mca/Views/DispatchStatistics.cpp
69 ↗(On Diff #163089)

It looks like this line is missing a single space, to put it inline with the rest of the output below.

tools/llvm-mca/Views/TimelineView.h
129 ↗(On Diff #163089)

This makes sense, but just for the sake of mentioning it. If we did make all of the other counters signed, then all counters would be uniform (in signedness) and we would not need to do any casting when comparing CycleDispatched against the other counters.

This revision is now accepted and ready to land.Aug 29 2018, 10:39 AM
This revision was automatically updated to reflect the committed changes.