This is an archive of the discontinued LLVM Phabricator instance.

[llvm-mca][MC] Add the ability to declare which processor resources model load/store queues (PR36666).
ClosedPublic

Authored by andreadb on Nov 27 2018, 9:51 AM.

Details

Summary

This patch adds the ability to specify via tablegen which processor resources are load/store queue resources.

A new tablegen class named MemoryQueue can be optionally used to mark resources that model load/store queues.
Information about the load/store queue is collected at CodeGenSchedule stage, and analyzed by the SubtargetEmitter to initialize two new fields in struct MCExtraProcessorInfo named LoadQueueID and StoreQueueID.
Those two fields are identifiers for buffered resources used to describe the load queue and the store queue.
Field BufferSize is interpreted as the number of entries in the queue, while the number of units is a throughput indicator (i.e. number of available pickers for loads/stores).

At construction time, LSUnit in llvm-mca checks for the presence of extra processor information (i.e. MCExtraProcessorInfo) in the scheduling model.
If that information is available, and fields LoadQueueID and StoreQueueID are set to a value different than zero (i.e. the invalid processor resource index), then LSUnit initializes its LoadQueue/StoreQueue based on the BufferSize value declared by the two processor resources.

With this patch, we more accurately track dynamic dispatch stalls caused by the lack of LS tokens (i.e. load/store queue full). This is also shown by the differences in two BdVer2 tests. Stalls that were previously classified as generic SCHEDULER FULL stalls, are not correctly classified either as "load queue full" or "store queue full".

About the minor differences in the -scheduler-stats view:
those differences are expected, because entries in the load/store queue are not released when at instruction issue stage. Those are only released at instruction executed stage. This is the main reason why in some tests the load/store queues become full before PdEx is full.

Let me know if okay to commit.

-Andrea

Diff Detail

Repository
rL LLVM

Event Timeline

andreadb created this revision.Nov 27 2018, 9:51 AM

LGTM, I'll let others weigh in. For the most part I think this patch is fine, as long as my questions do not point out any bugs.

include/llvm/Target/TargetSchedule.td
567 ↗(On Diff #175509)

I'd suggest renaming the parameter something different, perhaps <ProcResource PR>. Reading <ProcResource ProcResource> caused my brain to hiccup a bit.

tools/llvm-mca/Views/SchedulerStatistics.cpp
49 ↗(On Diff #175509)

I'm curious. What if the target does not have a LoadQueueID or StoreQueueID defined? It seems like SchedulerStats would not account for the uses in Usage vector in that case. Should we emit a warning if that is the case, or is it reasonable for SchedulerStats to ignore targets that do not define the Load/Store IDs?

87 ↗(On Diff #175509)

If a target does not define a LQ/SQ resource ID, then SchedulerStats defaults to an ID of '0'. Is it possible for there to be a valid buffer with a BufferID of 0, or is that some special case? Our resource Buffer IDs come from an index defined in the processor model:

`/// An index to the MCProcResourceDesc entry in the processor model.`
tools/llvm-mca/include/HardwareUnits/LSUnit.h
108 ↗(On Diff #175509)

nit: Remove the wild '.' character in the middle of the sentence.

120 ↗(On Diff #175509)

It feels weird to mention 'official hardware documentation' here, when the LSUnit should be somewhat target agnostic. Perhaps clarify what hardware documentation you are referring to.

tools/llvm-mca/lib/HardwareUnits/LSUnit.cpp
29 ↗(On Diff #175509)

nit: In SchedulerStatistics.cpp you named the instance of a MCExtraProcessorInfo as 'EPI' and here you are calling it 'MEP'.

andreadb marked 6 inline comments as done.Nov 27 2018, 10:44 AM
andreadb added inline comments.
include/llvm/Target/TargetSchedule.td
567 ↗(On Diff #175509)

Will do.

tools/llvm-mca/Views/SchedulerStatistics.cpp
49 ↗(On Diff #175509)

On targets that don't specify a load/store queue, resource usage is normally updated by onReservedBuffers and onReleasedBuffers.
Basically, this patch is not a visibile change for processors that don't declare a load/store queue.

87 ↗(On Diff #175509)

ID zero is for the invalid resource. So, it cannot be a buffered resource.

tools/llvm-mca/include/HardwareUnits/LSUnit.h
108 ↗(On Diff #175509)

Will do.

120 ↗(On Diff #175509)

No documentation in particular... Processor vendors often describe it in their official documents. Maybe it is my English... if it is confusing, I can remove it.

tools/llvm-mca/lib/HardwareUnits/LSUnit.cpp
29 ↗(On Diff #175509)

I will change it.

mattd added inline comments.Nov 27 2018, 10:48 AM
tools/llvm-mca/include/HardwareUnits/LSUnit.h
120 ↗(On Diff #175509)

Ah, I see. I read that statement as being specific to one vendor/implementation, and not something more general.

mattd added inline comments.Nov 27 2018, 10:49 AM
tools/llvm-mca/Views/SchedulerStatistics.cpp
87 ↗(On Diff #175509)

Ok, that's good to know. Thanks for the clarification.

andreadb updated this revision to Diff 175663.Nov 28 2018, 4:19 AM
andreadb marked 6 inline comments as done.

Patch updated.

Addressed review comments.

Please let me know if okay to commit.
-Andrea

Thank you, looks good in general!
Some very contrived nits:

tools/llvm-mca/Views/SchedulerStatistics.cpp
35–45 ↗(On Diff #175663)

Is there a tracking bug for this? Could you file one?
And reference it here in the comment please.

tools/llvm-mca/include/HardwareUnits/LSUnit.h
130–139 ↗(On Diff #175663)

Is there a bug tracking this?

196–198 ↗(On Diff #175663)

Is there a bug tracking this?

andreadb marked 3 inline comments as done.Nov 28 2018, 10:18 AM

Thank you, looks good in general!
Some very contrived nits:

Thanks for the review Roman!

tools/llvm-mca/Views/SchedulerStatistics.cpp
35–45 ↗(On Diff #175663)

Sure. I am going to create a bug for it.

tools/llvm-mca/include/HardwareUnits/LSUnit.h
130–139 ↗(On Diff #175663)

I definitely plan to raise a bug for the usage of field LoadLatency.

I am not entirely sure about whether there is a good/reasonable way to address the last FIXME paragraph. If you want, I can raise a low-priority investigation bug.

196–198 ↗(On Diff #175663)

There isn't. I am not sure if we want to revisit this in future. I suspect, itt require a bit more investigation. That being said, if you want, I can raise an investigation bugzilla for it.

andreadb updated this revision to Diff 175728.Nov 28 2018, 11:14 AM

Address review comments.

Raised PR39828, PR39829 and PR39830.
Added references to those upstream bugzillas in code comments as suggested by Roman.

mattd accepted this revision.Nov 28 2018, 11:17 AM

Thanks for answering my questions.

This revision is now accepted and ready to land.Nov 28 2018, 11:17 AM
This revision was automatically updated to reflect the committed changes.