Page MenuHomePhabricator

Please use GitHub pull requests for new patches. Avoid migrating existing patches. Phabricator shutdown timeline

[MDL] First full integration of MDL with LLVM
Needs ReviewPublic

Authored by reidtatge on Aug 24 2023, 4:05 PM.



This is the first full integration with the mainline LLVM Repo. This will be (shortly) broken into smaller commits which focus on individual parts for review. The components roughly break down as:

  • Deltas to Target library code: ~1000
  • Deltas to CodeGen and MC libraries: ~2800
  • New utilities: ~20000
  • Documentation: ~5000

The MDL language provides an alternate way (from TableGen) to describe a target micro-architecture (ie an alternative to Schedules and Itineraries) to the CodeGen and MC libraries. It provides all the capability of Tablegen Schedules and Itineraries, but can efficiently describe much more complex processors.

You can find the RFC with more detailed descriptions of this work at (from Nov 2022). Detailed documentation about the design and the MDL language (and the RFC) can be found in
the repo at llvm/docs/Mdl/.

This is the first integration of work that we plan to expand going forward, and we'd like to land this to avoid bitrot and make the work
available to others. This integration is fully integrated - it can extract microarchitecture information from TableGen, create equivalent
MDL descriptions, and compile and use that information in CodeGen and MC libraries. This works for any target that has Schedules or Itineraries (AArch64, AMDGPU, AMDGPU/R600, ARM, Hexagon, Lanai, Mips, PowerPC, RISCV,
Sparc, SystemZ, and X86).

MDL support is enabled using the LLVM_ENABLE_MDL cmake parameter. When enabled, we build the MDL compiler and a tool to scrape necessary information from tablegen files, then use this information in backend libraries instead of using TableGen generated information. When not enabled (the default), all the MDL-specific code is guarded by runtime flags that disable it.

We welcome comments and suggestions, and look forward to your feedback.

Diff Detail

Event Timeline

reidtatge created this revision.Aug 24 2023, 4:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 24 2023, 4:05 PM
reidtatge requested review of this revision.Aug 24 2023, 4:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 24 2023, 4:05 PM

It looks like I rebased with a version that includes the commit you
reverted. This is messy...

Matt added a subscriber: Matt.Aug 25 2023, 11:48 AM
reidtatge updated this revision to Diff 553640.Aug 25 2023, 2:52 PM

Fixed build problems

reidtatge edited the summary of this revision. (Show Details)Aug 28 2023, 2:53 PM
reidtatge edited the summary of this revision. (Show Details)
reidtatge edited the summary of this revision. (Show Details)Aug 29 2023, 10:52 AM
reidtatge added reviewers: aartbik, lattner.

This is exciting! Please start a thread to discuss integrating this on the forum, this is a significant enough change and addition that I expect many folks to have opinions.

lkail added a subscriber: lkail.Aug 29 2023, 6:58 PM

I am super pumped to hear more about MDL on our call later today. The thought that you've given to being able to model complex behavior is exciting!

I've given some comments and asked some questions on the documentation I've read so far. The question I keep asking myself is why we need MDL. It sounds like you've done a really thorough job of understanding what the limitations of the TableGen based scheduling constructs are, and you've presented a set of alternative constructs to address these issues. For example, you point out that forwarding is modeled between instructions and it may be more effective to model it between resources. Why don't we accomplish this using a TableGen approach?


Could this combination be represented by adjusting the ReadAdvance to accommodate both characteristics? For example, if a Read happens 3 cycles later in the pipeline and the input delivers its input 2 cycles sooner, can't we do this:

def : ReadAdvance<MyRead, !add(2, 3), [WriteThatGetsForwarded]>
def : ReadAdvance<MyRead, 3, [ /* All other Writes that do not get forwarded*/]>

Maybe it would be annoying to enumerate All other Writes, but I think we could modify ReadAdvance to support an exclude argument that might look something like this:

def : ReadAdvance<MyRead, !add(2, 3), [WriteThatGetsForwarded]>
def : ReadAdvance<MyRead, 3, excludeWrites=[WriteThatGetsForwarded]>

I am currently working on a RISCV patch that will make model forwarding by using ReadAdvances. I think I am able to represent forwarded-read and a late-operand-read for the same instruction without any modification to ReadAdvance class.


Are you sure that forwarding cannot be modeled for these instructions by adding ReadAdvance entries for the input operands?


The reason we didn't model any ReadAdvances wasn't because it was tedious, but because we weren't modeling our writes correctly in the first place. Now that we've had some time to do a better job at representing writes, we plan on implementing forwarding networks on the SiFive7 scheduler model.


We have found that modeling latency for vector instructions more accurately can lead to a 10% speedup on some benchmarks. I'm not sure I agree that modeling latency is not critical in this case.


I think this is a really nice idea. In hardware, forwarding is often property of functional units, not the instructions that run on those units.

However, it may be the case that functional unit X only forwards vector operands or only forwards scalar operands. Can we capture this behavior if we only discuss resource -> resource?

You say below:

A relatively common case would be instruction operands that are _not_ connected to the forwarding network.  We need a reasonable way to model these exceptions.

It's likely we'd use this to model the idea I present above. Can you add an example of how to model exceptions in this documentation?


Is there a reason why we need a new language to model the forwarding as between resources instead of between reads and writes?

This is exciting! Please start a thread to discuss integrating this on the forum, this is a significant enough change and addition that I expect many folks to have opinions.

Hey Chris, there is the RFC (from last year) I mentioned in the notes above. Were you suggesting something else?

Hi Michael, responses to your questions and comments!


Yes, I'd agree that this is reasonably doable in tablegen doing something like what you proposed. I think the drawback of this approach is that the nodes in your forwarding network are instructions, rather than functional units, so its a very large graph. The MDL approach is to model the functional-unit-based graph, then deal with relatively rare exceptions on an instruction-by-instruction basis. So you can describe a full forwarding network with just a few lines of description.


Yeah, I agree you could do that, as you suggested in the earlier comment.


I should clarify that statement - I meant to say that tdscan can't extract forwarding information for those CPUs. Certainly you can add ReadAdvance records to any CPU. But since you have to do it for every single input operand (which usually aren't explicitly modeled at all) and annotate all the SchedWrites, its a lot of work. Forwarding networks are usually relatively simple, since they're typically based on FUs (of which there are 10's), not instructions (of which there are 1000's).


Cool. You might try modeling it in the MDL language too. :-)


Agreed. I should amend that comment to clarify the thought.


To answer the first question: (example vector vs scalar forwarding)
Yes, there's an easy way to model this. Functional unit X can be modeled as a "derived" or "compound" functional unit:

func_unit VECTOR:SCALAR X(...);
func_unit VECTOR Y();
func_unit SCALAR Z();

This declares that X contains both a VECTOR unit and a SCALAR unit, and the forwarding network can specify them separately. So you could say:

forward VECTOR -> VECTOR        and/or
forward SCALAR -> SCALAR         and/or
forward X -> Y


To answer the second question about exceptions:
The language currently doesn't have a method for specifying exceptions on an instruction-by-instruction basis. Our thesis is that this doesn't happen much, and can be done on a functional unit (or sub-functional-unit) basis. But we could certainly add something to the language without much trouble.

An issue that we don't address today is that its possible that a functional unit might forward just a single output value for instructions, because it only has a single connection to the forwarding network. I'm not aware of any processors that do this, but we might want to be able to manage that possibility.


No, this was just explaining the rationale behind the MDL language decisions.