This is an archive of the discontinued LLVM Phabricator instance.

Add a generic "convert-to-llvm" pass delegating to an interface
ClosedPublic

Authored by mehdi_amini on Aug 5 2023, 12:50 AM.

Details

Summary

The multiple -convert-XXX-to-llvm passes are really nice testing tools for
individual dialects, but the expectation is that a proper conversion should
assemble the conversion patterns using `populateXXXToLLVMConversionPatterns()
APIs. However most customers just chain the conversion passes by convenience.

This pass makes it composable more transparently to assemble the required
patterns for conversion to LLVM dialect by using an interface.
The Pass will scan the input and collect all the dialect present, and for
those who implement the ConvertToLLVMPatternInterface it will use it to
populate the conversion pattern, and possible the conversion target.

Since these conversions can involve intermediate dialects, or target other
dialects than LLVM (for example AVX or NVVM), this pass can't statically
declare the required getDependentDialects() before the pass pipeline
begins. This is worked around by using an extension in the dialectRegistry
that will be invoked for every new loaded dialects in the context. This
allows to lookup the interface ahead of time and use it to query the
dependent dialects.

This patch demonstrate it with the NVVM dialect, but we should then
apply the same pattern for all the dialect that have conversions to LLVM.

Diff Detail

Event Timeline

mehdi_amini created this revision.Aug 5 2023, 12:50 AM
Herald added a project: Restricted Project. · View Herald Transcript
mehdi_amini requested review of this revision.Aug 5 2023, 12:50 AM
mehdi_amini edited the summary of this revision. (Show Details)Aug 5 2023, 12:51 AM
Mogball added inline comments.Aug 5 2023, 1:07 AM
mlir/lib/Conversion/ConvertToLLVM/ConvertToLLVMPass.cpp
34

nice

mlir/lib/Conversion/ConvertToLLVM/ToLLVMInterface.cpp
20

Ouch. Would it be better to just iterate the loaded dialects in the context?

mlir/lib/Conversion/NVVMToLLVM/NVVMToLLVM.cpp
212

leftover debugging?

220

\n

mehdi_amini marked 3 inline comments as done.
mehdi_amini edited the summary of this revision. (Show Details)

Initialize patterns from the dialects loaded in the context

mlir/lib/Conversion/ConvertToLLVM/ToLLVMInterface.cpp
20

I was wondering about this: if we have many functions with only a subset of the dialects, how costly is it to load all the possible patterns that we don't need?

I implemented this alternative now, PTAL :)

ftynse added a comment.Aug 5 2023, 4:49 AM

Nice!

mlir/include/mlir/Conversion/ConvertToLLVM/ToLLVMPass.h
2

Nit: spurious empty line.

mlir/include/mlir/Conversion/NVVMToLLVM/NVVMToLLVM.h
11

Nit: doesn't look necessary given the forward declaration below.

mlir/include/mlir/Conversion/Passes.td
22
mlir/lib/Conversion/ConvertToLLVM/ConvertToLLVMPass.cpp
31

Nit: a short documentation comment is welcome, here and below.

56–57

Could you comment on unique/shared ptr difference here? Can't the frozen pattern set be shared between copies of the pass?

63

Nit: registry.addExtensions<LoadDependentDialectExtension>() reads better.

74

Should this be also exposed to the interface? Patterns may be producing NVVM dialect and friends, which should be declared legal.

mehdi_amini marked 5 inline comments as done.

Address Alex's comments

mehdi_amini added inline comments.
mlir/lib/Conversion/ConvertToLLVM/ConvertToLLVMPass.cpp
56–57

I should be able to use shared everywhere now, the mutable state was making me nervous, I added a dependent revision and I'm using of shared_ptr<const T> now

63

I don't think it works? It requires a callback as argument and build an extension class around it? The template arguments are for the dialects to filter on.

74

It is actually :)
populateConvertToLLVMConversionPatterns gets the target for this purpose.

One thing I'm wondering: the populate methods allow one to combine it in rather arbitrary ways & hops. But the interface for me signals more a fixed/preferred way. For the lower level dialects there isn't much choice and I don't think we have any multipaths here already, so not a practical concern today, more question about positioning.

mlir/include/mlir/Conversion/ConvertToLLVM/ToLLVMInterface.h
21

Let's keep these sorted

mlir/lib/Conversion/ConvertToLLVM/ConvertToLLVMPass.cpp
36

This feels like something more general ... Especially as this is something dynamic pipelines commonly run into.

mehdi_amini added inline comments.Aug 5 2023, 5:33 PM
mlir/lib/Conversion/ConvertToLLVM/ConvertToLLVMPass.cpp
36

Here the anchor point is that it delegate some control to the dialects, but I'm not sure how to connect to dynamic pipeline?

The kind of problem I've seen with dynamic pipelines is when people build a complete pipeline where they want for example to delegate the dynamic pipeline to an Op or Attribute interface for example. The dynamic pipeline isn't known until the pass execution starts, since it is queried from the IR.
I'm not sure what's the solution for this though, I think it needs to split the pass pipeline instead and use other means.

This will be super helpful, thank you!

Could you add some tests?

mlir/lib/Conversion/ConvertToLLVM/ConvertToLLVMPass.cpp
46

Why not continue?

mehdi_amini marked 2 inline comments as done.

Address comments and enable the test

mlir/lib/Conversion/ConvertToLLVM/ConvertToLLVMPass.cpp
46

Ah, I just refactored this out of a lambda, where return was right, we need continue here indeed.

Mogball accepted this revision.Aug 7 2023, 1:08 AM
This revision is now accepted and ready to land.Aug 7 2023, 1:08 AM
ftynse accepted this revision.Aug 7 2023, 2:54 AM
ftynse added inline comments.
mlir/lib/Conversion/ConvertToLLVM/ConvertToLLVMPass.cpp
63

Address ftynse's comment

This revision was landed with ongoing or failed builds.Aug 7 2023, 6:46 PM
This revision was automatically updated to reflect the committed changes.
springerm added inline comments.
mlir/lib/Conversion/NVVMToLLVM/NVVMToLLVM.cpp
204

Can we change this into a type converter? ConversionTarget is an argument of applyPartialConversion etc. and I have not seen it as an argument to populate... methods. ConversionPattern has a type converter field and many populateLLVM... methods require a type converter.

springerm added inline comments.Aug 8 2023, 3:07 AM
mlir/lib/Conversion/NVVMToLLVM/NVVMToLLVM.cpp
204

Just saw the comment about NVVM above. How about having both?

Generally looks good but this seems like a very limited use case for now: the NVVMToLLVM actually does not do conversions, it just applies a single rewrite pattern to lower the ops with PTX builder to inline asm ops.

How do you envision this being extended?
I can see how to compose this with some of @springerm 's work on applying conversions but we still need customizable TypeConverter.
Also some conversions use/require flags (e.g. fp32 reassoc for fmas), I am unclear what the story is there?

Maybe the overarching question is whether you have this connected in a bigger context and have good confidence that you're able to solve the tricky conversion pass compositions we've been discussing?
I'd love to be able to throw some of my local hacks away :)

How do you envision this being extended?

As needed :)
Right now I hope to move the individual passes to be "test-passes", and avoid the issue of unrealized_cast as much as possible.

Also some conversions use/require flags (e.g. fp32 reassoc for fmas), I am unclear what the story is there?

I think we should operate with other means than options: this does not scale well and a pass isn't the best granularity for this in general: data layout and other option may better be in the IR (where you could attach them as attribute to functions or module? Allowing different options for different functions?)

Maybe the overarching question is whether you have this connected in a bigger context and have good confidence that you're able to solve the tricky conversion pass compositions we've been discussing?
I'd love to be able to throw some of my local hacks away :)

mlir/lib/Conversion/ConvertToLLVM/ConvertToLLVMPass.cpp
63

Somehow I missed this! Maybe it is because it has a plural.. :)
Thanks!

So, I have some minor thoughts here:

  1. It sounds like this'll help with target-specific overrides of type conversion, am I right here?
  2. Similarly, it'd be nice if this general pass would take options like triple and chip so you could look up the LLVM data layout etc., since that's logic that'd be a pain to set up everywhere
  3. One interesting idea would be to generalize D158287 . I'd like, for example, the notion of "target-specific patterns". That way, for example, when we have these vector-to-llvm patterns, the avx patterns would live in a separate library and be pulled in only if I'm targetting x86. This would allow me to not even build the target-specific dialects (for instance, why does a library targetting AMDGPU need ArmSVE?) and to have that be a general pattern. (As another example, I'm having to pull in NVVM because of some memref patterns.)
  1. It sounds like this'll help with target-specific overrides of type conversion, am I right here?

I don't understand the question?

  1. Similarly, it'd be nice if this general pass would take options like triple and chip so you could look up the LLVM data layout etc., since that's logic that'd be a pain to set up everywhere

Should this be handled by adding a data layout and other target information as a module attribute?

  1. One interesting idea would be to generalize D158287 . I'd like, for example, the notion of "target-specific patterns". That way, for example, when we have these vector-to-llvm patterns, the avx patterns would live in a separate library and be pulled in only if I'm targetting x86. This would allow me to not even build the target-specific dialects (for instance, why does a library targetting AMDGPU need ArmSVE?) and to have that be a general pattern. (As another example, I'm having to pull in NVVM because of some memref patterns.)

Right now with this pass this would work as long as the pipeline introduces a dependency on a dialect (that implements the interface).

I don't think it is satisfactory for D158287 though: because there can be multiple options to lower to LLVM when multiple vector ISA dialects are available. I'm not sure how to model it, but seems like it comes back to some sort of attributes (even function granularity maybe).

  1. I'm asking in relation to https://discourse.llvm.org/t/rfc-almost-all-uses-of-typeconverter-should-be-const/72689/9 and the existence of the need to create target-specific overrides of the LLVM type conversion (for instance, AMDGPU lowers bf16 to i16 to match the expectations of the backend). I'm not sure the new pass has much to do with this

Similarly, it'd be nice if this general pass would take options like triple and chip so you could look up the LLVM data layout etc., since that's logic that'd be a pain to set up everywhere

Should this be handled by adding a data layout and other target information as a module attribute?

Yeah, that could probably be attributes. That might actually make for a reasonably clean design overall and make sure thing stay consistent between the lowering to LLVM and any compilation/JIT.

Right now with this pass this would work as long as the pipeline introduces a dependency on a dialect (that implements the interface).

I don't think it is satisfactory for D158287 though: because there can be multiple options to lower to LLVM when multiple vector ISA dialects are available. I'm not sure how to model it, but seems like it comes back to some sort of attributes (even function granularity maybe).

The high-level context on all this is that a (potentially somewhat fossilized) "one big static library" build of our downstream project was tripping up the Windows 4 GB debug symbol limit. That's all worked around, but it sent me looking for ways to slim down our build of MLIR to the dialects we reasonably need (for example, "Wait, why is ArmSVE in here - we're not targeting ARM!").

Currently, the way you'd enable these optional target-specific extensions is either calling a pass with an argument set, or by manually building up a pattern set.

Since this pass is trying to replace (?) building up pattern sets by hand, some way to signal "I'm targeting X, please load all the X-specific patterns", even if you don't have any operations from the X dialect in your code (unless we want some sort of no-op that'll trigger the existing interface search) would be neat.

The high-level context on all this is that a (potentially somewhat fossilized) "one big static library" build of our downstream project was tripping up the Windows 4 GB debug symbol limit. That's all worked around, but it sent me looking for ways to slim down our build of MLIR to the dialects we reasonably need (for example, "Wait, why is ArmSVE in here - we're not targeting ARM!").
Currently, the way you'd enable these optional target-specific extensions is either calling a pass with an argument set, or by manually building up a pattern set.
Since this pass is trying to replace (?) building up pattern sets by hand, some way to signal "I'm targeting X, please load all the X-specific patterns", even if you don't have any operations from the X dialect in your code (unless we want some sort of no-op that'll trigger the existing interface search) would be neat.

Right, that's why I was thinking about an attribute, on the model of what we did for the GPU pipeline: the attribute could have an externally registered interface that load these patterns. Since it's all opt-in registered that breaks the build dependencies.
It's not obvious to me how to plumb everything together just now though.