This is an archive of the discontinued LLVM Phabricator instance.

[mlir][irdl] Introduce a way to define regions
ClosedPublic

Authored by unterumarmung on Jul 12 2023, 1:08 PM.

Details

Summary

This patch introduces new operations:
irdl.region and irdl.regions.
The former lets us to specify characteristics of a region,
such as the arguments for the entry block and the number of blocks.
The latter accepts all results of the former operations
to define the set of the regions for the operation.

Example:

irdl.dialect @example {
  irdl.operation @op_with_regions {
      %r0 = irdl.region
      %r1 = irdl.region()
      %v0 = irdl.is i32
      %v1 = irdl.is i64
      %r2 = irdl.region(%v0, %v1)
      %r3 = irdl.region with size 3

      irdl.regions(%r0, %r1, %r2, %r3)
  }
}

The above snippet demonstrates an operation named @op_with_regions,
which is constrained to have four regions.

  • Region %r0 doesn't have any constraints on the arguments or the number of blocks.
  • Region %r1 should have an empty set of arguments.
  • Region %r2 should have two arguments of types i32 and i64.
  • Region %r3 should contain exactly three blocks.

In the future the block count constraint may be expanded to support range of possible number of blocks.

Diff Detail

Event Timeline

unterumarmung created this revision.Jul 12 2023, 1:08 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2023, 1:08 PM
unterumarmung requested review of this revision.Jul 12 2023, 1:08 PM
unterumarmung edited the summary of this revision. (Show Details)

That looks like a nice first step to me!
I think the things I would change are:

  • Adding an !irdl.region type
  • Making that by default, irdl.region do not constrain the number of blocks
  • Maybe changing the format of irdl.region to make it explicit that we are constraining the number of arguments to 0 if no arguments are provided
mlir/include/mlir/Dialect/IRDL/IR/IRDLOps.td
357

Can you create a new IRDL_RegionType for that?

434–435

By default, it should not constrain anything.

443

I feel that the syntax here should look like irdl.region() with size 3.
I think it makes it more obvious that we are constraining the number of arguments to be zero. Otherwise, I would think that this means a region with an arbitrary number of arguments.

456

Same here, should be an IRDL_RegionType

mlir/include/mlir/Dialect/IRDL/IRDLVerifiers.h
183

This should not inherit from Constraint.
It should be its own kind of constraint.
However, we do not need to "save" the region characteristics in the ConstraintVerifier I believe.

mlir/lib/Dialect/IRDL/IR/IRDL.cpp
234

We should not by default constraint it to be 1.
I believe that by default, we should not check the region block size.

239–250

This is not needed if you have an IRDL_RegionType

mlir/test/Dialect/IRDL/regions.irdl.mlir
1–2 ↗(On Diff #539698)

Can you maybe rename the file regions-invalid.irdl.mlir, or something similar?

mlir/test/Dialect/IRDL/testd.mlir
597

Can you add a newline here?

unterumarmung added inline comments.Jul 14 2023, 7:38 AM
mlir/include/mlir/Dialect/IRDL/IR/IRDLOps.td
434–435

Makes sense

443

So, to clarify:

  • irdl.region() should constraint that there are no arguments
  • irdl.region should mean that there an arbitrary number of arguments

Right?

The question to me is how to code with with the declarative assemblyFormat? Because either way the entryBlockArgs are gonna be empty
Maybe it should be Optional<Variadic<IRDL_RegionType>>? Or there is a need for a custom parsing to achieve this?

math-fehr added inline comments.Jul 14 2023, 8:53 AM
mlir/include/mlir/Dialect/IRDL/IR/IRDLOps.td
443

Hmm, I think we would have to have a Variadic<IRDL_RegionType>, and an UnitAttr for that.
I guess that for now we could just write irdl.region() instead, and disallow irdl.region to make things more simple and less ambiguous (and have the code be simpler).

Rebase, fixed some mistakes (probably added new :D)

Hey @math-fehr!

I've just posted the latest work-in-progress version. I've made the changes you requested by implementing the empty parenthesis. It didn't add much complexity with the UnitAttr, and I believe it's essential to allow users the freedom to avoid constraining something they don't want to.

Here's what's still missing:

  • The documentation needs to be updated.
  • A lot of tests to be written.
  • The actual region verification is pending.

To ensure the correctness of the code, I intentionally emitted errors in the RegionConstraint::verify function. The tests produced the expected errors, which is a good sign.

However, I'm currently facing an issue when running ConstraintVerifier on the region arguments. The BlockArgument type is mlir::Value, not mlir::Attribute, so I'm unable to pass it to the verify method. Do you have any suggestions on how to overcome this?

Please, keep in mind that the code is still a work in progress, so I'm open to any further suggestions you might have.

Nice! For the mlir::Value, the solution is to wrap the value type in a TypeAttr. This is exactly how operands are handled for instance.

This is a way to handle both attributes and types using the same system, so we reduce a bit the complexity of the dialect!

mlir/include/mlir/Dialect/IRDL/IR/IRDLInterfaces.td
42–48

I guess you don't need the cppNamespace here, and you could add the description to the base class (and probably add an argument to VerifyInterface to make it templated).
Something like let description = "Interface to get an IRDL " # elementName # " constraint verifier from an operation"

Mogball accepted this revision.Jul 17 2023, 9:41 AM

LGTM but please wait for @math-fehr

mlir/include/mlir/Dialect/IRDL/IR/IRDLInterfaces.td
18–19

Please make this line less than 80 characters wide

50

What is the purpose of these interfaces? They have no methods and are only used on 1 op. Is there a plan to introduce more ops that implement these interfaces?

mlir/include/mlir/Dialect/IRDL/IR/IRDLTypes.td
57

no description?

mlir/lib/Dialect/IRDL/IR/IRDLOps.cpp
17

Please add a short doc for this method

mlir/lib/Dialect/IRDL/IRDLLoading.cpp
271

drop trivial braces

This revision is now accepted and ready to land.Jul 17 2023, 9:41 AM
unterumarmung marked 5 inline comments as done.Jul 17 2023, 11:13 AM
unterumarmung added inline comments.
mlir/include/mlir/Dialect/IRDL/IR/IRDLInterfaces.td
50

Both VerifyConstraintInterface and VerifyRegionInterface each have a method defined in the base class. While VerifyRegionInterface is currently only used by one operation, VerifyConstraintInterface is used by all constraint operations.

In the initial version of this patch, @math-fehr suggested that RegionConstraint should be treated as its own type of constraint, separate from the primary concepts. This is because it operates on the !irdl.region type rather than the !irdl.attribute type. While working on the patch, my goal was to maintain the close resemblance of the region verification functionality to the original implementation. This approach minimizes confusion and adheres to the interface segregation pattern.

Although it's uncertain whether VerifyRegionInterface will be utilized by multiple operations in the future, I believe it's beneficial to retain it in order to reuse patterns and code between the primary constraint verification and the region verification processes.

mlir/include/mlir/Dialect/IRDL/IR/IRDLTypes.td
57

I'm going to fix it before shipping

Mogball added inline comments.Jul 17 2023, 12:09 PM
mlir/include/mlir/Dialect/IRDL/IR/IRDLInterfaces.td
50

Makes sense to me.

All implemented and fixed

unterumarmung edited the summary of this revision. (Show Details)Jul 18 2023, 2:04 PM
unterumarmung marked 5 inline comments as done.

Stylistic changes

@math-fehr
I have implemented the constraint checking, written tests, updated the documentation, and addressed the review comments.

Please take a look :)

Regarding the naming of the tests file: we already have the attributes-op test that checks for invalid inputs for the operation. So, I followed this naming convention and renamed the file to regions-ops to better align with our existing practice. Currently, there are only two tests in the file. I can update them both to the style of your choice. However, I believe we should not limit ourselves to just "invalid" tests in these files, as we might include "valid" tests in the future for various purposes.

unterumarmung edited the summary of this revision. (Show Details)Jul 21 2023, 1:06 PM

Nice!
I added more comment, notably in the implementation of the region verifier. I think there is currently a bug at least!
For the testing files, we can have valid files directly in the regions-ops.irdl.mlir, it's just that if we do, we cannot load them to test their implementation.
This is why I put all valid files in testd.irdl.mlir, and the one failing in a separate file!

mlir/include/mlir/Dialect/IRDL/IR/IRDLOps.td
383–384

Nice!

mlir/include/mlir/Dialect/IRDL/IRDLVerifiers.h
183

Maybe adding a simple documentation for the class?

184

You can inline the constructor here directly. And You don't need explicit here, as you have more than one argument.

188

Can you add documentation for this function?
At least explaining that constraintContext will be used for the region arguments.

193

Can we maybe make this an optional<SmallVector<unsigned>>, and remove the constrainArguments?

mlir/lib/Dialect/IRDL/IR/IRDL.cpp
228–229

Can you drop the trivial braces here?

mlir/lib/Dialect/IRDL/IR/IRDLOps.cpp
18

Nice!

mlir/lib/Dialect/IRDL/IRDLLoading.cpp
260–263
267–268

You can also drop these braces

385

Should we put this in a shared pointer instead of storing this twice (and recomputing this twice as well)?

403–406

This is a bit hacky, but we need to call again irdlOpVerifier here, with the same arguments.
The reason is that when we are verifying the regions here, some constraints might have already been set by the operands/results.
For instance, if we have

%0 = irdl.any
irdl.operands(%0)
%1 = irdl.region(%0)
irdl.regions(%1)

then we have the constraint that the type of the first argument of the region is equal to the first operand.
However, this requires that the operand is verified in the same context as the region.
Here, you are verifying the operand in the normal verifier, and then verifying the region in the region verifier.

I guess one solution would be to check this again in the region verifier. This is slower, yes, but at the same time the alternative
would be to somehow pass a value between the call of the verifier and the region verifier.

Also, maybe a better solution would be to have this directly in verifier. My understanding is that you can verify things
about the number of blocks in a region or the block arguments in verifier, you just cannot verify things about the operations
contained in it. Am I right @Mogball?

mlir/lib/Dialect/IRDL/IRDLVerifiers.cpp
209–210

Is there a reason you are using formatv here and not previously?
I'm asking because I'm usually just using << for formatting, but I'm not sure what is the expected way to do that.
@Mogball, you probably know more about this?

217

This might fail if there are no arguments, right?
So I guess the error should also be printed at the operation level?

mlir/test/Dialect/IRDL/testd.irdl.mlir
140

Can you add a test like this one:

%0 = irdl.any
irdl.operands(%0)
%1 = irdl.region(%0)
irdl.regions(%1)

And then two tests:

my_op(%0) {
  ^bb(%1: i32):
} : (i32) -> ()

and

my_op(%0) {
  ^bb(%1: i64):
} : (i32) -> ()

To check that we are ensuring that the argument and the operand have the same type?
I feel that this will actually not fail in the current implementation (see my comment about`regionVerifier`)

unterumarmung marked 8 inline comments as done.

Address review comments and rebase to main

I would like to clarify your position on test filenames. Are you okay with the current naming, or do you think the filename should be changed?

mlir/lib/Dialect/IRDL/IRDLLoading.cpp
385

In the current implementation, using std::shared_ptr is not necessary, but I'd still like to share some thoughts.

Firstly, the reason I avoided using std::shared_ptr here is that I was initially advised against it due to its atomic reference counter overhead. However, upon further consideration, I realize that it is a legitimate use case, and the actual overhead is likely fairly negligible compared to the entire constraints computation process, especially when it's run more than once. So, I'd be willing to refactor it if needed.

Secondly, I did attempt to refactor to the shared pointer, but I encountered numerous obstacles in doing so. The entire API is tightly coupled to the ArrayRef<std::unique_ptr<Constraint>> type, and introducing changes to support shared_ptr would affect not only this patch but also various other parts of the codebase unrelated to it. Therefore, I believe the API should not expose unique_ptr or shared_ptr directly. Instead, I propose having something like ArrayRef<llvm::AnyRef<Constraint>>, where llvm::AnyRef behaves similarly to llvm::ArrayRef, but instead of encapsulating containers, it encapsulates C++ references, raw pointers, and smart pointers. Is there anything like this already available in the wild? I think it would be beneficial to refactor the API this way because we don't actually care about the ownership semantics at the API level. And if we need shared ownership in the future, we can easily add it without having to change the entire IRDL codebase.

403–406

Running the irdlOpVerifier for the second time, aside from the computation complexity, would also necessitate more changes to the code.

Following your suggestion that it might be possible to verify regions alongside operation verification, I investigated further, and it turned out to be a viable approach. All tests produced the expected results. Additionally, this approach eliminates the need for using shared pointers or computing the constraints objects twice. Therefore, I believe it's a better solution that not only improves efficiency but also reduces boilerplate in the code, making it more maintainable.

mlir/lib/Dialect/IRDL/IRDLVerifiers.cpp
209–210

This is solely my personal preference, but I find creating diagnostics with llvm::formatv much more readable than using operator<<.

There's also llvm::format, which has a printf-like syntax, but I don't like it, and it has some known bugs: Link to the bug report.

Another potential advantage of using llvm::formatv is that it makes it easier to migrate to std::format when LLVM or MLIR decide to adopt C++20 in the future. However, since that's not expected to happen in the near future, I don't consider it a strong argument at this moment.

If it truly matters, I have no problem rewriting the code to follow the operator<< style.

In case there is agreement on using llvm::formatv, I can refactor the entire IRDL codebase to adopt it in the follow-up PR.

217

You're absolutely right; I stumbled upon this mistake when testing the solution. Unfortunately, I overlooked writing the necessary code to address it while preparing the patch for review, and I completely forgot about it afterward.

Thank you for catching this issue. I have now fixed it and also created a test to cover it. Additionally, I refactored emitError to eliminate the need for the bool attachFunc = true argument.

mlir/test/Dialect/IRDL/testd.irdl.mlir
140

They pass with the implementation you've reviewed
But now I've rewritten it and the second one does not pass, so I added expected diagnostics to the test

Return missed WalkResult::interrupt() back

I'm fine with the current naming!
In any way, we need separate files for the invalid definitions, so we can't really merge them!

@Mogball, just for confirmation, is it fine to verify region block arguments and number of blocks in verifyInvariants, rather than in verifyRegions? I feel this should be equivalent, as this doesn't depend on operations verifying in verifyRegions.
Also, do you know what is preffered between << and llvm::formatv ?

mlir/lib/Dialect/IRDL/IRDLLoading.cpp
385

If we can keep everything in the "normal" verifier, then that's perfect!

mlir/lib/Dialect/IRDL/IRDLVerifiers.cpp
209–210

I personally am fine for both ;) It's just that I don't know what the guidelines are here!
I'll try to ask on the MLIR discord and see what people say.

@Mogball please take a look, thanks

Thanks for the ping.

@Mogball, just for confirmation, is it fine to verify region block arguments and number of blocks in verifyInvariants, rather than in verifyRegions? I feel this should be equivalent, as this doesn't depend on operations verifying in verifyRegions.

Yes. verifyRegions runs after the operations in the region have been verified, in case the verifier requires some invariant to be met there. If you just need to check the number of blocks or inspect block arguments, check them in verifyInvariants is fine.

Also, do you know what is preffered between << and llvm::formatv ?

<< is typically preferred.

Mogball accepted this revision.Aug 3 2023, 2:35 PM

Rebased to main and changed llvm::formatv to operator<<. @math-fehr please take a look :)

unterumarmung edited the summary of this revision. (Show Details)Aug 4 2023, 11:46 AM
math-fehr accepted this revision.Aug 16 2023, 6:41 AM

Sorry for the delay!
This looks good to me!

mlir/test/Dialect/IRDL/testd.mlir
244–304

These two tests do not have any // CHECK, we should maybe add one like // CHECK: func.func @function_name?

Rebase to main

mlir/test/Dialect/IRDL/testd.mlir
244–304

Agree, added

This revision was automatically updated to reflect the committed changes.