This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Simplify DestinationStyleOpInterface.
ClosedPublic

Authored by pifon2a on Oct 6 2022, 12:55 AM.

Details

Summary

Prepare DPS interface for moving it out of Linalg dialect. Remove some
of the methods. Express the structure of the op using the number of outputs only.

Diff Detail

Event Timeline

pifon2a created this revision.Oct 6 2022, 12:55 AM
Herald added a reviewer: aartbik. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
pifon2a requested review of this revision.Oct 6 2022, 12:55 AM
pifon2a edited reviewers, added: akuegel; removed: aartbik, dcaballe.Oct 6 2022, 12:56 AM
springerm added inline comments.Oct 6 2022, 1:33 AM
mlir/include/mlir/Dialect/Linalg/IR/LinalgInterfaces.td
886–887

Add: All non-output operands are inputs.

886–887

Any particular reason why inputs are limited to these three? Can we support arbitrary types?

887

Add:

It is assumed that the inputs of the op are the operands at position [0; num_operands - getNumOutputs). The outputs of the op are the operands at position [num_operands - getNumOutputs; num_operands). In other words, all input operands come first.

891

Add:

The i-th output tensor is tied to the i-th OpResult. The op may not have any additional OpResults.

895

nit: Furthermore

924–925

Why is this the case?

924–925

I would make these unsigned. There are various arithmetic computation with getNumOperands etc. and I think these return unsigned, so you may get compiler warnings.

938–939

How does this work? Is the method calling itself?

How about returning $_op.getNumResults() if the op has tensor semantics. Otherwise llvm_unreachable("must be implemented"). There must be as many outputs as there are OpResults (in case of tensor semantics).

mlir/lib/Dialect/Linalg/IR/LinalgInterfaces.cpp
794–795

Can we move this limitation to the LinalgOp verifier?

796–797

I think this check does not do anything because numInputs is computed as num_operands - num_outputs.

805–812

nit: I would move this check before the previous check for better error messages.

akuegel added inline comments.Oct 6 2022, 2:22 AM
mlir/include/mlir/Dialect/Linalg/IR/LinalgInterfaces.td
898

nit: transformation -> transformations

924–925

It was already int64_t before.

938–939

It is the same as with getLibraryCallName. It would indeed call itself if the op does not implement getNumOutputs(). And unfortunately that does not result in a compiler error in all builds. I had to revert a change because of that because it only showed up on the Windows build.
I like the suggestion of Matthias.

mlir/test/Dialect/Linalg/roundtrip.mlir
124

Maybe now we should have a test case in invalid.mlir instead?
Or is it not actually invalid?

pifon2a updated this revision to Diff 465730.Oct 6 2022, 7:39 AM
pifon2a marked 12 inline comments as done.

Address the comments.

pifon2a added a subscriber: ftynse.Oct 6 2022, 7:40 AM
pifon2a added inline comments.
mlir/include/mlir/Dialect/Linalg/IR/LinalgInterfaces.td
886–887

Right, it can be arbitrary. Fixed the comment.

938–939

Checking for tensor semantics is quite expensive. And we also have to make it work for buffers. So, I am not sure what's best here.

938–939

@ftynse is there some way not to shoot ourselves in the foot with this? I also don't really understand the difference between methodBody and defaultImplementation.

mlir/lib/Dialect/Linalg/IR/LinalgInterfaces.cpp
794–795

We can, but why? Shouldn't there be at least one destination for DPS ops?

mlir/test/Dialect/Linalg/roundtrip.mlir
124

it has a mixed form with tensors and buffers. Should we still support mixed tensor-memref inputs?

Just curious why you

"pifon2a edited reviewers, added: akuegel; removed: aartbik, dcaballe."

when you clearly touch sparse code? Luckily for me, Herald put me back :-)
Having said that, changes in sparse LGTM since they are all mechanical.

Just curious why you

"pifon2a edited reviewers, added: akuegel; removed: aartbik, dcaballe."

when you clearly touch sparse code? Luckily for me, Herald put me back :-)
Having said that, changes in sparse LGTM since they are all mechanical.

Yes, because the changes were mechanical, I removed you and Diego. :-)

Yes, because the changes were mechanical, I removed you and Diego. :-)

Please still keep me in the loop so I can tell the sparse compiler team that changes are coming.
(as a heads up on rebasing mainly in this case :-)

mravishankar requested changes to this revision.Oct 6 2022, 10:40 AM
mravishankar added inline comments.
mlir/include/mlir/Dialect/Linalg/IR/LinalgInterfaces.td
889

I think it would be better to not use getNumOperands directly. Some operations might have other operands that are neither inputs nor outputs. For example, optional padding value for a fictional pad operation (not the tensor.pad that exists. It might be better to have an interface method that expects the range of operands to be used as inputs and range of operands to be used as outputs to be published.

This revision now requires changes to proceed.Oct 6 2022, 10:40 AM
pifon2a added inline comments.Oct 6 2022, 1:09 PM
mlir/include/mlir/Dialect/Linalg/IR/LinalgInterfaces.td
889

@mravishankar could you elaborate a bit more on what is neither input nor output? I thought that everything that is not output is an input even if it is not a tensor/memref. For example, linalg.fill has one scalar input, insert_slice has a list of offsets-sizes-strides inputs.

@mravishankar could you elaborate a bit more on what is neither input nor output? I thought that everything that is not output is an input even if it is not a tensor/memref. For example, linalg.fill has one scalar input, insert_slice has a list of offsets-sizes-strides inputs.

Thats a pretty big assumption though. As an interface, it would be better to make as little assumptions about the operation as possible. The simplest here might be each operation is expected to return "list of input operands" and "list of output operands" (ideally it doesnt even need the list of input operands for destination passing style since that is irrelevant to the interface).

Thats a pretty big assumption though. As an interface, it would be better to make as little assumptions about the operation as possible. The simplest here might be each operation is expected to return "list of input operands" and "list of output operands" (ideally it doesnt even need the list of input operands for destination passing style since that is irrelevant to the interface).

I think that the list of inputs is relevant, because the DPS ops are allowed to have either tensor semantics or buffer semantics, not the mixed case. I don't think that the assumption is big. I see a bigger problem if we start targeting the case that we have never seen in practice.

Thats a pretty big assumption though. As an interface, it would be better to make as little assumptions about the operation as possible. The simplest here might be each operation is expected to return "list of input operands" and "list of output operands" (ideally it doesnt even need the list of input operands for destination passing style since that is irrelevant to the interface).

I think that the list of inputs is relevant, because the DPS ops are allowed to have either tensor semantics or buffer semantics, not the mixed case. I don't think that the assumption is big. I see a bigger problem if we start targeting the case that we have never seen in practice.

Maybe I am over indexing on the wording

/// It is assumed that the inputs of the op are the operands at position [0;
// getNumOperands() - getNumOutputs()). The outputs of the op are the operands
// at position [getNumOperands() - getNumOutputs(); getNumOperands()). In other
// words, all input operands come first.

why is that needed ? Why cant you have an interface method that is just

OpOperandVector getOutputOperands()

that each op that implements the interface overrides. You dont need to make assumptions on specific ordering of the operands? Is there a reason for the interface to make such a strong assumption. For example, we are prototyping a "pack + pad" operation that has the following argument list

let arguments = (ins Variadic<AnyShaped>:$inputs,
    Variadic<AnyShaped>:$outputs,
    DefaultValuedAttr<I64ArrayAttr, "{}">:$dims_pos,
    Variadic<Index>:$inner_tiles,
    I64ArrayAttr:$static_inner_tiles,
    Optional<AnyType>:$padding_value);

Here the "getOutputOperands" returns the $outputs operands. The assumption listed above doesnt hold. Why does the interface need to enforce ordering of operands, and how many operands exist explicitly. If this is an intermediate state before relaxing that requirement, thats fine, but if that is a hard requirement, then thats strange....

springerm added inline comments.Oct 6 2022, 10:36 PM
mlir/include/mlir/Dialect/Linalg/IR/LinalgInterfaces.td
938–939

Then I would just put a llvm_unreachable(...) there. A defaultImplementation should be provided only if overriding the InterfaceMethod is optional.

I could be wrong but I think methodBody should be used if the InterfaceMethod is not supposed to be defined/overridden by ops (https://mlir.llvm.org/docs/Interfaces/). That would actually be the case for the vast majority of the InterfaceMethods here. But before changing anything, try it for one InterfaceMethod first and see if it still compiles...

springerm added a comment.EditedOct 6 2022, 10:54 PM

why is that needed ? Why cant you have an interface method that is just

OpOperandVector getOutputOperands()

that each op that implements the interface overrides. You dont need to make assumptions on specific ordering of the operands? Is there a reason for the interface to make such a strong assumption.

Just to add another data point, we have a somewhat similar interface: ViewLikeOpInterface

We could follow a similar implementation strategy. The offsets are defined via the offsets InterfaceMethod (not via operand index ranges):

InterfaceMethod<
  /*desc=*/[{
    Return the dynamic offset operands.
  }],
  /*retTy=*/"::mlir::OperandRange",
  /*methodName=*/"offsets",
  /*args=*/(ins),
  /*methodBody=*/"",
  /*defaultImplementation=*/[{
    return $_op.getOffsets();
  }]
>,

Not sure what happens if the op has neither an offsets operand nor an offsets method (infinite loop?). We could force an implementation by putting an llvm_unreachable.

(This implementation approach still assumes that outputs is a consecutive block of operands.)

why is that needed ? Why cant you have an interface method that is just OpOperandVector getOutputOperands().

I was trying to avoid copying the OpOperands* all the time. In the pack/unpack ops, we can also move destination args to the and of the arg list. On the other hand, the same would need to be done for the already existing tensor.insert_slice. Which is also possible, but probably painful.

If we have $inits not at a fixed position, we have to construct SmallVector<OpOperand*> every time we call getOutputOperands. Also, does it mean that getNumOutputs would need to be expressed as getOutputOperands().size()? Would it make sense to have an interface method std::pair<unsigned, unsigned> getOutputsPositionsRange or smth like that that would return the interval of positions in the operand list that correspond to outputs? In that case, the users would need to override only this method.

@mravishankar

OperandRange does not own the data, does it? I was trying to avoid constructing SmallVector every time.

why is that needed ? Why cant you have an interface method that is just

OpOperandVector getOutputOperands()

that each op that implements the interface overrides. You dont need to make assumptions on specific ordering of the operands? Is there a reason for the interface to make such a strong assumption.

Just to add another data point, we have a somewhat similar interface: ViewLikeOpInterface

We could follow a similar implementation strategy. The offsets are defined via the offsets InterfaceMethod (not via operand index ranges):

InterfaceMethod<
  /*desc=*/[{
    Return the dynamic offset operands.
  }],
  /*retTy=*/"::mlir::OperandRange",
  /*methodName=*/"offsets",
  /*args=*/(ins),
  /*methodBody=*/"",
  /*defaultImplementation=*/[{
    return $_op.getOffsets();
  }]
>,

Not sure what happens if the op has neither an offsets operand nor an offsets method (infinite loop?). We could force an implementation by putting an llvm_unreachable.

(This implementation approach still assumes that outputs is a consecutive block of operands.)

why is that needed ? Why cant you have an interface method that is just OpOperandVector getOutputOperands().

I was trying to avoid copying the OpOperands* all the time. In the pack/unpack ops, we can also move destination args to the and of the arg list. On the other hand, the same would need to be done for the already existing tensor.insert_slice. Which is also possible, but probably painful.

If we have $inits not at a fixed position, we have to construct SmallVector<OpOperand*> every time we call getOutputOperands. Also, does it mean that getNumOutputs would need to be expressed as getOutputOperands().size()? Would it make sense to have an interface method std::pair<unsigned, unsigned> getOutputsPositionsRange or smth like that that would return the interval of positions in the operand list that correspond to outputs? In that case, the users would need to override only this method.

@mravishankar

Yeah having an getOutputsPositionRange would work. AFAIK OperandRange doesnt own its data. The OpOperands are owned by the operation.

Yep, OperandRange does not, but OpOperandVector does and it is used for DPS interface right now.

why is that needed ? Why cant you have an interface method that is just OpOperandVector getOutputOperands().

I was trying to avoid copying the OpOperands* all the time. In the pack/unpack ops, we can also move destination args to the and of the arg list. On the other hand, the same would need to be done for the already existing tensor.insert_slice. Which is also possible, but probably painful.

If we have $inits not at a fixed position, we have to construct SmallVector<OpOperand*> every time we call getOutputOperands. Also, does it mean that getNumOutputs would need to be expressed as getOutputOperands().size()? Would it make sense to have an interface method std::pair<unsigned, unsigned> getOutputsPositionsRange or smth like that that would return the interval of positions in the operand list that correspond to outputs? In that case, the users would need to override only this method.

@mravishankar

Yeah having an getOutputsPositionRange would work. AFAIK OperandRange doesnt own its data. The OpOperands are owned by the operation.

Ok, then I will add getOutputsPositionRange func.

mlir/include/mlir/Dialect/Linalg/IR/LinalgInterfaces.td
889

@mravishankar I think I see the option to relax this in the future (in particular to support list(tensor, shape) that will also be useful for tensor.expand_shape and tensor.pack/unpack) but I would keep this out of the current commit.

The current behavior has the getNumOperands-based assumption, let's first evolve to an interface and separately revisit that assumption if you don't mind.

924–925

Almost never use unsigned in C++: exception if doing bit manipulation

E.g. https://stackoverflow.com/questions/10168079/why-is-size-t-unsigned

952

I see we now have MutableOperandRange, use that insead?

972–974

can this use getOutputOperands()[i] and avoid distributing the underlying this->getOperation() order assumption to more methods?

984–986

can this use getOutputOperands()[i] and avoid distributing the underlying this->getOperation() order assumption to more methods?

1028

can this use getInputOperands()[i] and avoid distributing the underlying this->getOperation() order assumption to more methods?

mlir/include/mlir/Dialect/Linalg/IR/LinalgInterfaces.td
892

Does this means all ops that implement this interface must be of the form:

results_range = OP(input_operands_range, unqualified_operands_range, output_operands_range)

?

Or is there no unqualified_operands_range ?

Also, is this true for all ops in perpetuity or only for the ops that use the default DPS interface impl?
Can the DPS interface be configured differently for other ops?

Could you please add these considerations to the doc?

pifon2a added inline comments.Oct 12 2022, 2:07 AM
mlir/include/mlir/Dialect/Linalg/IR/LinalgInterfaces.td
972–974

It can. We should be careful though. At the moment creating MutableArrayRef<OpOperand> is cheap and doing getOutputOperands()[i] is ok-ish, but when/if we make the interface more "flexible" then getOutputOperands would construct SmallVector only to use one element of it.

pifon2a updated this revision to Diff 467841.Oct 14 2022, 10:58 AM
pifon2a marked 12 inline comments as done.

Address the comments.

pifon2a updated this revision to Diff 467843.Oct 14 2022, 11:00 AM

Update the doc.

mlir/include/mlir/Dialect/Linalg/IR/LinalgInterfaces.td
938–939

I don't have a default implementation anymore.

952

It allows you to iterate over Values only, unfortunately.

springerm accepted this revision.Oct 17 2022, 2:27 AM

Thx for improving the interface!

mlir/include/mlir/Dialect/Linalg/IR/LinalgInterfaces.td
945

Can we use this C++ feature (given that we also have to support some pretty old compiler versions)?

Thank you!

mlir/include/mlir/Dialect/Linalg/IR/LinalgInterfaces.td
945

it is used in SCF.cpp a lot, for example

Even though there are requested changes by Mahesh. I addressed these changes, I discussed it directly with him last week. I will push it.

This revision was not accepted when it landed; it landed in state Needs Review.Oct 17 2022, 3:45 AM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.

Just leaving a note here that the design of this interface is causing a real headache for downstream users (i.e. IREE). I dont understand why this interface was adding anything w.r.t to getInputs() or getNumInputs(). That seems completely orthogonal to what DestinationStyleOpInterface was supposed to target , i.e. which operands of the op can be used to alias the result of the operation. The assumption that all operands that are not outputs are "input"s is a really strong an unnecessary assumption. In theory they might be "input"s but they might not be called as such in the op definition of downstream operations.
FTR : I did mark this as changes requested.... I am not sure why repeatedly changes get submitted when I have marked them as changes requested. It means I am not done with the review. Submitting it when marked changes requested drops it off my dashboard and I lost track of this. Integrating this into IREE is causing a massive headache, which is why I marked this as changes requested.

Just leaving a note here that the design of this interface is causing a real headache for downstream users (i.e. IREE). I dont understand why this interface was adding anything w.r.t to getInputs() or getNumInputs(). That seems completely orthogonal to what DestinationStyleOpInterface was supposed to target , i.e. which operands of the op can be used to alias the result of the operation. The assumption that all operands that are not outputs are "input"s is a really strong an unnecessary assumption. In theory they might be "input"s but they might not be called as such in the op definition of downstream operations.
FTR : I did mark this as changes requested.... I am not sure why repeatedly changes get submitted when I have marked them as changes requested. It means I am not done with the review. Submitting it when marked changes requested drops it off my dashboard and I lost track of this. Integrating this into IREE is causing a massive headache, which is why I marked this as changes requested.

Follow up to this. I see that the inputs were part of the initial design proposal here https://discourse.llvm.org/t/rfc-interface-for-destination-style-ops/64056 . So I missed the issues this can cause as well. So my bad on that.... I will post back on the RFC after I navigate things downstream, maybe changes to the DestinationStyleOpInterface that restrict the scope of the interface from what it is doing today....

With https://reviews.llvm.org/D136943 I was able to handle all the name conflicts. Thanks @pifon2a ! Apologies for the knee-jerk reaction above.
Still think we need to take a further look at the methods added under DestinationStyleOpInterface. I will post a summary of what I found on the relevant discourse thread.