Page MenuHomePhabricator

[MLIR] Support function signature conversions with tuples of results types
AbandonedPublic

Authored by bondhugula on Jul 14 2020, 11:22 AM.

Details

Summary

Add support for function signature conversions when return values are
tuples of tensor types. This is a typical case with mlir/hlo dialect
where the a TF legalized function returns a tuple of tensor types. Use
a test dialect op for testing purposes where the op's operands are
simply propagated to the return op which is using its result.

Diff Detail

Event Timeline

bondhugula created this revision.Jul 14 2020, 11:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2020, 11:22 AM

Is there a running document anywhere detailing the design of the buffer assignment pass? In general it wouldn't be clear to me why we would treat tuples special this way, aside from the usages in XLA.

mlir/test/lib/Transforms/TestBufferPlacement.cpp
126

nit: Drop trivial braces.

137

nit: all_of(TypeRange(operands), [](Type type) {?

Is there a running document anywhere detailing the design of the buffer assignment pass? In general it wouldn't be clear to me why we would treat tuples special this way, aside from the usages in XLA.

I don't think there is a document; I can update it if one exists. Waiting for @dfki-ehna to comment where. The buffer conversion pass takes all return *tensor values* from a function and while converting them to memrefs, appends them to the function argument list (as output arguments). Now, when the return type is a tuple of tensors, we just maintain the same canonical form so that you have a function whose trailing arguments are going to be just "output memref arguments". That way clients don't have to unpack tuples - the inputs are already just a list of memrefs appearing as arguments, and so having the same for outputs would be consistent. (I didn't see a tuple of memref's as an argument as a useful or consistent abstraction. ) Other function / argument attribute information propagated from the originating programming has the necessary information on what are inputs / outputs or output aliases or resource vars (like in TensorFlow), and you have a consistent function signature of "all memref arguments" post buffer conversion for what were earlier input and output tensors.

bondhugula marked 3 inline comments as done.

Address review comments.

mlir/test/lib/Transforms/TestBufferPlacement.cpp
137

Thanks.

Is there a running document anywhere detailing the design of the buffer assignment pass?

+1: I think this is really needed to have a doc we can refer to and update along the way. This not an trivial piece of infrastructure to interact with and having a doc summarize the high level view, the approach, and then get in the detail of how is it hooked together and how is it intended to be reused by MLIR users seems like a very valuable thing to have.

herhut added subscribers: dfki-jugr, dfki-mako, herhut.

I don't think there is a document; I can update it if one exists. Waiting for @dfki-ehna to comment where.

@dfki-ehna is OOO until next week, as are the other two authors of buffer assignment @dfki-jugr and @dfki-mako.

The buffer conversion pass takes all return *tensor values* from a function and while converting them to memrefs, appends them to the function argument list (as output arguments). Now, when the return type is a tuple of tensors, we just maintain the same canonical form so that you have a function whose trailing arguments are going to be just "output memref arguments". That way clients don't have to unpack tuples - the inputs are already just a list of memrefs appearing as arguments, and so having the same for outputs would be consistent. (I didn't see a tuple of memref's as an argument as a useful or consistent abstraction. )

Could we just get rid of tuples completely and instead use multiple return values from the start? Do we actually need tuples as a data type?

I don't think there is a document; I can update it if one exists. Waiting for @dfki-ehna to comment where.

@dfki-ehna is OOO until next week, as are the other two authors of buffer assignment @dfki-jugr and @dfki-mako.

The buffer conversion pass takes all return *tensor values* from a function and while converting them to memrefs, appends them to the function argument list (as output arguments). Now, when the return type is a tuple of tensors, we just maintain the same canonical form so that you have a function whose trailing arguments are going to be just "output memref arguments". That way clients don't have to unpack tuples - the inputs are already just a list of memrefs appearing as arguments, and so having the same for outputs would be consistent. (I didn't see a tuple of memref's as an argument as a useful or consistent abstraction. )

Could we just get rid of tuples completely and instead use multiple return values from the start? Do we actually need tuples as a data type?

Are you referring to TF? We are in MLIR here, and this is client agnostic. The tuples were there to model HLO ops many of which pack into and extract from tuples.

Could we just get rid of tuples completely and instead use multiple return values from the start? Do we actually need tuples as a data type?

Are you referring to TF? We are in MLIR here, and this is client agnostic. The tuples were there to model HLO ops many of which pack into and extract from tuples.

No, I meant just in HLO. What we do here is essentially removing tuples as results when going to a buffer representation. My question is whether it would be better to remove tuples earlier by destructuring them at the tensor level and then have buffer allocation apply in the usual way. If there are cases where we cannot destructure tuples at the tensor level, we need to be able to allocate a representation for them. I would prefer if buffer assignment would not make this choice implicitly but it being part of an explicit rewrite.

bondhugula added a comment.EditedJul 16 2020, 10:25 AM

Could we just get rid of tuples completely and instead use multiple return values from the start? Do we actually need tuples as a data type?

Are you referring to TF? We are in MLIR here, and this is client agnostic. The tuples were there to model HLO ops many of which pack into and extract from tuples.

No, I meant just in HLO. What we do here is essentially removing tuples as results when going to a buffer representation. My question is whether it would be better to remove tuples earlier by destructuring them at the tensor level and then have buffer allocation apply in the usual way. If there are cases where we cannot destructure tuples at the tensor level, we need to be able to allocate a representation for them. I would prefer if buffer assignment would not make this choice implicitly but it being part of an explicit rewrite.

What you say is completely doable and meaningful as a rewrite in HLO, but I think you are assuming that eliminating such tuples completely is always desirable or possible before applying buffer conversion. On a side note, the real source of such return value tuples in TF/MLIR context is -tf-promote-resources-to-args, which creates a single tuple value for all TF outputs and resource var output aliases. The handling of the tuples itself is about 6-7 lines of code here, but there is a mix of concerns (as you suggest) if one assumes the tuples could always be eliminated and it'd be desirable to do so. But that's a big assumption because such tuples may only ultimately go away while we do the tensor to buffer conversion (where return value to output arg is happening). I have a conversion pattern on the HLO side that deals with xla_hlo.tuple (now mhlo.tuple) pretty much doing what the test dialect tensor tuple op is doing here. A tuple tensor being returned could potentially be an operand for another intermediate op for example. How do we remove tuples on the tensor op form without breaking op semantics?

What you say is completely doable and meaningful as a rewrite in HLO, but I think you are assuming that eliminating such tuples completely is always desirable or possible before applying buffer conversion. On a side note, the real source of such return value tuples in TF/MLIR context is -tf-promote-resources-to-args, which creates a single tuple value for all TF outputs and resource var output aliases. The handling of the tuples itself is about 6-7 lines of code here, but there is a mix of concerns (as you suggest) if one assumes the tuples could always be eliminated and it'd be desirable to do so. But that's a big assumption because such tuples may only ultimately go away while we do the tensor to buffer conversion (where return value to output arg is happening). I have a conversion pattern on the HLO side that deals with xla_hlo.tuple (now mhlo.tuple) pretty much doing what the test dialect tensor tuple op is doing here. A tuple tensor being returned could potentially be an operand for another intermediate op for example. How do we remove tuples on the tensor op form without breaking op semantics?

I have discussed this with @dfki-ehna today and we concluded that we should make the lowering strategy configurable by the user of the buffer assignment similar to how this is done with the llvm type converter. The idea is to be able to register the conversion of results with the BufferAssignmentTypeConverter. You would get something like addResultConversion that registers a callback that is queried to translate the result of an operation into either a new argument or a results of different type. That would also allow us to move the patterns back into the cc implementation file and get rid of the templates. We could then add this handling of tuples to the hlo to lhlo lowering without mandating it for all users of bufferization.

What you say is completely doable and meaningful as a rewrite in HLO, but I think you are assuming that eliminating such tuples completely is always desirable or possible before applying buffer conversion. On a side note, the real source of such return value tuples in TF/MLIR context is -tf-promote-resources-to-args, which creates a single tuple value for all TF outputs and resource var output aliases. The handling of the tuples itself is about 6-7 lines of code here, but there is a mix of concerns (as you suggest) if one assumes the tuples could always be eliminated and it'd be desirable to do so. But that's a big assumption because such tuples may only ultimately go away while we do the tensor to buffer conversion (where return value to output arg is happening). I have a conversion pattern on the HLO side that deals with xla_hlo.tuple (now mhlo.tuple) pretty much doing what the test dialect tensor tuple op is doing here. A tuple tensor being returned could potentially be an operand for another intermediate op for example. How do we remove tuples on the tensor op form without breaking op semantics?

I have discussed this with @dfki-ehna today and we concluded that we should make the lowering strategy configurable by the user of the buffer assignment similar to how this is done with the llvm type converter. The idea is to be able to register the conversion of results with the BufferAssignmentTypeConverter. You would get something like addResultConversion that registers a callback that is queried to translate the result of an operation into either a new argument or a results of different type. That would also allow us to move the patterns back into the cc implementation file and get rid of the templates. We could then add this handling of tuples to the hlo to lhlo lowering without mandating it for all users of bufferization.

Thanks @herhut - this sounds good to me. Please keep me in the loop as and when those changes happen. Is someone working on those?

dfki-ehna added a comment.EditedJul 21 2020, 12:45 AM

@bondhugula Yes, I am working on it.
@rriddle @mehdi_amini We have started to provide a detailed document for BA.

mravishankar resigned from this revision.Aug 3 2020, 11:40 PM
bondhugula abandoned this revision.Aug 19 2020, 3:41 PM

This is subsumed by a PR by @dfki-ehna.