Page MenuHomePhabricator

[mlir] use unpacked memref descriptors at function boundaries
ClosedPublic

Authored by ftynse on Fri, Feb 7, 3:34 AM.

Details

Summary

The existing (default) calling convention for memrefs in standard-to-LLVM
conversion was motivated by interfacing with LLVM IR produced from C sources.
In particular, it passes a pointer to the memref descriptor structure when
calling the function. Therefore, the descriptor is allocated on stack before
the call. This convention leads to several problems. PR44644 indicates a
problem with stack exhaustion when calling functions with memref-typed
arguments in a loop. Allocating outside of the loop may lead to concurrent
access problems in case the loop is parallel. When targeting GPUs, the contents
of the stack-allocated memory for the descriptor (passed by pointer) needs to
be explicitly copied to the device. Using an aggregate type makes it impossible
to attach pointer-specific argument attributes pertaining to alignment and
aliasing in the LLVM dialect.

Change the default calling convention for memrefs in standard-to-LLVM
conversion to transform a memref into a list of arguments, each of primitive
type, that are comprised in the memref descriptor. This avoids stack allocation
for ranked memrefs (and thus stack exhaustion and potential concurrent access
problems) and simplifies the device function invocation on GPUs.

Provide an option in the standard-to-LLVM conversion to generate auxiliary
wrapper function with the same interface as the previous calling convention,
compatible with LLVM IR porduced from C sources. These auxiliary functions
pack the individual values into a descriptor structure or unpack it. They also
handle descriptor stack allocation if necessary, serving as an allocation
scope: the memory reserved by alloca will be freed on exiting the auxiliary
function.

The effect of this change on MLIR-generated only LLVM IR is minimal. When
interfacing MLIR-generated LLVM IR with C-generated LLVM IR, the integration
only needs to require auxiliary functions and change the function name to call
the wrapper function instead of the original function.

This also opens the door to forwarding aliasing and alignment information from
memrefs to LLVM IR pointers in the standrd-to-LLVM conversion.

Diff Detail

Event Timeline

ftynse created this revision.Fri, Feb 7, 3:34 AM

Thanks Alex for digging into this and explaining in detail the tradeoffs I also had to go through.
Most importantly thanks for putting enough sweat to getting to what I think is the best think we can hope for in the absence of handling the C ABI ourselves.
Looking now, with the objective of pushing this through with hi-pri.

nicolasvasilache accepted this revision.Fri, Feb 7, 5:49 AM

Minor (ultra)-nit.
Looks great to me, this should also help with @dcaballe 's issues with forcing the memref descriptor ABI on everything.

Thanks for this great patch!

mlir/docs/ConversionToLLVMDialect.md
415

typo original

mlir/include/mlir/Conversion/StandardToLLVM/ConvertStandardToLLVM.h
162

ultra-nit: extra space before each line to make the list pop out better.
Or use a proper list and backticks as such:

  • !llvm...
mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp
284

typo individual

607

call getNumUnpackedValues for a single source of truth.

mlir/test/Conversion/StandardToLLVM/convert-funcs.mlir
21–22

typo argument

There is a typo in the description: porduced

mlir/docs/ConversionToLLVMDialect.md
497

nit: Introducing

498

typo: minize -> minimize

mlir/include/mlir/Conversion/StandardToLLVM/ConvertStandardToLLVM.h
65

nit: ///

mlir/lib/Conversion/GPUToCUDA/ConvertLaunchFuncToCudaCalls.cpp
243

nit: /// for the comments.

mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp
199

///

220

///

768

I thought std.varargs was removed? Also, you can use op->getDialectAttrs() to filter out the ones without a prefix.

2732

getModule().emitError ?

mlir/lib/Transforms/DialectConversion.cpp
404

This is relaxed because you are inserting multiple operations to perform the conversion?

Thanks Alex! LGTM. As Nicolas mentioned, I think this should help with the aliasing issue. Once it settles down I should be able to retire the bare ptr calling convention if nobody else is interested in it.

mlir/docs/ConversionToLLVMDialect.md
339

Stide -> Stride

mlir/include/mlir/Conversion/StandardToLLVM/ConvertStandardToLLVM.h
121

constituting?

mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp
224

Remove type if it's not needed?

284

constituting

883

drop {}

931–939

drop {}

nmostafa added inline comments.
mlir/docs/ConversionToLLVMDialect.md
274

Can you add an example for how UnrankedMemRef gets unpacked as well ?

Also, IIUC, if we are passing UnrankedMemRefDescriptor arguments in a loop, we still might exhaust the stack, since we still alloca, copy the MemRefDescriptor and pack the rank and alloca ptr into the UnrankedMemRef struct. Correct ?

ftynse updated this revision to Diff 243282.Fri, Feb 7, 1:58 PM
ftynse marked 18 inline comments as done.

addressed most comments

mlir/docs/ConversionToLLVMDialect.md
274

Will do.

Yes, unranked memref still allocates once (instead of twice). I don't see an easy way around because it requires a pointer in order to erase the actual type. There are two things we can explore: outlining the memref_cast from ranked to unranked to scope the allocation, using vararg functions and argument reordering.

mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp
768

It was still there in the code, so I decided not to touch it in this patch. Will take a look in a follow-up.

883

I prefer not to becase if has braces.

mlir/lib/Transforms/DialectConversion.cpp
404

Exactly.

rriddle accepted this revision.Fri, Feb 7, 2:14 PM
rriddle added inline comments.
mlir/lib/Transforms/DialectConversion.cpp
404

We erase use_empty cast operations during applyRewrites. How does this interact with generating multiple operations when casting? (This doesn't have to block this revision, but just curious on your thoughts there)

This revision is now accepted and ready to land.Fri, Feb 7, 2:14 PM
ftynse updated this revision to Diff 243518.Mon, Feb 10, 5:13 AM
ftynse marked an inline comment as done.

Added example with unranked memref to the doc.
Added a helper view class.

ftynse marked 5 inline comments as done.Mon, Feb 10, 5:36 AM
ftynse added inline comments.
mlir/docs/ConversionToLLVMDialect.md
274

Can you add an example for how UnrankedMemRef gets unpacked as well ?

Done.

mlir/lib/Transforms/DialectConversion.cpp
404

It erases the last operation, but keeps the rest, which are all equally dead and have no side effects.

It's a variant of the problem we face a lot in rewrites: do we clean up immediately or do we expect the canonicalizer to clean up later. I don't have a good answer here, but I wouldn't go out of my way for cleaning.

In general, I considered the following:

  • since we use DialectConversionRewriter, we can use its undo stack to remove all operations introduced in the cast materialization;
  • with multiple operations generated, I'm not convinced that we can rely only on one of them (e.g., the last one) to consider the entire conversion dead; we can have stores and other side-effecting operations;
  • the canonicalizer is better aware of deadness, so it sounds reasonable to rely on it instead to remove dead casts; (this is similar to the decision not to implement valuesToRemoveIfDead in replaceAllUsesWith IMO)
  • if we want the cleanup, maybe we can try and call into the canonicalization directly.

This is great! Thanks a lot for fixing this, and for such a detailed and helpful commit message! Once this is committed, https://github.com/tensorflow/mlir/issues/210 can also be closed (I think it was the first report filed on this issue.)

mehdi_amini added inline comments.Mon, Feb 10, 8:38 PM
mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp
50

Seems like you didn't provide tests for this flag?

ftynse marked an inline comment as done.Tue, Feb 11, 5:03 AM
ftynse added inline comments.
mlir/lib/Conversion/StandardToLLVM/ConvertStandardToLLVM.cpp
50

Indeed, forgot to git add :/ Thanks for noticing. Pushed in ea3a25e4f5166ccd5e523f0165f5270b24d71f46.

Unfortunately, this commit is breaking the lowering of the noalias attribute to LLVM when the bare pointer calling convention is used. @check_noalias test in convert-static-memref-ops.mlir passes successfully but it seems that the problem happens only when more than one argument carries the attribute. For example, this test will fail if you modify it as follows:

--- a/mlir/test/Conversion/StandardToLLVM/convert-static-memref-ops.mlir
+++ b/mlir/test/Conversion/StandardToLLVM/convert-static-memref-ops.mlir
@@ -3,8 +3,9 @@
 // RUN: mlir-opt -convert-std-to-llvm='use-bare-ptr-memref-call-conv=1' -split-input-file %s | FileCheck %s --check-prefix=BAREPTR

 // BAREPTR-LABEL: func @check_noalias
+// BAREPTR-SAME: %{{.*}}: !llvm<"float*"> {llvm.noalias = true},
 // BAREPTR-SAME: %{{.*}}: !llvm<"float*"> {llvm.noalias = true}
-func @check_noalias(%static : memref<2xf32> {llvm.noalias = true}) {
+func @check_noalias(%static1 : memref<2xf32> {llvm.noalias = true}, %static2 : memref<2xf32> {llvm.noalias = true}) {
     return
 }

It would be great if someone could have a look at this quickly or revert the commit if this is not blocking for you.
I'll try to take a look tomorrow if nobody hasn't.

Thanks!

@dcaballe I don't think reverting the commit will be a reasonable path here, a bunch of other things and integrates have switched to this.

The problem seems to be that the existing tests were not sufficiently covering the behavior you are interested in enforcing?
It seems the natural solution would be to fix those cases and add the proper tests to cover those behaviors?

@dcaballe I don't think reverting the commit will be a reasonable path here, a bunch of other things and integrates have switched to this.

The problem seems to be that the existing tests were not sufficiently covering the behavior you are interested in enforcing?
It seems the natural solution would be to fix those cases and add the proper tests to cover those behaviors?

If a fix can be quickly provided by the commit author, sure. Otherwise a few days after a breakage is introduced, reverting is absolutely the correct path.
(you need to revert 696f80736b861 along the way).

Should be fixed by 39cb2a8fc79976171b20369ff756f7fa43232b50.

Reverting or fixing it yourself is a complexity trade-off. Patches that are bigger and/or stayed in the repo longer have higher chance of transitively affecting other patches up to a point that you'd need to revert dozens of commits and resolve conflicts, which may take you more time than finding a trivial problem....

Thanks for addressing this so quickly and for the feedback! I'm giving it a try. Re reverting patches, this is a very common practice. Sometimes patches are reverted without giving an opportunity to provide a fix. It shouldn't be a problem. LLVM has a no regression policy and testing goes beyond in-tree tests since many vendors have private forks. In any case, I just provided some options so that you could decide. Thanks again for addressing this so quickly!

@ayzhuang found another case that is breaking our project using the default calling convention. The noalias attribute is now added to all the flattened arguments, including those that are not pointers. LLVM complaints about it:

<stdin>:4:3: error: llvm.noalias attribute attached to LLVM non-pointer argument
  llvm.func @check_noalias(%arg0: !llvm<"float*"> {llvm.noalias = true}, %arg1: !llvm<"float*"> {llvm.noalias = true}, %arg2: !llvm.i64 {llvm.noalias = true}, %arg3: !llvm.i64 {llvm.noalias
= true}, %arg4: !llvm.i64 {llvm.noalias = true}, %arg5: !llvm<"float*"> {llvm.noalias = true}, %arg6: !llvm<"float*"> {llvm.noalias = true}, %arg7: !llvm.i64 {llvm.noalias = true}, %arg8: !l
lvm.i64 {llvm.noalias = true}, %arg9: !llvm.i64 {llvm.noalias = true}) {
  ^

This is extending the test case to cover this new scenario:

diff --git a/mlir/test/Conversion/StandardToLLVM/convert-static-memref-ops.mlir b/mlir/test/Conversion/StandardToLLVM/convert-static-memref-ops.mlir
index 105a80dec7a..da3ae1341f0 100644
--- a/mlir/test/Conversion/StandardToLLVM/convert-static-memref-ops.mlir
+++ b/mlir/test/Conversion/StandardToLLVM/convert-static-memref-ops.mlir
@@ -2,6 +2,9 @@
 // RUN: mlir-opt -convert-std-to-llvm='use-alloca=1' %s | FileCheck %s --check-prefix=ALLOCA
 // RUN: mlir-opt -convert-std-to-llvm='use-bare-ptr-memref-call-conv=1' -split-input-file %s | FileCheck %s --check-prefix=BAREPTR

+// CHECK-LABEL: func @check_noalias
+// CHECK-SAME: %{{.*}}: !llvm<"float*"> {llvm.noalias = true}, %{{.*}}: !llvm<"float*"> {llvm.noalias = true}, %{{.*}}: !llvm.i64, %{{.*}}: !llvm.i64, %{{.*}}: !llvm.i64,
+// CHECK-SAME: %{{.*}}: !llvm<"float*"> {llvm.noalias = true}, %{{.*}}: !llvm<"float*"> {llvm.noalias = true}, %{{.*}}: !llvm.i64, %{{.*}}: !llvm.i64, %{{.*}}: !llvm.i64)
 // BAREPTR-LABEL: func @check_noalias
 // BAREPTR-SAME: %{{.*}}: !llvm<"float*"> {llvm.noalias = true}, %{{.*}}: !llvm<"float*"> {llvm.noalias = true}
 func @check_noalias(%static : memref<2xf32> {llvm.noalias = true}, %other : memref<2xf32> {llvm.noalias = true}) {

Thanks for addressing this so quickly and for the feedback! I'm giving it a try. Re reverting patches, this is a very common practice. Sometimes patches are reverted without giving an opportunity to provide a fix. It shouldn't be a problem. LLVM has a no regression policy and testing goes beyond in-tree tests since many vendors have private forks. In any case, I just provided some options so that you could decide. Thanks again for addressing this so quickly!

I know it's common. In this specific situation, you will get in a weird situation where reverting this patch could fix your project, but break other projects, so somebody would revert the revert :)

@ayzhuang found another case that is breaking our project using the default calling convention. The noalias attribute is now added to all the flattened arguments, including those that are not pointers. LLVM complaints about it:

You should not be using llvm.noalias with the default calling convention. It has not worked before, since you cannot attach noalias to structures either. So I will be surprised if you actually had a test exercising the default calling convention with llvm.noalias that used to pass and that was broken by this change.

This is extending the test case to cover this new scenario:

diff --git a/mlir/test/Conversion/StandardToLLVM/convert-static-memref-ops.mlir b/mlir/test/Conversion/StandardToLLVM/convert-static-memref-ops.mlir
index 105a80dec7a..da3ae1341f0 100644
--- a/mlir/test/Conversion/StandardToLLVM/convert-static-memref-ops.mlir
+++ b/mlir/test/Conversion/StandardToLLVM/convert-static-memref-ops.mlir
@@ -2,6 +2,9 @@
 // RUN: mlir-opt -convert-std-to-llvm='use-alloca=1' %s | FileCheck %s --check-prefix=ALLOCA
 // RUN: mlir-opt -convert-std-to-llvm='use-bare-ptr-memref-call-conv=1' -split-input-file %s | FileCheck %s --check-prefix=BAREPTR

+// CHECK-LABEL: func @check_noalias
+// CHECK-SAME: %{{.*}}: !llvm<"float*"> {llvm.noalias = true}, %{{.*}}: !llvm<"float*"> {llvm.noalias = true}, %{{.*}}: !llvm.i64, %{{.*}}: !llvm.i64, %{{.*}}: !llvm.i64,
+// CHECK-SAME: %{{.*}}: !llvm<"float*"> {llvm.noalias = true}, %{{.*}}: !llvm<"float*"> {llvm.noalias = true}, %{{.*}}: !llvm.i64, %{{.*}}: !llvm.i64, %{{.*}}: !llvm.i64)
 // BAREPTR-LABEL: func @check_noalias
 // BAREPTR-SAME: %{{.*}}: !llvm<"float*"> {llvm.noalias = true}, %{{.*}}: !llvm<"float*"> {llvm.noalias = true}
 func @check_noalias(%static : memref<2xf32> {llvm.noalias = true}, %other : memref<2xf32> {llvm.noalias = true}) {

This is wrong. The allocated and aligned pointer _do_ alias, so attaching "noalias" to them is incorrect. There is no aliasing model on memrefs. When we have one, we will see how to lower them to the LLVM dialect.

Thanks for addressing this so quickly and for the feedback! I'm giving it a try. Re reverting patches, this is a very common practice. Sometimes patches are reverted without giving an opportunity to provide a fix. It shouldn't be a problem. LLVM has a no regression policy and testing goes beyond in-tree tests since many vendors have private forks. In any case, I just provided some options so that you could decide. Thanks again for addressing this so quickly!

I know it's common. In this specific situation, you will get in a weird situation where reverting this patch could fix your project, but break other projects, so somebody would revert the revert :)

I am not sure I agree with this: if other project already depended on a *feature* they don't get to revert a revert that was fixing a *bug* introduced with this new feature. They have to deal with this out-of-tree.
Of course the delay matters, in this case it was just a few days later: the revert is a no-brainer for me (unless the author can provide a quick fix, as I mentioned before, and which is what happened in this case).

dcaballe added a comment.EditedTue, Feb 18, 9:24 AM

So I will be surprised if you actually had a test exercising the default calling convention with llvm.noalias that used to pass and that was broken by this change.

We actually have been testing this since before memrefs were lowered to a pointer to struct. It became useless but it worked since the noalias attribute was attached to the pointer to struct. However, I think I can take care of this on our side. No problem then.
Thanks!