This is an archive of the discontinued LLVM Phabricator instance.

[MLIR] Introduce std.alloca op
ClosedPublic

Authored by bondhugula on Mar 23 2020, 5:21 AM.

Details

Summary

Introduce the alloca op for stack memory allocation. When converting to the
LLVM dialect, this is lowered to an llvm.alloca.

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Diff Detail

Event Timeline

bondhugula created this revision.Mar 23 2020, 5:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 23 2020, 5:21 AM
nicolasvasilache requested changes to this revision.Mar 23 2020, 5:57 AM

The amount of copy-pasta is uncanny, is there a way to factor out the 90%+ common part?
Also, please use the assemblyFormat for parsing and printing,

This revision now requires changes to proceed.Mar 23 2020, 5:57 AM

Also, please use the assemblyFormat for parsing and printing,

Can the assembly format support its parsing and printing? Both affine.apply and alloc don't use the assembly format and have identical operand syntax (affine.apply was recently migrated to ODS but not auto print / parse). All these three require two variadic operand lists. Let me know.

The amount of copy-pasta is uncanny, is there a way to factor out the 90%+ common part?
Also, please use the assemblyFormat for parsing and printing,

I was thinking the same. The duplication is unfortunate and it's all in StandardToLLVM - it's possible to factor it all out if we get rid of useAlloca because the AllocOp lowering is conditional on that. Should I remove useAlloca in this patch itself and mostly merge AllocOpLowering and AllocaOpLowering? Merging the parse/print methods is straightforward.

Share print/parse/verify b/w alloc/alloca.

Refactor ODS for alloc/alloca

Register canonicalization pattern as well

Harbormaster failed remote builds in B50259: Diff 252331!

The amount of copy-pasta is uncanny, is there a way to factor out the 90%+ common part?

I was thinking the same. The duplication is unfortunate and it's all in StandardToLLVM - it's possible to factor it all out if we get rid of useAlloca because the AllocOp lowering is conditional on that. Should I remove useAlloca in this patch itself and mostly merge AllocOpLowering and AllocaOpLowering? Merging the parse/print methods is straightforward.

@nicolasvasilache Done with the refactoring here (except for the llvm lowering - see question above).

rriddle added inline comments.Mar 24 2020, 5:33 PM
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
144

Use ValueRange instead of ArrayRef<Value> in builder mehtods.

252

nit: "alloca" -> alloca

256

Please use mlir code blocks for any inline code.

mlir/lib/Dialect/StandardOps/IR/Ops.cpp
250

llvm::is_one_of

bondhugula marked 5 inline comments as done.

Address comments from @rriddle

Thanks for the review!

mlir/lib/Dialect/StandardOps/IR/Ops.cpp
250

Thanks!

Thanks for adding the alloca op! Really needed.
Not sure if you discussed this already but just a nit about the name: any plans on renaming alloca and alloc so that it's a bit clearer what they model? I find it a bit confusing right now. Some options that came to mind:
alloca -> salloc, salloca
alloc -> malloc, malloca, alloc

Thanks for adding the alloca op! Really needed.
Not sure if you discussed this already but just a nit about the name: any plans on renaming alloca and alloc so that it's a bit clearer what they model? I find it a bit confusing right now. Some options that came to mind:
alloca -> salloc, salloca
alloc -> malloc, malloca, alloc

alloca -> salloca sounds good to me! Note that the 'a' suffix in 'alloca' is for 'automatic' freeing (originating from early Unix's and BSDs and it has always meant allocating from the caller's stack). So, alloc -> malloca, alloca -> salloc would be inconsistent. Since 'alloc' currently doesn't specify where it's from the stack/heap and specifies it's explicitly freed via dealloc, we can leave it like that.

What is the s in salloc? Stack? Can we make it explicit then: stack_alloc?

Note that the 'a' suffix in 'alloca' is for 'automatic' freeing

Thanks for clarifying! I thought it was just a short for 'allocate' :)

Can we make it explicit then: stack_alloc?

stack_alloc sounds better to me, thanks.

Since 'alloc' currently doesn't specify where it's from the stack/heap and specifies it's explicitly freed via dealloc, we can leave it like that.

It sounds good. I think we are mapping 'alloc' to static allocation by using a flag in the llvm lowering. Maybe can create a simple pass to do a more proper alloc->static_alloc conversion in the future and leave 'alloc' only for heap allocation.

@bondhugula
Thanks for the refactoring!

Dropping the LLVM flag and refactoring the impl further sounds like the right thing to do IMO.
The unit test that uses the flag can easily be updated (or deprecated in favor of your test).
The internal use case we have for this will be easy to update.

@bondhugula
Thanks for the refactoring!

Dropping the LLVM flag and refactoring the impl further sounds like the right thing to do IMO.
The unit test that uses the flag can easily be updated (or deprecated in favor of your test).
The internal use case we have for this will be easy to update.

This sounds the best to me too. I'll do it in this patch.

Refactored std to llvm lowering for alloc op to reuse for alloca. The lowering
rewrite was really long. Broken down with multiple helpers now: this is also
ready to now use aligned_alloc in place of malloc for heap allocations.

@bondhugula
Thanks for the refactoring!

Dropping the LLVM flag and refactoring the impl further sounds like the right thing to do IMO.
The unit test that uses the flag can easily be updated (or deprecated in favor of your test).
The internal use case we have for this will be easy to update.

This sounds the best to me too. I'll do it in this patch.

Done @nicolasvasilache PTAL.

Update comments.

Harbormaster completed remote builds in B51887: Diff 255239.
ftynse added inline comments.Apr 6 2020, 5:49 AM
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
252

Nit: I think the intention of the comment above was to ask you to use backticks instead of double quotes.

253

Could you please elaborate what is a stack frame in MLIR? We don't seem to have this concept defined anywhere. In particular, is it only related to std.func, or can one register other ops that create stack frames?

mlir/lib/Conversion/StandardToLLVM/StandardToLLVM.cpp
1568

Nit: can we rather define one at the call site and pass it here (and to another call) ?

bondhugula marked 5 inline comments as done.Apr 6 2020, 7:02 AM
bondhugula added inline comments.
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
253

A stack frame here refers to the standard stack frame concept in CS that we know of! It's up to the conversion out of MLIR to realize this correctly.

mlir/lib/Conversion/StandardToLLVM/StandardToLLVM.cpp
1568

The reason I didn't do that is that the order of the instructions in that case wouldn't be natural - there would be other stuff (the size constant definitions) between the def of 'one' and its first use here. So I left it this way.

bondhugula updated this revision to Diff 255319.Apr 6 2020, 7:02 AM
bondhugula marked 2 inline comments as done.

Address review comments.

ftynse added inline comments.Apr 6 2020, 7:15 AM
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
253

MLIR makes us rethink a lot of things. Like MLIR does not have functions as a first-class concept :) I could obviously guess your intention with stack frames, but I would still insist you think it through in context of MLIR. Similarly to functions, I don't think we have a built-in memory model that has a stack... Or that std.func semantics says something about stack frames. IIRC, this was one of the conceptual problems of having alloca in the first place.

What should happen if I do the following?

my.func @foo() {
  alloca ...
}

Or

std.func @foo() {
  %0 = my.func_that_may_or_may_not_be_a_lambda @inner() {
    alloca ...
  }
}

"Conversion out of MLIR" does not mean anything to me either. Do you mean the translation of LLVM dialect to LLVM IR? There are other passes that may be using the standard dialect that are completely unaware of that.

bondhugula marked an inline comment as done.Apr 6 2020, 7:56 AM
bondhugula added inline comments.
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
253

Irrespective of which case it is, it depends on the ops around it and for the folks defining those ops to realize it. What you need to keep in mind is that an alloca's memory is automatically freed (i.e., you won't find a dealloc) and it disappears at the time the stack frame goes away whenever such a concept exists. Now, one is of course free to transform it and implement the auto freeing in another way within MLIR itself. Heap allocations for eg. get promoted to stack ones (eg. in LLVM's passes), and for whatever reason, one could decide to switch a stack allocation to a heap one. That won't be an incorrect transform. So it ultimately may not even be realized on the stack (let alone your question of when it should be freed). It's the intent of the op when you actually see in the IR that is of an allocation that is freed automatically when the stack frame goes away.

nicolasvasilache accepted this revision.Apr 6 2020, 9:06 AM

Thanks Uday, looks good!

Accepting conditioned on the LLVM options struct change.

mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
144

Note: I tend to prefer extra builders that take all PODs instead of attributes.
They are more natural to use and just work nicely once wrapped into EDSCs.

231

typo %d1

mlir/lib/Conversion/StandardToLLVM/StandardToLLVM.cpp
1513

I think the change in indentation (?) here and everywhere is unfortunate, it does not properly track what code moved (materialized with a yellow phab horizontal bar on the left) and make it harder to see what changed vs what moved.

I am mostly eyeballing the code after realizing most of this is just moved.

Please flag specific things need deeper review if appropriate.

3178

I am concerned by the silent behavior breaking changes here and in the followup revision.
I am expecting this will be painful for integrations and will repeat itself.

Can we turn this into a:

struct LowerToLLVMOptions {
  ... 
}
This revision is now accepted and ready to land.Apr 6 2020, 9:06 AM
ftynse requested changes to this revision.Apr 7 2020, 2:42 AM
ftynse added inline comments.
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
253

I don't disagree with anything you say. I am merely pointing out that it is absolutely unclear from the op definition _when_ the allocated memory will be automatically freed. And "when the stack frame goes away" is not a satisfactory definition, because it is meaningless under the current MLIR semantics as it is written. I could literally add the stack_frame.go_away() operation tomorrow and expect it to free the allocations... If you could instead tie it to something like "std.func" returning or the region of an operation with FunctionLike trait transferring control flow back to its enclosing op, it would be more MLIR-compatible.

This revision now requires changes to proceed.Apr 7 2020, 2:42 AM
mehdi_amini added inline comments.Apr 7 2020, 2:47 AM
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
253

I agree with Alex that this could deserve a more careful way of describing this.
In particular the "stack" isn't very important here, we really need a scope. Could we use a trait like we have for "IsolatedFromAbove" to control this scoping?
Unless we just consider every region as a new scope for these allocations?

bondhugula updated this revision to Diff 255622.Apr 7 2020, 3:08 AM
bondhugula marked 8 inline comments as done.

Address @nicolasvasilache review comments.

bondhugula added inline comments.Apr 7 2020, 3:15 AM
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
144

Hmm... I didn't change the existing alloc op builder here; just using the same for Alloca op. Yes, this can be changed to just take int64_t in another revision.

mlir/lib/Conversion/StandardToLLVM/StandardToLLVM.cpp
1513

All of this is just moved. Nothing in the alloc op lowering actually changed with this - I'm just reusing the common parts for alloca.

3178

That's a concern - there are going to be silent breaking effects. I've changed it to struct here. But even with a struct, you'll have the same issue with list initialization (some of the fields will remain uninitialized / wrongly initialized depending on where the field was removed from or where the new field was added). If the new fields are always added at the end, we'll just have uninitialized fields. And with explicit field-wise init, it'll just lead to uninitialized stuff.

BTW, none of the three options are documented at the declaration (but only in Passes.td).

bondhugula marked an inline comment as done.Apr 7 2020, 3:29 AM
bondhugula added inline comments.
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
253

Looks like these messages were in flight while I updated and committed. The rephrasing can be addressed in a future revision while we continue discussion here.

the current MLIR semantics as it is written. I could literally add the
stack_frame.go_away() operation tomorrow and expect it to
free the allocations.

The 'a' suffix in alloca is for automatic freeing - it can't be explicitly freed using another op. If you are imagining a low level op that exists that manipulates the stack frame, this would be a pathological case that it has freed it by circumventing things.

In particular the "stack" isn't very important here, we really need
a scope. Could we use a trait like we have for IsolatedFromAbove" to

Tying it to a scope like that of a closest surrounding op with a function like trait (or isolatedFromAbove) is almost fine, but ultimately not perfect since imperative function like ops like std.execute_region don't have function like traits, and we may want to imply freeing at that scope.

This revision was not accepted when it landed; it landed in state Needs Review.Apr 7 2020, 3:45 AM
This revision was automatically updated to reflect the committed changes.
ftynse added inline comments.Apr 7 2020, 4:55 AM
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
253

The 'a' suffix in alloca is for automatic freeing - it can't be explicitly freed using another op.

Why not? I would not constraint the behavior of every possible op based on a one-letter suffix of another op... "Automatic" does not mean it cannot be connected to another op. In fact, in MLIR, it will likely be connected to another op because functions are ops.

Tying it to a scope like that of a closest surrounding op with a function like trait (or isolatedFromAbove) is almost fine, but ultimately not perfect since imperative function like ops like std.execute_region don't have function like traits, and we may want to imply freeing at that scope.

IIUC, the idea is to introduce a new trait AutomaticAllocationScope that is orthogonal to FunctionLike. We can then make, e.g., std.func, llvm.func and std.execute_region have this trait, and let other ops opt-in to the automatic allocation/deallocation behavior. This would be my ideal solution, but I hesitated to push it since it may involve larger changes to this patch, trying to find a simpler solution like attaching to the function trait first. Now that the patch has landed, I think introducing a separate trait is the right thing to do.

bondhugula marked 4 inline comments as done.Apr 7 2020, 6:26 AM
bondhugula added inline comments.
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
253

mean it cannot be connected to another op. In fact, in MLIR, it will
likely be connected to another op because functions are ops.

Automatically there implies it has to be freed automatically, not by another op. The moment you want to explicitly free that, you'll have to rewrite it to something else - for eg. alloc/dealloc pair.

like attaching to the function trait first. Now that the patch has
landed, I think introducing a separate trait is the right thing to do.

Strictly speaking, I also feel adding another trait is the right option. But for now and to keep it lightweight until we have such use cases, 'freed when the closest surrounding op with FunctionLike trait returns' is a good approximation for 'stack frame being discarded'.

ftynse added inline comments.Apr 7 2020, 6:47 AM
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
253

Strictly speaking, I also feel adding another trait is the right option. But for now and to keep it lightweight until we have such use cases, 'freed when the closest surrounding op with FunctionLike trait returns' is a good approximation for 'stack frame being discarded'.

Let's have it as "freed when the closest surrounding op with FunctionLike trait has the control transferred back from its body", this avoids the potential interpretation that FunctionLike should also terminate with "std.return" and a weird-sounding "op <..> returns". We may also want to add a verifier check that such an op exists.

mehdi_amini added inline comments.Apr 7 2020, 11:27 AM
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
253

I am not convinced that the "FunctionLike" trait is the right one here, an op like gpu.launch for example would create a new scope in my expectations. What do you think about the trait Uday proposed above AutomaticAllocationScope ?

ftynse added a subscriber: herhut.Apr 7 2020, 11:40 AM
ftynse added inline comments.
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
253

I proposed it :) So I am obviously in favor of it.

I think we can start by associating the deallocation with FunctionLike, it's a simple documentation+verifier change that we can land fast and avoid having contentious code upstream as well as rolling it back. Then we can implement AutomaticAllocationScope and let ops to opt-in (e.g., I'd expect @herhut to decide whether we want allocas in GPU at all).

mehdi_amini added inline comments.Apr 7 2020, 7:20 PM
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
253

I think we can start by associating the deallocation with FunctionLike,

I'm afraid this will lead to wrong assumption, I rather have code written with the right trait checked from the beginning.

bondhugula marked 6 inline comments as done.Apr 7 2020, 9:57 PM
bondhugula added inline comments.
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
253

It is clear that using a new trait for scoped allocation will be accurate here. FunctionLike will always imply a new scope for stack allocation. Introducing the trait should be straightforward - okay to update the description when we introduce the trait.

mehdi_amini added inline comments.Apr 8 2020, 8:54 PM
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
253

Are you adding the trait? I think this should be done now: I wouldn't want code that start checking for "FunctionLike" where it should check for the "AutomaticAllocationScope" trait.

bondhugula marked 2 inline comments as done.Apr 8 2020, 9:27 PM
bondhugula added inline comments.
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
253

Alex, are you interested in / are you able to add this trait? If not, I can do it right away.

ftynse added inline comments.Apr 9 2020, 1:23 AM
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
253

I have quite a long todo list...

bondhugula marked 3 inline comments as done.Apr 9 2020, 3:16 AM
bondhugula added inline comments.
mlir/include/mlir/Dialect/StandardOps/IR/Ops.td
253