The terminator of this op is special: it does not just yield a value,
but bufferizes to a memcpy. This requires special treatment to make sure
that deallocs are placed after the memcpy. (By default, deallocs are
placed right before the terminator.)
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This patch solves deallocs placement error, but i still have some questions:
- 'memref.dealloc' is generated in bufferization process of 'scf.for_all' operation. It seems to me that this placement error is caused by illegal insertion, but expect others to fix it. is this over-coupling?
- If we accept this approach, any operation contains in terminator region should do same things as tensor.paralle_insert_slice on bufferize, Or should we abstract this behavior as interface or something?
I was just tried to contributing to llvm community, maybe i missed some important rationales. Thank you in advance for any help and explanation.
A bit of background: This bug was caused by our buffer deallocation strategy, which is currently too simple: it always places deallocs at the end of a block, right before the terminator. This worked great, until we added scf.for_all, which was the first time that we had an op with a terminator that does more than just yielding.
You are right, this is a not an ideal solution because the root problem is the original placement of the dealloc. We have plans to revamp the buffer deallocation logic this summer: One-Shot Bufferize would no longer insert any deallocations. This would be done by a separate pass. This change here is the smallest change to fix the issue for now, without writing a bunch of code that will be deleted in a month.
Contributions are always welcome, so please keep them coming :)
Thank you for your patience explanation, insert deallocation in another pass is a better way.
By the way, how can i get involved and contribute to the refactor process?