For the synchronous case, destroy the stream after synchronization.
Sneak in a unrelated change to report why the gpu.wait conversion pattern didn't match.
Paths
| Differential D89933
[mlir][gpu] Handle async in gpu.launch_func lowering. ClosedPublic Authored by csigg on Oct 21 2020, 11:40 PM.
Details Summary For the synchronous case, destroy the stream after synchronization. Sneak in a unrelated change to report why the gpu.wait conversion pattern didn't match.
Diff Detail
Event TimelineHerald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald TranscriptOct 21 2020, 11:40 PM csigg added a parent revision: D89324: [mlir][gpu] Allow gpu.launch_func to be async..Oct 21 2020, 11:41 PM Comment Actions The lifetime management of streams is not correct, I think. Even with the assumption that "forks" are explicit, we might leak streams. For instance if we have %t3 = gpu.wait aysnc [%t1, %t2] // last use of %t1 and %t2 ... the streams would never be freed. I wonder whether it would be easier to not deallocate streams at all when lowering and then have a fix-up pass that inserts the stream destruction where needed.
This revision now requires changes to proceed.Oct 22 2020, 6:26 AM Comment Actions
Yes, you are right. The gpu.wait async lowering doesn't currently destroy the streams it depends on. The gpu.wait async ops we create in D89937 have no dependencies. My plan is that gpu.wait async with dependencies will be inserted for tokens that have multiple uses (the explicit "forks"). There, the streams we synchronize on should not be destroyed. We should verify that we don't leak, but hopefully that will be a little easier than a fix-up pass. The change here primarily fixes the existing leaks of streams that we encountered benchmark when lowering non-async gpu.launch_func.
This revision is now accepted and ready to land.Oct 29 2020, 8:57 AM
This revision was landed with ongoing or failed builds.Oct 29 2020, 2:16 PM Closed by commit rGfce99e5f739e: [mlir][gpu] Handle async in gpu.launch_func lowering. (authored by csigg). · Explain Why This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 301668 mlir/lib/Conversion/GPUCommon/ConvertLaunchFuncToRuntimeCalls.cpp
|
clang-format suggested style edits found: