This is an archive of the discontinued LLVM Phabricator instance.

Pass GPU events instead of streams across async regions.
ClosedPublic

Authored by csigg on Feb 18 2021, 8:29 AM.

Details

Summary

Lower !gpu.async.tokens returned from async.execute regions to events instead of streams.

Make !gpu.async.token returned from !async.execute single-use.
This allows creating one event per use and destroying them without leaking or ref-counting.
Technically we only need this for stream/event-based lowering. I kept the code separate
from the rest of the gpu-async-region pass so that we can make this optional or move
to a separate pass as needed.

Diff Detail

Event Timeline

csigg created this revision.Feb 18 2021, 8:29 AM
csigg requested review of this revision.Feb 18 2021, 8:29 AM
csigg updated this revision to Diff 324736.Feb 18 2021, 12:29 PM

Fix tests.

csigg updated this revision to Diff 324770.Feb 18 2021, 2:09 PM

Remove debug printfs ;-)

csigg updated this revision to Diff 324885.Feb 18 2021, 10:36 PM

Fix identifier naming.

herhut added inline comments.Feb 24 2021, 12:40 AM
mlir/lib/Conversion/GPUCommon/ConvertLaunchFuncToRuntimeCalls.cpp
483

The name is misleading. This is not done via types but rather by matching the defining op. Maybe reflect that in the name.

486

nit: Can you get that string from somewhere, so it is only defined in one place?

This will fall apart with control flow, but we do not support this currently, anyway.

mlir/lib/Dialect/GPU/Transforms/AsyncRegionRewriter.cpp
300

This is a bit confusing. Here results is the result operands and further down it is the actual result values of the updated op.

301

Can this be done without rewriting the op multiple times?

csigg updated this revision to Diff 326071.Feb 24 2021, 6:06 AM
csigg marked 3 inline comments as done.

Address reviewer comments.

mlir/lib/Conversion/GPUCommon/ConvertLaunchFuncToRuntimeCalls.cpp
486

The idea is that we lower control flow to events, at least at the beginning.

mlir/lib/Dialect/GPU/Transforms/AsyncRegionRewriter.cpp
301

It's not worth the (quite significant) extra complexity, because in most cases there will be just one.

herhut accepted this revision.Feb 25 2021, 1:23 AM

Thanks!

This revision is now accepted and ready to land.Feb 25 2021, 1:23 AM
This revision was automatically updated to reflect the committed changes.