Page MenuHomePhabricator

bondhugula (Uday Bondhugula)
User

Projects

User does not belong to any projects.

User Details

User Since
Dec 14 2019, 8:54 PM (132 w, 4 d)

Recent Activity

Sun, Jun 26

bondhugula added reviewers for D127089: [ExecEngine] Add scanMemref* API and an `input-dir` option in JIT runner: mehdi_amini, rriddle.
Sun, Jun 26, 3:41 PM · Restricted Project, Restricted Project
bondhugula added a comment to D128575: Adding a new variant of DepthwiseConv2D.

Please add a test case as well for the new op.

Sun, Jun 26, 3:37 PM · Restricted Project, Restricted Project
bondhugula added a comment to D128575: Adding a new variant of DepthwiseConv2D.

Please add a test case as well for the new op.

Sun, Jun 26, 3:35 PM · Restricted Project, Restricted Project
bondhugula committed rGdab6c11f83b5: [MLIR] NFC. Fix doc comment for AliasResult::isNo (authored by bondhugula).
[MLIR] NFC. Fix doc comment for AliasResult::isNo
Sun, Jun 26, 3:28 PM · Restricted Project, Restricted Project
bondhugula closed D128594: [MLIR] NFC. Fix doc comment for AliasResult::isNo.
Sun, Jun 26, 3:28 PM · Restricted Project, Restricted Project

Sat, Jun 25

bondhugula added inline comments to D128581: [mlir][SCF][bufferize] Bufferize scf.if/execute_region terminators separately.
Sat, Jun 25, 7:36 PM · Restricted Project, Restricted Project
bondhugula requested review of D128594: [MLIR] NFC. Fix doc comment for AliasResult::isNo.
Sat, Jun 25, 7:24 PM · Restricted Project, Restricted Project

Wed, Jun 22

bondhugula added a comment to D127089: [ExecEngine] Add scanMemref* API and an `input-dir` option in JIT runner.

@herhut - would you be able to also review this?

Wed, Jun 22, 8:47 AM · Restricted Project, Restricted Project

Mon, Jun 13

bondhugula added reviewers for D127089: [ExecEngine] Add scanMemref* API and an `input-dir` option in JIT runner: ftynse, herhut.
Mon, Jun 13, 8:22 AM · Restricted Project, Restricted Project
bondhugula accepted D127076: [MLIR][Parser] Fix AffineParser colliding bare identifiers with primitive types.

While adding tests, I noticed affine_map's like:

#map = affine_map<(d0, d1)[mod] -> (d0 + d1 + mod)>

wouldn't be parsed, even though they are correct according to the grammar. I added support to parse these affine_maps too.

Mon, Jun 13, 4:56 AM · Restricted Project, Restricted Project

Sun, Jun 5

bondhugula added inline comments to D127076: [MLIR][Parser] Fix AffineParser colliding bare identifiers with primitive types.
Sun, Jun 5, 8:58 PM · Restricted Project, Restricted Project
bondhugula requested changes to D127076: [MLIR][Parser] Fix AffineParser colliding bare identifiers with primitive types.
Sun, Jun 5, 6:55 PM · Restricted Project, Restricted Project

Thu, Jun 2

bondhugula added inline comments to D126753: [mlir][scf] Add option to loop pipelining to not peel the epilogue.
Thu, Jun 2, 8:09 PM · Restricted Project, Restricted Project

May 27 2022

bondhugula accepted D126585: [mlir] Tunnel LLVM_USE_LINKER through to the standalone example build..

This is a great improvement - thanks!

May 27 2022, 10:22 PM · Restricted Project, Restricted Project

May 25 2022

bondhugula added a comment to D102799: [mlir][SCF] Canonicalize nested ParallelOps's.

Isn't the whole point of canonicalization to convert semantically equivalent codes to the single form? So we should either always merge nested loops or always split them (which makes nested loops support in scf.parallel useless).

May 25 2022, 5:43 PM · Restricted Project, Restricted Project
bondhugula added inline comments to D126199: [MLIR][GPU] Expose GpuParallelLoopMapping as non-test pass..
May 25 2022, 5:32 PM · Restricted Project, Restricted Project
bondhugula accepted D126199: [MLIR][GPU] Expose GpuParallelLoopMapping as non-test pass..

LGTM - should have definitely been a regular pass. Some minor comments.

May 25 2022, 10:02 AM · Restricted Project, Restricted Project

May 18 2022

bondhugula added inline comments to D125854: [mlir][normalize-memrefs] Non-normalizable operations with identity map layouts do not block normalization of the entire function.
May 18 2022, 9:55 AM · Restricted Project, Restricted Project

May 17 2022

bondhugula accepted D125613: [MLIR][Presburger] Attach values only to non-local identifiers in FAVC.

LGTM - thanks. There aren't indeed any use cases to attach Values to local identifiers. Some minor comments.

May 17 2022, 6:55 PM · Restricted Project, Restricted Project

May 14 2022

bondhugula committed rG16219f8c94a2: [MLIR][GPU] Add canonicalizer for gpu.memcpy (authored by arnab-oss).
[MLIR][GPU] Add canonicalizer for gpu.memcpy
May 14 2022, 6:32 AM · Restricted Project, Restricted Project
bondhugula closed D124257: [MLIR][GPU] Add canonicalizer for gpu.memcpy.
May 14 2022, 6:31 AM · Restricted Project, Restricted Project

May 13 2022

bondhugula added inline comments to D125587: [ParseResult] Fix warning in flang build, incorporate feedback from River..
May 13 2022, 3:31 PM · Restricted Project, Restricted Project, Restricted Project

May 12 2022

bondhugula accepted D124257: [MLIR][GPU] Add canonicalizer for gpu.memcpy.

LGTM

May 12 2022, 7:58 PM · Restricted Project, Restricted Project

May 8 2022

bondhugula added a comment to D124257: [MLIR][GPU] Add canonicalizer for gpu.memcpy.

Looking mostly good. Some minor comments and a question to under the async dep related guard.

May 8 2022, 10:12 PM · Restricted Project, Restricted Project

Apr 29 2022

bondhugula added inline comments to D124257: [MLIR][GPU] Add canonicalizer for gpu.memcpy.
Apr 29 2022, 5:09 PM · Restricted Project, Restricted Project

Apr 22 2022

bondhugula added inline comments to D124257: [MLIR][GPU] Add canonicalizer for gpu.memcpy.
Apr 22 2022, 6:58 AM · Restricted Project, Restricted Project
bondhugula added inline comments to D121279: [MLIR][GPU] Add canonicalizer for gpu.memcpy.
Apr 22 2022, 1:42 AM · Restricted Project, Restricted Project
bondhugula added a comment to D121279: [MLIR][GPU] Add canonicalizer for gpu.memcpy.

Here is a repro:

func @copy(%arg0: memref<1xi8>, %arg1: memref<i1>) {
  %0 = arith.constant 0 : index
  %1 = memref.view %arg0[%0][] : memref<1xi8> to memref<i1>
  gpu.memcpy  %1, %arg1 : memref<i1>, memref<i1>
  func.return
}

mlir-opt --canonicalize removes the memcpy when it really shouldn't.

Apr 22 2022, 1:32 AM · Restricted Project, Restricted Project

Apr 21 2022

bondhugula committed rGf47a38f51724: Add async dependencies support for gpu.launch op (authored by bondhugula).
Add async dependencies support for gpu.launch op
Apr 21 2022, 3:56 AM · Restricted Project, Restricted Project
bondhugula closed D123499: Add async dependencies support for gpu.launch op.
Apr 21 2022, 3:56 AM · Restricted Project, Restricted Project
bondhugula added inline comments to D123499: Add async dependencies support for gpu.launch op.
Apr 21 2022, 3:55 AM · Restricted Project, Restricted Project
bondhugula updated the diff for D123499: Add async dependencies support for gpu.launch op.

Adjust error message.

Apr 21 2022, 3:25 AM · Restricted Project, Restricted Project

Apr 20 2022

bondhugula updated the diff for D123499: Add async dependencies support for gpu.launch op.

Rebase.

Apr 20 2022, 10:21 AM · Restricted Project, Restricted Project
bondhugula committed rGd7565de6cc6b: [MLIR] NFC. Drop trailing white space in GPU async ops print (authored by bondhugula).
[MLIR] NFC. Drop trailing white space in GPU async ops print
Apr 20 2022, 5:29 AM · Restricted Project, Restricted Project
bondhugula closed D123754: [MLIR] NFC. Drop trailing white space in GPU async ops print.
Apr 20 2022, 5:29 AM · Restricted Project, Restricted Project
bondhugula committed rGd423fc372466: Add RegionBranchOpInterface on affine.for op (authored by bondhugula).
Add RegionBranchOpInterface on affine.for op
Apr 20 2022, 5:20 AM · Restricted Project, Restricted Project
bondhugula closed D123568: Add RegionBranchOpInterface on affine.for op.
Apr 20 2022, 5:19 AM · Restricted Project, Restricted Project

Apr 19 2022

bondhugula committed rG12f55cac69d8: [MLIR][GPU] Add canonicalizer for gpu.memcpy (authored by arnab-oss).
[MLIR][GPU] Add canonicalizer for gpu.memcpy
Apr 19 2022, 5:25 AM · Restricted Project, Restricted Project
bondhugula closed D121279: [MLIR][GPU] Add canonicalizer for gpu.memcpy.
Apr 19 2022, 5:25 AM · Restricted Project, Restricted Project
bondhugula added a comment to D123893: [MLIR] Provide a way to print ops in custom form on pass failure.

The revert seemed totally warranted here: I believe the real break of policy here is that it was committed over a clearly stated objection without an attempt to resolve the discussion, this is just not OK.

Apr 19 2022, 1:38 AM · Restricted Project, Restricted Project
bondhugula added a comment to D123893: [MLIR] Provide a way to print ops in custom form on pass failure.

Hey folks, I'm a bit dismayed by how this patch/discussion/revert happened. It seems to me that we are reaching for perfection on a debugging option that could have added utility/mileage towards an ultimate better state. For a lot of such things, the on the ground experience of using something for debugging in varied circumstances is how we find the best outcome. I personally find merit in points that both Chris and Mehdi made for the future but I don't see a need to resolve them or get there in one step.

In any case, merits/opinions aside, this was not a shining example of either the development process or the meta goal of having an open, pragmatic community. Not sure what to do about that but I feel it needs stating. I don't have much of a stake in this issue and if Mehdi/Uday would like to discuss it, I'd be open to facilitating.

Apr 19 2022, 1:13 AM · Restricted Project, Restricted Project
bondhugula added a comment to D123915: Print custom assembly on pass failure by default.

Nice, good point about the change to dump for invalid IR, this seems to provide the functionality of the previous change without needing a flag/API change and producing the prettier output in more cases. Of course the discussion still ongoing but looks like general improvement.

Not completely. It doesn't provide you with a custom assembly when the IR is invalid -- you *will* need an extra flag to say you need custom assembly even when the IR is invalid.

Your patch didn’t handle invalid IR, as I shown with the examples in one of my last answers there.

Apr 19 2022, 1:00 AM · Restricted Project, Restricted Project
bondhugula added a comment to D123893: [MLIR] Provide a way to print ops in custom form on pass failure.

Added a test case in https://reviews.llvm.org/D123916 to double-check my understanding and illustrate my point. I added a pass that creates an invalid op and has an option to trigger an optional call to "signalPassFailure()".
Below is the trace for all the possible configurations (the only constant is that the IR is invalid) with the current patch applied:

  • The pass signals pass failure, the option -mlir-print-custom-assembly-after-failure is not set:
$ bin/mlir-opt ../mlir/test/Pass/invalid-ir.mlir -pass-pipeline='func.func(test-pass-create-invalid-ir{signal-pass-failure=true})' -mlir-print-ir-after-failure 
// -----// IR Dump After (anonymous namespace)::TestInvalidIRPass Failed //----- //
"func.func"() ({
  "test.any_attr_of_i32_str"() : () -> ()
  "func.return"() : () -> ()
}) {function_type = () -> (), sym_name = "TestCreateInvalidCallInPass"} : () -> ()
  • The pass does not signal pass failure (so the verifier will fail instead), the option -mlir-print-custom-assembly-after-failure is not set:
$ bin/mlir-opt ../mlir/test/Pass/invalid-ir.mlir -pass-pipeline='func.func(test-pass-create-invalid-ir{signal-pass-failure=false})' -mlir-print-ir-after-failure 
<unknown>:0: error: 'test.any_attr_of_i32_str' op requires attribute 'attr'
<unknown>:0: note: see current operation: "test.any_attr_of_i32_str"() : () -> ()
// -----// IR Dump After (anonymous namespace)::TestInvalidIRPass Failed //----- //
"func.func"() ({
  "test.any_attr_of_i32_str"() : () -> ()
  "func.return"() : () -> ()
}) {function_type = () -> (), sym_name = "TestCreateInvalidCallInPass"} : () -> ()
  • The pass signals pass failure, the option -mlir-print-custom-assembly-after-failure is set:
$ bin/mlir-opt ../mlir/test/Pass/invalid-ir.mlir -pass-pipeline='func.func(test-pass-create-invalid-ir{signal-pass-failure=true})' -mlir-print-ir-after-failure  -mlir-print-custom-assembly-after-failure
// -----// IR Dump After (anonymous namespace)::TestInvalidIRPass Failed //----- //
"func.func"() ({
  "test.any_attr_of_i32_str"() : () -> ()
  "func.return"() : () -> ()
}) {function_type = () -> (), sym_name = "TestCreateInvalidCallInPass"} : () -> ()
  • The pass does not signal pass failure (so the verifier will fail instead), the option -mlir-print-custom-assembly-after-failure is set:
$ bin/mlir-opt ../mlir/test/Pass/invalid-ir.mlir -pass-pipeline='func.func(test-pass-create-invalid-ir{signal-pass-failure=false})' -mlir-print-ir-after-failure -mlir-print-custom-assembly-after-failure
<unknown>:0: error: 'test.any_attr_of_i32_str' op requires attribute 'attr'
<unknown>:0: note: see current operation: "test.any_attr_of_i32_str"() : () -> ()
// -----// IR Dump After (anonymous namespace)::TestInvalidIRPass Failed //----- //
"func.func"() ({
  "test.any_attr_of_i32_str"() : () -> ()
  "func.return"() : () -> ()
}) {function_type = () -> (), sym_name = "TestCreateInvalidCallInPass"} : () -> ()

We print generically in all cases: the current patch has no effect when there is invalid IR involved.

Apr 19 2022, 12:58 AM · Restricted Project, Restricted Project
bondhugula added a comment to D123499: Add async dependencies support for gpu.launch op.

Any more comments here @csigg ?

Apr 19 2022, 12:49 AM · Restricted Project, Restricted Project
bondhugula added a comment to D123915: Print custom assembly on pass failure by default.

Why is this safe? Does the printer already check that the IR passes verify() and falls back to the generic form automatically?

If so, this is awesome. Does Operation::dump() have the same behavior?

Yes: this changed ~ last month I think.

That is what I was trying to explain in Uday's patch and why I was objecting to adding a flag.

Apr 19 2022, 12:47 AM · Restricted Project, Restricted Project
bondhugula accepted D123915: Print custom assembly on pass failure by default.

Nice, good point about the change to dump for invalid IR, this seems to provide the functionality of the previous change without needing a flag/API change and producing the prettier output in more cases. Of course the discussion still ongoing but looks like general improvement.

Apr 19 2022, 12:45 AM · Restricted Project, Restricted Project
bondhugula edited reviewers for D123754: [MLIR] NFC. Drop trailing white space in GPU async ops print, added: ftynse; removed: ThomasRaoux.

Any other comments here: @rriddle?

Apr 19 2022, 12:33 AM · Restricted Project, Restricted Project
bondhugula added a comment to D123568: Add RegionBranchOpInterface on affine.for op.

@ftynse @herhut -- any other comments here? Everything's addressed.

Apr 19 2022, 12:32 AM · Restricted Project, Restricted Project

Apr 17 2022

bondhugula committed rGdaabcf5f04bb: [MLIR] Provide a way to print ops in custom form on pass failure (authored by bondhugula).
[MLIR] Provide a way to print ops in custom form on pass failure
Apr 17 2022, 7:45 AM · Restricted Project, Restricted Project
bondhugula closed D123893: [MLIR] Provide a way to print ops in custom form on pass failure.
Apr 17 2022, 7:44 AM · Restricted Project, Restricted Project
bondhugula added a comment to D123893: [MLIR] Provide a way to print ops in custom form on pass failure.

What is the concern here? I don't think that Uday is arguing this should be on-by-default.

Uday doesn't: I am.

To expand, my proposal is:

  1. Change the default for a pass failure (which what this patch is about) to print the custom form, that is remove the printGenericOpForm() from mlir/lib/Pass/IRPrinting.cpp) . This has nothing to do with invalid IR: printing the custom form will (by default) run the verifier and on failure fallback anyway to generic IR already right now.
Apr 17 2022, 7:39 AM · Restricted Project, Restricted Project
bondhugula updated the diff for D123893: [MLIR] Provide a way to print ops in custom form on pass failure.

Update cmd line flag name and added test case.

Apr 17 2022, 7:38 AM · Restricted Project, Restricted Project

Apr 16 2022

bondhugula added a comment to D123893: [MLIR] Provide a way to print ops in custom form on pass failure.

The cost of getting custom format output from generic output dump is rerunning through opt tool, the cost of crash is rerunning program (which can take multiple hours and take a lot of compute resources). I'd much rather it be uniform and safe and avoid needing flag plumbed through.

I have another take on this: this is printed after a pass failed but we don't know that the IR is invalid.
We were printing generically to be conservative, however our printer is now "safe" in that it'll run the verifier and print generically anyway (even with this new option added here...) if the verifier is failing.

Apr 16 2022, 6:14 PM · Restricted Project, Restricted Project
bondhugula added a comment to D123893: [MLIR] Provide a way to print ops in custom form on pass failure.

The cost of getting custom format output from generic output dump is rerunning through opt tool,

Apr 16 2022, 9:37 AM · Restricted Project, Restricted Project

Apr 15 2022

bondhugula added a comment to D123886: Improve terminator doc in MLIR LangRef.

Title: spec -> doc

Apr 15 2022, 10:58 PM · Restricted Project, Restricted Project
bondhugula accepted D123886: Improve terminator doc in MLIR LangRef.
Apr 15 2022, 10:57 PM · Restricted Project, Restricted Project
bondhugula added reviewers for D123893: [MLIR] Provide a way to print ops in custom form on pass failure: jpienaar, stellaraccident.
Apr 15 2022, 9:20 PM · Restricted Project, Restricted Project
bondhugula retitled D123893: [MLIR] Provide a way to print ops in custom form on pass failure from Provide a way to print ops in custom form on pass failure to [MLIR] Provide a way to print ops in custom form on pass failure.
Apr 15 2022, 9:17 PM · Restricted Project, Restricted Project
bondhugula updated the diff for D123893: [MLIR] Provide a way to print ops in custom form on pass failure.

Fix typos.

Apr 15 2022, 9:17 PM · Restricted Project, Restricted Project
bondhugula requested review of D123893: [MLIR] Provide a way to print ops in custom form on pass failure.
Apr 15 2022, 9:13 PM · Restricted Project, Restricted Project
bondhugula requested changes to D123855: [MLIR] Move values from FlatAffineValueConstraints to PresburgerSpace.
Apr 15 2022, 5:39 PM · Restricted Project, Restricted Project
bondhugula added a comment to D123266: [mlir][nvgpu] Add NVGPU dialect (architectural specific gpu dialect).

Looks good to me. We really need to figure out a way to group dialects :)

Please also wait for @bondhugula, who had concerns.

Thanks @herhut. @bondhugula, do you still have any concerns?

Apr 15 2022, 4:29 AM · Restricted Project, Restricted Project
bondhugula updated the diff for D123568: Add RegionBranchOpInterface on affine.for op.

Thanks for the review, @ftynse, @herhut. Fixed and enhanced the check. Tested via SCCP as well.

Apr 15 2022, 2:32 AM · Restricted Project, Restricted Project
bondhugula added inline comments to D123754: [MLIR] NFC. Drop trailing white space in GPU async ops print.
Apr 15 2022, 1:29 AM · Restricted Project, Restricted Project

Apr 14 2022

bondhugula committed rG3766ca75f874: [MLIR] Fix missing return statement warning in PatternMatch.h (authored by bondhugula).
[MLIR] Fix missing return statement warning in PatternMatch.h
Apr 14 2022, 12:36 AM · Restricted Project, Restricted Project
bondhugula closed D123756: [MLIR] Fix missing return statement warning in PatternMatch.h.
Apr 14 2022, 12:36 AM · Restricted Project, Restricted Project
bondhugula added inline comments to D123754: [MLIR] NFC. Drop trailing white space in GPU async ops print.
Apr 14 2022, 12:36 AM · Restricted Project, Restricted Project
bondhugula added a comment to D123756: [MLIR] Fix missing return statement warning in PatternMatch.h.

Wouldn't have expected this given the static assert, but weird.

Apr 14 2022, 12:32 AM · Restricted Project, Restricted Project
bondhugula updated the diff for D123756: [MLIR] Fix missing return statement warning in PatternMatch.h.

Switch to {}.

Apr 14 2022, 12:31 AM · Restricted Project, Restricted Project
bondhugula committed rG392d55c1e2d7: [MLIR][GPU] Add canonicalization patterns for folding simple gpu.wait ops. (authored by arnab-oss).
[MLIR][GPU] Add canonicalization patterns for folding simple gpu.wait ops.
Apr 14 2022, 12:02 AM · Restricted Project, Restricted Project
bondhugula closed D121878: [MLIR][GPU] Add canonicalization patterns for folding simple gpu.wait ops..
Apr 14 2022, 12:02 AM · Restricted Project, Restricted Project

Apr 13 2022

bondhugula requested review of D123756: [MLIR] Fix missing return statement warning in PatternMatch.h.
Apr 13 2022, 7:42 PM · Restricted Project, Restricted Project
bondhugula requested review of D123754: [MLIR] NFC. Drop trailing white space in GPU async ops print.
Apr 13 2022, 7:28 PM · Restricted Project, Restricted Project
bondhugula updated the diff for D123499: Add async dependencies support for gpu.launch op.

Add a couple more test cases for the outlining pass.

Apr 13 2022, 7:01 PM · Restricted Project, Restricted Project
bondhugula updated the diff for D123499: Add async dependencies support for gpu.launch op.

Missed updates for gpu.launch -> gpu.launch_func.

Apr 13 2022, 6:59 PM · Restricted Project, Restricted Project
bondhugula added a comment to D123499: Add async dependencies support for gpu.launch op.

For the support in this PR, I'm not adding a result token to the op if no async deps have been specified (even if the async keyword is specified, it's dropped during the print). It should perhaps be a parse error and similarly for the launch_func op?

I don't think this is a good idea. The 'async' keyword stands for the !gpu.async.token return type and should really be independent of the token operands inside []. I would not infer the former from the latter, but fail verification (for all but gpu.wait) if we want that.

This makes sense to me too -- follows from the previous para. I'll update the revision to make sure there is a meaning to the "async" keyword (regardless of the tokens) in that it returns a token indicating async execution.

Apr 13 2022, 6:47 PM · Restricted Project, Restricted Project
bondhugula updated the diff for D123499: Add async dependencies support for gpu.launch op.

Fix semantics and syntax to allow async without any deps.

Apr 13 2022, 6:36 PM · Restricted Project, Restricted Project
bondhugula added a comment to D123647: [MLIR][GPU] Add GPU ops nvvm.mma.sync, nvvm.mma.ldmatrix, lane_id.

This is looking good to me. Thank you for contributing these. Mostly minor comments/requests for documentation.

Apr 13 2022, 5:44 AM · Restricted Project, Restricted Project
bondhugula accepted D121878: [MLIR][GPU] Add canonicalization patterns for folding simple gpu.wait ops..
Apr 13 2022, 5:38 AM · Restricted Project, Restricted Project
bondhugula added a comment to D123499: Add async dependencies support for gpu.launch op.

For the support in this PR, I'm not adding a result token to the op if no async deps have been specified (even if the async keyword is specified, it's dropped during the print).

I don't understand what you're describing: you don't get to control the number of output of the operation, this is parsed by MLIR before your custom parser and if you don't provide a result type matching the number of result MLIR will fatal_error() anyway.

Apr 13 2022, 2:35 AM · Restricted Project, Restricted Project
bondhugula added a comment to D123499: Add async dependencies support for gpu.launch op.

async with 0 async dep tokens wouldn't appear to be a meaningful configuration for the op. The lowering does check for it and fails, but should it just be disallowed?

I'm not sure. I would say the semantics are clear if an op uses async (the host does not wait for the op to complete) but no dependencies (it can run immediately without waiting for anything else), and it's OK for the current lowering to be limited in what it can handle and rely on gpu-async-region to bring it into lowering-compatible form. I kind of like the symmetry of these ops (including gpu.wait, where gpu.wait async [] needs to be valid).

Apr 13 2022, 2:31 AM · Restricted Project, Restricted Project
bondhugula added inline comments to D121878: [MLIR][GPU] Add canonicalization patterns for folding simple gpu.wait ops..
Apr 13 2022, 2:27 AM · Restricted Project, Restricted Project
bondhugula added inline comments to D121878: [MLIR][GPU] Add canonicalization patterns for folding simple gpu.wait ops..
Apr 13 2022, 2:25 AM · Restricted Project, Restricted Project

Apr 12 2022

bondhugula added inline comments to D121878: [MLIR][GPU] Add canonicalization patterns for folding simple gpu.wait ops..
Apr 12 2022, 4:45 PM · Restricted Project, Restricted Project
bondhugula added inline comments to D121878: [MLIR][GPU] Add canonicalization patterns for folding simple gpu.wait ops..
Apr 12 2022, 6:55 AM · Restricted Project, Restricted Project
bondhugula added inline comments to D121878: [MLIR][GPU] Add canonicalization patterns for folding simple gpu.wait ops..
Apr 12 2022, 3:57 AM · Restricted Project, Restricted Project
bondhugula added a comment to D123482: Fix CUDA runtime wrapper for GPU mem alloc/free to async.

You broke the bot apparently: https://lab.llvm.org/buildbot/#/builders/61/builds/24891 ; can you look into this?

/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-cpu-runner: symbol lookup error: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_cuda_runtime.so: undefined symbol: cuMemAllocAsync

The bot has an old version of CUDA that doesn't support async alloc. We already guarded it with >= 11.2 above, and I double-checked again that a 11.2 or higher version should have had that method. Looking at the logs, the bot has CUDA_VERSION=10.2.89 in its env (not sure what the preprocessor sees). Either the format is different which is tripping the macro check or the build is being compiled with newer headers but is being linked with older libraries. Someone with access to the bot will have to see what CUDA_VERSION is being set to (for the build preprocessor).

Apr 12 2022, 12:15 AM · Restricted Project, Restricted Project
bondhugula added a comment to D123482: Fix CUDA runtime wrapper for GPU mem alloc/free to async.

You broke the bot apparently: https://lab.llvm.org/buildbot/#/builders/61/builds/24891 ; can you look into this?

Apr 12 2022, 12:07 AM · Restricted Project, Restricted Project

Apr 11 2022

bondhugula committed rG56245cc18c67: [MLIR] NFC. Address clang-tidy warning in AffineOps.cpp (authored by bondhugula).
[MLIR] NFC. Address clang-tidy warning in AffineOps.cpp
Apr 11 2022, 9:29 PM · Restricted Project, Restricted Project
bondhugula updated the diff for D123568: Add RegionBranchOpInterface on affine.for op.

Base it on the right commit.

Apr 11 2022, 9:25 PM · Restricted Project, Restricted Project
bondhugula requested review of D123568: Add RegionBranchOpInterface on affine.for op.
Apr 11 2022, 9:20 PM · Restricted Project, Restricted Project
bondhugula committed rGb4117fede20b: Fix CUDA runtime wrapper for GPU mem alloc/free to async (authored by bondhugula).
Fix CUDA runtime wrapper for GPU mem alloc/free to async
Apr 11 2022, 9:04 PM · Restricted Project, Restricted Project
bondhugula closed D123482: Fix CUDA runtime wrapper for GPU mem alloc/free to async.
Apr 11 2022, 9:04 PM · Restricted Project, Restricted Project
bondhugula updated the diff for D123482: Fix CUDA runtime wrapper for GPU mem alloc/free to async.

Rebase.

Apr 11 2022, 8:40 PM · Restricted Project, Restricted Project
bondhugula updated the diff for D123499: Add async dependencies support for gpu.launch op.

Sorted order.

Apr 11 2022, 6:18 PM · Restricted Project, Restricted Project
bondhugula updated the diff for D123499: Add async dependencies support for gpu.launch op.

Add AsyncOpInterface to gpu.launch and move common methods in GPUDialect.cpp up.

Apr 11 2022, 6:06 PM · Restricted Project, Restricted Project
bondhugula added a comment to D123499: Add async dependencies support for gpu.launch op.

A gpu.launch_func without async implies that it is synchronous (same for gpu.memset, gpu.memcpy etc). If the lowering to the gpu runtime wouldn't reject it, it would need to insert a mgpuStreamSynchronize call.

Apr 11 2022, 9:43 AM · Restricted Project, Restricted Project
bondhugula added a reviewer for D123499: Add async dependencies support for gpu.launch op: mehdi_amini.
Apr 11 2022, 6:52 AM · Restricted Project, Restricted Project
bondhugula added a comment to D123499: Add async dependencies support for gpu.launch op.

Not opposed to this change at all, but what's the motivation for allowing gpu-async-region to run before gpu-kernel-outlining?

Apr 11 2022, 6:52 AM · Restricted Project, Restricted Project
bondhugula updated the diff for D123499: Add async dependencies support for gpu.launch op.

Update commit summary.

Apr 11 2022, 6:36 AM · Restricted Project, Restricted Project