Page MenuHomePhabricator

herhut (Stephan Herhut)
Animal

Projects

User does not belong to any projects.

User Details

User Since
Jan 7 2020, 7:38 AM (130 w, 1 d)

Recent Activity

Today

herhut accepted D129228: [mlir][AMDGPU] Add --chipset option to AMDGPUToROCDL.
Thu, Jul 7, 3:13 AM · Restricted Project, Restricted Project

Mon, Jul 4

herhut added a comment to D129036: [mlir] Add InferIntRangeInterface to gpu.launch.

As a follow-up, how about adding an attribute next to gpu.kernel that encodes the launch bounds?
They can be used to provide ranges to gpu.block_id etc and can be lowered to e.g.:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#performance-tuning-directives

Mon, Jul 4, 5:44 AM · Restricted Project, Restricted Project

May 31 2022

herhut added a comment to D126310: Do not destroy attrs in MergeNestedParallelLoops.

I don't think there are any guarantees that MLIR preserves unknown attributes during canonicalization. So this would be the first precedent in that direction.

May 31 2022, 3:52 AM · Restricted Project, Restricted Project
herhut accepted D126158: [MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration..

Separate pass works for me.

May 31 2022, 1:42 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project

May 25 2022

herhut added inline comments to D126158: [MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration..
May 25 2022, 8:26 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project
herhut accepted D126199: [MLIR][GPU] Expose GpuParallelLoopMapping as non-test pass..

If it useful beyond testing, I don't see why it should not be a regular pass.

May 25 2022, 8:18 AM · Restricted Project, Restricted Project
Herald added a project to D102799: [mlir][SCF] Canonicalize nested ParallelOps's: Restricted Project.

What was the motivation behind this canonicalization? In particular, why would a combined parallel loop be considered the canonical form?

May 25 2022, 7:15 AM · Restricted Project, Restricted Project

Apr 28 2022

herhut accepted D124577: [mlir] Don't iterate mutable user list.

Thanks!

Apr 28 2022, 2:51 AM · Restricted Project, Restricted Project
herhut accepted D124533: [mlir][NvGpu] Fix nvgpu.mma.sync lowering to NVVM for f32, tf32 types.

Thanks. Adding @csigg as an FYI.

Apr 28 2022, 2:50 AM · Restricted Project, Restricted Project

Apr 27 2022

herhut committed rGc10bbc20bc4d: [mlir][bazel] Add suport for PDLL tests. (authored by herhut).
[mlir][bazel] Add suport for PDLL tests.
Apr 27 2022, 3:35 AM · Restricted Project
herhut closed D124515: [mlir][bazel] Add suport for PDLL tests..
Apr 27 2022, 3:35 AM · Restricted Project, Restricted Project
herhut added a reviewer for D124515: [mlir][bazel] Add suport for PDLL tests.: akuegel.
Apr 27 2022, 3:30 AM · Restricted Project, Restricted Project
herhut requested review of D124515: [mlir][bazel] Add suport for PDLL tests..
Apr 27 2022, 3:12 AM · Restricted Project, Restricted Project

Apr 25 2022

herhut added a comment to D124366: [mlir][vector] insert `alloca`s outside of loops.

What is the longer term plan here? Do all places that introduce alloca have to perform this local optimization? Should we instead have a pass that hoists alloca out of loops? Or rethink the introduction of allocation scope for loops?

Apr 25 2022, 5:17 AM · Restricted Project, Restricted Project

Apr 19 2022

herhut accepted D123568: Add RegionBranchOpInterface on affine.for op.

Thanks!

Apr 19 2022, 4:25 AM · Restricted Project, Restricted Project

Apr 13 2022

herhut accepted D123266: [mlir][nvgpu] Add NVGPU dialect (architectural specific gpu dialect).

Looks good to me. We really need to figure out a way to group dialects :)

Apr 13 2022, 8:32 AM · Restricted Project, Restricted Project

Apr 12 2022

herhut added a comment to D123568: Add RegionBranchOpInterface on affine.for op.

Thanks for adding this.

Apr 12 2022, 2:21 AM · Restricted Project, Restricted Project

Mar 21 2022

herhut accepted D122066: [mlir] Add a function to print C-strings to RunnerUtils.cpp..
Mar 21 2022, 2:26 AM · Restricted Project, Restricted Project

Feb 18 2022

herhut committed rGa43f7d6d7698: [mlir][tensor] Extend reshape utils. (authored by herhut).
[mlir][tensor] Extend reshape utils.
Feb 18 2022, 12:58 AM
herhut closed D119730: [mlir][tensor] Extend reshape utils..
Feb 18 2022, 12:58 AM · Restricted Project

Feb 17 2022

herhut added a comment to D120046: [buildbot][cuda] Update used CUDA version to 11.6..

Found the testing. I need to get my docker setup in order before I can test, so this will take a little to land.

Feb 17 2022, 7:47 AM
herhut accepted D120021: Rename PatternRewriteSet::insert to add.

Thanks!

Feb 17 2022, 7:45 AM · Restricted Project
herhut added a reviewer for D120046: [buildbot][cuda] Update used CUDA version to 11.6.: tra.

Is there a way to test these configurations before submitting them?

Feb 17 2022, 5:20 AM
herhut requested review of D120046: [buildbot][cuda] Update used CUDA version to 11.6..
Feb 17 2022, 5:15 AM
herhut accepted D120044: [BufferDeallocation] Don't assume successor operands are unique.

Thanks!

Feb 17 2022, 5:15 AM · Restricted Project
herhut added a comment to D119730: [mlir][tensor] Extend reshape utils..

I have always felt this method to be a bit ad-hoc. I wanted to replace this with the logic in this patch https://reviews.llvm.org/D119904 (at line 792 in LinalgOps.cpp). That logic is better defined and easier to follow I think. If you could adapt this to use that approach, that'd be great.

Feb 17 2022, 4:37 AM · Restricted Project

Feb 16 2022

herhut accepted D119938: [mlir][MemRef] Lower memref.copy with an offset to memcpy.

Thanks!

Feb 16 2022, 7:40 AM · Restricted Project
herhut added a comment to D119938: [mlir][MemRef] Lower memref.copy with an offset to memcpy.

Consider also adding an integration test. I think we have one for the other cases.

Feb 16 2022, 6:35 AM · Restricted Project
herhut accepted D119932: [mlir][gpu] Split ops sinking from gpu-kernel-outlining pass into separate pass.

Thanks for splitting this out.

Feb 16 2022, 6:00 AM · Restricted Project

Feb 15 2022

herhut added a comment to D119743: [MLIR][OpenMP][SCF] Mark parallel regions as allocation scopes.

Thanks for your reference to the scoping operation @ftynse. I had the feeling we had something like it but could not find it.

Feb 15 2022, 8:48 AM · Restricted Project
herhut added a comment to D119743: [MLIR][OpenMP][SCF] Mark parallel regions as allocation scopes.

My concern with only doing the scf.parallel operation is that we do lower it to scf.for loops in some cases and we would have to model this behavior there somehow. How would we do this? Do we have a suitable high-level abstraction that allows to insert an allocation scope? Alternatively, we could mark the scf.for loop the same. That seems a reasonable choice but scf.for supports returning values, so we might actually have escaping stack allocations currently. Is this a model we want to support?

Feb 15 2022, 3:26 AM · Restricted Project
herhut accepted D119632: [mlir][gpu] sinkOperationsIntoLaunchOp: Add user hook for isSinkingBeneficiary.

Thank you. Just a naming nit.

Feb 15 2022, 3:04 AM · Restricted Project

Feb 14 2022

herhut added a comment to D117427: Add verifier for gpu.alloc op.

Thanks for adding this. Could this rather reuse the logic from verifyAllocLike from the memref operations? It is the same logic but it would need to be exposed to other dialect.

Thanks for the suggestion. Is it allowed to expose this function to other dialect?

Hello @herhut,

Could you please reply to this?

Thanks,
Akshay

Feb 14 2022, 11:21 AM · Restricted Project
herhut added a comment to D119632: [mlir][gpu] sinkOperationsIntoLaunchOp: Add user hook for isSinkingBeneficiary.

Thank you. The default is fairly arbitrary (well, it's good enough for some index computations) and we really should have some cost model here. +1 to making it configurable at least.

Feb 14 2022, 9:31 AM · Restricted Project
herhut added reviewers for D119730: [mlir][tensor] Extend reshape utils.: mravishankar, ftynse.
Feb 14 2022, 9:06 AM · Restricted Project
herhut requested review of D119730: [mlir][tensor] Extend reshape utils..
Feb 14 2022, 8:55 AM · Restricted Project

Jan 31 2022

herhut accepted D118543: Remove OpTrait, AttrTrait and TypeTrait.
Jan 31 2022, 3:06 AM · Restricted Project, Restricted Project

Jan 26 2022

herhut added inline comments to D118241: [mlir] Move SCF utils implementations to SCF/Utils..
Jan 26 2022, 4:39 AM · Restricted Project

Jan 20 2022

herhut committed rG6d45284618f0: [mlir][memref] Add better support for identity layouts in memref.collapse_shape… (authored by herhut).
[mlir][memref] Add better support for identity layouts in memref.collapse_shape…
Jan 20 2022, 6:32 AM
herhut closed D117772: [mlir][memref] Add better support for identity layouts in memref.collapse_shape canonicalizer.
Jan 20 2022, 6:32 AM · Restricted Project
herhut added a comment to D117519: [MLIR][GPU] Add debug output to enable dumping GPU assembly.

Please fix the commit message though. I just noticed this now.

Jan 20 2022, 5:24 AM · Restricted Project
herhut accepted D117519: [MLIR][GPU] Add debug output to enable dumping GPU assembly.

Thanks.

Jan 20 2022, 5:24 AM · Restricted Project
herhut added a comment to D117772: [mlir][memref] Add better support for identity layouts in memref.collapse_shape canonicalizer.

PTAL

Jan 20 2022, 5:05 AM · Restricted Project
herhut updated the diff for D117772: [mlir][memref] Add better support for identity layouts in memref.collapse_shape canonicalizer.

Pull size checking out of helper.

Jan 20 2022, 5:03 AM · Restricted Project
herhut added reviewers for D117772: [mlir][memref] Add better support for identity layouts in memref.collapse_shape canonicalizer: bkramer, ftynse, nicolasvasilache.
Jan 20 2022, 4:23 AM · Restricted Project
herhut requested review of D117772: [mlir][memref] Add better support for identity layouts in memref.collapse_shape canonicalizer.
Jan 20 2022, 4:22 AM · Restricted Project

Jan 18 2022

herhut added a comment to D117519: [MLIR][GPU] Add debug output to enable dumping GPU assembly.

Thanks for adding this. I have wanted this a couple of times, too, but never went as far as creating a diff for it.

Jan 18 2022, 10:52 AM · Restricted Project

Jan 17 2022

herhut added a comment to D117427: Add verifier for gpu.alloc op.

Thanks for adding this. Could this rather reuse the logic from verifyAllocLike from the memref operations? It is the same logic but it would need to be exposed to other dialect.

Jan 17 2022, 4:41 AM · Restricted Project

Jan 14 2022

herhut committed rGaa3cabe3cbe8: [mlir][memref] Fix memref.copy of scalar memref (authored by herhut).
[mlir][memref] Fix memref.copy of scalar memref
Jan 14 2022, 7:16 AM
herhut closed D117314: [mlir][memref] Fix memref.copy of scalar memref.
Jan 14 2022, 7:15 AM · Restricted Project
herhut requested review of D117314: [mlir][memref] Fix memref.copy of scalar memref.
Jan 14 2022, 7:02 AM · Restricted Project
herhut committed rGab95ba704da4: [mlir][memref] Implement fast lowering of memref.copy (authored by herhut).
[mlir][memref] Implement fast lowering of memref.copy
Jan 14 2022, 5:22 AM
herhut closed D116099: [mlir][memref] Implement fast lowering of memref.copy.
Jan 14 2022, 5:22 AM · Restricted Project

Jan 13 2022

herhut updated the diff for D116099: [mlir][memref] Implement fast lowering of memref.copy.

fix test

Jan 13 2022, 3:06 AM · Restricted Project

Jan 12 2022

herhut accepted D117101: Fold arith.cmpf when at least one operand is known to be NaN..
Jan 12 2022, 5:40 AM · Restricted Project

Jan 11 2022

herhut added a comment to D117010: Mark arith.minf, arith.maxf as commutative..

The arith::minf operation is defined as

Jan 11 2022, 9:23 AM · Restricted Project
herhut accepted D117011: Remove NaN constant from arith.minf, arith.maxf expansion.

Thanks!

Jan 11 2022, 9:06 AM · Restricted Project
herhut added a comment to D116600: [linalg][fusion] Disallow fusion when it would create an invalid expand_shape.

I think Mahesh's suggestion is better because there is no guarantee in shape reification that you'd return 1 and you may well end up with propagating dynamic information further down.

There could well be a separate pass that inserts these casts to avoid the behavior in shape reification if that is critical to your use cases. I think it would still be better to fix shape reification to return a static value where possible. After all, if this is valid linalg IR, reification should do as good as it can with it.

Jan 11 2022, 8:48 AM · Restricted Project
herhut updated the diff for D116099: [mlir][memref] Implement fast lowering of memref.copy.

rebase

Jan 11 2022, 7:31 AM · Restricted Project
herhut accepted D117010: Mark arith.minf, arith.maxf as commutative..

Thanks!

Jan 11 2022, 6:07 AM · Restricted Project
herhut added inline comments to D117011: Remove NaN constant from arith.minf, arith.maxf expansion.
Jan 11 2022, 6:06 AM · Restricted Project

Jan 10 2022

herhut added a comment to D116889: Add inliner interface for GPU dialect.

Is it legal though? Should we have something like gpu.thread_id outside of a GPU function/module? I would rather think it is legal to inline GPU functions into GPU functions.

Jan 10 2022, 3:02 AM · Restricted Project
herhut committed rG33cec20dbd3b: [mlir][memref] Tighten verification of memref.reinterpret_cast (authored by herhut).
[mlir][memref] Tighten verification of memref.reinterpret_cast
Jan 10 2022, 2:56 AM
herhut closed D116601: [mlir][memref] Tighten verification of memref.reinterpret_cast.
Jan 10 2022, 2:56 AM · Restricted Project

Jan 7 2022

herhut added inline comments to D116600: [linalg][fusion] Disallow fusion when it would create an invalid expand_shape.
Jan 7 2022, 6:46 AM · Restricted Project
herhut added a comment to D116601: [mlir][memref] Tighten verification of memref.reinterpret_cast.

Thanks!

Jan 7 2022, 4:36 AM · Restricted Project
herhut added a comment to D116099: [mlir][memref] Implement fast lowering of memref.copy.

You are just using isIdentity() -- the API doesn't name map outside. In the inside, it's still a map.

I was not clear. I was wondering about the case where the static type of the memref does not have a map but the descriptor at runtime still does not have identity strides. Currently, one can create this setting using memref.reinterpret_cast by omitting the map from the target type. Maybe that should be illegal.

This would be an invalid op and should have failed the verifier in the first place. (Note that the memref always has a map -- it's not printed if it's identity as you know. Also, identity strides (all ones) don't correspond to an identity map -- N^2, N, 1 for example would correspond to an identity map for a 3-d memref for example.)

Jan 7 2022, 1:04 AM · Restricted Project
herhut updated the diff for D116099: [mlir][memref] Implement fast lowering of memref.copy.

Combined the two patterns.

Jan 7 2022, 1:01 AM · Restricted Project
herhut added inline comments to D116601: [mlir][memref] Tighten verification of memref.reinterpret_cast.
Jan 7 2022, 12:58 AM · Restricted Project
herhut updated the diff for D116601: [mlir][memref] Tighten verification of memref.reinterpret_cast.

Simplified as suggested

Jan 7 2022, 12:56 AM · Restricted Project

Jan 5 2022

herhut accepted D116657: [mlir] Fix missing check on nested op values in LICM.

Great find. Thanks for the fix.

Jan 5 2022, 5:52 AM · Restricted Project

Jan 4 2022

herhut requested review of D116601: [mlir][memref] Tighten verification of memref.reinterpret_cast.
Jan 4 2022, 8:19 AM · Restricted Project

Dec 22 2021

herhut added a comment to D116099: [mlir][memref] Implement fast lowering of memref.copy.

Do you want to instead create one pattern with two functions in it? It'll lead to less overhead in the greedy rewrite driver and perhaps also easier to later choose between the two.

I though about that, too. Another way would be to give the memcpy based pattern higher benefit and actually make then non-exclusive. That way one could optionally blend in the memcpy based pattern.

All of this still leads to two patterns.

Dec 22 2021, 8:48 AM · Restricted Project
herhut added a comment to D116151: [mlir][arith] Fix CmpIOP folding for vector types..

Can you change the subject to have an additional tag, e.g., [mlir][arith]. That provides a little more context. You could also state that it did fail for vectors before but now it folds them correctly.

Dec 22 2021, 6:40 AM · Restricted Project
herhut accepted D116151: [mlir][arith] Fix CmpIOP folding for vector types..

Thanks!

Dec 22 2021, 6:39 AM · Restricted Project
herhut added a comment to D116099: [mlir][memref] Implement fast lowering of memref.copy.

Do you want to instead create one pattern with two functions in it? It'll lead to less overhead in the greedy rewrite driver and perhaps also easier to later choose between the two.

Dec 22 2021, 5:24 AM · Restricted Project
herhut updated the diff for D116099: [mlir][memref] Implement fast lowering of memref.copy.

Add comments and failure reason.

Dec 22 2021, 5:21 AM · Restricted Project

Dec 21 2021

herhut requested review of D116099: [mlir][memref] Implement fast lowering of memref.copy.
Dec 21 2021, 4:00 AM · Restricted Project
herhut committed rG8761f5ebf754: [mlir][Support] Avoid multiplication in floorDiv / ceilDiv (authored by herhut).
[mlir][Support] Avoid multiplication in floorDiv / ceilDiv
Dec 21 2021, 2:51 AM
herhut closed D116096: [mlir][Support] Avoid multiplication in floorDiv / ceilDiv.
Dec 21 2021, 2:51 AM · Restricted Project
herhut accepted D114678: [mlir][memref] ReinterpretCast: allow static sizes/strides/offset where affine map expects dynamic.

Thanks. I don't see any reason to disallow this either and was surprised by the error message.

Dec 21 2021, 2:36 AM · Restricted Project
herhut updated the diff for D116096: [mlir][Support] Avoid multiplication in floorDiv / ceilDiv.

Add tests and fix behaviour for zero case.

Dec 21 2021, 2:32 AM · Restricted Project
herhut requested review of D116096: [mlir][Support] Avoid multiplication in floorDiv / ceilDiv.
Dec 21 2021, 1:58 AM · Restricted Project

Dec 16 2021

herhut accepted D115821: [mlir] Extend `tensor.from_elements` to support N-D case..
Dec 16 2021, 5:51 AM · Restricted Project
herhut added inline comments to D115821: [mlir] Extend `tensor.from_elements` to support N-D case..
Dec 16 2021, 4:54 AM · Restricted Project

Dec 14 2021

herhut accepted D115722: [mlir][GPU] Extend GPU kernel outlining to generate DL specification.

It would be nice if we could just pass the data layout as a string.

Dec 14 2021, 7:27 AM · Restricted Project

Dec 8 2021

herhut accepted D115326: [mlir] Added documentation for bufferization to memref conversion pass..
Dec 8 2021, 5:11 AM · Restricted Project

Dec 1 2021

herhut committed rG9fce961d2f47: [mlir][linalg] Disable tensor-matmul test under asan (authored by herhut).
[mlir][linalg] Disable tensor-matmul test under asan
Dec 1 2021, 7:26 AM
herhut closed D114857: [mlir][linalg] Disable tensor-matmul test under asan.
Dec 1 2021, 7:25 AM · Restricted Project
herhut added a comment to D114857: [mlir][linalg] Disable tensor-matmul test under asan.

I assume that would be @nicolasvasilache as I suspect this is a matmul bufferization issue.

Dec 1 2021, 7:25 AM · Restricted Project
herhut added a comment to D114857: [mlir][linalg] Disable tensor-matmul test under asan.

Thanks @mehdi_amini ! That is much easier.

Dec 1 2021, 4:15 AM · Restricted Project
herhut updated the diff for D114857: [mlir][linalg] Disable tensor-matmul test under asan.

Use in-test tag as suggested.

Dec 1 2021, 4:14 AM · Restricted Project
herhut requested review of D114857: [mlir][linalg] Disable tensor-matmul test under asan.
Dec 1 2021, 1:57 AM · Restricted Project

Nov 30 2021

herhut accepted D114384: Fix expand folder to avoid folding memref cast.
Nov 30 2021, 9:25 AM · Restricted Project, Restricted Project
herhut added a comment to D114384: Fix expand folder to avoid folding memref cast.

This was recently fixed in https://reviews.llvm.org/D114391. If you rebase, you should pick up that change.

Nov 30 2021, 9:24 AM · Restricted Project, Restricted Project

Nov 29 2021

herhut accepted D114233: [mlir] Decompose Bufferization Clone operation into Memref Alloc and Copy..

Please sort the headers, otherwise looks good. Thanks!

Nov 29 2021, 6:50 AM · Restricted Project
herhut committed rG95f34e318c46: [mlir][memref] Fix bug in verification of memref.collapse_shape (authored by herhut).
[mlir][memref] Fix bug in verification of memref.collapse_shape
Nov 29 2021, 6:48 AM
herhut closed D114702: [mlir][memref] Fix bug in verification of memref.collapse_shape.
Nov 29 2021, 6:47 AM · Restricted Project
herhut added reviewers for D114702: [mlir][memref] Fix bug in verification of memref.collapse_shape: bkramer, pifon2a, nicolasvasilache.
Nov 29 2021, 5:41 AM · Restricted Project
herhut requested review of D114702: [mlir][memref] Fix bug in verification of memref.collapse_shape.
Nov 29 2021, 5:40 AM · Restricted Project