Page MenuHomePhabricator
Feed Advanced Search

Fri, Jul 30

ThomasRaoux accepted D107205: [mlir][sparse] add sparse tensor type conversion operation.
Fri, Jul 30, 6:19 PM · Restricted Project
ThomasRaoux added inline comments to D107205: [mlir][sparse] add sparse tensor type conversion operation.
Fri, Jul 30, 6:18 PM · Restricted Project

Wed, Jul 28

ThomasRaoux abandoned D103253: [mlir][vector] Fold inbound attribute for transfer op with permutations.

do you still plan on landing this?

Wed, Jul 28, 7:32 AM · Restricted Project

Fri, Jul 23

ThomasRaoux committed rG73a9d6d0e200: [mlir][linalg] Fix bug in contraction op vectorization with output perm (authored by ThomasRaoux).
[mlir][linalg] Fix bug in contraction op vectorization with output perm
Fri, Jul 23, 8:40 AM
ThomasRaoux closed D106469: [mlir][linalg] Fix bug in contraction op vectorization with output perm.
Fri, Jul 23, 8:40 AM · Restricted Project

Wed, Jul 21

ThomasRaoux committed rG45cb4140eb13: [mlir] Extend scf pipeling to support loop carried dependencies (authored by ThomasRaoux).
[mlir] Extend scf pipeling to support loop carried dependencies
Wed, Jul 21, 6:33 PM
ThomasRaoux closed D106325: [mlir] Extend scf pipeling to support loop carried dependencies.
Wed, Jul 21, 6:33 PM · Restricted Project
ThomasRaoux updated the summary of D106469: [mlir][linalg] Fix bug in contraction op vectorization with output perm.
Wed, Jul 21, 11:08 AM · Restricted Project
ThomasRaoux requested review of D106469: [mlir][linalg] Fix bug in contraction op vectorization with output perm.
Wed, Jul 21, 11:08 AM · Restricted Project

Mon, Jul 19

ThomasRaoux requested review of D106325: [mlir] Extend scf pipeling to support loop carried dependencies.
Mon, Jul 19, 5:10 PM · Restricted Project
ThomasRaoux committed rG73f1d6edc069: [mlir] Fix bazel build (authored by ThomasRaoux).
[mlir] Fix bazel build
Mon, Jul 19, 2:07 PM
ThomasRaoux closed D106311: [mlir] Fix bazel build.
Mon, Jul 19, 2:06 PM · Restricted Project
ThomasRaoux requested review of D106311: [mlir] Fix bazel build.
Mon, Jul 19, 2:04 PM · Restricted Project
ThomasRaoux committed rGf6f88e66cedc: [mlir] Add software pipelining transformation for scf.For op (authored by ThomasRaoux).
[mlir] Add software pipelining transformation for scf.For op
Mon, Jul 19, 1:44 PM
ThomasRaoux closed D105868: [mlir] Add software pipelining transformation for scf.for op.
Mon, Jul 19, 1:44 PM · Restricted Project
ThomasRaoux added a comment to D105868: [mlir] Add software pipelining transformation for scf.for op.

Did a pass. Overall looks fine. Just one main comment, and a nit below. I will "accept" to remove my blocker, but would probably be better to get someone else to review as well. I am on the fence about the structure here.

Mon, Jul 19, 11:07 AM · Restricted Project

Thu, Jul 15

ThomasRaoux added a comment to D105868: [mlir] Add software pipelining transformation for scf.for op.

You can always make things work :) , but this to me falls into "hero optimization" category, which almost always can be solved by better abstractions. In this case, for example, you can just create a different operation scf.pipelined_for where the op has multiple scf.pipelined_stage operation that explicitly specify each stage and you can use standard SSA values to represent dependencies between different stages. The way of trying to recover the stages is what is traditionally done, but that is because until MLIR there wasnt a way to represent a "pipelined loop". You can easily lower such an operation to an scf.for in the exact form this patch is doing. Then you have to put in the work of having a lowering to scf.pipelined_for but I think that composes better.

Thu, Jul 15, 10:45 AM · Restricted Project
ThomasRaoux added inline comments to D105868: [mlir] Add software pipelining transformation for scf.for op.
Thu, Jul 15, 9:51 AM · Restricted Project
ThomasRaoux updated the diff for D105868: [mlir] Add software pipelining transformation for scf.for op.
Thu, Jul 15, 9:50 AM · Restricted Project
ThomasRaoux updated the diff for D105868: [mlir] Add software pipelining transformation for scf.for op.

Update the example doc

Thu, Jul 15, 9:50 AM · Restricted Project
ThomasRaoux added a comment to D105868: [mlir] Add software pipelining transformation for scf.for op.

Did a preliminary pass. I understand the core transformation is mostly generating the code and the actual dependence analysis is left to the caller through the callback. So in that way, mostly seems OK. Some questions though to get a better idea of where this heading

  1. If you want to do pipelining automatically what analysis would you need. Also not clear to me what the test_pipelining_cycle marker is in the test pass.
  2. It almost seems like it might just be easier to create a build method for scf.for that creates the "pipelined" implementation while lowering into loops. I can see how you would lower linalg ops to this, etc. Creating the pipelined loops while lowering seems to be more straight-forward than creating stages after the fact.

For now "request revision" to indicate that I will come back for a further review.

Correct, the scheduling itself is left outside as it will require a lot of more heuristics. Pipeliner are usually split into two pieces, the scheduling part that picks a scheduling based on latencies, dependencies, encoding, etc.. and expender that is the mechanical way to generate the loops.

If you want to do pipelining automatically what analysis would you need.

For the full automatic pipelining we would need to have some information about the latency of different operations. In general when done at high level the pipelining will be mostly deciding to overlap different part of the loop. This could be analyze but it is most likely to something hardcoded for some given ops as doing accurate latency analysis at this stage is going to be hard. Usually pipelining is done late in the compilation flow and require exact latency informations.
Since the scheduling part will be heavy on heuristic I don't think it can live in the core transformation for a while.

I suspect the documentation in the patch itself could be nicely improved with some of the description you provided here :)

Thu, Jul 15, 9:36 AM · Restricted Project
ThomasRaoux updated the diff for D105868: [mlir] Add software pipelining transformation for scf.for op.

Add more detailed doc to the pattern.

Thu, Jul 15, 9:34 AM · Restricted Project
ThomasRaoux added a comment to D105868: [mlir] Add software pipelining transformation for scf.for op.

Did a preliminary pass. I understand the core transformation is mostly generating the code and the actual dependence analysis is left to the caller through the callback. So in that way, mostly seems OK. Some questions though to get a better idea of where this heading

  1. If you want to do pipelining automatically what analysis would you need. Also not clear to me what the test_pipelining_cycle marker is in the test pass.
  2. It almost seems like it might just be easier to create a build method for scf.for that creates the "pipelined" implementation while lowering into loops. I can see how you would lower linalg ops to this, etc. Creating the pipelined loops while lowering seems to be more straight-forward than creating stages after the fact.

For now "request revision" to indicate that I will come back for a further review.

Correct, the scheduling itself is left outside as it will require a lot of more heuristics. Pipeliner are usually split into two pieces, the scheduling part that picks a scheduling based on latencies, dependencies, encoding, etc.. and expender that is the mechanical way to generate the loops.

If you want to do pipelining automatically what analysis would you need.

For the full automatic pipelining we would need to have some information about the latency of different operations. In general when done at high level the pipelining will be mostly deciding to overlap different part of the loop. This could be analyze but it is most likely to something hardcoded for some given ops as doing accurate latency analysis at this stage is going to be hard. Usually pipelining is done late in the compilation flow and require exact latency informations.
Since the scheduling part will be heavy on heuristic I don't think it can live in the core transformation for a while.

It almost seems like it might just be easier to create a build method for scf.for that creates the "pipelined" implementation while lowering into loops. I can see how you would lower linalg ops to this, etc. Creating the pipelined loops while lowering seems to be more straight-forward than creating stages after the fact.

I don't understand this comment. In general the pipelining can be applied at many different levels (could be applied on linalg on tensor, or on vector level, etc...) so having a generic solution for it seems like a better solution. Overall this is just the first draft and in my experience pipelining can become quite complex so I would think that separating the logic from other lowering is going to be the way to go.

Coming from a callers perspective, you need to look at the sequence of operations within the scf.for, carefully partition it into different stages, and then specify those stages to the transformation. Recovering the different stages after lowering to loops (especially if we say run canonicalizations before pipelining) is going to be very tricky and cumbersome. What is easier is to reason about the different stages while lowering to the loop. It is easier to build the pipelined implementation directly while creating the scf.for. If the build method takes multiple lambdas, each of which represents the IR to be used for one stage of the loop, then I can generate the pipelined implementation directly. I cant see a robust way of discovering these stages after the fact. It will invariably be very specific to the exact sequence of operations in the loop that is being pipelined and would be hard to maintain. I dont think the fact that pipelining can be applied at different stages (tensors, vectors, etc.) matters here. You start with something that is "higher level" representation (like linalg, etc.) and when you lower the to loops you explicitly reason about the different stages and pipeline them. During lowering you have enough information about the loop body (since you are creating the body) to be able to reason about the different stages you want to use.
If you do want to have a mechanism to take an scf.for that is already lowered and convert it to a "pipelined" version, then you can use the same mechanism in this change to do a pattern rewriter from the old scf.for to a pipelined implementation. Though, I think such cases should be really limited.
To clarify, I think the transformation itself is sound AFAICS. I am just not sure the mechanism to specify stages is very usable.

Thu, Jul 15, 9:12 AM · Restricted Project
ThomasRaoux added inline comments to D105868: [mlir] Add software pipelining transformation for scf.for op.
Thu, Jul 15, 8:25 AM · Restricted Project
ThomasRaoux updated the diff for D105868: [mlir] Add software pipelining transformation for scf.for op.

Rename the test marker to __test_pipelining_op_order__

Thu, Jul 15, 8:25 AM · Restricted Project

Wed, Jul 14

ThomasRaoux added a comment to D105868: [mlir] Add software pipelining transformation for scf.for op.

Did a preliminary pass. I understand the core transformation is mostly generating the code and the actual dependence analysis is left to the caller through the callback. So in that way, mostly seems OK. Some questions though to get a better idea of where this heading

  1. If you want to do pipelining automatically what analysis would you need. Also not clear to me what the test_pipelining_cycle marker is in the test pass.
  2. It almost seems like it might just be easier to create a build method for scf.for that creates the "pipelined" implementation while lowering into loops. I can see how you would lower linalg ops to this, etc. Creating the pipelined loops while lowering seems to be more straight-forward than creating stages after the fact.

For now "request revision" to indicate that I will come back for a further review.

Wed, Jul 14, 11:53 PM · Restricted Project
ThomasRaoux accepted D106014: [mlir][affine] Add single result affine.min/max -> affine.apply canonicalization..
Wed, Jul 14, 1:35 PM · Restricted Project

Tue, Jul 13

ThomasRaoux added inline comments to D105868: [mlir] Add software pipelining transformation for scf.for op.
Tue, Jul 13, 10:55 PM · Restricted Project
ThomasRaoux updated the diff for D105868: [mlir] Add software pipelining transformation for scf.for op.

fix PipeliningOption name

Tue, Jul 13, 10:55 PM · Restricted Project
ThomasRaoux committed rG6296e109728d: [mlir][Vector] Remove Vector TupleOp as it is unused (authored by ThomasRaoux).
[mlir][Vector] Remove Vector TupleOp as it is unused
Tue, Jul 13, 12:39 PM
ThomasRaoux closed D105924: [mlir][Vector] Remove Vector TupleOp as it is unused.
Tue, Jul 13, 12:39 PM · Restricted Project
ThomasRaoux requested review of D105924: [mlir][Vector] Remove Vector TupleOp as it is unused.
Tue, Jul 13, 11:50 AM · Restricted Project
ThomasRaoux committed rGae4cea38f18e: [mlir] Add support for tensor.extract to comprehensive bufferization (authored by ThomasRaoux).
[mlir] Add support for tensor.extract to comprehensive bufferization
Tue, Jul 13, 9:55 AM
ThomasRaoux closed D105870: [mlir] Add support for tensor.extract to comprehensive bufferization.
Tue, Jul 13, 9:55 AM · Restricted Project
ThomasRaoux accepted D105897: [mlir][Linalg] Properly specify Linalg attribute..
Tue, Jul 13, 9:27 AM · Restricted Project
ThomasRaoux updated the diff for D105868: [mlir] Add software pipelining transformation for scf.for op.

Fix clang-tidy warnings

Tue, Jul 13, 12:47 AM · Restricted Project

Mon, Jul 12

ThomasRaoux updated the summary of D105870: [mlir] Add support for tensor.extract to comprehensive bufferization.
Mon, Jul 12, 10:13 PM · Restricted Project
ThomasRaoux requested review of D105870: [mlir] Add support for tensor.extract to comprehensive bufferization.
Mon, Jul 12, 10:12 PM · Restricted Project
ThomasRaoux added inline comments to D105857: [mlir][Linalg] Better support for bufferizing non-tensor results..
Mon, Jul 12, 10:04 PM · Restricted Project
ThomasRaoux accepted D105859: [mlir][Linalg] Add layout specification support to bufferization..
Mon, Jul 12, 9:43 PM · Restricted Project
ThomasRaoux accepted D105857: [mlir][Linalg] Better support for bufferizing non-tensor results..
Mon, Jul 12, 9:40 PM · Restricted Project
ThomasRaoux added a reviewer for D105868: [mlir] Add software pipelining transformation for scf.for op: hgreving.
Mon, Jul 12, 9:37 PM · Restricted Project
ThomasRaoux requested review of D105868: [mlir] Add software pipelining transformation for scf.for op.
Mon, Jul 12, 9:36 PM · Restricted Project

Fri, Jul 9

ThomasRaoux accepted D105678: [MLIR][GPU][NFC] Fix documentation for wmma matrix load/store ops.

Thanks!

Fri, Jul 9, 8:59 AM · Restricted Project

Wed, Jul 7

ThomasRaoux accepted D105459: [mlir][linalg] Add optional output operand to PadTensorOp.
Wed, Jul 7, 5:56 PM · Restricted Project
ThomasRaoux accepted D105458: [mlir][linalg][NFC] Factor out tile generation in makeTiledShapes.
Wed, Jul 7, 1:00 PM · Restricted Project
ThomasRaoux committed rG291025389c2c: [mlir][vector] Refactor Vector Unrolling and remove Tuple ops (authored by ThomasRaoux).
[mlir][vector] Refactor Vector Unrolling and remove Tuple ops
Wed, Jul 7, 11:12 AM
ThomasRaoux closed D105381: [mlir][vector] Refactor Vector Unrolling and remove Tuple ops.
Wed, Jul 7, 11:11 AM · Restricted Project
ThomasRaoux updated the diff for D105381: [mlir][vector] Refactor Vector Unrolling and remove Tuple ops.
Wed, Jul 7, 11:10 AM · Restricted Project
ThomasRaoux added inline comments to D105381: [mlir][vector] Refactor Vector Unrolling and remove Tuple ops.
Wed, Jul 7, 11:02 AM · Restricted Project
ThomasRaoux updated the diff for D105381: [mlir][vector] Refactor Vector Unrolling and remove Tuple ops.

Address review comments.

Wed, Jul 7, 10:59 AM · Restricted Project

Sat, Jul 3

ThomasRaoux changed the edit policy for D103868: [mlir][gpu][NFC] Simplify conversion of MMA type to NVVM.
Sat, Jul 3, 8:26 PM · Restricted Project

Jul 2 2021

ThomasRaoux updated the diff for D105381: [mlir][vector] Refactor Vector Unrolling and remove Tuple ops.

Fix formatting

Jul 2 2021, 4:32 PM · Restricted Project
ThomasRaoux requested review of D105381: [mlir][vector] Refactor Vector Unrolling and remove Tuple ops.
Jul 2 2021, 4:05 PM · Restricted Project
ThomasRaoux accepted D104970: [mlir][Linalg] Fix incorrect logic in deciding when to fuse reshapes by linearization..
Jul 2 2021, 9:41 AM · Restricted Project
ThomasRaoux accepted D105359: [mlir][Vector] NFC - Compress vector to outerproduct lowering..
Jul 2 2021, 9:38 AM · Restricted Project

Jun 30 2021

ThomasRaoux committed rG627733b5f045: [mlir][vector] Extend vector distribution to all elementwise and contract (authored by ThomasRaoux).
[mlir][vector] Extend vector distribution to all elementwise and contract
Jun 30 2021, 4:36 PM
ThomasRaoux closed D104343: [mlir][vector] Extend vector distribution to all elementwise and contract.
Jun 30 2021, 4:36 PM · Restricted Project
ThomasRaoux added inline comments to D104343: [mlir][vector] Extend vector distribution to all elementwise and contract.
Jun 30 2021, 4:35 PM · Restricted Project
ThomasRaoux updated the diff for D104343: [mlir][vector] Extend vector distribution to all elementwise and contract.

Fix comment and format.

Jun 30 2021, 4:27 PM · Restricted Project
ThomasRaoux committed rG0298f2cfb1df: [mlir] Fix wrong type in WmmaConstantOpToNVVMLowering (authored by ThomasRaoux).
[mlir] Fix wrong type in WmmaConstantOpToNVVMLowering
Jun 30 2021, 9:11 AM
ThomasRaoux closed D105174: [mlir] Fix wrong type in WmmaConstantOpToNVVMLowering.
Jun 30 2021, 9:11 AM · Restricted Project
ThomasRaoux committed rG439284194959: [mlir][VectorToGPU] Support converting vetor.broadcast to MMA op (authored by ThomasRaoux).
[mlir][VectorToGPU] Support converting vetor.broadcast to MMA op
Jun 30 2021, 9:09 AM
ThomasRaoux closed D105175: [mlir][VectorToGPU] Support converting vetor.broadcast to MMA op.
Jun 30 2021, 9:09 AM · Restricted Project
ThomasRaoux updated the diff for D105175: [mlir][VectorToGPU] Support converting vetor.broadcast to MMA op.

clang-format

Jun 30 2021, 8:40 AM · Restricted Project
ThomasRaoux requested review of D105175: [mlir][VectorToGPU] Support converting vetor.broadcast to MMA op.
Jun 30 2021, 12:07 AM · Restricted Project
ThomasRaoux requested review of D105174: [mlir] Fix wrong type in WmmaConstantOpToNVVMLowering.
Jun 30 2021, 12:05 AM · Restricted Project

Jun 29 2021

ThomasRaoux added inline comments to D104970: [mlir][Linalg] Fix incorrect logic in deciding when to fuse reshapes by linearization..
Jun 29 2021, 11:51 AM · Restricted Project

Jun 28 2021

ThomasRaoux committed rG0d6e4199e32a: [mlir][vector] Order parallel indices before transposing the input in… (authored by harsh).
[mlir][vector] Order parallel indices before transposing the input in…
Jun 28 2021, 6:47 PM
ThomasRaoux added a comment to D104884: Order parallel indices before transposing the input in multireductions.

Thanks and feel free to merge because I don't have merge priveleges.

Jun 28 2021, 6:47 PM · Restricted Project
ThomasRaoux closed D104884: Order parallel indices before transposing the input in multireductions.
Jun 28 2021, 6:47 PM · Restricted Project
ThomasRaoux accepted D104884: Order parallel indices before transposing the input in multireductions.
Jun 28 2021, 5:07 PM · Restricted Project
ThomasRaoux accepted D104884: Order parallel indices before transposing the input in multireductions.

LGTM

Jun 28 2021, 3:12 PM · Restricted Project
ThomasRaoux added a comment to D104884: Order parallel indices before transposing the input in multireductions.

Thanks @ThomasRaoux , @asaadaldien , @nicolasvasilache for the comments. @nicolasvasilache I have modified the patch as per your changes but kept the current patch to just handle moving reductions to the inner most dimensions. I will put up another patch to handle moving reductions to the outermost dimensions and based on the performance of inner vs outer, we can decide which path we want to take.

Jun 28 2021, 2:59 PM · Restricted Project
ThomasRaoux accepted D105059: [mlir] Skip scalar operands when tiling to linalg.tiled_loop..

Thanks!

Jun 28 2021, 1:29 PM · Restricted Project

Jun 27 2021

ThomasRaoux added inline comments to D104884: Order parallel indices before transposing the input in multireductions.
Jun 27 2021, 11:19 PM · Restricted Project
ThomasRaoux added a reviewer for D104884: Order parallel indices before transposing the input in multireductions: ThomasRaoux.
Jun 27 2021, 11:19 PM · Restricted Project
ThomasRaoux added inline comments to D104884: Order parallel indices before transposing the input in multireductions.
Jun 27 2021, 11:11 PM · Restricted Project

Jun 24 2021

ThomasRaoux committed rG1a8655927641: [mlir][VectorToGPU] Add conversion for scf::For op with Matrix operands (authored by ThomasRaoux).
[mlir][VectorToGPU] Add conversion for scf::For op with Matrix operands
Jun 24 2021, 3:44 PM
ThomasRaoux closed D104134: [mlir][VectorToGPU] Add conversion for scf::For op with Matrix operands.
Jun 24 2021, 3:44 PM · Restricted Project
ThomasRaoux committed rG6413226dce06: [mlir][VectorToGPU] Add conversion for splat constant to MMA const matrix (authored by ThomasRaoux).
[mlir][VectorToGPU] Add conversion for splat constant to MMA const matrix
Jun 24 2021, 3:38 PM
ThomasRaoux closed D104133: [mlir][VectorToGPU] Add conversion for splat constant to MMA const matrix.
Jun 24 2021, 3:38 PM · Restricted Project

Jun 21 2021

ThomasRaoux updated the diff for D104343: [mlir][vector] Extend vector distribution to all elementwise and contract.

Rebase

Jun 21 2021, 11:20 PM · Restricted Project
ThomasRaoux accepted D104683: [mlir][linalg] Fusion of PadTensorOp.
Jun 21 2021, 7:46 PM · Restricted Project
ThomasRaoux committed rG1244bca53fb2: [mlir][vector] Support distributing transfer op with permutation map (authored by ThomasRaoux).
[mlir][vector] Support distributing transfer op with permutation map
Jun 21 2021, 12:56 PM
ThomasRaoux closed D104263: [mlir][vector] Support distributing transfer op with permutation map.
Jun 21 2021, 12:56 PM · Restricted Project
ThomasRaoux added inline comments to D104263: [mlir][vector] Support distributing transfer op with permutation map.
Jun 21 2021, 12:50 PM · Restricted Project
ThomasRaoux updated the diff for D104263: [mlir][vector] Support distributing transfer op with permutation map.

Address comments

Jun 21 2021, 12:49 PM · Restricted Project

Jun 18 2021

ThomasRaoux added a comment to D104134: [mlir][VectorToGPU] Add conversion for scf::For op with Matrix operands.

ping :)

Jun 18 2021, 10:07 AM · Restricted Project
ThomasRaoux added a comment to D104133: [mlir][VectorToGPU] Add conversion for splat constant to MMA const matrix.

ping :)

Jun 18 2021, 10:07 AM · Restricted Project
ThomasRaoux accepted D104357: [mlir][linalg] Lower subtensor(pad_tensor) to pad_tensor(subtensor).
Jun 18 2021, 8:29 AM · Restricted Project

Jun 17 2021

ThomasRaoux added inline comments to D104357: [mlir][linalg] Lower subtensor(pad_tensor) to pad_tensor(subtensor).
Jun 17 2021, 7:44 PM · Restricted Project
ThomasRaoux added inline comments to D104357: [mlir][linalg] Lower subtensor(pad_tensor) to pad_tensor(subtensor).
Jun 17 2021, 8:57 AM · Restricted Project
ThomasRaoux accepted D104278: [mlir][linalg] Canonicalize PadTensorOp with zero source dimension.
Jun 17 2021, 8:31 AM · Restricted Project

Jun 15 2021

ThomasRaoux requested review of D104343: [mlir][vector] Extend vector distribution to all elementwise and contract.
Jun 15 2021, 5:42 PM · Restricted Project

Jun 14 2021

ThomasRaoux requested review of D104263: [mlir][vector] Support distributing transfer op with permutation map.
Jun 14 2021, 1:29 PM · Restricted Project

Jun 11 2021

ThomasRaoux updated the diff for D104133: [mlir][VectorToGPU] Add conversion for splat constant to MMA const matrix.

Address review comments.

Jun 11 2021, 5:07 PM · Restricted Project
ThomasRaoux retitled D104134: [mlir][VectorToGPU] Add conversion for scf::For op with Matrix operands from [mlir][ConvertVectorToGPU] Add support for converting scf::For op with Matrix operands to [mlir][VectorToGPU] Add conversion for scf::For op with Matrix operands.
Jun 11 2021, 3:58 PM · Restricted Project
ThomasRaoux requested review of D104134: [mlir][VectorToGPU] Add conversion for scf::For op with Matrix operands.
Jun 11 2021, 11:01 AM · Restricted Project
ThomasRaoux requested review of D104133: [mlir][VectorToGPU] Add conversion for splat constant to MMA const matrix.
Jun 11 2021, 10:39 AM · Restricted Project