This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/lib/Dialect/SparseTensor/
-
lib/
-
Dialect/
-
SparseTensor/
-
Transforms/
4/4
SparseTensorConversion.cpp
-
Sparsification.cpp
-
Utils/
-
CMakeLists.txt
-
CodegenUtils.h
3/3
CodegenUtils.cpp
-
Merger.cpp
-
utils/bazel/llvm-project-overlay/mlir/
-
bazel/
-
llvm-project-overlay/
-
mlir/
6/8
BUILD.bazel

Differential D115008

[mlir][sparse] Factoring out Transforms/CodegenUtils.{cpp,h}
ClosedPublic

Authored by wrengr on Dec 2 2021, 4:34 PM.

Download Raw Diff

Details

Reviewers

aartbik
bixia
penpornk
rriddle

Commits

rG85b8d03e12bb: [mlir][sparse] Factoring out Transforms/CodegenUtils.{cpp,h}

Summary

This moves a bunch of helper functions from Transforms/SparseTensorConversion.cpp into Transforms/CodegenUtils.{cpp,h} so that they can be reused by Transforms/Sparsification.cpp, etc.

See also the dependent D115010 which cleans up some corner cases in this change.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

wrengr created this revision.Dec 2 2021, 4:34 PM

Herald added subscribers: sdasgup3, wenzhicui, Chia-hungDuan and 19 others. · View Herald TranscriptDec 2 2021, 4:35 PM

wrengr requested review of this revision.Dec 2 2021, 4:35 PM

Herald added a project: Restricted Project. · View Herald TranscriptDec 2 2021, 4:35 PM

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

wrengr added a child revision: D115010: [mlir][sparse] adding OverheadType::kIndex.Dec 2 2021, 4:43 PM

Harbormaster completed remote builds in B137254: Diff 391492.Dec 2 2021, 4:52 PM

Updating the two workaround comments with the differential ID that removes them

Harbormaster completed remote builds in B137272: Diff 391510.Dec 2 2021, 6:01 PM

wrengr mentioned this in D115005: [mlir][sparse] Requiring emitCInterface parameter to be explicit.Dec 3 2021, 3:25 PM

wrengr edited the summary of this revision. (Show Details)Dec 3 2021, 4:15 PM

aartbik added inline comments.Dec 6 2021, 1:47 PM

mlir/include/mlir/Dialect/SparseTensor/Utils/CodegenUtils.h
22–24 ↗	(On Diff #391510)	This is a good explanation, but sort of a very well known technique. I think a simple // Forward references. .... suffices ;-)
mlir/lib/Dialect/SparseTensor/Utils/CodegenUtils.cpp
16–18	// Forward.
utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
1810–1815	Is this autogen? Shouldn't we just add the right deps, instead of adding outside headers to this lib?

addressing nits

wrengr added inline comments.Dec 6 2021, 2:34 PM

utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
1810–1815	I tried using `build_cleaner` (which made the changes to the deps below), but it couldn't figure out how to fulfill these headers without pulling in a huge plethora of other stuff. I'll try taking another whack at it

Harbormaster completed remote builds in B137748: Diff 392186.Dec 6 2021, 2:43 PM

cleaning up BUILD.bazel

utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
1810–1815	Okay, I managed to get a decently clean version. The new version still has `"include/mlir/ExecutionEngine/SparseTensorUtils.h"` in the hdrs (which should be fine(?) since that's what `:SparseTensorTransforms` does as well). Alternatively we could add `":mlir_c_runner_utils"` to the deps (which will also bring that header in, though at the cost of building all of `:mlir_c_runner_utils`).

Harbormaster completed remote builds in B137764: Diff 392203.Dec 6 2021, 3:33 PM

aartbik added inline comments.Dec 6 2021, 8:16 PM

utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
1810–1815	Yeah, this look okay to me, since now we only include what we "own" in a sense. Let's see if others chime in, but I am okay with this.

rriddle requested changes to this revision.Dec 6 2021, 8:18 PM

rriddle added inline comments.

mlir/lib/Dialect/SparseTensor/Utils/CodegenUtils.cpp
14–18	namespaces should only really have classes, please remove these and use static methods and full namespace resolution instead.

This revision now requires changes to proceed.Dec 6 2021, 8:18 PM

rriddle added inline comments.Dec 6 2021, 8:21 PM

mlir/lib/Dialect/SparseTensor/Utils/CodegenUtils.cpp
14–18	Some relevant docs: https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions https://llvm.org/docs/CodingStandards.html#anonymous-namespaces

wrengr marked 2 inline comments as done.Dec 7 2021, 11:24 AM

adjusting namespaces

wrengr marked 2 inline comments as done.Dec 7 2021, 1:02 PM

Harbormaster completed remote builds in B137977: Diff 392508.Dec 7 2021, 1:15 PM

rriddle added inline comments.Dec 7 2021, 1:42 PM

mlir/include/mlir/Dialect/SparseTensor/Utils/CodegenUtils.h
1–11 ↗	(On Diff #392508)	It isn't clear to me that we should be exposing this kind of API in the external API, can this just be local to the lib/ directory?
96–113 ↗	(On Diff #392508)	I'm not really sure about this kind of API honestly, how much is it really saving? This comment isn't entirely blocking assuming this is local to lib/, but just a general comment.
100–103 ↗	(On Diff #392508)	This looks unused.
22–24 ↗	(On Diff #391510)	I wouldn't even add a comment, we don't do that anywhere else.

wrengr mentioned this in D115012: [mlir][sparse] Factoring out type-based function-name suffixes.Dec 7 2021, 1:44 PM

addressing nit

mlir/include/mlir/Dialect/SparseTensor/Utils/CodegenUtils.h
1–11 ↗	(On Diff #392508)	I have no idea where the external API is specified. This code is intended to be internal to the `sparse_tensor` dialect, but for general reuse by that dialect (i.e., across both `lib/Dialect/SparseTensor/Utils/.cpp` and `lib/Dialect/SparseTensor/Transforms/.cpp`)
96–113 ↗	(On Diff #392508)	It saves quite a lot actually. Saving 16–20 characters (46–54%) every single time we want to emit a constant adds up very quickly. Not to mention ensuring uniformity re which of the many different constant ops gets used. Let alone if we want to change that choice (as we did just recently). Not to mention the costs of onboarding new developers. It may have been some time since you saw MLIR with fresh eyes, but I assure you that for any newcomer to the codebase the constant drag from wading through the wall of text without these macros greatly impedes their velocity. And mere velocity is not the issue, but rather the fact that having such low initial velocity generally causes new developers to lose heart and decide the project isn't for them. If you want to discuss this further, we can set up a meeting to do so.
100–103 ↗	(On Diff #392508)	At present, yes; though I have no reason to believe it will remain so, and it strikes me as unnecessarily baroque to punch a hole in the regularity of the API just to eliminate a one-liner.

rriddle requested changes to this revision.Dec 8 2021, 12:44 PM

rriddle added inline comments.

mlir/include/mlir/Dialect/SparseTensor/Utils/CodegenUtils.h
1–11 ↗	(On Diff #392508)	Anything in include/mlir is considered an "external" API. This should be moved to the lib/ folder.
96–113 ↗	(On Diff #392508)	It may have been some time since you saw MLIR with fresh eyes, I'll start off by saying that I really don't appreciate the tone of this message Saving 16–20 characters (46–54%) every single time we want to emit a constant adds up very quickly. Some of these functions are unused, and some are used only a handful of times. If you want to save code, writing local utilities like this can be fine but I am somewhat strongly opposed to exposing stuff like in any kind of external API.
100–103 ↗	(On Diff #392508)	I don't see a reason to add code that is untested, seems better to add it when it is actually necessary.

This revision now requires changes to proceed.Dec 8 2021, 12:44 PM

mehdi_amini added inline comments.Dec 8 2021, 12:56 PM

mlir/include/mlir/Dialect/SparseTensor/Utils/CodegenUtils.h
96–113 ↗	(On Diff #392508)	It saves quite a lot actually. Saving 16–20 characters (46–54%) every single time we want to emit a constant adds up very quickly. [...] Thanks for noticing this. The usual process in MLIR is to raise this on Discourse, write an RFC, and gain consensus. We should definitely look into this kind of areas of improvement, and each newcomer to the project has indeed a "fresh pair of eyes" that is unique and very valuable! (after a few months, most people get "used to things" even when they are suboptimal). That said the things we've been consistent about is to keep all the common infrastructure shared and unified in the project: if you have a good case for helpers method to build various constant, it's unlikely that these have anything specific to the SparseTensor domain and so they really don't belong here.

aartbik added inline comments.Dec 8 2021, 1:34 PM

mlir/include/mlir/Dialect/SparseTensor/Utils/CodegenUtils.h
96–113 ↗	(On Diff #392508)	It may have been some time since you saw MLIR with fresh eyes, I'll start off by saying that I really don't appreciate the tone of this message Wren, please don't say stuff like this. River is our top contributor with a great passion for keeping our source and organization clean and consistent. River, I probably set the wrong example here by using include/mlir/Dialect/SparseTensor/Utils/ as the place to define the headers of utilities that are used in lib/mlir/Dialect/SparseTensor/* files, with the current (and only) example of Merger.h. I got that inspiration from several other dialects, e.g. include/mlir/Dialect/Linalg/Utils/Utils.h. I was not aware that defining headers in the lib directory is preferred in such places. Just for my understanding, do you think Merger.h was the right decision (since it it more elaborate and has its own testing) and is the objection purely based on having very tiny utils in the header, or do you prefer we move in that direction for everything?

rebase (over D115004 and D115005)

rriddle added inline comments.Dec 8 2021, 1:45 PM

mlir/include/mlir/Dialect/SparseTensor/Utils/CodegenUtils.h
96–113 ↗	(On Diff #392508)	The distinction that has generally been drawn between include/ vs lib/ is what is considered an "implementation detail" of a library, and what is meant to be exposed as a general utility to external users of this library. If we just want to have a place to share code between different passes within the same library then lib/ is generally a better choice, given that the stipulations and expectations of internal API are much more relaxed than external API. Merger could have also been defined in lib/ if it's not meant to be used by external clients, not sure what the expectation there is. Some of the GPU conversion utils are like this: https://github.com/llvm/llvm-project/tree/main/mlir/lib/Conversion/GPUCommon I will note that you can still write targeted tests for things only defined in lib/, the test directory can use relative include for these types of things: https://github.com/llvm/llvm-project/blob/f638c4d6e4a28fbd5508f853a8715cc92bb66d48/mlir/unittests/Conversion/PDLToPDLInterp/RootOrderingTest.cpp#L9

aartbik added inline comments.Dec 8 2021, 1:51 PM

mlir/include/mlir/Dialect/SparseTensor/Utils/CodegenUtils.h
96–113 ↗	(On Diff #392508)	Merger could have also been defined in lib/ if it's not meant to be used by external clients, not sure what the expectation there is. Thanks River for the background. That makes me feel better, since the expectation indeed was that other clients could start using this (since lattice operations are central to the TACO way of generating code), for example, when external contributor start writing improvements on sparse codegen, but want to use the same lattice theory. Wren, I think we should do the following next (1) take a look at our utils, see which one could be beneficial for all (a) and which one are specific to sparse (2a) move (a) to a general place, with River's blessing (2b) move (b) to a header inside SparseDialects lib

rriddle added inline comments.Dec 8 2021, 1:57 PM

mlir/include/mlir/Dialect/SparseTensor/Utils/CodegenUtils.h
96–113 ↗	(On Diff #392508)	Right, exactly what Mehdi says here. The "external API" of MLIR (and to a lesser extent the "internal API") need to be consistent. As newcomers engage with the project there can be a tendency to claim "ahah, this is definitely the way it should be because it's easy for me". In many cases this can help the project grow, but it has to be something agreed upon by the community lest we develop pockets of consistency (as different parts of the project develop differently). For this specific API, we've had discussions in the past that never really reached maturity (https://llvm.discourse.group/t/evolving-builder-apis-based-on-lessons-learned-from-edsc/879), and maybe there can be some renewed interest in starting a discussion there. Until then though, we need to ensure the API of the project is consistent.

Harbormaster completed remote builds in B138277: Diff 392922.Dec 8 2021, 5:06 PM

wrengr added inline comments.Dec 9 2021, 2:49 PM

mlir/include/mlir/Dialect/SparseTensor/Utils/CodegenUtils.h
96–113 ↗	(On Diff #392508)	Aha! I'd thought the include/ vs lib/ split was just for .h vs .cpp file types. I'll move this over to lib/ then, since the intention wasn't to make it public API. Sorry for the confusion
96–113 ↗	(On Diff #392508)	Yes of course, consensus is crucial for any major style/API changes. Again, my intent with this differential was never to change anything outside of the SparseTensor codebase, hence why I did not expect such pushback against this internal cleanup. Both `getOneAttr` and `genIsNonzero` are of general utility, though the latter needs renaming (afaik the "gen" prefix is our own thing rather than mlir standard). I can write up RFCs for both of these, but I'd rather not have the current differential (or rather D115010, D115012) depend on those being resolved. The constant generators are also of general utility, but they're far trickier to upstream (especially to do so without losing their brevity). There are several different versions of constant ops for different dialects, so we can't just make them into `OpBuilder` methods. I suppose we could make an `ArithOpBuilder` subclass, but since that's not an established pattern it strikes me as a terrible precedent. So there's considerable design work to be done just to determine what it would even mean to make these public, let alone reaching consensus. It would be great to do that work, but that's a considerably larger undertaking than what this differential aims to accomplish. That said, I'm getting the impression that I'm forbidden to do any sort of code cleanup that could be construed as general purpose. I fully appreciate the importance of avoiding design islands, believe me. Yet every other open-source project I have worked on has had it as part of their RFC process to demonstrate the utility of a change, namely by implementing it and using it on a small scale. If such small-scale implementations are forbidden, then I have difficulty seeing how any progress can be made.

Moved CodegenUtils.h into lib/. Dunno if the "../Utils/CodegenUtils.h" is acceptable style, or if there's a better way. Also rebased.

Harbormaster completed remote builds in B138544: Diff 393308.Dec 9 2021, 4:04 PM

added todos for filing RFCs to moving things upstream

mehdi_amini added inline comments.Dec 9 2021, 4:30 PM

mlir/include/mlir/Dialect/SparseTensor/Utils/CodegenUtils.h
96–113 ↗	(On Diff #392508)	The difficulty is how to manage these "islands" and avoid these to be creeping up. So in general I'm fairly conservative in trying to prevent them from existing in the first place (there are enough out-of-tree users to find such patterns through experiment) and instead be more proactive in refactoring our core infra. We saw with EDSC that a small island quickly becomes its own continent and is really hard to reconcile with the rest of the codebase later, I think it took more than a year of cleanups to purge it (I don't even know if we succeeded entirely). Another aspect is that letting islands emerge removes the incentive for folks to take the time to abstract away and generalize the concepts, so pushing back on a new "island" early is a way to keep a stronger incentive to creation motion around things like https://llvm.discourse.group/t/evolving-builder-apis-based-on-lessons-learned-from-edsc/879 In the end I'm not sure there is a perfect answer but it ends up being about various tradeoffs to make...

Harbormaster completed remote builds in B138557: Diff 393328.Dec 9 2021, 4:41 PM

aartbik added inline comments.Dec 10 2021, 9:52 AM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
17	This looks strange. Also, because CodeGenUtils.* is now a private facing API that supports only our codegen in Ttransforms, it seems to me all the the files should simply live in lib/Dialect/SparseTensor/Transforms, i.e. local to our codegen, since Utils should be reserved for implementation of our more public facing support. @rriddle @mehdi_amini can you please advise?

mehdi_amini added inline comments.Dec 10 2021, 12:28 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
17	Either way is fine with me.

aartbik added inline comments.Dec 10 2021, 12:31 PM

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
17	If either is fine with the masters, then I have a slight preference moving all the CodeGenUtils.* into Transforms, Wren. For now, I think you can even keep the "general" constant getters there, and go through the RFC of making them more widely available if you feel like going that route. But having these, now private, convenience methods inside the sparse codegen for now seems totally acceptable for the time being.

Moving CodegenUtils.{h,cpp} into Transforms/ instead of Utils/.
Also rebasing

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
17	The only issue I can think of re moving CodegenUtils into Transforms/ is that Utils/Merger.cpp also uses it. It's only a very minor issue since, iirc, Merger.cpp only uses one of the simpler functions once; so I'm totally cool with undoing that change for the sake of cleaning up the structure of things in Transforms/. But it is something to bear in mind in the future depending on how Merger. and CodegenUtils.* evolve.

wrengr retitled this revision from [mlir][sparse] Factoring out Utils/CodegenUtils.{cpp,h} to [mlir][sparse] Factoring out Transforms/CodegenUtils.{cpp,h}.Dec 15 2021, 5:22 PM

wrengr edited the summary of this revision. (Show Details)

Harbormaster completed remote builds in B139539: Diff 394691.Dec 15 2021, 5:39 PM

aartbik added inline comments.Dec 15 2021, 9:13 PM

mlir/lib/Dialect/SparseTensor/Transforms/CMakeLists.txt
2 ↗	(On Diff #394691)	I believe you have to list the header file here too now (others please chime in if that is wrong)

wrengr added inline comments.Dec 16 2021, 1:27 PM

mlir/lib/Dialect/SparseTensor/Transforms/CMakeLists.txt
2 ↗	(On Diff #394691)	It does build without listing the header... And none of the other `lib/Dialect/**/CMakeLists.txt` list their internal headers (albeit, all their internal headers are named `PassDetail.h` except for two instances of `TypeDetail.h`, so I don't know if those are considered special)... So I don't think it should be listed, but then I know very little about cmake so I'd rather get input from someone who does

wrengr mentioned this in D115909: [mlir][sparse] adding OverheadType::kIndex.Dec 16 2021, 2:10 PM

aartbik added inline comments.Dec 20 2021, 10:18 AM

utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
1821	Why does the sparse tensor utils depend on sparse tensors now and not the other way around. like it used to? Is this still required with the new structure? It feels a bit the wrong way to me.

wrengr added inline comments.Dec 20 2021, 11:48 AM

utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
1821	Imo, it makes the most sense to have the definition of a language be a priori to and independent of any utilities/transformations on that language. The naturalness of this layering is evidenced by the fact that the IR library doesn't actually depend on the utils at all (cf., line 1807 above), as well as the utils needing to pull in the IncGen dependencies if it can't get them from the IR library (which strikes me as a failure/violation of cohesion/encapsulation). But I can revert the change if you feel strongly

aartbik added inline comments.Dec 20 2021, 12:12 PM

utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
1821	I suppose that is true. I was thinking mainly of the new utils that support IR building, but there are also utils that indeed use the sparse encoding. Note that part of this is also because the sparse tensor build rule is too large, I really would like to see the sparse tensor attribute being independent of the dialect, especially since this may be used outside the sparse tensor dialect without having to pull in everything (see issue https://github.com/llvm/llvm-project/issues/52748 I filed recently).

aartbik accepted this revision.Dec 20 2021, 12:12 PM

wrengr added inline comments.Dec 20 2021, 12:29 PM

utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
1821	Making the attributes independent of the rest of the dialect makes sense to me. Nicely separates the type definition from the operations, avoiding the analogue of the forward class declaration hack (and opening the way to split up the sparse_tensor dialect or define alternatives to it, if we needed either for some reason). So, should I revert the BUILD.bazel change or no?

I am okay with BUILD file as is right now. I think we may bikeshed about this later when we extract encoding out ;-)

I don't have any concerns after moving to lib/, deferring the approval for this dir to Aart.

This revision is now accepted and ready to land.Jan 4 2022, 1:41 PM

rebase

Harbormaster completed remote builds in B141574: Diff 397411.Jan 4 2022, 3:25 PM

Closed by commit rG85b8d03e12bb: [mlir][sparse] Factoring out Transforms/CodegenUtils.{cpp,h} (authored by wrengr). · Explain WhyJan 4 2022, 4:11 PM

This revision was automatically updated to reflect the committed changes.

wrengr added a commit: rG85b8d03e12bb: [mlir][sparse] Factoring out Transforms/CodegenUtils.{cpp,h}.

Revision Contents

Path

Size

mlir/

lib/

Dialect/

SparseTensor/

Transforms/

SparseTensorConversion.cpp

124 lines

Sparsification.cpp

96 lines

Utils/

1 line

166 lines

128 lines

6 lines

utils/

bazel/

llvm-project-overlay/

mlir/

BUILD.bazel

23 lines

Diff 393328

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp

//===- SparseTensorConversion.cpp - Sparse tensor primitives conversion ---===//		//===- SparseTensorConversion.cpp - Sparse tensor primitives conversion ---===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// Convert sparse tensor primitives to calls into a runtime support library.		// Convert sparse tensor primitives to calls into a runtime support library.
// Note that this is a current implementation choice to keep the conversion		// Note that this is a current implementation choice to keep the conversion
// simple. In principle, these primitives could also be converted to actual		// simple. In principle, these primitives could also be converted to actual
// elaborate IR code that implements the primitives on the selected sparse		// elaborate IR code that implements the primitives on the selected sparse
// tensor storage schemes.		// tensor storage schemes.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		#include "../Utils/CodegenUtils.h"
		aartbikUnsubmitted Done Reply Inline Actions This looks strange. Also, because CodeGenUtils.* is now a private facing API that supports only our codegen in Ttransforms, it seems to me all the the files should simply live in lib/Dialect/SparseTensor/Transforms, i.e. local to our codegen, since Utils should be reserved for implementation of our more public facing support. @rriddle @mehdi_amini can you please advise? aartbik: This looks strange. Also, because CodeGenUtils.* is now a private facing API that supports only…
		mehdi_aminiUnsubmitted Done Reply Inline Actions Either way is fine with me. mehdi_amini: Either way is fine with me.
		aartbikUnsubmitted Done Reply Inline Actions If either is fine with the masters, then I have a slight preference moving all the CodeGenUtils.* into Transforms, Wren. For now, I think you can even keep the "general" constant getters there, and go through the RFC of making them more widely available if you feel like going that route. But having these, now private, convenience methods inside the sparse codegen for now seems totally acceptable for the time being. aartbik: If either is fine with the masters, then I have a slight preference moving all the CodeGenUtils.
		wrengrAuthorUnsubmitted Done Reply Inline Actions The only issue I can think of re moving CodegenUtils into Transforms/ is that Utils/Merger.cpp also uses it. It's only a very minor issue since, iirc, Merger.cpp only uses one of the simpler functions once; so I'm totally cool with undoing that change for the sake of cleaning up the structure of things in Transforms/. But it is something to bear in mind in the future depending on how Merger. and CodegenUtils.* evolve. wrengr: The only issue I can think of re moving CodegenUtils into Transforms/ is that Utils/Merger.cpp…
#include "mlir/Dialect/Bufferization/IR/Bufferization.h"		#include "mlir/Dialect/Bufferization/IR/Bufferization.h"
#include "mlir/Dialect/LLVMIR/LLVMDialect.h"		#include "mlir/Dialect/LLVMIR/LLVMDialect.h"
#include "mlir/Dialect/Linalg/Utils/Utils.h"		#include "mlir/Dialect/Linalg/Utils/Utils.h"
#include "mlir/Dialect/MemRef/IR/MemRef.h"		#include "mlir/Dialect/MemRef/IR/MemRef.h"
#include "mlir/Dialect/SCF/SCF.h"		#include "mlir/Dialect/SCF/SCF.h"
#include "mlir/Dialect/SparseTensor/IR/SparseTensor.h"		#include "mlir/Dialect/SparseTensor/IR/SparseTensor.h"
#include "mlir/Dialect/SparseTensor/Transforms/Passes.h"		#include "mlir/Dialect/SparseTensor/Transforms/Passes.h"
#include "mlir/Dialect/StandardOps/IR/Ops.h"		#include "mlir/Dialect/StandardOps/IR/Ops.h"
Show All 9 Lines
/// Shorthand aliases for the `emitCInterface` argument to `getFunc()`,		/// Shorthand aliases for the `emitCInterface` argument to `getFunc()`,
/// `createFuncCall()`, and `replaceOpWithFuncCall()`.		/// `createFuncCall()`, and `replaceOpWithFuncCall()`.
enum class EmitCInterface : bool { Off = false, On = true };		enum class EmitCInterface : bool { Off = false, On = true };

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Helper methods.		// Helper methods.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Generates a constant zero of the given type.
inline static Value constantZero(ConversionPatternRewriter &rewriter,
Location loc, Type t) {
return rewriter.create<arith::ConstantOp>(loc, t, rewriter.getZeroAttr(t));
}

/// Generates a constant of `index` type.
inline static Value constantIndex(ConversionPatternRewriter &rewriter,
Location loc, int64_t i) {
return rewriter.create<arith::ConstantIndexOp>(loc, i);
}

/// Generates a constant of `i32` type.
inline static Value constantI32(ConversionPatternRewriter &rewriter,
Location loc, int32_t i) {
return rewriter.create<arith::ConstantIntOp>(loc, i, 32);
}

/// Generates a constant of `i8` type.
inline static Value constantI8(ConversionPatternRewriter &rewriter,
Location loc, int8_t i) {
return rewriter.create<arith::ConstantIntOp>(loc, i, 8);
}

/// Generates a constant of the given `Action`.
static Value constantAction(ConversionPatternRewriter &rewriter, Location loc,
Action action) {
return constantI32(rewriter, loc, static_cast<uint32_t>(action));
}

/// Generates a constant of the internal type encoding for overhead storage.
static Value constantOverheadTypeEncoding(ConversionPatternRewriter &rewriter,
Location loc, unsigned width) {
OverheadType sec;
switch (width) {
default:
sec = OverheadType::kU64;
break;
case 32:
sec = OverheadType::kU32;
break;
case 16:
sec = OverheadType::kU16;
break;
case 8:
sec = OverheadType::kU8;
break;
}
return constantI32(rewriter, loc, static_cast<uint32_t>(sec));
}

/// Generates a constant of the internal type encoding for pointer
/// overhead storage.
static Value constantPointerTypeEncoding(ConversionPatternRewriter &rewriter,
Location loc,
SparseTensorEncodingAttr &enc) {
return constantOverheadTypeEncoding(rewriter, loc, enc.getPointerBitWidth());
}

/// Generates a constant of the internal type encoding for index overhead
/// storage.
static Value constantIndexTypeEncoding(ConversionPatternRewriter &rewriter,
Location loc,
SparseTensorEncodingAttr &enc) {
return constantOverheadTypeEncoding(rewriter, loc, enc.getIndexBitWidth());
}

/// Generates a constant of the internal type encoding for primary storage.
static Value constantPrimaryTypeEncoding(ConversionPatternRewriter &rewriter,
Location loc, Type tp) {
PrimaryType primary;
if (tp.isF64())
primary = PrimaryType::kF64;
else if (tp.isF32())
primary = PrimaryType::kF32;
else if (tp.isInteger(64))
primary = PrimaryType::kI64;
else if (tp.isInteger(32))
primary = PrimaryType::kI32;
else if (tp.isInteger(16))
primary = PrimaryType::kI16;
else if (tp.isInteger(8))
primary = PrimaryType::kI8;
else
llvm_unreachable("Unknown element type");
return constantI32(rewriter, loc, static_cast<uint32_t>(primary));
}

/// Generates a constant of the internal dimension level type encoding.
static Value
constantDimLevelTypeEncoding(ConversionPatternRewriter &rewriter, Location loc,
SparseTensorEncodingAttr::DimLevelType dlt) {
DimLevelType dlt2;
switch (dlt) {
case SparseTensorEncodingAttr::DimLevelType::Dense:
dlt2 = DimLevelType::kDense;
break;
case SparseTensorEncodingAttr::DimLevelType::Compressed:
dlt2 = DimLevelType::kCompressed;
break;
case SparseTensorEncodingAttr::DimLevelType::Singleton:
dlt2 = DimLevelType::kSingleton;
break;
}
return constantI8(rewriter, loc, static_cast<uint8_t>(dlt2));
}

/// Returns the equivalent of `void*` for opaque arguments to the		/// Returns the equivalent of `void*` for opaque arguments to the
/// execution engine.		/// execution engine.
static Type getOpaquePointerType(PatternRewriter &rewriter) {		static Type getOpaquePointerType(PatternRewriter &rewriter) {
return LLVM::LLVMPointerType::get(rewriter.getI8Type());		return LLVM::LLVMPointerType::get(rewriter.getI8Type());
}		}

/// Returns a function reference (first hit also inserts into module). Sets		/// Returns a function reference (first hit also inserts into module). Sets
/// the "_emit_c_interface" on the function declaration when requested,		/// the "_emit_c_interface" on the function declaration when requested,
▲ Show 20 Lines • Show All 174 Lines • ▼ Show 20 Lines	static void newParams(ConversionPatternRewriter &rewriter,
// User action.		// User action.
params.push_back(constantAction(rewriter, loc, action));		params.push_back(constantAction(rewriter, loc, action));
// Payload pointer.		// Payload pointer.
if (!ptr)		if (!ptr)
ptr = rewriter.create<LLVM::NullOp>(loc, getOpaquePointerType(rewriter));		ptr = rewriter.create<LLVM::NullOp>(loc, getOpaquePointerType(rewriter));
params.push_back(ptr);		params.push_back(ptr);
}		}

/// Generates the comparison `v != 0` where `v` is of numeric type `t`.
/// For floating types, we use the "unordered" comparator (i.e., returns
/// true if `v` is NaN).
static Value genIsNonzero(ConversionPatternRewriter &rewriter, Location loc,
Value v) {
Type t = v.getType();
Value zero = constantZero(rewriter, loc, t);
if (t.isa<FloatType>())
return rewriter.create<arith::CmpFOp>(loc, arith::CmpFPredicate::UNE, v,
zero);
if (t.isIntOrIndex())
return rewriter.create<arith::CmpIOp>(loc, arith::CmpIPredicate::ne, v,
zero);
llvm_unreachable("Unknown element type");
}

/// Generates the code to read the value from tensor[ivs], and conditionally		/// Generates the code to read the value from tensor[ivs], and conditionally
/// stores the indices ivs to the memory in ind. The generated code looks like		/// stores the indices ivs to the memory in ind. The generated code looks like
/// the following and the insertion point after this routine is inside the		/// the following and the insertion point after this routine is inside the
/// if-then branch behind the assignment to ind. This is to ensure that the		/// if-then branch behind the assignment to ind. This is to ensure that the
/// addEltX call generated after is inside the if-then branch.		/// addEltX call generated after is inside the if-then branch.
/// if (tensor[ivs]!=0) {		/// if (tensor[ivs]!=0) {
/// ind = ivs		/// ind = ivs
static Value genIndexAndValueForDense(ConversionPatternRewriter &rewriter,		static Value genIndexAndValueForDense(ConversionPatternRewriter &rewriter,
▲ Show 20 Lines • Show All 648 Lines • Show Last 20 Lines

mlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp

//===- Sparsification.cpp - Implementation of sparsification --------------===//		//===- Sparsification.cpp - Implementation of sparsification --------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file implements converting sparse tensor types to actual sparse code.		// This file implements converting sparse tensor types to actual sparse code.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		#include "../Utils/CodegenUtils.h"
#include "mlir/Dialect/Affine/IR/AffineOps.h"		#include "mlir/Dialect/Affine/IR/AffineOps.h"
#include "mlir/Dialect/Arithmetic/IR/Arithmetic.h"		#include "mlir/Dialect/Arithmetic/IR/Arithmetic.h"
#include "mlir/Dialect/Bufferization/IR/Bufferization.h"		#include "mlir/Dialect/Bufferization/IR/Bufferization.h"
#include "mlir/Dialect/Linalg/ComprehensiveBufferize/BufferizableOpInterface.h"		#include "mlir/Dialect/Linalg/ComprehensiveBufferize/BufferizableOpInterface.h"
#include "mlir/Dialect/Linalg/IR/LinalgOps.h"		#include "mlir/Dialect/Linalg/IR/LinalgOps.h"
#include "mlir/Dialect/Linalg/Utils/Utils.h"		#include "mlir/Dialect/Linalg/Utils/Utils.h"
#include "mlir/Dialect/MemRef/IR/MemRef.h"		#include "mlir/Dialect/MemRef/IR/MemRef.h"
#include "mlir/Dialect/SCF/SCF.h"		#include "mlir/Dialect/SCF/SCF.h"
▲ Show 20 Lines • Show All 380 Lines • ▼ Show 20 Lines
/// and a straightforward horizontal reduction will complete the operation.		/// and a straightforward horizontal reduction will complete the operation.
static Value genVectorReducInit(CodeGen &codegen, PatternRewriter &rewriter,		static Value genVectorReducInit(CodeGen &codegen, PatternRewriter &rewriter,
Location loc, VectorType vtp) {		Location loc, VectorType vtp) {
Value r = codegen.redVal;		Value r = codegen.redVal;
switch (codegen.redKind) {		switch (codegen.redKind) {
case kNoReduc:		case kNoReduc:
break;		break;
case kSum:		case kSum:
case kXor: {		case kXor:
// Initialize reduction vector to: \| 0 \| .. \| 0 \| r \|		// Initialize reduction vector to: \| 0 \| .. \| 0 \| r \|
Attribute zero = rewriter.getZeroAttr(vtp);
Value vec = rewriter.create<arith::ConstantOp>(loc, vtp, zero);
return rewriter.create<vector::InsertElementOp>(		return rewriter.create<vector::InsertElementOp>(
loc, r, vec, rewriter.create<arith::ConstantIndexOp>(loc, 0));		loc, r, constantZero(rewriter, loc, vtp),
}		constantIndex(rewriter, loc, 0));
case kProduct: {		case kProduct:
// Initialize reduction vector to: \| 1 \| .. \| 1 \| r \|		// Initialize reduction vector to: \| 1 \| .. \| 1 \| r \|
Type etp = vtp.getElementType();
Attribute one;
if (etp.isa<FloatType>())
one = rewriter.getFloatAttr(etp, 1.0);
else
one = rewriter.getIntegerAttr(etp, 1);
Value vec = rewriter.create<arith::ConstantOp>(
loc, vtp, DenseElementsAttr::get(vtp, one));
return rewriter.create<vector::InsertElementOp>(		return rewriter.create<vector::InsertElementOp>(
loc, r, vec, rewriter.create<arith::ConstantIndexOp>(loc, 0));		loc, r, constantOne(rewriter, loc, vtp),
}		constantIndex(rewriter, loc, 0));
case kAnd:		case kAnd:
case kOr:		case kOr:
// Initialize reduction vector to: \| r \| .. \| r \| r \|		// Initialize reduction vector to: \| r \| .. \| r \| r \|
return rewriter.create<vector::BroadcastOp>(loc, vtp, r);		return rewriter.create<vector::BroadcastOp>(loc, vtp, r);
}		}
llvm_unreachable("unknown reduction kind");		llvm_unreachable("unknown reduction kind");
}		}

Show All 11 Lines	static void updateReduc(Merger &merger, CodeGen &codegen, Value reduc) {
assert(codegen.redKind != kNoReduc);		assert(codegen.redKind != kNoReduc);
codegen.redVal = merger.exp(codegen.redExp).val = reduc;		codegen.redVal = merger.exp(codegen.redExp).val = reduc;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Sparse compiler synthesis methods (statements and expressions).		// Sparse compiler synthesis methods (statements and expressions).
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Maps sparse integer option to actual integral storage type.
static Type genIntType(PatternRewriter &rewriter, unsigned width) {
if (width == 0)
return rewriter.getIndexType();
return rewriter.getIntegerType(width);
}

/// Generates buffer for the output tensor. Note that all sparse kernels		/// Generates buffer for the output tensor. Note that all sparse kernels
/// assume that when all elements are written to (viz. x(i) = y(i) * z(i)),		/// assume that when all elements are written to (viz. x(i) = y(i) * z(i)),
/// the output buffer is already initialized to all zeroes and only nonzeroes		/// the output buffer is already initialized to all zeroes and only nonzeroes
/// values are computed and written out. For updates (viz. x(i) += y(i) * z(i)),		/// values are computed and written out. For updates (viz. x(i) += y(i) * z(i)),
/// only nonzeroes values are used for the updates and no assumption on the		/// only nonzeroes values are used for the updates and no assumption on the
/// original contents of the output buffer is necessary..		/// original contents of the output buffer is necessary..
static Value genOutputBuffer(CodeGen &codegen, PatternRewriter &rewriter,		static Value genOutputBuffer(CodeGen &codegen, PatternRewriter &rewriter,
linalg::GenericOp op, MemRefType denseTp,		linalg::GenericOp op, MemRefType denseTp,
ArrayRef<Value> args) {		ArrayRef<Value> args) {
Location loc = op.getLoc();		Location loc = op.getLoc();
Value tensor = op.getOutputOperand(0)->get();		Value tensor = op.getOutputOperand(0)->get();
// The output tensor simply could materialize from the buffer that will		// The output tensor simply could materialize from the buffer that will
// be generated for the tensor present in the outs() clause. This has		// be generated for the tensor present in the outs() clause. This has
// the major advantage that the sparse kernel only updates the nonzero		// the major advantage that the sparse kernel only updates the nonzero
// positions for the output tensor.		// positions for the output tensor.
if (isInPlace(tensor))		if (isInPlace(tensor))
return rewriter.create<bufferization::ToMemrefOp>(loc, denseTp, tensor);		return rewriter.create<bufferization::ToMemrefOp>(loc, denseTp, tensor);
// By default, a new buffer is allocated which is initialized to the		// By default, a new buffer is allocated which is initialized to the
// tensor defined in the outs() clause. This is always correct but		// tensor defined in the outs() clause. This is always correct but
// introduces a dense initialization component that may negatively		// introduces a dense initialization component that may negatively
// impact the running complexity of the sparse kernel. If the tensor		// impact the running complexity of the sparse kernel. If the tensor
// materializes into the computation, we need to preserve the zero		// materializes into the computation, we need to preserve the zero
// initialization assumption of all sparse output buffers.		// initialization assumption of all sparse output buffers.
if (isMaterializing(tensor)) {		if (isMaterializing(tensor)) {
Type tp = denseTp.getElementType();
Value alloc = rewriter.create<memref::AllocOp>(loc, denseTp, args);		Value alloc = rewriter.create<memref::AllocOp>(loc, denseTp, args);
Value zero =		Value zero = constantZero(rewriter, loc, denseTp.getElementType());
rewriter.create<arith::ConstantOp>(loc, tp, rewriter.getZeroAttr(tp));
rewriter.create<linalg::FillOp>(loc, zero, alloc);		rewriter.create<linalg::FillOp>(loc, zero, alloc);
return alloc;		return alloc;
}		}
Value init = rewriter.create<bufferization::ToMemrefOp>(loc, denseTp, tensor);		Value init = rewriter.create<bufferization::ToMemrefOp>(loc, denseTp, tensor);
Value alloc = rewriter.create<memref::AllocOp>(loc, denseTp, args);		Value alloc = rewriter.create<memref::AllocOp>(loc, denseTp, args);
rewriter.create<memref::CopyOp>(loc, init, alloc);		rewriter.create<memref::CopyOp>(loc, init, alloc);
return alloc;		return alloc;
}		}
Show All 18 Lines	for (OpOperand *t : op.getInputAndOutputOperands()) {
for (unsigned d = 0, rank = map.getNumResults(); d < rank; d++) {		for (unsigned d = 0, rank = map.getNumResults(); d < rank; d++) {
AffineExpr a = map.getResult(perm(enc, d));		AffineExpr a = map.getResult(perm(enc, d));
if (a.getKind() != AffineExprKind::DimId)		if (a.getKind() != AffineExprKind::DimId)
continue; // compound		continue; // compound
unsigned idx = a.cast<AffineDimExpr>().getPosition();		unsigned idx = a.cast<AffineDimExpr>().getPosition();
// Handle sparse storage schemes.		// Handle sparse storage schemes.
if (merger.isDim(tensor, idx, Dim::kSparse)) {		if (merger.isDim(tensor, idx, Dim::kSparse)) {
auto dynShape = {ShapedType::kDynamicSize};		auto dynShape = {ShapedType::kDynamicSize};
auto ptrTp = MemRefType::get(		auto ptrTp =
dynShape, genIntType(rewriter, enc.getPointerBitWidth()));		MemRefType::get(dynShape, getPointerOverheadType(rewriter, enc));
auto indTp = MemRefType::get(		auto indTp =
dynShape, genIntType(rewriter, enc.getIndexBitWidth()));		MemRefType::get(dynShape, getIndexOverheadType(rewriter, enc));
Value dim = rewriter.create<arith::ConstantIndexOp>(loc, d);		Value dim = constantIndex(rewriter, loc, d);
// Generate sparse primitives to obtains pointer and indices.		// Generate sparse primitives to obtains pointer and indices.
codegen.pointers[tensor][idx] =		codegen.pointers[tensor][idx] =
rewriter.create<ToPointersOp>(loc, ptrTp, t->get(), dim);		rewriter.create<ToPointersOp>(loc, ptrTp, t->get(), dim);
codegen.indices[tensor][idx] =		codegen.indices[tensor][idx] =
rewriter.create<ToIndicesOp>(loc, indTp, t->get(), dim);		rewriter.create<ToIndicesOp>(loc, indTp, t->get(), dim);
}		}
// Find upper bound in current dimension.		// Find upper bound in current dimension.
unsigned p = perm(enc, d);		unsigned p = perm(enc, d);
Show All 14 Lines	if (!enc) {
if (tensor < op.getNumInputs())		if (tensor < op.getNumInputs())
codegen.buffers[tensor] =		codegen.buffers[tensor] =
rewriter.create<bufferization::ToMemrefOp>(loc, denseTp, t->get());		rewriter.create<bufferization::ToMemrefOp>(loc, denseTp, t->get());
else		else
codegen.buffers[tensor] =		codegen.buffers[tensor] =
genOutputBuffer(codegen, rewriter, op, denseTp, args);		genOutputBuffer(codegen, rewriter, op, denseTp, args);
} else if (t == codegen.sparseOut) {		} else if (t == codegen.sparseOut) {
// True sparse output needs a lexIdx array.		// True sparse output needs a lexIdx array.
Value rank = rewriter.create<arith::ConstantIndexOp>(loc, op.getRank(t));		Value rank = constantIndex(rewriter, loc, op.getRank(t));
auto dynShape = {ShapedType::kDynamicSize};		auto dynShape = {ShapedType::kDynamicSize};
auto memTp = MemRefType::get(dynShape, rewriter.getIndexType());		auto memTp = MemRefType::get(dynShape, rewriter.getIndexType());
codegen.lexIdx = rewriter.create<memref::AllocaOp>(loc, memTp, rank);		codegen.lexIdx = rewriter.create<memref::AllocaOp>(loc, memTp, rank);
} else {		} else {
// Annotated sparse tensors.		// Annotated sparse tensors.
auto dynShape = {ShapedType::kDynamicSize};		auto dynShape = {ShapedType::kDynamicSize};
auto sparseTp = MemRefType::get(dynShape, elementType);		auto sparseTp = MemRefType::get(dynShape, elementType);
codegen.buffers[tensor] =		codegen.buffers[tensor] =
Show All 11 Lines
static VectorType vectorType(CodeGen &codegen, Value ptr) {		static VectorType vectorType(CodeGen &codegen, Value ptr) {
return vectorType(codegen, ptr.getType().cast<MemRefType>().getElementType());		return vectorType(codegen, ptr.getType().cast<MemRefType>().getElementType());
}		}

/// Constructs vector iteration mask.		/// Constructs vector iteration mask.
static Value genVectorMask(CodeGen &codegen, PatternRewriter &rewriter,		static Value genVectorMask(CodeGen &codegen, PatternRewriter &rewriter,
Value iv, Value lo, Value hi, Value step) {		Value iv, Value lo, Value hi, Value step) {
Location loc = iv.getLoc();		Location loc = iv.getLoc();
VectorType mtp = vectorType(codegen, genIntType(rewriter, 1));		VectorType mtp = vectorType(codegen, rewriter.getI1Type());
// Special case if the vector length evenly divides the trip count (for		// Special case if the vector length evenly divides the trip count (for
// example, "for i = 0, 128, 16"). A constant all-true mask is generated		// example, "for i = 0, 128, 16"). A constant all-true mask is generated
// so that all subsequent masked memory operations are immediately folded		// so that all subsequent masked memory operations are immediately folded
// into unconditional memory operations.		// into unconditional memory operations.
IntegerAttr loInt, hiInt, stepInt;		IntegerAttr loInt, hiInt, stepInt;
if (matchPattern(lo, m_Constant(&loInt)) &&		if (matchPattern(lo, m_Constant(&loInt)) &&
matchPattern(hi, m_Constant(&hiInt)) &&		matchPattern(hi, m_Constant(&hiInt)) &&
matchPattern(step, m_Constant(&stepInt))) {		matchPattern(step, m_Constant(&stepInt))) {
if (((hiInt.getInt() - loInt.getInt()) % stepInt.getInt()) == 0)		if (((hiInt.getInt() - loInt.getInt()) % stepInt.getInt()) == 0)
return rewriter.create<vector::BroadcastOp>(		return rewriter.create<vector::BroadcastOp>(
loc, mtp, rewriter.create<arith::ConstantIntOp>(loc, 1, 1));		loc, mtp, constantI1(rewriter, loc, true));
}		}
// Otherwise, generate a vector mask that avoids overrunning the upperbound		// Otherwise, generate a vector mask that avoids overrunning the upperbound
// during vector execution. Here we rely on subsequent loop optimizations to		// during vector execution. Here we rely on subsequent loop optimizations to
// avoid executing the mask in all iterations, for example, by splitting the		// avoid executing the mask in all iterations, for example, by splitting the
// loop into an unconditional vector loop and a scalar cleanup loop.		// loop into an unconditional vector loop and a scalar cleanup loop.
auto minMap = AffineMap::get(		auto minMap = AffineMap::get(
/dimCount=/2, /symbolCount=/1,		/dimCount=/2, /symbolCount=/1,
{rewriter.getAffineSymbolExpr(0),		{rewriter.getAffineSymbolExpr(0),
rewriter.getAffineDimExpr(0) - rewriter.getAffineDimExpr(1)},		rewriter.getAffineDimExpr(0) - rewriter.getAffineDimExpr(1)},
rewriter.getContext());		rewriter.getContext());
Value end =		Value end =
rewriter.createOrFold<AffineMinOp>(loc, minMap, ValueRange{hi, iv, step});		rewriter.createOrFold<AffineMinOp>(loc, minMap, ValueRange{hi, iv, step});
return rewriter.create<vector::CreateMaskOp>(loc, mtp, end);		return rewriter.create<vector::CreateMaskOp>(loc, mtp, end);
}		}

/// Generates a vectorized load lhs = a[ind[lo:hi]] or lhs = a[lo:hi].		/// Generates a vectorized load lhs = a[ind[lo:hi]] or lhs = a[lo:hi].
static Value genVectorLoad(CodeGen &codegen, PatternRewriter &rewriter,		static Value genVectorLoad(CodeGen &codegen, PatternRewriter &rewriter,
Value ptr, ArrayRef<Value> args) {		Value ptr, ArrayRef<Value> args) {
Location loc = ptr.getLoc();		Location loc = ptr.getLoc();
VectorType vtp = vectorType(codegen, ptr);		VectorType vtp = vectorType(codegen, ptr);
Value pass =		Value pass = constantZero(rewriter, loc, vtp);
rewriter.create<arith::ConstantOp>(loc, vtp, rewriter.getZeroAttr(vtp));
if (args.back().getType().isa<VectorType>()) {		if (args.back().getType().isa<VectorType>()) {
SmallVector<Value, 4> scalarArgs(args.begin(), args.end());		SmallVector<Value, 4> scalarArgs(args.begin(), args.end());
Value indexVec = args.back();		Value indexVec = args.back();
scalarArgs.back() = rewriter.create<arith::ConstantIndexOp>(loc, 0);		scalarArgs.back() = constantIndex(rewriter, loc, 0);
return rewriter.create<vector::GatherOp>(		return rewriter.create<vector::GatherOp>(
loc, vtp, ptr, scalarArgs, indexVec, codegen.curVecMask, pass);		loc, vtp, ptr, scalarArgs, indexVec, codegen.curVecMask, pass);
}		}
return rewriter.create<vector::MaskedLoadOp>(loc, vtp, ptr, args,		return rewriter.create<vector::MaskedLoadOp>(loc, vtp, ptr, args,
codegen.curVecMask, pass);		codegen.curVecMask, pass);
}		}

/// Generates a vectorized store a[ind[lo:hi]] = rhs or a[lo:hi] = rhs.		/// Generates a vectorized store a[ind[lo:hi]] = rhs or a[lo:hi] = rhs.
static void genVectorStore(CodeGen &codegen, PatternRewriter &rewriter,		static void genVectorStore(CodeGen &codegen, PatternRewriter &rewriter,
Value rhs, Value ptr, ArrayRef<Value> args) {		Value rhs, Value ptr, ArrayRef<Value> args) {
Location loc = ptr.getLoc();		Location loc = ptr.getLoc();
if (args.back().getType().isa<VectorType>()) {		if (args.back().getType().isa<VectorType>()) {
SmallVector<Value, 4> scalarArgs(args.begin(), args.end());		SmallVector<Value, 4> scalarArgs(args.begin(), args.end());
Value indexVec = args.back();		Value indexVec = args.back();
scalarArgs.back() = rewriter.create<arith::ConstantIndexOp>(loc, 0);		scalarArgs.back() = constantIndex(rewriter, loc, 0);
rewriter.create<vector::ScatterOp>(loc, ptr, scalarArgs, indexVec,		rewriter.create<vector::ScatterOp>(loc, ptr, scalarArgs, indexVec,
codegen.curVecMask, rhs);		codegen.curVecMask, rhs);
return;		return;
}		}
rewriter.create<vector::MaskedStoreOp>(loc, ptr, args, codegen.curVecMask,		rewriter.create<vector::MaskedStoreOp>(loc, ptr, args, codegen.curVecMask,
rhs);		rhs);
}		}

Show All 25 Lines	static Value genAffine(CodeGen &codegen, PatternRewriter &rewriter,
case AffineExprKind::Mul: {		case AffineExprKind::Mul: {
auto binOp = a.cast<AffineBinaryOpExpr>();		auto binOp = a.cast<AffineBinaryOpExpr>();
return rewriter.create<arith::MulIOp>(		return rewriter.create<arith::MulIOp>(
loc, genAffine(codegen, rewriter, binOp.getLHS(), loc),		loc, genAffine(codegen, rewriter, binOp.getLHS(), loc),
genAffine(codegen, rewriter, binOp.getRHS(), loc));		genAffine(codegen, rewriter, binOp.getRHS(), loc));
}		}
case AffineExprKind::Constant: {		case AffineExprKind::Constant: {
int64_t c = a.cast<AffineConstantExpr>().getValue();		int64_t c = a.cast<AffineConstantExpr>().getValue();
return rewriter.create<arith::ConstantIndexOp>(loc, c);		return constantIndex(rewriter, loc, c);
}		}
default:		default:
llvm_unreachable("unexpected affine subscript");		llvm_unreachable("unexpected affine subscript");
}		}
}		}

/// Generates index for load/store on sparse tensor.		/// Generates index for load/store on sparse tensor.
static Value genIndex(CodeGen &codegen, linalg::GenericOp op, OpOperand *t) {		static Value genIndex(CodeGen &codegen, linalg::GenericOp op, OpOperand *t) {
Show All 32 Lines

/// Generates insertion code to implement dynamic tensor load.		/// Generates insertion code to implement dynamic tensor load.
static Value genInsertionLoad(CodeGen &codegen, PatternRewriter &rewriter,		static Value genInsertionLoad(CodeGen &codegen, PatternRewriter &rewriter,
linalg::GenericOp op, OpOperand *t) {		linalg::GenericOp op, OpOperand *t) {
Location loc = op.getLoc();		Location loc = op.getLoc();
// Direct lexicographic index order, tensor loads as zero.		// Direct lexicographic index order, tensor loads as zero.
if (!codegen.expValues) {		if (!codegen.expValues) {
Type tp = getElementTypeOrSelf(t->get().getType());		Type tp = getElementTypeOrSelf(t->get().getType());
return rewriter.create<arith::ConstantOp>(loc, tp,		return constantZero(rewriter, loc, tp);
rewriter.getZeroAttr(tp));
}		}
// Load from expanded access pattern.		// Load from expanded access pattern.
Value index = genIndex(codegen, op, t);		Value index = genIndex(codegen, op, t);
return rewriter.create<memref::LoadOp>(loc, codegen.expValues, index);		return rewriter.create<memref::LoadOp>(loc, codegen.expValues, index);
}		}

/// Generates insertion code to implement dynamic tensor store.		/// Generates insertion code to implement dynamic tensor store.
static void genInsertionStore(CodeGen &codegen, PatternRewriter &rewriter,		static void genInsertionStore(CodeGen &codegen, PatternRewriter &rewriter,
linalg::GenericOp op, OpOperand *t, Value rhs) {		linalg::GenericOp op, OpOperand *t, Value rhs) {
Location loc = op.getLoc();		Location loc = op.getLoc();
// Direct insertion in lexicographic index order.		// Direct insertion in lexicographic index order.
if (!codegen.expValues) {		if (!codegen.expValues) {
rewriter.create<LexInsertOp>(loc, t->get(), codegen.lexIdx, rhs);		rewriter.create<LexInsertOp>(loc, t->get(), codegen.lexIdx, rhs);
return;		return;
}		}
// Generates insertion code along expanded access pattern.		// Generates insertion code along expanded access pattern.
// if (!expFilled[i]) then		// if (!expFilled[i]) then
// expFilled[i] = true		// expFilled[i] = true
// expAdded[inserts++] = i		// expAdded[inserts++] = i
// endif		// endif
// values[i] = rhs		// values[i] = rhs
Value index = genIndex(codegen, op, t);		Value index = genIndex(codegen, op, t);
Value fval = rewriter.create<arith::ConstantIntOp>(loc, 0, 1); // false		Value fval = constantI1(rewriter, loc, false);
Value tval = rewriter.create<arith::ConstantIntOp>(loc, 1, 1); // true		Value tval = constantI1(rewriter, loc, true);
// If statement.		// If statement.
Value filled = rewriter.create<memref::LoadOp>(loc, codegen.expFilled, index);		Value filled = rewriter.create<memref::LoadOp>(loc, codegen.expFilled, index);
Value cond = rewriter.create<arith::CmpIOp>(loc, arith::CmpIPredicate::eq,		Value cond = rewriter.create<arith::CmpIOp>(loc, arith::CmpIPredicate::eq,
filled, fval);		filled, fval);
scf::IfOp ifOp = rewriter.create<scf::IfOp>(loc, rewriter.getIndexType(),		scf::IfOp ifOp = rewriter.create<scf::IfOp>(loc, rewriter.getIndexType(),
cond, /else=/true);		cond, /else=/true);
// True branch.		// True branch.
rewriter.setInsertionPointToStart(&ifOp.thenRegion().front());		rewriter.setInsertionPointToStart(&ifOp.thenRegion().front());
rewriter.create<memref::StoreOp>(loc, tval, codegen.expFilled, index);		rewriter.create<memref::StoreOp>(loc, tval, codegen.expFilled, index);
rewriter.create<memref::StoreOp>(loc, index, codegen.expAdded,		rewriter.create<memref::StoreOp>(loc, index, codegen.expAdded,
codegen.expCount);		codegen.expCount);
Value one = rewriter.create<arith::ConstantIndexOp>(loc, 1);		Value one = constantIndex(rewriter, loc, 1);
Value add = rewriter.create<arith::AddIOp>(loc, codegen.expCount, one);		Value add = rewriter.create<arith::AddIOp>(loc, codegen.expCount, one);
rewriter.create<scf::YieldOp>(loc, add);		rewriter.create<scf::YieldOp>(loc, add);
// False branch.		// False branch.
rewriter.setInsertionPointToStart(&ifOp.elseRegion().front());		rewriter.setInsertionPointToStart(&ifOp.elseRegion().front());
rewriter.create<scf::YieldOp>(loc, codegen.expCount);		rewriter.create<scf::YieldOp>(loc, codegen.expCount);
rewriter.setInsertionPointAfter(ifOp);		rewriter.setInsertionPointAfter(ifOp);
// Value assignment.		// Value assignment.
codegen.expCount = ifOp.getResult(0);		codegen.expCount = ifOp.getResult(0);
▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines	if (codegen.curVecLength > 1) {
// to state that the indices are unsigned, with creates the potential of		// to state that the indices are unsigned, with creates the potential of
// incorrect address calculations in the unlikely case we need such		// incorrect address calculations in the unlikely case we need such
// extremely large offsets.		// extremely large offsets.
Type etp = ptr.getType().cast<MemRefType>().getElementType();		Type etp = ptr.getType().cast<MemRefType>().getElementType();
Value vload = genVectorLoad(codegen, rewriter, ptr, {s});		Value vload = genVectorLoad(codegen, rewriter, ptr, {s});
if (!etp.isa<IndexType>()) {		if (!etp.isa<IndexType>()) {
if (etp.getIntOrFloatBitWidth() < 32)		if (etp.getIntOrFloatBitWidth() < 32)
vload = rewriter.create<arith::ExtUIOp>(		vload = rewriter.create<arith::ExtUIOp>(
loc, vload, vectorType(codegen, genIntType(rewriter, 32)));		loc, vload, vectorType(codegen, rewriter.getI32Type()));
else if (etp.getIntOrFloatBitWidth() < 64 &&		else if (etp.getIntOrFloatBitWidth() < 64 &&
!codegen.options.enableSIMDIndex32)		!codegen.options.enableSIMDIndex32)
vload = rewriter.create<arith::ExtUIOp>(		vload = rewriter.create<arith::ExtUIOp>(
loc, vload, vectorType(codegen, genIntType(rewriter, 64)));		loc, vload, vectorType(codegen, rewriter.getI64Type()));
}		}
return vload;		return vload;
}		}
// For the scalar case, we simply zero extend narrower indices into 64-bit		// For the scalar case, we simply zero extend narrower indices into 64-bit
// values before casting to index without a performance penalty. Here too,		// values before casting to index without a performance penalty. Here too,
// however, indices that already are 64-bit, in theory, cannot express the		// however, indices that already are 64-bit, in theory, cannot express the
// full range as explained above.		// full range as explained above.
Value load = rewriter.create<memref::LoadOp>(loc, ptr, s);		Value load = rewriter.create<memref::LoadOp>(loc, ptr, s);
if (!load.getType().isa<IndexType>()) {		if (!load.getType().isa<IndexType>()) {
if (load.getType().getIntOrFloatBitWidth() < 64)		if (load.getType().getIntOrFloatBitWidth() < 64)
load =		load = rewriter.create<arith::ExtUIOp>(loc, load, rewriter.getI64Type());
rewriter.create<arith::ExtUIOp>(loc, load, genIntType(rewriter, 64));
load =		load =
rewriter.create<arith::IndexCastOp>(loc, load, rewriter.getIndexType());		rewriter.create<arith::IndexCastOp>(loc, load, rewriter.getIndexType());
}		}
return load;		return load;
}		}

/// Generates an invariant value.		/// Generates an invariant value.
static Value genInvariantValue(Merger &merger, CodeGen &codegen,		static Value genInvariantValue(Merger &merger, CodeGen &codegen,
▲ Show 20 Lines • Show All 115 Lines • ▼ Show 20 Lines	if (!lhs \|\| codegen.outerParNest != op.getRank(lhs) - 1 \|\|
return; // not needed at this level		return; // not needed at this level
// Generate start or end of an expanded access pattern.		// Generate start or end of an expanded access pattern.
Value tensor = lhs->get();		Value tensor = lhs->get();
Location loc = op.getLoc();		Location loc = op.getLoc();
if (atStart) {		if (atStart) {
auto dynShape = {ShapedType::kDynamicSize};		auto dynShape = {ShapedType::kDynamicSize};
Type etp = tensor.getType().cast<ShapedType>().getElementType();		Type etp = tensor.getType().cast<ShapedType>().getElementType();
Type t1 = MemRefType::get(dynShape, etp);		Type t1 = MemRefType::get(dynShape, etp);
Type t2 = MemRefType::get(dynShape, genIntType(rewriter, 1));		Type t2 = MemRefType::get(dynShape, rewriter.getI1Type());
Type t3 = MemRefType::get(dynShape, genIntType(rewriter, 0));		Type t3 = MemRefType::get(dynShape, rewriter.getIndexType());
Type t4 = rewriter.getIndexType();		Type t4 = rewriter.getIndexType();
auto res =		auto res =
rewriter.create<ExpandOp>(loc, TypeRange({t1, t2, t3, t4}), tensor);		rewriter.create<ExpandOp>(loc, TypeRange({t1, t2, t3, t4}), tensor);
assert(res.getNumResults() == 4);		assert(res.getNumResults() == 4);
assert(!codegen.expValues);		assert(!codegen.expValues);
codegen.expValues = res.getResult(0);		codegen.expValues = res.getResult(0);
codegen.expFilled = res.getResult(1);		codegen.expFilled = res.getResult(1);
codegen.expAdded = res.getResult(2);		codegen.expAdded = res.getResult(2);
Show All 26 Lines	if (inits[b]) {
if (merger.isDim(b, Dim::kSparse)) {		if (merger.isDim(b, Dim::kSparse)) {
// Initialize sparse index.		// Initialize sparse index.
unsigned pat = at;		unsigned pat = at;
for (; pat != 0; pat--) {		for (; pat != 0; pat--) {
if (codegen.pidxs[tensor][topSort[pat - 1]])		if (codegen.pidxs[tensor][topSort[pat - 1]])
break;		break;
}		}
Value ptr = codegen.pointers[tensor][idx];		Value ptr = codegen.pointers[tensor][idx];
Value one = rewriter.create<arith::ConstantIndexOp>(loc, 1);		Value one = constantIndex(rewriter, loc, 1);
Value p0 = (pat == 0) ? rewriter.create<arith::ConstantIndexOp>(loc, 0)		Value p0 = (pat == 0) ? constantIndex(rewriter, loc, 0)
: codegen.pidxs[tensor][topSort[pat - 1]];		: codegen.pidxs[tensor][topSort[pat - 1]];
codegen.pidxs[tensor][idx] = genLoad(codegen, rewriter, loc, ptr, p0);		codegen.pidxs[tensor][idx] = genLoad(codegen, rewriter, loc, ptr, p0);
Value p1 = rewriter.create<arith::AddIOp>(loc, p0, one);		Value p1 = rewriter.create<arith::AddIOp>(loc, p0, one);
codegen.highs[tensor][idx] = genLoad(codegen, rewriter, loc, ptr, p1);		codegen.highs[tensor][idx] = genLoad(codegen, rewriter, loc, ptr, p1);
} else {		} else {
// Dense index still in play.		// Dense index still in play.
needsUniv = true;		needsUniv = true;
}		}
}		}
}		}

// Initialize the universal dense index.		// Initialize the universal dense index.
codegen.loops[idx] = rewriter.create<arith::ConstantIndexOp>(loc, 0);		codegen.loops[idx] = constantIndex(rewriter, loc, 0);
return needsUniv;		return needsUniv;
}		}

/// Returns vectorization strategy. Any implicit inner loop in the Linalg		/// Returns vectorization strategy. Any implicit inner loop in the Linalg
/// operation is a candidate. Whether it is actually converted to SIMD code		/// operation is a candidate. Whether it is actually converted to SIMD code
/// depends on the requested strategy.		/// depends on the requested strategy.
static bool isVectorFor(CodeGen &codegen, bool isInner, bool isSparse) {		static bool isVectorFor(CodeGen &codegen, bool isInner, bool isSparse) {
switch (codegen.options.vectorizationStrategy) {		switch (codegen.options.vectorizationStrategy) {
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	static Operation *genFor(Merger &merger, CodeGen &codegen,
// Prepare vector length.		// Prepare vector length.
if (isVector)		if (isVector)
codegen.curVecLength = codegen.options.vectorLength;		codegen.curVecLength = codegen.options.vectorLength;

// Loop bounds and increment.		// Loop bounds and increment.
Location loc = op.getLoc();		Location loc = op.getLoc();
Value lo = isSparse ? codegen.pidxs[tensor][idx] : codegen.loops[idx];		Value lo = isSparse ? codegen.pidxs[tensor][idx] : codegen.loops[idx];
Value hi = isSparse ? codegen.highs[tensor][idx] : codegen.sizes[idx];		Value hi = isSparse ? codegen.highs[tensor][idx] : codegen.sizes[idx];
Value step =		Value step = constantIndex(rewriter, loc, codegen.curVecLength);
rewriter.create<arith::ConstantIndexOp>(loc, codegen.curVecLength);

// Emit a parallel loop.		// Emit a parallel loop.
if (isParallel) {		if (isParallel) {
assert(!isVector);		assert(!isVector);
scf::ParallelOp parOp = rewriter.create<scf::ParallelOp>(loc, lo, hi, step);		scf::ParallelOp parOp = rewriter.create<scf::ParallelOp>(loc, lo, hi, step);
if (isSparse)		if (isSparse)
codegen.pidxs[tensor][idx] = parOp.getInductionVars()[0];		codegen.pidxs[tensor][idx] = parOp.getInductionVars()[0];
else		else
▲ Show 20 Lines • Show All 157 Lines • ▼ Show 20 Lines	for (unsigned b = 0, be = locals.size(); b < be; b++) {
if ((locals[b] \|\| merger.isOutTensor(b, idx)) &&		if ((locals[b] \|\| merger.isOutTensor(b, idx)) &&
merger.isDim(b, Dim::kDense)) {		merger.isDim(b, Dim::kDense)) {
unsigned tensor = merger.tensor(b);		unsigned tensor = merger.tensor(b);
assert(idx == merger.index(b));		assert(idx == merger.index(b));
unsigned pat = at;		unsigned pat = at;
for (; pat != 0; pat--)		for (; pat != 0; pat--)
if (codegen.pidxs[tensor][topSort[pat - 1]])		if (codegen.pidxs[tensor][topSort[pat - 1]])
break;		break;
Value p = (pat == 0) ? rewriter.create<arith::ConstantIndexOp>(loc, 0)		Value p = (pat == 0) ? constantIndex(rewriter, loc, 0)
: codegen.pidxs[tensor][topSort[pat - 1]];		: codegen.pidxs[tensor][topSort[pat - 1]];
codegen.pidxs[tensor][idx] = genAddress(		codegen.pidxs[tensor][idx] = genAddress(
codegen, rewriter, loc, codegen.sizes[idx], p, codegen.loops[idx]);		codegen, rewriter, loc, codegen.sizes[idx], p, codegen.loops[idx]);
}		}
}		}

// Move the insertion indices in lexicographic index order. During access		// Move the insertion indices in lexicographic index order. During access
// pattern expansion, we can skip setting the innermost dimension.		// pattern expansion, we can skip setting the innermost dimension.
if (codegen.sparseOut && !codegen.expValues) {		if (codegen.sparseOut && !codegen.expValues) {
Value pos = rewriter.create<arith::ConstantIndexOp>(loc, at);		Value pos = constantIndex(rewriter, loc, at);
rewriter.create<memref::StoreOp>(loc, codegen.loops[idx], codegen.lexIdx,		rewriter.create<memref::StoreOp>(loc, codegen.loops[idx], codegen.lexIdx,
pos);		pos);
}		}
}		}

/// Generates the induction structure for a while-loop.		/// Generates the induction structure for a while-loop.
static void genWhileInduction(Merger &merger, CodeGen &codegen,		static void genWhileInduction(Merger &merger, CodeGen &codegen,
PatternRewriter &rewriter, linalg::GenericOp op,		PatternRewriter &rewriter, linalg::GenericOp op,
Show All 23 Lines	static void genWhileInduction(Merger &merger, CodeGen &codegen,
rewriter.setInsertionPointToEnd(&whileOp.after().front());		rewriter.setInsertionPointToEnd(&whileOp.after().front());
// Finalize the induction. Note that the induction could be performed		// Finalize the induction. Note that the induction could be performed
// in the individual if-branches to avoid re-evaluating the conditions.		// in the individual if-branches to avoid re-evaluating the conditions.
// However, that would result in a rather elaborate forest of yield		// However, that would result in a rather elaborate forest of yield
// instructions during code generation. Moreover, performing the induction		// instructions during code generation. Moreover, performing the induction
// after the if-statements more closely resembles code generated by TACO.		// after the if-statements more closely resembles code generated by TACO.
unsigned o = 0;		unsigned o = 0;
SmallVector<Value, 4> operands;		SmallVector<Value, 4> operands;
Value one = rewriter.create<arith::ConstantIndexOp>(loc, 1);		Value one = constantIndex(rewriter, loc, 1);
for (unsigned b = 0, be = induction.size(); b < be; b++) {		for (unsigned b = 0, be = induction.size(); b < be; b++) {
if (induction[b] && merger.isDim(b, Dim::kSparse)) {		if (induction[b] && merger.isDim(b, Dim::kSparse)) {
unsigned tensor = merger.tensor(b);		unsigned tensor = merger.tensor(b);
assert(idx == merger.index(b));		assert(idx == merger.index(b));
Value op1 = codegen.idxs[tensor][idx];		Value op1 = codegen.idxs[tensor][idx];
Value op2 = codegen.loops[idx];		Value op2 = codegen.loops[idx];
Value op3 = codegen.pidxs[tensor][idx];		Value op3 = codegen.pidxs[tensor][idx];
Value cmp = rewriter.create<arith::CmpIOp>(loc, arith::CmpIPredicate::eq,		Value cmp = rewriter.create<arith::CmpIOp>(loc, arith::CmpIPredicate::eq,
▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	if (conditions[b]) {
assert(idx == merger.index(b));		assert(idx == merger.index(b));
Value clause;		Value clause;
if (merger.isDim(b, Dim::kSparse)) {		if (merger.isDim(b, Dim::kSparse)) {
Value op1 = codegen.idxs[tensor][idx];		Value op1 = codegen.idxs[tensor][idx];
Value op2 = codegen.loops[idx];		Value op2 = codegen.loops[idx];
clause = rewriter.create<arith::CmpIOp>(loc, arith::CmpIPredicate::eq,		clause = rewriter.create<arith::CmpIOp>(loc, arith::CmpIPredicate::eq,
op1, op2);		op1, op2);
} else {		} else {
clause = rewriter.create<arith::ConstantIntOp>(loc, 1, 1); // true		clause = constantI1(rewriter, loc, true);
}		}
cond = cond ? rewriter.create<arith::AndIOp>(loc, cond, clause) : clause;		cond = cond ? rewriter.create<arith::AndIOp>(loc, cond, clause) : clause;
}		}
}		}
if (codegen.redVal)		if (codegen.redVal)
types.push_back(codegen.redVal.getType());		types.push_back(codegen.redVal.getType());
if (codegen.expValues)		if (codegen.expValues)
types.push_back(rewriter.getIndexType());		types.push_back(rewriter.getIndexType());
▲ Show 20 Lines • Show All 255 Lines • Show Last 20 Lines

mlir/lib/Dialect/SparseTensor/Utils/CMakeLists.txt

	add_mlir_dialect_library(MLIRSparseTensorUtils			add_mlir_dialect_library(MLIRSparseTensorUtils
	Merger.cpp			Merger.cpp
				CodegenUtils.cpp

	ADDITIONAL_HEADER_DIRS			ADDITIONAL_HEADER_DIRS
	${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/SparseTensor			${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/SparseTensor

	LINK_LIBS PUBLIC			LINK_LIBS PUBLIC
	MLIRArithmetic			MLIRArithmetic
	MLIRIR			MLIRIR
	MLIRLinalg			MLIRLinalg
	)			)

mlir/lib/Dialect/SparseTensor/Utils/CodegenUtils.h

This file was added.

				//===- CodegenUtils.h - Utilities for generating MLIR ------------ C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This header file defines utilities for generating MLIR.
				//
				//===----------------------------------------------------------------------===//

				#ifndef MLIR_DIALECT_SPARSETENSOR_UTILS_CODEGENUTILS_H_
				#define MLIR_DIALECT_SPARSETENSOR_UTILS_CODEGENUTILS_H_

				#include "mlir/Dialect/Arithmetic/IR/Arithmetic.h"
				#include "mlir/Dialect/SparseTensor/IR/SparseTensor.h"
				#include "mlir/ExecutionEngine/SparseTensorUtils.h"
				#include "mlir/IR/Builders.h"

				namespace mlir {
				class Location;
				class Type;
				class Value;

				namespace sparse_tensor {

				//===----------------------------------------------------------------------===//
				// ExecutionEngine/SparseTensorUtils helper functions.
				//===----------------------------------------------------------------------===//

				/// Converts an overhead storage bitwidth to its internal type-encoding.
				OverheadType overheadTypeEncoding(unsigned width);

				/// Converts the internal type-encoding for overhead storage to an mlir::Type.
				Type getOverheadType(Builder &builder, OverheadType ot);

				/// Returns the mlir::Type for pointer overhead storage.
				Type getPointerOverheadType(Builder &builder,
				const SparseTensorEncodingAttr &enc);

				/// Returns the mlir::Type for index overhead storage.
				Type getIndexOverheadType(Builder &builder,
				const SparseTensorEncodingAttr &enc);

				/// Converts a primary storage type to its internal type-encoding.
				PrimaryType primaryTypeEncoding(Type elemTp);

				/// Converts the IR's dimension level type to its internal type-encoding.
				DimLevelType dimLevelTypeEncoding(SparseTensorEncodingAttr::DimLevelType dlt);

				//===----------------------------------------------------------------------===//
				// Misc code generators.
				//
				// TODO: both of these should move upstream to their respective classes.
				// Once RFCs have been created for those changes, list them here.
				//===----------------------------------------------------------------------===//

				/// Generates a 1-valued attribute of the given type. This supports
				/// all the same types as `getZeroAttr`; however, unlike `getZeroAttr`,
				/// for unsupported types we raise `llvm_unreachable` rather than
				/// returning a null attribute.
				Attribute getOneAttr(Builder &builder, Type tp);

				/// Generates the comparison `v != 0` where `v` is of numeric type.
				/// For floating types, we use the "unordered" comparator (i.e., returns
				/// true if `v` is NaN).
				Value genIsNonzero(OpBuilder &builder, Location loc, Value v);

				//===----------------------------------------------------------------------===//
				// Constant generators.
				//
				// All these functions are just wrappers to improve code legibility;
				// therefore, we mark them as `inline` to avoid introducing any additional
				// overhead due to the legibility.
				//
				// TODO: Ideally these should move upstream, so that we don't
				// develop a design island. However, doing so will involve
				// substantial design work. For related prior discussion, see
				// <https://llvm.discourse.group/t/evolving-builder-apis-based-on-lessons-learned-from-edsc/879>
				//===----------------------------------------------------------------------===//

				/// Generates a 0-valued constant of the given type. In addition to
				/// the scalar types (`FloatType`, `IndexType`, `IntegerType`), this also
				/// works for `RankedTensorType` and `VectorType` (for which it generates
				/// a constant `DenseElementsAttr` of zeros).
				inline Value constantZero(OpBuilder &builder, Location loc, Type tp) {
				return builder.create<arith::ConstantOp>(loc, tp, builder.getZeroAttr(tp));
				}

				/// Generates a 1-valued constant of the given type. This supports all
				/// the same types as `constantZero`.
				inline Value constantOne(OpBuilder &builder, Location loc, Type tp) {
				return builder.create<arith::ConstantOp>(loc, tp, getOneAttr(builder, tp));
				}

				/// Generates a constant of `index` type.
				inline Value constantIndex(OpBuilder &builder, Location loc, int64_t i) {
				return builder.create<arith::ConstantIndexOp>(loc, i);
				}

				/// Generates a constant of `i32` type.
				inline Value constantI32(OpBuilder &builder, Location loc, int32_t i) {
				return builder.create<arith::ConstantIntOp>(loc, i, 32);
				}

				/// Generates a constant of `i16` type.
				inline Value constantI16(OpBuilder &builder, Location loc, int16_t i) {
				return builder.create<arith::ConstantIntOp>(loc, i, 16);
				}

				/// Generates a constant of `i8` type.
				inline Value constantI8(OpBuilder &builder, Location loc, int8_t i) {
				return builder.create<arith::ConstantIntOp>(loc, i, 8);
				}

				/// Generates a constant of `i1` type.
				inline Value constantI1(OpBuilder &builder, Location loc, bool b) {
				return builder.create<arith::ConstantIntOp>(loc, b, 1);
				}

				/// Generates a constant of the given `Action`.
				inline Value constantAction(OpBuilder &builder, Location loc, Action action) {
				return constantI32(builder, loc, static_cast<uint32_t>(action));
				}

				/// Generates a constant of the internal type-encoding for overhead storage.
				inline Value constantOverheadTypeEncoding(OpBuilder &builder, Location loc,
				unsigned width) {
				return constantI32(builder, loc,
				static_cast<uint32_t>(overheadTypeEncoding(width)));
				}

				/// Generates a constant of the internal type-encoding for pointer
				/// overhead storage.
				inline Value constantPointerTypeEncoding(OpBuilder &builder, Location loc,
				const SparseTensorEncodingAttr &enc) {
				return constantOverheadTypeEncoding(builder, loc, enc.getPointerBitWidth());
				}

				/// Generates a constant of the internal type-encoding for index overhead
				/// storage.
				inline Value constantIndexTypeEncoding(OpBuilder &builder, Location loc,
				const SparseTensorEncodingAttr &enc) {
				return constantOverheadTypeEncoding(builder, loc, enc.getIndexBitWidth());
				}

				/// Generates a constant of the internal type-encoding for primary storage.
				inline Value constantPrimaryTypeEncoding(OpBuilder &builder, Location loc,
				Type elemTp) {
				return constantI32(builder, loc,
				static_cast<uint32_t>(primaryTypeEncoding(elemTp)));
				}

				/// Generates a constant of the internal dimension level type encoding.
				inline Value
				constantDimLevelTypeEncoding(OpBuilder &builder, Location loc,
				SparseTensorEncodingAttr::DimLevelType dlt) {
				return constantI8(builder, loc,
				static_cast<uint8_t>(dimLevelTypeEncoding(dlt)));
				}

				} // namespace sparse_tensor
				} // namespace mlir

				#endif // MLIR_DIALECT_SPARSETENSOR_UTILS_CODEGENUTILS_H_

mlir/lib/Dialect/SparseTensor/Utils/CodegenUtils.cpp

This file was added.

				//===- CodegenUtils.cpp - Utilities for generating MLIR -------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "CodegenUtils.h"

				#include "mlir/IR/Types.h"
				#include "mlir/IR/Value.h"

				using namespace mlir::sparse_tensor;

				//===----------------------------------------------------------------------===//
				// ExecutionEngine/SparseTensorUtils helper functions.
				//===----------------------------------------------------------------------===//
				aartbikUnsubmitted Done Reply Inline Actions // Forward. aartbik: // Forward.
				rriddleUnsubmitted Done Reply Inline Actions namespaces should only really have classes, please remove these and use static methods and full namespace resolution instead. rriddle: namespaces should only really have classes, please remove these and use static methods and full…
				rriddleUnsubmitted Done Reply Inline Actions Some relevant docs: https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions https://llvm.org/docs/CodingStandards.html#anonymous-namespaces rriddle: Some relevant docs: https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to…

				OverheadType mlir::sparse_tensor::overheadTypeEncoding(unsigned width) {
				switch (width) {
				default:
				return OverheadType::kU64;
				case 32:
				return OverheadType::kU32;
				case 16:
				return OverheadType::kU16;
				case 8:
				return OverheadType::kU8;
				}
				}

				mlir::Type mlir::sparse_tensor::getOverheadType(mlir::Builder &builder,
				OverheadType ot) {
				switch (ot) {
				case OverheadType::kU64:
				return builder.getIntegerType(64);
				case OverheadType::kU32:
				return builder.getIntegerType(32);
				case OverheadType::kU16:
				return builder.getIntegerType(16);
				case OverheadType::kU8:
				return builder.getIntegerType(8);
				}
				llvm_unreachable("Unknown OverheadType");
				}

				mlir::Type mlir::sparse_tensor::getPointerOverheadType(
				mlir::Builder &builder, const SparseTensorEncodingAttr &enc) {
				// NOTE(wrengr): This workaround will be fixed in D115010.
				unsigned width = enc.getPointerBitWidth();
				if (width == 0)
				return builder.getIndexType();
				return getOverheadType(builder, overheadTypeEncoding(width));
				}

				mlir::Type
				mlir::sparse_tensor::getIndexOverheadType(mlir::Builder &builder,
				const SparseTensorEncodingAttr &enc) {
				// NOTE(wrengr): This workaround will be fixed in D115010.
				unsigned width = enc.getIndexBitWidth();
				if (width == 0)
				return builder.getIndexType();
				return getOverheadType(builder, overheadTypeEncoding(width));
				}

				PrimaryType mlir::sparse_tensor::primaryTypeEncoding(mlir::Type elemTp) {
				if (elemTp.isF64())
				return PrimaryType::kF64;
				if (elemTp.isF32())
				return PrimaryType::kF32;
				if (elemTp.isInteger(64))
				return PrimaryType::kI64;
				if (elemTp.isInteger(32))
				return PrimaryType::kI32;
				if (elemTp.isInteger(16))
				return PrimaryType::kI16;
				if (elemTp.isInteger(8))
				return PrimaryType::kI8;
				llvm_unreachable("Unknown primary type");
				}

				DimLevelType mlir::sparse_tensor::dimLevelTypeEncoding(
				SparseTensorEncodingAttr::DimLevelType dlt) {
				switch (dlt) {
				case SparseTensorEncodingAttr::DimLevelType::Dense:
				return DimLevelType::kDense;
				case SparseTensorEncodingAttr::DimLevelType::Compressed:
				return DimLevelType::kCompressed;
				case SparseTensorEncodingAttr::DimLevelType::Singleton:
				return DimLevelType::kSingleton;
				}
				llvm_unreachable("Unknown SparseTensorEncodingAttr::DimLevelType");
				}

				//===----------------------------------------------------------------------===//
				// Misc code generators.
				//===----------------------------------------------------------------------===//

				mlir::Attribute mlir::sparse_tensor::getOneAttr(mlir::Builder &builder,
				mlir::Type tp) {
				if (tp.isa<FloatType>())
				return builder.getFloatAttr(tp, 1.0);
				if (tp.isa<IndexType>())
				return builder.getIndexAttr(1);
				if (auto intTp = tp.dyn_cast<IntegerType>())
				return builder.getIntegerAttr(tp, APInt(intTp.getWidth(), 1));
				if (tp.isa<RankedTensorType, VectorType>()) {
				auto shapedTp = tp.cast<ShapedType>();
				if (auto one = getOneAttr(builder, shapedTp.getElementType()))
				return DenseElementsAttr::get(shapedTp, one);
				}
				llvm_unreachable("Unsupported attribute type");
				}

				mlir::Value mlir::sparse_tensor::genIsNonzero(mlir::OpBuilder &builder,
				mlir::Location loc,
				mlir::Value v) {
				mlir::Type tp = v.getType();
				mlir::Value zero = constantZero(builder, loc, tp);
				if (tp.isa<FloatType>())
				return builder.create<arith::CmpFOp>(loc, arith::CmpFPredicate::UNE, v,
				zero);
				if (tp.isIntOrIndex())
				return builder.create<arith::CmpIOp>(loc, arith::CmpIPredicate::ne, v,
				zero);
				llvm_unreachable("Non-numeric type");
				}

mlir/lib/Dialect/SparseTensor/Utils/Merger.cpp

//===- Merger.cpp - Implementation of iteration lattices ------------------===//		//===- Merger.cpp - Implementation of iteration lattices ------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "mlir/Dialect/SparseTensor/Utils/Merger.h"		#include "mlir/Dialect/SparseTensor/Utils/Merger.h"
		#include "CodegenUtils.h"
#include "mlir/Dialect/Arithmetic/IR/Arithmetic.h"		#include "mlir/Dialect/Arithmetic/IR/Arithmetic.h"

#include "mlir/IR/Operation.h"		#include "mlir/IR/Operation.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"

namespace mlir {		namespace mlir {
namespace sparse_tensor {		namespace sparse_tensor {

▲ Show 20 Lines • Show All 643 Lines • ▼ Show 20 Lines	Value Merger::buildExp(PatternRewriter &rewriter, Location loc, unsigned e,
case kCeilF:		case kCeilF:
return rewriter.create<math::CeilOp>(loc, v0);		return rewriter.create<math::CeilOp>(loc, v0);
case kFloorF:		case kFloorF:
return rewriter.create<math::FloorOp>(loc, v0);		return rewriter.create<math::FloorOp>(loc, v0);
case kNegF:		case kNegF:
return rewriter.create<arith::NegFOp>(loc, v0);		return rewriter.create<arith::NegFOp>(loc, v0);
case kNegI: // no negi in std		case kNegI: // no negi in std
return rewriter.create<arith::SubIOp>(		return rewriter.create<arith::SubIOp>(
loc,		loc, constantZero(rewriter, loc, v0.getType()), v0);
rewriter.create<arith::ConstantOp>(loc, v0.getType(),
rewriter.getZeroAttr(v0.getType())),
v0);
case kTruncF:		case kTruncF:
return rewriter.create<arith::TruncFOp>(loc, v0, inferType(e, v0));		return rewriter.create<arith::TruncFOp>(loc, v0, inferType(e, v0));
case kExtF:		case kExtF:
return rewriter.create<arith::ExtFOp>(loc, v0, inferType(e, v0));		return rewriter.create<arith::ExtFOp>(loc, v0, inferType(e, v0));
case kCastFS:		case kCastFS:
return rewriter.create<arith::FPToSIOp>(loc, v0, inferType(e, v0));		return rewriter.create<arith::FPToSIOp>(loc, v0, inferType(e, v0));
case kCastFU:		case kCastFU:
return rewriter.create<arith::FPToUIOp>(loc, v0, inferType(e, v0));		return rewriter.create<arith::FPToUIOp>(loc, v0, inferType(e, v0));
▲ Show 20 Lines • Show All 49 Lines • Show Last 20 Lines

utils/bazel/llvm-project-overlay/mlir/BUILD.bazel

Show First 20 Lines • Show All 1,791 Lines • ▼ Show 20 Lines	cc_library(
includes = ["include"],		includes = ["include"],
deps = [		deps = [
":ArithmeticDialect",		":ArithmeticDialect",
":IR",		":IR",
":InferTypeOpInterface",		":InferTypeOpInterface",
":SideEffectInterfaces",		":SideEffectInterfaces",
":SparseTensorAttrDefsIncGen",		":SparseTensorAttrDefsIncGen",
":SparseTensorOpsIncGen",		":SparseTensorOpsIncGen",
":SparseTensorUtils",
":StandardOps",		":StandardOps",
"//llvm:Support",		"//llvm:Support",
],		],
)		)

cc_library(		cc_library(
name = "SparseTensorUtils",		name = "SparseTensorUtils",
srcs = glob(["lib/Dialect/SparseTensor/Utils/*.cpp"]),		srcs = glob([
hdrs = glob(["include/mlir/Dialect/SparseTensor/Utils/*.h"]),		"lib/Dialect/SparseTensor/Utils/*.cpp",
		"lib/Dialect/SparseTensor/Utils/*.h",
		]),
		hdrs = glob([
		"include/mlir/Dialect/SparseTensor/Utils/*.h",
		]) + [
		"include/mlir/ExecutionEngine/SparseTensorUtils.h",
		],
		aartbikUnsubmitted Done Reply Inline Actions Is this autogen? Shouldn't we just add the right deps, instead of adding outside headers to this lib? aartbik: Is this autogen? Shouldn't we just add the right deps, instead of adding outside headers to…
		wrengrAuthorUnsubmitted Done Reply Inline Actions I tried using `build_cleaner` (which made the changes to the deps below), but it couldn't figure out how to fulfill these headers without pulling in a huge plethora of other stuff. I'll try taking another whack at it wrengr: I tried using `build_cleaner` (which made the changes to the deps below), but it couldn't…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Okay, I managed to get a decently clean version. The new version still has `"include/mlir/ExecutionEngine/SparseTensorUtils.h"` in the hdrs (which should be fine(?) since that's what `:SparseTensorTransforms` does as well). Alternatively we could add `":mlir_c_runner_utils"` to the deps (which will also bring that header in, though at the cost of building all of `:mlir_c_runner_utils`). wrengr: Okay, I managed to get a decently clean version. The new version still has…
		aartbikUnsubmitted Done Reply Inline Actions Yeah, this look okay to me, since now we only include what we "own" in a sense. Let's see if others chime in, but I am okay with this. aartbik: Yeah, this look okay to me, since now we only include what we "own" in a sense. Let's see if…
includes = ["include"],		includes = ["include"],
deps = [		deps = [
":ArithmeticDialect",		":ArithmeticDialect",
":IR",		":IR",
":LinalgOps",		":LinalgOps",
":SideEffectInterfaces",		":SparseTensor",
		aartbikUnsubmitted Not Done Reply Inline Actions Why does the sparse tensor utils depend on sparse tensors now and not the other way around. like it used to? Is this still required with the new structure? It feels a bit the wrong way to me. aartbik: Why does the sparse tensor utils depend on sparse tensors now and not the other way around.
		wrengrAuthorUnsubmitted Done Reply Inline Actions Imo, it makes the most sense to have the definition of a language be a priori to and independent of any utilities/transformations on that language. The naturalness of this layering is evidenced by the fact that the IR library doesn't actually depend on the utils at all (cf., line 1807 above), as well as the utils needing to pull in the IncGen dependencies if it can't get them from the IR library (which strikes me as a failure/violation of cohesion/encapsulation). But I can revert the change if you feel strongly wrengr: Imo, it makes the most sense to have the definition of a language be a priori to and…
		aartbikUnsubmitted Not Done Reply Inline Actions I suppose that is true. I was thinking mainly of the new utils that support IR building, but there are also utils that indeed use the sparse encoding. Note that part of this is also because the sparse tensor build rule is too large, I really would like to see the sparse tensor attribute being independent of the dialect, especially since this may be used outside the sparse tensor dialect without having to pull in everything (see issue https://github.com/llvm/llvm-project/issues/52748 I filed recently). aartbik: I suppose that is true. I was thinking mainly of the new utils that support IR building, but…
		wrengrAuthorUnsubmitted Done Reply Inline Actions Making the attributes independent of the rest of the dialect makes sense to me. Nicely separates the type definition from the operations, avoiding the analogue of the forward class declaration hack (and opening the way to split up the sparse_tensor dialect or define alternatives to it, if we needed either for some reason). So, should I revert the BUILD.bazel change or no? wrengr: Making the attributes independent of the rest of the dialect makes sense to me. Nicely…
":SparseTensorAttrDefsIncGen",
":SparseTensorOpsIncGen",
":StandardOps",
"//llvm:Support",		"//llvm:Support",
],		],
)		)

cc_library(		cc_library(
name = "SparseTensorTransforms",		name = "SparseTensorTransforms",
srcs = glob(["lib/Dialect/SparseTensor/Transforms/*.cpp"]),		srcs = glob([
		"lib/Dialect/SparseTensor/Transforms/*.cpp",
		]) + [
		"lib/Dialect/SparseTensor/Utils/CodegenUtils.h",
		],
hdrs = [		hdrs = [
"include/mlir/Dialect/SparseTensor/Transforms/Passes.h",		"include/mlir/Dialect/SparseTensor/Transforms/Passes.h",
"include/mlir/ExecutionEngine/SparseTensorUtils.h",		"include/mlir/ExecutionEngine/SparseTensorUtils.h",
],		],
includes = ["include"],		includes = ["include"],
deps = [		deps = [
":Affine",		":Affine",
":ArithmeticDialect",		":ArithmeticDialect",
▲ Show 20 Lines • Show All 6,072 Lines • Show Last 20 Lines