This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Model MemRef memory space as Attribute
ClosedPublic

Authored by vinograd47 on Feb 5 2021, 8:08 AM.

Details

Summary

Based on the following discussion:
https://llvm.discourse.group/t/rfc-memref-memory-shape-as-attribute/2229

The goal of the change is to make memory space property to have more
expressive representation, rather then "magic" integer values.

It will allow to have more clean ASM form:

gpu.func @test(%arg0: memref<100xf32, "workgroup">)

// instead of

gpu.func @test(%arg0: memref<100xf32, 3>)

Explanation for Attribute choice instead of plain string:

  • Attribute classes allow to use more type safe API based on RTTI.
  • Attribute classes provides faster comparison operator based on pointer comparison in contrast to generic string comparison.
  • Attribute allows to store more complex things, like structs or dictionaries. It will allows to have more complex memory space hierarchy.

This commit preserve old integer-based API and implements it on top
of the new one.

Depends on D97476

Diff Detail

Event Timeline

vinograd47 created this revision.Feb 5 2021, 8:08 AM
vinograd47 requested review of this revision.Feb 5 2021, 8:08 AM
mehdi_amini added inline comments.Feb 5 2021, 10:50 AM
mlir/include/mlir/IR/BuiltinTypes.h
184

What is the plan to get rid of the integer API? What prevents us from removing it right now?

Also wouldn't the natural name for this API be getMemorySpace? Will you rename getMemorySpaceAttr into getMemorySpace "at some point"?

rriddle added inline comments.Feb 5 2021, 1:18 PM
mlir/include/mlir/IR/BuiltinTypes.h
12

Please avoid including this here, use forward declarations instead.

rriddle added inline comments.Feb 5 2021, 1:18 PM
mlir/lib/IR/AsmPrinter.cpp
1811

Add a comment to this funciton.

mlir/lib/IR/BuiltinTypes.cpp
441

nit: Spell out auto here.

442

nit: !attr

540

Use emitOptionalError here.

560

nit: memorySpace &&

560

Drop the trivial braces here.

mlir/lib/Parser/TypeParser.cpp
244

nit: Spell out auto here.

248–255
bondhugula requested changes to this revision.Feb 7 2021, 12:06 AM
bondhugula added a subscriber: bondhugula.

Please remove RFC from the commit title. Also,
"MemRef memory space as Attribute" -> "Model MemRef memory space as Attribute" or "Switch MemRef memory space: unsigned -> Attribute".

mlir/lib/IR/BuiltinTypes.cpp
457–468

All static methods should have doc comments as well.

538

Nit: if (memorySpace && ...

561

Everything else can be?! (For eg., I would add IntegerSet to this prohibited list as well. )

mlir/lib/Parser/TypeParser.cpp
243–245

Please end all comments with a period. Here, above, and below.

This revision now requires changes to proceed.Feb 7 2021, 12:06 AM
vinograd47 updated this revision to Diff 322067.Feb 8 2021, 3:19 AM
vinograd47 retitled this revision from [mlir][RFC] MemRef memory space as Attribute to [mlir] Model MemRef memory space as Attribute.

Rebased and fix review remarks

@rriddle @mehdi_amini @bondhugula I've updated the change according to your comments.

mlir/include/mlir/IR/BuiltinTypes.h
12

I've replaced BuiltInAttributes.h header with Attributes.h. I can't replace it with forward declaration for Attribute class, because it is used as class member for MemRef::Builder.

184

The simple replacement of unsigned with Attribute breaks existing code in various dialects (std, affine, gpu). It might also break separate projects (like TensorFlow). As far as I understand, MLIR tries to avoid such breaking changes in single commit and handle compatibility with old code for some period of time to allow dependent projects to switch to the new API. I've followed this approach.

mlir/lib/IR/BuiltinTypes.cpp
561

I've updated this part. Now only IntegerAttr, StringAttr, DictionaryAttr and custom dialect attributes are allowed.

mehdi_amini added inline comments.Feb 8 2021, 10:52 AM
mlir/include/mlir/IR/BuiltinTypes.h
184

As far as I understand, MLIR tries to avoid such breaking changes in single commit and handle compatibility with old code for some period of time to allow dependent projects to switch to the new API.

Actually this is explicitly a non-goal: https://mlir.llvm.org/getting_started/DeveloperGuide/#breaking-changes

Anyway, I think my questions about naming and plan still stand even if you deemed easier to update the dialects in-tree in multiple steps.

Looks fine for the minor comments part I had.

bondhugula resigned from this revision.Feb 11 2021, 6:05 PM
vinograd47 edited the summary of this revision. (Show Details)

Removed old integer-based API from MemRefType. Updated internal dialects to use new API.

Rebased and fixed conflicts with new code.

@mehdi_amini I've updated the change and removed old integer-based API from the MemRefType. I've fixed internal dialects by switching to the new API. I still used IntegerAttr for them, because switch to another representation (like StringAttr) will require more deep modification of the dialects and lowering passes.

mehdi_amini added inline comments.Feb 16 2021, 1:05 PM
mlir/lib/Conversion/StandardToSPIRV/StandardToSPIRV.cpp
200 ↗(On Diff #323947)

How do you see the plan for moving away from IntegerAttr?
This kind of code becomes subject to crashing later.

vinograd47 added inline comments.Feb 17 2021, 5:48 AM
mlir/lib/Conversion/StandardToSPIRV/StandardToSPIRV.cpp
200 ↗(On Diff #323947)

@mehdi_amini In my understanding, the transition from IntegerAttr should be done step-by-step for separate dialects. For example:

  1. Switch from IntegerAttr to StringAttr (or something else) inside GPU dialect:
    • Adjust its operation to work with new Attribute type.
    • Add type conversion to fallback to IntegerAttr when lowering to other dialects (SPIRV, LLVM). For example, replace "local" string attribute with 3 integer attribute during the lowering pass.
  2. Switch from IntegerAttr to StringAttr inside SPIRV dialect:
    • Adjust its operation to work with new Attribute type.
    • Update lowering passes which converts to SPIRV dialect. For example, remove the type conversion from the above example or replace it with string/enum <-> string/enum conversion.
    • Update lowering passes from SPIRV dialect. Fallback to IntegerAttr as described above.
  3. etc.

Each step can be done with separate change and by separate authors who are more familiar with specific dialect.

At least that how I see it. Are you OK with such plan? Actually I'd like to delegate this work since I am not familiar with any of this dialects (GPU, SPIRV, LLVM).

@mehdi_amini Actually, while I've seen positive feedback from the forum discussion (integer memory space leads to non-informative IRs, GPU dialect is good example for this), doing such change in atomic commit by one developer is quite hard task. It is breaking change, which affects various dialects, I believe, not only inside MLIR source tree. That's why I've submitted 2 variants of this change:

  1. Non-breaking, with preserving original integer-based API in MemRefType. While it introduces dome duplication and non-convenient new API (getMemorySpaceAttr), it allows to avoid the break of the dialects. So, they can switch to the new API more smoothly.
  2. Breaking change in MemRefType API, preserving integer semantic in dependent dialects (the current change version).

Both variants have own pros and cons, I'd like to get MLIR architects/maintainers opinion regarding which path is better.

BTW, as I can see the CI failed with the following error:

/mnt/disks/ssd0/agent/llvm-project/mlir/lib/Bindings/Python/PybindUtils.h:16:10: error: 'pybind11/pybind11.h' file not found [clang-diagnostic-error]
#include <pybind11/pybind11.h>
         ^

It doesn't look like related to the change itself, but some kind of infrastructure issue. I'm not sure, how to deal with it.

@mehdi_amini Actually, while I've seen positive feedback from the forum discussion (integer memory space leads to non-informative IRs, GPU dialect is good example for this), doing such change in atomic commit by one developer is quite hard task. It is breaking change, which affects various dialects, I believe, not only inside MLIR source tree. That's why I've submitted 2 variants of this change:

  1. Non-breaking, with preserving original integer-based API in MemRefType. While it introduces dome duplication and non-convenient new API (getMemorySpaceAttr), it allows to avoid the break of the dialects. So, they can switch to the new API more smoothly.
  2. Breaking change in MemRefType API, preserving integer semantic in dependent dialects (the current change version).

Both variants have own pros and cons, I'd like to get MLIR architects/maintainers opinion regarding which path is better.

I'm fine doing it in multiple steps if it helps, I mostly pointed out that we don't make our life upstream harder for the convenience of downstream projects. Now these steps needs to be carefully crafted to control the amount of technical debt included and the potential to end up in a fragile situation (which is what I raised as a comment inline). Having some amount of extra APIs for the transition is OK, as long as there is a robust plan to get rid of them. For example when you introduce a getMemorySpaceAttr which should ultimately be named getMemorySpace, that seems to make it harder to later end up in the desired state.

What about this incremental plan:

  1. rename getMemorySpace into getMemorySpaceInteger as a NFC refactoring.
  2. Then change the storage to be a StringAttribute, introduce a getMemorySpace that returns a StringAttribute, and mark getMemorySpaceInteger deprecated (it could continue to return an integer when the string is "0", "1", etc. for now).
  3. Migrate all in-tree uses to getMemorySpace.
  4. When there is no use of getMemorySpaceInteger left in-tree it can be removed.

(I would think that going through all of these steps upstream shouldn't exceed a couple of weeks, or a month)

@mehdi_amini Submitted separate change for the first part (method renaming) : https://reviews.llvm.org/D97476

vinograd47 edited the summary of this revision. (Show Details)

Re-implemented the feature on top of separate "renaming" change.

Update related section in LangRef documentation.

Rebased to latest main

@ftynse @rriddle @mehdi_amini The dependency for this change (old method renaming) was landed. Could you please take a look at this change one more time?

ftynse added a comment.Mar 2 2021, 2:07 AM

LGTM in principle. Deferring to @rriddle for approval.

mlir/include/mlir/IR/BuiltinTypes.h
183

Could you actually mark deprecated functions with the [[deprecated]] attribute?

mlir/lib/IR/BuiltinTypes.cpp
454

Nit: is allowed

457

Nit: we use /// for top-level comments

rriddle accepted this revision.Mar 2 2021, 2:27 AM

LGTM for code (after resolving outstanding comments)

First though, can you confirm if you are going with the migration plan proposed by @mehdi_amini ? Just want to make sure there is a clear path for removing the old API.

mlir/include/mlir/IR/BuiltinTypes.h
183

You'll want to use LLVM_ATTRIBUTE_DEPRECATED.

mlir/lib/IR/BuiltinTypes.cpp
463

I'm assuming this is for checking against the builtin dialect? You should be able to do something like isa<BuiltinDialect>(memorySpace.getDialect()) instead.

468–473
This revision is now accepted and ready to land.Mar 2 2021, 2:27 AM
rriddle added inline comments.Mar 2 2021, 2:38 AM
mlir/include/mlir/IR/BuiltinTypes.h
183

Yes, disregard me. I remember that there a recent discussion, but never actually checked the outcome.

vinograd47 marked 17 inline comments as done.Mar 2 2021, 9:28 AM

Mark old comments as done.

vinograd47 updated this revision to Diff 327488.Mar 2 2021, 9:40 AM

Applied review remarks

vinograd47 marked 4 inline comments as done.Mar 2 2021, 9:48 AM

@ftynse @rriddle I've applied your review remarks except deprecated attribute. I didn't add it intentionally, since the main code base was not updated to the new method, this attribute will generate dozen of warnings during the build.

@rriddle Regarding the plan - I'm not very familiar with dialects like GPU, SPIRV, LLVM (which use memorySpace intensively). So for me it will take a lot of time to adopt them, also it will be done in low priority in background to main work :) It would be great if this task is shared with developers more familiar with this dialects specifics.

vinograd47 updated this revision to Diff 329320.Mar 9 2021, 6:44 AM

Rebased and adjusted to latest changes in upstream.

@ftynse @rriddle @mehdi_amini Do you have any more comments for this change?

mehdi_amini accepted this revision.Mar 9 2021, 12:54 PM

@rriddle Regarding the plan - I'm not very familiar with dialects like GPU, SPIRV, LLVM (which use memorySpace intensively). So for me it will take a lot of time to adopt them, also it will be done in low priority in background to main work :) It would be great if this task is shared with developers more familiar with this dialects specifics.

Can you file individual bugs against each of these components to update to richer attributes?

This revision was automatically updated to reflect the committed changes.

Can you file individual bugs against each of these components to update to richer attributes?

Done: