This is an archive of the discontinued LLVM Phabricator instance.

[mlir][sparse] introduce a new compressed(hi) dimension level type
ClosedPublic

Authored by Peiming on Apr 18 2023, 3:00 PM.

Details

Summary

compressed(hi) is similar to compressed, but instead of reusing the previous position high as the current position low, it uses a pair of positions for each sparse index.

The patch only introduces the definition (syntax) but does not provide codegen implementation.

Diff Detail

Event Timeline

Peiming created this revision.Apr 18 2023, 3:00 PM
Herald added a project: Restricted Project. · View Herald Transcript
Peiming requested review of this revision.Apr 18 2023, 3:00 PM
Peiming edited the summary of this revision. (Show Details)Apr 18 2023, 3:00 PM
Peiming added reviewers: wrengr, bixia.
Peiming updated this revision to Diff 514753.Apr 18 2023, 3:02 PM

add roundtrip test.

aartbik added inline comments.Apr 18 2023, 3:08 PM
mlir/lib/Bindings/Python/DialectSparseTensor.cpp
30

introducing parenthesis now seems a bit taking a new path.
Would compressed-hi-xxx a bit safer with the current (granted ugly) solution?

Peiming marked an inline comment as done.Apr 18 2023, 3:15 PM
Peiming updated this revision to Diff 514755.Apr 18 2023, 3:16 PM

address comments.

I disagree with the operational description of the new level-type as given in the CL description. Like I mentioned in the chat: I think that it'd make more sense to implement the new level-type by having two separate arrays for the position-lows and position-highs, rather than interleaving them into a single array. At the very least, the external description/definition of the level-type should allow for such an implementation, rather than specifying the interleaved implementation.

mlir/include/mlir-c/Dialect/SparseTensor.h
38–41

If we go with the two-arrays implementation, then these could be given the values 0b1010_XX: i.e., where the 0bX010_XX part indicates the presence of the position-lows array, and the 0b1XXX_XX part indicates the presence of the position-highs array.

The reason I mention this is because that sort of encoding will make sense in the future once we add ELL support; i.e., we'd have the bits (hasLowPositions, hasHighPositions, hasCoordinates) and then we'd have the level types CompressedWithHi = (1,1,1); Compressed = (1,0,1); Singleton = (0,0,1); Counted = (1,0,0); Dense = (0,0,0). Of course, since this changes how the Compressed/Singleton/Dense types are encoded relative to one another, we can always wait until we add the Counted level-type to make these changes.

Peiming edited the summary of this revision. (Show Details)Apr 18 2023, 3:20 PM
Peiming marked an inline comment as done.Apr 18 2023, 3:51 PM
Peiming added inline comments.
mlir/include/mlir-c/Dialect/SparseTensor.h
38–41

Make sense, let's wait then ;-)

Though, strictly speaking, compressed level has both lo and hi for the first index on the level (or you can say the last index on the level).

Peiming updated this revision to Diff 514772.Apr 18 2023, 3:52 PM
Peiming marked an inline comment as done.

rebase

aartbik accepted this revision.Apr 18 2023, 3:58 PM
This revision is now accepted and ready to land.Apr 18 2023, 3:58 PM
This revision was landed with ongoing or failed builds.Apr 18 2023, 4:26 PM
This revision was automatically updated to reflect the committed changes.