This is an archive of the discontinued LLVM Phabricator instance.

[mlir][sparse] Add complex number reading from files.
ClosedPublic

Authored by bixia on Jun 6 2022, 11:30 AM.

Details

Summary

Support complex numbers for Matrix Market Exchange Formats. Add a test case.

Diff Detail

Event Timeline

bixia created this revision.Jun 6 2022, 11:30 AM
Herald added a project: Restricted Project. · View Herald Transcript
bixia requested review of this revision.Jun 6 2022, 11:30 AM
aartbik added inline comments.Jun 6 2022, 4:43 PM
mlir/lib/ExecutionEngine/SparseTensorUtils.cpp
1264–1265
NOTE: this now should distinguish between real, complex, and probably even integer!
1360–1370

note that for the MM in particular, we should also verify that the data is marked real/complex/integer going forward, and verify that this is compatible with the actual data type <V> (assert, or otherwise fail when imcompatible)

1372

I would add a comment on reading two doubles to make a complex here first, just so that it stays similar to L1317-1319. Otherwise, directly addressing the pattern issue reads a bit too terse.

1386–1387

This is of course the right way forward, but did you measure perf impact of introducing more calls along the path that is executed for each nonzero? Perhaps try reading in one of the larger tensors before/after this change. Would marking methods as inline help here?

mlir/test/Integration/data/test_symmetric_complex.mtx
2

This must be "complex", not "real' following proper MM format

bixia updated this revision to Diff 434945.Jun 7 2022, 2:04 PM

Add data type check to MME header parsing.
Make the functions to read a COO value inline functions.

bixia marked 2 inline comments as done.Jun 7 2022, 2:05 PM
bixia added inline comments.
mlir/lib/ExecutionEngine/SparseTensorUtils.cpp
1360–1370

Make the functions inline functions.

bixia updated this revision to Diff 434946.Jun 7 2022, 2:10 PM

Fix a typo.

aartbik added inline comments.Jun 7 2022, 2:56 PM
mlir/lib/ExecutionEngine/SparseTensorUtils.cpp
1237

This is of course a reasonable way to do it. But one may a bit surprised by having the fatal assert while reading the header.

I feel that simply reading and storing the field in the SparseTensorFile class as an enum (with pattern, integer, real, complex, undef {to deal with FROSTT}) and then having getters will let the client decide what to do once the header is properly parsed seemed a bit more along the direction the refactoring in a class was taking.

I don't feel super strongly about it, so this is more a WDYT? comment

aartbik added inline comments.Jun 7 2022, 3:00 PM
mlir/lib/ExecutionEngine/SparseTensorUtils.cpp
1263

this is a very narrow subset of possibilities (i.e. must match exactly).

I can see cases where we allow ints to be assigned to real and vv (as we did before),
or real/integer to complex with imaginary part zero.

All such cases seem a lot easier to decide in the "client" (even if that is just us right now ;-)
with the enum approach

bixia updated this revision to Diff 434994.Jun 7 2022, 4:47 PM

Use enum to store the value type to SparseTensorFile and check type compatible in openSparseTensorCOO.

bixia marked an inline comment as done.Jun 7 2022, 4:49 PM
bixia added inline comments.
mlir/lib/ExecutionEngine/SparseTensorUtils.cpp
1237

Add enum ValueKind to store the value type information and use the information in the reader.

1360–1370

Move this check to the reader.

aartbik accepted this revision.Jun 7 2022, 5:01 PM

I love it! Thanks for making the change. Two last nits, but good to go after that.

mlir/lib/ExecutionEngine/SparseTensorUtils.cpp
1299–1349

Maybe add a comment, e.g.

// The FROSTT format does not define the data type of the nonzero elements.
valueKind_ = ValueKind::kUndefined;

1367

This does not seem to be exhaustive, i.e. complex vs isint seems an error too

given that we have accepted real and int before, I would be perfectly okay that the only case we reject right now is complex file vs. non-complex data (which we can later refine into zero-imaginary part)

This revision is now accepted and ready to land.Jun 7 2022, 5:01 PM
bixia updated this revision to Diff 435242.Jun 8 2022, 10:26 AM
bixia marked an inline comment as done.

Add a comment.

mlir/lib/ExecutionEngine/SparseTensorUtils.cpp
1367

complex data vs integer tensor is an error. Maybe you misread tensorIsReal? It also includes integer.

bixia updated this revision to Diff 435288.Jun 8 2022, 12:14 PM

There is one place where isValid() should be used, not isValid.

This revision was automatically updated to reflect the committed changes.