Page MenuHomePhabricator

[mlir][sparse] further implement singleton dimension level type
ClosedPublic

Authored by wrengr on Sep 29 2022, 7:05 PM.

Details

Summary

Handle more cases of singleton DLT including direct sparse2sparse conversion. (Followup to D134096)

Depends On D134926

Diff Detail

Event Timeline

wrengr created this revision.Sep 29 2022, 7:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 29 2022, 7:05 PM
wrengr requested review of this revision.Sep 29 2022, 7:05 PM
aartbik accepted this revision.Oct 5 2022, 1:20 PM
aartbik added inline comments.
mlir/include/mlir/ExecutionEngine/SparseTensor/Storage.h
865

you want to be defensive here too with an else assert dense (for the future)

924

period at end

mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp
453

nice refinement!

This revision is now accepted and ready to land.Oct 5 2022, 1:20 PM
wrengr updated this revision to Diff 465539.Oct 5 2022, 1:38 PM
wrengr marked 2 inline comments as done.

Adding future-proofing, per nits

wrengr updated this revision to Diff 465572.Oct 5 2022, 3:13 PM

Fixing typo in the new assertions

This change broke the mlir build on Windows in Debug. I'm actually not entirely sure why it only broke in Debug and not in Release as well because I would expect the issue to be the same in both.

What's happening now is SparseTensorConversion.cpp has calls to a couple of functions defined in Enums.h. However, Enums.h is setup as if it belongs to a shared library (but it doesn't, see comment in the CMakelists file when it was created here: https://reviews.llvm.org/D133462) - so when it is included in SparseTensorConversion.cpp the functions are marked as dllimport but there's no library that exports them, so the build fails to link a number of executables. For example:

cmd.exe /C "cd . && "C:\Program Files\CMake\bin\cmake.exe" -E vs_link_exe --intdir=tools\mlir\tools\mlir-opt\CMakeFiles\mlir-opt.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100203~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100203~1.0\x64\mt.exe --manifests  -- C:\PROGRA~1\MIB055~1\2022\PROFES~1\VC\Tools\MSVC\1433~1.316\bin\Hostx64\x64\link.exe /nologo @CMakeFiles\mlir-opt.rsp  /out:bin\mlir-opt.exe /implib:lib\mlir-opt.lib /pdb:bin\mlir-opt.pdb /version:0.0 /machine:x64 /STACK:10000000 /debug /INCREMENTAL /subsystem:console  && cd ."
LINK Pass 1: command "C:\PROGRA~1\MIB055~1\2022\PROFES~1\VC\Tools\MSVC\1433~1.316\bin\Hostx64\x64\link.exe /nologo @CMakeFiles\mlir-opt.rsp /out:bin\mlir-opt.exe /implib:lib\mlir-opt.lib /pdb:bin\mlir-opt.pdb /version:0.0 /machine:x64 /STACK:10000000 /debug /INCREMENTAL /subsystem:console /MANIFEST /MANIFESTFILE:tools\mlir\tools\mlir-opt\CMakeFiles\mlir-opt.dir/intermediate.manifest tools\mlir\tools\mlir-opt\CMakeFiles\mlir-opt.dir/manifest.res" failed (exit code 1120) with the following output:
MLIRSparseTensorTransforms.lib(SparseTensorConversion.cpp.obj) : error LNK2019: unresolved external symbol "bool __cdecl mlir::sparse_tensor::isDenseDLT(enum mlir::sparse_tensor::DimLevelType)" (?isDenseDLT@sparse_tensor@mlir@@YA_NW4DimLevelType@12@@Z) referenced in function "bool __cdecl `anonymous namespace'::canUseDirectConversion(class llvm::ArrayRef<enum mlir::sparse_tensor::SparseTensorEncodingAttr::DimLevelType>)" (?canUseDirectConversion@?A0x0125d971@@YA_NV?$ArrayRef@W4DimLevelType@SparseTensorEncodingAttr@sparse_tensor@mlir@@@llvm@@@Z)
MLIRSparseTensorTransforms.lib(SparseTensorConversion.cpp.obj) : error LNK2019: unresolved external symbol "bool __cdecl mlir::sparse_tensor::isCompressedDLT(enum mlir::sparse_tensor::DimLevelType)" (?isCompressedDLT@sparse_tensor@mlir@@YA_NW4DimLevelType@12@@Z) referenced in function "bool __cdecl `anonymous namespace'::canUseDirectConversion(class llvm::ArrayRef<enum mlir::sparse_tensor::SparseTensorEncodingAttr::DimLevelType>)" (?canUseDirectConversion@?A0x0125d971@@YA_NV?$ArrayRef@W4DimLevelType@SparseTensorEncodingAttr@sparse_tensor@mlir@@@llvm@@@Z)
MLIRSparseTensorTransforms.lib(SparseTensorConversion.cpp.obj) : error LNK2019: unresolved external symbol "bool __cdecl mlir::sparse_tensor::isSingletonDLT(enum mlir::sparse_tensor::DimLevelType)" (?isSingletonDLT@sparse_tensor@mlir@@YA_NW4DimLevelType@12@@Z) referenced in function "bool __cdecl `anonymous namespace'::canUseDirectConversion(class llvm::ArrayRef<enum mlir::sparse_tensor::SparseTensorEncodingAttr::DimLevelType>)" (?canUseDirectConversion@?A0x0125d971@@YA_NV?$ArrayRef@W4DimLevelType@SparseTensorEncodingAttr@sparse_tensor@mlir@@@llvm@@@Z)
bin\mlir-opt.exe : fatal error LNK1120: 3 unresolved externals
ninja: build stopped: subcommand failed.

This can be fixed in a couple of ways:

  1. Assume that mlir_sparse_tensor_utils is a static library (which it is) and then remove any export/import definitions. This should be fairly straightforward. In this case we might want to consider the library naming as well since mlir_xxx_utils are generally shared libraries.
  2. Make mlir_sparse_tensor_utils a shared library and make all the necessary fixes for it to work. One caveat here is that we'd end up with a situation where one of the transform libraries (and as far as I can tell only one of the transform libraries) requires linking to a shared library.

I suppose a third solution would be to move the definitions from Enums.h to the existing MLIRSparseTensorUtils which is also a static library.

Thoughts?

wrengr added a comment.Oct 7 2022, 4:37 PM

What's happening now is SparseTensorConversion.cpp has calls to a couple of functions defined in Enums.h. However, Enums.h is setup as if it belongs to a shared library (but it doesn't, see comment in the CMakelists file when it was created here: https://reviews.llvm.org/D133462) - so when it is included in SparseTensorConversion.cpp the functions are marked as dllimport but there's no library that exports them, so the build fails to link a number of executables.

When factoring out the mlir_sparse_tensor_utils library I was having a lot of issues getting things to work under MSVC, and ultimately I could only resolve the issues by making mlir_sparse_tensor_utils static rather than shared. Fwiw, the chain of reasoning leading to the current state is/was: mlir_c_runner_utils is a shared library so the stuff in ExecutionEngine/SparseTensorUtils.h must be marked with dllimport/dllexport, but SparseTensorUtils.h includes ExecutionEngine/SparseTensor/Enums.h and needs to reexport the enum definitions in order for the SparseTensorUtils.h functions to be usable, so the stuff in Enums.h ends up needing to be dllimport/dllexport as well. (And since Enums.h includes ExecutionEngine/Float16bits.h the same thing happens again, though the mlir_float16_utils library can be made a shared lib without complications.)

As I recall, the only thing mlir_sparse_tensor_utils really needs to export as part of its own API are the enum definitions themselves. So I think the quickest fix might be to just remove the MLIR_SPARSETENSOR_EXPORT attribute from all the functions in Enums.h and only leave it on the enums themselves. That does still leave things in an awkward state, but if it makes things green then that'll buy time to come up with a better fix. Though I'm not sure if leaving the MLIR_SPARSETENSOR_EXPORT attribute on the enum definitions would still cause the sorts of problems you're seeing. This is my first time working with DLLs on Windows, so any enlightenment you can share re best practices would be most welcome.

Also, it seems like the LLVM Windows buildbot is a lot stricter than the MLIR/Phabricator Windows buildbot. Is there any way I can send differentials to be run by the LLVM buildbot before landing them?

This can be fixed in a couple of ways:

  1. Assume that mlir_sparse_tensor_utils is a static library (which it is) and then remove any export/import definitions. This should be fairly straightforward. In this case we might want to consider the library naming as well since mlir_xxx_utils are generally shared libraries.
  2. Make mlir_sparse_tensor_utils a shared library and make all the necessary fixes for it to work. One caveat here is that we'd end up with a situation where one of the transform libraries (and as far as I can tell only one of the transform libraries) requires linking to a shared library.

I suppose a third solution would be to move the definitions from Enums.h to the existing MLIRSparseTensorUtils which is also a static library.

Thoughts?

Just as a bit of background: The reason we split mlir_sparse_tensor_utils out from mlir_c_runner_utils is because another team at Google wanted to be able to use the C++ library directly rather than going through the C-API exposed by mlir_c_runner_utils. That team doesn't care whether the library is shared vs static nor care about the Windows side, so whatever solution we can come up with should be fine for their needs. Originally I intended to make it a shared library, following the pattern of all the other mlir_xxx_utils libraries under the ExecutionEngine directory, but kept running into MSVC linker errors. Eventually I learned about dllimport/dllexport, which solved most of the problems— but ultimately had to make the library static because std::vector isn't set up to be used/exposed by DLLs.

TBH, I'm not really sure whether mlir_sparse_tensor_utils "should" be static or dynamic (nor what MLIR's stance is on such things). We definitely want to be able to share the enums between both mlir_c_runner_utils and SparseTensorTransforms, since the transforms need to be able to be able to codegen calls into the runtime library, so both the transforms and the runtime need to agree on the enums. And I'd really like to keep the predicate functions side-by-side with the enums, since that helps ensure they remain in sync as we evolve and change the enum definitions (e.g., as in D135004); though the predicate functions only need to be called by SparseTensorTransforms and mlir_sparse_tensor_utils itself, they don't need to be exported from mlir_c_runner_utils. I suppose we could split Enums.h off into a separate library from the rest of mlir_sparse_tensor_utils, but that doesn't help the latter become a sharedlib since there's still the std::vector issue.

Moving things into MLIRSparseTensorUtils is a no-go, since that's a completely unrelated library. (This is an example of why I hate using "utils" rather than more expressive names :) Though I'm totally cool with renaming mlir_sparse_tensor_utils to avoid the name causing any confusion about whether it's a sharedlib or not.

wrengr added a comment.Oct 7 2022, 5:04 PM

I just uploaded D135502 to see if that fixes things. (Apparently enum class definitions don't need the dllexport/dllimport stuff?) So let's continue the conversation on that differential