Page MenuHomePhabricator

[mlir][IR] Refactor the internal implementation of Value
ClosedPublic

Authored by rriddle on Mar 2 2021, 1:37 PM.

Details

Summary

The current implementation of Value involves a pointer int pair with several different kinds of owners, i.e. BlockArgumentImpl*, Operation *, TrailingOpResult*. This design arose from the desire to save memory overhead for operations that have a very small number of results (generally 0-2). There are, unfortunately, many problematic aspects of the current implementation that make Values difficult to work with or just inefficient.

Operation result types are stored as a separate array on the Operation. This is very inefficient for many reasons: we use TupleType for multiple results, which can lead to huge amounts of memory usage if multi-result operations change types frequently(they do). It also means that simple methods like Value::getType/Value::setType now require complex logic to get to the desired type.

Value only has one pointer bit free, severely limiting the ability to use it in things like PointerUnion/PointerIntPair. Given that we store the kind of a Value along with the "owner" pointer, we only leave one bit free for users of Value. This creates situations where we end up nesting PointerUnions to be able to use Value in one.

As noted above, most of the methods in Value need to branch on at least 3 different cases which is both inefficient, possibly error prone, and verbose. The current storage of results also creates problems for utilities like ValueRange/TypeRange, which want to efficiently store base pointers to ranges (of which Operation* isn't really useful as one).

This revision greatly simplifies the implementation of Value by the introduction of a new ValueImpl class. This class contains all of the state shared between all of the various derived value classes; i.e. the use list, the type, and the kind. This shared implementation class provides several large benefits:

  • Most of the methods on value are now branchless, and often one-liners.
  • The "kind" of the value is now stored in ValueImpl instead of Value

This frees up all of Value's pointer bits, allowing for users to take full advantage of PointerUnion/PointerIntPair/etc. It also allows for storing more operation results as "inline", 6 now instead of 2, freeing up 1 word per new inline result.

  • Operation result types are now stored in the result, instead of a side array

This drops the size of zero-result operations by 1 word. It also removes the memory crushing use of TupleType for operations results (which could lead up to hundreds of megabytes of "dead" TupleTypes in the context). This also allowed restructured ValueRange, making it simpler and one word smaller.

This revision does come with two conceptual downsides:

  • Operation::getResultTypes no longer returns an ArrayRef<Type>

This conceptually makes some usages slower, as the iterator increment is slightly more complex.

  • OpResult::getOwner is slightly more expensive, as it now requires a little bit of arithmetic

From profiling, neither of the conceptual downsides have resulted in any perceivable hit to performance. Given the advantages of the new design, most compiles are slightly faster.

Diff Detail

Event Timeline

rriddle created this revision.Mar 2 2021, 1:37 PM
rriddle requested review of this revision.Mar 2 2021, 1:37 PM
jpienaar accepted this revision.Mar 2 2021, 2:03 PM

Nice, thanks for the super fast improvement!

mlir/include/mlir/IR/Value.h
43–44

So, when I was looking at this I was wondering how large this impact is :)

Esp as I think this would enable simpler Type array accessing if all of these were equally spaced apart, but the performance impact/overhead may more than offset that

363

s/proper//

mlir/lib/IR/OperationSupport.cpp
482

Useful clang-tidy warning :)

mlir/lib/IR/Value.cpp
118

const_cast? (in addition)

118

This is a little bit magical for me, could you add a comment? (bonus point for ascii diagram ;-))

This revision is now accepted and ready to land.Mar 2 2021, 2:03 PM
rriddle updated this revision to Diff 327627.Mar 2 2021, 4:33 PM
rriddle marked 5 inline comments as done.

address comments

mlir/include/mlir/IR/Value.h
43–44

When I originally made the change there was a decent memory reduction for operations, given that a very high distribution of operations have a very small number of results. To reconfirm, I just added a small instrumentation to a TF model to see what the distribution is there (I didn't do it for upstream because I know for a fact a majority of those are single result) and came up with:

Operation with 0 results : 1137
Operation with 1 results : 33587
Operation with 2 results : 156
Operation with 3 results : 36
Operation with 4 results : 352
Operation with 5 results : 121
Operation with 6 results : 320
Operation with 7 results : 217
Operation with 16 results : 1
Operation with 84 results : 5
Operation with 171 results : 2
Operation with 309 results : 1
Operation with 310 results : 3
Operation with 394 results : 2
Operation with 395 results : 3
Operation with 396 results : 3

Every result up to 6 costs 1 less word, which in this example ended up being something like ~300kb. On bigger models, it gets much more pronounced. The logic could be simpler by just keeping everything the same size, but the size savings + negligible execution time change makes it compelling for me at least.

jpienaar accepted this revision.Mar 2 2021, 4:36 PM

Thanks!

This revision was landed with ongoing or failed builds.Mar 3 2021, 2:34 PM
This revision was automatically updated to reflect the committed changes.

This is a really nice design!

$ cmake -G Ninja -DCMAKE_CXX_COMPILER=$(command -v clang++) -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang;debuginfo-tests;mlir" ../llvm
...

$ ninja projects/debuginfo-tests/CMakeFiles/check-gdb-mlir-support.dir/llvm-prettyprinters/gdb/mlir-support.cpp.o
...
/home/nathan/src/llvm-project/debuginfo-tests/llvm-prettyprinters/gdb/mlir-support.cpp:13:33: error: no member named 'Kind' in 'mlir::Value'
                   mlir::Value::Kind::TrailingOpResult});
                   ~~~~~~~~~~~~~^
1 error generated.

D98613 is needed to avoid other errors in this file.

thopre added a subscriber: thopre.Tue, Apr 6, 4:06 AM
$ cmake -G Ninja -DCMAKE_CXX_COMPILER=$(command -v clang++) -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang;debuginfo-tests;mlir" ../llvm
...

$ ninja projects/debuginfo-tests/CMakeFiles/check-gdb-mlir-support.dir/llvm-prettyprinters/gdb/mlir-support.cpp.o
...
/home/nathan/src/llvm-project/debuginfo-tests/llvm-prettyprinters/gdb/mlir-support.cpp:13:33: error: no member named 'Kind' in 'mlir::Value'
                   mlir::Value::Kind::TrailingOpResult});
                   ~~~~~~~~~~~~~^
1 error generated.

D98613 is needed to avoid other errors in this file.

I'm still getting this error with top of main. Shouldn't Value be changed to a mlir::OutOfLineOpResult ? There's no more TrailingOpResult in mlir::ValueImpl.