This is an archive of the discontinued LLVM Phabricator instance.

[mlir][LLVMIR] Memorize compatible LLVM types
ClosedPublic

Authored by myhsu on Jun 15 2022, 4:36 PM.

Details

Summary

This patch memorizes compatible LLVM types in LLVM::isCompatibleType in order to avoid redundant works.

This is especially useful when the size of program is big and there are multiple occurrences of some deeply nested LLVM struct types, in which case we can gain quite some speedups with this patch.

Existing tests should be sufficient to cover the changes here.

Diff Detail

Event Timeline

myhsu created this revision.Jun 15 2022, 4:36 PM
myhsu requested review of this revision.Jun 15 2022, 4:36 PM
Mogball requested changes to this revision.Jun 15 2022, 8:23 PM
Mogball added inline comments.
mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp
832

Definitely do not want to be using thread_local or static or anything whose lifetime could exceed that of an MLIRContext. I believe the problem is with nested LLVM types and not nested builtin types like vector<*xvector<*xvector<...>>>. You could make all LLVM types (or just the problematic one) store a verified bit instead of using a set. Alternatively, you can put the set on the LLVMDialect (and make sure its access is thread-safe) and move this function to be a member function of LLVMDialect.

This revision now requires changes to proceed.Jun 15 2022, 8:23 PM
rriddle added inline comments.Jun 15 2022, 9:11 PM
mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp
832

Definitely do not want to be using thread_local or static or anything whose lifetime could exceed that of an MLIRContext.

+1 here

myhsu updated this revision to Diff 437766.Jun 16 2022, 6:27 PM
myhsu marked 2 inline comments as done.

This is a major revision. I choose to put the type cache (wrapped with ThreadLocalCache) inside LLVM dialect.
The reason I didn't put a verified flag inside aggregate types was because (i) it effectively makes the affected type mutable. Though it's fine for LLVMStructType, which is already mutable, we also want to cache other LLVM types like LLVMPointer as well. Making them mutable solely based on this purpose didn't really justify IMHO. (ii) The LLVM::isCompatibleType function is highly likely to be used in different threads, so we need to lock on this verified flag in every affected types.

On the matter of using bits: yes, it would require making more types mutable. On locking the verified bit: mutate already locks the storage uniquer (if threading is enabled). Reading the bit, however, doesn't need to be locked.

I assume ThreadLocalCache is faster than a shared DenseSet with a mutex?

Mostly LG otherwise

mlir/include/mlir/Dialect/LLVMIR/LLVMOpBase.td
61

I wouldn't really want these to be publicly exposed. Can you make the isCompatibleType a static function of LLVMDialect?

70

drop name spaces (like in the rest of the file)

mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp
772

type.getContext()->getLoadedDialect<LLVMDialect>()?

This would allow you to cache any type (if the LLVM dialect is loaded, which it should if there are LLVM types floating around in the context).

myhsu updated this revision to Diff 438489.Jun 20 2022, 3:29 PM
myhsu marked 3 inline comments as done.
  • Put isCompatibleType as a static member function of LLVMDialect.
  • Cache non-LLVM aggregate types as well.
myhsu added a comment.Jun 20 2022, 3:31 PM

On the matter of using bits: yes, it would require making more types mutable. On locking the verified bit: mutate already locks the storage uniquer (if threading is enabled). Reading the bit, however, doesn't need to be locked.

Good to know!

I assume ThreadLocalCache is faster than a shared DenseSet with a mutex?

Correct. It's using thread_local underlying but avoid the lifetime issue you mentioned in earlier comments. Thus, it only locks when a new thread is created.

Mostly LG otherwise

Mogball accepted this revision.Jun 20 2022, 4:02 PM

LGTM but please take a look at the comments

mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp
766–772

I think this should speed up the case of a type already being verified.

I wonder if it would not be possible to remove callstack. I remember you had a comment about it but I don't know where it went. Perhaps something like: optimistically insert the type being verified into compatibleTypes to break cycles but then remove the type if it turns out to be incompatible. I don't think a SetVector is necessary.

And if a cached compatibleTypes set can't be retrieved from the LLVMDialect, just use a local one.

771–772

I would outline the call to cacheCompatibleType:

bool result = llvm::TypeSwitch<Type, bool>(type)...
if (result && compatibleTypes)
  compatibleTypes->insert(type);
return result;
This revision is now accepted and ready to land.Jun 20 2022, 4:02 PM
Mogball added inline comments.Jun 20 2022, 4:03 PM
mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp
766–772

(this can be a follow-up patch, since I think it would improve performance).

myhsu updated this revision to Diff 438759.Jun 21 2022, 10:51 AM
myhsu marked 3 inline comments as done.

Use a single DenseSet for both dependency breaking and cache for compatible types. Please take another look.

myhsu added inline comments.Jun 21 2022, 10:53 AM
mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp
766–772

I wonder if it would not be possible to remove callstack. I remember you had a comment about it but I don't know where it went.

Yes I still remember that (I forgot why I removed it though): originally I only used a single DenseSet to break the cycle and memorize the compatible types, and I simply didn't remove the type after its visiting, based on an (not so solid) assumption that if isCompatibleType returns false, most of the caller will just bail out. It turned out that assumption worked well if we only use a single dialect, if we loaded multiple dialects, isCompatibleType will be used to dispatch some of the parsing / printing logics and the DenseSet would eventually memorize some incompatible types.

optimistically insert the type being verified into compatibleTypes to break cycles but then remove the type if it turns out to be incompatible. I don't think a SetVector is necessary.

sounds like a way. I think I can do it in this patch.

817

FYI: SetVector::pop_back is not faster than DenseSet::erase since the former actually uses DenseSet::erase underlying.

Mogball accepted this revision.Jun 21 2022, 10:53 AM

Nice! LGTM

mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp
817

Yeah, that's exactly what I mean when I said SetVector is not needed...

This revision was landed with ongoing or failed builds.Jun 27 2022, 9:48 AM
This revision was automatically updated to reflect the committed changes.