This is an archive of the discontinued LLVM Phabricator instance.

[mlir] First-party modeling of LLVM types
ClosedPublic

Authored by ftynse on Jul 22 2020, 9:39 AM.

Details

Summary

The current modeling of LLVM IR types in MLIR is based on the LLVMType class
that wraps a raw llvm::Type * and delegates uniquing, printing and parsing to
LLVM itself. This model makes thread-safe type manipulation hard and is being
progressively replaced with a cleaner MLIR model that replicates the type
system. Introduce a set of classes reflecting the LLVM IR type system in MLIR
instead of wrapping the existing types. These are currently introduced as
separate classes without affecting the dialect flow, and are exercised through
a test dialect. Once feature parity is reached, the old implementation will be
gradually substituted with the new one.

Depends On D84171

Diff Detail

Event Timeline

ftynse created this revision.Jul 22 2020, 9:39 AM
Herald added a project: Restricted Project. · View Herald Transcript
jdd added a subscriber: jdd.Jul 23 2020, 11:06 AM
jdd added inline comments.
mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
2225

Why are you passing around the output stream instead of the DialogAsmPrinter? I would prefer the stack be in DialogAsmPrinter, be a stack of mlir::Type, and be automatically managed by DialogAsmPrinter so you're not relying on the closed nature of the LLVM type system. @rriddle seemed to be supportive of that idea or at least confirmed it would work. Do you consider this to be out-of-scope for this change?

rriddle added inline comments.Jul 23 2020, 12:12 PM
mlir/include/mlir/Dialect/LLVMIR/LLVMDialect.h
246

Please fix lint check.

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
1774

I would move these to an enum for KeyFlags/MutableFlags. Right now it is slightly confusing.

1860

Can we restructure this storage a bit? We could move the identified body Type[] to be a trailing object, and move that size here(as unsigned). We could then steal two bits from each of the sizes for the flags.

1892

You could pack this into the resultType with PointerIntPair, you just need to define PointerLikeTypeTraits for LLVMTypeNew(or drop it down to a Type and just case when accessing it).

ftynse updated this revision to Diff 280517.Jul 24 2020, 10:43 AM
ftynse marked 3 inline comments as done.

Ready for review

ftynse marked 3 inline comments as done.Jul 24 2020, 10:44 AM
ftynse added inline comments.
mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
1860

We could move the identified body Type[] to be a trailing object, and move that size here(as unsigned).

This would mean that we need to know the size of the storage, i.e. the number of elements in the body of an identified struct, when we construct the storage, but we don't. It may remain unknown until we mutate the storage, at which point we cannot reallocate it anymore.

I considered having TrailingObjects<Storage, char, Type> for the key, but it creates a wrong impression that both the name and the (literal) type list can be present at the same time, and increases code complexity.

1892

I've been looking for the place where alignment is indicated, but failed to find it... Can you point me to it?

2225

Well, because I only need a stream in these functions and because some functions are reused with other streams (error reporting in parsing also calls these before the type is fully constructed). LLVM types do and likely will keep relying on closedness, otherwise, e.g., !llvm.ptr<i32> is ambiguous in pointing to standard or LLVM type.

Though I am happy to use a DialectAsmPrinter/Parser with a stack instead of rolling my own, we are already looking at 1300 LoC without comments, and I'm not even done here. When somebody gets to implement those things, I can consider moving this code to use them.

ftynse retitled this revision from WIP: LLVM types to [mlir] First-party modeling of LLVM types.Jul 24 2020, 10:45 AM
ftynse edited the summary of this revision. (Show Details)
ftynse marked an inline comment as done.
rriddle added inline comments.Jul 24 2020, 11:37 AM
mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
1860

SGTM.

ftynse updated this revision to Diff 280856.Jul 27 2020, 4:53 AM

more cmake

rriddle marked an inline comment as done.Jul 27 2020, 7:42 PM

Looks really great Alex! Made a first pass through, nothing really big mostly nits.

mlir/include/mlir/Dialect/LLVMIR/LLVMTypes.h
56 ↗(On Diff #280517)

nit: The other enums in MLIR use CamelCase for elements.

158 ↗(On Diff #280517)

Can you separate these class definitions with comment blocks?

//===-...-===//
//
//===-...-===//
189 ↗(On Diff #280517)

nit: Checks -> Returns

194 ↗(On Diff #280517)

Why have the duplicate method?

197 ↗(On Diff #280517)

nit: Maybe add signless integer type

213 ↗(On Diff #280517)

nit: of this integer type?

255 ↗(On Diff #280517)

typos: prodiving and lits

257 ↗(On Diff #280517)

nit: without a body

288 ↗(On Diff #280517)

nit: Such a struct

289 ↗(On Diff #280517)

nit: with a mutable body

296 ↗(On Diff #280517)

nit: Set the body

432 ↗(On Diff #280517)

nit: Can you set this to the same value as PointerLikeTypeTraits<mlir::Type>::NumLowBitsAvailable?

mlir/include/mlir/IR/DialectImplementation.h
329

nit: Can you add a one line comment here?

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
1757

Why was this removed?

mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp
49 ↗(On Diff #280856)

Could we use llvm/ADT/BitFields here instead?

105 ↗(On Diff #280856)

Can you move these to a TypeDetail header in this directory?

115 ↗(On Diff #280856)

nit: whether the

133 ↗(On Diff #280856)

nit: I think you could state - mutable (Identified structs only) which would avoid the need to state it for each point? Feel free to ignore, given that the current description matches the style immutable part.

193 ↗(On Diff #280856)

Can you use the enum names instead of numbers here?

203 ↗(On Diff #280856)
if (isIdentified())
  return other.isIdentified() && identifier.equals(...);
return !other.isIdentified() && other.isPacked() == ...;

?

227 ↗(On Diff #280856)

nit: Prefer getFoo style getters and setters. Same applies below.

259 ↗(On Diff #280856)

typo: unintilized

289 ↗(On Diff #280856)

nit: Prefer early return here.

353 ↗(On Diff #280856)

BumpPtrAllocator::Allocate<T> (used by copyInto) ensures address alignment for a given T. Are you wanting the char * key to have the same alignment as the Type *?

Also, what would be within the pointer union?

462 ↗(On Diff #280856)

Please remove the ::mlir:: in this file.

629 ↗(On Diff #280856)

Could you also move the printing/parsing to a different file?

894 ↗(On Diff #280856)

Please add a comment here.

941 ↗(On Diff #280856)

nit: Can you just capture the diagnostic as a variable, and then attach the note?

auto diag = parser.emitError(subtypesLoc) << ...;
diag.attachNote() << ...;
969 ↗(On Diff #280856)

nit: Just return directly.

ftynse updated this revision to Diff 281171.Jul 28 2020, 3:27 AM
ftynse marked 28 inline comments as done.

Address review

ftynse updated this revision to Diff 281181.Jul 28 2020, 4:08 AM

Try to appease MSVC

ftynse updated this revision to Diff 281188.Jul 28 2020, 4:51 AM

Even more MSVC

ftynse added inline comments.Jul 28 2020, 6:01 AM
mlir/include/mlir/Dialect/LLVMIR/LLVMTypes.h
194 ↗(On Diff #280517)

LLVM has it as params but that contradicts the usual MLIR style. I am fine either way.

257 ↗(On Diff #280517)

Intentionality matters here. A struct can be created without a body with the intention of setting the body later, or with the intention of keeping it opaque. I referred to the latter as intentionally opaque, do you have a better name?

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
1757

code motion accident

mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp
49 ↗(On Diff #280856)

Sure, thanks for mentioning this!

193 ↗(On Diff #280856)

The "packed" flag is present in two enums and while it currently has the same value in both, it can change leading to subtle problems with hashing here, e.g., if it has the same value as identified. I can add named variables here instead.

353 ↗(On Diff #280856)

BumpPtrAllocator::Allocate<T> (used by copyInto) ensures address alignment for a given T. Are you wanting the char * key to have the same alignment as the Type *?

I want at least one bit for the pointer union itself. If I get three bits, I can use the remaining bits for flags instead of stealing them from the size. But that may consume extra space, so I'm also fine with writing the casts myself.

Also, what would be within the pointer union?

The same thing as in keyPtr with type erasure: first element of the key.

969 ↗(On Diff #280856)

debug leftover, sorry

ftynse updated this revision to Diff 281543.Jul 29 2020, 5:59 AM

Use LLVMTypeNew inside struct storage for consistency with other storages

rriddle accepted this revision.Jul 29 2020, 3:38 PM
rriddle added inline comments.
mlir/lib/Dialect/LLVMIR/IR/LLVMTypeSyntax.cpp
1 ↗(On Diff #281543)

Should we name this TypeParser instead? This would match the naming that the Quant dialect used.

mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp
193 ↗(On Diff #280856)

I can add named variables here instead.

Ah, thanks for explaining. Named variables works for me.

353 ↗(On Diff #280856)

I want at least one bit for the pointer union itself. If I get three bits, I can use the
remaining bits for flags instead of stealing them from the size. But that may consume
extra space, so I'm also fine with writing the casts myself.

Ah, we could add an alignment parameter to StorageAllocator::copyInto. We'd just need to reimplement what BumpPtrAllocator<T>::Allocate does(effectively just uses the type-erased Allocate and casts back to a T*) and allow for a new desired alignment. We'd just need to assert that the desired alignment is still valid for T.

This revision is now accepted and ready to land.Jul 29 2020, 3:38 PM
ftynse added inline comments.Jul 30 2020, 2:39 PM
mlir/lib/Dialect/LLVMIR/IR/LLVMTypeSyntax.cpp
1 ↗(On Diff #281543)

I considered that, but the file contains both parsing and printing so it felt incorrect to emphasize the parser. I can rename if you think it makes more sense.

mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp
353 ↗(On Diff #280856)

Ah, we could add an alignment parameter to StorageAllocator::copyInto. We'd just need to reimplement what BumpPtrAllocator<T>::Allocate does(effectively just uses the type-erased Allocate and casts back to a T*) and allow for a new desired alignment. We'd just need to assert that the desired alignment is still valid for T.

I'll keep it as a more detailed TODO, it is unclear to me if we want to potentially waste additional memory for alignment just to remove a couple of casts when we pack bits into sizes in the same object.

This revision was landed with ongoing or failed builds.Aug 3 2020, 6:45 AM
This revision was automatically updated to reflect the committed changes.