This is an archive of the discontinued LLVM Phabricator instance.

[mlir] First-party modeling of LLVM types
ClosedPublic

Authored by ftynse on Jul 22 2020, 9:39 AM.

Details

Summary

The current modeling of LLVM IR types in MLIR is based on the LLVMType class
that wraps a raw llvm::Type * and delegates uniquing, printing and parsing to
LLVM itself. This model makes thread-safe type manipulation hard and is being
progressively replaced with a cleaner MLIR model that replicates the type
system. Introduce a set of classes reflecting the LLVM IR type system in MLIR
instead of wrapping the existing types. These are currently introduced as
separate classes without affecting the dialect flow, and are exercised through
a test dialect. Once feature parity is reached, the old implementation will be
gradually substituted with the new one.

Depends On D84171

Diff Detail

Event Timeline

ftynse created this revision.Jul 22 2020, 9:39 AM
Herald added a project: Restricted Project. · View Herald Transcript
jdd added a subscriber: jdd.Jul 23 2020, 11:06 AM
jdd added inline comments.
mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
2225 ↗(On Diff #279857)

Why are you passing around the output stream instead of the DialogAsmPrinter? I would prefer the stack be in DialogAsmPrinter, be a stack of mlir::Type, and be automatically managed by DialogAsmPrinter so you're not relying on the closed nature of the LLVM type system. @rriddle seemed to be supportive of that idea or at least confirmed it would work. Do you consider this to be out-of-scope for this change?

rriddle added inline comments.Jul 23 2020, 12:12 PM
mlir/include/mlir/Dialect/LLVMIR/LLVMDialect.h
246 ↗(On Diff #279857)

Please fix lint check.

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
1774 ↗(On Diff #279857)

I would move these to an enum for KeyFlags/MutableFlags. Right now it is slightly confusing.

1860 ↗(On Diff #279857)

Can we restructure this storage a bit? We could move the identified body Type[] to be a trailing object, and move that size here(as unsigned). We could then steal two bits from each of the sizes for the flags.

1892 ↗(On Diff #279857)

You could pack this into the resultType with PointerIntPair, you just need to define PointerLikeTypeTraits for LLVMTypeNew(or drop it down to a Type and just case when accessing it).

ftynse updated this revision to Diff 280517.Jul 24 2020, 10:43 AM
ftynse marked 3 inline comments as done.

Ready for review

ftynse marked 3 inline comments as done.Jul 24 2020, 10:44 AM
ftynse added inline comments.
mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
1860 ↗(On Diff #279857)

We could move the identified body Type[] to be a trailing object, and move that size here(as unsigned).

This would mean that we need to know the size of the storage, i.e. the number of elements in the body of an identified struct, when we construct the storage, but we don't. It may remain unknown until we mutate the storage, at which point we cannot reallocate it anymore.

I considered having TrailingObjects<Storage, char, Type> for the key, but it creates a wrong impression that both the name and the (literal) type list can be present at the same time, and increases code complexity.

1892 ↗(On Diff #279857)

I've been looking for the place where alignment is indicated, but failed to find it... Can you point me to it?

2225 ↗(On Diff #279857)

Well, because I only need a stream in these functions and because some functions are reused with other streams (error reporting in parsing also calls these before the type is fully constructed). LLVM types do and likely will keep relying on closedness, otherwise, e.g., !llvm.ptr<i32> is ambiguous in pointing to standard or LLVM type.

Though I am happy to use a DialectAsmPrinter/Parser with a stack instead of rolling my own, we are already looking at 1300 LoC without comments, and I'm not even done here. When somebody gets to implement those things, I can consider moving this code to use them.

ftynse retitled this revision from WIP: LLVM types to [mlir] First-party modeling of LLVM types.Jul 24 2020, 10:45 AM
ftynse edited the summary of this revision. (Show Details)
ftynse marked an inline comment as done.
rriddle added inline comments.Jul 24 2020, 11:37 AM
mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
1860 ↗(On Diff #279857)

SGTM.

ftynse updated this revision to Diff 280856.Jul 27 2020, 4:53 AM

more cmake

rriddle marked an inline comment as done.Jul 27 2020, 7:42 PM

Looks really great Alex! Made a first pass through, nothing really big mostly nits.

mlir/include/mlir/Dialect/LLVMIR/LLVMTypes.h
57

nit: The other enums in MLIR use CamelCase for elements.

159

Can you separate these class definitions with comment blocks?

//===-...-===//
//
//===-...-===//
190

nit: Checks -> Returns

195

Why have the duplicate method?

198

nit: Maybe add signless integer type

214

nit: of this integer type?

256

typos: prodiving and lits

258

nit: without a body

289

nit: Such a struct

290

nit: with a mutable body

297

nit: Set the body

433

nit: Can you set this to the same value as PointerLikeTypeTraits<mlir::Type>::NumLowBitsAvailable?

mlir/include/mlir/IR/DialectImplementation.h
329

nit: Can you add a one line comment here?

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
1757 ↗(On Diff #280517)

Why was this removed?

mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp
50

Could we use llvm/ADT/BitFields here instead?

106

Can you move these to a TypeDetail header in this directory?

116

nit: whether the

134

nit: I think you could state - mutable (Identified structs only) which would avoid the need to state it for each point? Feel free to ignore, given that the current description matches the style immutable part.

194

Can you use the enum names instead of numbers here?

204
if (isIdentified())
  return other.isIdentified() && identifier.equals(...);
return !other.isIdentified() && other.isPacked() == ...;

?

228

nit: Prefer getFoo style getters and setters. Same applies below.

260

typo: unintilized

290

nit: Prefer early return here.

354

BumpPtrAllocator::Allocate<T> (used by copyInto) ensures address alignment for a given T. Are you wanting the char * key to have the same alignment as the Type *?

Also, what would be within the pointer union?

463

Please remove the ::mlir:: in this file.

630

Could you also move the printing/parsing to a different file?

895

Please add a comment here.

942

nit: Can you just capture the diagnostic as a variable, and then attach the note?

auto diag = parser.emitError(subtypesLoc) << ...;
diag.attachNote() << ...;
970

nit: Just return directly.

ftynse updated this revision to Diff 281171.Jul 28 2020, 3:27 AM
ftynse marked 28 inline comments as done.

Address review

ftynse updated this revision to Diff 281181.Jul 28 2020, 4:08 AM

Try to appease MSVC

ftynse updated this revision to Diff 281188.Jul 28 2020, 4:51 AM

Even more MSVC

ftynse added inline comments.Jul 28 2020, 6:01 AM
mlir/include/mlir/Dialect/LLVMIR/LLVMTypes.h
195

LLVM has it as params but that contradicts the usual MLIR style. I am fine either way.

258

Intentionality matters here. A struct can be created without a body with the intention of setting the body later, or with the intention of keeping it opaque. I referred to the latter as intentionally opaque, do you have a better name?

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
1757 ↗(On Diff #280517)

code motion accident

mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp
50

Sure, thanks for mentioning this!

194

The "packed" flag is present in two enums and while it currently has the same value in both, it can change leading to subtle problems with hashing here, e.g., if it has the same value as identified. I can add named variables here instead.

354

BumpPtrAllocator::Allocate<T> (used by copyInto) ensures address alignment for a given T. Are you wanting the char * key to have the same alignment as the Type *?

I want at least one bit for the pointer union itself. If I get three bits, I can use the remaining bits for flags instead of stealing them from the size. But that may consume extra space, so I'm also fine with writing the casts myself.

Also, what would be within the pointer union?

The same thing as in keyPtr with type erasure: first element of the key.

970

debug leftover, sorry

ftynse updated this revision to Diff 281543.Jul 29 2020, 5:59 AM

Use LLVMTypeNew inside struct storage for consistency with other storages

rriddle accepted this revision.Jul 29 2020, 3:38 PM
rriddle added inline comments.
mlir/lib/Dialect/LLVMIR/IR/LLVMTypeSyntax.cpp
2

Should we name this TypeParser instead? This would match the naming that the Quant dialect used.

mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp
194

I can add named variables here instead.

Ah, thanks for explaining. Named variables works for me.

354

I want at least one bit for the pointer union itself. If I get three bits, I can use the
remaining bits for flags instead of stealing them from the size. But that may consume
extra space, so I'm also fine with writing the casts myself.

Ah, we could add an alignment parameter to StorageAllocator::copyInto. We'd just need to reimplement what BumpPtrAllocator<T>::Allocate does(effectively just uses the type-erased Allocate and casts back to a T*) and allow for a new desired alignment. We'd just need to assert that the desired alignment is still valid for T.

This revision is now accepted and ready to land.Jul 29 2020, 3:38 PM
ftynse added inline comments.Jul 30 2020, 2:39 PM
mlir/lib/Dialect/LLVMIR/IR/LLVMTypeSyntax.cpp
2

I considered that, but the file contains both parsing and printing so it felt incorrect to emphasize the parser. I can rename if you think it makes more sense.

mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp
354

Ah, we could add an alignment parameter to StorageAllocator::copyInto. We'd just need to reimplement what BumpPtrAllocator<T>::Allocate does(effectively just uses the type-erased Allocate and casts back to a T*) and allow for a new desired alignment. We'd just need to assert that the desired alignment is still valid for T.

I'll keep it as a more detailed TODO, it is unclear to me if we want to potentially waste additional memory for alignment just to remove a couple of casts when we pack bits into sizes in the same object.

This revision was landed with ongoing or failed builds.Aug 3 2020, 6:45 AM
This revision was automatically updated to reflect the committed changes.