This is an archive of the discontinued LLVM Phabricator instance.

[MLIR][SPIRV] Support identified and recursive structs.
ClosedPublic

Authored by ergawy on Sep 6 2020, 10:09 AM.

Details

Summary

This PR adds support for identified and recursive structs. This includes: parsing,
printing, serializing, and deserializing such structs.

The following C struct:

C
struct A {
  A* next;
};

which is translated to the following MLIR code as:

mlir
!spv.struct<A, (!spv.ptr<!spv.struct<A>, Generic>)>

would be represented in the SPIR-V module as:

spirv
OpName %A "A"
OpTypeForwardPointer %APtr Generic
%A = OpTypeStruct %APtr
%APtr = OpTypePointer Generic %A

In particular the following changes are included:

  • SPIR-V structs can now be either identified or literal (i.e. non-identified).
  • All structs now have their members surrounded by a ()-pair.
  • For recursive references, (1) an OpTypeForwardPointer instruction is emitted before the OpTypeStruct instruction defining the recursive struct (2) an OpTypePointer instruction is emitted after the OpTypeStruct instruction which actually defines the recursive pointer to struct type.

Diff Detail

Event Timeline

ergawy created this revision.Sep 6 2020, 10:09 AM
ergawy requested review of this revision.Sep 6 2020, 10:09 AM

There are still a few TODOs and of course more testing is also needed. Just wanted to make sure that there are no major objections to how I approached things here.

ergawy updated this revision to Diff 290143.Sep 6 2020, 10:22 AM

Remove debug prints.

rriddle requested changes to this revision.Sep 6 2020, 11:29 AM

I have concerns about several aspects of what you are doing here. Blocking until I have time to review.

This revision now requires changes to proceed.Sep 6 2020, 11:29 AM
ergawy added a comment.Sep 7 2020, 2:47 AM

I have concerns about several aspects of what you are doing here. Blocking until I have time to review.

Thanks for taking interest. Glad to handle any comments. Hopefully, nothing atrocious was introduced :D!

ergawy updated this revision to Diff 290300.Sep 7 2020, 8:57 AM
  • Run clang-format.
ergawy updated this revision to Diff 290481.Sep 8 2020, 7:25 AM
  • Get rid of getStructContext() that was added for printing.
  • Get rid of getStructContext() that was added for parsing.
  • Rebase on top of master.
ergawy updated this revision to Diff 290484.Sep 8 2020, 7:40 AM
  • Some more clean-up.
ergawy added inline comments.Sep 8 2020, 8:15 AM
mlir/lib/Dialect/SPIRV/SPIRVDialect.cpp
597

Originally, I added a getStructContext() method to the parser but this ended up polluting the infrastructure code in a lot of places.

Instead, I resorted to this solution which is not thread-safe at all. One, not so smart solution, would be to acquire a lock whenever we start parsing an outermost struct and not release that lock until we finished parsing the struct. Would that be ok?

745

Same as my comment above for parsing.

ergawy updated this revision to Diff 290493.Sep 8 2020, 8:34 AM
  • Rename some tests.
ergawy updated this revision to Diff 290645.Sep 8 2020, 11:37 PM
  • Use thread_local instead of static.
ergawy added inline comments.Sep 9 2020, 12:16 AM
mlir/lib/Dialect/SPIRV/SPIRVDialect.cpp
597

Sorry for the noise, forgot about thread_local in C++. Changed from static to thread_local

ftynse requested changes to this revision.Sep 11 2020, 10:39 AM
ftynse added inline comments.
mlir/include/mlir/Dialect/SPIRV/SPIRVTypes.h
359

Nit: prefer TypeRange to ArrayRef<Type>

mlir/lib/Dialect/SPIRV/SPIRVDialect.cpp
597

I would consider a mechanism in DialectAsmParser that allows it, when parsing types and attributes, to be aware of the call stack. Each call to parseType can have an additional argument that is stored on the stack contained in DialectAsmParser and everybody has read-only access to it. Individual parsers can then look up the stack to see if they "recognize" some data stored there and can use it when necessary. This would be essentially a generalization of what this does and what LLVM dialect does.

631–639

Between the time this code looked up the identifier, and the time a struct with this identifier was created, another thread may have created a struct with the same identifier. And this struct may or may not correspond to what is being parsed currently. That's why there was no lookup on the uniquer.

In the LLVM dialect, this is solved by having an "initialized" bit inside the struct storage. The code always calls "get" and is guaranteed to get a type. If the type is already initialized to a different body, the parser can report an error immediately. If it is not initialized, it can attempt mutating it to set the body and the initialized bit, and the mutation fails if somebody has already mutated the type so the parser can, again, report an error. When there is a recursive reference, the parser gets a type (wrapped pointer) that will be populated later but can already be used as element type, including for itself. This approach should work here, too.

This revision now requires changes to proceed.Sep 11 2020, 10:39 AM
ergawy updated this revision to Diff 291585.Sep 14 2020, 8:58 AM
  • Rebase
  • Handle review comments: = Get rid of lookup logic.
ergawy marked an inline comment as done.Sep 14 2020, 9:15 AM
ergawy added inline comments.
mlir/include/mlir/Dialect/SPIRV/SPIRVTypes.h
359

We eventually need an ArrayRef for the allocator (see SPIRVTypes.cpp:369). Am I missing something?

mlir/lib/Dialect/SPIRV/SPIRVDialect.cpp
597

That would be certainly better. I can do that but would prefer to do it as part of another review if that's ok since it touches both dialects.

631–639

Thanks for pointing that out. I already have a flag to check whether the body was before so I didn't even need the lookup logic as you explained.

Removed all lookup methods that I introduced.

ergawy marked an inline comment as done.Sep 14 2020, 9:16 AM
ergawy updated this revision to Diff 291955.Sep 15 2020, 9:16 AM
  • Clean-up, add more docs, and rebase.
ftynse added inline comments.Sep 16 2020, 4:14 AM
mlir/include/mlir/Dialect/SPIRV/SPIRVTypes.h
359

We can add a TypeRange overload to the allocator as well, but definitely not in this patch.

mlir/lib/Dialect/SPIRV/LayoutUtils.cpp
80

Consider returning nullptr and making sure it's properly handled/propagated on the caller's side. Generally, if something can be triggered by the IR that passes the verifier, we don't abort.

mlir/lib/Dialect/SPIRV/SPIRVDialect.cpp
597

Sure thing for the separate patch, thanks!

603

Nit: please terminate sentences with dots in comments.

618

Nit: I'd use if (failed(parser.parseComma()) for consistency with conditional above

622

It looks like from this point onward, the last identifier needs to be deleted from the context before returning. Currently it only happens in non-error cases. There can be uses that parse types and ignore errors, at which point this will be keeping spurious identifiers in the thread-local storage.

628–633

This can be folded into the line that assigns identifierExistsInCtx, and the variable isn't necessary.

ergawy updated this revision to Diff 292249.Sep 16 2020, 9:12 AM
  • Handle review comments
ergawy marked 5 inline comments as done.Sep 16 2020, 9:17 AM
ftynse requested changes to this revision.Sep 17 2020, 5:16 AM
ftynse added inline comments.
mlir/lib/Dialect/SPIRV/SPIRVDialect.cpp
643

Elide trivial braces. Here and below.

677

Shouldn't we check for success here?

mlir/lib/Dialect/SPIRV/SPIRVTypes.cpp
762

This class and methods within could use some documentation, it's not a trivial storage helper anymore.

855

This does not look correct. It has to go through the mutate hook that can be called in a thread-safe way with a proper allocator by the infrastructure. See https://mlir.llvm.org/docs/Tutorials/DefiningAttributesAndTypes/#mutable-types. Following this scheme also removes the temptation to store the allocator.

865

Nit: trailing dot

874

Nit: I don't think you need explicit this-> here.

895

This should not be saving the allocator, no guarantees it will be the same on every call (even though it is, in practice).

935

This may fail if a concurrent thread set the body between Base::get and this call. You can either set the body inside get (forwarded to construct on the storage), or check the result and return nullptr on failure + document that type construction may fail.

991

This should be calling Base::mutate to have proper locking.

mlir/lib/Dialect/SPIRV/Serialization/Deserializer.cpp
115

I wonder if we can use an identified struct with unset body instead.

123

Do we actually expect zero member types to be the most frequently appearing case? I'd use something like 4.

569

Nit: SmallVector is re-exported into the mlir namespace, no need to prefix it with llvm::. Same is true for many ADT, not sure about SetVector.

1276

Generally, we need to check the result of trySetBody everytime we call it, otherwise it doesn't make sense to return the result.

mlir/lib/Dialect/SPIRV/Serialization/Serializer.cpp
1478–1480

Renaming the current processType to processTypeImpl and then introducing a new processType that defines the context and calls processTypeImpl would decrease the change surface in this diff.

mlir/lib/IR/AsmPrinter.cpp
911

This doesn't look necessary anymore.

This revision now requires changes to proceed.Sep 17 2020, 5:16 AM
ergawy updated this revision to Diff 292794.Sep 18 2020, 8:10 AM
  • Handle some style-related review comments.
  • Use mutate to properly set an identified struct's body.
  • Add more docs to StructTypeStorage.
  • Add more checking to StructTypeStorage::mutate().
ergawy marked 12 inline comments as done.Sep 18 2020, 8:19 AM
ergawy added inline comments.
mlir/lib/Dialect/SPIRV/SPIRVTypes.cpp
935

I added a comment explaining the thread-unsafe behavior.

ergawy added inline comments.Sep 18 2020, 8:19 AM
mlir/lib/Dialect/SPIRV/Serialization/Deserializer.cpp
115

+1, saves us a look up and don't see a downside to it. Done.

569

Just FYI, SetVector is indeed no re-exported.

1276

Initially, I refrained from doing so because of a small issue overlooked in the round-trip code of spv. In particular, the same context is used both for serialization and deserialization. However:

  • I adopted what you did in the LLVM dialect by failing only if the struct content is different from what is already set.
  • Started a review to handle the round-tripping issue: https://reviews.llvm.org/D87692.
ergawy updated this revision to Diff 292802.Sep 18 2020, 8:29 AM
ergawy marked 4 inline comments as done.
  • Remove some trivial braces.
ergawy updated this revision to Diff 292812.Sep 18 2020, 8:53 AM
  • Run clang-format.
ergawy edited the summary of this revision. (Show Details)Sep 18 2020, 8:58 AM
ergawy marked 6 inline comments as done.Sep 22 2020, 3:15 AM

Ping! :). I guess all comments from @ftynse are handled now. Please have another look whenever is suitable for you.

mlir/lib/Dialect/SPIRV/SPIRVDialect.cpp
597

I will prepare a review with the proposed changes this week when I have some free time.

mlir/lib/Dialect/SPIRV/SPIRVTypes.cpp
874

We do, the field and the argument have the same name.

ergawy updated this revision to Diff 294641.Sep 28 2020, 3:44 AM
ergawy marked 2 inline comments as done.
  • Rebase.
ftynse accepted this revision.Sep 29 2020, 5:26 AM

Type modeling LGTM. Sorry for the delay, this is a relatively complex and long patch that requires full attention. I would speculate that @rriddle picked at the same things as I did about type modeling. It would be great to have somebody more familiar with SPIR-V to take a look at the (de)serialization.

mlir/include/mlir/Dialect/SPIRV/SPIRVTypes.h
308

Nit: I believe the same comment as below about threads is applicable here. That is, if another thread creates (or indeed if any thread had already created a struct with the same identifier, that struct will be returned).

mlir/lib/Dialect/SPIRV/SPIRVTypes.cpp
778

Nit: I'd also require the unused parts of the key to be always empty. This seems to be the case in practice and avoids bad surprises in the future.

891–896

(Out-of-scope for this commit): any reason why this is not using ArrayRef?

mlir/lib/Dialect/SPIRV/Serialization/Deserializer.cpp
114

Nit: this isn't an ID anymore.

mlir/lib/Dialect/SPIRV/Transforms/DecorateSPIRVCompositeTypeLayoutPass.cpp
39 ↗(On Diff #294641)

Ultra-nit: drop trivial braces

ergawy updated this revision to Diff 295201.Sep 30 2020, 1:35 AM
  • Handle review comments.
ergawy marked 4 inline comments as done.Sep 30 2020, 1:39 AM

@ftynse thanks for the approval, I appreciate you taking the time for a careful review (and apologies for always forgetting to remove trivial braces, in my personal coding I like to always add them :D. But if that's the style mostly followed in MLIR then I will follow it as closely as possible).

@antiagainst promised to have a look soon :P.

(and apologies for always forgetting to remove trivial braces, in my personal coding I like to always add them :D. But if that's the style mostly followed in MLIR then I will follow it as closely as possible).

No worries, I find switching back and forth between styles to be the hardest part :) Eliding trivial braces is a part of LLVM coding standard now - https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements - it has always been implicitly.

antiagainst accepted this revision.Sep 30 2020, 3:07 PM

This is impressive, thanks @ergawy for the great contribution! I love the extensive documentation and tests. :) Sorry for the delay; I have a little one to my family recently so that eats into my time and effort greatly. :) In general this looks good to me. Also Alex has been already given extensive reviews. I just have a few nits. I'll handle landing this after you addressed them. Thanks very much!

mlir/include/mlir/Dialect/SPIRV/SPIRVTypes.h
266

It would be nice to briefly talk about the two kinds of struct types: literal and identified in the documentation. Like:

Two kinds of struct types are supported: literal and identified. The former is uniqued via its fields while the later is uniqued via a string identifier. One typically see literal structs; identified structs are typically used for recursive structs.

322

Documentation for this? What if this struct is not identified?

mlir/lib/Dialect/SPIRV/SPIRVDialect.cpp
595

This function is quite lengthy right now. We need to break it down. No action required for this change though. Could you put a TODO to remind?

597

It would be nice to document this regarding what it is used for and why thread_local.

599

Does this have to be static?

mlir/lib/Dialect/SPIRV/SPIRVTypes.cpp
772

Nit: put two spaces to align info with an. :)

mlir/lib/Dialect/SPIRV/Serialization/Deserializer.cpp
1270

Nit: "expected ... type": use lower case and no trailing dot.

1475

Is this a tab?? :)

2503

TODO:

mlir/lib/Dialect/SPIRV/Serialization/Serializer.cpp
119

Just put this ahead of recursiveStructInfos in the class?

1000

Document why this context for later reference?

1020

The else here is not necessary? and also a tab in the below. :) Make sure you run clang format.

ergawy updated this revision to Diff 295540.Oct 1 2020, 5:52 AM
  • Handle review comments.
ergawy marked 9 inline comments as done.Oct 1 2020, 5:59 AM

Sorry for the delay; I have a little one to my family recently so that eats into my time and effort greatly. :)

No worries and thanks for taking the time @antiagainst.

mlir/lib/Dialect/SPIRV/SPIRVDialect.cpp
599

I declared it static since I didn't want a new lambda object to be created with each invocation.

mlir/lib/Dialect/SPIRV/Serialization/Deserializer.cpp
1475

Not sure why it shows a tab. Ran clang-format on the entire file.

I have concerns about several aspects of what you are doing here. Blocking until I have time to review.

Friendly ping to @rriddle. I'd like to land this and iterate on it if there are no major concerns from you.

ergawy updated this revision to Diff 296124.Oct 5 2020, 2:28 AM
  • Update tests to reflect latest master changes.
rriddle accepted this revision.Oct 5 2020, 1:20 PM

Apologies for any delays, this wasn't appearing in my dashboard. LGTM pending resolving comments.

mlir/include/mlir/Dialect/SPIRV/SPIRVTypes.h
265

Top-level comments should be ///.

326

nit: Can you add a comment here?

mlir/lib/Dialect/SPIRV/LayoutUtils.cpp
73

nit: Don't use else after return

73

nit: Drop else after return.

mlir/lib/Dialect/SPIRV/SPIRVDialect.cpp
604

Is there an action item to remove this? We shouldn't be relying on thread_local ideally.

750

nit: Can you flip this conditional and early return the else branch instead?

mlir/lib/Dialect/SPIRV/SPIRVTypes.cpp
764

nit: If you have an comment inside of the body, you need braces.

766

nit: Don't use else after return.

879

nit: Can you name the parameters something else to avoid the need for this->? Will avoid any potential problems refactoring this in the future.

893

Why the newline here?

895

nit: You could use a PointerIntPair to pack this boolean into one of the pointers to save space.

mlir/lib/Dialect/SPIRV/Serialization/Deserializer.cpp
564

hmm, seems like the style in this file is inconsistent. Variable comments should use ///.

1238

nit: Drop the newline here.

1247

Why the use of std::begin/std::end? Also if the iterators aren't changing within the loop, you should cache the end iterator instead of recomputing it.

1250

Same here.

mlir/lib/Dialect/SPIRV/Serialization/Serializer.cpp
1017

nit: Drop trivial braces.

1024

Is this something that we can test for?

This revision is now accepted and ready to land.Oct 5 2020, 1:20 PM
ergawy updated this revision to Diff 296433.Oct 6 2020, 6:17 AM
  • Rebase and handle review comments.
ergawy marked 15 inline comments as done.Oct 6 2020, 6:21 AM

Thanks for the review!

mlir/lib/Dialect/SPIRV/SPIRVDialect.cpp
604

I don't whether the solution is good, but I started a review to get rid of this and also to provide a unified solution for managing parsing context in general: https://reviews.llvm.org/D88090.

If you think the solution is not good enough, then I will at least add a TODO to find a proper solution.

Out of curiosity, what are the downsides for using thread_local?

mlir/lib/Dialect/SPIRV/Serialization/Deserializer.cpp
564

I fixed it for the newly added variables at least.

1247

Should I follow a different style when iterating over containers?

The containers are changed inside the loop so we need to recalculate the end iterator.

mlir/lib/Dialect/SPIRV/Serialization/Serializer.cpp
1024

I think we can. I will follow up with a review to handle that TODO. Just thought this one is long as it is.

ergawy marked 3 inline comments as done.Oct 6 2020, 6:21 AM
ergawy added a comment.Oct 8 2020, 3:54 AM

Ping! Is it ok to merge this patch now? (I don't have commit access).

ergawy updated this revision to Diff 297525.Oct 12 2020, 2:55 AM
  • Rebase.
This revision was automatically updated to reflect the committed changes.