This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Added oilist primitive
ClosedPublic

Authored by shraiysh on Dec 6 2021, 8:20 PM.

Details

Summary

This patch attempts to add the oilist primitive proposed in the RFC: Extending Declarative Assembly Format to support order-independent variadic segments.

This element supports optional order-independent variadic segments for operations. This will allow OpenACC and OpenMP Dialects to have similar and relaxed requirements while encouraging the use of Declarative Assembly Format and avoiding code duplication.

An oilist element parses grammar of the form:

clause-list := clause clause-list | empty
clause := `keyword` clause1 | `otherKeyword` clause2
clause1 := <assembly-format element>
clause2 := <assembly-format element>

AssemblyFormat specification:

let assemblyFormat = [{
  oilist( `keyword` clause1
        | `otherkeyword` clause2
        ...
        )
}];

Example:

oilist( `private` `(` $arg0 `:` type($arg0) `)`
      | `nowait`
      | `reduction` custom<ReductionClause>($arg1, type($arg1)))

oilist( `private` `=` $arg0 `:` type($arg0)
      | `reduction` `=` $arg1 `:` type($arg1)
      | `firstprivate` `=` $arg3 `:` type($arg2))

Diff Detail

Event Timeline

shraiysh created this revision.Dec 6 2021, 8:20 PM
shraiysh requested review of this revision.Dec 6 2021, 8:20 PM
shraiysh updated this revision to Diff 392273.Dec 6 2021, 8:29 PM

Remove unrelated change

Nice!

mlir/test/mlir-tblgen/op-format-spec.td
368

Can you add an op to the test dialect that uses this new directive and then add some list tests to test round-tripping the new format?

mlir/tools/mlir-tblgen/FormatGen.h
49

[ and ] are called l_square and r_square in other parts of the codebase.

mlir/tools/mlir-tblgen/OpFormatGen.cpp
46

Because oilist is a keyword, I would consider it more of a "directive" and wouldn't really group it with optional.

256–257
263

Can you encapsulate these members and document them?

691

Can you be more descriptive here?

693–696

Can you also reference oilist somewhere in the error message? Otherwise, it might be a bit vague to someone who isn't aware of the construct.

792–796
1187

Use cast if you're immediately going to dereference.

1188–1189

If you're generating an int for every element whose only purpose is to index into a BitVector, why not just generate a bool flag for each element instead and ditch the bit vector?

1193

You could instead do

::llvm::StringRef oilistKey;
while (succeeded(parser.parseOptionalKeyword(&oilistKey))) {
  if (oilistKey == "literal0") { ... } // or StringSwitch
}
1198
1947

Handling the spaces can be tricky. I think you should delete this line and call genLiteralPrinter below (instead of directly generating the literal print yourself), which has the correct behaviour for printing spaces and updating the flags.

1952
1952–1953

hasAttrOfType?

1955–1960

Let genElementPrinter handle the spaces.

1964–1965
2704–2705

Can you add a comment explaining what this is meant to guard against? (I like the error message btw)

2851–2852
2855–2859

Leftover debugging?

2863
2871–2873
2875–2876

A ] is required to exit the loop, so this branch is unreachable.

Mogball requested changes to this revision.Dec 7 2021, 9:26 AM
This revision now requires changes to proceed.Dec 7 2021, 9:26 AM
shraiysh updated this revision to Diff 392918.Dec 8 2021, 1:31 PM
shraiysh marked 20 inline comments as done.

Thanks for the detailed review @Mogball. I have tried to address all your comments. Please let me know if I missed something!

mlir/tools/mlir-tblgen/OpFormatGen.cpp
1193

I tried doing that earlier, but that was causing issues when this was the last element in the assembly format.

For example,
Operation Definition:

let assemblyFormat = [{ oilist [ `keyword` | `otherKeyword` ] }];

MLIR file:

func @testing() {
  test.op keyword
  return
}

It tends to accept return as a keyword and enters the while loop. Once it is parsed, I do not know how to un-parse it for the next operation to take over. Please let me know if you have an alternative solution for this.

shraiysh added inline comments.Dec 8 2021, 1:31 PM
mlir/tools/mlir-tblgen/OpFormatGen.cpp
263

I have documented them. I am not too sure if I can encapsulate literalElements as OIListElement is itself encapsulating it, but I guess we can do something about parsingElements. Please let me know if you want me to make a new class ElementListElement for one parsing element - it is a element that is a list of elements (along with its own parsing and printing). I know that the name is a bit weird, looking for better suggestions for the same.

2871–2873

I cannot put it in the while condition, because then this testcase will be accepted with no errors:

oilist [ `something` | ]

This is because after getting a pipe, continue will check the condition and it will break the loop as curToken is now ]. So, now this will get accepted without any error reporting.

Thanks, will take a look within the next day or two.

mlir/test/lib/Dialect/Test/TestDialect.cpp
521–523

Drop trivial braces.

Mogball added inline comments.Dec 9 2021, 7:33 AM
mlir/tools/mlir-tblgen/OpFormatGen.cpp
263

I meant to make them private members and provide accessors:

auto getLiteralElements() const {
  return llvm::make_pointee_range(llvm::map_range(literalElements, [](auto &el) { return cast<LiteralElement>(el.get()); }));
}
695
1184–1187

Use a for-each loop (and the provided accessor)

1193

Ah, right, because there's no delimiter to end the list. This is fine as-is then.

1193

You can use llvm::zip to iterate over literalElements and parsingElements at the same time

for (auto it : llvm::zip(oilist->getLiteralElements(), oilist->getParsingElements())) {
  LiteralElement &literalElement = std::get<0>(it);
  auto &parsingElement = std::get<1>(it);
  ...
}

You can move printing else to the end of the loop without checking the index. Then after the loop print

body << " {\n"
2871–2873

Right, makes sense.

rriddle requested changes to this revision.Dec 10 2021, 7:45 PM

Can you add documentation for this in docs/OpDefinitions.md?

Also, how does this interact with the detection of parse ambiguities? We should still detect ambiguities inside of (and adjacent to) oilist.

mlir/test/lib/Dialect/Test/TestOps.td
648–650

Why surround with brackets instead of using something like ()(like all of the other directives)?

mlir/tools/mlir-tblgen/FormatGen.cpp
177–179

Keep this sorted.

mlir/tools/mlir-tblgen/FormatGen.h
74–75

Keep this sorted.

mlir/tools/mlir-tblgen/OpFormatGen.cpp
45–46

Please keep these sorted.

255
691

typo: coressponding

695–696

Drop the reference to the current operation, leave that to the parser to handle infra to callout.

The message is also a bit confusing, this is calling out a duplicate entry for the clause but the message seems to state that only one clause of the list is valid.

1182–1183
1184

Cache the end iterator unless it changes within the loop

2705–2711

Can you flip the if so that error happens first? We generally prefer early exit.

2846
This revision now requires changes to proceed.Dec 10 2021, 7:45 PM
shraiysh updated this revision to Diff 394012.Dec 13 2021, 12:39 PM
shraiysh marked 21 inline comments as done.

Thanks for the review @Mogball and @rriddle. I have tried to address all your comments.

@rriddle can you please elaborate on the parse ambiguities. Are you talking about something like this?

oilist ( `keyword` | `otherkeyword` ) `keyword` attr-dict

I realise this translates to no valid syntax with the current implementation of the parser. Would you suggest not allowing oilist-literals right after an oilist element? (Looking for ideas on this)

If there are any other ambiguities which should be discussed, we can discuss them before moving ahead.

Mogball accepted this revision.Dec 14 2021, 10:10 PM

LG enough for me once the issues I've commented on below are resolved. You should wait for river to sign off too.

mlir/tools/mlir-tblgen/OpFormatGen.cpp
250

Constructors and public methods should come before private methods and class members

279–281

make_pointee_range will work on any range of types that implement operator*, including unique_ptr. You don't need to map them to get(). Also, I would just inline this map function into your accessor method below.

298–299

You don't need to store this in a std::function

1192

Should this be a reference?

1951

Should this be a reference?

shraiysh updated this revision to Diff 394489.Dec 15 2021, 1:39 AM
shraiysh marked 2 inline comments as done.

Thanks for the review @Mogball. Addressed Comments.

mlir/tools/mlir-tblgen/OpFormatGen.cpp
250

I wanted the vectorConverter method to be private, but because its return type is auto, I was forced to put it before its use in getParsingElements(). So, if I try to put the other public methods above, the structure will be something like this. Please let me know if I should change it to this:

public:
  OIListElement(...) { ... }
  static bool classof(...) { ... }
  auto getLiteralElements() const {...}
private:
  static auto vectorConverter(...) { ... }
  std::vector<...> literalElements;
  std::vector<...> parsingElements;
public:
  auto getParsingElements() const {...}
279–281

Thanks for the unique_ptr comment. Addressed it.

About the inlining, I can make it an std::function there, but the declaration will not be pretty as it returns a very complex templated llvm::iterator_range. (can't use it directly or use auto with lambda, because lambdas cannot be copied and llvm::zip will try to copy lambda).

std::function<llvm::iterator_range<llvm::pointee_iterator<decltype((
    std::begin(std::declval<std::vector<std::unique_ptr<Element>>>())))>>(
    const std::vector<std::unique_ptr<Element>> &vec)>
    vectorConverter =
        [](auto &vec) { return llvm::make_pointee_range(vec); };

Let me know if you prefer this, and I will change it accordingly.

298–299

I cannot use auto as copy constructor for lambda functions is deleted but it is not deleted for std::function. The function is stored as is in a llvm::map_range iterator and it is copied while using llvm::zip. Using auto will make it a lambda and lead to errors.

rriddle requested changes to this revision.Dec 15 2021, 1:58 AM

Thanks for the review @Mogball and @rriddle. I have tried to address all your comments.

@rriddle can you please elaborate on the parse ambiguities. Are you talking about something like this?

oilist ( `keyword` | `otherkeyword` ) `keyword` attr-dict

I realise this translates to no valid syntax with the current implementation of the parser. Would you suggest not allowing oilist-literals right after an oilist element? (Looking for ideas on this)

If there are any other ambiguities which should be discussed, we can discuss them before moving ahead.

One case in particular is https://github.com/llvm/llvm-project/blob/1652871473a70e9908d794e858388edc89cd6e88/mlir/tools/mlir-tblgen/OpFormatGen.cpp#L2435
https://github.com/llvm/llvm-project/blob/1652871473a70e9908d794e858388edc89cd6e88/mlir/test/mlir-tblgen/op-format-spec.td#L463

mlir/docs/OpDefinitions.md
625

I would avoid the use of "string" here.

627

Something non-obvious here is the treatment of non-optional constructs, should those be allowed? What happens in those cases? Should we error out?

mlir/tools/mlir-tblgen/OpFormatGen.cpp
250

That method doesn't look like it pays for itself, I'd just inline it and keep proper public -> private ordering.

253

Documentation should use /// comments.

274–280

This doesn't appear super useful, I'd just inline it wherever this gets called.

293–294
294
298–299

The use of std::function is quite unfortunate, can you add a comment here capturing this?

303–304
1192

Can you spell out auto here?

1196

nit: Make sure to use camelCase for variable names.

1199

nit: Spell out auto here.

This revision now requires changes to proceed.Dec 15 2021, 1:58 AM
shraiysh updated this revision to Diff 402396.Jan 23 2022, 10:07 PM
shraiysh marked 15 inline comments as done.

Thanks for the review @rriddle. I have tried to address all the comments. Please let me know if I have missed something.

shraiysh edited the summary of this revision. (Show Details)Jan 23 2022, 10:08 PM
rriddle accepted this revision.Feb 8 2022, 12:13 AM

Thanks for the ping, appreciated!

mlir/tools/mlir-tblgen/OpFormatGen.cpp
2183

I think at head we can drop the llvm:: on these.

2522

nit: Drop the trivial braces here.

2887

Can we drop the ::mlir:: on these? (I'm not sure why they are on the others in this file).

This revision is now accepted and ready to land.Feb 8 2022, 12:13 AM
clementval added inline comments.Feb 8 2022, 1:20 AM
mlir/docs/OpDefinitions.md
626
shraiysh updated this revision to Diff 407439.Feb 10 2022, 2:30 AM

Thanks for accepting this @rriddle, and thanks for the review @clementval. Addressed comments.

@Mogball, I have rebased this patch with main and there were some substantial changes by you in these files. Can you please recheck if everything is where it is supposed to be in this patch. Thank you.

shraiysh marked 4 inline comments as done.Feb 10 2022, 2:31 AM
shraiysh requested review of this revision.Feb 15 2022, 7:03 AM
Mogball accepted this revision.Feb 15 2022, 10:48 AM

Overall LGTM.

mlir/tools/mlir-tblgen/FormatGen.h
379

Fill this in? :P

380

Let's move this into OpFormatGen.cpp. There is an expectation that elements defined in this file are available to both Attribute/Type and Operation formats.

385–386
391
399

avoid returning a copy

mlir/tools/mlir-tblgen/OpFormatGen.cpp
793

avoid copying the vector

1190
1951
2910

Unneeded?

2916

Same here

This revision is now accepted and ready to land.Feb 15 2022, 10:48 AM
shraiysh updated this revision to Diff 409482.Feb 16 2022, 9:18 PM
shraiysh marked 10 inline comments as done.

Rebase with main, addressed comments. Thanks for the review @Mogball.

mlir/tools/mlir-tblgen/FormatGen.h
391

Sadly, I have to use std::function here because lambdas can't be copied (copy constructor is required in llvm::zip). I have added a comment about this.

This revision was automatically updated to reflect the committed changes.