This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Added oilist primitive
ClosedPublic

Authored by shraiysh on Dec 6 2021, 8:20 PM.

Details

Summary

This patch attempts to add the oilist primitive proposed in the RFC: Extending Declarative Assembly Format to support order-independent variadic segments.

This element supports optional order-independent variadic segments for operations. This will allow OpenACC and OpenMP Dialects to have similar and relaxed requirements while encouraging the use of Declarative Assembly Format and avoiding code duplication.

An oilist element parses grammar of the form:

clause-list := clause clause-list | empty
clause := `keyword` clause1 | `otherKeyword` clause2
clause1 := <assembly-format element>
clause2 := <assembly-format element>

AssemblyFormat specification:

let assemblyFormat = [{
  oilist( `keyword` clause1
        | `otherkeyword` clause2
        ...
        )
}];

Example:

oilist( `private` `(` $arg0 `:` type($arg0) `)`
      | `nowait`
      | `reduction` custom<ReductionClause>($arg1, type($arg1)))

oilist( `private` `=` $arg0 `:` type($arg0)
      | `reduction` `=` $arg1 `:` type($arg1)
      | `firstprivate` `=` $arg3 `:` type($arg2))

Diff Detail

Event Timeline

shraiysh created this revision.Dec 6 2021, 8:20 PM
shraiysh requested review of this revision.Dec 6 2021, 8:20 PM
shraiysh updated this revision to Diff 392273.Dec 6 2021, 8:29 PM

Remove unrelated change

Nice!

mlir/test/mlir-tblgen/op-format-spec.td
368

Can you add an op to the test dialect that uses this new directive and then add some list tests to test round-tripping the new format?

mlir/tools/mlir-tblgen/FormatGen.h
45

[ and ] are called l_square and r_square in other parts of the codebase.

mlir/tools/mlir-tblgen/OpFormatGen.cpp
72

Because oilist is a keyword, I would consider it more of a "directive" and wouldn't really group it with optional.

390–391
397

Can you encapsulate these members and document them?

895

Can you be more descriptive here?

897–900

Can you also reference oilist somewhere in the error message? Otherwise, it might be a bit vague to someone who isn't aware of the construct.

995–999
1393

Use cast if you're immediately going to dereference.

1394–1395

If you're generating an int for every element whose only purpose is to index into a BitVector, why not just generate a bool flag for each element instead and ditch the bit vector?

1399

You could instead do

::llvm::StringRef oilistKey;
while (succeeded(parser.parseOptionalKeyword(&oilistKey))) {
  if (oilistKey == "literal0") { ... } // or StringSwitch
}
1404
2152

Handling the spaces can be tricky. I think you should delete this line and call genLiteralPrinter below (instead of directly generating the literal print yourself), which has the correct behaviour for printing spaces and updating the flags.

2157
2157–2158

hasAttrOfType?

2160–2165

Let genElementPrinter handle the spaces.

2169–2170
3020–3027

Can you add a comment explaining what this is meant to guard against? (I like the error message btw)

3403–3404
3407–3411

Leftover debugging?

3415
3423–3425
3427–3428

A ] is required to exit the loop, so this branch is unreachable.

Mogball requested changes to this revision.Dec 7 2021, 9:26 AM
This revision now requires changes to proceed.Dec 7 2021, 9:26 AM
shraiysh updated this revision to Diff 392918.Dec 8 2021, 1:31 PM
shraiysh marked 20 inline comments as done.

Thanks for the detailed review @Mogball. I have tried to address all your comments. Please let me know if I missed something!

mlir/tools/mlir-tblgen/OpFormatGen.cpp
1399

I tried doing that earlier, but that was causing issues when this was the last element in the assembly format.

For example,
Operation Definition:

let assemblyFormat = [{ oilist [ `keyword` | `otherKeyword` ] }];

MLIR file:

func @testing() {
  test.op keyword
  return
}

It tends to accept return as a keyword and enters the while loop. Once it is parsed, I do not know how to un-parse it for the next operation to take over. Please let me know if you have an alternative solution for this.

shraiysh added inline comments.Dec 8 2021, 1:31 PM
mlir/tools/mlir-tblgen/OpFormatGen.cpp
397

I have documented them. I am not too sure if I can encapsulate literalElements as OIListElement is itself encapsulating it, but I guess we can do something about parsingElements. Please let me know if you want me to make a new class ElementListElement for one parsing element - it is a element that is a list of elements (along with its own parsing and printing). I know that the name is a bit weird, looking for better suggestions for the same.

3423–3425

I cannot put it in the while condition, because then this testcase will be accepted with no errors:

oilist [ `something` | ]

This is because after getting a pipe, continue will check the condition and it will break the loop as curToken is now ]. So, now this will get accepted without any error reporting.

Thanks, will take a look within the next day or two.

mlir/test/lib/Dialect/Test/TestDialect.cpp
520–522

Drop trivial braces.

Mogball added inline comments.Dec 9 2021, 7:33 AM
mlir/tools/mlir-tblgen/OpFormatGen.cpp
397

I meant to make them private members and provide accessors:

auto getLiteralElements() const {
  return llvm::make_pointee_range(llvm::map_range(literalElements, [](auto &el) { return cast<LiteralElement>(el.get()); }));
}
899
1390–1393

Use a for-each loop (and the provided accessor)

1399

Ah, right, because there's no delimiter to end the list. This is fine as-is then.

1399

You can use llvm::zip to iterate over literalElements and parsingElements at the same time

for (auto it : llvm::zip(oilist->getLiteralElements(), oilist->getParsingElements())) {
  LiteralElement &literalElement = std::get<0>(it);
  auto &parsingElement = std::get<1>(it);
  ...
}

You can move printing else to the end of the loop without checking the index. Then after the loop print

body << " {\n"
3423–3425

Right, makes sense.

rriddle requested changes to this revision.Dec 10 2021, 7:45 PM

Can you add documentation for this in docs/OpDefinitions.md?

Also, how does this interact with the detection of parse ambiguities? We should still detect ambiguities inside of (and adjacent to) oilist.

mlir/test/lib/Dialect/Test/TestOps.td
649–651

Why surround with brackets instead of using something like ()(like all of the other directives)?

mlir/tools/mlir-tblgen/FormatGen.cpp
177–179

Keep this sorted.

mlir/tools/mlir-tblgen/FormatGen.h
70–71

Keep this sorted.

mlir/tools/mlir-tblgen/OpFormatGen.cpp
54–55

Please keep these sorted.

389
895

typo: coressponding

899–900

Drop the reference to the current operation, leave that to the parser to handle infra to callout.

The message is also a bit confusing, this is calling out a duplicate entry for the clause but the message seems to state that only one clause of the list is valid.

1388–1389
1390

Cache the end iterator unless it changes within the loop

3021–3027

Can you flip the if so that error happens first? We generally prefer early exit.

3398
This revision now requires changes to proceed.Dec 10 2021, 7:45 PM
shraiysh updated this revision to Diff 394012.Dec 13 2021, 12:39 PM
shraiysh marked 21 inline comments as done.

Thanks for the review @Mogball and @rriddle. I have tried to address all your comments.

@rriddle can you please elaborate on the parse ambiguities. Are you talking about something like this?

oilist ( `keyword` | `otherkeyword` ) `keyword` attr-dict

I realise this translates to no valid syntax with the current implementation of the parser. Would you suggest not allowing oilist-literals right after an oilist element? (Looking for ideas on this)

If there are any other ambiguities which should be discussed, we can discuss them before moving ahead.

Mogball accepted this revision.Dec 14 2021, 10:10 PM

LG enough for me once the issues I've commented on below are resolved. You should wait for river to sign off too.

mlir/tools/mlir-tblgen/OpFormatGen.cpp
384

Constructors and public methods should come before private methods and class members

413–415

make_pointee_range will work on any range of types that implement operator*, including unique_ptr. You don't need to map them to get(). Also, I would just inline this map function into your accessor method below.

432–433

You don't need to store this in a std::function

1398

Should this be a reference?

2156

Should this be a reference?

shraiysh updated this revision to Diff 394489.Dec 15 2021, 1:39 AM
shraiysh marked 2 inline comments as done.

Thanks for the review @Mogball. Addressed Comments.

mlir/tools/mlir-tblgen/OpFormatGen.cpp
384

I wanted the vectorConverter method to be private, but because its return type is auto, I was forced to put it before its use in getParsingElements(). So, if I try to put the other public methods above, the structure will be something like this. Please let me know if I should change it to this:

public:
  OIListElement(...) { ... }
  static bool classof(...) { ... }
  auto getLiteralElements() const {...}
private:
  static auto vectorConverter(...) { ... }
  std::vector<...> literalElements;
  std::vector<...> parsingElements;
public:
  auto getParsingElements() const {...}
413–415

Thanks for the unique_ptr comment. Addressed it.

About the inlining, I can make it an std::function there, but the declaration will not be pretty as it returns a very complex templated llvm::iterator_range. (can't use it directly or use auto with lambda, because lambdas cannot be copied and llvm::zip will try to copy lambda).

std::function<llvm::iterator_range<llvm::pointee_iterator<decltype((
    std::begin(std::declval<std::vector<std::unique_ptr<Element>>>())))>>(
    const std::vector<std::unique_ptr<Element>> &vec)>
    vectorConverter =
        [](auto &vec) { return llvm::make_pointee_range(vec); };

Let me know if you prefer this, and I will change it accordingly.

432–433

I cannot use auto as copy constructor for lambda functions is deleted but it is not deleted for std::function. The function is stored as is in a llvm::map_range iterator and it is copied while using llvm::zip. Using auto will make it a lambda and lead to errors.

rriddle requested changes to this revision.Dec 15 2021, 1:58 AM

Thanks for the review @Mogball and @rriddle. I have tried to address all your comments.

@rriddle can you please elaborate on the parse ambiguities. Are you talking about something like this?

oilist ( `keyword` | `otherkeyword` ) `keyword` attr-dict

I realise this translates to no valid syntax with the current implementation of the parser. Would you suggest not allowing oilist-literals right after an oilist element? (Looking for ideas on this)

If there are any other ambiguities which should be discussed, we can discuss them before moving ahead.

One case in particular is https://github.com/llvm/llvm-project/blob/1652871473a70e9908d794e858388edc89cd6e88/mlir/tools/mlir-tblgen/OpFormatGen.cpp#L2435
https://github.com/llvm/llvm-project/blob/1652871473a70e9908d794e858388edc89cd6e88/mlir/test/mlir-tblgen/op-format-spec.td#L463

mlir/docs/OpDefinitions.md
626

I would avoid the use of "string" here.

628

Something non-obvious here is the treatment of non-optional constructs, should those be allowed? What happens in those cases? Should we error out?

mlir/tools/mlir-tblgen/OpFormatGen.cpp
384

That method doesn't look like it pays for itself, I'd just inline it and keep proper public -> private ordering.

387

Documentation should use /// comments.

408–414

This doesn't appear super useful, I'd just inline it wherever this gets called.

427–428
428
432–433

The use of std::function is quite unfortunate, can you add a comment here capturing this?

437–438
1398

Can you spell out auto here?

1402

nit: Make sure to use camelCase for variable names.

1405

nit: Spell out auto here.

This revision now requires changes to proceed.Dec 15 2021, 1:58 AM
shraiysh updated this revision to Diff 402396.Jan 23 2022, 10:07 PM
shraiysh marked 15 inline comments as done.

Thanks for the review @rriddle. I have tried to address all the comments. Please let me know if I have missed something.

shraiysh edited the summary of this revision. (Show Details)Jan 23 2022, 10:08 PM
rriddle accepted this revision.Feb 8 2022, 12:13 AM

Thanks for the ping, appreciated!

mlir/tools/mlir-tblgen/OpFormatGen.cpp
2386

I think at head we can drop the llvm:: on these.

2793

nit: Drop the trivial braces here.

3439

Can we drop the ::mlir:: on these? (I'm not sure why they are on the others in this file).

This revision is now accepted and ready to land.Feb 8 2022, 12:13 AM
clementval added inline comments.Feb 8 2022, 1:20 AM
mlir/docs/OpDefinitions.md
627
shraiysh updated this revision to Diff 407439.Feb 10 2022, 2:30 AM

Thanks for accepting this @rriddle, and thanks for the review @clementval. Addressed comments.

@Mogball, I have rebased this patch with main and there were some substantial changes by you in these files. Can you please recheck if everything is where it is supposed to be in this patch. Thank you.

shraiysh marked 4 inline comments as done.Feb 10 2022, 2:31 AM
shraiysh requested review of this revision.Feb 15 2022, 7:03 AM
Mogball accepted this revision.Feb 15 2022, 10:48 AM

Overall LGTM.

mlir/tools/mlir-tblgen/FormatGen.h
147

Fill this in? :P

148

Let's move this into OpFormatGen.cpp. There is an expectation that elements defined in this file are available to both Attribute/Type and Operation formats.

153–154
159
167

avoid returning a copy

mlir/tools/mlir-tblgen/OpFormatGen.cpp
996

avoid copying the vector

1396
2156
3462

Unneeded?

3468

Same here

This revision is now accepted and ready to land.Feb 15 2022, 10:48 AM
shraiysh updated this revision to Diff 409482.Feb 16 2022, 9:18 PM
shraiysh marked 10 inline comments as done.

Rebase with main, addressed comments. Thanks for the review @Mogball.

mlir/tools/mlir-tblgen/FormatGen.h
159

Sadly, I have to use std::function here because lambdas can't be copied (copy constructor is required in llvm::zip). I have added a comment about this.

This revision was automatically updated to reflect the committed changes.