This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Added oilist primitive
ClosedPublic

Authored by shraiysh on Dec 6 2021, 8:20 PM.

Details

Summary

This patch attempts to add the oilist primitive proposed in the RFC: Extending Declarative Assembly Format to support order-independent variadic segments.

This element supports optional order-independent variadic segments for operations. This will allow OpenACC and OpenMP Dialects to have similar and relaxed requirements while encouraging the use of Declarative Assembly Format and avoiding code duplication.

An oilist element parses grammar of the form:

clause-list := clause clause-list | empty
clause := `keyword` clause1 | `otherKeyword` clause2
clause1 := <assembly-format element>
clause2 := <assembly-format element>

AssemblyFormat specification:

let assemblyFormat = [{
  oilist( `keyword` clause1
        | `otherkeyword` clause2
        ...
        )
}];

Example:

oilist( `private` `(` $arg0 `:` type($arg0) `)`
      | `nowait`
      | `reduction` custom<ReductionClause>($arg1, type($arg1)))

oilist( `private` `=` $arg0 `:` type($arg0)
      | `reduction` `=` $arg1 `:` type($arg1)
      | `firstprivate` `=` $arg3 `:` type($arg2))

Diff Detail

Event Timeline

shraiysh created this revision.Dec 6 2021, 8:20 PM
shraiysh requested review of this revision.Dec 6 2021, 8:20 PM
shraiysh updated this revision to Diff 392273.Dec 6 2021, 8:29 PM

Remove unrelated change

Nice!

mlir/test/mlir-tblgen/op-format-spec.td
368

Can you add an op to the test dialect that uses this new directive and then add some list tests to test round-tripping the new format?

mlir/tools/mlir-tblgen/FormatGen.h
45

[ and ] are called l_square and r_square in other parts of the codebase.

mlir/tools/mlir-tblgen/OpFormatGen.cpp
71

Because oilist is a keyword, I would consider it more of a "directive" and wouldn't really group it with optional.

370–371
377

Can you encapsulate these members and document them?

831

Can you be more descriptive here?

833–836

Can you also reference oilist somewhere in the error message? Otherwise, it might be a bit vague to someone who isn't aware of the construct.

931–935
1325

Use cast if you're immediately going to dereference.

1326–1327

If you're generating an int for every element whose only purpose is to index into a BitVector, why not just generate a bool flag for each element instead and ditch the bit vector?

1331

You could instead do

::llvm::StringRef oilistKey;
while (succeeded(parser.parseOptionalKeyword(&oilistKey))) {
  if (oilistKey == "literal0") { ... } // or StringSwitch
}
1336
2078

Handling the spaces can be tricky. I think you should delete this line and call genLiteralPrinter below (instead of directly generating the literal print yourself), which has the correct behaviour for printing spaces and updating the flags.

2083
2083–2084

hasAttrOfType?

2086–2091

Let genElementPrinter handle the spaces.

2095–2096
2891–2899

Can you add a comment explaining what this is meant to guard against? (I like the error message btw)

3275–3276
3279–3283

Leftover debugging?

3287
3295–3297
3299–3300

A ] is required to exit the loop, so this branch is unreachable.

Mogball requested changes to this revision.Dec 7 2021, 9:26 AM
This revision now requires changes to proceed.Dec 7 2021, 9:26 AM
shraiysh updated this revision to Diff 392918.Dec 8 2021, 1:31 PM
shraiysh marked 20 inline comments as done.

Thanks for the detailed review @Mogball. I have tried to address all your comments. Please let me know if I missed something!

mlir/tools/mlir-tblgen/OpFormatGen.cpp
1331

I tried doing that earlier, but that was causing issues when this was the last element in the assembly format.

For example,
Operation Definition:

let assemblyFormat = [{ oilist [ `keyword` | `otherKeyword` ] }];

MLIR file:

func @testing() {
  test.op keyword
  return
}

It tends to accept return as a keyword and enters the while loop. Once it is parsed, I do not know how to un-parse it for the next operation to take over. Please let me know if you have an alternative solution for this.

shraiysh added inline comments.Dec 8 2021, 1:31 PM
mlir/tools/mlir-tblgen/OpFormatGen.cpp
377

I have documented them. I am not too sure if I can encapsulate literalElements as OIListElement is itself encapsulating it, but I guess we can do something about parsingElements. Please let me know if you want me to make a new class ElementListElement for one parsing element - it is a element that is a list of elements (along with its own parsing and printing). I know that the name is a bit weird, looking for better suggestions for the same.

3295–3297

I cannot put it in the while condition, because then this testcase will be accepted with no errors:

oilist [ `something` | ]

This is because after getting a pipe, continue will check the condition and it will break the loop as curToken is now ]. So, now this will get accepted without any error reporting.

Thanks, will take a look within the next day or two.

mlir/test/lib/Dialect/Test/TestDialect.cpp
521–523

Drop trivial braces.

Mogball added inline comments.Dec 9 2021, 7:33 AM
mlir/tools/mlir-tblgen/OpFormatGen.cpp
377

I meant to make them private members and provide accessors:

auto getLiteralElements() const {
  return llvm::make_pointee_range(llvm::map_range(literalElements, [](auto &el) { return cast<LiteralElement>(el.get()); }));
}
835
1322–1325

Use a for-each loop (and the provided accessor)

1331

Ah, right, because there's no delimiter to end the list. This is fine as-is then.

1331

You can use llvm::zip to iterate over literalElements and parsingElements at the same time

for (auto it : llvm::zip(oilist->getLiteralElements(), oilist->getParsingElements())) {
  LiteralElement &literalElement = std::get<0>(it);
  auto &parsingElement = std::get<1>(it);
  ...
}

You can move printing else to the end of the loop without checking the index. Then after the loop print

body << " {\n"
3295–3297

Right, makes sense.

rriddle requested changes to this revision.Dec 10 2021, 7:45 PM

Can you add documentation for this in docs/OpDefinitions.md?

Also, how does this interact with the detection of parse ambiguities? We should still detect ambiguities inside of (and adjacent to) oilist.

mlir/test/lib/Dialect/Test/TestOps.td
613–615

Why surround with brackets instead of using something like ()(like all of the other directives)?

mlir/tools/mlir-tblgen/FormatGen.cpp
180–181

Keep this sorted.

mlir/tools/mlir-tblgen/FormatGen.h
70–71

Keep this sorted.

mlir/tools/mlir-tblgen/OpFormatGen.cpp
52–53

Please keep these sorted.

369
831

typo: coressponding

835–836

Drop the reference to the current operation, leave that to the parser to handle infra to callout.

The message is also a bit confusing, this is calling out a duplicate entry for the clause but the message seems to state that only one clause of the list is valid.

1320–1321
1322

Cache the end iterator unless it changes within the loop

2892–2898

Can you flip the if so that error happens first? We generally prefer early exit.

3270
This revision now requires changes to proceed.Dec 10 2021, 7:45 PM
shraiysh updated this revision to Diff 394012.Dec 13 2021, 12:39 PM
shraiysh marked 21 inline comments as done.

Thanks for the review @Mogball and @rriddle. I have tried to address all your comments.

@rriddle can you please elaborate on the parse ambiguities. Are you talking about something like this?

oilist ( `keyword` | `otherkeyword` ) `keyword` attr-dict

I realise this translates to no valid syntax with the current implementation of the parser. Would you suggest not allowing oilist-literals right after an oilist element? (Looking for ideas on this)

If there are any other ambiguities which should be discussed, we can discuss them before moving ahead.

Mogball accepted this revision.Dec 14 2021, 10:10 PM

LG enough for me once the issues I've commented on below are resolved. You should wait for river to sign off too.

mlir/tools/mlir-tblgen/OpFormatGen.cpp
364

Constructors and public methods should come before private methods and class members

393–395

make_pointee_range will work on any range of types that implement operator*, including unique_ptr. You don't need to map them to get(). Also, I would just inline this map function into your accessor method below.

412–413

You don't need to store this in a std::function

1330

Should this be a reference?

2082

Should this be a reference?

shraiysh updated this revision to Diff 394489.Dec 15 2021, 1:39 AM
shraiysh marked 2 inline comments as done.

Thanks for the review @Mogball. Addressed Comments.

mlir/tools/mlir-tblgen/OpFormatGen.cpp
364

I wanted the vectorConverter method to be private, but because its return type is auto, I was forced to put it before its use in getParsingElements(). So, if I try to put the other public methods above, the structure will be something like this. Please let me know if I should change it to this:

public:
  OIListElement(...) { ... }
  static bool classof(...) { ... }
  auto getLiteralElements() const {...}
private:
  static auto vectorConverter(...) { ... }
  std::vector<...> literalElements;
  std::vector<...> parsingElements;
public:
  auto getParsingElements() const {...}
393–395

Thanks for the unique_ptr comment. Addressed it.

About the inlining, I can make it an std::function there, but the declaration will not be pretty as it returns a very complex templated llvm::iterator_range. (can't use it directly or use auto with lambda, because lambdas cannot be copied and llvm::zip will try to copy lambda).

std::function<llvm::iterator_range<llvm::pointee_iterator<decltype((
    std::begin(std::declval<std::vector<std::unique_ptr<Element>>>())))>>(
    const std::vector<std::unique_ptr<Element>> &vec)>
    vectorConverter =
        [](auto &vec) { return llvm::make_pointee_range(vec); };

Let me know if you prefer this, and I will change it accordingly.

412–413

I cannot use auto as copy constructor for lambda functions is deleted but it is not deleted for std::function. The function is stored as is in a llvm::map_range iterator and it is copied while using llvm::zip. Using auto will make it a lambda and lead to errors.

rriddle requested changes to this revision.Dec 15 2021, 1:58 AM

Thanks for the review @Mogball and @rriddle. I have tried to address all your comments.

@rriddle can you please elaborate on the parse ambiguities. Are you talking about something like this?

oilist ( `keyword` | `otherkeyword` ) `keyword` attr-dict

I realise this translates to no valid syntax with the current implementation of the parser. Would you suggest not allowing oilist-literals right after an oilist element? (Looking for ideas on this)

If there are any other ambiguities which should be discussed, we can discuss them before moving ahead.

One case in particular is https://github.com/llvm/llvm-project/blob/1652871473a70e9908d794e858388edc89cd6e88/mlir/tools/mlir-tblgen/OpFormatGen.cpp#L2435
https://github.com/llvm/llvm-project/blob/1652871473a70e9908d794e858388edc89cd6e88/mlir/test/mlir-tblgen/op-format-spec.td#L463

mlir/docs/OpDefinitions.md
626 ↗(On Diff #394489)

I would avoid the use of "string" here.

628 ↗(On Diff #394489)

Something non-obvious here is the treatment of non-optional constructs, should those be allowed? What happens in those cases? Should we error out?

mlir/tools/mlir-tblgen/OpFormatGen.cpp
364

That method doesn't look like it pays for itself, I'd just inline it and keep proper public -> private ordering.

367

Documentation should use /// comments.

388–394

This doesn't appear super useful, I'd just inline it wherever this gets called.

407–408
408
412–413

The use of std::function is quite unfortunate, can you add a comment here capturing this?

417–418
1330

Can you spell out auto here?

1334

nit: Make sure to use camelCase for variable names.

1337

nit: Spell out auto here.

This revision now requires changes to proceed.Dec 15 2021, 1:58 AM
shraiysh updated this revision to Diff 402396.Jan 23 2022, 10:07 PM
shraiysh marked 15 inline comments as done.

Thanks for the review @rriddle. I have tried to address all the comments. Please let me know if I have missed something.

shraiysh edited the summary of this revision. (Show Details)Jan 23 2022, 10:08 PM
rriddle accepted this revision.Feb 8 2022, 12:13 AM

Thanks for the ping, appreciated!

mlir/tools/mlir-tblgen/OpFormatGen.cpp
2308

I think at head we can drop the llvm:: on these.

2702

nit: Drop the trivial braces here.

3311

Can we drop the ::mlir:: on these? (I'm not sure why they are on the others in this file).

This revision is now accepted and ready to land.Feb 8 2022, 12:13 AM
clementval added inline comments.Feb 8 2022, 1:20 AM
mlir/docs/OpDefinitions.md
627 ↗(On Diff #402396)
shraiysh updated this revision to Diff 407439.Feb 10 2022, 2:30 AM

Thanks for accepting this @rriddle, and thanks for the review @clementval. Addressed comments.

@Mogball, I have rebased this patch with main and there were some substantial changes by you in these files. Can you please recheck if everything is where it is supposed to be in this patch. Thank you.

shraiysh marked 4 inline comments as done.Feb 10 2022, 2:31 AM
shraiysh requested review of this revision.Feb 15 2022, 7:03 AM
Mogball accepted this revision.Feb 15 2022, 10:48 AM

Overall LGTM.

mlir/tools/mlir-tblgen/FormatGen.h
148

Fill this in? :P

149

Let's move this into OpFormatGen.cpp. There is an expectation that elements defined in this file are available to both Attribute/Type and Operation formats.

154–155
160
168

avoid returning a copy

mlir/tools/mlir-tblgen/OpFormatGen.cpp
932

avoid copying the vector

1328
2082
3334

Unneeded?

3340

Same here

This revision is now accepted and ready to land.Feb 15 2022, 10:48 AM
shraiysh updated this revision to Diff 409482.Feb 16 2022, 9:18 PM
shraiysh marked 10 inline comments as done.

Rebase with main, addressed comments. Thanks for the review @Mogball.

mlir/tools/mlir-tblgen/FormatGen.h
160

Sadly, I have to use std::function here because lambdas can't be copied (copy constructor is required in llvm::zip). I have added a comment about this.

This revision was automatically updated to reflect the committed changes.