This is an archive of the discontinued LLVM Phabricator instance.

[MLIR] Added llvm.invoke and llvm.landingpad
ClosedPublic

Authored by shraiysh on Dec 30 2019, 8:42 AM.

Details

Summary

I have tried to implement llvm.invoke and llvm.landingpad.

  1. llvm.invoke is similar to llvm.call with two successors added, the first one is the normal label and the second one is unwind label.
  2. llvm.launchpad takes a variable number of args with either catch or filter associated with them. Catch clauses are not array types and filter clauses are array types. This is same as the criteria used by LLVM (https://github.com/llvm/llvm-project/blob/4f82af81a04d711721300f6ca32f402f2ea6faf4/llvm/include/llvm/IR/Instructions.h#L2866)

Examples:

LLVM IR

define i32 @caller(i32 %a) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
    invoke i32 @foo(i32 2) to label %success unwind label %fail

  success:
    ret i32 2

  fail:
    landingpad {i8*, i32} catch i8** @_ZTIi catch i8** null catch i8* bitcast (i8** @_ZTIi to i8*) filter [1 x i8] [ i8 1 ]
    ret i32 3
}

MLIR LLVM Dialect

llvm.func @caller(%arg0: !llvm.i32) -> !llvm.i32 {
  %0 = llvm.mlir.constant(3 : i32) : !llvm.i32
  %1 = llvm.mlir.constant("\01") : !llvm<"[1 x i8]">
  %2 = llvm.mlir.addressof @_ZTIi : !llvm<"i8**">
  %3 = llvm.bitcast %2 : !llvm<"i8**"> to !llvm<"i8*">
  %4 = llvm.mlir.null : !llvm<"i8**">
  %5 = llvm.mlir.addressof @_ZTIi : !llvm<"i8**">
  %6 = llvm.mlir.constant(2 : i32) : !llvm.i32
  %7 = llvm.invoke @foo(%6) to ^bb1 unwind ^bb2 : (!llvm.i32) -> !llvm.i32
^bb1:	// pred: ^bb0
  llvm.return %6 : !llvm.i32
^bb2:	// pred: ^bb0
  %8 = llvm.landingpad (catch %5 : !llvm<"i8**">) (catch %4 : !llvm<"i8**">) (catch %3 : !llvm<"i8*">) (filter %1 : !llvm<"[1 x i8]">) : !llvm<"{ i8*, i32 }">
  llvm.return %0 : !llvm.i32
}

Signed-off-by: Shraiysh Vaishay <cs17btech11050@iith.ac.in>

Diff Detail

Event Timeline

shraiysh created this revision.Dec 30 2019, 8:42 AM
shraiysh edited the summary of this revision. (Show Details)Dec 30 2019, 9:00 AM
shraiysh added a reviewer: ftynse.
rriddle requested changes to this revision.Dec 30 2019, 9:36 AM

You are missing some tests.

Simple comments for now, haven't had time to think much about the modeling.

For 2. I would remove the attributes and just do the same thing that LLVM does, i.e. if the type is an Array it is a filter. I don't see a reason why attributes are necessary on top of that.

mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td
377

Can we use the same names as LLVM for 'success' and 'failure'?

379

This should be checked now, we don't model PHIInst like LLVM does so this should be even simpler.

452

Lines should be wrapped at 80 characters just like code.

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
642

This needs to be updated

645

Why is this necessary? Why not just use result.attributes directly instead?

651

Change this to FunctionType, parseType will already check that the parsed type isa<FunctionType>

868

Stream these in:

p << ' ' << op.getOperands() << " : " << op.getType();

875

This seems unimplemented

mlir/lib/Target/LLVMIR/ConvertFromLLVMIR.cpp
294

Leftover debugging.

299

This looks unrelated.

780

Remove all trivial braces.

799

Please use proper builder.create<> instead, adding build methods to Invoke and LandingPad as necessary.

mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
317

Remove all commented out code.

This revision now requires changes to proceed.Dec 30 2019, 9:36 AM
shraiysh added a comment.EditedDec 30 2019, 11:48 AM

Yes, I'm sorry for the confusion. Thanks for the comments.

I just wanted a green light to go ahead with this implementation. This obviously needs a lot of revision 🙂

lebedev.ri retitled this revision from Added llvm.invoke and llvm.landingpad to [MLIR] Added llvm.invoke and llvm.landingpad.Dec 30 2019, 11:50 AM
ftynse added a comment.EditedDec 31 2019, 1:27 AM

Haven't looked at the code, let met know if you want me to. Please see design comments inline.

I have tried to implement llvm.invoke and llvm.landingpad.

llvm.invoke is similar to llvm.call with two successors added, the first one is the normal label and the second one is unwind label.
llvm.call was breaking for function pointers inside code, so I did not handle the same for llvm.invoke

Breaking how? If you think there's a bug, please report it (and feel free to assign to me).

handling that can probably be a separate commit as it is independent on this. Both call and invoke might need revision after that.

I have a slight preference of not changing the IR often, i.e. if we know invoke will be modeled differently soon, why not model it differently from the start?

llvm.launchpad takes a variable number of args with either catch or filter associated with them. I went with a string attribute dictionary for a list of catch and filter.

This could work, but there may be another way. Since the clauses must refer to globals anyway, we can have a list of attributes where the i-th attribute is a SymbolRef to the global if it is a catch clause, and a nested potentially empty array of SymbolRefs to globals if it is a filter clause. This is a more intrusive change compared to LLVM IR, but it's justifiable IMO since we model globals differently anyway. We can optionally store the array types as other attributes but they can always be inferred from array attributes. I haven't thought deeply about the trade-offs involved.

This would give something like
"llvm.landingpad"() {[@_ZTli, [], @_ZTd, [@_ZTf]}] : () -> !llvm<"result_type">
in the generic form. which would correspond to
landingpad catch i8* @_ZTli, filter [0xi8**] undef, catch i8* @_ZTd, filter [1xi8*] @_ZTf
It will remove the need to do addressof and declare constants before the landingpad (simplifying the verifier that checks whether it's the first operation in the block, and another one that checks the operands are only constants), but will make it hard to see the types.

Not sure if implementing another type constraint alongside LLVM_Type would be a better idea.

I don't think type constraints are relevant here. You need to _store_ the fact that an operand belongs to one clause or another, constraints are only checked in the verifier and don't affect the storage.

llvm.launchpad expects an array as an argument when filtering. Constant array and Undefined values were not handled and the program was going into a segmentation fault.

Please report a bug about the segmentation fault.

I could test that it should work using i8 constants, which are interpreted as strings, but this is not rigorous testing.
Examples:
LLVM IR

define i32 @caller(i32 %a) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {

  invoke i32 @foo(i32 2) to label %success unwind label %fail

success:
  ret i32 2

fail:
  landingpad {i8*, i32} catch i8** null catch i8* bitcast (i8** @_ZTIi to i8*) filter [1 x i8] [ i8 1 ]
  ret i32 3

}
MLIR LLVM Dialect

llvm.func @caller(%arg0: !llvm.i32) -> !llvm.i32 {

  %0 = llvm.mlir.constant(3 : i32) : !llvm.i32
  %1 = llvm.mlir.constant("\01") : !llvm<"[1 x i8]">
  %2 = llvm.mlir.addressof @_ZTIi : !llvm<"i8**">
  %3 = llvm.bitcast %2 : !llvm<"i8**"> to !llvm<"i8*">
  %4 = llvm.mlir.null : !llvm<"i8**">
  %5 = llvm.mlir.constant(2 : i32) : !llvm.i32
  %6 = llvm.invoke @foo(%5) ^bb1, ^bb2 : (!llvm.i32) -> !llvm.i32
^bb1:	// pred: ^bb0
  llvm.return %5 : !llvm.i32
^bb2:	// pred: ^bb0
  %7 = llvm.landingpad  {clauseTypes = ["catch", "catch", "filter"], cleanup = false} %4, %3, %1 : !llvm<"{ i8*, i32 }">
  llvm.return %0 : !llvm.i32
}

Is this syntax okay? Should go ahead with verifiers and tests for currently handled inputs?

I think you can even have the llvm-like syntax:
llvm.landingpad catch %4, catch %3, filter %1 : !llvm<"{ i8*, i32 }">

You can do something like

StringRef kw;
while (succeeded(parseOptionalKeyword(&kw)) {
  if (kw.is(kCatchKeyword)) // handle catch
  else if (kw.is(kFilterKeyword)) // handle filter
  else if (kw.is(kCleanupKeyword)) // handle cleanup
}
parseColonType(..)

For 2. I would remove the attributes and just do the same thing that LLVM does, i.e. if the type is an Array it is a filter. I don't see a reason why attributes are necessary on top of that.

Does anything preclude catch type from being an array type, in theory? (In C++, it's a pointer to a std::type_info, but we should not be limited to that)

For 2. I would remove the attributes and just do the same thing that LLVM does, i.e. if the type is an Array it is a filter. I don't see a reason why attributes are necessary on top of that.

Does anything preclude catch type from being an array type, in theory? (In C++, it's a pointer to a std::type_info, but we should not be limited to that)

Langref doesn't outright state it as far as I can remember, but this has always been a constraint(from the point landing pad was added, ~9 years). You could try to use an Array for catch, but it would require some amount of changes to LLVM(i.e. removing baked-in assumptions) to get it to work.
https://github.com/llvm/llvm-project/blob/4f82af81a04d711721300f6ca32f402f2ea6faf4/llvm/include/llvm/IR/Instructions.h#L2866

Haven't looked at the code, let met know if you want me to. Please see design comments inline.

No, the code is not finished yet. I just wanted reviews on the design. Thanks.

Breaking how? If you think there's a bug, please report it (and feel free to assign to me).

It goes into a segmentation fault. I will file a bug.

I have a slight preference of not changing the IR often, i.e. if we know invoke will be modelled differently soon, why not model it differently from the start?

Agreed. Handling the function call is same for both call and invoke - so, once the function pointers are handled for call, it will be exactly the same for invoke. I think this syntax is going to work even after the change - it's just that while handling those pointers, the LLVM IR parser in MLIR might have to change a bit - that is what I meant by revision.

This could work, but there may be another way. Since the clauses must refer to globals anyway, we can have a list of attributes where the i-th attribute is a SymbolRef to the global if it is a catch clause, and a nested potentially empty array of SymbolRefs to globals if it is a filter clause. This is a more intrusive change compared to LLVM IR, but it's justifiable IMO since we model globals differently anyway. We can optionally store the array types as other attributes but they can always be inferred from array attributes. I haven't thought deeply about the trade-offs involved.

This would give something like
"llvm.landingpad"() {[@_ZTli, [], @_ZTd, [@_ZTf]}] : () -> !llvm<"result_type">
in the generic form. which would correspond to
landingpad catch i8* @_ZTli, filter [0xi8**] undef, catch i8* @_ZTd, filter [1xi8*] @_ZTf
It will remove the need to do addressof and declare constants before the landingpad (simplifying the verifier that checks whether it's the first operation in the block, and another one that checks the operands are only constants), but will make it hard to see the types.

This sounds like a better idea. I will start working on this instead of the current thing.

I don't think type constraints are relevant here. You need to _store_ the fact that an operand belongs to one clause or another, constraints are only checked in the verifier and don't affect the storage.

Okay. That makes sense.

llvm.launchpad expects an array as an argument when filtering. Constant array and Undefined values were not handled and the program was going into a segmentation fault.

Please report a bug about the segmentation fault.

Sure thing. I will report about those too.

I think you can even have the llvm-like syntax:
llvm.landingpad catch %4, catch %3, filter %1 : !llvm<"{ i8*, i32 }">

You can do something like

StringRef kw;
while (succeeded(parseOptionalKeyword(&kw)) {
  if (kw.is(kCatchKeyword)) // handle catch
  else if (kw.is(kFilterKeyword)) // handle filter
  else if (kw.is(kCleanupKeyword)) // handle cleanup
}
parseColonType(..)

Sure, I will do this.

Thanks for the comments @ftynse, @rriddle.

shraiysh updated this revision to Diff 239000.Jan 19 2020, 9:55 AM
shraiysh edited the summary of this revision. (Show Details)

I have tried to address almost all the comments.

  1. I did not make a builder for llvm.invoke and used the OperationState instead - because it would involve two specialised builders for just one instruction. Let me know if making both of them is worth the effort.
  2. Because the clauses can have any constant and not only a global variable (bitcasts and all), I decided to save them in operands instead of an array of attributes. The verifier for llvm.invoke checks for the first llvm.landingpad instruction and makes sure that whatever operations occur before it are the ones used in the clauses. I am not sure how sound this is. I will be willing to update it in case this is flawed.

I have also updated the summary. If I missed anything, please let me know, and I will address it.

Please add proper builders. Users should *not* have to use OperationState directly, it is extremely error prone and clunky.

mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td
333

Please move large blocks of c++ into the .cpp file.

349

Remove trivial braces. There are still quite a few remaining.

357

This loop seems fairly inefficient. Can you rework this?

363

This seems slightly incorrect. Other instructions are allowed if they feed into the landingpad.

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
431

Can you print this in a more MLIR friendly way? i.e. operand : type

448

Can we use a UnitAttr for this instead?

462

Why not just resolve the operand right here?

471

I don't see where the location is used.

mlir/lib/Target/LLVMIR/ConvertFromLLVMIR.cpp
546

nit: You can elide the b.getIdentifier() and just pass the string name.

mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
223

This looks error prone, what is the invariant you want here?

232

nit: inline this into its use.

237

Start diagnostics with lower case

mlir/test/Dialect/LLVMIR/roundtrip.mlir
228

Please do not use SSA value names directly.

https://mlir.llvm.org/getting_started/TestingGuide/

This script can help with generating check lines, but should be stripped down to something appropriate.

https://github.com/llvm/llvm-project/blob/master/mlir/utils/generate-test-checks.py

271

Missing newline in some of these files.

shraiysh updated this revision to Diff 239052.Jan 20 2020, 3:23 AM
shraiysh marked 19 inline comments as done.
shraiysh edited the summary of this revision. (Show Details)

I have tried to address all the comments pointed out by @rriddle . I have also made the builder for llvm.InvokeOp. Please let me know if anything else has to be done.

shraiysh added inline comments.Jan 20 2020, 3:25 AM
mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td
357

I have avoided this now, by processing GlobalVariable in processConstant (earlier it was in processValue) and putting it at the beginning of the basic block. So, now, the first instruction has to be a landingpad instruction.

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
471

It is used in reporting the error inside next if.

shraiysh updated this revision to Diff 239081.Jan 20 2020, 4:46 AM
rriddle added inline comments.Jan 21 2020, 11:04 AM
mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
294

Remove the newline.

300

Never use 'size' for operation/block/region lists, it is not O(1). You should be able to use getSuccessor(1)->empty() instead.

372

This should already be checked by parseType, so no need to duplicate error handling. You should be able to replace uses of 'funcType' with 'type' directly.

440

nit: auto -> Value

459

You need to check this parse method for failure.

462

The || true seems off, what invariant are you trying to capture here? Also, it is good practice to explicitly wrap optional parse methods with 'succeeded' or 'failed' to make the intent clear.

478

You shouldn't emit error for these methods, an error will already be emitted.

mlir/lib/Target/LLVMIR/ConvertFromLLVMIR.cpp
523

Remove trivial braces. Also, please cache the end iterator to avoid recomputation.

mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
228

Is this error case captured by any of the tests? I can't seem to find where it's covered.

shraiysh updated this revision to Diff 239655.Jan 22 2020, 11:05 AM
shraiysh marked 11 inline comments as done.
shraiysh added inline comments.
mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
462

Actually I was trying to make parenthesis optional - but they had to be parsed if present. Hence they || true. Now, I realize that code in the current patch would lead to valid parsing of ( catch... (catch... (filter... also (which is wrong). So, in the new patch, I am making parenthesis compulsory.

mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
228

I have added tests for invalid IR in test/Dialect/LLVMIR/invalid.mlir. Also, I have added a verified for landingpad instruction

I realized that I had forgotten to add function personality in our dialect. I will be adding it in another patch, once this gets accepted.
Apart from that, I have tried to address all the all inline comments. Please let me know if anything has to be changed. Thanks.

rriddle added inline comments.Jan 26 2020, 2:19 AM
mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td
337

nit: just invoke the other build method after adding the 'callee' attribute.

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
301

atleast -> at least

303

This sentence seems incomplete.

304

->front()?

338

Use /// for top-level comments.

399

nit: Use range based for loop instead: funcType.getInputs()

411

nit: llvm::makeArrayRef(operands).drop_front()

423

nit: Can you hoist these above the 'if (isDirect)' branch so that you can change that to early exit instead?

485

'ua' looks unused.

491

Are the 'catch'/'filter' truly optional? i.e. should we emit an error if neither are present?

mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
226

nit: constOperand

shraiysh updated this revision to Diff 240442.Jan 26 2020, 9:08 AM
shraiysh marked 12 inline comments as done.
shraiysh added inline comments.
mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
338

// was used for other functions in the file. Should I change it for all of them for uniformity?

423

addSuccessor internally is appending successor operands to result.operands and therefore, adding successor before adding operands leads to unexpected failures.

491

I have now added a check in the verifier.

shraiysh updated this revision to Diff 240445.Jan 26 2020, 9:12 AM
rriddle accepted this revision.Jan 29 2020, 11:41 PM
rriddle added inline comments.
mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
317

nit: This is unnecessary now.

438

This seems like something that should be attached as a note.

See https://mlir.llvm.org/docs/Diagnostics/ for more details

This revision is now accepted and ready to land.Jan 29 2020, 11:41 PM
shraiysh updated this revision to Diff 241396.Jan 30 2020, 3:38 AM
shraiysh marked 2 inline comments as done.

I do not have write access to the repository. If would be great if someone could commit for me.
I have updated the patch.

This revision was automatically updated to reflect the committed changes.