This is an archive of the discontinued LLVM Phabricator instance.

[mlir][LLVMIR] Use a new way to verify GEPOp indices
ClosedPublic

Authored by myhsu on May 4 2022, 9:24 AM.

Details

Summary

Previously, GEPOp relies on findKnownStructIndices to check if a GEP index should be static. The truth is, findKnownStructIndices can only tell you a GEP index _might_ be indexing into a struct (which should use a static GEP index). But GEPOp::build and GEPOp::verify are falsely taking this information as a certain answer. Here is a counter example:

%sub_struct = type { i32, i8 }
%my_struct = type { %sub_struct, [4 x i32] }

define void @foo(%my_struct* %arg, i32 %idx) {
  %1 = getelementptr %my_struct, %my_struct* %arg, i32 0, i32 1, i32 %idx
  ret void
}

Which is a legit GEP, but the current verifier will bail out with error message "expected index 2 indexing a struct to be constant". One of the GEPOp::build functions also bailed out with a similar error message.

The solution presented here adopts a new verification scheme: When we're recursively checking the child element types of a struct type, instead of checking every child types, we only check the one dictated by the (static) GEP index value. We also combine "refinement" logics --refine/promote struct index mlir::Value into constants -- into the very
verification process since they have lots of logics in common. The resulting code is more concise and less brittle.

We also hide GEPOp::findKnownStructIndices since most of the aforementioned logics are already encapsulated within GEPOp::build and GEPOp::verify, we found little reason for findKnownStructIndices (or the
new findStructIndices) to be public.

Diff Detail

Event Timeline

myhsu created this revision.May 4 2022, 9:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 4 2022, 9:24 AM
myhsu requested review of this revision.May 4 2022, 9:24 AM
ftynse requested changes to this revision.May 4 2022, 9:33 AM

Thanks, something might be off with the check, but I'm not sure I agree with your current assessment. The verifier does recurse into structs - https://github.com/llvm/llvm-project/blob/ff8d0b338f48ab26919ac84aef2c79e1e1a20ef2/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp#L406-L429. My another suspicion is that build methods are not matching your expectation, it would be helpful to have a MLIR-only test that demonstrates the problem by parsing IR, rather than building it (use the generic form to bypass parser checks). If this proves impossible, the issue is in the build method.

mlir/test/Target/LLVMIR/Import/verify-gep.ll
7 ↗(On Diff #427039)

Could we have have indices here plz?

Also, don't check for SSA value names, use something like %{{.*}} instead.

This revision now requires changes to proceed.May 4 2022, 9:33 AM
myhsu added a comment.May 5 2022, 10:57 AM

Thanks, something might be off with the check, but I'm not sure I agree with your current assessment. The verifier does recurse into structs - https://github.com/llvm/llvm-project/blob/ff8d0b338f48ab26919ac84aef2c79e1e1a20ef2/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp#L406-L429. My another suspicion is that build methods are not matching your expectation, it would be helpful to have a MLIR-only test that demonstrates the problem by parsing IR, rather than building it (use the generic form to bypass parser checks). If this proves impossible, the issue is in the build method.

Yes I'm aware of recordStructIndices and its algorithm. But when it examines the enclosing element types (of an aggregate type), the next GEP index will be considered static if any of the element types is a struct. The key is that not every element types is a struct.
What LLVM verifier does is not only taking the type hierarchy into considerations, but also using the current GEP index value to see if a particular element type is a struct, which dictates whether the next GEP index should be static or not. This is what I think a better way.

I'll update my descriptions of this approach in the comments and the summary here though.

myhsu updated this revision to Diff 427392.May 5 2022, 11:02 AM
myhsu marked an inline comment as done.
myhsu edited the summary of this revision. (Show Details)
  • Test GEP indices in the translation test
  • Add MLIR-only test to test GEPOp::build
myhsu added a comment.May 5 2022, 11:10 AM

It turns out this patch fails this test in test/Dialect/LLVMIR/invalid.mlir:

func.func @gep_struct_variable(%arg0: !llvm.ptr<struct<(i32)>>, %arg1: i32, %arg2: i32) {
  // expected-error @below {{op expected index 1 indexing a struct to be constant}}
  llvm.getelementptr %arg0[%arg1, %arg1] : (!llvm.ptr<struct<(i32)>>, i32, i32) -> !llvm.ptr<i32>
  return
}

Which does contain an invalid GEP index. That means, we're too permissive with the new changes.

I'm incline to rewrite the algorithm used in GEPOp::findKnownStructIndices (and recordStructIndices) to adopt the better solution I mentioned previously.

ftynse added a comment.May 6 2022, 2:17 AM

I'm incline to rewrite the algorithm used in GEPOp::findKnownStructIndices (and recordStructIndices) to adopt the better solution I mentioned previously.

I think this would be the best way forward!

myhsu updated this revision to Diff 429569.May 15 2022, 3:32 PM
myhsu retitled this revision from [mlir][LLVMIR] Fix incorrect assumption on GEPOp indices to [mlir][LLVMIR] Use a new way to verify GEPOp indices.
myhsu edited the summary of this revision. (Show Details)

This is a major revision which verifies GEPOp indices using the index operand values rather than just the type. This can also solve multiple other problems on the way, for instance, refine/promote the struct index value into constant and check if the index is out of bound.

Note that I removed GEPOp::findKnownStructIndices because it's not used anywhere now and more importantly, it cannot give us precise answers on the validity of struct indices anymore.

ftynse accepted this revision.May 17 2022, 7:40 AM

Thanks!

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
403

Nit: please add documentation to these classes.

423

Nit: MLIR uses camelCase for variable names, we would use os here.

568–571

Nit: I would spell out the auto here.

642

Ditto.

This revision is now accepted and ready to land.May 17 2022, 7:40 AM
myhsu updated this revision to Diff 430114.May 17 2022, 10:26 AM
myhsu marked 4 inline comments as done.
  • Prefix the name of our llvm::ErrorInfo classes with "GEP" and add some documentations (NFC).
  • Addressed minor issues.
This revision was landed with ongoing or failed builds.May 17 2022, 10:30 AM
This revision was automatically updated to reflect the committed changes.
zero9178 added inline comments.
mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
525

I am sorry I am responding this late after this has landed but was disallowing using static indices in dynamic positions an intention of this patch?

I've had a lot of GEPOp that consisted purely of constants and hence were always of the form llvm.getelementptr %ptr[0, 1]. Since updating to a newer MLIR revision including this patch I am now running into crashes when using the build method creating such GEPs that crash here as so far I always passed an empty range as indices.

This is surprising to me

  1. because this is a straight up crash and not a verifier failure or similar as any accesses into indices in this and the above methods are completely unchecked
  2. This previously worked without issues.

If the static indices should from now on only be allowed to be used for indexing into a struct that would be okay I guess, but it should probably be documented somewhere

myhsu added inline comments.Jul 8 2022, 2:02 PM
mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
525

Right, I don't see any problem using (integer) attribute on non-struct indices (other than the fact that the operand is called "structIndices" in the operation definition) so I think this is a bug. Could you help me file a bug on GitHub and tag me (mshockwave)?