This is an archive of the discontinued LLVM Phabricator instance.

[mlir][AsmParser] Improve parse{Attribute,Type} error handling
ClosedPublic

Authored by rkayaith on Feb 25 2023, 9:21 PM.

Details

Summary

Currently these functions report errors directly to stderr, this updates
them to use diagnostics instead. This also makes partially-consumed
strings an error if the numRead parameter isn't provided (the
docstrings already claimed this happened, but it didn't.)

While here I also tried to reduce the number of overloads by switching
to using default parameters.

Diff Detail

Event Timeline

rkayaith created this revision.Feb 25 2023, 9:21 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 25 2023, 9:21 PM
rkayaith published this revision for review.Feb 25 2023, 10:31 PM
rkayaith added a reviewer: rriddle.
rkayaith added inline comments.
mlir/lib/AsmParser/DialectSymbolParser.cpp
317

switched this to true since I found the parser reads past the end of the buffer if it isn't null-terminated (e.g. StringRef("999", 1) gets parsed as 999 instead of 9)

mlir/test/Dialect/Linalg/transform-op-pad.mlir
108–109

"foo" is still an error, but it emits an extra diagnostic that can't be checked through expected-error:

foo:1:1: error: expected attribute value

(previously that error was just ending up on stderr)

rriddle accepted this revision.Feb 27 2023, 12:24 PM
rriddle added inline comments.
mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-yaml-gen.cpp
320

Do you need the temp variable here? (after the RequiresNullTerminator change?)

This revision is now accepted and ready to land.Feb 27 2023, 12:24 PM
rkayaith added inline comments.Feb 27 2023, 12:39 PM
mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-yaml-gen.cpp
320

Yea, all RequiresNullTerminator seems to do is enable an assert that checks that the input is null-terminated. Which ends up triggering here without the std::string copy.

rriddle added inline comments.Feb 27 2023, 12:41 PM
mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-yaml-gen.cpp
320

Hmrm, we should likely change the API to better differentiate cases where the user knows there is a null-terminator, vs where one is needed. We have a bug related to this as well.

bixia added a subscriber: bixia.Mar 1 2023, 4:16 PM

This change breaks tensorflow test github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/tensorflow/tests/compile_mlir_util/mlir-module-serialized-str-attr.mlir

In particular, the parser returns error ":16:1: error: found trailing characters: ': return [[IDENTITY]] : tensor<?xi32> loc(unknown)" because it doesn't finish reading the trailing comments in the test input.
I couldn't figure out which part of this PR may cause this change in the parser.

I temporary fix the test by this change:
tensorflow/compiler/mlir/tensorflow/utils/tf_xla_mlir_translate.cc ====
388c388,391

< mlir::Attribute attr = mlir::parseAttribute(input, context);

// When the parser doesn't read all the input chars, it issues an error unless
// an output parameter is provided for returning the number of chars read.
size_t numRead;
mlir::Attribute attr = mlir::parseAttribute(input, context, {}, &numRead);

This change breaks tensorflow test github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/tensorflow/tests/compile_mlir_util/mlir-module-serialized-str-attr.mlir

In particular, the parser returns error ":16:1: error: found trailing characters: ': return [[IDENTITY]] : tensor<?xi32> loc(unknown)" because it doesn't finish reading the trailing comments in the test input.
I couldn't figure out which part of this PR may cause this change in the parser.

I temporary fix the test by this change:
tensorflow/compiler/mlir/tensorflow/utils/tf_xla_mlir_translate.cc ====
388c388,391

< mlir::Attribute attr = mlir::parseAttribute(input, context);

// When the parser doesn't read all the input chars, it issues an error unless
// an output parameter is provided for returning the number of chars read.
size_t numRead;
mlir::Attribute attr = mlir::parseAttribute(input, context, {}, &numRead);

If I understand correctly, the error here is an intentional change (previously the extra input would be silently ignored, which could hide potential errors). If you know you only want to parse part of the string, you can either trim the StringRef before passing it to parseAttribute, or pass numRead as you did in your fix.

bixia added a comment.Mar 2 2023, 10:38 AM

The input contains the "meaningfule string for string attr" and followed by comment lines. Do you know why the parser stop in the middle of a comment line?

The input contains the "meaningfule string for string attr" and followed by comment lines. Do you know why the parser stop in the middle of a comment line?

I'm not familiar with the parser internals, but that sounds like a bug to me. It would've existed before this change though, just hidden.

kuhar added a subscriber: kuhar.Mar 2 2023, 12:09 PM

Hi @rkayaith,

Seems like this does not allow for passing trimmed StringRefs to mlir::parseType anymore. These are not null-terminated Was this intentional?

Hi @rkayaith,

Seems like this does not allow for passing trimmed StringRefs to mlir::parseType anymore. These are not null-terminated Was this intentional?

From what I could tell, this is an existing requirement of the parser: https://reviews.llvm.org/D144804#inline-1398522

So you'd have to trim + copy to a null terminated string, e.g. std::string

Hi @rkayaith,

Seems like this does not allow for passing trimmed StringRefs to mlir::parseType anymore. These are not null-terminated Was this intentional?

From what I could tell, this is an existing requirement of the parser: https://reviews.llvm.org/D144804#inline-1398522

So you'd have to trim + copy to a null terminated string, e.g. std::string

Actually it's probably better if parseType just does this copy by default, patch for that here: https://reviews.llvm.org/D145182