[YAML] Recover gracefully when deserializing invalid YAML input.
ClosedPublic

Authored by tulloch on Jul 30 2013, 3:37 AM.

Details

Summary
  1. See http://llvm.org/bugs/show_bug.cgi?id=16221 for a user's bug report.
  2. We weren't checking for Document::getRoot() returning NULL. This happens

in the case where the input YAML/JSON object is malformed.

  1. Document::getRoot() is passed to isa<NullNode>, which asserts on it's

argument being NULL.

  1. To fix this, we check for this return value, and set the error code

accordingly. We also modify the calling code to check the return value
of input::setCurrentDocument().

The added unit tests hit the assert in trunk, while passing on this branch.

Additionally, the command that reproduces this assert in the bug report:

llvm-build/Debug+Asserts/bin/clang-format  -style="{BasedOnStyle:llvm}" tools/clang/tools/clang-format/ClangFormat.cpp

now outputs http://llvm-reviews.chandlerc.com/P55

Diff Detail

tulloch updated this revision.Jul 30 2013, 5:45 PM

Removed braces around single-line conditional block.

LGTM

include/llvm/Support/YAMLTraits.h
956

Wouldn't this whole body be simpler as:

if (yin.setCurrentDocument())
  yamlize(yin, docMap, true);
return yin;
tulloch added inline comments.Jul 31 2013, 11:10 AM
include/llvm/Support/YAMLTraits.h
956

Yes, that would be simpler. I'll update.

lib/Support/YAMLTraits.cpp
53

This call can potentially output errors without allowing us to set the DiagHandler.

unittests/Support/YAMLIOTest.cpp
1393–1394

This unfortunately outputs an error message into the unit test logs, since Document.begin() is called before we setDiagHandler.

We can convert it to

a) optionally pass DiagHandler in the constructor,
b) lazily initialize the DocIterator,
c) explicitly initialize Input, etc.
d) just ignore it.

Any thoughts on this?

tulloch updated this revision.Aug 1 2013, 10:27 PM

Cleaned up control flow somewhat.

Does this handle empty input well (http://llvm.org/PR15927)? If so, please add a test.

Plus a few comments inline.

unittests/Support/YAMLIOTest.cpp
76

nit: add a space after the comma.

1393–1394

Option a) seems like the best solution here.

tulloch planned changes to this revision.Nov 13 2013, 9:06 AM

@alexfh, thanks for the comments (and the pointer to the other bug report). Will update the diff with your suggested changes.

unittests/Support/YAMLIOTest.cpp
76

Sure, good catch.

tulloch updated this revision.Nov 15 2013, 5:51 AM

Updated with reviewer comments, and added unit tests covering the empty string for the sequence/map with {required, optional} keys cases.

tulloch updated this revision.Nov 15 2013, 6:24 AM

Fixed spelling error in unit test naming.

Thanks for the updated patch! It looks good overall. Please see a few comments inline.

include/llvm/Support/YAMLTraits.h
692

nit: Please add spaces around assignments.

I'd also rename Handler to DiagHandler and rephrase the comment:
"The DiagHandler can be specified to provide alternative error reporting."

698–699

I don't see any need in this method, as it's only used directly after constructor (which can be wrong, if the constructor call generates errors on its own). So I suggest removing it.

I only could find usages of this method in two files:
unittests/Support/YAMLIOTest.cpp
tools/clang/tools/extra/clang-apply-replacements/lib/Tooling/ApplyReplacements.cpp

(the second one resides in a different branch of the repository, and can be updated separately after this patch gets in).

lib/Support/YAMLTraits.cpp
114

Is CurrentNode NULL only when the document is empty? Can it be NULL when parsing invalid documents?

146–147

nit: please put a period in the end of the comment.

unittests/Support/YAMLIOTest.cpp
1392

nit: in LLVM it's more common to write:
/*Ctxt=*/NULL,

tulloch updated this revision.Nov 17 2013, 3:45 AM

Updated with reviewer comments.

tulloch added inline comments.Nov 17 2013, 6:18 AM
lib/Support/YAMLTraits.cpp
114

It would be null in the case where the document is invalid, yes. In that case we would have set EC in setCurrentDocument(), and so wouldn't have reached this part of the method.

Would you like this comment to be clarified?

Looks good with one change in a comment.

Do you commit access? If not, I can commit the patch for you (and make the change in the comment myself).

lib/Support/YAMLTraits.cpp
114

Yes, the comment needs to be clarified. Mostly by removing the description of _what_ we're doing, and leaving only _why_, e.g. "CurrentNode is null for empty documents, which is an error in the presence of required keys." or something similar. What do you think?

@alexfh - thanks for reviewing the patch. I don't have commit access, and feel free to commit the patch (with whatever changes you feel like - the rewording you proposed is much clearer).

Thanks!

Andrew

lib/Support/YAMLTraits.cpp
114

I think your wording is much clearer, thanks!

alexfh accepted this revision.Nov 18 2013, 7:31 AM
alexfh closed this revision.Nov 18 2013, 7:56 AM

Committed revision 195016.