This is an archive of the discontinued LLVM Phabricator instance.

[LibTooling] Change Transformer's TextGenerator to a partial function.
ClosedPublic

Authored by ymandel on Apr 23 2019, 6:53 AM.

Details

Summary

Changes the signature of the std::function to return an Expected<std::string>
instead of std::string to allow for (non-fatal) failures. Previously, we
expected that any failures would be expressed with assertions. However, that's
unfriendly to running the code in servers or other places that don't want their
library calls to crash the program.

Diff Detail

Repository
rL LLVM

Event Timeline

ymandel created this revision.Apr 23 2019, 6:53 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 23 2019, 6:53 AM

Why would we consider this a legitimate failure, rather than a programming error?
Same argument could be made about any form of format-string-like functions, e.g. llvm::formatv or sprintf.
Yet, they return strings and not Expected<string> or their equivalent.

Why would we consider this a legitimate failure, rather than a programming error?
Same argument could be made about any form of format-string-like functions, e.g. llvm::formatv or sprintf.
Yet, they return strings and not Expected<string> or their equivalent.

It's not that it's a legitimate failure so much as an invalid argument -- that is, the caller caused it and therefore should handle the failure. For a standalone tool making the call, I'd argue they should treat it as a programming error and assert to crash. However, a server that takes the input from, say, a user will want to propagate that back to the user.

As for formatv and sprintf -- I think the difference is that this has a more dynamic design. Both of those take explicit arguments at the callsite, vs. TextGenerator which takes a MatchResult. But, that's just a detail -- the key thing is that we're trying to support a usecase where the input may be flawed and we want to fail gracefully. We do so at the cost of the added complexity of Expected<>.

Alternatives:

  1. add a separate validation function. But, this will complicate the design since we'd need to pass two functions rather than one. That is, TextGenerator would need to be a pair of functions.
  2. return an empty string on failure. This has the typical tradeoffs of using a sentinel value.

I'd argue it's the server's job to validate the inputs in that case.

The code that landed so far clearly looks like a C++ DSL to describe transformations of the source. While it can be used a dependency in the server-side, I don't see why doing user-input checking should be done by the library, rather than the server itself.
User input would have to be transformed into the calls of the C++ API somehow, that looks like a proper layer to do the validation.

I'd argue it's the server's job to validate the inputs in that case.

The code that landed so far clearly looks like a C++ DSL to describe transformations of the source. While it can be used a dependency in the server-side, I don't see why doing user-input checking should be done by the library, rather than the server itself.
User input would have to be transformed into the calls of the C++ API somehow, that looks like a proper layer to do the validation.

The problem is that validation can't* be done in the abstract. It has to be done with respect to a specific match result. Unfortunately, the server won't be layered directly on top of the call to the TextGenerator -- the rewrite rule is interposed between them. That is, the client of the TG is the rewriterule and that's where the validation has to happen, but the TG is opaque to the rewrite rule, so it can't hardcode that validation logic. So, we'd need to change TextGenerator to bundle a validation function and string generator. Is it still worth it in that case?

*"can't" is a bit strong. I could imagine a design which allows full analysis and validation of rewrite rules before they are executed, but it would be far more sophisticated than the current design.

The problem is that validation can't* be done in the abstract. It has to be done with respect to a specific match result. Unfortunately, the server won't be layered directly on top of the call to the TextGenerator -- the rewrite rule is interposed between them. That is, the client of the TG is the rewriterule and that's where the validation has to happen, but the TG is opaque to the rewrite rule, so it can't hardcode that validation logic. So, we'd need to change TextGenerator to bundle a validation function and string generator. Is it still worth it in that case?

*"can't" is a bit strong. I could imagine a design which allows full analysis and validation of rewrite rules before they are executed, but it would be far more sophisticated than the current design.

Could you provide more details into how the server app is layered? Specifically, what are the user inputs and how do they get translated into the transformer library calls?
I imagine their should be a syntax for textual representation of the generator, the representation should allow referring to the named match results. While parsing this representation, it should be possible to collect all matches of named nodes and make sure there are corresponding binding in the rewrite rule.

If the setup is different, I may be missing where the complexity comes from and I would love to learn about it.

ymandel updated this revision to Diff 197146.Apr 29 2019, 10:53 AM

Updated comment to more explicity describe motivation for new signature.

ymandel retitled this revision from [LibTooing] Change Transformer's TextGenerator to a partial function. to [LibTooling] Change Transformer's TextGenerator to a partial function..Apr 29 2019, 10:53 AM
ymandel updated this revision to Diff 197250.Apr 29 2019, 7:59 PM

Add test for new behavior.

In the process, tweak the handling of errors from TextGenerators in Transformer:
instead of printing to llvm::errs, we set the error in the AtomicChange.

ymandel updated this revision to Diff 197255.Apr 29 2019, 8:05 PM

Updates comments on Transformer to make explicit the error reporting.

ilya-biryukov added inline comments.Apr 30 2019, 3:28 AM
clang/include/clang/Tooling/Refactoring/Transformer.h
47 ↗(On Diff #197255)

NIT: maybe shorten a bit, still capturing the essence? Something like the following should be enough:

Note that \p TextGenerator is allowed to fail, e.g. when trying to access a matched node that was not bound.
Allowing this to fail simplifies error handling for interactive tools like clang-query.
58 ↗(On Diff #197255)

Maybe drop trivially successful? Does not seem to be super-important.

238 ↗(On Diff #197255)

s/it's/its

clang/lib/Tooling/Refactoring/Transformer.cpp
164 ↗(On Diff #197255)

Maybe follow a typical pattern for handling errors here (to avoid OrErr suffixes and an extra Err variable)? I.e.

auto Replacement = Edit.Replacement(Result);
if (!Replacement)
  return Replacement.takeError();
T.Replacement = std::move(*Replacement);
205 ↗(On Diff #197255)

This looks super-complicated.
Having Error in AtomicChange seems like a bad idea in the first place, why would we choose to use it here?

The following alternatives would encourage clients to handle errors properly:

  1. accept an Expected<AtomicChange> in our callback,
  2. provide a separate callback to consume errors.

WDYT about picking one of those two?

ymandel marked 2 inline comments as done.Apr 30 2019, 5:34 AM
ymandel added inline comments.
clang/lib/Tooling/Refactoring/Transformer.cpp
205 ↗(On Diff #197255)

Agreed! I was using setError on the assumption that it was the "standard" way to express errors. Given that it seems to be totally ignored otherwise, let's go with option 1. I'll update the revision.

ymandel updated this revision to Diff 197316.Apr 30 2019, 6:44 AM
ymandel marked 2 inline comments as done.

Addresses comments, including error handling style and signature of ChangeConsumer.

Updates testing code to use new ChangeConsumer signature.

ymandel marked 4 inline comments as done.Apr 30 2019, 6:51 AM
ymandel added inline comments.
clang/lib/Tooling/Refactoring/Transformer.cpp
164 ↗(On Diff #197255)

Here and elsewhere.

ilya-biryukov accepted this revision.Apr 30 2019, 9:13 AM

Thanks! LGTM

This revision is now accepted and ready to land.Apr 30 2019, 9:13 AM
This revision was automatically updated to reflect the committed changes.
ymandel marked an inline comment as done.
Herald added a project: Restricted Project. · View Herald TranscriptApr 30 2019, 9:47 AM
thakis added a subscriber: thakis.Apr 30 2019, 10:50 AM

This breaks a test: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-msan/builds/12112/steps/check-llvm%20check-clang%20stage3%2Fmsan/logs/stdio

[----------] 1 test from TransformerTest
[ RUN      ] TransformerTest.NodePartNameDeclRefFailure
/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm/tools/clang/unittests/Tooling/TransformerTest.cpp:66: Failure
Value of: MaybeActual
  Actual: false
Expected: true
Rewrite failed. Expecting: 
    struct Y {
      int operator*();
    };
    int neutral(int x) {
      Y y;
      int (Y::*ptr)() = &Y::operator*;
      return *y + x;
    }
  
[  FAILED  ] TransformerTest.NodePartNameDeclRefFailure (83 ms)
[----------] 1 test from TransformerTest (83 ms total)

Can you take a look?

This breaks a test: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-msan/builds/12112/steps/check-llvm%20check-clang%20stage3%2Fmsan/logs/stdio

[----------] 1 test from TransformerTest
[ RUN      ] TransformerTest.NodePartNameDeclRefFailure
/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm/tools/clang/unittests/Tooling/TransformerTest.cpp:66: Failure
Value of: MaybeActual
  Actual: false
Expected: true
Rewrite failed. Expecting: 
    struct Y {
      int operator*();
    };
    int neutral(int x) {
      Y y;
      int (Y::*ptr)() = &Y::operator*;
      return *y + x;
    }
  
[  FAILED  ] TransformerTest.NodePartNameDeclRefFailure (83 ms)
[----------] 1 test from TransformerTest (83 ms total)

Can you take a look?

Fixed in r359578.