This is an archive of the discontinued LLVM Phabricator instance.

Move clang's ParseHelper into Support (llvm part)
Needs ReviewPublic

Authored by olista01 on Dec 7 2016, 3:42 AM.

Details

Reviewers
hans
AndyG
Summary

This moves clang's ParseHelper class (currently used to parse comments for the -verify option) into the Support library so that it can be used elsewhere in LLVM.

Diff Detail

Repository
rL LLVM

Event Timeline

olista01 updated this revision to Diff 80563.Dec 7 2016, 3:42 AM
olista01 retitled this revision from to Refactor clang's ParseHelper into Support (llvm part).
olista01 updated this object.
olista01 added reviewers: EricWF, AndyG, hans.
olista01 set the repository for this revision to rL LLVM.
olista01 added a subscriber: llvm-commits.
hans edited edge metadata.Dec 7 2016, 11:24 AM

This moves clang's ParseHelper class (currently used to parse comments for the -verify option) into the Support library so that it can be used elsewhere in LLVM.

What use do you have in mind?

I'd suggest renaming the title of the patch from "Refactor clang's ParseHelper into Support " to something like "Move clang's ParseHelper ..." since you're not so much refactoring it as moving it.

ParseHelper is currently in an anonmymous namespace inside VerifyDiagnosticConsumer, that is, it's an implementation detail of that class. The name makes sense in that context: it helps with the parsing in that file, but does the name make sense externally?

include/llvm/Support/ParseHelper.h
53

Does it make sense to have all of these as public members?

olista01 retitled this revision from Refactor clang's ParseHelper into Support (llvm part) to Move clang's ParseHelper into Support (llvm part).Dec 8 2016, 2:42 AM
olista01 edited edge metadata.

I'm going to use this for D27514, adding similar diagnostic verifying functionality to llvm-mc.

I think the name is still accurate, it is a helper class for writing simple parsers. Do you have a suggestion for a better name?

include/llvm/Support/ParseHelper.h
53

The current user of this class uses all of these except for End, because it needs to know the current position in the string for diagnostics, and to read back parts of the string found by SearchClosingBrace.

hans added a comment.Dec 8 2016, 3:05 PM

I'm going to use this for D27514, adding similar diagnostic verifying functionality to llvm-mc.

Interesting, that looks like cool functionality.

It seems unfortunate that not more code could be shared between llvm-mc and clang's -verify though. It looks like you're essientially implementing a large chunk of it again.

It would be really cool if we could build some kind of DiagnosticVerifier utility that would work for both clang and llvm-mc. And then this ParserHelper would be an implementation of that.

It would be really cool if we could build some kind of DiagnosticVerifier utility that would work for both clang and llvm-mc.

I did consider that, but unfortunately the clang version of this uses clang's own systems for diagnostics, source management etc, so I think this would only be able to share a small amount of code, at the cost of a lot of added complexity.

hans added a comment.Dec 9 2016, 10:39 AM

What worries me is that we're parsing a ton of different kinds of files all over LLVM, but this ParserHelper still seems only useful for parsing -verify things. (For example, SearchClosingBrace seems very specific, and exposing End, C, and P seems like a very specialized interface.) So there's kind of mismatch between the name and what it's for.

Would it really be that hard to extract a generally useful -verify utility? Something that parses a file and digs out the expected strings and their positions? That seems like it would be really useful.

hans added a comment.Dec 9 2016, 1:24 PM

Looking again at D27512, you're essentially re-implementing VerifyDiagnosticConsumer.cpp's ParseDirective (much of it with copy-paste). That seems really unfortunate. If we decide to change something in the -verify syntax, we'll now have multiple places to change. If there are bugs, there are multiple places to fix.

It must be possible to at least share the code that parses a comment and figures out what directives it contains, and ideally also the code that matches those directives against diagnostics.

AndyG edited edge metadata.Dec 12 2016, 9:23 AM

Scanning over what is being proposed here and in D27514, I have just a couple of comments.

Its seems, firstly, that some functionality of the -verify syntax is being lost in translation to the llvm-mc tool. The missing part that I noticed is the expected-no-diagnostics directive. This may be intentional, or it may be because the reason behind its inclusion may not have been apparent, which was to ensure that test-cases would not pass when -verify was selected but no diagnostics produced, and no directives were parsed. And the reason that this might happen is simply because the test-case forgot to parse a file at all -- see http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20120924/065007.html for the background that led to this being implemented in the first place.

I think this sort of thing also argues for a more holistic approach, as suggested by Hans, in bringing this feature up to the tool-wide level.

I think it would be really nice to implement a class that encapsulates both ParserHelper and ParseDirective. There are two ways then that I could see this working. One, and probably the easier, would be as a base abstract class which would step through the main syntax of the directives, deferring to methods implemented in a sub-class which would then do the actual handling and implementation. The first method of which would be bool handleDirectiveType(SourceMgr::DiagKind Kind) which in your implementation would simply store Kind for later use, but in the clang implementation, would pick the correct DirectiveList* for later use, and so on through out.

An alternative, and perhaps conceptually clearer, approach could be to implement ParserHelper/ParseDirective as an iterator object, so that the following sort of work-flow could be implemented (each VerifyDirective object holding a pre-processed directive):

for (VerifyDirective& VD : VerifyDirectiveParser(CommentText)) {
  switch (VD.getDiagnosticKind()) {
    case SourceMgr::DK_Error:
      ...
  }
  if (VD.isRegexDirective()) {
    ...
  }
  ...
}

I haven't thought through all the details of either implementation, for example how to handle syntax errors or the best way to handle finding diagnostics in external files, but maybe there is some food for thought here. Beyond that, I really don't have an major issue with your proposal, but like Hans, I think it would be nice if it were possible for more code to be shared between the two (current) implementations.

EricWF resigned from this revision.Dec 23 2016, 9:51 PM
EricWF removed a reviewer: EricWF.

Resigning as a reviewer since I don't feel comfortable reviewing generic LLVM patches.