This is an archive of the discontinued LLVM Phabricator instance.

[preprocessor] When preprocessor option 'SingleFileParseMode' is enabled, parse all directive blocks if the condition uses undefined macros
ClosedPublic

Authored by akyrtzi on Jun 15 2017, 5:09 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

akyrtzi created this revision.Jun 15 2017, 5:09 PM

Hello Argyrios,
This is a good addition to simplify reuse of preprocessor in IDEs. Thanks for doing this.

From our experience of integrating clang PP into NetBeans, the following change gives more flexibility:

  • introduce method in PPCallbacks and consult it what is the preferred value for undefined symbol

It will be similar to approach used for unresolved includes and FileNotFound method.
It also gives dynamic behavior for clients: client might still prefer to keep skipping some blocks like unix, mac, win, solaris, LP64, cplusplus, ...

Thanks,
Vladimir.

Hey Vladimir, what you are proposing is orthogonal to this patch. You are proposing for "the client to provide the value for an undefined identifier", and the patch is about the client not knowing what the value should be so it fallbacks to parsing all tokens to get the max amount of info. Note that both of the techniques can be combined well, if the client provides the value, the preprocessor will take it into account, otherwise if it is stays unresolved it will fallback to lexing all tokens.
But what you are proposing is not a replacement for what the patch is doing.

benlangmuir edited edge metadata.Jun 16 2017, 10:34 AM

It would be nice if the doc comment for the single file parse mode flag was updated to include this new functionality.

lib/Lex/PPDirectives.cpp
2709 ↗(On Diff #102761)

Nitpick: this is misaligned. Same with many other calls to pushConditionalLevel in this patch.

2774 ↗(On Diff #102761)

Why is wasSkipping false here?

akyrtzi added inline comments.Jun 16 2017, 11:19 AM
lib/Lex/PPDirectives.cpp
2774 ↗(On Diff #102761)

For #if it sets wasskip to true to indicate later for HandleElseDirective that the block should not be skipped.
But here (inside HandleElseDirective) setting wasskip doesn't really affect anything, the HandleEndifDirective doesn't check it. I chose to set it to false as indication that the #else block does get parsed.

benlangmuir added inline comments.Jun 16 2017, 11:23 AM
lib/Lex/PPDirectives.cpp
2774 ↗(On Diff #102761)

Thanks for explaining. That makes sense to me.

akyrtzi updated this revision to Diff 102859.Jun 16 2017, 12:28 PM

Enhanced doc-comment for the preprocessor option, and fixed indentation on a couple of calls.

Thanks, this looks good to me. I'd appreciate if @klimek could take a quick look though.

include/clang/Lex/PreprocessorOptions.h
102 ↗(On Diff #102859)

nitpick: should use the full words "maximum" and probably also "information" here

akyrtzi updated this revision to Diff 102885.Jun 16 2017, 3:03 PM

Improving doc comment.

klimek edited edge metadata.Jun 19 2017, 1:20 AM

Generally this patch lg from my side - how many patches for single file mode are coming down the road, though? I'm somewhat concerned about the overall complexity it'll add to clang.
When I saw your first patch my reaction was "wow, this works with this little change - cool".
With this patch it's "well, that doesn't seem to add too much complexity, so looks good".
I don't want to realize 5 patches down the line that we're sprinkling all of clang with single file conditional code without some discussion about the strategy with Richard.

include/clang/Lex/Preprocessor.h
1838–1841 ↗(On Diff #102885)

A short comment on the struct what the semantics are would be helpful.

Hey Vladimir, what you are proposing is orthogonal to this patch. You are proposing for "the client to provide the value for an undefined identifier", and the patch is about the client not knowing what the value should be so it fallbacks to parsing all tokens to get the max amount of info. Note that both of the techniques can be combined well, if the client provides the value, the preprocessor will take it into account, otherwise if it is stays unresolved it will fallback to lexing all tokens.
But what you are proposing is not a replacement for what the patch is doing.

I'm not sure :-)

What I find problematic in this patch is PPOpts->SingleFileParseMode checks.
Let's suppose we implement what I mentioned above => how is it going to co-exist nicely? I think code will be less understandable with both: check of flag and call to PPCallbacks.
What I propose is to move the logic from the PPOpts single flag into PPCallbacks. And from check of flag to query of PPCallbacks.
Of course when you create PPCallbacks you can consult global SingleFileParseMode to create default implementation to answer "any symbol is defined", so you get your use-case easily handled, but it also gives others more flexibility.

Thanks,
Vladimir.

how many patches for single file mode are coming down the road, though? I'm somewhat concerned about the overall complexity it'll add to clang.

There is no other patch for using this preprocessor option. Any related improvements down the road will be about general improvements for error recovery, for example things like this:

  • Introduce an UnresolvedTypename type instead of changing unresolved types to int.
  • For @interace A : B, don't completely drop B from the super-class list of A if it is unresolved.

These kind of improvements are not conditional.

What I find problematic in this patch is PPOpts->SingleFileParseMode checks.
Let's suppose we implement what I mentioned above => how is it going to co-exist nicely? I think code will be less understandable with both: check of flag and call to PPCallbacks.
What I propose is to move the logic from the PPOpts single flag into PPCallbacks. And from check of flag to query of PPCallbacks.
Of course when you create PPCallbacks you can consult global SingleFileParseMode to create default implementation to answer "any symbol is defined", so you get your use-case easily handled, but it also gives others more flexibility.

There is a bit of misunderstanding on what this patch is doing, the patch is not about answering "any symbol is defined". See my cfe-dev email (http://lists.llvm.org/pipermail/cfe-dev/2017-June/054254.html) that provides some more context.
The patch is doing these 2 things:

  • Keep track of whether a preprocessor directive condition used a symbol that was not resolved at all
  • If above is true then make the preprocessor parse all directive blocks, not arbitrary choose one of them.

Here's an example to clarify the difference:

Say that you have this code:

#if TARGET_IOS
@interface A @end
#else
@interface B @end
#endif
  1. Without any changes this will happen:
    • TARGET_IOS is unresolved so it will be treated as 0
    • #if block will be skipped, #else will be parsed
    • AST will contain @interface B
  1. With adding a PPCallback that answers "any symbol is defined" (what you are proposing):
    • TARGET_IOS is undefined but the PPCallback sets it to 1
    • #if block will be parsed, #else will be skipped
    • AST will contain @interface A
  1. With this patch and enabling SingleFileParseMode
    • TARGET_IOS is unresolved
    • Both #if and #else blocks will be parsed
    • AST will contain both @interface A and @interface B

As you can see 2. and 3. above are not overlapping so they can "co-exist nicely" like this:

  • TARGET_IOS is undefined, ask the PPCallback if it has a value for it or not
    • If the callback provides a value then parse as 1. or 2. depending on the value
    • If the callback cannot provide a specific value then parse as 3.
akyrtzi updated this revision to Diff 103066.Jun 19 2017, 9:50 AM

Provide doc-comments for struct DirectiveEvalResult.

akyrtzi updated this revision to Diff 103067.Jun 19 2017, 9:52 AM

Update doc-comment for Preprocessor::EvaluateDirectiveExpression

how many patches for single file mode are coming down the road, though? I'm somewhat concerned about the overall complexity it'll add to clang.

There is no other patch for using this preprocessor option. Any related improvements down the road will be about general improvements for error recovery, for example things like this:

  • Introduce an UnresolvedTypename type instead of changing unresolved types to int.
  • For @interace A : B, don't completely drop B from the super-class list of A if it is unresolved.

These kind of improvements are not conditional.

That's great! I was always hoping we'd get some diag improvements out of these :D

klimek accepted this revision.Jun 20 2017, 1:00 AM

oh, and lg from my side

This revision is now accepted and ready to land.Jun 20 2017, 1:00 AM

Here's an example to clarify the difference:

Thanks for the example. You are right, I missed this difference in patch.

This revision was automatically updated to reflect the committed changes.
cfe/trunk/lib/Lex/PPExpressions.cpp