This is an archive of the discontinued LLVM Phabricator instance.

make -frewrite-includes also rewrite conditions in #if/#elif
ClosedPublic

Authored by llunak on Jun 18 2019, 12:00 PM.

Details

Summary

Qt5 has a wrapper macro that makes __has_include simpler to use:

#ifdef __has_include
#  define QT_HAS_INCLUDE(x)             __has_include(x)
#else
#  define QT_HAS_INCLUDE(x)             0
#endif
#if QT_HAS_INCLUDE(<chrono>)
#  include <chrono>
#endif

The code handling this so far ignores macros entirely. This patch makes -frewrite-includes rewrite all #if/#elif conditions, which is more reliable and simpler than trying to handle __has_include specifically.

Diff Detail

Repository
rL LLVM

Event Timeline

llunak created this revision.Jun 18 2019, 12:00 PM
llunak edited the summary of this revision. (Show Details)Jun 18 2019, 12:17 PM

Luboš, can you please clarify what is wrong with the current Clang behavior? I'm just thinking about what should we do with nested macros.

The code uses a raw lexer, so it doesn't expand macros. Thus the piece of code quoted will not get the "#if QT_HAS_INCLUDE(<chrono>)" part rewritten. My specific use case is the Icecream distributed build tool which does remote builds in a chroot and so this #if will be false and the expanded contents of <chrono> will be ignored, leading to build errors (https://github.com/icecc/icecream/issues/471).

I can change the code to use 'while' instead of 'if' to handle nested macros, but I think such an approach should go only so far. My patch is simple and handles a realistic scenario, but if more complex macro handling is needed (which is a question), then IMO the code should be changed to do macro expansion properly. It was a long time ago when I wrote the code, but seeing that usage of Preprocessor::SetMacroExpansionOnlyInDirectives() I think such an expansion could be done similarly how handling macro expansion is handled in other directives - which IIRC is getting that information from PPCallbacks.

Just a thought: From the point of view of -frewrite-includes, it should technically do full/nested macro expansion right? But from the point of view of Icecream, that would defeat the purpose of using -frewrite-includes, since the goal is to embed all includes into a single source file, ship it to a remote node, and do the rest of the preprocessing there. Or is there a way to expand all the parts that may resolve to a __has_include but leave others as is (for later processing)? How does the code deal with #if FOO(123) && MACRO_HAS_INCLUDE(<foo.h>) eg?

From the Qt side, this patch is good enough, since we only use one level of indirection (to handle compilers without this feature), so I'm just thinking out loud to understand the problem better 😃

Icecream's usage of -frewrite-includes is not special, the purpose is always to merge everything necessary for the compilation into one source file that can be later compiled on its own as if the original file was compiled. The #include expansion done during -frewrite-includes will evaluate all the #if conditions etc. the same way they will be evaluated again during the actual compilation.

rsmith added a subscriber: rsmith.Jun 19 2019, 10:18 AM

Perhaps we should rewrite all #if-like directives to #if 0 or #if 1? Either that or we could emit pragmas indicating the values of later __has_includes. Special-casing macros of this specific form is at best a partial solution.

Perhaps we should rewrite all #if-like directives to #if 0 or #if 1?

If I understand your suggestion correctly, wouldn't the latter break this case?

#if QT_HAS_INCLUDE(<nope.h>)
#  include <nope.h>
#  define HAS_NOPE
#endif

#ifdef HAS_NOPE
void bar() { nope(); }
#endif

Perhaps we should rewrite all #if-like directives to #if 0 or #if 1?

I think that would work too and it'd be in fact a reliable simple solution.

llunak updated this revision to Diff 206070.Jun 21 2019, 2:01 PM
llunak retitled this revision from make -frewrite-includes handle __has_include wrapped in a macro to make -frewrite-includes also rewrite conditions in #if/#elif.
llunak edited the summary of this revision. (Show Details)

Implemented in the current version of the patch.

Patch generally looks good; just a minor concern about the output format.

clang/lib/Frontend/Rewrite/InclusionRewriter.cpp
487–490 ↗(On Diff #206070)

Adding an extra line here is going to mess up presumed line numbering. Also, we don't need parentheses around the constant 0 or 1. Perhaps instead we could prepend a #if 0 /* or #if 1 /* to the directive:

#if FOO(x) == \
  BAZ
#ifndef BAR

->

#if 1 /*#if FOO(x) == \
  BAZ*/
#if 0 /*#ifndef BAR*/

I don't think we really need the "evaluated by -frewrite-includes" part, but I have no objection to including it. The best place is probably between the /* and the #:

#if 1 /*evaluated by -frewrite-includes: #if FOO(x) == \
  BAZ*/
#if 0 /*evaluated by -frewrite-includes: #ifndef BAR*/

Patch generally looks good; just a minor concern about the output format.

Also, we don't need parentheses around the constant 0 or 1.

That was consistent with what the current code does, but I don't mind changing that.

Adding an extra line here is going to mess up presumed line numbering.

That's ok. The code already inserts extra lines and always fixes them up using line markers. And I think that's unavoidable due to reasons below.

Perhaps instead we could prepend a #if 0 /* or #if 1 /* to the directive:

That's fragile. Do it with something like "#if value /*blah*/ " and you'll have a problem with comment nesting. That is why all the changes are inserted between the two extra #if 0 / #endif lines. IIRC I thought about this for a while when designing -frewrite-includes and this is the best I came up with.
But you made me realize that the previous patch didn't handle this properly, so the current one has been fixed to be consistent there.

I don't think we really need the "evaluated by -frewrite-includes" part, but I have no objection to including it.

It's strictly speaking not necessary, but it's there to make it easier to identify those extra #if lines added by -frewrite-includes.

llunak updated this revision to Diff 207220.Jun 30 2019, 7:59 AM

Updated patch, removed ()'s around 1/0, fixed to use CommentOutDirective().

llunak updated this revision to Diff 207229.Jun 30 2019, 10:31 AM

Updated the patch again, commenting out just a part of #if didn't work either, so it seems there's no good way to comment out unwanted #if. This patch surrounds rewritten #if conditions by extra #if 0 #endif block to disable them, moreover it makes the conditions be part of an empty #if #endif block, so they do not break nesting.

I've also created https://reviews.llvm.org/D63979 to verify that the results in fact do compile.

rsmith added inline comments.Jul 12 2019, 12:59 PM
clang/test/Frontend/rewrite-includes-conditions.c
17 ↗(On Diff #207229)

Did you mean for this to be value1 < value2 rather than value2 < value2?

54–67 ↗(On Diff #207229)

I find this output pretty hard to read -- it's hard to tell what's an original #if / #endif that's been effectively commented out, and what's an added #if / #endif that's doing the commenting.

What do you think about modifying the original line so it's no longer recognized as a preprocessing directive? For instance, instead of:

#if 0 /* disabled by -frewrite-includes */
#if value1 == value2
#endif
#endif /* disabled by -frewrite-includes */
#if 0 /* evaluated by -frewrite-includes */
# 14 "{{.*}}rewrite-includes-conditions.c"
line3
#if 0 /* disabled by -frewrite-includes */
#if 0
#elif value1 > value2
#endif
#endif /* disabled by -frewrite-includes */
#elif 0 /* evaluated by -frewrite-includes */
# 16 "{{.*}}rewrite-includes-conditions.c"
line4
#if 0 /* disabled by -frewrite-includes */
#if 0
#elif value1 < value2
#endif
#endif /* disabled by -frewrite-includes */
#elif 1 /* evaluated by -frewrite-includes */
# 18 "{{.*}}rewrite-includes-conditions.c"
line5
[...]

you might produce

#if 0 /* rewritten by -frewrite-includes */
!#if value1 == value2
#endif
#if 0 /* evaluated by -frewrite-includes */
# 14 "{{.*}}rewrite-includes-conditions.c"
line3
#if 0 /* rewritten by -frewrite-includes */
!#elif value1 > value2
#endif
#elif 0 /* evaluated by -frewrite-includes */
# 16 "{{.*}}rewrite-includes-conditions.c"
line4
#if 0 /* rewritten by -frewrite-includes */
!#elif value1 < value2
#endif
#elif 1 /* evaluated by -frewrite-includes */
# 18 "{{.*}}rewrite-includes-conditions.c"
line5
[...]

(or whatever transformation you like that prevents recognition of a #if/#elif within the #if 0 block).

Also, maybe we could move the quoted directives inside their respective #if / #elif block, and only comment them out if the block was entered:

#if 0 /* evaluated by -frewrite-includes */
was: #if value1 == value2
# 14 "{{.*}}rewrite-includes-conditions.c"
line3
#elif 0 /* evaluated by -frewrite-includes */
was: #elif value1 > value2
# 16 "{{.*}}rewrite-includes-conditions.c"
line4
#elif 1 /* evaluated by -frewrite-includes */
#if 0
was: #elif value1 < value2
#endif
# 18 "{{.*}}rewrite-includes-conditions.c"
line5
[...]
llunak added inline comments.Jul 16 2019, 12:49 PM
clang/test/Frontend/rewrite-includes-conditions.c
17 ↗(On Diff #207229)

Yes, not that it'd matter in practice. I'll fix that in the next iteration of the patch.

54–67 ↗(On Diff #207229)

I think that's a matter of taste/opinion. I find your proposal visually harder to read, because although it saves one line, it also is incorrect syntax, it visually breaks if-endif nesting and is inconsistent with how -frewrite-includes does other rewriting. Moreover I think this will be so rarely examined by a human that it doesn't really matter. So unless you insist I'd prefer to keep it as it is.

BTW, the other related review I linked above, is it enough to mention it here, or should I do something more about it? I'm not sure how to pick reviewers if there's not an obvious one.

llunak updated this revision to Diff 220378.Sep 16 2019, 12:56 PM
  • updated to apply to current trunk
  • changed misleading condition in a test

I've noticed that the Developer Policy says that it's actually allowed to commit patches without approval for parts "that you have contributed or maintain". Given that I'm the author of -rewrite-includes I take it that it's ok if I commit this if there are no further comments.

Just for understanding what's going on here, I'm going to ask some stupid questions 😄

So, this will make -frewrite-includes do more work, to ensure that it not only covers the "top level" #include or __has_include case, but also __has_include in one or more levels of macros?

Does that effectively means it needs to do full preprocessing? Eg. to determine cases like this:

#ifdef FOO
#define BAR(x) __has_include(x)
#else
#define BAR(x) 0
#endif

Or do we draw the line there and only support __has_include in macros if those are not conditional themselves? Does it depend on whether -E is passed with -frewrite-includes?

Will this mean that cases like icecream and distcc that use -frewrite-includes will have fully preprocessed sources to send to the remote hosts (with correct line/column number when compilation fails), or should they still pass -D options during the remote compilation? I'm referring to this quote from the distcc manual:

run_second_cpp

If false, ccache will first run preprocessor to preprocess the source code and then on a cache miss run the compiler on the preprocessed source code instead of the original source code. This makes cache misses slightly faster since the source code only has to be preprocessed once. The downside is that some compilers won’t produce the same result (for instance diagnostics warnings) when compiling preprocessed source code.

A solution to the above mentioned downside is to set run_second_cpp to false and pass -fdirectives-only (for GCC) or -frewrite-includes (for Clang) to the compiler. This will cause the compiler to leave the macros and other preprocessor information, and only process the #include directives. When run in this way, the preprocessor arguments will be passed to the compiler since it still has to do some preprocessing (like macros).

How does ICECC_REMOTE_CPP work in this case?

So, this will make -frewrite-includes do more work, to ensure that it not only covers the "top level" #include or __has_include case, but also __has_include in one or more levels of macros?

Does that effectively means it needs to do full preprocessing?

This patch does not make -frewrite-includes do any more work. It already does a partial preprocessing run which evaluates all preprocessor directives. The patch just make better use of that information.

Does it depend on whether -E is passed with -frewrite-includes?

You cannot pass -frewrite-includes without -E.

Will this mean that cases like icecream and distcc that use -frewrite-includes will have fully preprocessed sources to send to the remote hosts (with correct line/column number when compilation fails), or should they still pass -D options during the remote compilation?

The patch does not change anything about the usage. -frewrite-includes still only expands all #include directives and that's it.

How does ICECC_REMOTE_CPP work in this case?

The same way as before. You preferably do nothing, which defaults to ICECC_REMOTE_CPP=1, which uses -frewrite-includes to send full source to the remote, which does full preprocessing and compilation and you get pretty much the same result as if compiling locally. Or you may go with ICECC_REMOTE_CPP=0, which will not do this and then you'll most likely get warnings caused by compiling already preprocessed source. As for ccache, I personally consider suggesting explicit passing of -frewrite-includes to the compilation as somewhat lame; if ccache requires it for the cpp run, it can silently add it under the hood the same way icecream does, or at least it can have it configurable the same as other options and not require messing with build options. I myself use ccache with CCACHE_DEPEND=1, which completely avoids any preprocessor usage.

rsmith accepted this revision.Sep 17 2019, 7:29 AM
This revision is now accepted and ready to land.Sep 17 2019, 7:29 AM

Thank you for the explanation! 🙂

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptSep 18 2019, 12:08 PM
Herald added a subscriber: llvm-commits. · View Herald Transcript