This is an archive of the discontinued LLVM Phabricator instance.

[clang-cl, PCH] Implement support for MS-style PCH through headers
ClosedPublic

Authored by mikerice on May 9 2018, 11:12 AM.

Details

Summary

Implement support for MS-style PCH through headers.

This enables support for /Yc and /Yu where the through header is either
on the command line or included in the source. It replaces the current
support the requires the header also be specified with /FI.

This change adds a -cc1 option -pch-through-header that is used to either
start or stop compilation during PCH create or use.

When creating a PCH, the compilation ends after compilation of the through
header.

When using a PCH, tokens are skipped until after the through header is seen.

Diff Detail

Event Timeline

mikerice created this revision.May 9 2018, 11:12 AM
rnk edited reviewers, added: zturner, hans; removed: cfe-commits.May 10 2018, 3:40 PM
rnk added a subscriber: cfe-commits.
kimgr added a subscriber: kimgr.May 11 2018, 1:50 AM

When using a PCH, tokens are skipped until after the through header is seen.

I know this is what MSVC does (though in their case I think it's an artifact of their implementation), but I remember this being a source of mysterious bugs:

#define MY_SYMBOL 1
#include "stdafx.h"  // include pch

The #define would silently disappear, and the symbol would never be defined. I wonder if it would be worth adding a warning for this in SkipTokensUntilPCHThroughHeader?

Thanks for taking a look at the patch. I added a warning as suggested. I also fixed skipping so it ignores all directives except #include and now #define.

mikerice updated this revision to Diff 146513.May 13 2018, 8:17 AM

Added warning when macro is defined when skipping. Also fixed skipping so directives are completely ignored expect for #include and #define.

rnk removed a reviewer: rnk.May 16 2018, 11:37 AM
rnk added a subscriber: rnk.

@hans @thakis, do you remember how these flags are supposed to work? I've forgotten anything I ever knew about them...

kimgr added a comment.May 16 2018, 1:22 PM

I've done some terrible things with MSVC PCHs over the years, so I'll pile on some more questions:

  • Can you test what happens when you do clang-cl.exe /Yc stdafx.h /Tp stdafx.h, i.e. compile the header directly as C++ code and generate .pch from it? The normal MSVC modus operandi is to have stdafx.h included in all source files (with /Yu) and stdafx.cpp to generate it (with /Yc). That stdafx.cpp file serves no purpose, so I've played this trick to generate PCH from the header directly. If it works, it might also be useful for your tests.
  • I think the current skip-warning is over-specific -- what if I put an inline function before my PCH include? A global variable? A #pragma? Or am I misunderstanding what skipping does? It seems to me that any non-comment tokens before the PCH #include should raise a warning.
  • I find the "through header" terminology is a little hard to interpret, but I did find it on MSDN, so maybe it's well established.

I'll try to look more at this patch in context later.

  • Can you test what happens when you do clang-cl.exe /Yc stdafx.h /Tp stdafx.h, i.e. compile the header directly as C++ code and generate .pch from it? The normal MSVC modus operandi is to have stdafx.h included in all source files (with /Yu) and stdafx.cpp to generate it (with /Yc). That stdafx.cpp file serves no purpose, so I've played this trick to generate PCH from the header directly. If it works, it might also be useful for your tests.

It isn't always the case that the .cpp file serves no purpose. It's true that the MS project generators give you this setup but we've seen many projects over the years that use a regular source file with code in it to generate the PCH. The object file generated during PCH create can have real code and the compiler can take advantage of that fact by compiling less code in the compilation units that use the PCH (for example not instantiating some COMDAT data/functions that are know to be in the create object). This can speed up compilations significantly in some cases.

I am not sure about your command: clang-cl.exe /Yc stdafx.h /Tp stdafx.h. The space makes this a compile with just /Yc (no through header). This patch doesn't address /Yc without the through header. Something like clang-cl /Yc /Tpstdafx.h would probably work if/when that is implemented.

  • I think the current skip-warning is over-specific -- what if I put an inline function before my PCH include? A global variable? A #pragma? Or am I misunderstanding what skipping does? It seems to me that any non-comment tokens before the PCH #include should raise a warning.

Not any tokens. The through header "feature" compiles all the code in the file up to the through header. So normally you are used to seeing just comments and #include <stdafx.h>. That's one usage but your PCH can be created from a source like this:

#include <header.h>
#ifdef FOO
#define FOO1
#endif
#include <another.h>
#include <stdafx.h>

If the files using the PCH also have exactly this 'pch prefix' there is no problem and certainly no warning is expected. Some compilers also check that this 'prefix' matches and won't use a PCH if it doesn't match. The MS compiler does not do this but they do give a warning if a #define is seen that doesn't match the definition in the PCH (that may be new I hadn't noticed it before you mentioned it). It's a sign that the user is likely doing something wrong so a warning is probably warranted. We can do some simple checks here but I suppose the user is most interested in compilation speed when using a PCH.

  • I find the "through header" terminology is a little hard to interpret, but I did find it on MSDN, so maybe it's well established.

Sorry about the terminology. I've been responsible for emulating this PCH mechanism for many years so I may be one of the few that understands it. I did try to keep that terminology out of the diagnostics but I don't really have a better name for through header.

Thanks for taking a look at this.

Can you test what happens when you do clang-cl.exe /Yc stdafx.h /Tp stdafx.h, i.e. compile the header
directly as C++ code and generate .pch from it? The normal MSVC modus operandi is to have stdafx.h
included in all source files (with /Yu) and stdafx.cpp to generate it (with /Yc). That stdafx.cpp file serves
no purpose, so I've played this trick to generate PCH from the header directly. If it works, it might also
be useful for your tests.

It isn't always the case that the .cpp file serves no purpose. It's true that the MS project generators give you this setup but we've seen many projects over the years that use a regular source file with code in it to generate the PCH. The > object file generated during PCH create can have real code and the compiler can take advantage of that fact by compiling less code in the compilation units that use the PCH [...]

Cool, it's never occurred to me that the source file has any other purpose than including the pch header.

I am not sure about your command: clang-cl.exe /Yc stdafx.h /Tp stdafx.h. The space makes this a compile with just
/Yc (no through header). This patch doesn't address /Yc without the through header. Something like clang-cl /Yc
/Tpstdafx.h would probably work if/when that is implemented.

I no longer have access to an MSVC environment, so I can't spell out the exact command, but I was trying to phrase a command that generates a PCH from the through-header *without a distinct source file*. It's proved to be a useful trick. I didn't mean to omit the through header, let's try again:

cl.exe /Ycstdafx.h /TP stdafx.h

or if that doesn't work, a /FI switch may be required, I can't remember for sure what we did:

cl.exe /Ycstdafx.h /TP /FI stdafx.h stdafx.h

I think the current skip-warning is over-specific -- what if I put an inline function before my PCH include?
A global variable? A #pragma? Or am I misunderstanding what skipping does? It seems to me that any
non-comment tokens before the PCH #include should raise a warning.

Not any tokens. The through header "feature" compiles all the code in the file up to the through header.
So normally you are used to seeing just comments and #include <stdafx.h>. That's one usage but your
PCH can be created from a source like this [...]

Thanks for the example, I must have misunderstood how MSVC PCHs work. With that background, the warning seems fine.

I find the "through header" terminology is a little hard to interpret, but I did find it on MSDN, so maybe it's well established.

Sorry about the terminology. I've been responsible for emulating this PCH mechanism for many years so I may be one
of the few that understands it. I did try to keep that terminology out of the diagnostics but I don't really have a better
name for through header.

:) Sounds like it's well-established!

Ping. Still looking for a reviewer, mostly Lex and clang-cl driver changes.

Ping. This can significantly improve compile time for projects that use the common /Yc<header.h> /Yu<header.h> PCH mechanism.

yvvan added a subscriber: yvvan.Jun 18 2018, 11:02 PM
rnk added a comment.Jun 20 2018, 2:22 PM

@hans, think you'll have time to look at this with your recent dllexport PCH experimentation?

hans added a comment.Jun 20 2018, 11:58 PM
In D46652#1138375, @rnk wrote:

@hans, think you'll have time to look at this with your recent dllexport PCH experimentation?

Yes, I mean to look at it, marked it unread in my inbox and everything.

Godin added a subscriber: Godin.Jun 22 2018, 5:46 AM
hans added a comment.Jun 26 2018, 6:03 AM

This looks reasonable to me as far as I can tell. Thanks!

I think the expert on the driver side for this stuff is Nico, so hopefully he can take a look as well.

include/clang/Basic/DiagnosticLexKinds.td
412

Should probably just have a warn_ prefix.

include/clang/Driver/CC1Options.td
608

The "through header" terminology was new to me, and I didn't see it when browsing the MSDN articles about precompiled headers. The HelpText here probably isn't the right place, but it would be good if the term could be documented somewhere to make it clearer exactly what the behaviour is. It's not really obvious to me what "stop at this file" and "start after this file" means. I can guess, but it would be nice if it were more explicit :-)

include/clang/Lex/Preprocessor.h
2050

I would probably define this out-of-line like the other methods you added, but no big deal.

lib/Driver/ToolChains/Clang.cpp
1066–1067

In r335466, I added the -building-pch-with-obj flag. This is probably the right place to pass it when there's a /Yc flag.

lib/Lex/PPDirectives.cpp
1887

Returning here seemed surprising to me. Isn't just setting the flag and then carrying on as usual what we want?

lib/Lex/PPLexerChange.cpp
344

Maybe move this down to where it's first used?

lib/Lex/Preprocessor.cpp
583

I know some functions already do this, but I don't think repeating the comment from the .h file here is necessary, it's just another thing to keep in sync. Same for the other small methods below.

607

nit: curly on the previous line

test/PCH/pch-through2.cpp
4

What's s2t2 for? From what I see, there's only two pch files used in this test, so I would have gone with %t.1 and %t.2

mikerice updated this revision to Diff 153067.Jun 27 2018, 6:21 AM
mikerice marked 8 inline comments as done.

Thanks for the review. Updated based on comments.

include/clang/Driver/CC1Options.td
608

You definitely have to look hard at the MSDN docs to find mention of through headers. If you look at the documentation for /Yc and /Yu you can see some vague references. I think it may have been more prominent many years ago.

The MSDN page says "For /Yc, filename specifies the point at which precompilation stops; the compiler precompiles all code though(sic) filename..." https://msdn.microsoft.com/en-us/library/z0atkd6c.aspx

I'll look for a place to document this better in a comment at least.

lib/Lex/PPDirectives.cpp
1887

When skipping we always return since we don't want to include any headers encountered because they are in the PCH. When the 'through header' is seen we also return because it too is in the PCH. In that case we also set the flag to stop the skipping.

So with /Yu"hdr2.h":

#include "hdr1.h" <- in PCH
#include "hdr2.h" <- in PCH
#include "hdr3.h" <- next include processed normally.

That's because /Yc"hdr2.h" contains all the headers "through" hdr2.h (i.e. hdr1.h and hdr2.h).

hans accepted this revision.Jun 29 2018, 4:54 AM

This looks good to me.

include/clang/Driver/CC1Options.td
608

Thanks for the pointer. Yeah, a clear explanation in a comment somewhere would be really helpful. Another idea might be to have a little "precompiled headers" section in the clang-cl section of docs/UsersManual.rst. Maybe we could do a better job than MS at explaining it precisely? :-)

This revision is now accepted and ready to land.Jun 29 2018, 4:54 AM
mikerice added inline comments.Jul 5 2018, 9:57 AM
include/clang/Driver/CC1Options.td
608

Thanks for the review. In the last patch I updated the comment in PreprocessorOptions.h to better explain what a through header is. I like the idea of explaining this in UsersManual.rst. I'll work on that in the next few weeks.

This revision was automatically updated to reflect the committed changes.
This revision was automatically updated to reflect the committed changes.

Sorry about missing this. Looks great, thanks for doing this. This was a pretty big omission, glad to see it's done now :-)

I always imagined we might reuse the precompiled preamble stuff (Lexer::ComputePreamble() etc), but it's pretty clear as-is.

Also, were you planning on also adding support for the (filename-less version of) hdrstop pragma? After this change, that should probably be fairly straightforward.

And finally (sorry about all the mails), this should probably be mentioned in the release notes (docs/ReleaseNotes.rst in the clang repo) since it's a notable new feature :-)

And finally (sorry about all the mails), this should probably be mentioned in the release notes (docs/ReleaseNotes.rst in the clang repo) since it's a notable new feature :-)

I added this to releasenotes in r337381.

Also, were you planning on also adding support for the (filename-less version of) hdrstop pragma? After this change, that should probably be fairly straightforward.

Thanks for taking a look. I'll be going through our PCH tests at some point soon and will likely add support for this and any other interesting issues that remain.

And finally (sorry about all the mails), this should probably be mentioned in the release notes (docs/ReleaseNotes.rst in the clang repo) since it's a notable new feature :-)

I added this to releasenotes in r337381.

Thanks!

Also, were you planning on also adding support for the (filename-less version of) hdrstop pragma? After this change, that should probably be fairly straightforward.

Thanks for taking a look. I'll be going through our PCH tests at some point soon and will likely add support for this and any other interesting issues that remain.

Cool :-) I gave hrdstop a try (wip at https://reviews.llvm.org/D49496) but kind of got stuck and haven't had time to become unstuck. If you get to it first, maybe the wip patch is useful (or not).

It looks like your change did break using /P together with /Yu: https://bugs.llvm.org/show_bug.cgi?id=38508