This patch is an optimization for speed: whenever possible, it will avoid creating a new process for the -cc1 invocation, and instead will call the cc1 tool inside the calling process (clang tool).
On Windows, this has a moderate impact on build times (please see timings in the comments below)
CFG (control flow guard) is disabled on Windows.
This patch also mildly improves build times on a Haswell 6-core Linux PC.
@Meinersbur reported no change in timings on a many-core Linux machine.
If you'd like to try this on your configuration, use the script below to ensure a standardized unit of measure:
(this is only for Windows)
Warning: the script pulls and reverts all local changes!
Please edit the script before running, to provide the right filename for applying the patch.
The script can be used such as:
# run the test (lengthy, hours!) > powershell .\abba_test.ps1
By default, it'll do a 2-stage build, including cmake'ing with a fixed set of options; followed by AB/BA testing where it'll alternatively rebuild using Clang 10 with and without the patch, for at least 5 hours. If you want to fiddle the number of hours it runs, you can do:
> powershell .\abba_test.ps1 50
Where '50' represents the number of hours it'll run. You can also skip building the two stages, if you already did it, by setting the second parameter:
> powershell .\abba_test.ps1 50 $True
You then end up with something like that:
Total iterations: 6 | Min | Mean | Median | Max | A | 00:11:55.8588259 | 00:12:11.5531275 | 00:12:09.5867878 | 00:12:30.2468040 | B | 00:09:35.1556958 | 00:09:52.5302980 | 00:09:55.2329476 | 00:10:04.4994028 | Diff | -00:02:20.7031301 | -00:02:19.0228296 | -00:02:14.3538402 | -00:02:25.7474012 |
If you find this useful, I can maybe convert it to a python script and send a review separately.
It's not really a callback though. How about just "Pointer to the ExecuteCC1Tool function, if available."
It would also be good if the comment explained why the pointer is needed.
And why does it need to be thread-local? Can different threads end up with different values for this?