This patch adds a benchmark for command line round-tripping.
Below are the results of running command-line parsing, preprocessing and compilation of a minimal file (int main() { return 0; }) with 137 CC1 arguments (that's typical on macOS; on Linux CC1 usually gets half of that), in release build with assertions:
--------------------------------------------------------------------- Benchmark Time CPU Iterations --------------------------------------------------------------------- BM_CompilerInvocationCreate/0 145905 ns 145882 ns 4768 BM_CompilerInvocationCreate/1 432513 ns 432354 ns 1622 BM_Preprocess/0 1442563 ns 1442200 ns 489 BM_Preprocess/1 1748370 ns 1748310 ns 393 BM_Compile/0 2656841 ns 2656802 ns 263 BM_Compile/1 2971966 ns 2970355 ns 231
Command line parsing is ~3x slower. That makes sense given we're doing the parse twice and also generate the original command line from CompilerInvocation. The absolute delta is small though: ~0.3ms. Preprocessing is ~21% slower, compilation ~12% slower.
On a real-world compile of clang/lib/Frontend/CompilerInvocation.cpp with -O3, the command-line parsing time is naturally insignificant:
----------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------- BM_CompilerInvocationCreate/0 208180 ns 208155 ns 3352 BM_CompilerInvocationCreate/1 587957 ns 587869 ns 1183 BM_Preprocess/0 403769607 ns 403672500 ns 2 BM_Preprocess/1 405895925 ns 405802000 ns 2 BM_Compile/0 22408258046 ns 22403926000 ns 1 BM_Compile/1 22363847808 ns 22358900000 ns 1
Running check-clang and Clang's Frontend LIT tests doesn't show any measurable performance impact.
Not sure if we should do anything else here. I mostly cargo-culted this from clangd.