This speeds up linking chrome.dll with PGO instrumentation by 13%
(154271ms -> 134033ms).
LLVM's Option library is very slow. In particular, it allocates at least
one large-ish heap object (Arg) for every argument. When PGO
instrumentation is enabled, all the __profd_* symbols are added to the
@llvm.used list, which compiles down to these /INCLUDE: directives. This
means we have O(#symbols) directives to parse in the section, so we end
up allocating an Arg for every function symbol in the object file. This
is unnecessary.
To address the issue and speed up the link, extend the fast path that we
already have for /EXPORT:, which has similar scaling issues.
I promise that I took a hard look at optimizing the Option library, but
its data structures are very general and would need a lot of cleanup. We
have accumulated lots of optional features (option groups, aliases,
multiple values) over the years, and these are now properties of every
parsed argument, when the vast majority of arguments do not use these
features.
Curious: how many includes do you have per .drective? It is worth calling .reserve() somehow before inserting? MS-STL has geometric increase of the std::vector buffer. If your .drective has many tokens, we would probably allocate & move memory several times per .drective. I think includes.reserve(tokensNum), exports.reserve(tokensNum) is better that the cost of re-alloc, even we're wasting a few extra memory. Unless you do a two-step parsing.