As noted by @nikic, D126061 causes a compile time regression of about
0.5% on -O0 builds:
This happens because, in a number of places, D126061 creates an
additional local variable of type ParsedAttributes. In the large
majority of cases, no attributes are added to this ParsedAttributes,
but it turns out that creating an empty ParsedAttributes, then
destroying it is a relatively expensive operation.
The reason for this is because AttributePool uses a TinyPtrVector as
its underlying vector class, and the technique that TinyPtrVector
employs to achieve its extreme memory frugality makes the begin() and
end() member functions relatively slow. The ParsedAttributes
destructor iterates over the attributes in its AttributePool, and this
is a relatively expensive operation because TinyPtrVector's begin() and
end() are relatively slow.
The fix for this is to replace TinyPtrVector in ParsedAttributes and
AttributePool with SmallVector. ParsedAttributes and
AttributePool objects are only ever allocated on the stack (they're
not part of the AST), and only a small number of these objects are live
at any given time, so they don't need the extreme memory frugality of
TinyPtrVector.
I've confirmed with valgrind that this patch does not increase heap
memory usage, and it actually makes compiles slightly faster than they
were before D126061.
Here are instruction count measurements (obtained with callgrind)
running clang -c MultiSource/Applications/JM/lencod/parsetcommon.c
(a file from llvm-test-suite that exhibited a particularly large
compile-time regression):
7acc88be0312c721bc082ed9934e381d297f4707
(baseline one commit before D126061 landed)
102,280,068 instructions
8c7b64b5ae2a09027c38db969a04fc9ddd0cd6bb
(the patch that landed D126061)
103,289,454 instructions
(+0.99% relative to baseline)
This patch applied onto
8c7b64b5ae2a09027c38db969a04fc9ddd0cd6bb
101,117,584 instructions
(-1.14% relative to baseline)