[llvm] Make YAML serialization up to 2.5 times faster

Authored by kbobyrev on Aug 20 2018, 12:00 AM.


[llvm] Make YAML serialization up to 2.5 times faster

This patch significantly improves performance of the YAML serializer by
optimizing YAML::isNumeric function. This function is called on the
most strings and is highly inefficient for two reasons:

  • It uses Regex, which is parsed and compiled each time this function is called
  • It uses multiple passes which are not necessary

This patch introduces stateful ad hoc YAML number parser which does not
rely on Regex. It also fixes YAML number format inconsistency: current
implementation supports C-stile octal number format (01234567) which
was present in YAML 1.0 specialization (http://yaml.org/spec/1.0/),
[Section 2.4. Tags, Example 2.19] but was deprecated and is no longer
present in latest YAML 1.2 specification
(http://yaml.org/spec/1.2/spec.html), see [Section 10.3.2. Tag
Resolution]. Since the rest of the rest of the implementation does not
support other deprecated YAML 1.0 numeric features such as sexagecimal
numbers, commas as delimiters it is treated as inconsistency and not
longer supported. This patch also adds unit tests to ensure the validity
of proposed implementation.

This performance bottleneck was identified while profiling Clangd's
global-symbol-builder tool with my colleague @ilya-biryukov. The
substantial part of the runtime was spent during a single-thread Reduce
phase, which concludes with YAML serialization of collected symbol
collection. Regex matching was accountable for approximately 45% of the
whole runtime (which involves sharded Map phase), now it is reduced to
18% (which is spent in clang::clangd::CanonicalIncludes and can be
also optimized because all used regexes are in fact either suffix
matches or exact matches).

llvm-yaml-numeric-parser-fuzzer was used to ensure the validity of the
proposed regex replacement. Fuzzing for ~60 hours using 10 threads did
not expose any bugs.

Benchmarking global-symbol-builder (using `hyperfine --warmup 2
--min-runs 5 'command 1' 'command 2'`) tool by processing a reasonable
amount of code (26 source files matched by
clang-tools-extra/clangd/*.cpp with all transitive includes) confirmed
our understanding of the performance bottleneck nature as it speeds up
the command by the factor of 1.6x:

CommandMean [s]Min…Max [s]
this patch (D50839)84.7 ± 0.683.3…84.7
master (rL339849)133.1 ± 0.8132.4…134.6

Using smaller samples (e.g. by collecting symbols from
clang-tools-extra/clangd/AST.cpp only) yields even better performance
improvement, which is expected because Map phase takes less time
compared to Reduce and is 2.05x faster and therefore would significantly
improve the performance of standalone YAML serializations.

CommandMean [ms]Min…Max [ms]
this patch (D50839)3702.2 ± 48.73635.1…3752.3
master (rL339849)7607.6 ± 109.57533.3…7796.4

Reviewed by: zturner, ilya-biryukov

Differential revision: https://reviews.llvm.org/D50839

llvm-svn: 340154


kbobyrevAug 20 2018, 12:00 AM
Differential Revision
D50839: [llvm] Make YAML serialization up to 2.5 times faster
rG6f1740d52f1b: [SimplifyCFG] Replace some uses of bitwise or with logical or