Value::printAsOperand has been scanning the entire module just to
print a single value as an operand, regardless being asked to print a
type or not at all, and regardless really needing to scan the module
to print a type.
It made some of the users of the method exceptionally slow on large
IR-modules (or large MIR-files with large IR-modules embedded).
This patch defers scanning a module looking for struct types, mostly
numbered struct types, as much as possible, speeding up those users
w/o changing any APIs at all.
See speedup examples below:
Release Build:
# 83 seconds -> 5.5 seconds time ./bin/llc -start-before=irtranslator -stop-after=irtranslator \ -global-isel -global-isel-abort=2 -simplify-mir sqlite3.O0.ll -o \ sqlite3.O0.mir
# 133 seconds -> 6.2 seconds time ./bin/opt sqlite3.O0.ll -dot-cfg -disable-output
Release + Asserts Build:
# 1096 seconds -> 553 seconds time ./bin/llc -debug-only=isel -fast-isel=false -stop-after=isel \ sqlite3.O0.ll -o /dev/null 2> err
# 95 seconds -> 5.5 seconds time ./bin/llc -start-before=irtranslator -stop-after=irtranslator \ -global-isel -global-isel-abort=2 -simplify-mir sqlite3.O0.ll -o \ sqlite3.O0.mir
# 146 seconds -> 6.2 seconds time ./bin/opt sqlite3.O0.ll -dot-cfg -disable-output
where sqlite3.O0.ll is non-optimized IR produced from
sqlite-amalgamation (http://sqlite.org/download.html), which is entire
SQLite3 implementation in a single C-file.
Benchmarked on 4-cores / 8 threads PCI-E SSD iMac running macOS
Maybe move this comment up so it's a bit clearer that the sizes matching imply we've already built the index table?