This is an archive of the discontinued LLVM Phabricator instance.

[MIR] Making MIR Printing, opt -dot-cfg, and -debug printing faster
ClosedPublic

Authored by rtereshin on Mar 5 2018, 8:57 PM.

Details

Summary

Value::printAsOperand has been scanning the entire module just to
print a single value as an operand, regardless being asked to print a
type or not at all, and regardless really needing to scan the module
to print a type.

It made some of the users of the method exceptionally slow on large
IR-modules (or large MIR-files with large IR-modules embedded).

This patch defers scanning a module looking for struct types, mostly
numbered struct types, as much as possible, speeding up those users
w/o changing any APIs at all.

See speedup examples below:

Release Build:

# 83 seconds -> 5.5 seconds
time ./bin/llc -start-before=irtranslator -stop-after=irtranslator \
  -global-isel -global-isel-abort=2 -simplify-mir sqlite3.O0.ll -o \
  sqlite3.O0.mir
# 133 seconds -> 6.2 seconds
time ./bin/opt sqlite3.O0.ll -dot-cfg -disable-output

Release + Asserts Build:

# 1096 seconds -> 553 seconds
time ./bin/llc -debug-only=isel -fast-isel=false -stop-after=isel \
  sqlite3.O0.ll -o /dev/null 2> err
# 95 seconds -> 5.5 seconds
time ./bin/llc -start-before=irtranslator -stop-after=irtranslator \
  -global-isel -global-isel-abort=2 -simplify-mir sqlite3.O0.ll -o \
  sqlite3.O0.mir
# 146 seconds -> 6.2 seconds
time ./bin/opt sqlite3.O0.ll -dot-cfg -disable-output

where sqlite3.O0.ll is non-optimized IR produced from
sqlite-amalgamation (http://sqlite.org/download.html), which is entire
SQLite3 implementation in a single C-file.

Benchmarked on 4-cores / 8 threads PCI-E SSD iMac running macOS

Diff Detail

Repository
rL LLVM

Event Timeline

rtereshin created this revision.Mar 5 2018, 8:57 PM
rtereshin edited the summary of this revision. (Show Details)Mar 5 2018, 9:00 PM
bogner accepted this revision.Mar 14 2018, 2:43 PM

Nice improvement! I have a couple of minor style/clarity comments, but LGTM with those addressed.

lib/IR/AsmWriter.cpp
501–505

Maybe move this comment up so it's a bit clearer that the sizes matching imply we've already built the index table?

507

range-for would be a bit more concise.

511–512

AFAICT this assert and the one inside the loop are checking the same thing. Do we need them both?

526

It's a bit surprising to null out the module to mark that we already ran this, but I suppose it's simpler than keeping an extra bool around just for that. Maybe add comments to point out what's happening with M, both here and by the member declaration?

This revision is now accepted and ready to land.Mar 14 2018, 2:43 PM
rtereshin added inline comments.Mar 14 2018, 3:01 PM
lib/IR/AsmWriter.cpp
501–505

Sure, will improve comments around here!

507

will do

511–512

Not quite, but I'm checking the uniqueness one totally wrong, I will fix it, good catch!

526

Sure, will do, thanks!

rtereshin updated this revision to Diff 139186.Mar 20 2018, 1:50 PM
rtereshin marked 3 inline comments as done.

Addressing the review comments

Still looks good - two very nitpicky style comments below for when you commit.

lib/IR/AsmWriter.cpp
508

Better to spell this "const auto &P"

510

I think you can avoid default constructing the item by using NumberedTypes.count(P.second) instead of operator[]. Of course, it really doesn't matter much in this case since the value's just an unsigned.

rtereshin added inline comments.Mar 21 2018, 12:13 PM
lib/IR/AsmWriter.cpp
510

NumberedTypes is a vector

This revision was automatically updated to reflect the committed changes.