This is an archive of the discontinued LLVM Phabricator instance.

[LiveDebugValues] Visit open var locs just once in transferRegisterDef, NFC
ClosedPublic

Authored by vsk on Feb 14 2020, 11:22 AM.

Details

Summary

For a file in WebKit, this brings the time spent in LiveDebugValues down
from 16 minutes to 2 minutes. The reduction comes from iterating the set
of open variable locations just once in transferRegisterDef. Post-patch,
the most expensive item inside of transferRegisterDef is a call to
VarLoc::isDescribedByReg, which we have to do.

Testing: I built LNT using the Os-g cmake cache with & without this
patch, then diffed the object files to verify there was no binary diff.

rdar://59446577

Diff Detail

Event Timeline

vsk created this revision.Feb 14 2020, 11:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 14 2020, 11:22 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
vsk added a comment.Feb 14 2020, 11:23 AM

(for reference, MachineCSE takes 4 minutes on this file...)

vsk updated this revision to Diff 244785.Feb 14 2020, 3:53 PM
vsk added a reviewer: NikolaPrica.
  • Exit early if Reg is undef.
jmorse accepted this revision.Feb 17 2020, 6:15 AM

LGTM,

/me rustles around some drawers,

I've uploaded D74706 which might be relevant to your interests. We've run into similar performance problems, but caused by hoards of DBG_VALUEs rather than by many clobbered registers, and indexing VarMap by machine location gave a significant compile-time performance improvement at the expense of some memory. I can prioritise getting it landed if you think it'd be useful.

Note that it doesn't address register masks, as we don't see many of those on x86.

llvm/lib/CodeGen/LiveDebugValues.cpp
935

Could we replace "inline" with "statically allocated" or something similar? Seeing "inline" here immediately makes me think of function-inlining (which is a relevant topic in LiveDebugValues), and confused me for a bit.

This revision is now accepted and ready to land.Feb 17 2020, 6:15 AM
vsk marked an inline comment as done.Feb 17 2020, 1:26 PM

LGTM,

/me rustles around some drawers,

I've uploaded D74706 which might be relevant to your interests. We've run into similar performance problems, but caused by hoards of DBG_VALUEs rather than by many clobbered registers, and indexing VarMap by machine location gave a significant compile-time performance improvement at the expense of some memory. I can prioritise getting it landed if you think it'd be useful.

This is very interesting, thank you for sharing. I think I explored a substantially similar approach, and decided that it wouldn't really work until we make iterating over VarLocSet /much/ faster. I've left comments in D74706.

llvm/lib/CodeGen/LiveDebugValues.cpp
935

Yes, I'll fix this before committing.

This revision was automatically updated to reflect the committed changes.