This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Speed up live-in virtual register set computaion in GCNScheduleDAGMILive
ClosedPublic

Authored by vpykhtin on May 24 2019, 9:06 AM.

Details

Summary

On functions with large number of basic block live-in virtual register set computation on every BB in the scheduler takes very long time. Currently it has complexity:

C1 = O(NumVirtRegs * lg(averageLiveRangeSegmentsPerReg) * BBNumber).

This patch changes the complexity to:

C2 = O(NumVirtRegs * averageLiveRangeSegmentsPerReg * lg(BBNumber)).

For this purpose live-ins are calculated all at once for all BBs. BB's starting SlotIndexes are collected and sorted and then searched using binary seach from within LiveRange segments giving logarithm on BB number. This gives almost 3 times speedup on luxmark hotel scene.

Diff Detail

Repository
rL LLVM

Event Timeline

vpykhtin created this revision.May 24 2019, 9:06 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 24 2019, 9:06 AM
arsenm added inline comments.May 24 2019, 9:17 AM
include/llvm/CodeGen/LiveInterval.h
608–627 ↗(On Diff #201265)

This should be split into a separate patch

LGTM, but you need to split LiveInterval.h part in a parent review.

vpykhtin updated this revision to Diff 201286.May 24 2019, 10:46 AM

split LiveInterval.h change into different patch

This revision is now accepted and ready to land.May 24 2019, 10:50 AM
vpykhtin updated this revision to Diff 201538.May 27 2019, 9:00 AM

replaced std::vector with SmallVector, moved out of the loop.

This revision was automatically updated to reflect the committed changes.