This patch uses the same approach used in GCNScheduleDAGMILive::getBBLiveInMap,
getLiveRegMap has complexity O(NumVirtRegs * averageLiveRangeSegmentsPerReg * lg(NumBB))
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/AMDGPU/SIFormMemoryClauses.cpp | ||
---|---|---|
280 | You seem to be assuming a single clause per block. I'd expect to handle this a full clause in a time, within a single block. |
llvm/lib/Target/AMDGPU/SIFormMemoryClauses.cpp | ||
---|---|---|
280 | Not quite, I just compute the live-in set for the first clause per BB to reset the RPTracker, then it is advanced to the next clause. |
llvm/lib/Target/AMDGPU/SIFormMemoryClauses.cpp | ||
---|---|---|
280 | Can you do this per block, instead of calculating this for every block? |
llvm/lib/Target/AMDGPU/SIFormMemoryClauses.cpp | ||
---|---|---|
280 | This is the whole point of doing that all at once:
which of SlotIndexes fall into Segments of the virtual register - that would mean the register is live at those SlotIndexes.
I now realize the complexity is not what I thought before, it should be (per register): O( min ( NumSegments * lg(NumSlotIndexes), NumSlotIndexes * lg(NumSegments) ) See getLiveRegMap, LiveRange::findIndexesLiveAt. |
llvm/lib/Target/AMDGPU/SIFormMemoryClauses.cpp | ||
---|---|---|
280 | Sorry I mean first instruction of a first clause per BB, following clauses are processed using 'advance' |
llvm/lib/Target/AMDGPU/SIFormMemoryClauses.cpp | ||
---|---|---|
286 | Can you add a comment explaining this? I'm still not following how you're doing it all at once, but subsequent clauses are handled later? |
You seem to be assuming a single clause per block. I'd expect to handle this a full clause in a time, within a single block.