This is the slowest operation in the already slow pass.
Instead of sorting just put a stall list into an ordered
map.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/AMDGPU/GCNRegBankReassign.cpp | ||
---|---|---|
107–129 | Is push_front more expensive if we are calling it for every candidate? | |
457 | Make this an assertion combined with other change to collectCandidate suggested below? | |
613 | Presumably we don't need to do this if we are not using weights at all (i.e. no sorting)? |
llvm/lib/Target/AMDGPU/GCNRegBankReassign.cpp | ||
---|---|---|
613 | Actually since it is cheap it makes sense to keep loop depth weight. The operand forwarding part is expensive, but the sort is not. I will experiment tomorrow and return sort and MLI part, just disable operand scan part. |
Hm... On practice that is std::list::sort() takes most of the time. Maybe it is vaible to change the list to a vector of lists, where vector is sorted by equal weights.
In fact the slowest part was sorting. I just have changed data structure to avoid sorting. Weight calculation itself turns to be not that much expensive.
This almost halves time the pass takes.
Is push_front more expensive if we are calling it for every candidate?
Probably just always push_back?