The testcase in https://bugs.llvm.org/show_bug.cgi?id=32984 went from 3 minutes to 10 minutes after a change that made the LoopUnroll pass more aggressive (increasing the threshold).
-time-passes reveals large part of the time is spent in PHI Elimination (in the backend).
My profiling shows all the time of PHI elimination goes to llvm::LiveVariables::addNewBlock. This is because we keep Defs/Kills registers in a SmallSet, and lvm/ADT/SmallSet.h -> VIterator vfind(const T &V); is O(N).
Switching to a DenseSet reduces the time spent in the pass from 297 seconds to 97 seconds. Profiling still shows a lot of time is spent iterating the data structure, so I guess there's room for improvement.
I tried a SparseBitVector and it's slightly slower than the DenseMap solution. I wanted to try a BitVector but I'm not sure I have an upper bound on the number of elements.