with AMDGPU we end up having 208 passes to initialise, this
makes the loop over LastUser in PMTopLevelManager::setLastUser
appear quite high on the CPU usage. Since we have to initialise
the context and pass for every shader due to threading, we
see this quite often.
In a test of initing a bunch of shaders and exiting, this takes
the runtime down from 2.3s to 2.2s since the MapVector has
more efficient iterating.