With this patch I'm trying to solve the following problem:
I'm running llvm in multiple threads (each instance uses its private LLVMContext).
With a release-asserts build of llvm I got terrible multi-thread performance. I found that the problem is the mutex in PassRegistry::getPassInfo(), mainly called from PMDataManager::verifyPreservedAnalysis().
My first idea was to make an option to disable verifyPreservedAnalysis(), even in an asserts-build.
But I didn't like it because I don't want to give up this verification and the mutex also cause overhead outside verifyPreservedAnalysis().
So I did the following:
I added a lock() function in PassRegistry which I call when I'm sure that all passes are registered.
After the PassRegistry is locked it can safely access its maps without using a mutex.
This completely solves the multi-thread performance problem.
The use of the lock() function is optional. If not used, nothing changes.
"Pedantic" comment: since your motivation is purely performance here, you may want to replace all uses of `locked`` with `locked.load(memory_order_consume)``. Maybe in a separate private method like that:
Especially since I don't think there is a guarantee that atomic<bool> is lock_free.