This is an archive of the discontinued LLVM Phabricator instance.

use a MapVector for pass manage LastUser
AbandonedPublic

Authored by airlied on Jun 25 2018, 7:21 PM.

Details

Reviewers
chandlerc
Summary

with AMDGPU we end up having 208 passes to initialise, this
makes the loop over LastUser in PMTopLevelManager::setLastUser
appear quite high on the CPU usage. Since we have to initialise
the context and pass for every shader due to threading, we
see this quite often.

In a test of initing a bunch of shaders and exiting, this takes
the runtime down from 2.3s to 2.2s since the MapVector has
more efficient iterating.

Diff Detail

Repository
rL LLVM

Event Timeline

airlied created this revision.Jun 25 2018, 7:21 PM
mareko added a subscriber: mareko.Jun 29 2018, 11:39 AM
chandlerc requested changes to this revision.Jul 6 2018, 7:08 PM

I really don't think this is the right approach.

First, this is fundamentally a more expensive data structure than DenseMap. I'm honestly surprised that *any* query pattern is actually faster with it, but I can understand why your particular one happens to be faster. But it does make every other code path a bit slower. It doesn't really seem like the right tradeoff for all of LLVM, especially given that this doesn't even help *that* much.

If you want to fix this, you should fix the bad algorithm in setLastUser. We already have an inverse map, so there should be some way that simply doesn't iterate over the entire thing in this way.

But honestly, I'm not sure its worth spending a lot of time tuning this. If you see lots of overhead due to these kinds of pass management infrastructure, I would *strongly* suggest working on moving to the new pass manager. It fundamentally doesn't have these pitfalls. So far no one has stepped forward to work on making this happen for the code generator, but it really does need to happen...

This revision now requires changes to proceed.Jul 6 2018, 7:08 PM
airlied abandoned this revision.Jul 9 2018, 4:27 PM

Thanks for the review, Chandler,

Yes it does appear as if the amdgpu usage pattern is a bit special here, I've spent some time getting patches into the mesa side to avoid recreating the pass managers so often and it removes the needs for this.

Dave.

foad added a subscriber: foad.Nov 30 2020, 3:53 AM

If you want to fix this, you should fix the bad algorithm in setLastUser. We already have an inverse map, so there should be some way that simply doesn't iterate over the entire thing in this way.

I've done this in D92309.