This is an archive of the discontinued LLVM Phabricator instance.

[PATCH/RFC] Use a BumpAllocator for DIEs
Needs ReviewPublic

Authored by friss on Jan 20 2015, 11:28 AM.

Details

Summary

First of all, although this patch passes the testsuite, I don't
think it is ready to go in, at least not without a thorough
discussion of its downsides.

What the patch does is move away from using unique_ptr<>s to
track DIE object memory and allocate most of them using a
BumpAllocator. Making releasing the memory for the DIE tree
basically free. (DIEBlocks and DIELocs are still allocated using
the standard allocator, but they were already using separate
bookkeeping, so this doesn't complicate the patch).

I'm going down this road because the teardown of the DIE tree
uses a lot of CPU time, especially in dsymutil's usecase. In
my prototype, the DIE destructors take ~40 seconds to run for
DIE trees worth of 700MB of debug infomration. That's quite a
lot especially considering that dsymutil's runtime on this kind
of data should be around 2 minutes. I really need to find an
acceptale solution for dsymutil's usecase.

This patch achieves the performance gains that I expect, but there
is one big downside: it limits the number of attributes a DIE can
accept to a fixed maximum (well technically, if you remove the
assert in the code, it will work, but it will leak the attribute
vectors going over the limit).

Imposing this limit on users seems too strong as a constraint.
Especially considering not all frontends are in-tree. I can imagine
more flexible solutions involving complicated allocation strategies,
but I wanted to ask for opinions before I work on something that I
fear would end up much more complicated than this patch. Maybe
there's an easier solution that I haven't seen, happy to hear
suggestions.

BTW, the runtime difference is definitely measurable on a clang
debug build also, although the gained seconds do not represent
40% of the runtime like in dsymutil's case. On my machine, a clang
debug build goes from:

real 14m38.308s
user 45m26.856s
sys 3m53.519s

without the patch, to:

real 14m19.762s
user 44m14.438s
sys 3m45.520s

Shaving more than one minute of user time and 15 to 20 seconds from
wall-clock time.

Diff Detail

Event Timeline

friss updated this revision to Diff 18447.Jan 20 2015, 11:28 AM
friss retitled this revision from to [PATCH/RFC] Use a BumpAllocator for DIEs.
friss added reviewers: dblaikie, echristo.
friss updated this object.
friss added a subscriber: Unknown Object (MLST).
friss updated this revision to Diff 18851.Jan 27 2015, 2:08 PM

This update removes the limitation to a fixed maximum number of
attributes per DIE. The patch is very similar to the first one that
I sent for discussion. It only adds a pointer to a vector of DIEs
as arguement to DIE::addValue(). If a DIE overflows its inline storage
for attributes, it is added to the vector. The vector is then iterated
just before the BumpPtrAllocator is destroyed to call the destructors
of all these DIEs.

This approach gives the most flexibilty as it doesn't impose the
memory management upon the user (for example DIEHashTest.cpp continues
to use stack allocated DIEs and it works fine).

I've been playing with this for the last few days, and I have quite
a few different implementations lying around (for example using
std::vectors with a custom stateful allocator). They are all more
complicated and don't perform as well as this patch.

There's one point that I'd like to mention: the patch unifies the
existing bookkeeping of the DIEBlock and DIELoc objects and treats
them the same as standard DIEs. This means that we will call ~DIE
on DIEBlock and DIELoc objects which will result in a 'partial'
destruction (DIEBlock inherits both from DIEValue and from DIE).
I think this is fine. It should do exactly what we want and just
call the destructors of the SmallVectors in the DIE part of the
object, but I wanted to mention it in case someone thinks its not
legal to do that.

dblaikie added inline comments.Mar 10 2015, 11:46 AM
include/llvm/CodeGen/DIE.h
135

This might be able to be private inheritance? (just handle the DIEs by unique_ptr and then pass them into addChild)

Hmm, I suppose it'd need to friend the ilist instantiation & maybe other things, for this to be private inheritance?

157

I've a bit of an aversion to intrusive lists - so I wonder if you'd be willing to revisit the issues in the comment above (including the issue about why list<DIE> was insufficient - picking up from where I left off to see if we could reasonably rely on post-insertion pointer validity without too much convolution?).

Actually why move to ilist? These objects will never have their dtors run, right? So you could just switch from vector<unique_ptr<DIE>> to vector<DIE*> - everything's non-owning now, since their memory is managed by the BumpPtrAllocator entirely?

220

If we're going to go with ilist, could we improve/overload ilist's push_back to take unique_ptr for clarity?

lib/CodeGen/AsmPrinter/DwarfUnit.cpp
67

Doesn't seem like it'd be too important to allocate the unit die with the DIEValueAllocator, does it?

I guess if you don't, then it can end up in the list of DIEs to deallocate, and ends up double-deleted? Again, this would be the sort of subtlety it might be best to avoid.

lib/CodeGen/AsmPrinter/DwarfUnit.h
116

This seems a bit subtle - having the SmallVector's dtors not be run unless they happen to go over the limit & reallocate. Could we use the existing BumpPtrAllocator as the allocator to the SmallVectors? Or something similar. (perhaps we discussed this, it's been a while)

unittests/CodeGen/DIEHashTest.cpp
626

Again, bit subtle - in the real use case this is non-owning because the parent's dtor is never run, etc, etc. But in the test case it is owning because the parent's dtor is run?

Hmm, that raises another question - once a DIE is added to the "DIEs to delete" list, what stops it from deleting its children DIEs? (since it's a member std::ilist in there) possibly leading to excess deletes (inefficiencies you're trying to remove) or double/triple/multiple deletes if children are in the "DIEs to delete" list too?

friss added inline comments.Mar 10 2015, 12:14 PM
include/llvm/CodeGen/DIE.h
135

Could try it, though I don't think I've seen that pattern anywhere else.

157

If we use list<DIE>, we'd need to use a std custom allocator that wraps the BumpPtrAllocator which would complicate things quite a bit. I could try it though. The whole point of the ilist here is that allocating a DIE allocates everything that's needed for the list bookkeeping.

Re: vector<DIE*>, you'd leak the vector storage if you do not call the destructor on every DIE.

220

I can try that.

lib/CodeGen/AsmPrinter/DwarfUnit.cpp
67

It's not important from a performance standpoint, but if I find the logic easier to follow if the rule is "allocate everything DIE related on the BumpPtrAllocator". If we allow this on to be a an exception to this rule, we must then handle its destruction specifically (as you point out, removing it from the DIEsToDelete list, but also making sure it drops its children and does not delete them).

lib/CodeGen/AsmPrinter/DwarfUnit.h
116

SmallVector don't accept configurable allocators AFAICS. I tried going with a std::vector with a custom allocator that wrapped a mix of recycliing + BumpPtrAllocator, but it didn't perform well.

I agree it's subtle, but it's the best tradeoff I could find.

unittests/CodeGen/DIEHashTest.cpp
626

This approach basically gives you the choice of ownership. If you wanna use something like a BumpPtrAllocator, then you use the DIEsToDelete trick to avoid leaking memory.

If you wanna use a more 'traditional' memory management scheme, it's also possible like in this test. The DIEs are still owned by their parents through the ilist of children, which means calling the root destructor will destroy everything.

This is handled in the BumpPtrAllocator by preventing the DIEs from the DIEsToDelete array to delete their children (look for clearAndLeakNodesUnsafely() in ~DwarfUnit). Yeah, subtle again...

dexonsmith resigned from this revision.Oct 6 2020, 3:35 PM