Page MenuHomePhabricator

unique_ptrify ownership of ModuleFiles in the ModuleManager
Needs ReviewPublic

Authored by dblaikie on Oct 24 2014, 10:30 AM.

Details

Reviewers
rsmith
Summary

We could maintain ownership in 'Chain' rather than 'Modules', though that'll
require a little more reordering in removeModules (if it's OK to just move the
Chain loop/remove_if to after the victim loop, then that's easy)

The main uncertainty I have about this change is whether it's safe to move the
Modules.erase from the start of the victim loop (in removeModules) to the end.
Is that OK? Or do some of the operations in the mody of that loop rely on the
entry to have already been removed from 'Modules'?)

Diff Detail

Event Timeline

dblaikie updated this revision to Diff 15420.Oct 24 2014, 10:30 AM
dblaikie retitled this revision from to unique_ptrify ownership of ModuleFiles in the ModuleManager.
dblaikie added a reviewer: rsmith.
dblaikie updated this object.
dblaikie added a subscriber: Unknown Object (MLST).
rsmith edited edge metadata.Oct 24 2014, 1:18 PM

I think this is correct. I'm on the fence as to whether it's an improvement, due to the sort-of dual ownership of these pointers.

include/clang/Serialization/ModuleManager.h
34–37

Storing pointers to ModuleFiles that point into values owned by a unique_ptr worries me a little. Seems like there's a risk here of a dangling pointer getting left in Chain; I think it'd be more obvious this happened if the delete were explicit.

dblaikie added inline comments.Oct 26 2014, 11:37 AM
include/clang/Serialization/ModuleManager.h
34–37

Not sure how much more obvious it'd be - as it stands the likely problem is a memory leak (such as 220569 ) rather than an inconsistency between these two data structures.

At least if something's dangling it'll likely fail pretty quick, rather than silently leak... but yeah, tradeoffs.

This certainly isn't the only case where we have owning and non-owning data structures at the same scope (including member scope), so I don't think it's all that surprising, but I could be wrong. Certainly we could comment the members more clearly to indicate that the poniters of one point to the same objects as the unique_ptrs of the other and that these need to be kept in step (this latter property is probably worth describing in more detail regardless)

klimek added a subscriber: klimek.Oct 27 2014, 10:02 AM
klimek added inline comments.
include/clang/Serialization/ModuleManager.h
34–37

I'd vote for putting the unique_ptr into Chain, and having Modules point to that. I find that to be a pattern I'm using quite a bit in the new world, and at least for me personally having an explicit delete in there will help less than than the nicely visible ownership we get through unique_ptr.

dblaikie added inline comments.Oct 27 2014, 3:06 PM
include/clang/Serialization/ModuleManager.h
34–37

Yeah, I'm totally OK with having Chains as the owner if that's any more compelling. I forget why I picked Modules - but all evidence points to Chains being the "owner" in this code (its the data structure iterated over in the dtor, iterators to Chains are what're handled in removeModules, etc).

dblaikie added inline comments.Jan 22 2015, 11:17 AM
include/clang/Serialization/ModuleManager.h
34–37

Found some old git branches I had lying around including this one.

Turns out one reason making Chain the owner is a bit harder is that it's exposed in more places (ModuleManager exposes iterators to Chain through begin/end/etc). This goes into many places, including llvm Graph APIs (llvm/Support/GraphWriter, et al), other places like ModuleManager.cpp:186: llvm::SmallPtrSet<ModuleFile *, 4> victimSet(first, last);

So to have Chain as the owner, we'd probably need to implement an iterator adapter from iterator over unique_ptr<T> to iterator over T* rvalue (since there's no underlying T* to point to/reference). Not sure if that's worth it here, but we'll probably eventually want such an abstraction, so maybe this can wait until I find the time/inclination to implement such a thing.