I'm refactoring my Static Analyzer checker UninitializedObjectChecker, and I need to store lightweight polymorphic objects in llvm::ImmutableList. I wish to use std::unique_ptrs, but llvm::ImmutableList requires it's elements to be copy constructible for no good reason. This patch aims to fix this.
Details
Diff Detail
Event Timeline
Test coverage?
(& also, maybe rather than modifying concat's signature (& causing its
arguments to reverse in a way that might be confusing), just take the
argument by value and move into the destination?)
& does changing these signatures (of concat and add) not break any existing
code?
I strongly support argument order swap because it's consistent with other immutable data structures.
Not sure if i support turning add into some sort of emplace. I'd rather just restrict it to a move-constructor by accepting a single rvalue, because it'll make caller code easy to understand. Maybe add another emplace-like method if you need it. I think it's an important distinction to make.
This is used only in clang and there's D49986 and there should be no functional change here.
Not sure why didn't we ever write unit tests for these data structures. But it should more or less work as tested by clang lit tests.
Not sure I follow - which examples of argument order are you comparing here?
It looks like concat orders the arguments in the same way that the output would be (so concat(X, list) produces [X, list]) - so preserving that argument order seems like it improves/retains readability compared to switching them around so 'concat(list, X)' produces '[X, list]'.
This is used only in clang and there's D49986 and there should be no functional change here.
There's a functional change as described by the patch description - this data structure is getting new features/functionality and that can be unit tested. Also, it's best to test it in LLVM so that it doesn't get broken by LLVM developers who may not be compiling/testing Clang.
Not sure why didn't we ever write unit tests for these data structures. But it should more or less work as tested by clang lit tests.
Yeah, often the older corners don't have unit tests - but when they're improved/changed, it's a good chance to correct that & start putting in unit tests. The overhead/work involved in making the first unit test isn't high enough that I feel like it's a great imposition to ask the next developer to start it.
ImmutableMap and ImmutableSet both have add(Collection, Item) argument order, but ImmutableList writes the same thing as add(Item, Collection).
It looks like concat orders the arguments in the same way that the output would be (so concat(X, list) produces [X, list]) - so preserving that argument order seems like it improves/retains readability compared to switching them around so 'concat(list, X)' produces '[X, list]'.
Yeah, i guess that might have been the motivation behind such inconsistency. I'll be fine with fixing the order for other data structures.
Yup, that makes sense.
Added a new emplace method, and the rest of the factory methods now take the data argument by value.
I also added a unittest file. It does contain code unrelated to this patch, but since the file didn't exist, I though it's okay to hit two birds with one stone.
On a somewhat unrelated note, is there a need for concat to be public? Also, to me, when I think if the word concatenation, I would first think that it would work similar to std::list<T>::append().
I ran into a serious problem. It seems like ImmutableList doesn't run the destructor for std::unique_ptrs or std::shared_ptrs:
struct Obj { ~Obj() { llvm_unreachable(""); } // never called }; using ObjRef = std::unique_ptr<Obj>; // same with std::shared_ptr namespace llvm { // Specializing FoldingSetTrait so we can store ObjRef objects in ImmutableList. template <> struct FoldingSetTrait<ObjRef> : public DefaultFoldingSetTrait<ObjRef> { static void Profile(const ObjRef &FN, FoldingSetNodeID &ID) { ID.AddPointer(FN.get()); } }; } // end of namespace llvm TEST_F(ImmutableListTest, UniquePtrTest) { ImmutableList<ObjRef>::Factory f; ImmutableList<ObjRef> L = f.create(ObjRef(new Obj)); }
What is very interesting though, a simple wrapper works perfectly:
struct ObjRef { Obj *ptr; ObjRef(Obj *ptr) : ptr(ptr) {} ~ObjRef() { delete ptr; } Obj *get() const { return ptr; } }; // llvm_unreachable is actually reached in Obj::~Obj().
I've been stuck on this issue for a while now. Do you know anything about why this could happen?
requires it's elements to be copy constructible for no good reason
It seems like ImmutableList doesn't run the destructor for std::unique_ptrs
Seems there was a reason: and ImmutableList and its members are assumed to be a POD (plain old datatype), which do not need destructors.
E.g. see the code at the bottom of ImmutableList.h:
template <typename T> struct isPodLike; template <typename T> struct isPodLike<ImmutableList<T>> { static const bool value = true; };
@NoQ might want to correct me here, but I'm not sure how achievable is your use case, or whether it makes sense.
The point of functional ImmutableList is that you can copy them by value at O(1) cost everywhere.
If your list contains unique_ptr you can no longer copy it at all without violating unique_ptr semantics.
Would it make more sense to store references in ImmutableList for your use case instead?
My guess is, in your wrapper code the destructor for the temporary ObjRef at the end of the full-expression f.create(ObjRef(new Obj)) deletes the object, and the copy of ObjRef within the immutable list now contains a dangling pointer to the object (from which it won't be deleted again because immutable list doesn't call destructors; your code also doesn't dereference the pointer).
Your custom wrapper doesn't define any reasonable copy/move constructors, so a default copy is used.
But generally, yeah, i guess the main problem with putting non-POD objects into the immutable list is that you cannot easily make the factory call destructors when it dies because the allocator within it is only good in providing chunks of memory, not tracking them.
Putting smart pointers into an immutable list doesn't make it non-POD or harder to copy, but we are still unable to recall what we need to destroy when time comes.
Thanks for the quick and well detailed replies!
Oh wow. Thank you so much for pointing this out, I was stuck on this one for days. However, to me, it seems like, T doesn't need to be llvm::isPodLike type, just std::is_trivially_destructible.
Would it make more sense to store references in ImmutableList for your use case instead?
Yes, dynamic memory management has a way too great overhead, and since my algorithm relies on recursion, using references should be fine. Great idea!
In any case, tests are great, I really appreciate those, but I think we have established that the overall direction here does not make too much sense (or would require much more invasive changes)
- Moved tests to D50646
- Changed ordering back to how it was
- Added a static_assert, so only trivially destructible types can be stored in ImmutableList.
I still don't see why we wouldn't like to remove the copy constructability restriction, to me the changes for it seem relatively non-invasive. However, can we at least add static_assert(std::is_trivially_destructible<T>::value, ...) to maybe save a developer from falling in the same pit?
Re-uploaded with unittests.
include/llvm/ADT/ImmutableList.h | ||
---|---|---|
172 | I intentionally left it like this, because one can just std::move to it. That does achieve the same thing, right? |
include/llvm/ADT/ImmutableList.h | ||
---|---|---|
172 | In my understanding, one creates an extra copy, and one does not, but my c++fu is not that strong for templates+references. @NoQ might clarify |
@Szelethus I'm still not sure why do you pass by value instead of using a universal reference.
include/llvm/ADT/ImmutableList.h | ||
---|---|---|
172 |
Nah, it doesn't achieve the same thing. Reference is always a reference, passing it into a function is always free-of-charge, i.e. never invokes a constructor. Also std::move doesn't invoke move-constructor on its own, it simply marks the object as movable, i.e. move is simply an lvalue-to-xvalue-cast. But initialization of a value-type argument is a non-elidable constructor, which will be copy-constructor if the object is not movable and move-constructor if the object is movable. So if you omit && for Head, you are losing the benefits of a free-of-charge pass-by-reference and instead have to deal with an actual constructor call. |
Shouldn't we have && here as well for Head?