Analagous to the std::make_(unqiue|shared)_for_overwrite added in c++20.
If T is POD, and the container gets larger, any new values added wont be initialized.
This is useful when using SmallVector as a buffer where its planned to overwrite any potential new values added.
If T is not POD, new (Storage) T functions identically to new (Storage) T() so this will function identically to resize(size_type).
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I wonder if this can be tested, something like:
V.push_back(5); V.pop_back(); V.resize_for_overwrite(V.size() + 1); EXPECT_EQ(5, V.back()); V.pop_back(); V.resize(V.size() + 1); EXPECT_EQ(0, V.back());
llvm/include/llvm/ADT/SmallVector.h | ||
---|---|---|
466 | This whitespace change seems unrelated; can you commit separately? |
Was thinking of a good way to test what is essentially undefined behaviour, how will this work under msan??
llvm/include/llvm/ADT/SmallVector.h | ||
---|---|---|
466 | It's not strictly unrelated, but it was a clang format artefact |
Given that we own SmallVector and destroy_range is a no-op, is it undefined behaviour?
If the problem is the call to new (&*) T;, then can we add a function in SmallVectorTemplateCommon (or whatever it is that's specialized for PODs) called uninitialized_construct or something that's definitely a no-op?
llvm/include/llvm/ADT/SmallVector.h | ||
---|---|---|
466 | Hmm... is it possible that git-diff gives clang-format a different diff than Phab is showing? Phab's version doesn't have this statement changing except for the whitespace... if clang-format is getting the same diff and formatting the line, that sounds like a bug in clang-format; alternatively, I wonder if you ran clang-format when the patch was different and at the time it looked like the statement had changed? |
If the call to new (&*) T; is tripping sanitisers up, I'm happy to keep that behaviour. After calling resize_for_overwrite it should be required that you write to any newly allocated items before you read them, Explicitly making it a no-op will hide that testing route.
For the record I can't seem to run gtest under msan, there seems to be a false-positive use-of-uninitialized-value occurring in basic_string::push_back, obviously not related to this change.
llvm/include/llvm/ADT/SmallVector.h | ||
---|---|---|
466 | Thats what I mean when I say an artefact from clang-format. I often format as I go. |
Ish. We have used msan_* and asan_* functions to annotate LLVM's allocators so that uses of memory that hasn't been allocated into the pool, but not assigned to any user of the allocator can be detected (or memory used after it's returned to the allocator). We can/possibly should add such annotations to SmallVector and I think it could catch bugs like this.
LGTM. I'm hopeful we can somehow keep the tests once we instrument SmallVector for the sanitizers (see my comment below); even if not, I suppose we can drop the tests at that time.
When that happens, will there be a way to update this testcase, maybe to something like this?
V.push_back(5); V.pop_back(); V.resize_for_overwrite(V.size() + 1); if (!is_sanitizer_poisoning_pop_back()) EXPECT_EQ(5, V.back());
or:
V.resize_for_overwrite(V.size() + 1); if (is_sanitizer_poisoning_pop_back()) EXPECT_TRUE(is_sanitizer_poison(V.back())); else EXPECT_EQ(5, V.back());
llvm/include/llvm/ADT/SmallVector.h | ||
---|---|---|
466 | Ah, right, thanks. | |
llvm/unittests/ADT/SmallVectorTest.cpp | ||
346 ↗ | (On Diff #312924) | Nit: period at end of sentence. |
357 ↗ | (On Diff #312924) | Nit: period at end of sentence. |
We can detect MSAN using
#if LLVM_MEMORY_SANITIZER_BUILD // We have msan, don't run tests. #else <The test code> #endif
If these tests are causing issues under msan, then just don't run them.
As an added bonus if the tests are failing under msan, that would mean that msan will help catch bugs when people abuse this by not writing to the new storage before reading.
This whitespace change seems unrelated; can you commit separately?