Download Raw Diff

Details

Reviewers

dblaikie
mehdi_amini
silvas
dexonsmith

Commits

rG5d10b8ad595d: [ADT] Add resize_for_overwrite method to SmallVector.

Summary

Analagous to the std::make_(unqiue|shared)_for_overwrite added in c++20.
If T is POD, and the container gets larger, any new values added wont be initialized.
This is useful when using SmallVector as a buffer where its planned to overwrite any potential new values added.
If T is not POD, new (Storage) T functions identically to new (Storage) T() so this will function identically to resize(size_type).

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

njames93 requested review of this revision.Dec 18 2020, 4:28 AM

njames93 created this revision.

Herald added a project: Restricted Project. · View Herald TranscriptDec 18 2020, 4:28 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B82940: Diff 312753.Dec 18 2020, 5:34 AM

Fix a lint warning

Harbormaster completed remote builds in B82980: Diff 312821.Dec 18 2020, 10:15 AM

I wonder if this can be tested, something like:

V.push_back(5);
V.pop_back();
V.resize_for_overwrite(V.size() + 1);
EXPECT_EQ(5, V.back());
V.pop_back();
V.resize(V.size() + 1);
EXPECT_EQ(0, V.back());

llvm/include/llvm/ADT/SmallVector.h
466	This whitespace change seems unrelated; can you commit separately?

In D93532#2463881, @dexonsmith wrote:
I wonder if this can be tested, something like:
V.push_back(5);
V.pop_back();
V.resize_for_overwrite(V.size() + 1);
EXPECT_EQ(5, V.back());
V.pop_back();
V.resize(V.size() + 1);
EXPECT_EQ(0, V.back());

Was thinking of a good way to test what is essentially undefined behaviour, how will this work under msan??

llvm/include/llvm/ADT/SmallVector.h
466	It's not strictly unrelated, but it was a clang format artefact

In D93532#2463995, @njames93 wrote:
In D93532#2463881, @dexonsmith wrote:
I wonder if this can be tested, something like:
V.push_back(5);
V.pop_back();
V.resize_for_overwrite(V.size() + 1);
EXPECT_EQ(5, V.back());
V.pop_back();
V.resize(V.size() + 1);
EXPECT_EQ(0, V.back());
Was thinking of a good way to test what is essentially undefined behaviour, how will this work under msan??

Given that we own SmallVector and destroy_range is a no-op, is it undefined behaviour?

If the problem is the call to new (&*) T;, then can we add a function in SmallVectorTemplateCommon (or whatever it is that's specialized for PODs) called uninitialized_construct or something that's definitely a no-op?

llvm/include/llvm/ADT/SmallVector.h
466	Hmm... is it possible that git-diff gives clang-format a different diff than Phab is showing? Phab's version doesn't have this statement changing except for the whitespace... if clang-format is getting the same diff and formatting the line, that sounds like a bug in clang-format; alternatively, I wonder if you ran `clang-format` when the patch was different and at the time it looked like the statement had changed?

In D93532#2464225, @dexonsmith wrote:

Given that we own SmallVector and destroy_range is a no-op, is it undefined behaviour?

If the problem is the call to new (&*) T;, then can we add a function in SmallVectorTemplateCommon (or whatever it is that's specialized for PODs) called uninitialized_construct or something that's definitely a no-op?

If the call to new (&*) T; is tripping sanitisers up, I'm happy to keep that behaviour. After calling resize_for_overwrite it should be required that you write to any newly allocated items before you read them, Explicitly making it a no-op will hide that testing route.
For the record I can't seem to run gtest under msan, there seems to be a false-positive use-of-uninitialized-value occurring in basic_string::push_back, obviously not related to this change.

llvm/include/llvm/ADT/SmallVector.h
466	Thats what I mean when I say an artefact from clang-format. I often format as I go.

Added test case, remove format artefact.

Harbormaster completed remote builds in B83041: Diff 312924.Dec 19 2020, 4:25 AM

In D93532#2464225, @dexonsmith wrote:
In D93532#2463995, @njames93 wrote:
In D93532#2463881, @dexonsmith wrote:
I wonder if this can be tested, something like:
V.push_back(5);
V.pop_back();
V.resize_for_overwrite(V.size() + 1);
EXPECT_EQ(5, V.back());
V.pop_back();
V.resize(V.size() + 1);
EXPECT_EQ(0, V.back());
Was thinking of a good way to test what is essentially undefined behaviour, how will this work under msan??
Given that we own SmallVector and destroy_range is a no-op, is it undefined behaviour?

Ish. We have used msan_* and asan_* functions to annotate LLVM's allocators so that uses of memory that hasn't been allocated into the pool, but not assigned to any user of the allocator can be detected (or memory used after it's returned to the allocator). We can/possibly should add such annotations to SmallVector and I think it could catch bugs like this.

LGTM. I'm hopeful we can somehow keep the tests once we instrument SmallVector for the sanitizers (see my comment below); even if not, I suppose we can drop the tests at that time.

In D93532#2464555, @dblaikie wrote:

In D93532#2464225, @dexonsmith wrote:

In D93532#2463995, @njames93 wrote:

Was thinking of a good way to test what is essentially undefined behaviour, how will this work under msan??

Given that we own SmallVector and destroy_range is a no-op, is it undefined behaviour?

Ish. We have used msan_* and asan_* functions to annotate LLVM's allocators so that uses of memory that hasn't been allocated into the pool, but not assigned to any user of the allocator can be detected (or memory used after it's returned to the allocator). We can/possibly should add such annotations to SmallVector and I think it could catch bugs like this.

When that happens, will there be a way to update this testcase, maybe to something like this?

V.push_back(5);
V.pop_back();
V.resize_for_overwrite(V.size() + 1);
if (!is_sanitizer_poisoning_pop_back())
  EXPECT_EQ(5, V.back());

or:

V.resize_for_overwrite(V.size() + 1);
if (is_sanitizer_poisoning_pop_back())
  EXPECT_TRUE(is_sanitizer_poison(V.back()));
else
  EXPECT_EQ(5, V.back());

llvm/include/llvm/ADT/SmallVector.h
466	Ah, right, thanks.
llvm/unittests/ADT/SmallVectorTest.cpp
346 ↗	(On Diff #312924)	Nit: period at end of sentence.
357 ↗	(On Diff #312924)	Nit: period at end of sentence.

This revision is now accepted and ready to land.Dec 21 2020, 3:13 PM

Period after comments.

In D93532#2466993, @dexonsmith wrote:
LGTM. I'm hopeful we can somehow keep the tests once we instrument SmallVector for the sanitizers (see my comment below); even if not, I suppose we can drop the tests at that time.

When that happens, will there be a way to update this testcase, maybe to something like this?
V.push_back(5);
V.pop_back();
V.resize_for_overwrite(V.size() + 1);
if (!is_sanitizer_poisoning_pop_back())
  EXPECT_EQ(5, V.back());
or:
V.resize_for_overwrite(V.size() + 1);
if (is_sanitizer_poisoning_pop_back())
  EXPECT_TRUE(is_sanitizer_poison(V.back()));
else
  EXPECT_EQ(5, V.back());

We can detect MSAN using

#if LLVM_MEMORY_SANITIZER_BUILD
// We have msan, don't run tests.
#else
<The test code>
#endif

If these tests are causing issues under msan, then just don't run them.
As an added bonus if the tests are failing under msan, that would mean that msan will help catch bugs when people abuse this by not writing to the new storage before reading.

Harbormaster completed remote builds in B83239: Diff 313284.Dec 22 2020, 4:53 AM

This revision was landed with ongoing or failed builds.Dec 22 2020, 9:19 AM

Closed by commit rG5d10b8ad595d: [ADT] Add resize_for_overwrite method to SmallVector. (authored by njames93). · Explain Why

This revision was automatically updated to reflect the committed changes.

njames93 added a commit: rG5d10b8ad595d: [ADT] Add resize_for_overwrite method to SmallVector..

In D93532#2466993, @dexonsmith wrote:
LGTM. I'm hopeful we can somehow keep the tests once we instrument SmallVector for the sanitizers (see my comment below); even if not, I suppose we can drop the tests at that time.

In D93532#2464555, @dblaikie wrote:

In D93532#2464225, @dexonsmith wrote:

In D93532#2463995, @njames93 wrote:

Was thinking of a good way to test what is essentially undefined behaviour, how will this work under msan??

Given that we own SmallVector and destroy_range is a no-op, is it undefined behaviour?

Ish. We have used msan_* and asan_* functions to annotate LLVM's allocators so that uses of memory that hasn't been allocated into the pool, but not assigned to any user of the allocator can be detected (or memory used after it's returned to the allocator). We can/possibly should add such annotations to SmallVector and I think it could catch bugs like this.

When that happens, will there be a way to update this testcase, maybe to something like this?
V.push_back(5);
V.pop_back();
V.resize_for_overwrite(V.size() + 1);
if (!is_sanitizer_poisoning_pop_back())
  EXPECT_EQ(5, V.back());
or:
V.resize_for_overwrite(V.size() + 1);
if (is_sanitizer_poisoning_pop_back())
  EXPECT_TRUE(is_sanitizer_poison(V.back()));
else
  EXPECT_EQ(5, V.back());

In D93532#2467643, @njames93 wrote:
In D93532#2466993, @dexonsmith wrote:
LGTM. I'm hopeful we can somehow keep the tests once we instrument SmallVector for the sanitizers (see my comment below); even if not, I suppose we can drop the tests at that time.

When that happens, will there be a way to update this testcase, maybe to something like this?
V.push_back(5);
V.pop_back();
V.resize_for_overwrite(V.size() + 1);
if (!is_sanitizer_poisoning_pop_back())
  EXPECT_EQ(5, V.back());
or:
V.resize_for_overwrite(V.size() + 1);
if (is_sanitizer_poisoning_pop_back())
  EXPECT_TRUE(is_sanitizer_poison(V.back()));
else
  EXPECT_EQ(5, V.back());
We can detect MSAN using
#if LLVM_MEMORY_SANITIZER_BUILD
// We have msan, don't run tests.
#else
<The test code>
#endif
If these tests are causing issues under msan, then just don't run them.
As an added bonus if the tests are failing under msan, that would mean that msan will help catch bugs when people abuse this by not writing to the new storage before reading.

Yep. Sounds good!

MaskRay mentioned this in D93761: [libObject/Decompressor] - Use `resize_for_overwrite` in Decompressor::resizeAndDecompress()..Dec 23 2020, 10:06 AM

grimar added a subscriber: grimar.Dec 23 2020, 11:36 PM

Nice! I've not realized this is a very new feature. Used it in D93761.

This is an archive of the discontinued LLVM Phabricator instance.

[ADT] Add resize_for_overwrite method to SmallVector.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 312821

llvm/include/llvm/ADT/SmallVector.h

This is an archive of the discontinued LLVM Phabricator instance.

[ADT] Add resize_for_overwrite method to SmallVector.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 312821

llvm/include/llvm/ADT/SmallVector.h

[ADT] Add resize_for_overwrite method to SmallVector.
ClosedPublic