This is an archive of the discontinued LLVM Phabricator instance.

[ADT] Fix join_impl using the wrong size when calculating total length
ClosedPublic

Authored by njames93 on Jul 7 2020, 7:11 AM.

Diff Detail

Event Timeline

njames93 created this revision.Jul 7 2020, 7:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 7 2020, 7:11 AM

Any chance of testing this? Could test the 'capacity' after join to check its the right size?

Any chance of testing this? Could test the 'capacity' after join to check its the right size?

How do you suppose I do that, maybe expensive checks to get the capacity after reserve and after all items have been added and ensure they are the same, otherwise the string grew during appending.

Any chance of testing this? Could test the 'capacity' after join to check its the right size?

How do you suppose I do that, maybe expensive checks to get the capacity after reserve and after all items have been added and ensure they are the same, otherwise the string grew during appending.

Oh, nah, nothing that'd require expensive checks - you could add an assert that size == Len at the end of the function (would have to have some way of avoiding the case for short results that are below the starting capacity, though... perhaps checking if capacity grew over the reserve call, and only asserting in that case)). But I was thinking more along the lines of a unit test that tests join and checks the capacity isn't bigger than the size for a few hardcoded samples.

Any chance of testing this? Could test the 'capacity' after join to check its the right size?

How do you suppose I do that, maybe expensive checks to get the capacity after reserve and after all items have been added and ensure they are the same, otherwise the string grew during appending.

Oh, nah, nothing that'd require expensive checks - you could add an assert that size == Len at the end of the function (would have to have some way of avoiding the case for short results that are below the starting capacity, though... perhaps checking if capacity grew over the reserve call, and only asserting in that case)). But I was thinking more along the lines of a unit test that tests join and checks the capacity isn't bigger than the size for a few hardcoded samples.

Oh that assert is fine, the unit test won't be a good test. As we don't control std string, it's capacity for calls to reserve will be implementation defined

Any chance of testing this? Could test the 'capacity' after join to check its the right size?

How do you suppose I do that, maybe expensive checks to get the capacity after reserve and after all items have been added and ensure they are the same, otherwise the string grew during appending.

Oh, nah, nothing that'd require expensive checks - you could add an assert that size == Len at the end of the function (would have to have some way of avoiding the case for short results that are below the starting capacity, though... perhaps checking if capacity grew over the reserve call, and only asserting in that case)). But I was thinking more along the lines of a unit test that tests join and checks the capacity isn't bigger than the size for a few hardcoded samples.

Oh that assert is fine, the unit test won't be a good test. As we don't control std string, it's capacity for calls to reserve will be implementation defined

The spec seems to be a bit more precise than that (using cppreference, which isn't authoritative, but more readable): "If new_cap is greater than the current capacity(), new storage is allocated, and capacity() is made equal or greater than new_cap."

So a unit test that joins some large strings (if it wanted to be fancy, it could even generate a string intentionally larger than std::string's default capacity, whatever it happens to be) and if the resulting string has a larger capacity than std::string's default capacity, verify that the capacity is exactly equal to the string length.

The spec seems to be a bit more precise than that (using cppreference, which isn't authoritative, but more readable): "If new_cap is greater than the current capacity(), new storage is allocated, and capacity() is made equal or greater than new_cap."

So a unit test that joins some large strings (if it wanted to be fancy, it could even generate a string intentionally larger than std::string's default capacity, whatever it happens to be) and if the resulting string has a larger capacity than std::string's default capacity, verify that the capacity is exactly equal to the string length.

That still sounds like it'll cause some issues. The main thing we want to test (which is the purpose of this patch) is that the string didn't grow while we are appending all the pieces. It's impossible to tell externally if that happened as reserve may choose to over allocate.
Putting an assert in the function that the capacity before hasn't changed should be more than enough, WDYT?

njames93 updated this revision to Diff 316264.Jan 12 2021, 3:57 PM

Add assert to enusre correct behaviour.

dblaikie accepted this revision.Jan 12 2021, 7:07 PM

Sounds good - thanks!

This revision is now accepted and ready to land.Jan 12 2021, 7:07 PM
This revision was landed with ongoing or failed builds.Jan 13 2021, 3:37 AM
This revision was automatically updated to reflect the committed changes.