This is an archive of the discontinued LLVM Phabricator instance.

[Support] Fix utf16 path's index upper bound
AbandonedPublic

Authored by danix800 on Jul 7 2020, 9:38 AM.

Details

Summary

Size of paths in utf8 is possibly bigger than in utf16.

Diff Detail

Event Timeline

danix800 created this revision.Jul 7 2020, 9:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 7 2020, 9:38 AM
lattner edited reviewers, added: compnerd; removed: lattner.Jul 7 2020, 9:44 AM
danix800 edited reviewers, added: respindola; removed: andrewng.Jul 8 2020, 7:39 AM

The usual practice is to include the entire context in the diff. However, the change itself LGTM.

danix800 updated this revision to Diff 276609.EditedJul 8 2020, 5:35 PM

Full context attached @andrewng .

Could anyone merge this revision if acceptable? I don't have commit access.

Full context attached @andrewng .

Thanks for doing that, makes it easier for reviewers to see around the code without having to load up the file in question.

Perhaps this particular use case needs to be added to the unit testing ("llvm/unittests/Support/Path.cpp")?

It's just so obvious.

The unit-test TEST_F(FileSystemTest, widenPath) in llvm/unittests/Support/Path.cpp:2080-2146
has already demonstrated this.

#ifdef _WIN32
TEST_F(FileSystemTest, widenPath) {
  const std::wstring LongPathPrefix(L"\\\\?\\");

  // Test that the length limit is checked against the UTF-16 length and not the
  // UTF-8 length.
  std::string Input("C:\\foldername\\");
  const std::string Pi("\xcf\x80"); // UTF-8 lower case pi.
  // Add Pi up to the MAX_PATH limit.
  const size_t NumChars = MAX_PATH - Input.size() - 1;
  for (size_t i = 0; i < NumChars; ++i)
    Input += Pi;
  // Check that UTF-8 length already exceeds MAX_PATH.
  EXPECT_TRUE(Input.size() > MAX_PATH);
  SmallVector<wchar_t, MAX_PATH + 16> Result;
  ASSERT_NO_ERROR(windows::widenPath(Input, Result));
  // Result should not start with the long path prefix.
  EXPECT_TRUE(std::wmemcmp(Result.data(), LongPathPrefix.c_str(),
                           LongPathPrefix.size()) != 0);
  EXPECT_EQ(Result.size(), (size_t)MAX_PATH - 1);

  // Add another Pi to exceed the MAX_PATH limit.

Input.size() > MAX_PATH is expected to be true and Result.size() is expected to be MAX_PATH - 1, meaning
that PathUTF16[Path.size() - 1] is an out-of-bound array access.

That test is for widenPath and it does cover a similar situation (I know because I wrote it). The code you have changed relates to directory_iterator_construct which does not have coverage of the situation where the UTF16 length is less than the UTF8 length, i.e. the issue that you are fixing.

If another reviewer is happy to approve this change without any test coverage, then that's fine with me.

danix800 abandoned this revision.Jul 11 2020, 12:47 AM