This is an archive of the discontinued LLVM Phabricator instance.

[Support/TarWriter] - Don't allow TarWriter to add the same file more than once.
ClosedPublic

Authored by grimar on Nov 29 2017, 8:00 AM.

Details

Summary

This is for PR35460.

Currently when LLD adds files to TarWriter it may pass the same file multiple times.
For example it happens for clang reproduce file which specifies archive (.a) files more
than once in command line. Though that can be catched on LLD side, I think it is reasonable
to restrict TarWriter to add the same file more than once, I believe current behavior does not
make much sence.

Diff Detail

Repository
rL LLVM

Event Timeline

grimar created this revision.Nov 29 2017, 8:00 AM
ruiu added inline comments.Nov 29 2017, 2:40 PM
include/llvm/Support/TarWriter.h
16 ↗(On Diff #124755)

Please use llvm::StringSet.

24 ↗(On Diff #124755)

Just keep it return void as you are not going to use its return value.

lib/Support/TarWriter.cpp
176 ↗(On Diff #124755)

Exists

177–178 ↗(On Diff #124755)

Or, just if (Seen.insert(Fullpath).second) return;

grimar updated this revision to Diff 124901.Nov 30 2017, 3:05 AM
grimar marked 4 inline comments as done.
grimar edited the summary of this revision. (Show Details)
  • Addressed review comments.
grimar added inline comments.Nov 30 2017, 3:08 AM
include/llvm/Support/TarWriter.h
16 ↗(On Diff #124755)

It does not work:

>D:\Work\llvm\include\llvm/ADT/StringMap.h(209): error C2039: 'Deallocate': is not a member of 'std::basic_string<char,std::char_traits<char>,std::allocator<char>>'
1>  C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\include\xstring(2633): note: see declaration of 'std::basic_string<char,std::char_traits<char>,std::allocator<char>>'
1>  D:\Work\llvm\include\llvm/ADT/StringMap.h(301): note: see reference to function template instantiation 'void llvm::StringMapEntry<ValueTy>::Destroy<AllocatorTy>(AllocatorTy &)' being compiled
1>          with
1>          [
1>              ValueTy=char,
1>              AllocatorTy=std::string
1>          ]

I think llvm containers does not know how to handle std::string. So I had to use std::set. It seems common in LLVM to use it for std::string.

lib/Support/TarWriter.cpp
177–178 ↗(On Diff #124755)

That looks much better, thanks !

grimar updated this revision to Diff 124958.Nov 30 2017, 9:44 AM
  • Use llvm::StringSet instead of std::set<std::string>.
ruiu accepted this revision.Nov 30 2017, 12:18 PM

LGTM with a test.

This revision is now accepted and ready to land.Nov 30 2017, 12:18 PM
grimar updated this revision to Diff 125095.Dec 1 2017, 3:23 AM
  • Added testcase.
ruiu added inline comments.Dec 1 2017, 12:01 PM
unittests/Support/TarWriterTest.cpp
124–143 ↗(On Diff #125095)

You should split this into three different test functions.

grimar updated this revision to Diff 125298.Dec 4 2017, 12:20 AM

Addressed review comments:

  • Split test into three.
  • Use llvm::sys::fs::file_size API for finding output tar file size.
ruiu accepted this revision.Dec 4 2017, 4:39 PM

LGTM with this change.

unittests/Support/TarWriterTest.cpp
142 ↗(On Diff #125298)

These test names made me think for a while why you needed to check for file size (e.g. did we have a logic to pick the largest file or something?)

It turned out that you are using file size just to verify that output is correct.

Please rename these tests SingleFile, NoDuplicate and Duplicate.

This revision was automatically updated to reflect the committed changes.