This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] Fix crash when writing empty binary output
ClosedPublic

Authored by rupprecht on Jan 16 2019, 1:43 PM.

Details

Summary

When using llvm-objcopy -O binary and the resulting file will be empty (e.g. removing the only section that would be written, or using --only-keep with a section that doesn't exist/isn't SHF_ALLOC), we crash because FileOutputBuffer expects Size > 0. Add a regression test, and change Buffer to open/truncate the output file in this case.

Diff Detail

Repository
rL LLVM

Event Timeline

rupprecht created this revision.Jan 16 2019, 1:43 PM
jhenderson added inline comments.Jan 17 2019, 2:27 AM
llvm/test/tools/llvm-objcopy/ELF/binary-output-empty.test
3 ↗(On Diff #182138)

Nit: existant -> existent

26 ↗(On Diff #182138)

I'd like this to be above the yaml, to make it easier to see the expected output.

llvm/tools/llvm-objcopy/Buffer.cpp
33 ↗(On Diff #182138)

Could you return here, to avoid the need to write "else"/indent below?

39 ↗(On Diff #182138)

There's a weird mismatch between using auto here and not above. I'd err on the side of not using auto in either case.

rupprecht updated this revision to Diff 182389.Jan 17 2019, 1:39 PM
rupprecht marked 5 inline comments as done.
  • Misc code review comments
llvm/tools/llvm-objcopy/Buffer.cpp
33 ↗(On Diff #182138)

I did that... but personally found it to be a bit awkward for some reason. It felt like the return was hidden in a bunch of code.
Instead, I refactored out the bit of creating/truncating the file, and added an early handle+return condition for that instead (so the reverse order), which I think is a more readable/common pattern of returning early. WDYT?

llvm/tools/llvm-objcopy/Buffer.cpp
31 ↗(On Diff #182389)

so given we are trying to get rid of "reportError" (to enable us to build a library from this code in the future), maybe we can return an error from the method "allocate" instead ? (similarly to what "commit" does)

rupprecht updated this revision to Diff 182438.Jan 17 2019, 7:08 PM
rupprecht marked 2 inline comments as done.
  • Return Error instead of void
llvm/tools/llvm-objcopy/Buffer.cpp
31 ↗(On Diff #182389)

SGTM, though I'd like to punt on changing the signature of allocate() in this patch. I'll start working on a separate patch now...

llvm/tools/llvm-objcopy/Buffer.cpp
31 ↗(On Diff #182389)

oh, probably my bad,
my second thought is the following: basically if i remember correctly we call Buffer::commit in the very end (and in the case of FileBuffer it does smth like renaming the temporary file backing the buffer (which was mmaped into the memory) - i don't remember precisely, need to double check it).
If there is an error in the tool somewhere in the middle of the whole pipeline, the files are untouched and I think it's a nice property to have. But if we truncate a file inside the method "allocate" this won't be true anymore. So ideally all the "real" changes should be applied to the file system only inside the "commit" method + would be cool to have a test for this scenario. What do you think ?

rupprecht planned changes to this revision.Jan 17 2019, 10:40 PM
rupprecht marked an inline comment as done.

Planning to move file truncation from finalize() to commit()

llvm/tools/llvm-objcopy/Buffer.cpp
31 ↗(On Diff #182389)

No worries, error handling here looks like something we should fix. I have something in progress (https://reviews.llvm.org/differential/diff/182462/) but I actually want to increase the scope of that refactoring, so it's not ready yet. I just want it to be orthogonal to this crash fix :)

I see what you mean about commit(). In the instance of the crash I'm fixing, it doesn't seem like it's at all possible for a crash in between finalize() and write(), but I can't say I've ruled out other situations where Size = 0.
I don't know if it's testable, but I'll try to move this block to commit().

rupprecht set the repository for this revision to rL LLVM.Jan 18 2019, 1:26 PM
  • Make sure file modification only happens in commit()
rupprecht marked 2 inline comments as done.Jan 23 2019, 10:18 AM
rupprecht added inline comments.
llvm/tools/llvm-objcopy/Buffer.cpp
31 ↗(On Diff #182389)

I couldn't come up with a way to add a test for this, but I manually verified it by throwing in an assert(false) and making sure the file isn't changed.

$ echo abcd > /tmp/existing-file.txt
$ llvm-objcopy -R .text -O binary binary-output-empty.test.tmp.o /tmp/existing-file.txt
<crash>
$ cat /tmp/existing-file.txt
abcd

(Before your suggestion, it was printing an empty file)

Given the state of the code, I don't think a crash is really possible -- unless something like malloc failed -- so I don't have a test to offer.

jhenderson added inline comments.Jan 24 2019, 1:24 AM
llvm/tools/llvm-objcopy/Buffer.cpp
57 ↗(On Diff #183125)

Instead of this assert, how practical would it be to call allocate in the constructor?

31 ↗(On Diff #182389)

This is probably fine as is, but a couple of ideas I had where this might fail (not 100% sure) is a) if another file has a lock on it, or b) if the file is read-only maybe? Are either of those testable in the current framework? I think you could test a) by using python to open it and then spawn llvm-objcopy without closing the file.

rupprecht marked 2 inline comments as done.Jan 24 2019, 9:32 AM
rupprecht added inline comments.
llvm/tools/llvm-objcopy/Buffer.cpp
57 ↗(On Diff #183125)

allocate() is often called long after construction, e.g. llvm-objcopy.cpp creates the FileBuffer and ELF/Object.cpp calls allocate() after the object has been processed (and then it knows how much to allocate).

31 ↗(On Diff #182389)

I think both of those would describe createAndTruncateFile() failures. If commit() fails, then that's "fine" in the sense that there are no side effects -- any file modification should fail. The original problem was:

  1. allocate() is called, which truncates some existing data
  2. some extra code is called, which crashes
  3. commit() is never called, so the processed object is never written, but the original data is still destroyed

vs as implemented in the latest version of the patch:

  1. allocate() is called, which remembers to defer truncating until later
  2. some extra code is called, which crashes
  3. commit() is never called, so truncation never happens and the original data is still there

But, I don't see any reasonable ways for the part in between to crash. The relevent snippet I'm trying to break is this:

// From finalize()
for (auto &Section : Obj.sections()) {
  if ((Section.Flags & SHF_ALLOC) == 0)
    continue;
  AllocatedSections.push_back(&Section);
}
TotalSize = 0;
for (const auto &Section : AllocatedSections) {
  if (Section->Type != SHT_NOBITS)
    TotalSize = std::max(TotalSize, Section->Offset + Section->Size);
}
Buf.allocate(TotalSize);
SecWriter = llvm::make_unique<BinarySectionWriter>(Buf);

// From write()
for (auto &Section : Obj.sections()) {
  if ((Section.Flags & SHF_ALLOC) == 0)
    continue;
  Section.accept(*SecWriter);
}
return Buf.commit();

The things that could fail:

  • SecWriter construction fails, e.g. malloc failure
  • Section.accept() fails, but given the constraint that TotalSize is 0 (in order for allocate(0) to be called), the only sections it could be called on would be SHT_NOBITS or have zero offset/size. It looks like section visitors check SHT_NOBITS and return early, and I don't know if zero offset/size is valid or possible to construct.
This revision is now accepted and ready to land.Jan 24 2019, 10:35 AM
jakehehrlich accepted this revision.Jan 24 2019, 10:55 AM

Does overwriting a file create a new file? --build-id-link-dir and friends relay on that behavior. In general tools should *never* modify a file in place. Consider any build that uses GN (like Fuchsia, or Chromeium, and now optionally llvm/clang). Say they have a rule (like how we strip in Fuchsia) that writes a file. Now say a copy rule is used to put that somewhere else or you use --build-id-link-dir to copy into a .build-id directory (both happen in Fuchsia). The correct behavior should be for a new file to be created on disk, not modification of the existing one. This for instance keeps the file in the .build-id directory correct. I think this truncates the same existing inode.

jakehehrlich requested changes to this revision.Jan 24 2019, 10:55 AM

Sorry, meant to request changes.

This revision now requires changes to proceed.Jan 24 2019, 10:55 AM
  • Make sure a new file is created instead of truncating an inode in place

Yup, this was truncating in-place. Changed to nuke the old file + create a new one now (using CD_CreateNew to make sure it's not in-place). PTAL.

  • Fix casing
jakehehrlich added inline comments.Jan 24 2019, 6:21 PM
llvm/tools/llvm-objcopy/Buffer.cpp
25–26 ↗(On Diff #183365)

Remove has a default parameter that sets weather or not it errors out if the file already exists. This removes a (well for this it doesn't actually matter in any case I've seen) race condition. To eliminate the condition entirely you need to create a temporary empty file on the same partition and then rename it into place. That behaves atomically. That's what FileBuffer is doing under the hood.

rupprecht updated this revision to Diff 183481.Jan 24 2019, 9:46 PM
  • Use tempfile
rupprecht marked 2 inline comments as done.Jan 24 2019, 9:48 PM
rupprecht added inline comments.
llvm/tools/llvm-objcopy/Buffer.cpp
25–26 ↗(On Diff #183365)

Thanks for that pointer! Even though this race condition is probably overkill to worry about, it ends up being much less code to use a temp file anyway, so I'm happy to steal it :D

rupprecht updated this revision to Diff 183482.Jan 24 2019, 9:50 PM
rupprecht marked an inline comment as done.
  • typo
jhenderson accepted this revision.Jan 25 2019, 3:05 AM

Looks good from my point of view, but obviously wait for @jakehehrlich in particular to confirm he's happy.

jakehehrlich accepted this revision.Jan 25 2019, 1:47 PM
This revision is now accepted and ready to land.Jan 25 2019, 1:47 PM
This revision was automatically updated to reflect the committed changes.