This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] check minimum header length when opening linkable input files
ClosedPublic

Authored by gkm on Feb 26 2021, 11:51 PM.

Details

Summary

Bifurcate the readFile() API into ...

  • readRawFile() which performs no checks, and
  • readLinkableFile() which enforces minimum length of 20 bytes, same as ld64

There are no new tests because tweaks to existing tests are sufficient.

Diff Detail

Unit TestsFailed

Event Timeline

gkm created this revision.Feb 26 2021, 11:51 PM
Herald added a project: Restricted Project. · View Herald Transcript
gkm requested review of this revision.Feb 26 2021, 11:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 26 2021, 11:51 PM
gkm updated this revision to Diff 326880.Feb 26 2021, 11:53 PM
  • resubmit to catch parent-diff dependence
int3 added inline comments.Feb 27 2021, 9:22 AM
lld/MachO/Driver.cpp
266

petition to keep this as addFile -- it's not like we have addRawFile, so the additional word seems unnecessary

406

I think the old name made more sense, we are adding the files in the file list, not the filelist file itself

lld/test/MachO/invalid/bad-archive.s
9 ↗(On Diff #326880)

the original intent of the test was to exercise the archive-parsing code and its error handling, so this is no longer serving its purpose. We could tweak the test, but perhaps it would be easier to just require linkable files to have 4 bytes for the magic number, rather than 20? (plus, I'd like to avoid doing things just because ld64 is doing it, until we find some deeper reason)

similar issue for invalid-fat-narch.s below

lld/test/MachO/rename.s
17

we shouldn't be emitting this message if /dev/null is an output file parameter

gkm updated this revision to Diff 326911.EditedFeb 27 2021, 10:01 AM
  • strip "lld: " from FileCheck prefixes BAD1 & BAD2
gkm updated this revision to Diff 326915.Feb 27 2021, 11:05 AM
  • add -DAG to FileCheck prefixes BAD1 & BAD2
gkm marked 4 inline comments as done.Feb 27 2021, 11:41 AM
gkm added inline comments.
lld/test/MachO/rename.s
17

The -o was swallowed by -rename_section because of a missing sub arg. Therefore, it is interpreted as an input parameter.

gkm updated this revision to Diff 326918.Feb 27 2021, 11:41 AM
gkm marked an inline comment as done.
gkm edited the summary of this revision. (Show Details)
  • revise according to review feedback
int3 accepted this revision.Feb 27 2021, 1:37 PM
int3 added inline comments.
lld/test/MachO/rename.s
17

ah I see. IMO it's kind of weird to test this zero-size-file edge case in rename.s, since it's only tangentially related to renaming. I'd prefer we create a separate test file under test/invalid for it

This revision is now accepted and ready to land.Feb 27 2021, 1:37 PM
int3 added inline comments.Feb 27 2021, 1:39 PM
lld/test/MachO/rename.s
17

(also, test seems to be failing on Windows since there isn't a real /dev/null there. Looks like they create a fake temporary file instead)

gkm marked 2 inline comments as done.Feb 27 2021, 1:45 PM
gkm updated this revision to Diff 326924.Feb 27 2021, 1:45 PM
  • remove /dev/null from FileCheck patterns to appease Windoze
tschuett added inline comments.
lld/MachO/InputFiles.cpp
124

What are you going to do with. e.g. 4, 5, or 6 bytes files. It only has the magic numbers, but there is no real header?

gkm updated this revision to Diff 326926.Feb 27 2021, 2:02 PM
  • add test file lld/test/MachO/invalid/tiny-input.s for degenerate inputs
gkm marked an inline comment as done.Feb 27 2021, 2:25 PM
gkm added inline comments.
lld/MachO/InputFiles.cpp
124

Those will be validated downstream, based on magic number. If the magic number doesn't match anything we handle, then it will be rejected as such. If the magic number is valid, then the code for that file type will check that the header & contents are complete & consistent.

This revision was automatically updated to reflect the committed changes.
gkm marked an inline comment as done.
int3 added inline comments.Feb 27 2021, 2:44 PM
lld/MachO/InputFiles.cpp
124

I think it's fairly unlikely that we would have a file with a valid magic number but nothing else. We're not trying to defend against fuzzer inputs anyway, as per https://github.com/llvm/llvm-project/blob/main/lld/docs/NewLLD.rst:

The current policy is that it is your responsibility to give trustworthy object files. The function is guaranteed to return as long as you do not pass corrupted or malicious object files. A corrupted file could cause a fatal error or SEGV. That being said, you don't need to worry too much about it if you create object files in the usual way and give them to the linker. It is naturally expected to work, or otherwise it's a linker's bug.

thakis added a subscriber: thakis.Mar 1 2021, 4:07 PM

a) I think this breaks --reproduce=repro.tar for -order_file since that previously called readFile() which added to the repro tar, but now readRawFile() doesn't. (Looks like -sectcreate has the same problem likely.)
b) What's the motivation for this change? lld assumes that inputs are produced by a compiler or similar, and are generally valid. What problem is this solving?

smeenai added a subscriber: smeenai.Mar 1 2021, 5:18 PM

b) What's the motivation for this change? lld assumes that inputs are produced by a compiler or similar, and are generally valid. What problem is this solving?

I believe avoiding ASAN errors, per https://reviews.llvm.org/D97600?id=326855#inline-915154

gkm added a comment.Mar 2 2021, 1:40 AM

a) I think this breaks --reproduce=repro.tar for -order_file since that previously called readFile() which added to the repro tar, but now readRawFile() doesn't. (Looks like -sectcreate has the same problem likely.)

-sectcreate is subtle. It should be an opaque object immune to meddling, but as-is, if it happens to be MachO::FAT_MAGIC its arch component will be extracted. I am not going to fret about that until it proves troublesome.

b) What's the motivation for this change? lld assumes that inputs are produced by a compiler or similar, and are generally valid. What problem is this solving?

The immediate motive was to fix an ASAN overflow when checking magic# of an empty buffer. My solution was an ill-conceived refactor with insufficient understanding. Apologies. This reverts the refactor and adds a modest one-line check that the file buffer is long-enough for a magic number: https://reviews.llvm.org/D97757