Page MenuHomePhabricator

[llvm-objcopy] preserve file ownership when overwritten by root
ClosedPublic

Authored by jcai19 on Dec 28 2020, 6:45 PM.

Details

Summary
As of binutils 2.36, GNU strip calls chown(2) for "sudo strip foo" and
"sudo strip foo -o foo", but no "sudo strip foo -o bar" or "sudo strip
foo -o ./foo". In other words, while "sudo strip foo -o bar" creates a
new file bar with root access, "sudo strip foo" will keep the owner and
group of foo unchanged. Currently llvm-objcopy and llvm-strip behave
differently, always changing the owner and group to root. The
discrepancy prevents Chrome OS from migrating to llvm-objcopy and
llvm-strip as they change file ownership and cause intended users/groups
to lose access when invoked by sudo with the following sequence
(recommended in man page of GNU strip).

1.<Link the executable as normal.>
2.<Copy "foo" to "foo.full">
3.<Run "strip --strip-debug foo">
4.<Run "objcopy --add-gnu-debuglink=foo.full foo">

This patch makes llvm-objcopy and llvm-strip follow GNU's behavior.

Link: crbug.com/1108880

Diff Detail

Unit TestsFailed

TimeTest
260 msx64 debian > libarcher.races::task-dependency.c
Script: -- : 'RUN: at line 13'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests -I /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/runtime/src -L /mnt/disks/ssd0/agent/llvm-project/build/lib -Wl,-rpath,/mnt/disks/ssd0/agent/llvm-project/build/lib /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/races/task-dependency.c -o /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-dependency.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/deflake.bash /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-dependency.c.tmp 2>&1 | tee /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-dependency.c.tmp.log | /mnt/disks/ssd0/agent/llvm-project/build/./bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/races/task-dependency.c
260 msx64 debian > libarcher.races::task-taskgroup-unrelated.c
Script: -- : 'RUN: at line 13'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests -I /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/runtime/src -L /mnt/disks/ssd0/agent/llvm-project/build/lib -Wl,-rpath,/mnt/disks/ssd0/agent/llvm-project/build/lib /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/races/task-taskgroup-unrelated.c -o /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-taskgroup-unrelated.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/deflake.bash /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-taskgroup-unrelated.c.tmp 2>&1 | tee /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-taskgroup-unrelated.c.tmp.log | /mnt/disks/ssd0/agent/llvm-project/build/./bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/races/task-taskgroup-unrelated.c
280 msx64 debian > libarcher.races::task-taskwait-nested.c
Script: -- : 'RUN: at line 13'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests -I /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/runtime/src -L /mnt/disks/ssd0/agent/llvm-project/build/lib -Wl,-rpath,/mnt/disks/ssd0/agent/llvm-project/build/lib /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/races/task-taskwait-nested.c -o /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-taskwait-nested.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/deflake.bash /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-taskwait-nested.c.tmp 2>&1 | tee /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-taskwait-nested.c.tmp.log | /mnt/disks/ssd0/agent/llvm-project/build/./bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/races/task-taskwait-nested.c
320 msx64 debian > libarcher.races::task-two.c
Script: -- : 'RUN: at line 13'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests -I /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/runtime/src -L /mnt/disks/ssd0/agent/llvm-project/build/lib -Wl,-rpath,/mnt/disks/ssd0/agent/llvm-project/build/lib /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/races/task-two.c -o /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-two.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/deflake.bash /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-two.c.tmp 2>&1 | tee /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-two.c.tmp.log | /mnt/disks/ssd0/agent/llvm-project/build/./bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/races/task-two.c
280 msx64 debian > libarcher.task::task-barrier.c
Script: -- : 'RUN: at line 15'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests -I /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/runtime/src -L /mnt/disks/ssd0/agent/llvm-project/build/lib -Wl,-rpath,/mnt/disks/ssd0/agent/llvm-project/build/lib /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/task/task-barrier.c -o /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/task/Output/task-barrier.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/task/Output/task-barrier.c.tmp 2>&1 | tee /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/task/Output/task-barrier.c.tmp.log | /mnt/disks/ssd0/agent/llvm-project/build/./bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/task/task-barrier.c
View Full Test Results (13 Failed)

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
jcai19 added inline comments.Jan 6 2021, 2:49 PM
llvm/lib/Support/Unix/Path.inc
1214

Good catch! git clang-format HEAD~1 somehow did not notice it. I have updated it.

1219

What's the RetryAfterSignal about?

I think it basically keeps calling the same function again if EINTR (interrupted function call) happens https://llvm.org/doxygen/Errno_8h_source.html#l00033.

Should it be commented?

I just added some comments in the latest patch. FYI this function is called several times in the same file and has not been commented anywhere else, so let me know if you think we should just remove it to keep consistent.

llvm/test/tools/llvm-objcopy/preserve-file-ownership.test
2 ↗(On Diff #314795)

That's a good point. I could remove sudo from the commands but this change won't get tested. Any suggestion on how we could better test this?

MaskRay added a comment.EditedJan 6 2021, 3:06 PM

I don't think we want to introduce a filename based chmod API. Unless it is essential (very unlikely), it is easy for developers to make time-of-check time-of-use (TOCTOU) race condition allowing bypass of protection mechanism due to symlink attacks.

binutils 2.36 will fix the issue https://sourceware.org/bugzilla/show_bug.cgi?id=26945

Regarding whether llvm-objcopy should adopt the GNU objcopy smart_rename behavior, I am a bit hesitant, somewhat inclining to no.

It seems that very few packages actually require the behavior - most packages should use something like install -o ... -g ... -m ... to copy files, the owner/group is not too useful.

I don't think we want to introduce a filename based chmod API. Unless it is essential (very unlikely), it is easy for developers to make time-of-check time-of-use (TOCTOU) race condition allowing bypass of protection mechanism due to symlink attacks.

binutils 2.36 will fix the issue https://sourceware.org/bugzilla/show_bug.cgi?id=26945

Can you please clarify what is being fixed, the use of internal renaming APIs or the current behavior?

Regarding whether llvm-objcopy should adopt the GNU objcopy smart_rename behavior, I am a bit hesitant, somewhat inclining to no.

It seems that very few packages actually require the behavior - most packages should use something like install -o ... -g ... -m ... to copy files, the owner/group is not too useful.

We are hitting this problem in Portage (used in Gentoo Linux and Chrome OS) which is both a build system and a package manager. One of the steps after building a package is to install the files with the required permissions. And another action is to strip the files and add the debuginfo link etc. These steps execute as root privilege in portage as they modify the file system and we do not have control over it.

Because of the deviation from GNU objcopy/strip, stripping or just adding debuginfo link causes the "security sensitive" binaries to lose the intended permissions. We already had to revert twice switching to llvm objcopy/strip for this reason in Chrome OS before we figured out the problem. Therefore, I'd argue that this change is needed for GNU compatibility.
It is also what most tools already do when dealing with existing files e.g. "sudo vi foo.c" or "echo foo |& sudo tee a.c" does not change file ownership or group.

It sounds like portage should be using something like setuidgid to drop privileges for parts of the build.

It sounds like portage should be using something like setuidgid to drop privileges for parts of the build.

I do not think that matters. If a file has owner other than default [1] , stripping or adding debuginfo link to this file will still require root privileges.

[1] https://github.com/gentoo/gentoo/search?q=fowners&type=code

GNU objcopy and strip preserve the ownership of a file when it's overwritten, e.g. while "sudo objcopy foo -o bar" creates a new file bar with root access, "sudo objcopy foo" will keep the owner and group of foo unchanged. Currently llvm-objcopy and llvm-strip behave differently, always changing the ownership to root.

objcopy foo -o bar => objcopy foo bar

I am unsure that we want to adopt the GNU behavior because

(1) there were security vulnerabilities
(2) the code is subtle and untestable
(3) the behavior is self inconsistent. The behaviors of sudo objcopy foo and sudo -u mail objcopy foo (if user mail has write permission to the directory) are different.
(4) I believe very few packages require the behavior. It can only be used by root (CAP_CHOWN), so it is a very minor use case.

The discrepancy prevents Chrome OS from migrating to llvm-objcopy and llvm-strip as Portage (the build system) builds binaries following recommended steps documented at the man page of GNU strip as follows,

Not having a behavior has pros and cons and I understand that it may sometimes cause friction.
But from https://chromium-review.googlesource.com/q/gnu+strip , it seems that very few crbug.com bugs require the GNU fchown (was: chown) behavior.
Could you just update the build instructions for these few packages?

I don't think we want to introduce a filename based chmod API. Unless it is essential (very unlikely), it is easy for developers to make time-of-check time-of-use (TOCTOU) race condition allowing bypass of protection mechanism due to symlink attacks.

binutils 2.36 will fix the issue https://sourceware.org/bugzilla/show_bug.cgi?id=26945

Can you please clarify what is being fixed, the use of internal renaming APIs or the current behavior?

GNU objcopy before 2.36 had the symlink attach vulnerability. The patch would introduce the similar vulnerability to llvm-objcopy.

Regarding whether llvm-objcopy should adopt the GNU objcopy smart_rename behavior, I am a bit hesitant, somewhat inclining to no.

It seems that very few packages actually require the behavior - most packages should use something like install -o ... -g ... -m ... to copy files, the owner/group is not too useful.

We are hitting this problem in Portage (used in Gentoo Linux and Chrome OS) which is both a build system and a package manager. One of the steps after building a package is to install the files with the required permissions. And another action is to strip the files and add the debuginfo link etc. These steps execute as root privilege in portage as they modify the file system and we do not have control over it.

Because of the deviation from GNU objcopy/strip, stripping or just adding debuginfo link causes the "security sensitive" binaries to lose the intended permissions. We already had to revert twice switching to llvm objcopy/strip for this reason in Chrome OS before we figured out the problem. Therefore, I'd argue that this change is needed for GNU compatibility.
It is also what most tools already do when dealing with existing files e.g. "sudo vi foo.c" or "echo foo |& sudo tee a.c" does not change file ownership or group.

It's different. vi foo.c and tee a.c edits the file in place while strip/llvm-strip creates a temporary file and renames it to the original name.

GNU objcopy and strip preserve the ownership of a file when it's overwritten, e.g. while "sudo objcopy foo -o bar" creates a new file bar with root access, "sudo objcopy foo" will keep the owner and group of foo unchanged. Currently llvm-objcopy and llvm-strip behave differently, always changing the ownership to root.

objcopy foo -o bar => objcopy foo bar

I am unsure that we want to adopt the GNU behavior because

(1) there were security vulnerabilities
(2) the code is subtle and untestable
(3) the behavior is self inconsistent. The behaviors of sudo objcopy foo and sudo -u mail objcopy foo (if user mail has write permission to the directory) are different.

While I can understand the hesitation, we have a real need for this. And IIUC, even the trunk GNU tools has the existing old behavior, though they may have changed the internal implementation.

(4) I believe very few packages require the behavior. It can only be used by root (CAP_CHOWN), so it is a very minor use case.

I think I already provided examples of usage in Gentoo and Chrome OS that llvm objcopy breaks. This does not seem to be a case of minor use case to me.

The discrepancy prevents Chrome OS from migrating to llvm-objcopy and llvm-strip as Portage (the build system) builds binaries following recommended steps documented at the man page of GNU strip as follows,

Not having a behavior has pros and cons and I understand that it may sometimes cause friction.
But from https://chromium-review.googlesource.com/q/gnu+strip , it seems that very few crbug.com bugs require the GNU fchown (was: chown) behavior.

Please see the query links below for a more accurate count.

Could you just update the build instructions for these few packages?

I see over 600 uses of of fowners in Gentoo and over 55 in Chrome OS (public). Not all uses are related to binary files but asking to fix the build system instead of fixing the incompatibility is a non-trivial and a no-go.

[1] https://github.com/gentoo/gentoo/search?q=fowners&type=code
[2] https://source.chromium.org/search?q=fowners&sq=&ss=chromiumos

I don't think we want to introduce a filename based chmod API. Unless it is essential (very unlikely), it is easy for developers to make time-of-check time-of-use (TOCTOU) race condition allowing bypass of protection mechanism due to symlink attacks.

binutils 2.36 will fix the issue https://sourceware.org/bugzilla/show_bug.cgi?id=26945

Can you please clarify what is being fixed, the use of internal renaming APIs or the current behavior?

GNU objcopy before 2.36 had the symlink attach vulnerability. The patch would introduce the similar vulnerability to llvm-objcopy.

Regarding whether llvm-objcopy should adopt the GNU objcopy smart_rename behavior, I am a bit hesitant, somewhat inclining to no.

It seems that very few packages actually require the behavior - most packages should use something like install -o ... -g ... -m ... to copy files, the owner/group is not too useful.

We are hitting this problem in Portage (used in Gentoo Linux and Chrome OS) which is both a build system and a package manager. One of the steps after building a package is to install the files with the required permissions. And another action is to strip the files and add the debuginfo link etc. These steps execute as root privilege in portage as they modify the file system and we do not have control over it.

Because of the deviation from GNU objcopy/strip, stripping or just adding debuginfo link causes the "security sensitive" binaries to lose the intended permissions. We already had to revert twice switching to llvm objcopy/strip for this reason in Chrome OS before we figured out the problem. Therefore, I'd argue that this change is needed for GNU compatibility.
It is also what most tools already do when dealing with existing files e.g. "sudo vi foo.c" or "echo foo |& sudo tee a.c" does not change file ownership or group.

It's different. vi foo.c and tee a.c edits the file in place while strip/llvm-strip creates a temporary file and renames it to the original name.

I am simply trying to point out the normal behavior of tools when modifying existing files by various. The internal implementation obviously will vary between tools. From user point of view, it does not matter if the tool does a copy/rename or in place modification to do so.

MaskRay added a comment.EditedJan 6 2021, 6:45 PM

GNU objcopy and strip preserve the ownership of a file when it's overwritten, e.g. while "sudo objcopy foo -o bar" creates a new file bar with root access, "sudo objcopy foo" will keep the owner and group of foo unchanged. Currently llvm-objcopy and llvm-strip behave differently, always changing the ownership to root.

objcopy foo -o bar => objcopy foo bar

I am unsure that we want to adopt the GNU behavior because

(1) there were security vulnerabilities
(2) the code is subtle and untestable
(3) the behavior is self inconsistent. The behaviors of sudo objcopy foo and sudo -u mail objcopy foo (if user mail has write permission to the directory) are different.

While I can understand the hesitation, we have a real need for this. And IIUC, even the trunk GNU tools has the existing old behavior, though they may have changed the internal implementation.

(4) I believe very few packages require the behavior. It can only be used by root (CAP_CHOWN), so it is a very minor use case.

I think I already provided examples of usage in Gentoo and Chrome OS that llvm objcopy breaks. This does not seem to be a case of minor use case to me.

The discrepancy prevents Chrome OS from migrating to llvm-objcopy and llvm-strip as Portage (the build system) builds binaries following recommended steps documented at the man page of GNU strip as follows,

Not having a behavior has pros and cons and I understand that it may sometimes cause friction.
But from https://chromium-review.googlesource.com/q/gnu+strip , it seems that very few crbug.com bugs require the GNU fchown (was: chown) behavior.

Please see the query links below for a more accurate count.

Could you just update the build instructions for these few packages?

I see over 600 uses of of fowners in Gentoo and over 55 in Chrome OS (public). Not all uses are related to binary files but asking to fix the build system instead of fixing the incompatibility is a non-trivial and a no-go.

[1] https://github.com/gentoo/gentoo/search?q=fowners&type=code
[2] https://source.chromium.org/search?q=fowners&sq=&ss=chromiumos

OK. If that is the case, I am happy that llvm-objcopy adopts the GNU objcopy behavior.

But I think we should do better: instead of: (1) create a temp file (2) open it by name again (3) check it is a regular file with one hard link & it is writable (S_IWUSR) (4) fstat (5) fchown,
we should just modify the destination in place (for objcopy file and strip file). If the file has been mapped readonly, on many platforms it may be unwritable. We have to accept this.

The various security checks is to prevent vulnerability.

I don't think we want to introduce a filename based chmod API. Unless it is essential (very unlikely), it is easy for developers to make time-of-check time-of-use (TOCTOU) race condition allowing bypass of protection mechanism due to symlink attacks.

binutils 2.36 will fix the issue https://sourceware.org/bugzilla/show_bug.cgi?id=26945

Can you please clarify what is being fixed, the use of internal renaming APIs or the current behavior?

GNU objcopy before 2.36 had the symlink attach vulnerability. The patch would introduce the similar vulnerability to llvm-objcopy.

Regarding whether llvm-objcopy should adopt the GNU objcopy smart_rename behavior, I am a bit hesitant, somewhat inclining to no.

It seems that very few packages actually require the behavior - most packages should use something like install -o ... -g ... -m ... to copy files, the owner/group is not too useful.

We are hitting this problem in Portage (used in Gentoo Linux and Chrome OS) which is both a build system and a package manager. One of the steps after building a package is to install the files with the required permissions. And another action is to strip the files and add the debuginfo link etc. These steps execute as root privilege in portage as they modify the file system and we do not have control over it.

Because of the deviation from GNU objcopy/strip, stripping or just adding debuginfo link causes the "security sensitive" binaries to lose the intended permissions. We already had to revert twice switching to llvm objcopy/strip for this reason in Chrome OS before we figured out the problem. Therefore, I'd argue that this change is needed for GNU compatibility.
It is also what most tools already do when dealing with existing files e.g. "sudo vi foo.c" or "echo foo |& sudo tee a.c" does not change file ownership or group.

It's different. vi foo.c and tee a.c edits the file in place while strip/llvm-strip creates a temporary file and renames it to the original name.

I am simply trying to point out the normal behavior of tools when modifying existing files by various. The internal implementation obviously will vary between tools. From user point of view, it does not matter if the tool does a copy/rename or in place modification to do so.

Could you just update the build instructions for these few packages?

I see over 600 uses of of fowners in Gentoo and over 55 in Chrome OS (public). Not all uses are related to binary files but asking to fix the build system instead of fixing the incompatibility is a non-trivial and a no-go.

[1] https://github.com/gentoo/gentoo/search?q=fowners&type=code
[2] https://source.chromium.org/search?q=fowners&sq=&ss=chromiumos

OK. If that is the case, I am happy that llvm-objcopy adopts the GNU objcopy behavior.

But I think we should do better: instead of: (1) create a temp file (2) open it by name again (3) check it is a regular file with one hard link & it is writable (S_IWUSR) (4) fstat (5) fchown,
we should just modify the destination in place (for objcopy file and strip file). If the file has been mapped readonly, on many platforms it may be unwritable. We have to accept this.

The various security checks is to prevent vulnerability.

Thanks, we will work on an improved patch with the added checks.

From a user standpoint, I agree that "objcopy foo.o" should not be changing the ownership of a file, regardless of the implementation (which, in the case of llvm-objcopy, is getting it "wrong" compared to GNU objcopy, because we use temporary file and rename it at the end, without fixing ownership). Sounds like there's agreement there.

I also wonder if we should make this logic uniform to be least surprising, e.g. "llvm-objcopy foo.o bar.o" or "llvm-strip foo.o -o bar.o" should similarly have the same ownership for foo.o and bar.o. Also, if "llvm-objcopy foo.o --split-dwo=foo.dwo" should propagate ownership of foo.o to foo.dwo. As a user, that may be the most intuitive. This seems to break from what GNU objcopy does, however.

If so, I think we'd want to move some of this elsewhere, e.g. into FileOutputBuffer (chown the temporary file immediately instead of patching it at the end), or restoreStatOnFile (so it runs on both the object file and the split dwo file), so that it doesn't apply just to this narrow use case.

jcai19 added a comment.EditedJan 8 2021, 11:43 AM

GNU objcopy before 2.36 had the symlink attach vulnerability. The patch would introduce the similar vulnerability to llvm-objcopy.

@MaskRay, I am not familiar with this vulnerability, so just to clarify, my code can be exploited by a symlink attack after it opens the file but before it calls fchown, i.e. (1) llvm-objcopy opens the output file (2) attackers created a symlink with the same name, (3) fchown would change the ownership of the symlink instead of the intended file. Is that correct?

But I think we should do better: instead of: (1) create a temp file (2) open it by name again (3) check it is a regular file with one hard link & it is writable (S_IWUSR) (4) fstat (5) fchown,
we should just modify the destination in place (for objcopy file and strip file). If the file has been mapped readonly, on many platforms it may be unwritable. We have to accept this.

Can we use absolute path instead of relative path to shield the code from symlink attacks? This should also avoid the above-mentioned issue on readonly files.

EDIT: while testing about readonly files, I found another discrepancy between llvm-objcopy and objcopy

$ chmod -w foo
$ objcopy foo
objcopy: unable to copy file 'foo'; reason: Permission denied

The same commands worked with llvm-objcopy.

$ chmod -w foo
$ llvm-objcopy foo
$ echo $?
0

In-place write may fix that as well. I will start working on that approach.

jcai19 added a comment.EditedJan 8 2021, 11:55 AM

From a user standpoint, I agree that "objcopy foo.o" should not be changing the ownership of a file, regardless of the implementation (which, in the case of llvm-objcopy, is getting it "wrong" compared to GNU objcopy, because we use temporary file and rename it at the end, without fixing ownership). Sounds like there's agreement there.

Thanks for the support!

I also wonder if we should make this logic uniform to be least surprising, e.g. "llvm-objcopy foo.o bar.o" or "llvm-strip foo.o -o bar.o" should similarly have the same ownership for foo.o and bar.o. Also, if "llvm-objcopy foo.o --split-dwo=foo.dwo" should propagate ownership of foo.o to foo.dwo. As a user, that may be the most intuitive. This seems to break from what GNU objcopy does, however.

This sounds interesting and probably is worth another thread for discussion.

If so, I think we'd want to move some of this elsewhere, e.g. into FileOutputBuffer (chown the temporary file immediately instead of patching it at the end), or restoreStatOnFile (so it runs on both the object file and the split dwo file), so that it doesn't apply just to this narrow use case.

IMO it's safer to decouple the GNU-compatible and GNU-incompatible changes in case the latter breaks existing code and we need to revert it before we find a long-term solution.

But I think we should do better: instead of: (1) create a temp file (2) open it by name again (3) check it is a regular file with one hard link & it is writable (S_IWUSR) (4) fstat (5) fchown,
we should just modify the destination in place (for objcopy file and strip file). If the file has been mapped readonly, on many platforms it may be unwritable. We have to accept this.

@MaskRay I took a look at the how ELF binaries are handled and from what I understand, the code directly copies the content of the input file into temporary file (https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-objcopy/ELF/Object.cpp#L1786), before replacing the output file with it. So unless we are willing to allocate memory space to save its content (which will be very costive for large binaries), it does not appear modifying the input file in place is easy. And if we ever decide to take that path, it looks like we would have to make a lot of changes as the code of interest is buried deeply in the call hierarchy, and in addition we would likely have to change how the other file formats handle such cases to be consistent. Please let me know if I misunderstood your proposal or overlooked anything, but so far it appears to me the best place to preserve the file ownership without major change to the code is in the proposed place.

MaskRay added subscribers: robertu94, mgorny.EditedJan 14 2021, 3:25 PM

But I think we should do better: instead of: (1) create a temp file (2) open it by name again (3) check it is a regular file with one hard link & it is writable (S_IWUSR) (4) fstat (5) fchown,
we should just modify the destination in place (for objcopy file and strip file). If the file has been mapped readonly, on many platforms it may be unwritable. We have to accept this.

@MaskRay I took a look at the how ELF binaries are handled and from what I understand, the code directly copies the content of the input file into temporary file (https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-objcopy/ELF/Object.cpp#L1786), before replacing the output file with it. So unless we are willing to allocate memory space to save its content (which will be very costive for large binaries), it does not appear modifying the input file in place is easy. And if we ever decide to take that path, it looks like we would have to make a lot of changes as the code of interest is buried deeply in the call hierarchy, and in addition we would likely have to change how the other file formats handle such cases to be consistent. Please let me know if I misunderstood your proposal or overlooked anything, but so far it appears to me the best place to preserve the file ownership without major change to the code is in the proposed place.

If that case, I'll personally want to rethink whether the Gentoo usage is a case we want to support directly. I have mentioned that the GNU objcopy/strip smart_rename still has vulnerability and the owner preservation does not work for non-root (CAP_CHOWN) users.

chmod o+wx .
sudo chown mail a.o
sudo -u bin strip a.o  # owner becomes bin. llvm-strip has the same behavior.

You can create a wrapper script for strip used by root.

CC some Gentoo folks: @mgorny @robertu94

I have mentioned that the GNU objcopy/strip smart_rename still has vulnerability and the owner preservation does not work for non-root (CAP_CHOWN) users.

Can this be solved by limiting the affected user to root only?

chmod o+wx .
sudo chmod mail a.o
sudo -u bin strip a.o  # owner becomes bin. llvm-strip has the same behavior.

I tried to verify with this example but failed. Could you share the complete list of commands? Thanks.

robertu94 added a comment.EditedJan 15 2021, 7:07 AM

I could be wrong (@mgorny would know better), but it was my understanding that portage (if sandbox and userpriv FEATURES in are set in make.conf or not overwritten from /usr/share/portage/config/make.globals) drops user privileges to the portage user and group using the setuid/setgid functions during the unpack, prepare, configure, compile, and test phases. During the install process fakeroot can be used (but not the default), but I don't believe that privileges can be dropped separately from fakeroot.

jcai19 updated this revision to Diff 320981.Feb 2 2021, 8:28 PM

Update the ownership of the temporary file, instead of the overwritten file.

manojgupta added inline comments.Feb 2 2021, 8:49 PM
llvm/lib/Support/FileOutputBuffer.cpp
142

Can we pass the file descriptor (File.FD) directly?

llvm/lib/Support/Unix/Path.inc
1216

Please use the existing file descriptor and call fchown. I don't think there is any need to open and close the file if file is already open.

jcai19 updated this revision to Diff 321007.Feb 2 2021, 10:32 PM

Address @manojgupta's comments.

jcai19 marked 2 inline comments as done.Feb 2 2021, 10:33 PM
jhenderson added inline comments.Feb 3 2021, 12:51 AM
llvm/include/llvm/Support/FileOutputBuffer.h
38

Nit: comments should end with a . (same above).

llvm/lib/Support/FileOutputBuffer.cpp
131

Is there a way we could improve this API to create the temporary file with the right UID and GID in the first place? That seems like a better idea to me than having to do something post creation.

A thought to consider (I don't know what the desirable behaviour is here): what happens if a user were to Ctrl-C (or the program otherwise terminated unexpectedly) after the TempFile creation, before the ownership was updated? Would the temp file be in a desirable state?

llvm/lib/Support/Unix/Path.inc
1216

Nit: missing trailing full stop at end of comment. It would be helpful to explain the why behind the retry in this comment. It isn't obvious to those less experienced with this sort of code.

rupprecht added inline comments.Feb 3 2021, 4:31 PM
llvm/lib/Support/FileOutputBuffer.cpp
131

Calling setfsuid was suggested at one point, but that isn't a portable solution. Something like that would be the ideal fix IMO.

jcai19 updated this revision to Diff 321298.Feb 3 2021, 7:40 PM
jcai19 marked an inline comment as done.

Add trailing comma to comments.

jcai19 marked an inline comment as done.Feb 3 2021, 7:46 PM
jcai19 added inline comments.
llvm/lib/Support/FileOutputBuffer.cpp
131

Thanks @rupprecht for the clarification. @jhenderson I agree that would be the best and it was in fact what I had in mind when I started, but I could not find suitable functions to use.

llvm/lib/Support/Unix/Path.inc
1216

I called RetryAfterSignal simply to follow the way the other functions in this file were written. Unfortunately I do not know exactly when this will be helpful.

Thanks for the update. No more comments from me at this time, but I'd prefer others to finish looking and approve it.

rupprecht added inline comments.Feb 4 2021, 3:52 PM
llvm/lib/Support/FileOutputBuffer.cpp
140

WDYT about doing a stat on the newly-generated tempfile so we can skip the redundant fchown if possible? (i.e. the common case of running as a non-root user)

208–209

Personally, I was a little confused by Flags & F_keep_ownership, and I think it would be simpler to pass in the unsigned Flags param itself instead of bool KeepOwnership, and deferring the Flags & F_keep_ownership check until later.

llvm/lib/Support/Unix/Path.inc
1215

auto FChown

dxf added a subscriber: dxf.Feb 4 2021, 4:54 PM
rupprecht added inline comments.Feb 4 2021, 5:13 PM
llvm/lib/Support/FileOutputBuffer.cpp
140

IIUC, this has to be fstat on the input file in order to work securely. Otherwise, there is a race between reading the file (for copying the object from), and deciding what to fchown the new file to.

jcai19 marked an inline comment as done.Feb 4 2021, 6:43 PM
jcai19 added inline comments.
llvm/lib/Support/FileOutputBuffer.cpp
140

WDYT about doing a stat on the newly-generated tempfile so we can skip the redundant fchown if possible? (i.e. the common case of running as a non-root user)

I'm not sure I followed here. How can we set the ownership of the tempfile to the ownership of the output file without inquiring it? And how do we skip fchown in that case?

(i.e. the common case of running as a non-root user)

Right, I will limit this to only affect root user, i.e. when Stat.getUser() is 0. This should also avoid some of the issues with "sudo -u" issue mentioned by @MaskRay.

IIUC, this has to be fstat on the input file in order to work securely. Otherwise, there is a race between reading the file (for copying the object from), and deciding what to fchown the new file to.

I think fs::status by default calls stat (https://llvm.org/doxygen/Unix_2Path_8inc_source.html#l00739), which should have the same effect with fast according to man page: "fstat() is identical to stat(), except that the file about which information is to be retrieved is specified by the file descriptor fd."

208–209

We could definitely delay the and operation into createOnDiskBuffer, but it is not a member function so we would have to use something like Flags & FileOutputBuffer::F_keep_ownership. I feel with the current implementation we could avoid spilling implementation details of FileOutputBuffer into createOnDiskBuffer. WDTY?

jcai19 added inline comments.Feb 4 2021, 6:48 PM
llvm/lib/Support/FileOutputBuffer.cpp
208–209

To further clarify, I meant we could create less dependency between FileOutputBuffer and createOnDiskBuffer with the current implementation so I feel it could be marginally better. Also correction to my typo, WDTY -> WDYT.

jcai19 updated this revision to Diff 321632.Feb 4 2021, 7:19 PM
jcai19 marked an inline comment as done.

Limit the patch to root user.

manojgupta added inline comments.Feb 4 2021, 7:23 PM
llvm/lib/Support/FileOutputBuffer.cpp
142

The check for root user looks incorrect to me. The stat check for root should be done on the temp file, not the existing file.

jcai19 added inline comments.Feb 4 2021, 7:24 PM
llvm/lib/Support/FileOutputBuffer.cpp
142

Yes, I realized it the moment I sent the update. Will fix it.

manojgupta added inline comments.Feb 4 2021, 7:37 PM
llvm/lib/Support/FileOutputBuffer.cpp
142

I also think that calling fstat (fd) is better than stat. So I think using the existing status function which uses file descriptor is better here: std::error_code status(int FD, file_status &Result)

rupprecht added inline comments.Feb 4 2021, 7:57 PM
llvm/lib/Support/FileOutputBuffer.cpp
140

The suggestion to use fstat is here: https://sourceware.org/bugzilla/show_bug.cgi?id=26945#c2. Although as mentioned, it may not be exploitable.

jcai19 updated this revision to Diff 321645.Feb 4 2021, 8:40 PM

Check the user of the temporary file instead of the output file to decide the user that invokes the tool.

jcai19 marked an inline comment as done.Feb 4 2021, 8:47 PM
jcai19 added inline comments.
llvm/lib/Support/FileOutputBuffer.cpp
140

Thanks for the link, but if I understand you correctly, it seems to me that we only have path of the output file, so I have to call stat instead, which is the same as fstat except it takes the path instead of the file descriptor.

rupprecht added inline comments.Feb 5 2021, 11:55 AM
llvm/lib/Support/FileOutputBuffer.cpp
140

I do realize that fstat is just stat but applied to a file descriptor. But the point is that relying on the stat properties to not change is a race if you use path names instead of file descriptors.

The idea is to do:

  1. int input_fd = fopen("input-file.o") // llvm-objcopy copies sections etc. from this fd
  2. stat input_stat = fstat(input_fd)
  3. fchown(output_fd, input_stat.uid, input_stat.gid)

As opposed to:

  1. int input_fd = fopen("input-file.o") // llvm-objcopy copies sections etc. from this fd
  2. Attacker changes "input-file.o"
  3. stat input_stat = stat("input-file.o") // stat coming from the wrong file
  4. fchown(output_fd, input_stat.uid, input_stat.gid) // wrong chown applied
manojgupta added inline comments.Feb 5 2021, 12:08 PM
llvm/lib/Support/FileOutputBuffer.cpp
140

yes, I think this should be doable to avoid the call to stat completely here since objcopy already has the Stat information for the input file.
Jian, please see my comment for llvm-objcopy.cpp file.

llvm/tools/llvm-objcopy/llvm-objcopy.cpp
275

Since original file Stat is available, please re-use this instead of calling stat again on original file later.

jcai19 updated this revision to Diff 321855.Feb 5 2021, 12:55 PM

Reuse the result of stat instead of calling it on the output file again.

llvm/lib/Support/FileOutputBuffer.cpp
140

@rupprecht @manojgupta Thank you for the clarification, now I understand what you mean :). I've updated my patch to reuse the stat from earlier, as you suggest.

MaskRay requested changes to this revision.Feb 11 2021, 4:57 PM

Overall, LGTM. I am fine with fchown, but the description needs a lot of changes to proceed.

For one, the fix advice in https://reviews.llvm.org/D93881#2483542 needs to be taken, objcopy xxx -o yyy does not work.

You can mention that as of binutils 2.36, GNU strip calls chown(2) for sudo strip a and sudo strip a -o a, but not sudo strip a -o b or sudo strip a -o ./a.

The discrepancy prevents Chrome OS from migrating to llvm-objcopy and llvm-strip as Portage (the build system) builds binaries following recommended steps documented at the man page of GNU strip as follows,

This part should be rephrased. I think whatever llvm-objcopy/llvm-strip resolution is, Portage may need to migrate away from in-place strip as well @mgorny @robertu94, like https://git.archlinux.org/pacman.git/commit/?id=88d054093c1c99a697d95b26bd9aad5bc4d8e170
To prevent pessimistic double write behavior with binutils 2.37.

With llvm-strip and llvm-objcopy, the last two steps change ownership of binaries to root and cause them to become inaccessible to the intended user.

The question is about when root strips that is intended to have different owner/group, after atomic rename, should root restore owner/group?
I have asked many folks for their opinions. The result is contentious. It can be argued either way. In particular, I got objection like:
"the question is not 'how to justify fchown' but 'why justify fchown'" "if someone wants to sudo strip, let them do so and deal with the consequences"

I finally swing to fchown because the implicit intention to keep the file accessible by original owner/group.
The argument is stronger than "because doing this patch makes us more compatible with GNU, so we do it." While generally it makes sense to keep compatibility when I can do that for free,
the binutils behavior is quite problematic. Do we want to take its new behavior? D96308 I have thought very hard and finally think "no".

This does mean unfortunate platform differences and slightly inconsistent behaviors, but we are going to take it.

llvm/lib/Support/FileOutputBuffer.cpp
137

if requested.

This revision now requires changes to proceed.Feb 11 2021, 4:57 PM

Overall, LGTM. I am fine with fchown, but the description needs a lot of changes to proceed.

Thanks!

For one, the fix advice in https://reviews.llvm.org/D93881#2483542 needs to be taken, objcopy xxx -o yyy does not work.

Oh i've updated my commit message locally a while ago, I did not realize this has not been updated in Phabricator. Will definitely update it.

You can mention that as of binutils 2.36, GNU strip calls chown(2) for sudo strip a and sudo strip a -o a, but not sudo strip a -o b or sudo strip a -o ./a.

The discrepancy prevents Chrome OS from migrating to llvm-objcopy and llvm-strip as Portage (the build system) builds binaries following recommended steps documented at the man page of GNU strip as follows,

This part should be rephrased. I think whatever llvm-objcopy/llvm-strip resolution is, Portage may need to migrate away from in-place strip as well @mgorny @robertu94, like https://git.archlinux.org/pacman.git/commit/?id=88d054093c1c99a697d95b26bd9aad5bc4d8e170
To prevent pessimistic double write behavior with binutils 2.37.

With llvm-strip and llvm-objcopy, the last two steps change ownership of binaries to root and cause them to become inaccessible to the intended user.

The question is about when root strips that is intended to have different owner/group, after atomic rename, should root restore owner/group?
I have asked many folks for their opinions. The result is contentious. It can be argued either way. In particular, I got objection like:
"the question is not 'how to justify fchown' but 'why justify fchown'" "if someone wants to sudo strip, let them do so and deal with the consequences"

I finally swing to fchown because the implicit intention to keep the file accessible by original owner/group.
The argument is stronger than "because doing this patch makes us more compatible with GNU, so we do it." While generally it makes sense to keep compatibility when I can do that for free,
the binutils behavior is quite problematic. Do we want to take its new behavior? D96308 I have thought very hard and finally think "no".

This does mean unfortunate platform differences and slightly inconsistent behaviors, but we are going to take it.

Will address the rest.

Overall, LGTM. I am fine with fchown, but the description needs a lot of changes to proceed.

Thanks!

For one, the fix advice in https://reviews.llvm.org/D93881#2483542 needs to be taken, objcopy xxx -o yyy does not work.

Oh i've updated my commit message locally a while ago, I did not realize this has not been updated in Phabricator. Will definitely update it.

If you use arc diff 'HEAD^', you need --verbatim to amend the description.

You can mention that as of binutils 2.36, GNU strip calls chown(2) for sudo strip a and sudo strip a -o a, but not sudo strip a -o b or sudo strip a -o ./a.

The discrepancy prevents Chrome OS from migrating to llvm-objcopy and llvm-strip as Portage (the build system) builds binaries following recommended steps documented at the man page of GNU strip as follows,

This part should be rephrased. I think whatever llvm-objcopy/llvm-strip resolution is, Portage may need to migrate away from in-place strip as well @mgorny @robertu94, like https://git.archlinux.org/pacman.git/commit/?id=88d054093c1c99a697d95b26bd9aad5bc4d8e170
To prevent pessimistic double write behavior with binutils 2.37.

With llvm-strip and llvm-objcopy, the last two steps change ownership of binaries to root and cause them to become inaccessible to the intended user.

The question is about when root strips that is intended to have different owner/group, after atomic rename, should root restore owner/group?
I have asked many folks for their opinions. The result is contentious. It can be argued either way. In particular, I got objection like:
"the question is not 'how to justify fchown' but 'why justify fchown'" "if someone wants to sudo strip, let them do so and deal with the consequences"

I finally swing to fchown because the implicit intention to keep the file accessible by original owner/group.
The argument is stronger than "because doing this patch makes us more compatible with GNU, so we do it." While generally it makes sense to keep compatibility when I can do that for free,
the binutils behavior is quite problematic. Do we want to take its new behavior? D96308 I have thought very hard and finally think "no".

This does mean unfortunate platform differences and slightly inconsistent behaviors, but we are going to take it.

Will address the rest.

jcai19 retitled this revision from [llvm-objcopy] preserve file ownership when overwritten to [llvm-objcopy] preserve file ownership when overwritten by root.Feb 11 2021, 5:55 PM
jcai19 edited the summary of this revision. (Show Details)

If you use arc diff 'HEAD^', you need --verbatim to amend the description.

Gotcha. Thanks. I've updated the message.

jcai19 updated this revision to Diff 323196.Feb 11 2021, 5:59 PM

Update a comment.

jcai19 marked an inline comment as done.Feb 11 2021, 6:00 PM
MaskRay accepted this revision.Feb 11 2021, 6:10 PM

always changing the owner and gropu to root. The

gropu -> group.

The subsequent bullet points can use 1/2/3/4 instead of 1/1/1/1.

This revision is now accepted and ready to land.Feb 11 2021, 6:10 PM
jcai19 edited the summary of this revision. (Show Details)Feb 11 2021, 6:51 PM