This is an archive of the discontinued LLVM Phabricator instance.

[ELF]Fail the link early if an output path is invalid
ClosedPublic

Authored by jhenderson on Feb 28 2017, 1:06 AM.

Details

Summary

If a user has a long link, e.g. due to a large LTO link, they do not wish to run it and find that it failed because there was a mistake in their command-line, after they waited for some significant amount of time. This change adds some basic checking of the linker output file path, which is run shortly after parsing the command-line and linker script. An error is emitted if LLD cannot write to the specified path.

Diff Detail

Repository
rL LLVM

Event Timeline

jhenderson created this revision.Feb 28 2017, 1:06 AM
ruiu edited edge metadata.Feb 28 2017, 9:38 AM

I see the need of doing this, but we shouldn't introduce a new flag. You can do that unconditionally.

I'd define bool Writable(StringRef Path) function which returns true if Path is writable and use that for checking. In that function, I'd just open a given path with the O_APPEND mode to see if it succeeds (and just close it). In general, when you do a file system operation, the easiest and most reliable way of knowing whether you can do that or not is just doing that, so if you are in doubt if you can open a file, you want to try to open it to see if it fails.

In D30449#688721, @ruiu wrote:

I see the need of doing this, but we shouldn't introduce a new flag. You can do that unconditionally.

I don't have a problem with removing the flag, but that does mean we lose any testing of the later final attempt to write the file (see the changes in ELF/map-file.s and ELF/driver.test).

I'd define bool Writable(StringRef Path) function which returns true if Path is writable and use that for checking. In that function, I'd just open a given path with the O_APPEND mode to see if it succeeds (and just close it). In general, when you do a file system operation, the easiest and most reliable way of knowing whether you can do that or not is just doing that, so if you are in doubt if you can open a file, you want to try to open it to see if it fails.

Would we mind if a file is created or its modtime updated if it wouldn't otherwise be (e.g. an opt-remarks file would always be generated regardless of presence of bitcode)? This might, for example, cause a build system to think a file was up-to-date incorrectly. We could maybe work around this by listing such files and resetting them to their previous state where necessary (e.g. deleting them or resetting their mod-time).

Relatedly, I noticed that Path.inc in the LLVMSupport library already has an "access" function which says whether or not a file is writable. It doesn't attempt to write to the file explicitly, which means we can't be 100% certain that the real write will work, but does do all the basic checks from what I can see (file exists/can be created, isn't read-only etc), and returns a std::error_code if things go wrong that we could check to give a sensible message, so we could simply use something like that directly rather than writing our own version.

ruiu added a comment.Mar 1 2017, 10:41 AM

I see the need of doing this, but we shouldn't introduce a new flag. You can do that unconditionally.

I don't have a problem with removing the flag, but that does mean we lose any testing of the later final attempt to write the file (see the changes in ELF/map-file.s and ELF/driver.test).

That is fine. I don't want to add a new flag just for the test.

I'd define bool Writable(StringRef Path) function which returns true if Path is writable and use that for checking. In that function, I'd just open a given path with the O_APPEND mode to see if it succeeds (and just close it). In general, when you do a file system operation, the easiest and most reliable way of knowing whether you can do that or not is just doing that, so if you are in doubt if you can open a file, you want to try to open it to see if it fails.

Would we mind if a file is created or its modtime updated if it wouldn't otherwise be (e.g. an opt-remarks file would always be generated regardless of presence of bitcode)? This might, for example, cause a build system to think a file was up-to-date incorrectly. We could maybe work around this by listing such files and resetting them to their previous state where necessary (e.g. deleting them or resetting their mod-time).

Relatedly, I noticed that Path.inc in the LLVMSupport library already has an "access" function which says whether or not a file is writable. It doesn't attempt to write to the file explicitly, which means we can't be 100% certain that the real write will work, but does do all the basic checks from what I can see (file exists/can be created, isn't read-only etc), and returns a std::error_code if things go wrong that we could check to give a sensible message, so we could simply use something like that directly rather than writing our own version.

Ah, I totally have forgotten that Unix has access(2) function (and LLVM provides a wrapper for that) until now. That's probably what we want. I'd define a function like

void ensureWritable(StringRef Path) {
  if (auto EC = access(Path, AccessMode::Write))
    error("cannot access " + Path + ": " + EC.message());
}

and use that in the driver like

// Fail early if output files are not writable. If a user has a long link, e.g.
// due to a large LTO link, they do not wish to run it and find that it failed
// because there was a mistake in their command-line.
ensureWritable(Config->Output);
ensureWritable(Config->MapFile);
...
if (ErrorCount)
  return;

Using access won't account for more modern systems based on ACL's.

Terse comment. My point is that things are not so simple - the *only* way to know if a file is writable is to try to write to it. Lets embrace that. I think that LLD has the opportunity now to simply open all the files that it might write to upfront. This is simple and robust behaviour. Yes, this is a behaviour change, in that we would touch files that we might not touch with the current behaviour, but the new behaviour is benign. For example in the case of an opt-remarks file - emitting an empty file is just as correct as not emitting a file.

ruiu added a comment.Mar 2 2017, 9:13 AM

As long as access(2) works conservatively, it should be fine. In other words, we can tolerate the situation that access(2) reports in non-writable files writable, but not the other. Are you suggesting that in some environment access(2) reports that wriable files non-writable?

For "access" reporting writable files as non-writable - I can't think of a non-artificial case (e.g: custom filesystem/driver or bug).

What I am suggesting is that "access" or any other permissions check is only ever an approximation of the checks that the system will do when you attempt to open a file. Therefore, don't rely on it. Opening the files upfront is simple and robust.

Some reasons not to try to open the files upfront:

  1. Build tools might not cope with files being touched when the link did not actually complete.
  2. The existence of an output file to trigger some action. An empty file might trigger the action when not intended.
  3. A link might fail when it would succeed with the current lld behavior. This is because we might give an error that a file is not writable but (due to the codepath that the taken during the link) lld might never not have actually written to that file.

Case 1: Not a problem because lld will return a non-zero return code. Build systems must cope with this scenario as tools can error after partially writing output files.

Case 2: Whatever action is triggered will be quick (as it will have nothing to do) and will be correct as an empty output file is semantically equivalent to a missing one.

Case 3: This can be argued to be a feature rather than a problem. It is also a problem with the proposed "access" check scheme or any other early permissions check scheme.

ruiu added a comment.Mar 2 2017, 11:54 AM

I'm not totally against the idea of opening files upfront, and I agree that that is in general a good practice to avoid issues such as security issues, but what we are trying to do here is to fail early if it is likely to fail eventually. Unless access(2) does not report writable files as non-writable, the behavior of the linker won't change by this patch. It will fail sooner or later if files are not writable. So, I guess that this use of access(2) is hint.

For "access" reporting writable files as non-writable - I can't think of a non-artificial case (e.g: custom filesystem/driver or bug).

What I am suggesting is that "access" or any other permissions check is only ever an approximation of the checks that the system will do when you attempt to open a file. Therefore, don't rely on it. Opening the files upfront is simple and robust.

Some reasons not to try to open the files upfront:

  1. Build tools might not cope with files being touched when the link did not actually complete.
  2. The existence of an output file to trigger some action. An empty file might trigger the action when not intended.
  3. A link might fail when it would succeed with the current lld behavior. This is because we might give an error that a file is not writable but (due to the codepath that the taken during the link) lld might never not have actually written to that file.

Case 1: Not a problem because lld will return a non-zero return code. Build systems must cope with this scenario as tools can error after partially writing output files.

Case 2: Whatever action is triggered will be quick (as it will have nothing to do) and will be correct as an empty output file is semantically equivalent to a missing one.

For files specified by -o, this is not a problem We don't leave empty files behind if the link failed. We use the create-new-file-and-rename technique, so we always create files atomically.

Case 3: This can be argued to be a feature rather than a problem. It is also a problem with the proposed "access" check scheme or any other early permissions check scheme.

@ruiu - I didn't know lld used "create and rename". Nice.

I really like the "open all files upfront" approach. However, I agree that if we can guarantee that "access" is conservative then using access is a smaller behaviour change (we only need to worry about case 3 not case 2 in my list) and a smaller change.

I did a bit more thinking about when "access" could fail. From https://linux.die.net/man/2/access:

access() may not work correctly on NFS file systems with UID mapping enabled, because UID mapping is done on the server and hidden from the client, which checks permissions.

I suppose that NFS comes under the custom filesystems I mentioned - the difference is that linking over NFS is something that I have seen in real life.

I have no idea if using NFS could result in "access" reporting a file as non-writable when it is actually writable and I don't have the means to test this.

jhenderson updated this revision to Diff 90484.Mar 3 2017, 8:12 AM
jhenderson edited the summary of this revision. (Show Details)

I've implemented a version that tries to write to the file to test whether it is writable or not. I think that @bd1976llvm is right to worry about the various pitfalls that access may or may not run into. However, more importantly, I discovered several other issues with access (see below). I also noticed that the way -print-map works is to override the MapFile config setting, which means that any writability checking needs to happen after this point. I've added a test for this to the existing map file test. Finally, I extended the new test to cover read-only files, as that is the other most obvious use-case beyond invalid paths, which is now covered by the updated behaviour.

This new change exposed a crash in LLVMSupport to do with handling of stdout redirection via raw_fd_ostream. I have uploaded a patch, D30540, to fix this issue.

The first problem with access() is that it works just as well on directories as with files. If you call access on a directory, to test if it is writable, it will pass (assuming the user has the relevant permissions to write to the directory). The other problem is that it fails if the file does not exist, even if it could be created. This leads to much more complicated looking code that also has to check whether a path is a directory, or if it points to a valid directory to create a file in etc. I've included a possible implementation below. This, combined with the other potential issues raised, leads me to think that opening the file for append is the safer bet.

static void ensureWritable(StringRef Path, StringRef FileDescription) {
  if (Path.empty()) {
    return;
  }

  auto EmitError = [&](StringRef Reason) {
    error("cannot write to " + FileDescription + " " + Path + ": " + Reason);
  };

  if (auto EC = sys::fs::access(Path, sys::fs::AccessMode::Write)) {
    // access() emits a no_such_file_or_directory error if the file does not
    // exist, even if we could create it. In this case, check the directory
    // exists and is accessible.
    if (EC == std::errc::no_such_file_or_directory) {
      SmallString<128> Directory = Path;
      sys::path::remove_filename(Directory);
      if (EC = sys::fs::access(Directory, sys::fs::AccessMode::Write)) {
        // Directory is not writable
        EmitError(EC.message());
      } else if (!sys::fs::is_directory(Directory)) {
        // Directory is actually a writable file
        EmitError(EC.message());
      }
    } else {
      // File is not writable
      EmitError(EC.message());
    }
  } else if (sys::fs::is_directory(Path)) {
    EmitError("path is a directory");
  }
}
ruiu added inline comments.Mar 6 2017, 9:18 AM
ELF/Driver.cpp
537–541 ↗(On Diff #90484)

We don't actually open a file in the append mode to output to the file specified by the -o option. What we do is to create a temporary file in the same directory as the output file and then rename it to the destination file name after everything is done.

That cause a subtle difference if a file is not writable but the directory containing the file is writable, because appending file will fail but renaming over to it will succeed. There are other subtle semantics such as - and we don't want to replicate all that logic here.

How about using FileOutputBuffer here? FileOutputBuffer doesn't write to a specified file until commit() is called. We can use this for the purpose of error checking. You can call FileOutputBuffer::create() to simulate to open a file, check if that succeeded, and then just return from this function without calling commit(). This doesn't change the file access time and does exactly what FileOutputBuffer will do later.

540 ↗(On Diff #90484)

cannot write to -> cannot open, because we are just trying to open a file.

632–641 ↗(On Diff #90484)

This is too early to check for file permissions because you can set the output file name using linker scripts. You want to do that in LinkerDriver::link.

test/ELF/map-file.s
14 ↗(On Diff #90484)

To make sure that test does not interpret - as an option, you want to replace - with ./-. But if you use FileOutputBuffer for error checking, I don't bother to check for this.

jhenderson added inline comments.Mar 9 2017, 7:32 AM
ELF/Driver.cpp
537–541 ↗(On Diff #90484)

FileOutputBuffer doesn't currently check for writability of the file in question - there's a FIXME about it, and I have started work on a series of changes to implement and test it, which hopefully should be ready soon. Other than that, I agree that it would make sense to use it for the output file, but we don't use it for the other two instances. I guess in an ideal world we should use the same process as we do later on in each case, but that means two different ensureWritable functions, and also a reliance on later implementation details, which seems fragile.

The semantics of "-" for the map file and opt remarks file are important, because we don't want to fail here at all in this case, even if there is, for example, a read-only file called "-", because in this case we never create the file anyway. We could special case that particular behaviour though (don't test if the file path is "-"), which would only be a problem if the output file was set to "-", and couldn't be created for some reason (and even then our later attempt to write it should still pick it up).

Using FileOutputBuffer may also have the same flaws as described originally - without committing, it is impossible to know for certain that it is possible to write the final output file. However, I think it would be unlikely to give false errors, just occasionally let things through that it shouldn't do.

632–641 ↗(On Diff #90484)

Thanks, I didn't know that output file was a linker script option.

test/ELF/map-file.s
14 ↗(On Diff #90484)

I think this is a missing test case anyway, even without my change. It documents the behaviour of "-Map=-" to print to standard out and not create a file.

jhenderson added inline comments.Mar 9 2017, 9:16 AM
ELF/Driver.cpp
537–541 ↗(On Diff #90484)

So it turns out that FileOutputBuffer doesn't like it when you pass a directory as the path, so after I experimented with this, I started getting assertion failures from the map file test... Indeed, it is possible to provoke an assertion in LLD without my changes by specifying "-o /". The fix is simple, and I will look at that soon, hopefully.

632–641 ↗(On Diff #90484)

I looked further at this, and discovered that to bail out truly early, we need to do the checks before linker script processing, because otherwise, the link starts to try to add the input files. I don't know the cost of this phase, but I would think we can check both before and after linker script processing, since the first check is a no-op if the output file hasn't been set yet, and the second will just be a repeat of the first, if it is set by the command-line.

ruiu added inline comments.Mar 9 2017, 9:47 AM
ELF/Driver.cpp
537–541 ↗(On Diff #90484)

File writability doesn't really matter. The FIXME is wrong. What matters here is the writability of the directory containing the destination file. Even if the output file is not writable, we can and will delete it and create a new one as long as the directory is writable. And that was my point -- it isn't easy to predict if some file system operation will succeed or fail, so the best way of doing that is doing the same thing early. In this case, that's FileOutptuBuffer.

The problem that FileOutputBuffer doesn't work if you pass a directory is a different problem. Let's not fix that in this patch.

Can you fix one problem at a time? This patch is hanging around for a while but I want to see this to be checked in because this patch addresses a real problem. The -o option is the most common case you want to address. Other command line options, like --map-file, is a low priority. So let's do it incrementally. I think you want to use FileOutputBuffer to check for the -o option.

jhenderson updated this revision to Diff 91348.Mar 10 2017, 8:05 AM
jhenderson marked 2 inline comments as done.
jhenderson edited the summary of this revision. (Show Details)

Addressed review comments. In particular, I've removed the references to the map file and optimization remarks file, and switched to using the FileOutputBuffer. I've also moved the check later to make sure it covers the linker script case.

On Windows, at least, file writability does matter, in that FileOutputBuffer tries and fails to delete read-only files. However, given that point, the FIXME is irrelevant anyway and should probably be removed to avoid further confusion.

I agree on the directory issue being a separate patch (I wasn't proposing fixing it as part of this one anyway, but potentially as a precursor to it).

In response to Rafael's mailing list comments:

Given that Windows and Unix read-only file/directory behaviour is different, we need to be careful about any assumptions we make. In particular, just because we can write to a directory on Windows does not mean we can overwrite a file in that directory, so the principle of "can we create a file in the given directory" is insufficient. We need to know that we can both create the file and overwrite an existing file if it is present. That's a more general point though than this change is intended to address. As long as we use the same system early on as we do later on (which this change now does), we will emit errors early in the same cases, whatever those may be (bad paths, read-only files or directories, etc).

I'm not sure I agree that leaving broken output during a crashing link is a big deal, personally, but again, that's not really relevant given the nature of this change now. It may be relevant in the case of the map file and optimization remarks files later on though.

ruiu added a comment.Mar 10 2017, 11:58 AM

LGTM. I think this is very useful. Thank you for doing this!

ELF/Driver.cpp
804 ↗(On Diff #91348)

Replace auto with ErrorOr<std::unique_ptr<FileOutputBuffer>>

ruiu added a comment.Mar 10 2017, 12:03 PM

It's still LGTM, but for the sake of future development, I will write a comment.

So the reason why we need to use FileOutputBuffer twice, once in the driver and once in the writer, is because when you call FileOutputBuffer::create, you need to know the size of the output file. Obviously, until we finish linking, we don't know the size of the output, so we can't do that in the driver. However, there should be no technical reason to not allow file resizing after an instance of FileOutputBuffer is created. So I think the right way of fixing this is to make a change to FileOutputBuffer to allow that and use that in LLD.

jhenderson updated this revision to Diff 91528.Mar 13 2017, 2:43 AM
jhenderson marked an inline comment as done.

Changed to explicit type instead of auto.

@ruiu - I don't currently have commit access (hopefully coming soon). Would you mind committing on my behalf?

This revision was automatically updated to reflect the committed changes.