Page MenuHomePhabricator

Make --reproduce produce more reproducible logs.
ClosedPublic

Authored by ruiu on Apr 29 2016, 12:40 PM.

Details

Summary

The aim of this patch is to make it easy to re-run the command without
updating paths in the command line. Here is a use case.

Assume that Alice is having an issue with lld and is reporting the issue
to developer Bob. Alice's current directly is /home/alice/work and her
command line is "ld.lld -o foo foo.o ../bar.o". She adds "--reproduce repro"
to the command line and re-run. Then the following text will be produced as
invocation.txt (notice that the paths are rewritten so that they are
relative to /home/alice/work/repro.)

-o home/alice/work/foo home/alice/work/foo.o home/alice/bar.o

The command also produces the following files by copying inputs.

/home/alice/repro/home/alice/work/foo.o
/home/alice/repro/home/alice/bar.o

Alice zips the directory and send it to Bob. Bob get an archive from Alice
and extract it to his home directory as /home/bob/repro. Now his directory
have the following files.

/home/bob/repro/invocation.txt
/home/bob/repro/home/alice/work/foo.o
/home/bob/repro/home/alice/bar.o

Bob then re-run the command with these files by the following commands.

cd /home/bob/repro
ld.lld @invocation.txt

Diff Detail

Repository
rL LLVM

Event Timeline

ruiu updated this revision to Diff 55649.Apr 29 2016, 12:40 PM
ruiu retitled this revision from to Make --reproduce produce more reproducible logs..
ruiu updated this object.
ruiu added reviewers: davide, rafael.
ruiu added a subscriber: llvm-commits.
silvas added a subscriber: silvas.EditedApr 29 2016, 1:22 PM

This sounds like it would lead to a highly degraded experience for windows users.

Let me explain the technique that I use to reconstruct games which works pretty well and has been used across windows/linux/mac for some time now, to see if it would work for your use case too:

Basically, your resulting directory structure is as follows (filenames changed for LLD's --reproduce use case):

repro.dir/
    invocation.txt
    FS_ROOT/
        C/
            Users/
                alice/

On Unix it would be something like:

repro.dir/
    invocation.txt
    FS_ROOT/
        home/
            alice/

The key thing is that the FS_ROOT captures every file that LLD needs to reproduce the build.

Then there is one more key step: all paths on the command line are converted to be relative paths to repro.dir. That is, LLD, when run with its CWD set to repro.dir, sees only paths of the form FS_ROOT/.... (I call such paths "relatively absolute").

This approach has worked for me in practice robustly across Windows, Mac, and Linux.

In the future, we can make invocation.txt a bit more robust, such as making it a rsp file so that we know we will interpret correctly when invoked as ld.lld @invocation.txt. We control the response file quoting for this and so we can do this robustly. It seems overcomplicated to me (and non-portable) to have a dependency on sh/bash simply for running our own invocation.

silvas requested changes to this revision.Apr 29 2016, 1:31 PM
silvas added a reviewer: silvas.

I think we can accomplish the goals of this patch without adding a dependence on a shell.

ELF/DriverUtils.cpp
126 ↗(On Diff #55649)

It would be better to make invocation.txt be a response file (where we control the quoting) so that we don't depend on having a unix shell. That also avoids needing to hardcode /ld.lld which makes things further unportable.

This revision now requires changes to proceed.Apr 29 2016, 1:31 PM
ruiu added a comment.Apr 29 2016, 2:40 PM

I don't agree that this would degrade the experience on Windows because currently on Windows (or even on Unix) you have to edit the command line to re-run the command. And it wouldn't add dependency to the Unix shell -- I'm trying to make it reconstractable so I employed a mechanical rule that is the same as the Unix's single quotation rule. Currently, you cannot distinguish spaces between filename with original spaces between arguments. So, at least, this wouldn't make things worse.

But I think it is a nice idea to make it produce a response file.

I could rewrite paths so that they are relative to FS_ROOT, but how would you handle thin archive files that contain pathnames rather than actual member files?

ruiu updated this revision to Diff 55679.Apr 29 2016, 3:53 PM
ruiu edited edge metadata.
  • Updated as per Sean's comment. invocation.txt is now a response file.
ruiu updated this object.Apr 29 2016, 3:54 PM
ruiu edited edge metadata.
ruiu updated this revision to Diff 55680.Apr 29 2016, 3:56 PM
  • Fix typo.
In D19737#417445, @ruiu wrote:

I could rewrite paths so that they are relative to FS_ROOT, but how would you handle thin archive files that contain pathnames rather than actual member files?

We could look inside them in this code for --reproduce or save some information in our InputFile structures. Don't we have a single point of truth for every file we open?

ELF/DriverUtils.cpp
122 ↗(On Diff #55680)

Clang generates response files for the linker and has been in production for some time. Can we share that code? I think the code we want to share is part of Command::writeResponseFile.

144 ↗(On Diff #55680)

You may want to call this invocation.rsp since it is a response file (it obviously doesn't matter, but since we are changing the meaning from the previous invocation.txt it might help to avoid confusion).

152 ↗(On Diff #55680)

Nice reusing libOption for this! This is really nice!

silvas added inline comments.Apr 29 2016, 4:54 PM
test/ELF/reproduce.s
20 ↗(On Diff #55680)

When I tried the first version of your patch this was failing on windows for the same reason as the original reproduce patch did (lit expands a path foo/C:/...). Can we avoid the problem?

ruiu added inline comments.Apr 29 2016, 5:00 PM
ELF/DriverUtils.cpp
122 ↗(On Diff #55680)

I'll try.

144 ↗(On Diff #55680)

How about response.txt? I think on Windows "rsp" extension is not recognized so that it is not easy to look inside.

152 ↗(On Diff #55680)

:)

test/ELF/reproduce.s
20 ↗(On Diff #55680)

Reid suggested defining a new variable "%:t" to get a path without a drive letter. But I'm not sure if it's worth to do. I'm inclined to simply add "REQUIRES: shell" to this file.

silvas added inline comments.Apr 29 2016, 5:07 PM
test/ELF/reproduce.s
20 ↗(On Diff #55680)

As we have seen there are some Windows-specific considerations for --reproduce, so disabling testing for this feature on Windows is unfortunately not an option.

ruiu updated this revision to Diff 55698.Apr 29 2016, 6:40 PM
  • Make tests work on Windows.
ELF/DriverUtils.cpp
122 ↗(On Diff #55680)

Command::writeResponseFile is not reusable for this purpose, and that function is too small to generalize. I think this function is just fine.

ruiu updated this revision to Diff 55699.Apr 29 2016, 6:41 PM
  • Update comments.
rafael accepted this revision.Apr 29 2016, 7:15 PM
rafael edited edge metadata.

I think this is an improvement. Both in feature (getting a response file with quotes) and code organization.

There is more work to be done, but I think it is fine to do it as a follow up patches.

So, fine with me, but please wait to see if Sean has more to add.

silvas accepted this revision.Apr 29 2016, 7:57 PM
silvas edited edge metadata.

Yeah, let's get this in.

This revision is now accepted and ready to land.Apr 29 2016, 7:57 PM
davide accepted this revision.Apr 29 2016, 8:06 PM
davide edited edge metadata.

LGTM

ruiu added a comment.Apr 29 2016, 8:07 PM

Can you guys also take a look at http://reviews.llvm.org/D19757? This patch depends on that.

This revision was automatically updated to reflect the committed changes.