This is an archive of the discontinued LLVM Phabricator instance.

[libcxx][lit] Support shared directories in ssh executor
Needs RevisionPublic

Authored by arichardson on Jul 23 2020, 8:24 AM.

Details

Reviewers
ldionne
Group Reviewers
Restricted Project
Summary

If the remote system and the local one share a directory (using e.g. NFS or
SMB), we can use this to avoid two ssh invocations and one scp invocation.
This commit adds new flags --shared-mount-{local,remote}-path to ssh.py
that when passed use the shared directory instead of scp to upload files
to the remote system.

Example usage:
./bin/llvm-lit projects/libcxx/test "-Dexecutor=/path/to/llvm-project/libcxx/utils/ssh.py --shared-mount-local-path=$(pwd)/tmp --shared-mount-remote-path=/mnt/tmp --host testuser@local-qemu"

This can massively speed up running tests:
Running the libcxxabi test suite via ssh.py on localhost on a Linux test
system takes 87 seconds instead of previously 200.
real 4m10.025s -> 1m46.165s
user 1m3.192s -> 0m45.396s
sys 0m12.088s -> 0m9.795s

Depends on D84097

Diff Detail

Event Timeline

arichardson created this revision.Jul 23 2020, 8:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 23 2020, 8:24 AM
Herald added 1 blocking reviewer(s): Restricted Project. · View Herald Transcript
ldionne requested changes to this revision.Sep 14 2020, 11:59 AM

I like to keep these executors really simple and doing only a single thing. Would you consider splitting this out into a separate executor instead? Here's what I got, roughly:

def ssh(args, command):
    cmd = ['ssh', '-oBatchMode=yes']
    if args.extra_ssh_args is not None:
        cmd.extend(shlex.split(args.extra_ssh_args))
    return cmd + [args.host, command]


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('--host', type=str, required=True)
    parser.add_argument('--execdir', type=str, required=True)
    parser.add_argument('--debug', action="store_true", required=False)
    parser.add_argument('--extra-ssh-args', type=str, required=False)
    parser.add_argument('--shared-mount-local-path', type=str, required=False,
                        help="Local path that is shared with the remote system (e.g. via NFS)")
    parser.add_argument('--shared-mount-remote-path', type=str, required=False,
                        help="Path for the shared directory on the remote system")
    parser.add_argument('--codesign_identity', type=str, required=False, default=None)
    parser.add_argument('--env', type=str, nargs='*', required=False, default=dict())
    parser.add_argument("command", nargs=argparse.ONE_OR_MORE)
    args = parser.parse_args()
    commandLine = args.command

    # Allow using a directory that is shared between the local system and the
    # remote on. This can significantly speed up testing by avoiding three
    # additional ssh connections for every test.
    if args.shared_mount_local_path:
        if not os.path.isdir(args.shared_mount_local_path):
            sys.exit("ERROR: --shared-mount-local-path is not a directory.")
        if not args.shared_mount_remote_path:
            sys.exit("ERROR: missing --shared-mount-remote-path argument.")

    # Create a temporary directory where the test will be run.
    # That is effectively the value of %T on the remote host.
    localTmp = tempfile.mkdtemp(prefix="libcxx.", dir=args.shared_mount_local_path)
    remoteTmp = os.path.join(args.shared_mount_remote_path, os.path.basename(localTmp))

    # HACK:
    # If an argument is a file that ends in `.tmp.exe`, assume it is the name
    # of an executable generated by a test file. We call these test-executables
    # below. This allows us to do custom processing like codesigning test-executables
    # and changing their path when running on the remote host. It's also possible
    # for there to be no such executable, for example in the case of a .sh.cpp
    # test.
    isTestExe = lambda exe: exe.endswith('.tmp.exe') and os.path.exists(exe)
    pathOnRemote = lambda file: posixpath.join(remoteTmp, os.path.basename(file))

    try:
        # Do any necessary codesigning of test-executables found in the command line.
        if args.codesign_identity:
            for exe in filter(isTestExe, commandLine):
                subprocess.check_call(['xcrun', 'codesign', '-f', '-s', args.codesign_identity, exe], env={})

        shutil.copy2(args.execdir, args.shared_mount_local_path)

        # Make sure all test-executables in the remote command line have 'execute'
        # permissions on the remote host. The host that compiled the test-executable
        # might not have a notion of 'executable' permissions.
        for exe in map(pathOnRemote, filter(isTestExe, commandLine)):
            remoteCommands.append('chmod +x {}'.format(exe))

        # Execute the command through SSH in the temporary directory, with the
        # correct environment. We tweak the command line to run it on the remote
        # host by transforming the path of test-executables to their path in the
        # temporary directory on the remote host.
        commandLine = (pathOnRemote(x) if isTestExe(x) else x for x in commandLine)
        remoteCommands.append('cd {}'.format(remoteTmp))
        if args.env:
            remoteCommands.append('export {}'.format(' '.join(args.env)))
        remoteCommands.append(subprocess.list2cmdline(commandLine))

        # Finally, SSH to the remote host and execute all the commands.
        executeRemoteCommand = ssh(args, ' && '.join(remoteCommands))
        rc = subprocess.call(executeRemoteCommand)
        return rc

    finally:
        # Make sure the temporary directory is removed when we're done.
        shutil.rmtree(localTmp)


if __name__ == '__main__':
    exit(main())

The interesting part is that we don't need the additional scp args anymore, and we don't need to go through hoops related to the tarball.

There is more duplication indeed, however it might be possible to remove it in other ways (e.g. factoring it out into something common). Also, for other stuff like code signing, I've been thinking it should be part of the compilation step anyway, not the executor. So this bit of duplication would go away from all executors. Thoughts?

This revision now requires changes to proceed.Sep 14 2020, 11:59 AM