This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
libcxx/utils/
-
utils/
-
ssh.py

Differential D84421

[libcxx][lit] Support shared directories in ssh executor
Needs RevisionPublic

Authored by arichardson on Jul 23 2020, 8:24 AM.

Download Raw Diff

Details

Reviewers

ldionne

Group Reviewers

Restricted Project

Summary

If the remote system and the local one share a directory (using e.g. NFS or
SMB), we can use this to avoid two ssh invocations and one scp invocation.
This commit adds new flags --shared-mount-{local,remote}-path to ssh.py
that when passed use the shared directory instead of scp to upload files
to the remote system.

Example usage:
./bin/llvm-lit projects/libcxx/test "-Dexecutor=/path/to/llvm-project/libcxx/utils/ssh.py --shared-mount-local-path=$(pwd)/tmp --shared-mount-remote-path=/mnt/tmp --host testuser@local-qemu"

This can massively speed up running tests:
Running the libcxxabi test suite via ssh.py on localhost on a Linux test
system takes 87 seconds instead of previously 200.
real 4m10.025s -> 1m46.165s
user 1m3.192s -> 0m45.396s
sys 0m12.088s -> 0m9.795s

Depends on D84097

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

arichardson created this revision.Jul 23 2020, 8:24 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 23 2020, 8:24 AM

Herald added 1 blocking reviewer(s): Restricted Project. · View Herald Transcript

Herald added subscribers: libcxx-commits, dexonsmith. · View Herald Transcript

Harbormaster failed remote builds in B65385: Diff 280138!Jul 23 2020, 9:04 AM

ping?

ldionne mentioned this in D84097: [libcxx][lit] Add support for custom ssh/scp flags in ssh.py.Sep 14 2020, 11:55 AM

I like to keep these executors really simple and doing only a single thing. Would you consider splitting this out into a separate executor instead? Here's what I got, roughly:

def ssh(args, command):
    cmd = ['ssh', '-oBatchMode=yes']
    if args.extra_ssh_args is not None:
        cmd.extend(shlex.split(args.extra_ssh_args))
    return cmd + [args.host, command]


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('--host', type=str, required=True)
    parser.add_argument('--execdir', type=str, required=True)
    parser.add_argument('--debug', action="store_true", required=False)
    parser.add_argument('--extra-ssh-args', type=str, required=False)
    parser.add_argument('--shared-mount-local-path', type=str, required=False,
                        help="Local path that is shared with the remote system (e.g. via NFS)")
    parser.add_argument('--shared-mount-remote-path', type=str, required=False,
                        help="Path for the shared directory on the remote system")
    parser.add_argument('--codesign_identity', type=str, required=False, default=None)
    parser.add_argument('--env', type=str, nargs='*', required=False, default=dict())
    parser.add_argument("command", nargs=argparse.ONE_OR_MORE)
    args = parser.parse_args()
    commandLine = args.command

    # Allow using a directory that is shared between the local system and the
    # remote on. This can significantly speed up testing by avoiding three
    # additional ssh connections for every test.
    if args.shared_mount_local_path:
        if not os.path.isdir(args.shared_mount_local_path):
            sys.exit("ERROR: --shared-mount-local-path is not a directory.")
        if not args.shared_mount_remote_path:
            sys.exit("ERROR: missing --shared-mount-remote-path argument.")

    # Create a temporary directory where the test will be run.
    # That is effectively the value of %T on the remote host.
    localTmp = tempfile.mkdtemp(prefix="libcxx.", dir=args.shared_mount_local_path)
    remoteTmp = os.path.join(args.shared_mount_remote_path, os.path.basename(localTmp))

    # HACK:
    # If an argument is a file that ends in `.tmp.exe`, assume it is the name
    # of an executable generated by a test file. We call these test-executables
    # below. This allows us to do custom processing like codesigning test-executables
    # and changing their path when running on the remote host. It's also possible
    # for there to be no such executable, for example in the case of a .sh.cpp
    # test.
    isTestExe = lambda exe: exe.endswith('.tmp.exe') and os.path.exists(exe)
    pathOnRemote = lambda file: posixpath.join(remoteTmp, os.path.basename(file))

    try:
        # Do any necessary codesigning of test-executables found in the command line.
        if args.codesign_identity:
            for exe in filter(isTestExe, commandLine):
                subprocess.check_call(['xcrun', 'codesign', '-f', '-s', args.codesign_identity, exe], env={})

        shutil.copy2(args.execdir, args.shared_mount_local_path)

        # Make sure all test-executables in the remote command line have 'execute'
        # permissions on the remote host. The host that compiled the test-executable
        # might not have a notion of 'executable' permissions.
        for exe in map(pathOnRemote, filter(isTestExe, commandLine)):
            remoteCommands.append('chmod +x {}'.format(exe))

        # Execute the command through SSH in the temporary directory, with the
        # correct environment. We tweak the command line to run it on the remote
        # host by transforming the path of test-executables to their path in the
        # temporary directory on the remote host.
        commandLine = (pathOnRemote(x) if isTestExe(x) else x for x in commandLine)
        remoteCommands.append('cd {}'.format(remoteTmp))
        if args.env:
            remoteCommands.append('export {}'.format(' '.join(args.env)))
        remoteCommands.append(subprocess.list2cmdline(commandLine))

        # Finally, SSH to the remote host and execute all the commands.
        executeRemoteCommand = ssh(args, ' && '.join(remoteCommands))
        rc = subprocess.call(executeRemoteCommand)
        return rc

    finally:
        # Make sure the temporary directory is removed when we're done.
        shutil.rmtree(localTmp)


if __name__ == '__main__':
    exit(main())

The interesting part is that we don't need the additional scp args anymore, and we don't need to go through hoops related to the tarball.

There is more duplication indeed, however it might be possible to remove it in other ways (e.g. factoring it out into something common). Also, for other stuff like code signing, I've been thinking it should be part of the compilation step anyway, not the executor. So this bit of duplication would go away from all executors. Thoughts?

This revision now requires changes to proceed.Sep 14 2020, 11:59 AM

Revision Contents

Path

Size

libcxx/

utils/

ssh.py

78 lines

Diff 280138

libcxx/utils/ssh.py

	#!/usr/bin/env python			#!/usr/bin/env python
	#===----------------------------------------------------------------------===##			#===----------------------------------------------------------------------===##
	#			#
	# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	# See https://llvm.org/LICENSE.txt for license information.			# See https://llvm.org/LICENSE.txt for license information.
	# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	#			#
	#===----------------------------------------------------------------------===##			#===----------------------------------------------------------------------===##

	"""			"""
	Runs an executable on a remote host.			Runs an executable on a remote host.

	This is meant to be used as an executor when running the C++ Standard Library			This is meant to be used as an executor when running the C++ Standard Library
	conformance test suite.			conformance test suite.
	"""			"""
				from __future__ import print_function

	import argparse			import argparse
	import os			import os
	import posixpath			import posixpath
	import shlex			import shlex
				import shutil
	import subprocess			import subprocess
	import sys			import sys
	import tarfile			import tarfile
	import tempfile			import tempfile

	def ssh(args, command):			def ssh(args, command):
	cmd = ['ssh', '-oBatchMode=yes']			cmd = ['ssh', '-oBatchMode=yes']
	if args.extra_ssh_args is not None:			if args.extra_ssh_args is not None:
	cmd.extend(shlex.split(args.extra_ssh_args))			cmd.extend(shlex.split(args.extra_ssh_args))
	return cmd + [args.host, command]			return cmd + [args.host, command]


	def scp(args, src, dst):			def scp(args, src, dst):
	cmd = ['scp', '-q', '-oBatchMode=yes']			cmd = ['scp', '-q', '-oBatchMode=yes']
	if args.extra_scp_args is not None:			if args.extra_scp_args is not None:
	cmd.extend(shlex.split(args.extra_scp_args))			cmd.extend(shlex.split(args.extra_scp_args))
	return cmd + [src, '{}:{}'.format(args.host, dst)]			return cmd + [src, '{}:{}'.format(args.host, dst)]


				def debug(cmdlineArgs, args, *kwargs):
				if cmdlineArgs.debug:
				print(args, file=sys.stderr, *kwargs)


				def createTempdir(args):
				if args.shared_mount_local_path:
				localTmp = tempfile.mkdtemp(prefix="libcxx.", dir=args.shared_mount_local_path)
				remoteTmp = os.path.join(args.shared_mount_remote_path, os.path.basename(localTmp))
				debug(args, "Created local tmp dir:", localTmp)
				debug(args, "Assuming remote path is:", remoteTmp)
				return localTmp, remoteTmp
				remoteTmp = subprocess.check_output(ssh(args, 'mktemp -d /tmp/libcxx.XXXXXXXXXX'),
				universal_newlines=True).strip()
				debug(args, "Create remote tmp dir:", remoteTmp)
				return None, remoteTmp


				def cleanupTempdir(args, localTmp, remoteTmp):
				if localTmp is not None:
				# If we have a shared mount we can simply delete the local directory.
				assert args.shared_mount_local_path is not None
				debug(args, "Deleting local tmp dir:", localTmp)
				shutil.rmtree(localTmp)
				else:
				debug(args, "Deleting remote tmp dir:", remoteTmp)
				subprocess.check_call(ssh(args, 'rm -r {}'.format(remoteTmp)))


				def uploadTarball(args, src, dst):
				if args.shared_mount_local_path:
				# TODO: when using a shared mount we should probably just copy all files
				# and skip creating the tarball.
				remoteRelPath = os.path.relpath(dst, args.shared_mount_remote_path)
				# The remote path should be inside the shared directory:
				assert not remoteRelPath.startswith('..'), remoteRelPath
				localPath = os.path.join(args.shared_mount_local_path, remoteRelPath)
				debug(args, "Copying", src, "->", localPath)
				shutil.copy2(src, localPath)
				else:
				debug(args, "Uploading", src, "->", dst, "using scp")
				subprocess.check_call(scp(args, src, dst))
				return dst


	def main():			def main():
	parser = argparse.ArgumentParser()			parser = argparse.ArgumentParser()
	parser.add_argument('--host', type=str, required=True)			parser.add_argument('--host', type=str, required=True)
	parser.add_argument('--execdir', type=str, required=True)			parser.add_argument('--execdir', type=str, required=True)
				parser.add_argument('--debug', action="store_true", required=False)
	parser.add_argument('--extra-ssh-args', type=str, required=False)			parser.add_argument('--extra-ssh-args', type=str, required=False)
	parser.add_argument('--extra-scp-args', type=str, required=False)			parser.add_argument('--extra-scp-args', type=str, required=False)
				parser.add_argument('--shared-mount-local-path', type=str, required=False,
				help="Local path that is shared with the remote system (e.g. via NFS)")
				parser.add_argument('--shared-mount-remote-path', type=str, required=False,
				help="Path for the shared directory on the remote system")
	parser.add_argument('--codesign_identity', type=str, required=False, default=None)			parser.add_argument('--codesign_identity', type=str, required=False, default=None)
	parser.add_argument('--env', type=str, nargs='*', required=False, default=dict())			parser.add_argument('--env', type=str, nargs='*', required=False, default=dict())
	parser.add_argument("command", nargs=argparse.ONE_OR_MORE)			parser.add_argument("command", nargs=argparse.ONE_OR_MORE)
	args = parser.parse_args()			args = parser.parse_args()
	commandLine = args.command			commandLine = args.command

				# Allow using a directory that is shared between the local system and the
				# remote on. This can significantly speed up testing by avoiding three
				# additional ssh connections for every test.
				if args.shared_mount_local_path:
				if not os.path.isdir(args.shared_mount_local_path):
				sys.exit("ERROR: --shared-mount-local-path is not a directory.")
				if not args.shared_mount_remote_path:
				sys.exit("ERROR: missing --shared-mount-remote-path argument.")

	# Create a temporary directory where the test will be run.			# Create a temporary directory where the test will be run.
	# That is effectively the value of %T on the remote host.			# That is effectively the value of %T on the remote host.
	tmp = subprocess.check_output(ssh(args, 'mktemp -d /tmp/libcxx.XXXXXXXXXX'), universal_newlines=True).strip()			localTmp, remoteTmp = createTempdir(args)

	# HACK:			# HACK:
	# If an argument is a file that ends in `.tmp.exe`, assume it is the name			# If an argument is a file that ends in `.tmp.exe`, assume it is the name
	# of an executable generated by a test file. We call these test-executables			# of an executable generated by a test file. We call these test-executables
	# below. This allows us to do custom processing like codesigning test-executables			# below. This allows us to do custom processing like codesigning test-executables
	# and changing their path when running on the remote host. It's also possible			# and changing their path when running on the remote host. It's also possible
	# for there to be no such executable, for example in the case of a .sh.cpp			# for there to be no such executable, for example in the case of a .sh.cpp
	# test.			# test.
	isTestExe = lambda exe: exe.endswith('.tmp.exe') and os.path.exists(exe)			isTestExe = lambda exe: exe.endswith('.tmp.exe') and os.path.exists(exe)
	pathOnRemote = lambda file: posixpath.join(tmp, os.path.basename(file))			pathOnRemote = lambda file: posixpath.join(remoteTmp, os.path.basename(file))

	try:			try:
	# Do any necessary codesigning of test-executables found in the command line.			# Do any necessary codesigning of test-executables found in the command line.
	if args.codesign_identity:			if args.codesign_identity:
	for exe in filter(isTestExe, commandLine):			for exe in filter(isTestExe, commandLine):
	subprocess.check_call(['xcrun', 'codesign', '-f', '-s', args.codesign_identity, exe], env={})			subprocess.check_call(['xcrun', 'codesign', '-f', '-s', args.codesign_identity, exe], env={})

	# tar up the execution directory (which contains everything that's needed			# tar up the execution directory (which contains everything that's needed
	# to run the test), and copy the tarball over to the remote host.			# to run the test), and copy the tarball over to the remote host.
	try:			try:
	tmpTar = tempfile.NamedTemporaryFile(suffix='.tar', delete=False)			tmpTar = tempfile.NamedTemporaryFile(suffix='.tar', delete=False)
	with tarfile.open(fileobj=tmpTar, mode='w') as tarball:			with tarfile.open(fileobj=tmpTar, mode='w') as tarball:
	tarball.add(args.execdir, arcname=os.path.basename(args.execdir))			tarball.add(args.execdir, arcname=os.path.basename(args.execdir))

	# Make sure we close the file before we scp it, because accessing			# Make sure we close the file before we scp it, because accessing
	# the temporary file while still open doesn't work on Windows.			# the temporary file while still open doesn't work on Windows.
	tmpTar.close()			tmpTar.close()
	remoteTarball = pathOnRemote(tmpTar.name)			remoteTarball = uploadTarball(args, tmpTar.name, pathOnRemote(tmpTar.name))
	subprocess.check_call(scp(args, tmpTar.name, remoteTarball))
	finally:			finally:
	# Make sure we close the file in case an exception happens before			# Make sure we close the file in case an exception happens before
	# we've closed it above -- otherwise close() is idempotent.			# we've closed it above -- otherwise close() is idempotent.
	tmpTar.close()			tmpTar.close()
	os.remove(tmpTar.name)			os.remove(tmpTar.name)

	# Untar the dependencies in the temporary directory and remove the tarball.			# Untar the dependencies in the temporary directory and remove the tarball.
	remoteCommands = [			remoteCommands = [
	'tar -xf {} -C {} --strip-components 1'.format(remoteTarball, tmp),			'tar -xf {} -C {} --strip-components 1'.format(remoteTarball, remoteTmp),
	'rm {}'.format(remoteTarball)			'rm {}'.format(remoteTarball)
	]			]

	# Make sure all test-executables in the remote command line have 'execute'			# Make sure all test-executables in the remote command line have 'execute'
	# permissions on the remote host. The host that compiled the test-executable			# permissions on the remote host. The host that compiled the test-executable
	# might not have a notion of 'executable' permissions.			# might not have a notion of 'executable' permissions.
	for exe in map(pathOnRemote, filter(isTestExe, commandLine)):			for exe in map(pathOnRemote, filter(isTestExe, commandLine)):
	remoteCommands.append('chmod +x {}'.format(exe))			remoteCommands.append('chmod +x {}'.format(exe))

	# Execute the command through SSH in the temporary directory, with the			# Execute the command through SSH in the temporary directory, with the
	# correct environment. We tweak the command line to run it on the remote			# correct environment. We tweak the command line to run it on the remote
	# host by transforming the path of test-executables to their path in the			# host by transforming the path of test-executables to their path in the
	# temporary directory on the remote host.			# temporary directory on the remote host.
	commandLine = (pathOnRemote(x) if isTestExe(x) else x for x in commandLine)			commandLine = (pathOnRemote(x) if isTestExe(x) else x for x in commandLine)
	remoteCommands.append('cd {}'.format(tmp))			remoteCommands.append('cd {}'.format(remoteTmp))
	if args.env:			if args.env:
	remoteCommands.append('export {}'.format(' '.join(args.env)))			remoteCommands.append('export {}'.format(' '.join(args.env)))
	remoteCommands.append(subprocess.list2cmdline(commandLine))			remoteCommands.append(subprocess.list2cmdline(commandLine))

	# Finally, SSH to the remote host and execute all the commands.			# Finally, SSH to the remote host and execute all the commands.
	rc = subprocess.call(ssh(args, ' && '.join(remoteCommands)))			executeRemoteCommand = ssh(args, ' && '.join(remoteCommands))
				debug(args, "Executing test using", executeRemoteCommand)
				rc = subprocess.call(executeRemoteCommand)
	return rc			return rc

	finally:			finally:
	# Make sure the temporary directory is removed when we're done.			# Make sure the temporary directory is removed when we're done.
	subprocess.check_call(ssh(args, 'rm -r {}'.format(tmp)))			cleanupTempdir(args, localTmp, remoteTmp)


	if __name__ == '__main__':			if __name__ == '__main__':
	exit(main())			exit(main())