Download Raw Diff

Details

Reviewers

klimek
mehdi_amini

Commits

rG8b62e0887d72: Added optional validation of svn sources to Dockerfiles.
rL313359: Added optional validation of svn sources to Dockerfiles.

Summary

This commit also adds a script to compute sha256 hashes of llvm checkouts.

Diff Detail

Build Status

Buildable 10130
Build 10130: arc lint + arc unit

Event Timeline

ilya-biryukov created this revision.Aug 24 2017, 1:40 AM

ilya-biryukov added a parent revision: D37098: Use temporary directory when building docker image..Aug 24 2017, 1:42 AM

Would this be all obsolete with git repos?

In D37099#852251, @mehdi_amini wrote:

Would this be all obsolete with git repos?

I would think so.

Also all the python code seems complicated to me for what is basically find | grep -v '/\.git/' | grep -v '/\.svn/' | LC_ALL=C sort | tar -cf - -T - --no-recursion | sha1sum, can you elaborate why this is needed?

That's roughly what I started with, but's it's more complicated if you want to compute separate checksums for projects inside single tree source checkout (i.e. llvm, llvm/tool/clang, llvm/projects/lldb, etc.).
The use-case is rather bizzare:

You continuously import and review commits from svn repo for a set a llvm projects (e.g., llvm, clang, lldb). You use single tree source checkout for that.
You want to build docker images for a subset of those projects (e.g., only for llvm and clang). However, you want to checkout source code from official svn repo, not your mirror.
You want to make sure the code you checkout matches the code you reviewed in step 1.

There is also a problem that some files inside LLVM repo use svn substitutions. The substitutions are mostly fine, but $Date$ and $LastChangedDate$ are locale-specific and we have to account for that.

I totally agree with you, though, the code is quite complicated. Wish it was simpler.

In D37099#852298, @ilya-biryukov wrote:

In D37099#852251, @mehdi_amini wrote:

Would this be all obsolete with git repos?

I would think so.

Also all the python code seems complicated to me for what is basically find | grep -v '/\.git/' | grep -v '/\.svn/' | LC_ALL=C sort | tar -cf - -T - --no-recursion | sha1sum, can you elaborate why this is needed?

That's roughly what I started with, but's it's more complicated if you want to compute separate checksums for projects inside single tree source checkout (i.e. llvm, llvm/tool/clang, llvm/projects/lldb, etc.).

Well first using a flat layout makes it easier: checkout clang next to llvm instead of inside llvm/tools (and use -DLLVM_ENABLE_PROJECTS).
Then it is a matter of wrapping the command I gave in a loop around the directories and producing one entry per project.

There is also a problem that some files inside LLVM repo use svn substitutions. The substitutions are mostly fine, but $Date$ and $LastChangedDate$ are locale-specific and we have to account for that.

OK that's terrible :(

I totally agree with you, though, the code is quite complicated. Wish it was simpler.

Yeah, like just using git ;)

Anyway I'm fine with adding this, but I won't have time to review deeply the python code right now.

klimek added inline comments.Aug 30 2017, 7:03 AM

utils/docker/scripts/llvm_checksum/llvm_checksum_utils.py
1 ↗	(On Diff #112515)	Why 2 files? (generally, I dislike files named "util" :)
27–28 ↗	(On Diff #112515)	Why's the algo not part of the content_hasher? Can't we just use a hasher that has the exact interface we need?
48 ↗	(On Diff #112515)	Call it process_file? (proc_file sounds like a file in /proc :P)
82–83 ↗	(On Diff #112515)	Can't we put that in the same line?
122–123 ↗	(On Diff #112515)	Given that we feed in paths, this seems redundant?
131–133 ↗	(On Diff #112515)	Why not just feed the content (file content or symlink target)?

Addressed review comments.

Renamed proc_file to process_file.
Don't feed number of files to the hasher, it's redundant.
Removed next_dirs local var.

utils/docker/scripts/llvm_checksum/llvm_checksum_utils.py
1 ↗	(On Diff #112515)	Totally agree, `util` is a bad choice. One file(`util`) is a source code of a library, the other is a source code of an executable. The idea is to split the command argument parsing logic from an actual library code. How about `_lib` instead of `_util`? Or do you think merging the files is better?
27–28 ↗	(On Diff #112515)	`content_hasher` is responsible for reading files and replacing the contents of the file, while `hash_algo` is responsible for providing hash functions. we don't call `content_hasher` for broken symlinks and use `hash_algo` on the symlink target instead. It does not make sense to do any replacements in there, as it's not a file. it's nice that this function controls the lifetime of `hash_algo()` objects. Otherwise, there's a chance the client will create hasher only once and use that for all subsequent calls. It also seems nice to have the checksumming algorithm name as part of the function interface.
48 ↗	(On Diff #112515)	All short names are already taken by Linux :-( Renamed to `process_file`.
122–123 ↗	(On Diff #112515)	Totally agree, removed it.
131–133 ↗	(On Diff #112515)	To distinguish between a broken symlink and a file with content equivalent to the broken symlink's target.

klimek added inline comments.Aug 31 2017, 2:15 AM

utils/docker/scripts/llvm_checksum/llvm_checksum_utils.py
27–28 ↗	(On Diff #112515)	I think my main concern is that we're mixing levels of abstraction; generally, we have a strategy how to do the hashing, and we have an algorithm that does the visitation (giving project structure, whitelist, etc). I'd probably just do a visit_project function that takes a class with perhaps 2 functions: visit_file(path, content) visit_symlink(path, link) Then we'd have a Checksum class that implements these 2 functions, has a hasher member and does the hashing in the visitation, cutting at the responsibilities. That way, we also could split this up into multiple files more easily: project_tree.py (perhaps with a better name :) allows to visit all files in the project. llvm_checksum.py - can now be the main method and the checksum'ing parts of the algo
131–133 ↗	(On Diff #112515)	Why do we care?

ilya-biryukov added inline comments.Aug 31 2017, 3:45 AM

utils/docker/scripts/llvm_checksum/llvm_checksum_utils.py
27–28 ↗	(On Diff #112515)	Good point, will do.
131–133 ↗	(On Diff #112515)	Frankly, we don't. Broken symlinks and files are different beasts, but for our use-case this is probably irrelevant. Could remove this logic altogether and make the script simpler.

Separated project tree walking and checksumming implementations.

Harbormaster completed remote builds in B10130: Diff 114840.Sep 12 2017, 7:47 AM

Followed @klimek's suggestions and split the checksumming and project walking logic.
Also simplified the logic of dealing with broken symlinks - now we simply hash the target of the symlink, therefore it's indistinguishable from file with the same content. This makes the code a little simpler.

klimek added inline comments.Sep 14 2017, 5:34 AM

utils/docker/scripts/llvm_checksum/llvm_checksum.py
86	Substitute subsitutions for substitutions.
114	So the main reason we hash each file is for debugging purposes?

Rename read_replacing_substituions to read_and_collapse_svn_substitutions.

Harbormaster completed remote builds in B10225: Diff 115212.Sep 14 2017, 7:02 AM

ilya-biryukov marked an inline comment as done.Sep 14 2017, 7:03 AM

ilya-biryukov added inline comments.

utils/docker/scripts/llvm_checksum/llvm_checksum.py
86	Substituted with a name mentioning less substitutions.
114	Yes, makes debugging when checksums don't match easier. An alternative I was thinking about is to just feed all files to a single hasher, but that would probably allow to easily craft two different directory trees with the same hashes.

This revision is now accepted and ready to land.Sep 15 2017, 4:59 AM

Closed by commit rL313359: Added optional validation of svn sources to Dockerfiles. (authored by ibiryukov). · Explain WhySep 15 2017, 6:37 AM

This revision was automatically updated to reflect the committed changes.

Are the checksums intended to be valid across multiple platforms? I recall that @zturner noticed that we use SVN in weird ways to munge newlines for some files on Windows.

In D37099#873837, @jlebar wrote:

Are the checksums intended to be valid across multiple platforms? I recall that @zturner noticed that we use SVN in weird ways to munge newlines for some files on Windows.

Currently the checksums are only intended to match on Linux. I haven't checked, but some files will definitely have different line-endings on Windows.

Diff 114840

utils/docker/build_docker_image.sh

Show All 32 Lines	LLVM-specific:
-p\|--llvm-project name of an svn project to checkout. Will also add the		-p\|--llvm-project name of an svn project to checkout. Will also add the
project to a list LLVM_ENABLE_PROJECTS, passed to CMake.		project to a list LLVM_ENABLE_PROJECTS, passed to CMake.
For clang, please use 'clang', not 'cfe'.		For clang, please use 'clang', not 'cfe'.
Project 'llvm' is always included and ignored, if		Project 'llvm' is always included and ignored, if
specified.		specified.
Can be specified multiple times.		Can be specified multiple times.
-i\|--install-target name of a cmake install target to build and include in		-i\|--install-target name of a cmake install target to build and include in
the resulting archive. Can be specified multiple times.		the resulting archive. Can be specified multiple times.
		-c\|--checksums name of a file, containing checksums of llvm checkout.
		Script will fail if checksums of the checkout do not
		match.

Required options: --source and --docker-repository, at least one		Required options: --source and --docker-repository, at least one
--install-target.		--install-target.

All options after '--' are passed to CMake invocation.		All options after '--' are passed to CMake invocation.

For example, running:		For example, running:
$ build_docker_image.sh -s debian8 -d mydocker/debian8-clang -t latest \		$ build_docker_image.sh -s debian8 -d mydocker/debian8-clang -t latest \
Show All 12 Lines	$ ./build_docker_image.sh -s debian8 -d mydocker/clang-debian8 -t "latest" \
-- \		-- \
-DLLVM_TARGETS_TO_BUILD=Native -DCMAKE_BUILD_TYPE=Release \		-DLLVM_TARGETS_TO_BUILD=Native -DCMAKE_BUILD_TYPE=Release \
-DBOOTSTRAP_CMAKE_BUILD_TYPE=Release \		-DBOOTSTRAP_CMAKE_BUILD_TYPE=Release \
-DCLANG_ENABLE_BOOTSTRAP=ON \		-DCLANG_ENABLE_BOOTSTRAP=ON \
-DCLANG_BOOTSTRAP_TARGETS="install-clang;install-clang-headers"		-DCLANG_BOOTSTRAP_TARGETS="install-clang;install-clang-headers"
EOF		EOF
}		}

		CHECKSUMS_FILE=""
SEEN_INSTALL_TARGET=0		SEEN_INSTALL_TARGET=0
while [[ $# -gt 0 ]]; do		while [[ $# -gt 0 ]]; do
case "$1" in		case "$1" in
-h\|--help)		-h\|--help)
show_usage		show_usage
exit 0		exit 0
;;		;;
-s\|--source)		-s\|--source)
Show All 13 Lines	-t\|--docker-tag)
;;		;;
-i\|--install-target\|-r\|--revision\|-b\|--branch\|-p\|--llvm-project)		-i\|--install-target\|-r\|--revision\|-b\|--branch\|-p\|--llvm-project)
if [ "$1" == "-i" ] \|\| [ "$1" == "--install-target" ]; then		if [ "$1" == "-i" ] \|\| [ "$1" == "--install-target" ]; then
SEEN_INSTALL_TARGET=1		SEEN_INSTALL_TARGET=1
fi		fi
BUILDSCRIPT_ARGS="$BUILDSCRIPT_ARGS $1 $2"		BUILDSCRIPT_ARGS="$BUILDSCRIPT_ARGS $1 $2"
shift 2		shift 2
;;		;;
		-c\|--checksums)
		shift
		CHECKSUMS_FILE="$1"
		shift
		;;
--)		--)
shift		shift
BUILDSCRIPT_ARGS="$BUILDSCRIPT_ARGS -- $*"		BUILDSCRIPT_ARGS="$BUILDSCRIPT_ARGS -- $*"
shift $#		shift $#
;;		;;
*)		*)
echo "Unknown argument $1"		echo "Unknown argument $1"
exit 1		exit 1
Show All 30 Lines

BUILD_DIR=$(mktemp -d)		BUILD_DIR=$(mktemp -d)
trap "rm -rf $BUILD_DIR" EXIT		trap "rm -rf $BUILD_DIR" EXIT
echo "Using a temporary directory for the build: $BUILD_DIR"		echo "Using a temporary directory for the build: $BUILD_DIR"

cp -r "$SOURCE_DIR/$IMAGE_SOURCE" "$BUILD_DIR/$IMAGE_SOURCE"		cp -r "$SOURCE_DIR/$IMAGE_SOURCE" "$BUILD_DIR/$IMAGE_SOURCE"
cp -r "$SOURCE_DIR/scripts" "$BUILD_DIR/scripts"		cp -r "$SOURCE_DIR/scripts" "$BUILD_DIR/scripts"

		mkdir "$BUILD_DIR/checksums"
		if [ "$CHECKSUMS_FILE" != "" ]; then
		cp "$CHECKSUMS_FILE" "$BUILD_DIR/checksums/checksums.txt"
		fi

if [ "$DOCKER_TAG" != "" ]; then		if [ "$DOCKER_TAG" != "" ]; then
DOCKER_TAG=":$DOCKER_TAG"		DOCKER_TAG=":$DOCKER_TAG"
fi		fi

echo "Building from $IMAGE_SOURCE"		echo "Building from $IMAGE_SOURCE"
echo "Building $DOCKER_REPOSITORY-build$DOCKER_TAG"		echo "Building $DOCKER_REPOSITORY-build$DOCKER_TAG"
docker build -t "$DOCKER_REPOSITORY-build$DOCKER_TAG" \		docker build -t "$DOCKER_REPOSITORY-build$DOCKER_TAG" \
--build-arg "buildscript_args=$BUILDSCRIPT_ARGS" \		--build-arg "buildscript_args=$BUILDSCRIPT_ARGS" \
Show All 12 Lines

utils/docker/debian8/build/Dockerfile

	Show All 13 Lines
	# Install build dependencies of llvm.			# Install build dependencies of llvm.
	# First, Update the apt's source list and include the sources of the packages.			# First, Update the apt's source list and include the sources of the packages.
	RUN grep deb /etc/apt/sources.list \| \			RUN grep deb /etc/apt/sources.list \| \
	sed 's/^deb/deb-src /g' >> /etc/apt/sources.list			sed 's/^deb/deb-src /g' >> /etc/apt/sources.list

	# Install compiler, python and subversion.			# Install compiler, python and subversion.
	RUN apt-get update && \			RUN apt-get update && \
	apt-get install -y --no-install-recommends ca-certificates gnupg \			apt-get install -y --no-install-recommends ca-certificates gnupg \
	build-essential python2.7 wget subversion ninja-build && \			build-essential python wget subversion ninja-build && \
	rm -rf /var/lib/apt/lists/*			rm -rf /var/lib/apt/lists/*

	# Import public key required for verifying signature of cmake download.			# Import public key required for verifying signature of cmake download.
	RUN gpg --keyserver hkp://pgp.mit.edu --recv 0x2D2CEF1034921684			RUN gpg --keyserver hkp://pgp.mit.edu --recv 0x2D2CEF1034921684

	# Download, verify and install cmake version that can compile clang into /usr/local.			# Download, verify and install cmake version that can compile clang into /usr/local.
	# (Version in debian8 repos is is too old)			# (Version in debian8 repos is is too old)
	RUN mkdir /tmp/cmake-install && cd /tmp/cmake-install && \			RUN mkdir /tmp/cmake-install && cd /tmp/cmake-install && \
	wget "https://cmake.org/files/v3.7/cmake-3.7.2-SHA-256.txt.asc" && \			wget "https://cmake.org/files/v3.7/cmake-3.7.2-SHA-256.txt.asc" && \
	wget "https://cmake.org/files/v3.7/cmake-3.7.2-SHA-256.txt" && \			wget "https://cmake.org/files/v3.7/cmake-3.7.2-SHA-256.txt" && \
	gpg --verify cmake-3.7.2-SHA-256.txt.asc cmake-3.7.2-SHA-256.txt && \			gpg --verify cmake-3.7.2-SHA-256.txt.asc cmake-3.7.2-SHA-256.txt && \
	wget "https://cmake.org/files/v3.7/cmake-3.7.2-Linux-x86_64.tar.gz" && \			wget "https://cmake.org/files/v3.7/cmake-3.7.2-Linux-x86_64.tar.gz" && \
	( grep "cmake-3.7.2-Linux-x86_64.tar.gz" cmake-3.7.2-SHA-256.txt \| \			( grep "cmake-3.7.2-Linux-x86_64.tar.gz" cmake-3.7.2-SHA-256.txt \| \
	sha256sum -c - ) && \			sha256sum -c - ) && \
	tar xzf cmake-3.7.2-Linux-x86_64.tar.gz -C /usr/local --strip-components=1 && \			tar xzf cmake-3.7.2-Linux-x86_64.tar.gz -C /usr/local --strip-components=1 && \
	cd / && rm -rf /tmp/cmake-install			cd / && rm -rf /tmp/cmake-install

				ADD checksums /tmp/checksums
				ADD scripts /tmp/scripts

	# Arguments passed to build_install_clang.sh.			# Arguments passed to build_install_clang.sh.
	ARG buildscript_args			ARG buildscript_args

	# Run the build. Results of the build will be available as /tmp/clang.tar.gz.			# Run the build. Results of the build will be available as /tmp/clang.tar.gz.
	ADD scripts/build_install_llvm.sh /tmp			RUN /tmp/scripts/build_install_llvm.sh ${buildscript_args}
	RUN /tmp/build_install_llvm.sh ${buildscript_args}

utils/docker/example/build/Dockerfile

	Show All 12 Lines
	FROM ubuntu			FROM ubuntu

	# FIXME: Change maintainer name			# FIXME: Change maintainer name
	LABEL maintainer "Maintainer <maintainer@email>"			LABEL maintainer "Maintainer <maintainer@email>"

	# FIXME: Install llvm/clang build dependencies. Including compiler to			# FIXME: Install llvm/clang build dependencies. Including compiler to
	# build stage1, cmake, subversion, ninja, etc.			# build stage1, cmake, subversion, ninja, etc.

	# Arguments to pass to build_install_clang.sh.			ADD checksums /tmp/checksums
				ADD scripts /tmp/scripts

				# Arguments passed to build_install_clang.sh.
	ARG buildscript_args			ARG buildscript_args

	# Run the build. Results of the build will be available as /tmp/clang.tar.gz.			# Run the build. Results of the build will be available as /tmp/clang.tar.gz.
	ADD scripts/build_install_llvm.sh /tmp			RUN /tmp/scripts/build_install_llvm.sh ${buildscript_args}
	RUN /tmp/build_install_llvm.sh ${buildscript_args}

utils/docker/nvidia-cuda/build/Dockerfile

	Show All 11 Lines

	LABEL maintainer "LLVM Developers"			LABEL maintainer "LLVM Developers"

	# Arguments to pass to build_install_clang.sh.			# Arguments to pass to build_install_clang.sh.
	ARG buildscript_args			ARG buildscript_args

	# Install llvm build dependencies.			# Install llvm build dependencies.
	RUN apt-get update && \			RUN apt-get update && \
	apt-get install -y --no-install-recommends ca-certificates cmake python2.7 \			apt-get install -y --no-install-recommends ca-certificates cmake python \
	subversion ninja-build && \			subversion ninja-build && \
	rm -rf /var/lib/apt/lists/*			rm -rf /var/lib/apt/lists/*

				ADD checksums /tmp/checksums
				ADD scripts /tmp/scripts

				# Arguments passed to build_install_clang.sh.
				ARG buildscript_args

	# Run the build. Results of the build will be available as /tmp/clang.tar.gz.			# Run the build. Results of the build will be available as /tmp/clang.tar.gz.
	ADD scripts/build_install_llvm.sh /tmp			RUN /tmp/scripts/build_install_llvm.sh ${buildscript_args}
	RUN /tmp/build_install_llvm.sh ${buildscript_args}

utils/docker/scripts/build_install_llvm.sh

	Show First 20 Lines • Show All 175 Lines • ▼ Show 20 Lines

	if [ $CLANG_TOOLS_EXTRA_ENABLED -ne 0 ]; then			if [ $CLANG_TOOLS_EXTRA_ENABLED -ne 0 ]; then
	echo "Checking out https://llvm.org/svn/llvm-project/clang-tools-extra to $CLANG_BUILD_DIR/src/clang/tools/extra"			echo "Checking out https://llvm.org/svn/llvm-project/clang-tools-extra to $CLANG_BUILD_DIR/src/clang/tools/extra"
	svn co -q $SVN_REV_ARG \			svn co -q $SVN_REV_ARG \
	"https://llvm.org/svn/llvm-project/clang-tools-extra/$LLVM_BRANCH" \			"https://llvm.org/svn/llvm-project/clang-tools-extra/$LLVM_BRANCH" \
	"$CLANG_BUILD_DIR/src/clang/tools/extra"			"$CLANG_BUILD_DIR/src/clang/tools/extra"
	fi			fi

				CHECKSUMS_FILE="/tmp/checksums/checksums.txt"

				if [ -f "$CHECKSUMS_FILE" ]; then
				echo "Validating checksums for LLVM checkout..."
				python "$(dirname $0)/llvm_checksum/llvm_checksum.py" -c "$CHECKSUMS_FILE" \
				--partial --multi_dir "$CLANG_BUILD_DIR/src"
				else
				echo "Skipping checksumming checks..."
				fi

	mkdir "$CLANG_BUILD_DIR/build"			mkdir "$CLANG_BUILD_DIR/build"
	pushd "$CLANG_BUILD_DIR/build"			pushd "$CLANG_BUILD_DIR/build"

	# Run the build as specified in the build arguments.			# Run the build as specified in the build arguments.
	echo "Running build"			echo "Running build"
	cmake -GNinja \			cmake -GNinja \
	-DCMAKE_INSTALL_PREFIX="$CLANG_INSTALL_DIR" \			-DCMAKE_INSTALL_PREFIX="$CLANG_INSTALL_DIR" \
	-DLLVM_ENABLE_PROJECTS="$CMAKE_LLVM_ENABLE_PROJECTS" \			-DLLVM_ENABLE_PROJECTS="$CMAKE_LLVM_ENABLE_PROJECTS" \
	Show All 15 Lines

utils/docker/scripts/llvm_checksum/llvm_checksum.py

This file was added.

Property	Old Value	New Value
File Mode	null	100755

				#!/usr/bin/python
				""" A small program to compute checksums of LLVM checkout.
				"""
				from __future__ import absolute_import
				from __future__ import division
				from __future__ import print_function

				import hashlib
				import logging
				import re
				import sys
				from argparse import ArgumentParser
				from project_tree import *

				SVN_DATES_REGEX = re.compile(r"\$(Date\|LastChangedDate)[^\$]+\$")


				def main():
				parser = ArgumentParser()
				parser.add_argument(
				"-v", "--verbose", action="store_true", help="enable debug logging")
				parser.add_argument(
				"-c",
				"--check",
				metavar="reference_file",
				help="read checksums from reference_file and " +
				"check they match checksums of llvm_path.")
				parser.add_argument(
				"--partial",
				action="store_true",
				help="ignore projects from reference_file " +
				"that are not checked out in llvm_path.")
				parser.add_argument(
				"--multi_dir",
				action="store_true",
				help="indicates llvm_path contains llvm, checked out " +
				"into multiple directories, as opposed to a " +
				"typical single source tree checkout.")
				parser.add_argument("llvm_path")

				args = parser.parse_args()
				if args.check is not None:
				with open(args.check, "r") as f:
				reference_checksums = ReadLLVMChecksums(f)
				else:
				reference_checksums = None

				if args.verbose:
				logging.basicConfig(level=logging.DEBUG)

				llvm_projects = CreateLLVMProjects(not args.multi_dir)
				checksums = ComputeLLVMChecksums(args.llvm_path, llvm_projects)

				if reference_checksums is None:
				WriteLLVMChecksums(checksums, sys.stdout)
				sys.exit(0)

				if not ValidateChecksums(reference_checksums, checksums, args.partial):
				sys.stdout.write("Checksums differ.\nNew checksums:\n")
				WriteLLVMChecksums(checksums, sys.stdout)
				sys.stdout.write("Reference checksums:\n")
				WriteLLVMChecksums(reference_checksums, sys.stdout)
				sys.exit(1)
				else:
				sys.stdout.write("Checksums match.")


				def ComputeLLVMChecksums(root_path, projects):
				"""Compute checksums for LLVM sources checked out using svn.

				Args:
				root_path: a directory of llvm checkout.
				projects: a list of LLVMProject instances, which describe checkout paths,
				relative to root_path.

				Returns:
				A dict mapping from project name to project checksum.
				"""
				hash_algo = hashlib.sha256

				def replace_svn_substitutions(contents):
				# Replace svn substitutions for $Date$ and $LastChangedDate$.
				# Unfortunately, these are locale-specific for local machine.
				return SVN_DATES_REGEX.sub("$\1$", contents)

				def read_replacing_subsitutions(file_path):
				klimekUnsubmitted Done Reply Inline Actions Substitute subsitutions for substitutions. klimek: Substitute subsitutions for substitutions.
				ilya-biryukovAuthorUnsubmitted Not Done Reply Inline Actions Substituted with a name mentioning less substitutions. ilya-biryukov: Substituted with a name mentioning less substitutions.
				with open(file_path, "rb") as f:
				contents = f.read()
				new_contents = replace_svn_substitutions(contents)
				if contents != new_contents:
				logging.debug("Replaced svn keyword substitutions in %s", file_path)
				logging.debug("\n\tBefore\n%s\n\tAfter\n%s", contents, new_contents)
				return new_contents

				project_checksums = dict()
				# Hash each project using dir_checksum.
				for proj in projects:
				project_root = os.path.join(root_path, proj.relpath)
				if not os.path.exists(project_root):
				logging.info("Folder %s doesn't exist, skipping project %s", proj.relpath,
				proj.name)
				continue

				files = list()

				def add_file_hash(file_path):
				if os.path.islink(file_path) and not os.path.exists(file_path):
				content = os.readlink(file_path)
				else:
				content = read_replacing_subsitutions(file_path)
				hasher = hash_algo()
				hasher.update(content)
				file_digest = hasher.hexdigest()
				logging.debug("Checksum %s for file %s", file_digest, file_path)
				klimekUnsubmitted Not Done Reply Inline Actions So the main reason we hash each file is for debugging purposes? klimek: So the main reason we hash each file is for debugging purposes?
				ilya-biryukovAuthorUnsubmitted Not Done Reply Inline Actions Yes, makes debugging when checksums don't match easier. An alternative I was thinking about is to just feed all files to a single hasher, but that would probably allow to easily craft two different directory trees with the same hashes. ilya-biryukov: Yes, makes debugging when checksums don't match easier. An alternative I was thinking about is…
				files.append((file_path, file_digest))

				logging.info("Computing checksum for %s", proj.name)
				WalkProjectFiles(root_path, projects, proj, add_file_hash)

				# Compute final checksum.
				files.sort(key=lambda x: x[0])
				hasher = hash_algo()
				for file_path, file_digest in files:
				file_path = os.path.relpath(file_path, project_root)
				hasher.update(file_path)
				hasher.update(file_digest)
				project_checksums[proj.name] = hasher.hexdigest()
				return project_checksums


				def WriteLLVMChecksums(checksums, f):
				"""Writes checksums to a text file.

				Args:
				checksums: a dict mapping from project name to project checksum (result of
				ComputeLLVMChecksums).
				f: a file object to write into.
				"""

				for proj in sorted(checksums.keys()):
				f.write("{} {}\n".format(checksums[proj], proj))


				def ReadLLVMChecksums(f):
				"""Reads checksums from a text file, produced by WriteLLVMChecksums.

				Returns:
				A dict, mapping from project name to project checksum.
				"""
				checksums = {}
				while True:
				line = f.readline()
				if line == "":
				break
				checksum, proj = line.split()
				checksums[proj] = checksum
				return checksums


				def ValidateChecksums(reference_checksums,
				new_checksums,
				allow_missing_projects=False):
				"""Validates that reference_checksums and new_checksums match.

				Args:
				reference_checksums: a dict of reference checksums, mapping from a project
				name to a project checksum.
				new_checksums: a dict of checksums to be checked, mapping from a project
				name to a project checksum.
				allow_missing_projects:
				When True, reference_checksums may contain more projects than
				new_checksums. Projects missing from new_checksums are ignored.
				When False, new_checksums and reference_checksums must contain checksums
				for the same set of projects. If there is a project in
				reference_checksums, missing from new_checksums, ValidateChecksums
				will return False.

				Returns:
				True, if checksums match with regards to allow_missing_projects flag value.
				False, otherwise.
				"""
				if not allow_missing_projects:
				if len(new_checksums) != len(reference_checksums):
				return False

				for proj, checksum in new_checksums.iteritems():
				# We never computed a checksum for this project.
				if proj not in reference_checksums:
				return False
				# Checksum did not match.
				if reference_checksums[proj] != checksum:
				return False

				return True


				if __name__ == "__main__":
				main()

utils/docker/scripts/llvm_checksum/project_tree.py

This file was added.

				"""Contains helper functions to compute checksums for LLVM checkouts.
				"""
				from __future__ import absolute_import
				from __future__ import division
				from __future__ import print_function

				import logging
				import os
				import os.path
				import sys


				class LLVMProject(object):
				"""An LLVM project with a descriptive name and a relative checkout path.
				"""

				def __init__(self, name, relpath):
				self.name = name
				self.relpath = relpath

				def is_subproject(self, other_project):
				""" Check if self is checked out as a subdirectory of other_project.
				"""
				return self.relpath.startswith(other_project.relpath)


				def WalkProjectFiles(checkout_root, all_projects, project, visitor):
				""" Walk over all files inside a project without recursing into subprojects, '.git' and '.svn' subfolders.

				checkout_root: root of the LLVM checkout.
				all_projects: projects in the LLVM checkout.
				project: a project to walk the files of. Must be inside all_projects.
				visitor: a function called on each visited file.
				"""
				assert project in all_projects

				ignored_paths = set()
				for other_project in all_projects:
				if other_project != project and other_project.is_subproject(project):
				ignored_paths.add(os.path.join(checkout_root, other_project.relpath))

				def raise_error(err):
				raise err

				project_root = os.path.join(checkout_root, project.relpath)
				for root, dirs, files in os.walk(project_root, onerror=raise_error):
				dirs[:] = [
				d for d in dirs
				if d != ".svn" and d != ".git" and
				os.path.join(root, d) not in ignored_paths
				]
				for f in files:
				visitor(os.path.join(root, f))


				def CreateLLVMProjects(single_tree_checkout):
				"""Returns a list of LLVMProject instances, describing relative paths of a typical LLVM checkout.

				Args:
				single_tree_checkout:
				When True, relative paths for each project points to a typical single
				source tree checkout.
				When False, relative paths for each projects points to a separate
				directory. However, clang-tools-extra is an exception, its relative path
				will always be 'clang/tools/extra'.
				"""
				# FIXME: cover all of llvm projects.

				# Projects that reside inside 'projects/' in a single source tree checkout.
				ORDINARY_PROJECTS = [
				"compiler-rt", "dragonegg", "libcxx", "libcxxabi", "libunwind",
				"parallel-libs", "test-suite"
				]
				# Projects that reside inside 'tools/' in a single source tree checkout.
				TOOLS_PROJECTS = ["clang", "lld", "lldb", "llgo"]

				if single_tree_checkout:
				projects = [LLVMProject("llvm", "")]
				projects += [
				LLVMProject(p, os.path.join("projects", p)) for p in ORDINARY_PROJECTS
				]
				projects += [
				LLVMProject(p, os.path.join("tools", p)) for p in TOOLS_PROJECTS
				]
				projects.append(
				LLVMProject("clang-tools-extra",
				os.path.join("tools", "clang", "tools", "extra")))
				else:
				projects = [LLVMProject("llvm", "llvm")]
				projects += [LLVMProject(p, p) for p in ORDINARY_PROJECTS]
				projects += [LLVMProject(p, p) for p in TOOLS_PROJECTS]
				projects.append(
				LLVMProject("clang-tools-extra", os.path.join("clang", "tools",
				"extra")))
				return projects

This is an archive of the discontinued LLVM Phabricator instance.

Added optional validation of svn sources to Dockerfiles.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 114840

utils/docker/build_docker_image.sh

utils/docker/debian8/build/Dockerfile

utils/docker/example/build/Dockerfile

utils/docker/nvidia-cuda/build/Dockerfile

utils/docker/scripts/build_install_llvm.sh

utils/docker/scripts/llvm_checksum/llvm_checksum.py

utils/docker/scripts/llvm_checksum/project_tree.py

This is an archive of the discontinued LLVM Phabricator instance.

Added optional validation of svn sources to Dockerfiles.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 114840

utils/docker/build_docker_image.sh

utils/docker/debian8/build/Dockerfile

utils/docker/example/build/Dockerfile

utils/docker/nvidia-cuda/build/Dockerfile

utils/docker/scripts/build_install_llvm.sh

utils/docker/scripts/llvm_checksum/llvm_checksum.py

utils/docker/scripts/llvm_checksum/project_tree.py

Added optional validation of svn sources to Dockerfiles.
ClosedPublic