Download Raw Diff

Details

Reviewers

modocache
compnerd
djasper
jbcoe
srhines
• ddunbar

Commits

rG40304afb06f5: Make git-clang-format python 3 compatible
rC303871: Make git-clang-format python 3 compatible
rL303871: Make git-clang-format python 3 compatible

Summary

This patch attempts to make git-clang-format both python2 and python3 compatible. Currently it only works in python2.

Diff Detail

Event Timeline

EricWF created this revision.Mar 8 2017, 8:41 PM

There seem to be a couple cases where it's non-trivial to convert the output from bytes to str. I'll look into this further and update.

In D30773#696301, @EricWF wrote:

There seem to be a couple cases where it's non-trivial to convert the output from bytes to str. I'll look into this further and update.

Generally bytes<->str conversion should be done using .decode() and .encode(), i.e. respecting a particular character encoding in bytes.

In D30773#696349, @mgorny wrote:

In D30773#696301, @EricWF wrote:

There seem to be a couple cases where it's non-trivial to convert the output from bytes to str. I'll look into this further and update.

Generally bytes<->str conversion should be done using .decode() and .encode(), i.e. respecting a particular character encoding in bytes.

Right, and that's what I'm currently doing. The non-trivial cases are where we have an open file object referring to output from a child process instead of a string representing the full output.

EricWF added reviewers: djasper, srhines, • ddunbar.Mar 8 2017, 11:39 PM

Some nits from a Python3 hacker.

tools/clang-format/git-clang-format
144	I find `print('template', tail)` surprising in Python3, but it could be because we don't use it in our local standards. I'd jump immediately to formatting to make this 2/3-compatible: print(' %s' % filename) or in this case maybe just join the strings: print(' ' + filename)
325	This should not be necessary for iteration -- in Python3 it returns a generator instead of a list, but generators can be iterated.
501–504	No need for parentheses around the string
502	I wonder if `file=sys.stderr` works with Python 2.7's print statement. Have you tested this with 2.7? You can use `__future__` to get the print function behavior in 2.7 as well: http://stackoverflow.com/questions/32032697/how-to-use-from-future-import-print-function
522–525	This looks wrong -- where does `to_bytes` come from? I can never get my head around Python2's string/bytes philosophy. In Python3 you would do: stdout and stderr are always `bytes` stdout = stdout.decode('utf-8') stderr = stderr.decode('utf-8') (assuming the input is utf-8) Not sure how that plays with Python2, but the current code doesn't look right.

EricWF marked 4 inline comments as done.Mar 9 2017, 12:34 AM

EricWF added inline comments.

tools/clang-format/git-clang-format
325	This was done by the 2to3 tool. I'll have to see what it thinks it was up to. But I agree this seems wrong.
502	It doesn't. I think getting the new behavior from `__future__` is the only reasonable way to go.
522–525	This is wrong. And it will be removed in the next set of changes.

EricWF marked 2 inline comments as done.Mar 9 2017, 12:37 AM

EricWF added inline comments.

tools/clang-format/git-clang-format
325	Ah, so this is needed because in python 3 it's a runtime error to remove items from a dictionary while iterating over the keys. So we have to make a copy first.

kimgr added inline comments.Mar 9 2017, 1:08 AM

tools/clang-format/git-clang-format
325	Oh, I see, I didn't catch that. Carry on.

This still isn't working in python3 due to more bytes vs str issues, but this is my progress so far.

mgorny added inline comments.Mar 9 2017, 8:10 AM

tools/clang-format/git-clang-format
293	Pretty much a nit but using variable names that match type names can be a bit confusing here.
302	Shouldn't this be: return bytes.decode('utf-8') ? Otherwise, unless I'm missing something this function will always return the parameter [with no modifications], either in the first conditional if it's of `str` type, or in the conditional inside `to_bytes()` if it's of `bytes` type.
305	Any reason not to use: br'^\+...' ? i.e. make it bytes immediately instead of converting.
306	This logic looks really weird to me. What is the purpose of having both `to_string()` and `convert_string()`? Why do `to_bytes()` and `to_string()` use `isinstance()` to recognize types, and here you rely on exceptions? Why is `to_string()` called after decoding?
310	I don't think this can ever succeed. If the argument is not valid utf8, it certainly won't be valid ASCII.
327	Since you are using `keys_to_delete` now, you can remove the `list()`.
328	Another nit. I think it'd be better to just append a single item instead of a list of 1 item ;-). keys_to_delete.append(filename)

Add complete implementation.

As far as I've tested this works perfectly in both python2 and python3.

EricWF marked 4 inline comments as done.Mar 10 2017, 9:27 PM

EricWF added inline comments.

tools/clang-format/git-clang-format
293	These functions are all lifted directly from `lit/util.py`
302	Nope. So in python3 this line will never be hit. but it will be in python2. Specifically `convert_string` calls `to_string(bytes.decode('utf-8'))`. Which in python2 calls `to_string` with the type `unicode`. We need to de-code the unicode into a string using `bytes.encode('utf-8')`. I agree it's a bit misleading that it's calling a function called `to_bytes` though.
305	I tried `rb'blah'` and that didn't work in python2. However this has since been removed.
306	`to_string` is called after decoding because in python2 the result of decoding is a `unicode` type, and we need to encode it a `str` type. Hense to_string.

rename variables so they don't conflict with type names.

mgorny added inline comments.Mar 10 2017, 11:35 PM

tools/clang-format/git-clang-format
306	No offense intended but this sounds really horrible. In modern Python, everything is either `bytes` or `unicode`. The difference basically is that `str` in py2 was pretty much `bytes` (except that `bytes` explicitly removes some operations that are unsuitable for bytestrings), and that `str` in py3 is equivalent to `unicode` before. So if you are specifically converting to `str`, it means that you want to have two distinct types in py2/py3. Which really sounds like you're doing something wrong.

EricWF added inline comments.Mar 11 2017, 2:06 AM

tools/clang-format/git-clang-format
306	No offence taken. I had to do way to much work to answer your questions accurately which means the code is way too complicated or non-obvious. However there are a couple of things you got wrong. In python2 everything is normally `bytes`, `str`, or `unicode`. The `to_string` method converts both `unicode` and `bytes` to the type `str`. So if you are specifically converting to str, it means that you want to have two distinct types in py2/py3. Which really sounds like you're doing something wrong. I don't think having the Python library return different types in py2/py3 means something is wrong. In fact that's exactly what's happening and that's exactly what `convert_string` is trying to fix. In python 2 `stdout, stderr, and stdin` return strings, but in python 3 they return `bytes`. `convert_string` is meant to transform these different types into `str`. Regardless I'll remove the call to `to_bytes` from inside `to_string` because it's too confusing.

Rewrite to_string for clarities sake.

mgorny added inline comments.Mar 11 2017, 3:10 AM

tools/clang-format/git-clang-format
306	In Python 2, `bytes` and `str` is the same type, e.g.: In [13]: bytes('foo').__class__ Out[13]: str In other words, in each version of Python there are two kinds of strings: binary strings (`std::string` in C++) and Unicode strings (alike `std::u32string`). In Python 2, `str` is binary string (`bytes` is also accepted for compatibility) and `unicode` is the Unicode string. In Python 3, `bytes` is binary string and `str` is Unicode string. Now, let's consider subprocess streams. In Python 2 they use `str` -- which means binary strings. In Python 3 they use `bytes` -- i.e. also binary strings. So there's really no change here, except that Python 3 is more strict in handling different types. So if you plan to use the data as binary data, you can just use the data directly. If you need to use Unicode strings, you can reliably use `.decode()` and `.encode()` to convert. If you really believe you need to use `str` (i.e. two distinct types at the same time), it just means that the code is most likely wrong and partially relies on characteristic of one type, and partially on the other. For completeness, I should also mention that string types used by standard system streams (i.e. `sys.std`) have* changed between Python 2 and 3. Python 2 is operating them in `bytes` mode by default, while Python 3 is operating them (transcoding) in `str` (i.e. unicode) mode. In the latter case, you can use the `.buffer` attribute to access the underlying `bytes` stream -- e.g. `sys.stdin.buffer` will yield bytes.

I've committed all but the str vs byte changes upstream.

EricWF added reviewers: jbcoe, compnerd, modocache.Apr 20 2017, 5:01 PM

LGTM

This revision is now accepted and ready to land.Apr 21 2017, 2:45 AM

EricWF closed this revision.May 25 2017, 8:24 AM

Diff 96025

tools/clang-format/git-clang-format

Show All 14 Lines

This file provides a clang-format integration for git. Put it somewhere in your		This file provides a clang-format integration for git. Put it somewhere in your
path and ensure that it is executable. Then, "git clang-format" will invoke		path and ensure that it is executable. Then, "git clang-format" will invoke
clang-format on the changes in current files or a specific commit.		clang-format on the changes in current files or a specific commit.

For further details, run:		For further details, run:
git clang-format -h		git clang-format -h

Requires Python 2.7		Requires Python 2.7 or Python 3
"""		"""

from __future__ import print_function		from __future__ import print_function
import argparse		import argparse
import collections		import collections
import contextlib		import contextlib
import errno		import errno
import os		import os
▲ Show 20 Lines • Show All 104 Lines • ▼ Show 20 Lines	def main():
if opts.verbose >= 1:		if opts.verbose >= 1:
ignored_files = set(changed_lines)		ignored_files = set(changed_lines)
filter_by_extension(changed_lines, opts.extensions.lower().split(','))		filter_by_extension(changed_lines, opts.extensions.lower().split(','))
if opts.verbose >= 1:		if opts.verbose >= 1:
ignored_files.difference_update(changed_lines)		ignored_files.difference_update(changed_lines)
if ignored_files:		if ignored_files:
print('Ignoring changes in the following files (wrong extension):')		print('Ignoring changes in the following files (wrong extension):')
for filename in ignored_files:		for filename in ignored_files:
print(' %s' % filename)		print(' %s' % filename)
		kimgrUnsubmitted Done Reply Inline Actions I find `print('template', tail)` surprising in Python3, but it could be because we don't use it in our local standards. I'd jump immediately to formatting to make this 2/3-compatible: print(' %s' % filename) or in this case maybe just join the strings: print(' ' + filename) kimgr: I find `print('template', tail)` surprising in Python3, but it could be because we don't use it…
if changed_lines:		if changed_lines:
print('Running clang-format on the following files:')		print('Running clang-format on the following files:')
for filename in changed_lines:		for filename in changed_lines:
print(' %s' % filename)		print(' %s' % filename)
if not changed_lines:		if not changed_lines:
print('no modified files to format')		print('no modified files to format')
return		return
# The computed diff outputs absolute paths, so we must cd before accessing		# The computed diff outputs absolute paths, so we must cd before accessing
▲ Show 20 Lines • Show All 100 Lines • ▼ Show 20 Lines
def get_object_type(value):		def get_object_type(value):
"""Returns a string description of an object's type, or None if it is not		"""Returns a string description of an object's type, or None if it is not
a valid git object."""		a valid git object."""
cmd = ['git', 'cat-file', '-t', value]		cmd = ['git', 'cat-file', '-t', value]
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)		p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = p.communicate()		stdout, stderr = p.communicate()
if p.returncode != 0:		if p.returncode != 0:
return None		return None
return stdout.strip()		return convert_string(stdout.strip())


def compute_diff_and_extract_lines(commits, files):		def compute_diff_and_extract_lines(commits, files):
"""Calls compute_diff() followed by extract_lines()."""		"""Calls compute_diff() followed by extract_lines()."""
diff_process = compute_diff(commits, files)		diff_process = compute_diff(commits, files)
changed_lines = extract_lines(diff_process.stdout)		changed_lines = extract_lines(diff_process.stdout)
diff_process.stdout.close()		diff_process.stdout.close()
diff_process.wait()		diff_process.wait()
Show All 15 Lines	if len(commits) > 1:
git_tool = 'diff-tree'		git_tool = 'diff-tree'
cmd = ['git', git_tool, '-p', '-U0'] + commits + ['--']		cmd = ['git', git_tool, '-p', '-U0'] + commits + ['--']
cmd.extend(files)		cmd.extend(files)
p = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE)		p = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
p.stdin.close()		p.stdin.close()
return p		return p


def extract_lines(patch_file):		def extract_lines(patch_file):
		EricWFAuthorUnsubmitted Not Done Reply Inline Actions These functions are all lifted directly from `lit/util.py` EricWF: These functions are all lifted directly from `lit/util.py`
		mgornyUnsubmitted Done Reply Inline Actions Pretty much a nit but using variable names that match type names can be a bit confusing here. mgorny: Pretty much a nit but using variable names that match type names can be a bit confusing here.
"""Extract the changed lines in `patch_file`.		"""Extract the changed lines in `patch_file`.

The return value is a dictionary mapping filename to a list of (start_line,		The return value is a dictionary mapping filename to a list of (start_line,
line_count) pairs.		line_count) pairs.

The input must have been produced with ``-U0``, meaning unidiff format with		The input must have been produced with ``-U0``, meaning unidiff format with
zero lines of context. The return value is a dict mapping filename to a		zero lines of context. The return value is a dict mapping filename to a
list of line `Range`s."""		list of line `Range`s."""
matches = {}		matches = {}
		mgornyUnsubmitted Not Done Reply Inline Actions Shouldn't this be: return bytes.decode('utf-8') ? Otherwise, unless I'm missing something this function will always return the parameter [with no modifications], either in the first conditional if it's of `str` type, or in the conditional inside `to_bytes()` if it's of `bytes` type. mgorny: Shouldn't this be: return bytes.decode('utf-8') ? Otherwise, unless I'm missing…
		EricWFAuthorUnsubmitted Not Done Reply Inline Actions Nope. So in python3 this line will never be hit. but it will be in python2. Specifically `convert_string` calls `to_string(bytes.decode('utf-8'))`. Which in python2 calls `to_string` with the type `unicode`. We need to de-code the unicode into a string using `bytes.encode('utf-8')`. I agree it's a bit misleading that it's calling a function called `to_bytes` though. EricWF: Nope. So in python3 this line will never be hit. but it will be in python2. Specifically…
for line in patch_file:		for line in patch_file:
		line = convert_string(line)
match = re.search(r'^\+\+\+\ [^/]+/(.*)', line)		match = re.search(r'^\+\+\+\ [^/]+/(.*)', line)
		mgornyUnsubmitted Done Reply Inline Actions Any reason not to use: br'^\+...' ? i.e. make it bytes immediately instead of converting. mgorny: Any reason not to use: br'^\+...' ? i.e. make it bytes immediately instead of converting.
		EricWFAuthorUnsubmitted Not Done Reply Inline Actions I tried `rb'blah'` and that didn't work in python2. However this has since been removed. EricWF: I tried `rb'blah'` and that didn't work in python2. However this has since been removed.
if match:		if match:
		mgornyUnsubmitted Not Done Reply Inline Actions This logic looks really weird to me. What is the purpose of having both `to_string()` and `convert_string()`? Why do `to_bytes()` and `to_string()` use `isinstance()` to recognize types, and here you rely on exceptions? Why is `to_string()` called after decoding? mgorny: This logic looks really weird to me. What is the purpose of having both `to_string()` and…
		EricWFAuthorUnsubmitted Not Done Reply Inline Actions `to_string` is called after decoding because in python2 the result of decoding is a `unicode` type, and we need to encode it a `str` type. Hense to_string. EricWF: `to_string` is called after decoding because in python2 the result of decoding is a `unicode`…
		mgornyUnsubmitted Not Done Reply Inline Actions No offense intended but this sounds really horrible. In modern Python, everything is either `bytes` or `unicode`. The difference basically is that `str` in py2 was pretty much `bytes` (except that `bytes` explicitly removes some operations that are unsuitable for bytestrings), and that `str` in py3 is equivalent to `unicode` before. So if you are specifically converting to `str`, it means that you want to have two distinct types in py2/py3. Which really sounds like you're doing something wrong. mgorny: No offense intended but this sounds really horrible. In modern Python, everything is either…
		EricWFAuthorUnsubmitted Not Done Reply Inline Actions No offence taken. I had to do way to much work to answer your questions accurately which means the code is way too complicated or non-obvious. However there are a couple of things you got wrong. In python2 everything is normally `bytes`, `str`, or `unicode`. The `to_string` method converts both `unicode` and `bytes` to the type `str`. So if you are specifically converting to str, it means that you want to have two distinct types in py2/py3. Which really sounds like you're doing something wrong. I don't think having the Python library return different types in py2/py3 means something is wrong. In fact that's exactly what's happening and that's exactly what `convert_string` is trying to fix. In python 2 `stdout, stderr, and stdin` return strings, but in python 3 they return `bytes`. `convert_string` is meant to transform these different types into `str`. Regardless I'll remove the call to `to_bytes` from inside `to_string` because it's too confusing. EricWF: No offence taken. I had to do way to much work to answer your questions accurately which means…
		mgornyUnsubmitted Not Done Reply Inline Actions In Python 2, `bytes` and `str` is the same type, e.g.: In [13]: bytes('foo').__class__ Out[13]: str In other words, in each version of Python there are two kinds of strings: binary strings (`std::string` in C++) and Unicode strings (alike `std::u32string`). In Python 2, `str` is binary string (`bytes` is also accepted for compatibility) and `unicode` is the Unicode string. In Python 3, `bytes` is binary string and `str` is Unicode string. Now, let's consider subprocess streams. In Python 2 they use `str` -- which means binary strings. In Python 3 they use `bytes` -- i.e. also binary strings. So there's really no change here, except that Python 3 is more strict in handling different types. So if you plan to use the data as binary data, you can just use the data directly. If you need to use Unicode strings, you can reliably use `.decode()` and `.encode()` to convert. If you really believe you need to use `str` (i.e. two distinct types at the same time), it just means that the code is most likely wrong and partially relies on characteristic of one type, and partially on the other. For completeness, I should also mention that string types used by standard system streams (i.e. `sys.std`) have* changed between Python 2 and 3. Python 2 is operating them in `bytes` mode by default, while Python 3 is operating them (transcoding) in `str` (i.e. unicode) mode. In the latter case, you can use the `.buffer` attribute to access the underlying `bytes` stream -- e.g. `sys.stdin.buffer` will yield bytes. mgorny: In Python 2, `bytes` and `str` is the same type, e.g.: ``` In [13]: bytes('foo').__class__ Out…
filename = match.group(1).rstrip('\r\n')		filename = match.group(1).rstrip('\r\n')
match = re.search(r'^@@ -[0-9,]+ \+(\d+)(,(\d+))?', line)		match = re.search(r'^@@ -[0-9,]+ \+(\d+)(,(\d+))?', line)
if match:		if match:
start_line = int(match.group(1))		start_line = int(match.group(1))
		mgornyUnsubmitted Not Done Reply Inline Actions I don't think this can ever succeed. If the argument is not valid utf8, it certainly won't be valid ASCII. mgorny: I don't think this can ever succeed. If the argument is not valid utf8, it certainly won't be…
line_count = 1		line_count = 1
if match.group(3):		if match.group(3):
line_count = int(match.group(3))		line_count = int(match.group(3))
if line_count > 0:		if line_count > 0:
matches.setdefault(filename, []).append(Range(start_line, line_count))		matches.setdefault(filename, []).append(Range(start_line, line_count))
return matches		return matches


def filter_by_extension(dictionary, allowed_extensions):		def filter_by_extension(dictionary, allowed_extensions):
"""Delete every key in `dictionary` that doesn't have an allowed extension.		"""Delete every key in `dictionary` that doesn't have an allowed extension.

`allowed_extensions` must be a collection of lowercase file extensions,		`allowed_extensions` must be a collection of lowercase file extensions,
excluding the period."""		excluding the period."""
allowed_extensions = frozenset(allowed_extensions)		allowed_extensions = frozenset(allowed_extensions)
for filename in list(dictionary.keys()):		for filename in list(dictionary.keys()):
		kimgrUnsubmitted Done Reply Inline Actions This should not be necessary for iteration -- in Python3 it returns a generator instead of a list, but generators can be iterated. kimgr: This should not be necessary for iteration -- in Python3 it returns a generator instead of a…
		EricWFAuthorUnsubmitted Not Done Reply Inline Actions This was done by the 2to3 tool. I'll have to see what it thinks it was up to. But I agree this seems wrong. EricWF: This was done by the 2to3 tool. I'll have to see what it thinks it was up to. But I agree this…
		EricWFAuthorUnsubmitted Not Done Reply Inline Actions Ah, so this is needed because in python 3 it's a runtime error to remove items from a dictionary while iterating over the keys. So we have to make a copy first. EricWF: Ah, so this is needed because in python 3 it's a runtime error to remove items from a…
		kimgrUnsubmitted Not Done Reply Inline Actions Oh, I see, I didn't catch that. Carry on. kimgr: Oh, I see, I didn't catch that. Carry on.
base_ext = filename.rsplit('.', 1)		base_ext = filename.rsplit('.', 1)
if len(base_ext) == 1 or base_ext[1].lower() not in allowed_extensions:		if len(base_ext) == 1 or base_ext[1].lower() not in allowed_extensions:
		mgornyUnsubmitted Done Reply Inline Actions Since you are using `keys_to_delete` now, you can remove the `list()`. mgorny: Since you are using `keys_to_delete` now, you can remove the `list()`.
del dictionary[filename]		del dictionary[filename]
		mgornyUnsubmitted Done Reply Inline Actions Another nit. I think it'd be better to just append a single item instead of a list of 1 item ;-). keys_to_delete.append(filename) mgorny: Another nit. I think it'd be better to just append a single item instead of a list of 1 item…


def cd_to_toplevel():		def cd_to_toplevel():
"""Change to the top level of the git repository."""		"""Change to the top level of the git repository."""
toplevel = run('git', 'rev-parse', '--show-toplevel')		toplevel = run('git', 'rev-parse', '--show-toplevel')
os.chdir(toplevel)		os.chdir(toplevel)


▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	def create_tree(input_lines, mode):
'--index-info' is must be a list of values suitable for "git update-index		'--index-info' is must be a list of values suitable for "git update-index
--index-info", such as "<mode> <SP> <sha1> <TAB> <filename>". Any other mode		--index-info", such as "<mode> <SP> <sha1> <TAB> <filename>". Any other mode
is invalid."""		is invalid."""
assert mode in ('--stdin', '--index-info')		assert mode in ('--stdin', '--index-info')
cmd = ['git', 'update-index', '--add', '-z', mode]		cmd = ['git', 'update-index', '--add', '-z', mode]
with temporary_index_file():		with temporary_index_file():
p = subprocess.Popen(cmd, stdin=subprocess.PIPE)		p = subprocess.Popen(cmd, stdin=subprocess.PIPE)
for line in input_lines:		for line in input_lines:
p.stdin.write('%s\0' % line)		p.stdin.write(to_bytes('%s\0' % line))
p.stdin.close()		p.stdin.close()
if p.wait() != 0:		if p.wait() != 0:
die('`%s` failed' % ' '.join(cmd))		die('`%s` failed' % ' '.join(cmd))
tree_id = run('git', 'write-tree')		tree_id = run('git', 'write-tree')
return tree_id		return tree_id


def clang_format_to_blob(filename, line_ranges, revision=None,		def clang_format_to_blob(filename, line_ranges, revision=None,
Show All 38 Lines	def clang_format_to_blob(filename, line_ranges, revision=None,
clang_format.stdout.close()		clang_format.stdout.close()
stdout = hash_object.communicate()[0]		stdout = hash_object.communicate()[0]
if hash_object.returncode != 0:		if hash_object.returncode != 0:
die('`%s` failed' % ' '.join(hash_object_cmd))		die('`%s` failed' % ' '.join(hash_object_cmd))
if clang_format.wait() != 0:		if clang_format.wait() != 0:
die('`%s` failed' % ' '.join(clang_format_cmd))		die('`%s` failed' % ' '.join(clang_format_cmd))
if git_show and git_show.wait() != 0:		if git_show and git_show.wait() != 0:
die('`%s` failed' % ' '.join(git_show_cmd))		die('`%s` failed' % ' '.join(git_show_cmd))
return stdout.rstrip('\r\n')		return convert_string(stdout).rstrip('\r\n')


@contextlib.contextmanager		@contextlib.contextmanager
def temporary_index_file(tree=None):		def temporary_index_file(tree=None):
"""Context manager for setting GIT_INDEX_FILE to a temporary file and deleting		"""Context manager for setting GIT_INDEX_FILE to a temporary file and deleting
the file afterward."""		the file afterward."""
index_path = create_temporary_index(tree)		index_path = create_temporary_index(tree)
old_index_path = os.environ.get('GIT_INDEX_FILE')		old_index_path = os.environ.get('GIT_INDEX_FILE')
Show All 40 Lines	def apply_changes(old_tree, new_tree, force=False, patch_mode=False):
Bails if there are local changes in those files and not `force`. If		Bails if there are local changes in those files and not `force`. If
`patch_mode`, runs `git checkout --patch` to select hunks interactively."""		`patch_mode`, runs `git checkout --patch` to select hunks interactively."""
changed_files = run('git', 'diff-tree', '--diff-filter=M', '-r', '-z',		changed_files = run('git', 'diff-tree', '--diff-filter=M', '-r', '-z',
'--name-only', old_tree,		'--name-only', old_tree,
new_tree).rstrip('\0').split('\0')		new_tree).rstrip('\0').split('\0')
if not force:		if not force:
unstaged_files = run('git', 'diff-files', '--name-status', *changed_files)		unstaged_files = run('git', 'diff-files', '--name-status', *changed_files)
if unstaged_files:		if unstaged_files:
print('The following files would be modified but '		print('The following files would be modified but '
'have unstaged changes:', file=sys.stderr)		'have unstaged changes:', file=sys.stderr)
		kimgrUnsubmitted Done Reply Inline Actions I wonder if `file=sys.stderr` works with Python 2.7's print statement. Have you tested this with 2.7? You can use `__future__` to get the print function behavior in 2.7 as well: http://stackoverflow.com/questions/32032697/how-to-use-from-future-import-print-function kimgr: I wonder if `file=sys.stderr` works with Python 2.7's print statement. Have you tested this…
		EricWFAuthorUnsubmitted Not Done Reply Inline Actions It doesn't. I think getting the new behavior from `__future__` is the only reasonable way to go. EricWF: It doesn't. I think getting the new behavior from `__future__` is the only reasonable way to go.
print(unstaged_files, file=sys.stderr)		print(unstaged_files, file=sys.stderr)
print('Please commit, stage, or stash them first.', file=sys.stderr)		print('Please commit, stage, or stash them first.', file=sys.stderr)
		kimgrUnsubmitted Done Reply Inline Actions No need for parentheses around the string kimgr: No need for parentheses around the string
sys.exit(2)		sys.exit(2)
if patch_mode:		if patch_mode:
# In patch mode, we could just as well create an index from the new tree		# In patch mode, we could just as well create an index from the new tree
# and checkout from that, but then the user will be presented with a		# and checkout from that, but then the user will be presented with a
# message saying "Discard ... from worktree". Instead, we use the old		# message saying "Discard ... from worktree". Instead, we use the old
# tree as the index and checkout from new_tree, which gives the slightly		# tree as the index and checkout from new_tree, which gives the slightly
# better message, "Apply ... to index and worktree". This is not quite		# better message, "Apply ... to index and worktree". This is not quite
# right, since it won't be applied to the user's index, but oh well.		# right, since it won't be applied to the user's index, but oh well.
with temporary_index_file(old_tree):		with temporary_index_file(old_tree):
subprocess.check_call(['git', 'checkout', '--patch', new_tree])		subprocess.check_call(['git', 'checkout', '--patch', new_tree])
index_tree = old_tree		index_tree = old_tree
else:		else:
with temporary_index_file(new_tree):		with temporary_index_file(new_tree):
run('git', 'checkout-index', '-a', '-f')		run('git', 'checkout-index', '-a', '-f')
return changed_files		return changed_files


def run(args, *kwargs):		def run(args, *kwargs):
stdin = kwargs.pop('stdin', '')		stdin = kwargs.pop('stdin', '')
verbose = kwargs.pop('verbose', True)		verbose = kwargs.pop('verbose', True)
strip = kwargs.pop('strip', True)		strip = kwargs.pop('strip', True)
		kimgrUnsubmitted Not Done Reply Inline Actions This looks wrong -- where does `to_bytes` come from? I can never get my head around Python2's string/bytes philosophy. In Python3 you would do: stdout and stderr are always `bytes` stdout = stdout.decode('utf-8') stderr = stderr.decode('utf-8') (assuming the input is utf-8) Not sure how that plays with Python2, but the current code doesn't look right. kimgr: This looks wrong -- where does `to_bytes` come from? I can never get my head around Python2's…
		EricWFAuthorUnsubmitted Not Done Reply Inline Actions This is wrong. And it will be removed in the next set of changes. EricWF: This is wrong. And it will be removed in the next set of changes.
for name in kwargs:		for name in kwargs:
raise TypeError("run() got an unexpected keyword argument '%s'" % name)		raise TypeError("run() got an unexpected keyword argument '%s'" % name)
p = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE,		p = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE,
stdin=subprocess.PIPE)		stdin=subprocess.PIPE)
stdout, stderr = p.communicate(input=stdin)		stdout, stderr = p.communicate(input=stdin)

		stdout = convert_string(stdout)
		stderr = convert_string(stderr)

if p.returncode == 0:		if p.returncode == 0:
if stderr:		if stderr:
if verbose:		if verbose:
print('`%s` printed to stderr:' % ' '.join(args), file=sys.stderr)		print('`%s` printed to stderr:' % ' '.join(args), file=sys.stderr)
print(stderr.rstrip(), file=sys.stderr)		print(stderr.rstrip(), file=sys.stderr)
if strip:		if strip:
stdout = stdout.rstrip('\r\n')		stdout = stdout.rstrip('\r\n')
return stdout		return stdout
if verbose:		if verbose:
print('`%s` returned %s' % (' '.join(args), p.returncode), file=sys.stderr)		print('`%s` returned %s' % (' '.join(args), p.returncode), file=sys.stderr)
if stderr:		if stderr:
print(stderr.rstrip(), file=sys.stderr)		print(stderr.rstrip(), file=sys.stderr)
sys.exit(2)		sys.exit(2)


def die(message):		def die(message):
print('error:', message, file=sys.stderr)		print('error:', message, file=sys.stderr)
sys.exit(2)		sys.exit(2)


		def to_bytes(str_input):
		# Encode to UTF-8 to get binary data.
		if isinstance(str_input, bytes):
		return str_input
		return str_input.encode('utf-8')


		def to_string(bytes_input):
		if isinstance(bytes_input, str):
		return bytes_input
		return bytes_input.encode('utf-8')


		def convert_string(bytes_input):
		try:
		return to_string(bytes_input.decode('utf-8'))
		except AttributeError: # 'str' object has no attribute 'decode'.
		return str(bytes_input)
		except UnicodeError:
		return str(bytes_input)

if __name__ == '__main__':		if __name__ == '__main__':
main()		main()

This is an archive of the discontinued LLVM Phabricator instance.

Make git-clang-format python 3 compatible
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 96025

tools/clang-format/git-clang-format

This is an archive of the discontinued LLVM Phabricator instance.

Make git-clang-format python 3 compatibleClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 96025

tools/clang-format/git-clang-format

Make git-clang-format python 3 compatible
ClosedPublic