Download Raw Diff

Details

Reviewers

anemet
fhahn

Commits

rG55bfb497d246: [opt-viewer] Put critical items in parallel
rL293261: [opt-viewer] Put critical items in parallel

Summary

Put opt-viewer critical items in parallel

Requires features from Python 2.7

Performance
Below are performance results across various configurations. These were taken on an i5-5200U (dual core + HT). They were taken with a small subset of the YAML output of building Python 3.6.0b3 with LTO+PGO. 60 YAML files.

"multiprocessing" is the current submission contents. "baseline" is as of 544f14c6b2a07a94168df31833dba9dc35fd8289 (I think this is aka r287505).

"ImportError" vs "class<...CLoader>" below are just confirming the expected configuration (with/without CLoader).

The below was measured on AMD A8-5500B (4 cores) with 224 input YAML files, showing a ~1.75x speed increase over the baseline with libYAML. I suspect it would scale well on high-end servers.

**************************************** MULTIPROCESSING ****************************************
PyYAML:
        Traceback (most recent call last):
          File "<string>", line 1, in <module>
        ImportError: cannot import name CLoader
        Python 2.7.10
489.42user 5.53system 2:38.03elapsed 313%CPU (0avgtext+0avgdata 400308maxresident)k
0inputs+31392outputs (0major+473540minor)pagefaults 0swaps

PyYAML+libYAML:
        <class 'yaml.cyaml.CLoader'>
        Python 2.7.10
78.69user 5.45system 0:32.63elapsed 257%CPU (0avgtext+0avgdata 398560maxresident)k
0inputs+31392outputs (0major+542022minor)pagefaults 0swaps

PyPy/PyYAML:
        Traceback (most recent call last):
          File "<builtin>/app_main.py", line 75, in run_toplevel
          File "<builtin>/app_main.py", line 601, in run_it
          File "<string>", line 1, in <module>
        ImportError: cannot import name 'CLoader'
        Python 2.7.9 (2.6.0+dfsg-3, Jul 04 2015, 05:43:17)
        [PyPy 2.6.0 with GCC 4.9.3]
154.27user 8.12system 0:53.83elapsed 301%CPU (0avgtext+0avgdata 627960maxresident)k
808inputs+30376outputs (0major+727994minor)pagefaults 0swaps
**************************************** BASELINE        ****************************************
PyYAML:
        Traceback (most recent call last):
          File "<string>", line 1, in <module>
        ImportError: cannot import name CLoader
        Python 2.7.10
        358.08user 4.05system 6:08.37elapsed 98%CPU (0avgtext+0avgdata 315004maxresident)k
0inputs+31392outputs (0major+85252minor)pagefaults 0swaps

PyYAML+libYAML:
        <class 'yaml.cyaml.CLoader'>
        Python 2.7.10
50.32user 3.30system 0:56.59elapsed 94%CPU (0avgtext+0avgdata 307296maxresident)k
0inputs+31392outputs (0major+79335minor)pagefaults 0swaps

PyPy/PyYAML:
        Traceback (most recent call last):
          File "<builtin>/app_main.py", line 75, in run_toplevel
          File "<builtin>/app_main.py", line 601, in run_it
          File "<string>", line 1, in <module>
        ImportError: cannot import name 'CLoader'
        Python 2.7.9 (2.6.0+dfsg-3, Jul 04 2015, 05:43:17)
        [PyPy 2.6.0 with GCC 4.9.3]
72.94user 5.18system 1:23.41elapsed 93%CPU (0avgtext+0avgdata 455312maxresident)k
0inputs+30392outputs (0major+110280minor)pagefaults 0swaps

Diff Detail

Repository: rL LLVM

Event Timeline

bcain updated this revision to Diff 78858.Nov 22 2016, 5:30 AM

bcain retitled this revision from to Put opt-viewer critical items in parallel.

bcain updated this object.

bcain added reviewers: anemet, fhahn.

bcain set the repository for this revision to rL LLVM.

bcain added a subscriber: llvm-commits.

Herald added a subscriber: mehdi_amini. · View Herald TranscriptNov 22 2016, 5:30 AM

bcain mentioned this in D26789: opt-viewer parallelized.Nov 22 2016, 5:33 AM

Is baseline with PyYAML+libYAML just slightly slower than multiprocessing with PyYAML+libYAML (0:12.57elapsed vs 0:09.66elapsed) or am I missing something?

Re-generated the diff with maximum context.

In D26967#602463, @fhahn wrote:

Is baseline with PyYAML+libYAML just slightly slower than multiprocessing with PyYAML+libYAML (0:12.57elapsed vs 0:09.66elapsed) or am I missing something?

Sorry, it was a bad case to present. Only 60 input files and minimal parallelism (only 4 tasks spawned). It will scale significantly better on many-core servers + hundreds of input files.

I see. I think it would be good to present a case where multiprocessing provides a good speedup compared to the baseline PyYAML+libYAML, which alone seems to yield a 10x speedup vs without PyYAML+libYAML .

bcain updated this object.Nov 23 2016, 5:41 AM

In D26967#602510, @fhahn wrote:

I see. I think it would be good to present a case where multiprocessing provides a good speedup compared to the baseline PyYAML+libYAML, which alone seems to yield a 10x speedup vs without PyYAML+libYAML .

Updated the summary to show the unabridged YAML from Python3.6.0b3 on a 4-core desktop. It's 1.75x over the existing PyYAML+libYAML.

Here are some numbers I gathered for analyzing 250 opt.yaml files generated by compiling clang/llvm with the following options: "-O2 -g -fsave-optimization-record -mllvm -pass-remarks -mllvm -pass-remarks-missed". The size of the .opt.yaml files varies between a few kB and a few MB.

I used Python 2.7.6 on Linux with PyYAML and libyaml on an Intel Xeon with a large number of cores (time used to measure the runtime):

baseline:
./opt-viewer-base.py cat yamls_250.txt pwd`/out-baseline 257.55s user 86.05s system 91% cpu 6:15.89 total`

parallel with 10 processes
./opt-viewer.py cat yamls_250.txt pwd/out-parallel-10 607.80s user 268.75s system 160% cpu 9:04.52 total`

parallel with 4 processes
./opt-viewer.py cat yamls_250.txt pwd`/out-parallel-4 524.31s user 196.45s system 163% cpu 7:20.54 total`

I'm not sure if I did something wrong. How big were the opt.yaml files you used?

In D26967#604923, @fhahn wrote:

Here are some numbers I gathered for analyzing 250 opt.yaml files generated by compiling clang/llvm with the following options: "-O2 -g -fsave-optimization-record -mllvm -pass-remarks -mllvm -pass-remarks-missed". The size of the .opt.yaml files varies between a few kB and a few MB.

I used Python 2.7.6 on Linux with PyYAML and libyaml on an Intel Xeon with a large number of cores (time used to measure the runtime):

Hmm, that's kinda surprising. Maybe the overhead of the Lock() is bigger when there's more competition. Could you artificially limit the parallelism (min(cpu_count(), 4) or using cgroups/taskset)? Or comment out the Lock?

The servers I have access to will be mildly more difficult to run libYAML on but I will give it a go.

I used the Python3.6.0b3 source and only "-fsave-optimization-record". 224 files, ranging from ~10k up to 3.4M.

In D26967#604923, @fhahn wrote:

Here are some numbers I gathered for analyzing 250 opt.yaml files generated by compiling clang/llvm with the following options: "-O2 -g -fsave-optimization-record -mllvm -pass-remarks -mllvm -pass-remarks-missed". The size of the .opt.yaml files varies between a few kB and a few MB.

There is no need to pass -pass-remarks*, -fsave-optimization-record gathers all remarks.

In D26967#604923, @fhahn wrote:

Here are some numbers I gathered for analyzing 250 opt.yaml files generated by compiling clang/llvm with the following options: "-O2 -g -fsave-optimization-record -mllvm -pass-remarks -mllvm -pass-remarks-missed". The size of the .opt.yaml files varies between a few kB and a few MB.

...

I'm not sure if I did something wrong. How big were the opt.yaml files you used?

I wasn't able to reproduce these results on an i7 4x2 machine either. I switched from CPython to Apache Thrift. It's a C++ project so it will leverage c++filt more. I varied the number of processes and the time either improved or was neutral as I added processes. I tried with and without -source-dir because I noticed your experiments omitted it. Nothing that I measured was ever slower than the baseline.

Florian, can you upload your specific subset of the LLVM .yaml files somewhere? That way I can try to isolate whether the problem is only visible in that test case or something else. If it's convenient, can you do an experiment where the yaml files are in a tmpfs partition like /dev/shm and see if this patch is still slower than the baseline?

Hi Brian,

I will also do some experiments with this patch but here are my comments on the patch itself. In general feel free to peel off your formatting changes and just commit them right away.

Adam

utils/opt-viewer/opt-viewer.py
60–61 ↗	(On Diff #78860)	Please commit formatting changes separately.
68–69 ↗	(On Diff #78860)	This too, please commit separately.
134–143 ↗	(On Diff #78860)	Looks like another formatting change.
203 ↗	(On Diff #78860)	I am still not sure that this is the right thing to do but either way it should go into a separate review.
221–222 ↗	(On Diff #78860)	Formatting-only change.
346 ↗	(On Diff #78860)	merge_file_remark_dicts
351 ↗	(On Diff #78860)	k -> file k_ -> line v_ -> remarks
378–396 ↗	(On Diff #78860)	Since you're passing all_remarks, it's confusing that you're not passing file_remarks.

I'll omit the formatting changes from autopep8, the encoding change and fix the exception handling bug in _get_rermarks() and the render_file() file_remarks/all_remarks bug.

utils/opt-viewer/opt-viewer.py
337 ↗	(On Diff #78860)	This is not how this failure should get reported.
378–396 ↗	(On Diff #78860)	Yes, this looks to me like a bug.

anemet added inline comments.Dec 1 2016, 5:27 PM

utils/opt-viewer/opt-viewer.py
203 ↗	(On Diff #78860)	Actually, I think you're right about the encoding. Feel free to commit this part. Thanks!

I see consistent speed-ups but the resulting HTML directories don't match the original. It *seems* there are some extra remarks so I am assuming the uniquing does not work? Can you please look into it? After that I think this is ready to go in.

Also one more high-level comment. We should probably expose the level of parallelism through a '-j' option defaulting to cpu_count().

Removed formatting and other unrelated changes.

Added "--jobs".

Fixed bug in passing file_remarks/all_remarks

Response: most changes made as requested.

utils/opt-viewer/opt-viewer.py
346 ↗	(On Diff #78860)	I left this as-is. It's designed to be an abstract merge among an iterable of dictionaries. `[ {'a': [3], }, {'a': [4], }, {'b': [6] }] -> {'a': [3,4,], 'b': [6]}` If you feel it would be better with the specific concepts we're leveraging here, I'll change it.

anemet added inline comments.Dec 5 2016, 8:24 PM

utils/opt-viewer/opt-viewer.py
346 ↗	(On Diff #78860)	Sounds good, just add a comment at the call then that this is merging the list of remarks at each line of each file.

Brian, just to make sure we're not deadlocked here waiting for each other. I am still looking for an answer to this question:

I see consistent speed-ups but the resulting HTML directories don't match the original. It *seems* there are some extra remarks so I am assuming the uniquing does not work? Can you please look into it? After that I think this is ready to go in.

In D26967#616015, @anemet wrote:

Brian, just to make sure we're not deadlocked here waiting for each other. I am still looking for an answer to this question:

I see consistent speed-ups but the resulting HTML directories don't match the original. It *seems* there are some extra remarks so I am assuming the uniquing does not work? Can you please look into it? After that I think this is ready to go in.

I'd hoped that might go away when I fixed the args getting passed to generate_report().

I'll run some tests to see if I can reproduce the problem.

Updated comments for merge_dicts().

Adam, can you provide a test case that illustrates the problem? I didn't see it with the last two revisions.

In D26967#616742, @bcain wrote:

Adam, can you provide a test case that illustrates the problem? I didn't see it with the last two revisions.

I just sent you an email with the details.

Thanks,
Adam

In D26967#617705, @anemet wrote:

In D26967#616742, @bcain wrote:

Adam, can you provide a test case that illustrates the problem? I didn't see it with the last two revisions.

I just sent you an email with the details.

The differences are limited to the index.html. Most of the differences are inconsequential. I'm trying to explain them one by one and a few more unexplained ones remain, so stay tuned.

The inconsequential ones are due to ordering differences in the index.html between entries with identical hotness. I can make those differences go away if I change both the baseline and the patch to have this line:

sorted_remarks = sorted(all_remarks.itervalues(), key=lambda r: (r.Hotness, r.__dict__), reverse=True)

... instead of sorting on Hotness alone, this tuple also considers all the other fields of the remark.

I will include this change with my next update. It would stabilize comparisons going forward but it won't make the output comparison from this patch match the baseline.

Thanks for the update, Brian:

In D26967#650414, @bcain wrote:

In D26967#617705, @anemet wrote:

I just sent you an email with the details.

The differences are limited to the index.html. Most of the differences are inconsequential. I'm trying to explain them one by one and a few more unexplained ones remain, so stay tuned.

There were differences in the source views as well. I just investigated and looks like the remark uniquing is working differently between the serial and parallel -- neither of them being correct.

When we're uniquing the remarks using all_remarks, key does not include the containing function, so if there is no hotness difference between the different functions (inlining contexts) they will get uniqued.

On the other hand, the parallel version only does uniquing within one thread. We do merge all_remarks from each thread but file_remarks at that point could contain duplicates. We should probably unique in merge_dicts?

If you want to see an example of this, diff _Users_adam_proj_org_llvm_build-rel_bin_.._include_c++_v1___functional_base.html between html-orig and html-par from the tarball I sent you. On the very first line we annotate (line 63), there are licm remarks with different inlining contexts on the parallel version but not on the serial one.

In D26967#651691, @anemet wrote:

Thanks for the update, Brian:

In D26967#650414, @bcain wrote:

In D26967#617705, @anemet wrote:

I just sent you an email with the details.

The differences are limited to the index.html. Most of the differences are inconsequential. I'm trying to explain them one by one and a few more unexplained ones remain, so stay tuned.

There were differences in the source views as well. I just investigated and looks like the remark uniquing is working differently between the serial and parallel -- neither of them being correct.

Oh, yes. I suppose I won't be able to produce most of the source views correctly for this example unless I have all of the source files (and have them installed at the same paths as the producer).

When we're uniquing the remarks using all_remarks, key does not include the containing function, so if there is no hotness difference between the different functions (inlining contexts) they will get uniqued.

On the other hand, the parallel version only does uniquing within one thread. We do merge all_remarks from each thread but file_remarks at that point could contain duplicates. We should probably unique in merge_dicts?

file_remarks is indexed on file-line. The difference that I can see between serial and parallel implementation is not the presence/absence of duplicates but the order in which they're append()ed is not the same. But that shouldn't matter if file_remarks are only referenced while generating the source file renders, right?

If the .key property of Remark would be better off by having a self.Function element, should I just add it?

Also: note that this parallel implementation spawns processes, not threads. This means that globally visible changes like e.g. class-members have to be manually delivered back to the parent. This is why the Remark.max_hotness is not updated in get_remarks(). Changing this implementation to be multithreaded is simple and only requires slightly different imports.

If you want to see an example of this, diff _Users_adam_proj_org_llvm_build-rel_bin_.._include_c++_v1___functional_base.html between html-orig and html-par from the tarball I sent you. On the very first line we annotate (line 63), there are licm remarks with different inlining contexts on the parallel version but not on the serial one.

I'll see if I can make a minimal reproducer, sorting through all of these differences is difficult to reason about IMO.

HI Brian!

In D26967#653334, @bcain wrote:

If the .key property of Remark would be better off by having a self.Function element, should I just add it?

I think we should just go with your patch as is and incrementally improve the situation from there. As I said the original version is also incorrect so we're technically not regressing anything here.

I have been holding off commits to avoid causing merge conflicts for you but the queue is getting pretty long; I'd like to get this in and go from there.

Also: note that this parallel implementation spawns processes, not threads. This means that globally visible changes like e.g. class-members have to be manually delivered back to the parent. This is why the Remark.max_hotness is not updated in get_remarks(). Changing this implementation to be multithreaded is simple and only requires slightly different imports.

If you want to see an example of this, diff _Users_adam_proj_org_llvm_build-rel_bin_.._include_c++_v1___functional_base.html between html-orig and html-par from the tarball I sent you. On the very first line we annotate (line 63), there are licm remarks with different inlining contexts on the parallel version but not on the serial one.

I'll see if I can make a minimal reproducer, sorting through all of these differences is difficult to reason about IMO.

I wouldn't worry about this. Why don't you just post the latest version of the patch and then we improve/fix thing further in tree.

Thanks again for your work.

Adam

In D26967#653413, @anemet wrote:

HI Brian!

In D26967#653334, @bcain wrote:

If the .key property of Remark would be better off by having a self.Function element, should I just add it?

I think we should just go with your patch as is and incrementally improve the situation from there. As I said the original version is also incorrect so we're technically not regressing anything here.

Ok, thanks. Can you commit this patch under review, then? I don't believe that I have commit privileges.

-Brian

In D26967#655238, @bcain wrote:

Ok, thanks. Can you commit this patch under review, then? I don't believe that I have commit privileges.

Sure!

anemet accepted this revision.Jan 24 2017, 11:41 AM

This revision is now accepted and ready to land.Jan 24 2017, 11:41 AM

Closed by commit rL293261: [opt-viewer] Put critical items in parallel (authored by anemet). · Explain WhyJan 26 2017, 10:49 PM

This revision was automatically updated to reflect the committed changes.

Brian,

I've found a few more things that needed fixing. Please see commits r293262-r293266. Let me know if you have comments.

Also my speed up for the files under lib/Transform/Scalars is from 9m42 to 4m17.

Thanks again,
Adam

Diff 86016

llvm/trunk/utils/opt-viewer/opt-viewer.py

Show All 9 Lines
For faster parsing, you may want to use libYAML with PyYAML.'''		For faster parsing, you may want to use libYAML with PyYAML.'''

import yaml		import yaml
# Try to use the C parser.		# Try to use the C parser.
try:		try:
from yaml import CLoader as Loader		from yaml import CLoader as Loader
except ImportError:		except ImportError:
from yaml import Loader		from yaml import Loader

		import functools
		from collections import defaultdict
		import itertools
		from multiprocessing import Pool
		from multiprocessing import Lock, cpu_count
		import errno
import argparse		import argparse
import os.path		import os.path
import re		import re
import subprocess		import subprocess
import shutil		import shutil
from pygments import highlight		from pygments import highlight
from pygments.lexers.c_cpp import CppLexer		from pygments.lexers.c_cpp import CppLexer
from pygments.formatters import HtmlFormatter		from pygments.formatters import HtmlFormatter

parser = argparse.ArgumentParser(description=desc)
parser.add_argument('yaml_files', nargs='+')
parser.add_argument('output_dir')
parser.add_argument('-source-dir', '-s', default='', help='set source directory')
args = parser.parse_args()

p = subprocess.Popen(['c++filt', '-n'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)		p = subprocess.Popen(['c++filt', '-n'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
		p_lock = Lock()


def demangle(name):		def demangle(name):
		with p_lock:
p.stdin.write(name + '\n')		p.stdin.write(name + '\n')
return p.stdout.readline().rstrip()		return p.stdout.readline().rstrip()


class Remark(yaml.YAMLObject):		class Remark(yaml.YAMLObject):
max_hotness = 0		max_hotness = 0

# Work-around for http://pyyaml.org/ticket/154.		# Work-around for http://pyyaml.org/ticket/154.
yaml_loader = Loader		yaml_loader = Loader

▲ Show 20 Lines • Show All 104 Lines • ▼ Show 20 Lines	class Missed(Remark):
yaml_tag = '!Missed'		yaml_tag = '!Missed'

@property		@property
def color(self):		def color(self):
return "red"		return "red"


class SourceFileRenderer:		class SourceFileRenderer:
def __init__(self, filename):		def __init__(self, source_dir, output_dir, filename):
existing_filename = None		existing_filename = None
if os.path.exists(filename):		if os.path.exists(filename):
existing_filename = filename		existing_filename = filename
else:		else:
fn = os.path.join(args.source_dir, filename)		fn = os.path.join(source_dir, filename)
if os.path.exists(fn):		if os.path.exists(fn):
existing_filename = fn		existing_filename = fn

self.stream = open(os.path.join(args.output_dir, SourceFileRenderer.html_file_name(filename)), 'w')		self.stream = open(os.path.join(output_dir, SourceFileRenderer.html_file_name(filename)), 'w')
if existing_filename:		if existing_filename:
self.source_stream = open(existing_filename)		self.source_stream = open(existing_filename)
else:		else:
self.source_stream = None		self.source_stream = None
print('''		print('''
<html>		<html>
<h1>Unable to locate file {}</h1>		<h1>Unable to locate file {}</h1>
</html>		</html>
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines
</html>''', file=self.stream)		</html>''', file=self.stream)

@classmethod		@classmethod
def html_file_name(cls, filename):		def html_file_name(cls, filename):
return filename.replace('/', '_') + ".html"		return filename.replace('/', '_') + ".html"


class IndexRenderer:		class IndexRenderer:
def __init__(self):		def __init__(self, output_dir):
self.stream = open(os.path.join(args.output_dir, 'index.html'), 'w')		self.stream = open(os.path.join(output_dir, 'index.html'), 'w')

def render_entry(self, r):		def render_entry(self, r):
print('''		print('''
<tr>		<tr>
<td><a href={r.Link}>{r.DebugLocString}</a></td>		<td><a href={r.Link}>{r.DebugLocString}</a></td>
<td>{r.RelativeHotness}</td>		<td>{r.RelativeHotness}</td>
<td>{r.DemangledFunctionName}</td>		<td>{r.DemangledFunctionName}</td>
<td class=\"column-entry-{r.color}\">{r.Pass}</td>		<td class=\"column-entry-{r.color}\">{r.Pass}</td>
Show All 17 Lines	</tr>''', file=self.stream)
for remark in all_remarks:		for remark in all_remarks:
self.render_entry(remark)		self.render_entry(remark)
print('''		print('''
</table>		</table>
</body>		</body>
</html>''', file=self.stream)		</html>''', file=self.stream)


		def get_remarks(input_file):
		max_hotness = 0
all_remarks = dict()		all_remarks = dict()
file_remarks = dict()		file_remarks = defaultdict(functools.partial(defaultdict, list))

for input_file in args.yaml_files:		with open(input_file) as f:
f = open(input_file)
docs = yaml.load_all(f, Loader=Loader)		docs = yaml.load_all(f, Loader=Loader)

for remark in docs:		for remark in docs:
# Avoid remarks withoug debug location or if they are duplicated		# Avoid remarks withoug debug location or if they are duplicated
if not hasattr(remark, 'DebugLoc') or remark.key in all_remarks:		if not hasattr(remark, 'DebugLoc') or remark.key in all_remarks:
continue		continue
all_remarks[remark.key] = remark		all_remarks[remark.key] = remark

file_remarks.setdefault(remark.File, dict()).setdefault(remark.Line, []).append(remark)		file_remarks[remark.File][remark.Line].append(remark)

		max_hotness = max(max_hotness, remark.Hotness)

		return max_hotness, all_remarks, file_remarks


		def _render_file(source_dir, output_dir, entry):
		filename, remarks = entry
		SourceFileRenderer(source_dir, output_dir, filename).render(remarks)


		def gather_results(pool, filenames):
		all_remarks = dict()
		remarks = pool.map(get_remarks, filenames)

		def merge_dicts(dicts):
		''' Takes an iterable of dicts and merges them into
		a single dict. Nested dicts are merged as well.
		>>> merge_dicts([ {'a': [3], }, {'a': [4], }, {'b': [6] }])
		{'a': [3,4,], 'b': [6]}
		>>> merge_dicts([ {'a': {'q': [6,3], 'f': [30],}, }, {'a': {'f': [4,10]}, }, {'b': [6] }])
		{'a': [{'q': [6,3]}, {'f': [4,10,30]}], 'b': [6]}

		'''
		merged = defaultdict(functools.partial(defaultdict, list))

		for k, v in itertools.chain(*[d.iteritems() for d in dicts]):
		for k_, v_ in v.items():
		merged[k][k_] += v_

Remark.max_hotness = max(Remark.max_hotness, remark.Hotness)		return merged

# Set up a map between function names and their source location for function where inlining happened		file_remark_dicts = [entry[2] for entry in remarks]
		# merge the list of remarks at each line of each file
		file_remarks = merge_dicts(file_remark_dicts)

		# merge individual 'all_remark' results:
		for _, all_rem, _ in remarks:
		all_remarks.update(all_rem)

		Remark.max_hotness = max(entry[0] for entry in remarks)

		return all_remarks, file_remarks


		def map_remarks(all_remarks):
		# Set up a map between function names and their source location for
		# function where inlining happened
for remark in all_remarks.itervalues():		for remark in all_remarks.itervalues():
if type(remark) == Passed and remark.Pass == "inline" and remark.Name == "Inlined":		if isinstance(remark, Passed) and remark.Pass == "inline" and remark.Name == "Inlined":
for arg in remark.Args:		for arg in remark.Args:
caller = arg.get('Caller')		caller = arg.get('Caller')
if caller:		if caller:
Remark.caller_loc[caller] = arg['DebugLoc']		Remark.caller_loc[caller] = arg['DebugLoc']


		def generate_report(pool, all_remarks, file_remarks, source_dir, output_dir):
		try:
		os.makedirs(output_dir)
		except OSError as e:
		if e.errno == errno.EEXIST and os.path.isdir(output_dir):
		pass
		else:
		raise

		_render_file_bound = functools.partial(_render_file, source_dir, output_dir)
		pool.map(_render_file_bound, file_remarks.items())

if Remark.should_display_hotness():		if Remark.should_display_hotness():
sorted_remarks = sorted(all_remarks.itervalues(), key=lambda r: r.Hotness, reverse=True)		sorted_remarks = sorted(all_remarks.itervalues(), key=lambda r: r.Hotness, reverse=True)
else:		else:
sorted_remarks = sorted(all_remarks.itervalues(), key=lambda r: (r.File, r.Line, r.Column))		sorted_remarks = sorted(all_remarks.itervalues(), key=lambda r: (r.File, r.Line, r.Column))
		IndexRenderer(args.output_dir).render(sorted_remarks)

		shutil.copy(os.path.join(os.path.dirname(os.path.realpath(__file__)),
		"style.css"), output_dir)


		if __name__ == '__main__':
		parser = argparse.ArgumentParser(description=desc)
		parser.add_argument('yaml_files', nargs='+')
		parser.add_argument('output_dir')
		parser.add_argument(
		'--jobs',
		'-j',
		default=cpu_count(),
		type=int,
		help='Max job count (defaults to current CPU count)')
		parser.add_argument(
		'-source-dir',
		'-s',
		default='',
		help='set source directory')
		args = parser.parse_args()

if not os.path.exists(args.output_dir):		if len(args.yaml_files) == 0:
os.mkdir(args.output_dir)		parser.print_help()
		sys.exit(1)

for (filename, remarks) in file_remarks.iteritems():		pool = Pool(processes=args.jobs)
SourceFileRenderer(filename).render(remarks)		all_remarks, file_remarks = gather_results(pool, args.yaml_files)

IndexRenderer().render(sorted_remarks)		map_remarks(all_remarks)

shutil.copy(os.path.join(os.path.dirname(os.path.realpath(__file__)), "style.css"), args.output_dir)		generate_report(pool, all_remarks, file_remarks, args.source_dir, args.output_dir)

This is an archive of the discontinued LLVM Phabricator instance.

Put opt-viewer critical items in parallel
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 86016

llvm/trunk/utils/opt-viewer/opt-viewer.py

This is an archive of the discontinued LLVM Phabricator instance.

Put opt-viewer critical items in parallelClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 86016

llvm/trunk/utils/opt-viewer/opt-viewer.py

Put opt-viewer critical items in parallel
ClosedPublic