MatzeB (Matthias Braun)
User

Projects

User does not belong to any projects.

User Details

User Since
Aug 13 2013, 3:10 PM (257 w, 6 h)

Recent Activity

Yesterday

MatzeB added a comment to D49353: [RegAlloc] Skip global splitting if the live range is huge and its spill is trivially rematerializable .
In D49353#1163951, @wmi wrote:

For the record: The check is based on a LiveInterval::size() which gives you the number of segments. So I assume what is "huge" here is the number of basic blocks?

One segment can span multiple basicblocks. I am not sure whether one basicblock can have multiple segments inside of it theoretically, but it is uncommon. So emperically large number of segments mean large number of basicblocks, then large number of edge bundle nodes and high hopfield neural network algorithm cost.

Mon, Jul 16, 1:01 PM
MatzeB added inline comments to D49353: [RegAlloc] Skip global splitting if the live range is huge and its spill is trivially rematerializable .
Mon, Jul 16, 10:56 AM
MatzeB added a comment to D49353: [RegAlloc] Skip global splitting if the live range is huge and its spill is trivially rematerializable .

For the record: The check is based on a LiveInterval::size() which gives you the number of segments. So I assume what is "huge" here is the number of basic blocks?

Mon, Jul 16, 10:55 AM

Fri, Jul 13

MatzeB added a comment to D43542: [CodeGen][FastRegAlloc] Disable registers spilling for a naked function (PR28641).

I think fastregalloc/-O0 isn't really expecting all liveness flags to be complete (unfortunately hard to tell/not written down).

Fri, Jul 13, 6:24 PM
MatzeB added a comment to D48442: [SPECCPU2017] Add addition platform options and missing flags..

My LGTM still stands (in case you are waiting for another approval).

Fri, Jul 13, 6:19 PM
MatzeB added inline comments to D48776: [MachineOutliner] Add support for target-default outlining.
Fri, Jul 13, 2:05 PM
MatzeB accepted D47999: MIR YAML TracksRegLiveness default to true in yaml parser.

I still think we should rather try to not set TracksRegLiveness early in the codegen pipeline instead. Let me try how well that works...

Fri, Jul 13, 1:50 PM
MatzeB added a comment to D47999: MIR YAML TracksRegLiveness default to true in yaml parser.

I still think we should rather try to not set TracksRegLiveness early in the codegen pipeline instead. Let me try how well that works...

Fri, Jul 13, 12:50 PM
MatzeB added inline comments to D45308: [IPRA] Do not collect register usage information on functions that can be derefined.
Fri, Jul 13, 12:49 PM
MatzeB added inline comments to D45308: [IPRA] Do not collect register usage information on functions that can be derefined.
Fri, Jul 13, 12:47 PM
MatzeB added inline comments to D45308: [IPRA] Do not collect register usage information on functions that can be derefined.
Fri, Jul 13, 12:44 PM
MatzeB added inline comments to D48046: [test-suite] Backprop kernel from Rodinia Benchmark.
Fri, Jul 13, 12:32 PM
MatzeB requested changes to D48046: [test-suite] Backprop kernel from Rodinia Benchmark.
  • Patches should be relative to the toplevel test-suite directory.
  • 94k of reference output seems a lot and smells like you may end up in the anti-pattern of spending most of the benchmark time in printing instead of doing calculations.
  • We currently tend to not add -ffast-math as part of the makefiles, but instead people choose it together with their other optimization flags.
  • How long does this benchmark run? (We aim for 0.5-1s on typical PC hardware); Bonus points if the duration can be changed from the commandline (no expections for matching reference outputs for different input size though).
  • srand()/rand() produce different output with different libc implementations!
Fri, Jul 13, 12:27 PM
MatzeB added a comment to D47524: [MachineInstruction] Put implicit operands part of MCID before ones that aren't in addOperand().

or maybe prependImplicitDef()?

Fri, Jul 13, 12:19 PM
MatzeB added a comment to D47524: [MachineInstruction] Put implicit operands part of MCID before ones that aren't in addOperand().

I would consider setDesc() a rare/exceptional operation. So I'm not sure I like an extra loop for the common addOperand() maybe we can rather provide an insertOperand() to accomodate the setDesc() scenario you describe?

Fri, Jul 13, 12:18 PM
MatzeB accepted D48582: Reverse subregister saved loops in register usage info collector..

I fear the covered check will hit us perf wise either way (+CC some people from our GPU team) and that we need to find a different solution long term (like the Targets announcing the saved registers themselfes instead of the generic code piecing together the information...).
Anyway if this works for you I'd be fine with this patch as a stopgap.

Fri, Jul 13, 12:14 PM
MatzeB added inline comments to D46315: [RegUsageInfoCollector] Fix handling of callee saved registers with CSR optimization..
Fri, Jul 13, 11:30 AM

Thu, Jul 12

MatzeB added reviewers for D48582: Reverse subregister saved loops in register usage info collector.: jonpa, vivekvpandya.

Adding more people; still reading up on why the covered logic was necessary in the first place...

Thu, Jul 12, 11:28 AM
MatzeB updated the diff for D49256: CodeGen: Remove pipeline dependencies on StackProtector; NFC.

turns out my checkout was slightly outdated too, rebased. Fixed warning.

Thu, Jul 12, 11:19 AM
MatzeB added a comment to D48994: Update DBG_VALUE register operand during WebAssemblyOptimizeLiveIntervals pass.

Makes sense, thanks for the patch.

Thu, Jul 12, 11:12 AM
MatzeB added a comment to D37582: CodeGen: Remove pipeline dependencies on StackProtector; NFC.

Put an updated version of the patch here: https://reviews.llvm.org/D49256

Thu, Jul 12, 10:53 AM
MatzeB created D49256: CodeGen: Remove pipeline dependencies on StackProtector; NFC.
Thu, Jul 12, 10:53 AM
MatzeB added a comment to D37582: CodeGen: Remove pipeline dependencies on StackProtector; NFC.

Thanks @rinon for this. Are you still interested in committing this? I would like to get this in to fix some X86 machine verifier issues.

Yep! I'll rebase and take a look at your MFI comments now, unless @MatzeB is already doing that.

Thu, Jul 12, 10:15 AM
MatzeB added a comment to D37582: CodeGen: Remove pipeline dependencies on StackProtector; NFC.

Sorry this fell to the cracks. I tried comitting it yesterday, but it needs some adaptation for GlobalISel now. I'm currently making some simple changes for that and will then commit...

Thu, Jul 12, 9:49 AM

Wed, Jul 11

MatzeB added a comment to D49203: [PM] Use getAnalysisIfAvailable / addUsedIfAvailable with StackProtector.

See also: https://reviews.llvm.org/D37582
(Looks like I never got around comitting it and nobody pinged, oops)

Wed, Jul 11, 3:34 PM

Mon, Jul 2

MatzeB added a comment to D48674: SelectionDAGBuilder, mach-o: Skip trap after noreturn call (for Mach-O).

Out of curiosity, what's the use case/motivation for this?

Mon, Jul 2, 10:46 AM

Thu, Jun 28

MatzeB closed D48674: SelectionDAGBuilder, mach-o: Skip trap after noreturn call (for Mach-O).

Landed in rL335877

Thu, Jun 28, 10:49 AM

Wed, Jun 27

MatzeB updated the diff for D48674: SelectionDAGBuilder, mach-o: Skip trap after noreturn call (for Mach-O).

update

Wed, Jun 27, 2:29 PM
MatzeB created D48674: SelectionDAGBuilder, mach-o: Skip trap after noreturn call (for Mach-O).
Wed, Jun 27, 2:28 PM

Thu, Jun 21

MatzeB accepted D48442: [SPECCPU2017] Add addition platform options and missing flags..

I don't have a copy of Spec2017 yet so I cannot test any of it.

Thu, Jun 21, 10:20 AM

Wed, Jun 20

MatzeB accepted D48398: [test-suite] Set the language standard for SPEC benchmarks that do not compile using the newest..

LGTM

Wed, Jun 20, 4:15 PM

Mon, Jun 18

MatzeB added a comment to D47999: MIR YAML TracksRegLiveness default to true in yaml parser.

Looking around the code I would guess it just works. You should however also remove the && MRI.tracksLiveness() from MachineBasicBlock::print() if you change the default.

Mon, Jun 18, 1:51 PM
MatzeB added a comment to D47999: MIR YAML TracksRegLiveness default to true in yaml parser.

IMO MachineFunction::init() should not set TracksLiveness by default and instead VirtRegMap / FastRegAlloc should set it once it actually filled in the basic block live-in lists.

Mon, Jun 18, 1:49 PM
MatzeB requested changes to D47999: MIR YAML TracksRegLiveness default to true in yaml parser.
Mon, Jun 18, 12:36 PM
MatzeB added a comment to D47999: MIR YAML TracksRegLiveness default to true in yaml parser.

From what I remember our use of this flag is strange:

Mon, Jun 18, 12:35 PM

Jun 14 2018

MatzeB accepted D48154: [VirtRegRewriter] Avoid clobbering registers when expanding copy bundles.

LGTM with this round of nitpicks addressed.

Jun 14 2018, 11:20 AM

Jun 13 2018

MatzeB added inline comments to D48154: [VirtRegRewriter] Avoid clobbering registers when expanding copy bundles.
Jun 13 2018, 5:22 PM
MatzeB added a comment to D48154: [VirtRegRewriter] Avoid clobbering registers when expanding copy bundles.

Take a look at RAGreedy::selectOrSplit for an example on how to report fatal errors in the backend.

Jun 13 2018, 5:09 PM
MatzeB accepted D48109: [Timers] Use the pass argument name for JSON keys in time-passes.

@chandlerc @jlebar can you confirm that the new PM doesn't have timer support?

I'm pretty sure I only added the timer->statistics support for the "legacy" pass manager.

Jun 13 2018, 11:33 AM

Jun 8 2018

MatzeB added a comment to D47967: [debuginfo-tests] Always use the system python to invoke llgdb.py..

usr/bin/env is the recommended way to invoke python... Or is lldb somehow hardcoded to use /usr/bin/python and you need to match this here?

Jun 8 2018, 5:26 PM

May 29 2018

MatzeB added a comment to D47486: [LNT] Request authentization for lnt submit command via secret_key in lnt.cfg file..

I'd also recommend using the api_auth_token. Actually if you look at the Rest API you can see it supporting submissions only with api_auth_token set. I just never wanted to introduce this to the /submitRun pages to remain compatible with old clients.

May 29 2018, 10:46 AM
MatzeB added a comment to D42200: Fix RegScavenger::unprocess.

I assume you don't have commit access?
Usually it's the author of the patch that commits it unless he doesn't have commit access yet in which case it's upon the reviewers.

May 29 2018, 10:35 AM

May 22 2018

MatzeB accepted D46267: [test-suite] Enable MicroBenchmarks by default.

LGTM

May 22 2018, 2:28 PM
MatzeB added a comment to D47186: [LNT] Sort machines in Compare to by names..

Why not let the database do the sorting? .order_by(ts.Machine.name) or whatever sqlalchemy wants there?

May 22 2018, 2:24 PM
MatzeB added a comment to D46267: [test-suite] Enable MicroBenchmarks by default.

Cross compiling currently is a mess. Basically cmake only works properly when you have a complete gcc style cross toolchain, or a MacOS style toolchain on Mac. Most llvm developers currently are not in a position to set one of those two things up when they just want to test a compiler they made some changes to; to be honest when I started hacking on the llvm test-suite cmake I wasn't even aware of what assumptions are made there/what problem exist...

May 22 2018, 1:15 PM

May 18 2018

MatzeB accepted D46267: [test-suite] Enable MicroBenchmarks by default.

LGTM. Please watch the buildbots carefully (and revert in case of problems) when you push this.

May 18 2018, 3:24 PM

May 16 2018

MatzeB added a comment to D46433: [LNT] lnt profile upgrade command for large Spec2017 perf.data aborts.

No feedback from my side, I#m not familiar with the profile code.

May 16 2018, 8:42 AM

May 14 2018

MatzeB added a comment to D46837: [MachineScheduler] Skip an implicit def of a super-reg added by regalloc in findDefIdx..

(FYI: I'm busy with non-llvm at the moment, so I may not be responsive for things I cannot answer right away; but I'll try to find a timeslot this week going through the backlog of llvm things I have to look at)

May 14 2018, 3:40 PM

May 11 2018

MatzeB added a comment to D46735: [Test-Suite] Added Box Blur And Sobel Edge Detection.

First, let me keep the record straight:

  • Only SingleSource/Benchmarks/Polybench profits from the (mostly tiling) transformations applied by Polly.
  • There are various reasons why other benchmarks are "not optimizable" by Polly but only a fraction is caused by manual "pre-optimizations" (except the input language choice obviously).
  • Adding simple "Polly-optimizable" benchmarks is all good and well (as it makes for nicer evaluation sections in future papers...), but I would argue it is much more interesting to investigate if the existing benchmarks could be optimized and why they currently are not.

I agree that studying existing sources, why they are not optimized, and improve the optimizer such that the reason is not an obstacle anymore, is the primary goal.
The problem is that this not feasible for all sources. For instance, the array-to-pointers-to-arrays style (which @proton unfortunately also used here; I call then "jagged arrays", although not necessarily jagged) cannot be optimized because the pointers may overlap. Either the frontend language has to ensure that this never happens, or each pointer has to be compared pairwise for aliasing. The first is not the case in C++ (without extensions such as the restrict keyword), the latter involves a super-constant overhead. Unfortunately, very many benchmarks use jagged arrays.
Second, even if it is possible to remove an optimization obstacle, I would like to know whether it is worth it.
Third, researchers in the field of polyhedral optimization work on improving the optimizer algorithm and ignore language-level details (e.g. whether a jagged, row-major or column-major arrays are used)

Regarding these (and other new) benchmarks:

  • Please describe why/how the codes differ from existing ones we have (e.g., Halide/blur). Polybench already contains various kernels including many almost identical ones.

The Halide benchmarks are special in many regards; for instance, works only on x86.

  • Please describe why/how the magic constants (aka sizes) are chosen. "#define windows 10" is not necessarily helpful.

At some point a problem size has to be arbitrarily defined. What kind of explanation do you expect?

  • I fail to see how Polly is going to optimize this code (in a way that is general enough for real codes). So my question is: Did you choose a linked data structure on purpose or do you actually want to have a multi-dimensional array?

+1

Are you writing these from scratch? If so, I'd like to make some suggestions:

  • Please aim for a runtime for 0.5-1 second on typical hardware. Shorter benchmarks tend to be hard to time correctly, running longer doesn't increase precision in our experience.

But the fraction of noise will be more for shorter runtimes. A longer runtime will help us when we to see the performance improvement after applying optimization.

The question is: At what point is a performance change interesting? If we posit that a performance change is interesting at the ~1% level, and we can distinguish application-time running-time differences at around 0.01s, then running for 1-2s is sufficient for tracking. As the test suite gets larger, we have an overarching goal of keeping the overall execution time in check (in part, so we can run it more often). It's often better to collect statistics over multiple runs, compared to a single longer run, regardless.

Also, if there are particular kernels you're trying to benchmark it's better to time them separately. We have a nice infrastructure to do that now, making use of the Google benchmark library, in the MicroBenchmarks subdirectory.

IMHO we also want to see effects on workingset sizes larger than the last-level-cache. A micro-benchmark is great for small workingsets, but I am not sure whether Google's benchmark library works well with longer ones.

I don't see why it wouldn't work on longer-running kernels. Nevertheless, modern machines have bandwidths in the GB/s range, so a 1s running time is certainly long enough to move around a working set larger than your cache size.

Some optimizations (e.g. cache-locality, parallelization) can cut the execution time by order by magnitudes. With gemm, I have seen single-thread speed-ups of 34x. With parallelization, it will be even more. If the execution time without optimization is one second, it will be too short with optimization, especially with parallelization and accelerator-offloading which adds invocation overheads.

There are two difficult issues here. First, running with multiple threads puts you in a different regime for several reasons, and often one that really needs to be tested separately (because of different bandwidth constraints, different effects of prefetching, effects from using multiple hardware threads, and so on). We don't currently have an infrastructure for testing threaded code (although we probably should).

Second, I don't think that we can have a set of problem sizes that can stay the same across 40x performance improvements. If the compiler starts doing that, we'll need to change the test somehow. If we make the test long enough that, once 40x faster, it will have a reasonable running time, then until then, the test suite will be unreasonably slow for continuous integration. I think that we need to pick problems that works reasonably now, and when the compiler improves, we'd need to change the test. One of the reasons that I like the Google bechmark library is that it dynamically adjusts the number of iterations, thus essentially changing this for us as needed.

May 11 2018, 7:01 PM
MatzeB added a comment to D46735: [Test-Suite] Added Box Blur And Sobel Edge Detection.

Some optimizations (e.g. cache-locality, parallelization) can cut the execution time by order by magnitudes. With gemm, I have seen single-thread speed-ups of 34x. With parallelization, it will be even more. If the execution time without optimization is one second, it will be too short with optimization, especially with parallelization and accelerator-offloading which adds invocation overheads.

It's great to have a discussion on how such benchmarks should look like.

Instead of one-size-fits-it-all, should we have multiple problem sizes? There is already SMALL_DATASET, which is smaller than the default, but what about larger ones? SPEC has "test" (should execute everything at least once, great to check correctness), "train" (for PGO-training), "ref" (the scored benchmark input; in CPU 2017 runs up to 2 hrs). Polybench has MINI_DATASET to EXTRALARGE_DATASET which are defined by workingset-size, instead of purpose or runtime.

  • First: We have to choose a default problem size and I just wanted to emphasize that it should not be too big, so running the llvm test-suite finishes in a reasonable timeframe.
May 11 2018, 6:12 PM

May 10 2018

MatzeB added a comment to D46735: [Test-Suite] Added Box Blur And Sobel Edge Detection.

Are you writing these from scratch? If so, I'd like to make some suggestions:

May 10 2018, 5:23 PM
MatzeB added a comment to D46714: [test-suite] Add list of programs we might add..

It's odd to have this in the repository, but admittedly we don't really have a wiki or similar in LLVM so I may be ok.

May 10 2018, 1:00 PM

Apr 30 2018

MatzeB added a comment to D37582: CodeGen: Remove pipeline dependencies on StackProtector; NFC.

(this patch is still good to go/accepted)

Apr 30 2018, 6:25 PM
MatzeB added a comment to D46290: Remove \brief commands from doxygen comments..

BTW: I did not do this years ago, because the doxygen documentation claims that the brief comment stops at the first linebreak or the first dot. Hence I was preaching the rule that you still need \brief when the comment is more than 1 line. I now heard that apparently this is not how doxygen behaves (so documentation bug I guess).

Apr 30 2018, 3:47 PM

Apr 26 2018

MatzeB added inline comments to D45558: [test-suite] Save stats for LTO step too..
Apr 26 2018, 4:26 PM

Apr 25 2018

MatzeB accepted D45000: [LNT] fix some typos/dead code.

LGTM
Tgiugh this is another sign that the global_status view probably isn‘t used byanyone ans has no tests. I still think we should just remove it...

Apr 25 2018, 9:10 AM
MatzeB added inline comments to D45558: [test-suite] Save stats for LTO step too..
Apr 25 2018, 9:00 AM

Apr 24 2018

MatzeB accepted D44006: [LNT] Display order url via proper template method in v4_machine.html..

LGTM

Apr 24 2018, 2:00 PM
MatzeB accepted D44010: [LNT] Add support for logarithmic scale..

I think this is fine, but maybe wait a week or so before comitting in case @cmatthews has more comments.

Apr 24 2018, 1:58 PM
MatzeB accepted D44011: [LNT] Make new Order template in Compare To in v4_run.html view..

This seems straightforward.

Apr 24 2018, 1:56 PM
MatzeB added a comment to D44013: [LNT] Make Run-Over-Run and Run-Over-Baseline changes an accordion..

Maybe the answer here would be to further break up templates/reporting/runs.html into smaller pieces that can then be included from the WebUI and the E-Mail reports. So we can leave the accordions in the web-ui part. (That may be a bunch of refactoring though)

Apr 24 2018, 1:55 PM
MatzeB added a comment to D44013: [LNT] Make Run-Over-Run and Run-Over-Baseline changes an accordion..

templates/reporting/runs.html reporting is also used to produce E-Mail reports I believe. And I'm not sure the E-Mail output has the CSS required for the accordions?

Apr 24 2018, 1:52 PM
MatzeB accepted D44681: [LNT] Split newlines in all params/fields for Machine and Run in run view..

Looks good enought to commit, though I would recommend factoring things out differently:

Apr 24 2018, 9:31 AM

Apr 18 2018

MatzeB resigned from D45770: [AArch64] Disable spill slot scavenging when stack realignment required..
Apr 18 2018, 10:42 AM
MatzeB added a comment to D45770: [AArch64] Disable spill slot scavenging when stack realignment required..

Oh sorry, I didn't notice there is another "scavenging" thing in there.

Apr 18 2018, 10:42 AM
MatzeB requested changes to D45770: [AArch64] Disable spill slot scavenging when stack realignment required..

Huh? You cannot just disable stack slot scavenging. Who guarantees you that you don't need it?
Is the bug related to the latest changes in PEI? (I've seen some commits talking about base and frame pointers flying by but did not have time to read them or dive into the details...)

Apr 18 2018, 9:43 AM
MatzeB added a comment to D45708: [NFC] Remove doxygen brief tag from BasicBlock.h.

I diffed the two outputs and also pulled up the text in a browser. I do not see any significant changes in browser, and there wasn't. However, I am happy to revert if any noticeable changes are witnessed. I see multiline/paragraph text in the output.

Apr 18 2018, 1:39 AM
MatzeB added a comment to D45558: [test-suite] Save stats for LTO step too..
  • Do non gold linkers all accept the -Wl,-plugin-opt lines?
  • Maybe the clang driver knows if a gold linker is used and could add the -Wl,-plugin opt lines itself when -save-stats is used?
Apr 18 2018, 1:09 AM
MatzeB added a comment to D45708: [NFC] Remove doxygen brief tag from BasicBlock.h.

I am pretty sure this is not correct: Autobrief is stupid and stops at the first dot or at the line ending AFAIK. This means when the first sentence stretches over more than 1 line you still need need \brief. So a bunch of the changes here appear invalid to me, please check the doxygen output and revert if necessary.

Apr 18 2018, 1:07 AM

Apr 17 2018

MatzeB added a comment to D45695: [CodeGen] Use RegUnits to track register aliases (NFC).

Have you seen LiveRegUnits::accumulate() glancing over the code here (and earlier) it seems to me that TII::tracksRegDefsUses() was just invented to do the same thing: figuring out which registers are free/usable over a range of instructions...

Yes, actually my current change in TII::tracksRegDefsUses() was inspired by LiveRegUnits::accumulate(). The only difference is that we need to track UsedRegs and DefReg separately in TII::tracksRegDefsUses(), while LiveRegUnits::accumulate() track both used/defed registers together. Based on your comment in https://reviews.llvm.org/D41463#inline-400109, would it make sense to introduce new static member function in LiveRegUnits like :

static void LiveRegUnits::accumulateUseDef(MachineInstr &MI, LiveRegUnits &ModifiedRegUnits, LiveRegUnits &UsedRegUnits)  ?
Apr 17 2018, 3:35 PM

Apr 16 2018

MatzeB added inline comments to D41463: [CodeGen] Add a new pass for PostRA sink.
Apr 16 2018, 5:36 PM
MatzeB added a comment to D45695: [CodeGen] Use RegUnits to track register aliases (NFC).

Have you seen LiveRegUnits::accumulate() glancing over the code here (and earlier) it seems to me that TII::tracksRegDefsUses() was just invented to do the same thing: figuring out which registers are free/usable over a range of instructions...

Apr 16 2018, 5:33 PM

Apr 3 2018

MatzeB added a comment to D45229: [MI-sched] schedule following instruction latencies.

or

$ clang -arch arm64 -O3 -S -o- t.c -mcpu=generic
...
	ldrb		w8, [x1]
	ldrb	w9, [x1, #1]
	ldrb	w10, [x1, #2]
	ldrb	w11, [x1, #3]
	strb		w8, [x0]
	strb	w9, [x0, #1]
	strb	w10, [x0, #2]
	strb	w11, [x0, #3]
	ret
Apr 3 2018, 9:31 PM
MatzeB added a comment to D45229: [MI-sched] schedule following instruction latencies.
Apr 3 2018, 9:27 PM
MatzeB added a comment to D45211: Try to import parse_requirements from pip._internal.req too..

Should we maybe move the requirement list into setup.py instead of using this functionality which they apparently don't want us to use like this? (I'm fine with this patch in the meantime)

Apr 3 2018, 10:09 AM

Mar 26 2018

MatzeB accepted D44389: [LNT] Error accessing URL /db_default/v4/nts/profile/ajax/getFunctions; NOT FOUND.

LGTM, though none of us apple people know much about the profile code as it only works on linux right now. But unless someone with more knowledge of the code objects, I'd push this.

Mar 26 2018, 10:27 AM

Mar 8 2018

MatzeB added a comment to D43314: [lit] - Allow 1 test to report multiple micro-test results to provide support for microbenchmarks..

Tried some things and was able to get it working using /dev/stdout.

That doesn't work on Windows, may not work on Mac?

Should I change it back to rm %t.results.out after FileCheck? I'm not sure how to do this with a pipe.

Mar 8 2018, 11:02 AM

Mar 5 2018

MatzeB added a comment to D43916: Named VReg support for MIR.

Thanks for working on this; I'm looking forward to have this!

Mar 5 2018, 5:05 PM

Mar 4 2018

MatzeB added a comment to D43951: [RFC] llvm-mca: a static performance analysis tool..

This is great! I always wanted to see a IACTA like tool for other architectures.

Mar 4 2018, 12:36 PM

Feb 28 2018

MatzeB added a comment to D43862: [LNT] Error accessing URL /db_default/v4/nts/profile/ajax/getFunctions; NOT FOUND.

... looks like we should add some tests for the stuff in profile_views.py

Feb 28 2018, 7:25 AM

Feb 27 2018

MatzeB added a comment to D43850: [CodeGen] Avoid handling DBG_VALUE in the LivePhysRegs (addUses,removeDefs,stepForward).

This is surprising: DBG_VALUE should not have any defs so removeDefs() should have no effect and all uses should be marked with the debug flag so none of them should pass the MachineOperand::readsReg() test in addUses. Could you check whether the debug flag is set correctly? Did you try running your test with -verify-machineinstrs?

Feb 27 2018, 6:15 PM · debug-info
MatzeB added a comment to D43093: [FastISel] Sink local value materializations to first use.

Now, should we go to such length to make the debug information be friendlier to some debugger, I am not sure, but I let you decide.

Definitely not! That debugger constraint feels like it is designed for ancient compilers producing code and register allocating for one C statement at a time.

Feb 27 2018, 1:37 PM

Feb 25 2018

MatzeB added a comment to D43542: [CodeGen][FastRegAlloc] Disable registers spilling for a naked function (PR28641).

I've been looking into this, and we do indeed produce something like this for naked function with a parameter (this from http://llvm.org/PR28641 compiled with -O0):

bb.0 (%ir-block.1):
  liveins: $edi
  %0:gr32 = COPY $edi
  %1:gr32 = COPY killed %0
  INLINEASM &ret [sideeffect] [attdialect], $0:[clobber], implicit-def early-clobber $rdi, $1:[clobber], implicit-def early-clobber $eflags, !2
Feb 25 2018, 12:06 PM

Feb 22 2018

MatzeB added a comment to D43093: [FastISel] Sink local value materializations to first use.

Looks good from what I can see, though I'm not super experienced with fast-isel either. Does this work correctly in the case when fast-isel aborts while selecting a block?

Feb 22 2018, 5:16 PM
MatzeB added a comment to D43314: [lit] - Allow 1 test to report multiple micro-test results to provide support for microbenchmarks..

Looks nearly good to me. I'd just opt for a different naming of the sub-tests. Also some nitpicks/comments on other nitpicks below:

Feb 22 2018, 5:12 PM
MatzeB accepted D43337: [CodeGen] Don't omit any redundant information in -debug output.

LGTM

Feb 22 2018, 4:51 PM
MatzeB accepted D43319: [test-suite] Adding LCALS (Livermore Compiler Analysis Loop Suite) loop kernels to test suite..
  • You should double check whether you really need to repeat the lit.local.cfg changes in each directory.
  • I haven't actually tried compiling/running the benchmark but the changes LGTM.
Feb 22 2018, 4:51 PM
MatzeB added a comment to D43353: [X86] Add phony registers for high halves of E[A-D]X, E[SD]I, E[BS]P and EIP.

This should be a good thing! Among other things it should also allow us to switch register mask operands from physical registers to register units.

Feb 22 2018, 4:45 PM
MatzeB added a comment to D43398: [InstCombine] allow fdiv folds with less than fully 'fast' ops.

Your reasoning makes sense to me. And I think it's fine to push this given no floating point expert objects.

Feb 22 2018, 4:41 PM
MatzeB added a comment to D43427: [RegAllocFast] Salvage debug values when killing operands.
  • I'd rather keep MachineRegisterInfo to things directly related to information about virtual registers (register class etc.); I don't think it's role should be to provide transformations for involving virtual registers. So I'd rather move the functionality somewhere else.
  • I'm skeptical this works as intended in the presence of control flow (= multiple uses and defs of a vreg anywhere in the control flow graph).
  • I'm not sure I completely understand what is going on here, is it looking for COPYs killing a value so it can replace the debug values with the copied value? The commit message should mention that then.
Feb 22 2018, 4:36 PM · debug-info
MatzeB accepted D43478: [LiveIntervals] Handle moving up dead partial write.

This is a tricky case, good catch.

Feb 22 2018, 4:24 PM
MatzeB added a comment to D43542: [CodeGen][FastRegAlloc] Disable registers spilling for a naked function (PR28641).

I'm still a bit confused as to why we start spilling unused input vregs in the first place. Can you add a test case so we can reproduce? (The test can't get commit without a test anyway)

Feb 22 2018, 4:24 PM
MatzeB added a comment to D43578: -ftime-report switch support in Clang.
  • Just a warning: I'm a bit skeptical timers will work reliable for things that happen in well under a millisecond (did you do some sanity checking? i.e. do all the timer roughly add up to the time spent in the frontend?)
  • In the same vain I wonder if you add overhead by adding timers to functions that are likely called hundreds of times for a typical source file. Have you measured a release build with and without your patch.
Feb 22 2018, 4:20 PM
MatzeB added inline comments to D43601: [bugpoint] Add NoStripSymbols option..
Feb 22 2018, 4:14 PM
MatzeB accepted D43609: [bugpoint] Add option categories..

LGTM with declarations moved into a header.

Feb 22 2018, 4:09 PM
MatzeB accepted D43649: [AArch64] Refactor macro fusion.

Please append a "; NFC" to the commit title.
LGTM.

Feb 22 2018, 4:06 PM

Feb 21 2018

MatzeB added a comment to D43601: [bugpoint] Add NoStripSymbols option..

Okay reading the code a bit more seems like that is what you are doing.

Feb 21 2018, 6:25 PM
MatzeB added a comment to D43601: [bugpoint] Add NoStripSymbols option..

But we don't want user flags for bugpoint, we want it to try both variants and keep the smaller one when it still reproduces the crash.

Feb 21 2018, 6:18 PM

Feb 20 2018

MatzeB added a reviewer for D43542: [CodeGen][FastRegAlloc] Disable registers spilling for a naked function (PR28641): qcolombet.

Since a "naked" function can only have nothing but "asm" statements inside, and those "asm" statements can not have input or output parameters, it is safe to just disable any registers spilling for a "naked" function.

Feb 20 2018, 6:37 PM

Feb 19 2018

MatzeB added a reviewer for D43165: [lit] Fix problem in how Python versions open files with different encodings: MaggieYi.
Feb 19 2018, 10:08 PM