MatzeB (Matthias Braun)
User

Projects

User does not belong to any projects.

User Details

User Since
Aug 13 2013, 3:10 PM (244 w, 4 d)

Recent Activity

Wed, Apr 18

MatzeB resigned from D45770: [AArch64] Disable spill slot scavenging when stack realignment required..
Wed, Apr 18, 10:42 AM
MatzeB added a comment to D45770: [AArch64] Disable spill slot scavenging when stack realignment required..

Oh sorry, I didn't notice there is another "scavenging" thing in there.

Wed, Apr 18, 10:42 AM
MatzeB requested changes to D45770: [AArch64] Disable spill slot scavenging when stack realignment required..

Huh? You cannot just disable stack slot scavenging. Who guarantees you that you don't need it?
Is the bug related to the latest changes in PEI? (I've seen some commits talking about base and frame pointers flying by but did not have time to read them or dive into the details...)

Wed, Apr 18, 9:43 AM
MatzeB added a comment to D45708: [NFC] Remove doxygen brief tag from BasicBlock.h.

I diffed the two outputs and also pulled up the text in a browser. I do not see any significant changes in browser, and there wasn't. However, I am happy to revert if any noticeable changes are witnessed. I see multiline/paragraph text in the output.

Wed, Apr 18, 1:39 AM
MatzeB added a comment to D45558: [test-suite] Save stats for LTO step too..
  • Do non gold linkers all accept the -Wl,-plugin-opt lines?
  • Maybe the clang driver knows if a gold linker is used and could add the -Wl,-plugin opt lines itself when -save-stats is used?
Wed, Apr 18, 1:09 AM
MatzeB added a comment to D45708: [NFC] Remove doxygen brief tag from BasicBlock.h.

I am pretty sure this is not correct: Autobrief is stupid and stops at the first dot or at the line ending AFAIK. This means when the first sentence stretches over more than 1 line you still need need \brief. So a bunch of the changes here appear invalid to me, please check the doxygen output and revert if necessary.

Wed, Apr 18, 1:07 AM

Tue, Apr 17

MatzeB added a comment to D45695: [CodeGen] Use RegUnits to track register aliases (NFC).

Have you seen LiveRegUnits::accumulate() glancing over the code here (and earlier) it seems to me that TII::tracksRegDefsUses() was just invented to do the same thing: figuring out which registers are free/usable over a range of instructions...

Yes, actually my current change in TII::tracksRegDefsUses() was inspired by LiveRegUnits::accumulate(). The only difference is that we need to track UsedRegs and DefReg separately in TII::tracksRegDefsUses(), while LiveRegUnits::accumulate() track both used/defed registers together. Based on your comment in https://reviews.llvm.org/D41463#inline-400109, would it make sense to introduce new static member function in LiveRegUnits like :

static void LiveRegUnits::accumulateUseDef(MachineInstr &MI, LiveRegUnits &ModifiedRegUnits, LiveRegUnits &UsedRegUnits)  ?
Tue, Apr 17, 3:35 PM

Mon, Apr 16

MatzeB added inline comments to D41463: [CodeGen] Add a new pass for PostRA sink.
Mon, Apr 16, 5:36 PM
MatzeB added a comment to D45695: [CodeGen] Use RegUnits to track register aliases (NFC).

Have you seen LiveRegUnits::accumulate() glancing over the code here (and earlier) it seems to me that TII::tracksRegDefsUses() was just invented to do the same thing: figuring out which registers are free/usable over a range of instructions...

Mon, Apr 16, 5:33 PM

Tue, Apr 3

MatzeB added a comment to D45229: [MI-sched] schedule following instruction latencies.

or

$ clang -arch arm64 -O3 -S -o- t.c -mcpu=generic
...
	ldrb		w8, [x1]
	ldrb	w9, [x1, #1]
	ldrb	w10, [x1, #2]
	ldrb	w11, [x1, #3]
	strb		w8, [x0]
	strb	w9, [x0, #1]
	strb	w10, [x0, #2]
	strb	w11, [x0, #3]
	ret
Tue, Apr 3, 9:31 PM
MatzeB added a comment to D45229: [MI-sched] schedule following instruction latencies.
Tue, Apr 3, 9:27 PM
MatzeB added a comment to D45211: Try to import parse_requirements from pip._internal.req too..

Should we maybe move the requirement list into setup.py instead of using this functionality which they apparently don't want us to use like this?

Tue, Apr 3, 10:09 AM

Mon, Mar 26

MatzeB accepted D44389: [LNT] Error accessing URL /db_default/v4/nts/profile/ajax/getFunctions; NOT FOUND.

LGTM, though none of us apple people know much about the profile code as it only works on linux right now. But unless someone with more knowledge of the code objects, I'd push this.

Mon, Mar 26, 10:27 AM

Mar 8 2018

MatzeB added a comment to D43314: [lit] - Allow 1 test to report multiple micro-test results to provide support for microbenchmarks..

Tried some things and was able to get it working using /dev/stdout.

That doesn't work on Windows, may not work on Mac?

Should I change it back to rm %t.results.out after FileCheck? I'm not sure how to do this with a pipe.

Mar 8 2018, 11:02 AM

Mar 5 2018

MatzeB added a comment to D43916: Named VReg support for MIR.

Thanks for working on this; I'm looking forward to have this!

Mar 5 2018, 5:05 PM

Mar 4 2018

MatzeB added a comment to D43951: [RFC] llvm-mca: a static performance analysis tool..

This is great! I always wanted to see a IACTA like tool for other architectures.

Mar 4 2018, 12:36 PM

Feb 28 2018

MatzeB added a comment to D43862: [LNT] Error accessing URL /db_default/v4/nts/profile/ajax/getFunctions; NOT FOUND.

... looks like we should add some tests for the stuff in profile_views.py

Feb 28 2018, 7:25 AM

Feb 27 2018

MatzeB added a comment to D43850: [CodeGen] Avoid handling DBG_VALUE in the LivePhysRegs (addUses,removeDefs,stepForward).

This is surprising: DBG_VALUE should not have any defs so removeDefs() should have no effect and all uses should be marked with the debug flag so none of them should pass the MachineOperand::readsReg() test in addUses. Could you check whether the debug flag is set correctly? Did you try running your test with -verify-machineinstrs?

Feb 27 2018, 6:15 PM · debug-info
MatzeB added a comment to D43093: [FastISel] Sink local value materializations to first use.

Now, should we go to such length to make the debug information be friendlier to some debugger, I am not sure, but I let you decide.

Definitely not! That debugger constraint feels like it is designed for ancient compilers producing code and register allocating for one C statement at a time.

Feb 27 2018, 1:37 PM

Feb 25 2018

MatzeB added a comment to D43542: [CodeGen][FastRegAlloc] Disable registers spilling for a naked function (PR28641).

I've been looking into this, and we do indeed produce something like this for naked function with a parameter (this from http://llvm.org/PR28641 compiled with -O0):

bb.0 (%ir-block.1):
  liveins: $edi
  %0:gr32 = COPY $edi
  %1:gr32 = COPY killed %0
  INLINEASM &ret [sideeffect] [attdialect], $0:[clobber], implicit-def early-clobber $rdi, $1:[clobber], implicit-def early-clobber $eflags, !2
Feb 25 2018, 12:06 PM

Feb 22 2018

MatzeB added a comment to D43093: [FastISel] Sink local value materializations to first use.

Looks good from what I can see, though I'm not super experienced with fast-isel either. Does this work correctly in the case when fast-isel aborts while selecting a block?

Feb 22 2018, 5:16 PM
MatzeB added a comment to D43314: [lit] - Allow 1 test to report multiple micro-test results to provide support for microbenchmarks..

Looks nearly good to me. I'd just opt for a different naming of the sub-tests. Also some nitpicks/comments on other nitpicks below:

Feb 22 2018, 5:12 PM
MatzeB accepted D43337: [CodeGen] Don't omit any redundant information in -debug output.

LGTM

Feb 22 2018, 4:51 PM
MatzeB accepted D43319: [test-suite] Adding LCALS (Livermore Compiler Analysis Loop Suite) loop kernels to test suite..
  • You should double check whether you really need to repeat the lit.local.cfg changes in each directory.
  • I haven't actually tried compiling/running the benchmark but the changes LGTM.
Feb 22 2018, 4:51 PM
MatzeB added a comment to D43353: [X86] Add phony registers for high halves of E[A-D]X, E[SD]I, E[BS]P and EIP.

This should be a good thing! Among other things it should also allow us to switch register mask operands from physical registers to register units.

Feb 22 2018, 4:45 PM
MatzeB added a comment to D43398: [InstCombine] allow fdiv folds with less than fully 'fast' ops.

Your reasoning makes sense to me. And I think it's fine to push this given no floating point expert objects.

Feb 22 2018, 4:41 PM
MatzeB added a comment to D43427: [RegAllocFast] Salvage debug values when killing operands.
  • I'd rather keep MachineRegisterInfo to things directly related to information about virtual registers (register class etc.); I don't think it's role should be to provide transformations for involving virtual registers. So I'd rather move the functionality somewhere else.
  • I'm skeptical this works as intended in the presence of control flow (= multiple uses and defs of a vreg anywhere in the control flow graph).
  • I'm not sure I completely understand what is going on here, is it looking for COPYs killing a value so it can replace the debug values with the copied value? The commit message should mention that then.
Feb 22 2018, 4:36 PM · debug-info
MatzeB accepted D43478: [LiveIntervals] Handle moving up dead partial write.

This is a tricky case, good catch.

Feb 22 2018, 4:24 PM
MatzeB added a comment to D43542: [CodeGen][FastRegAlloc] Disable registers spilling for a naked function (PR28641).

I'm still a bit confused as to why we start spilling unused input vregs in the first place. Can you add a test case so we can reproduce? (The test can't get commit without a test anyway)

Feb 22 2018, 4:24 PM
MatzeB added a comment to D43578: -ftime-report switch support in Clang.
  • Just a warning: I'm a bit skeptical timers will work reliable for things that happen in well under a millisecond (did you do some sanity checking? i.e. do all the timer roughly add up to the time spent in the frontend?)
  • In the same vain I wonder if you add overhead by adding timers to functions that are likely called hundreds of times for a typical source file. Have you measured a release build with and without your patch.
Feb 22 2018, 4:20 PM
MatzeB added inline comments to D43601: [bugpoint] Add NoStripSymbols option..
Feb 22 2018, 4:14 PM
MatzeB accepted D43609: [bugpoint] Add option categories..

LGTM with declarations moved into a header.

Feb 22 2018, 4:09 PM
MatzeB accepted D43649: [AArch64] Refactor macro fusion.

Please append a "; NFC" to the commit title.
LGTM.

Feb 22 2018, 4:06 PM

Feb 21 2018

MatzeB added a comment to D43601: [bugpoint] Add NoStripSymbols option..

Okay reading the code a bit more seems like that is what you are doing.

Feb 21 2018, 6:25 PM
MatzeB added a comment to D43601: [bugpoint] Add NoStripSymbols option..

But we don't want user flags for bugpoint, we want it to try both variants and keep the smaller one when it still reproduces the crash.

Feb 21 2018, 6:18 PM

Feb 20 2018

MatzeB added a reviewer for D43542: [CodeGen][FastRegAlloc] Disable registers spilling for a naked function (PR28641): qcolombet.

Since a "naked" function can only have nothing but "asm" statements inside, and those "asm" statements can not have input or output parameters, it is safe to just disable any registers spilling for a "naked" function.

Feb 20 2018, 6:37 PM

Feb 19 2018

MatzeB added a reviewer for D43165: [lit] Fix problem in how Python versions open files with different encodings: MaggieYi.
Feb 19 2018, 10:08 PM
MatzeB added a comment to D43165: [lit] Fix problem in how Python versions open files with different encodings.

Reading python docu and playing a bit with a prototype I believe this is the way to do things:

Feb 19 2018, 10:08 PM
MatzeB added a comment to D43168: Don't use -ldl on BSD.

-pthreads defines additional OS specific flags like _REENTRANT or _PTHREADS and this is the form that should be used.

Feb 19 2018, 9:37 PM
MatzeB edited reviewers for D43107: Support for the mno-stack-arg-probe flag, added: aemerson, whitequark; removed: MatzeB.

@MatzeB : mstack-probe-size=0 is already used for a different purpose. It basically, if I remember correctly, always forces calls to chkstk regardless of the actual stack usage. Normally that would only happen after 4K. Regarding your second question -- do we really need to have these flags to be mutually-exclusive? They are related but serve a different purpose. stack-probe-size merely changes the default probe size from 4K to the specified value. mstack-arg-probe/mno-stack-arg-probe enables/disables stack probes. If we disable stack probes, their size no longer matters, but we do not necessarily have to prohibit stack-probe-size. So, in fact, I can imagine somebody may even define system-wide alias if they want to override 4K (for whatever reason). Then some Makefile (which is unaware of this alias) may, for example, disable stack probes completely (like in UEFI code case), and we do not necessarily want to prohibit this combination.

Feb 19 2018, 5:20 PM
MatzeB added a comment to D43478: [LiveIntervals] Handle moving up dead partial write.

I put my handleMove() tests into unittests/MI/LiveIntervalTest.cpp, with manually crafted MI snippets and a specified move of a single instruction.

Feb 19 2018, 3:10 PM

Feb 16 2018

MatzeB added a comment to D43107: Support for the mno-stack-arg-probe flag.
  • Have you considered using stack-probe-size=0 instead?
  • Doing it as a separate no-stack-arg-probe probably needs an IR verifier rule that forbids setting stack-probe-size and no-stack-arg-probe at the same time. Maybe also some attribute compatibility rules for inlining?
Feb 16 2018, 11:45 AM

Feb 15 2018

MatzeB accepted D40308: [RegisterCoalescer] More fixes for subreg join failure in RegCoalescer.

LGTM, with same change to avoid unnecessary repeated shrinkToUses().

Feb 15 2018, 11:48 AM
MatzeB added a comment to D43337: [CodeGen] Don't omit any redundant information in -debug output.

Printing different things depending on whether asserts are enabled feels strange. Can you explain the motivation here?

I think we should be showing all the information we have (redundant reg classes, redundant successors, predecessors (always redundant)) in things like -print-after-all, or when calling MF.dump() from the debugger (basically, when we're actually debugging llvm). Also, this can be too verbose to have when calling MF.print() for testing or other purposes.
I'm basically trying to come up with an implicit -simplify-mir for MF.print().

I guess if we don't really care about having MF.print() being too verbose, we could always assume IsStandalone is true except for MIR printing.

Feb 15 2018, 11:11 AM
MatzeB added a comment to D43337: [CodeGen] Don't omit any redundant information in -debug output.

Printing different things depending on whether asserts are enabled feels strange. Can you explain the motivation here?

Feb 15 2018, 10:42 AM

Feb 14 2018

MatzeB added a comment to D43165: [lit] Fix problem in how Python versions open files with different encodings.

'Latin-1' seems completely arbitrary and wrong to me. You should rather try to get a stream of bytes rather than have python try to decode the file.

Feb 14 2018, 10:51 AM
MatzeB added a comment to D43165: [lit] Fix problem in how Python versions open files with different encodings.

'Latin-1' seems completely arbitrary and wrong to me. You should rather try to get a stream of bytes rather than have python try to decode the file.

Feb 14 2018, 10:47 AM
MatzeB accepted D43274: [RegisterClassInfo] Invalidate the register pressure set limit cache when reserved regs or callee saved regs change.

LGTM

Feb 14 2018, 10:30 AM

Feb 12 2018

MatzeB added a comment to D43168: Don't use -ldl on BSD.
In D43168#1005380, @dim wrote:

@asbirlea: Why is -dl and -pthread needed at all? Glancing at the code I don't see any reference to pthreads or dlopen/dlsym...

Hmm that is a good one, indeed! I have locally tried deleting the blocks that add -lpthread and -ldl, and at least on Linux, the programs in question all appear to link just fine.

On FreeBSD, I do get errors about pthread functions, though:

[789/4485] Linking CXX executable Bitcode/Benchmarks/Halide/bilateral_grid/halide_bilateral_grid
FAILED: Bitcode/Benchmarks/Halide/bilateral_grid/halide_bilateral_grid
: && /home/dim/obj/test-suite/build/tools/timeit --summary Bitcode/Benchmarks/Halide/bilateral_grid/halide_bilateral_grid.link.time /share/dim/llvm/324090-rel60-freebsd12-amd64-ninja-rel-1/bin/clang++  -O3 -DNDEBUG   Bitcode/Benchmarks/Halide/bilateral_grid/CMakeFiles/halide_bilateral_grid.dir/driver.cpp.o Bitcode/Benchmarks/Halide/bilateral_grid/CMakeFiles/halide_bilateral_grid.dir/__/common/x86_halide_runtime.bc.o Bitcode/Benchmarks/Halide/bilateral_grid/CMakeFiles/halide_bilateral_grid.dir/bilateral_grid.bc.o  -o Bitcode/Benchmarks/Halide/bilateral_grid/halide_bilateral_grid   && :
Bitcode/Benchmarks/Halide/bilateral_grid/CMakeFiles/halide_bilateral_grid.dir/__/common/x86_halide_runtime.bc.o: In function `halide_spawn_thread':
posix_allocator.cpp:(.text+0x3ab): undefined reference to `pthread_create'
clang-6.0: error: linker command failed with exit code 1 (use -v to see invocation)
[790/4485] Linking CXX executable Bitcode/Regression/vector_widen/widen_bug
FAILED: Bitcode/Regression/vector_widen/widen_bug
: && /home/dim/obj/test-suite/build/tools/timeit --summary Bitcode/Regression/vector_widen/widen_bug.link.time /share/dim/llvm/324090-rel60-freebsd12-amd64-ninja-rel-1/bin/clang++  -O3 -DNDEBUG   Bitcode/Regression/vector_widen/CMakeFiles/widen_bug.dir/driver.cpp.o Bitcode/Regression/vector_widen/CMakeFiles/widen_bug.dir/halide_runtime.bc.o Bitcode/Regression/vector_widen/CMakeFiles/widen_bug.dir/vector_widen.bc.o  -o Bitcode/Regression/vector_widen/widen_bug   && :
Bitcode/Regression/vector_widen/CMakeFiles/widen_bug.dir/halide_runtime.bc.o: In function `Halide::Runtime::Internal::default_do_par_for(void*, int (*)(void*, int, unsigned char*), int, int, unsigned char*)':
simd_op_check_runtime.ll:(.text+0x1dd): undefined reference to `pthread_create'
Bitcode/Regression/vector_widen/CMakeFiles/widen_bug.dir/halide_runtime.bc.o: In function `halide_spawn_thread':
simd_op_check_runtime.ll:(.text+0x651): undefined reference to `pthread_create'
clang-6.0: error: linker command failed with exit code 1 (use -v to see invocation)
[794/4485] Building CXX object Bitcode/Benchmarks/Halide/local_laplacian/CMakeFiles/halide_local_laplacian.dir/local_laplacian.bc.o
ninja: build stopped: subcommand failed.

Apparently there is a halide_spawn_thread function in the bitcode. I am unsure if there is any source for that bitcode?

Feb 12 2018, 10:47 AM
MatzeB added a comment to D43168: Don't use -ldl on BSD.

@asbirlea: Why is -dl and -pthread needed at all? Glancing at the code I don't see any reference to pthreads or dlopen/dlsym...

Feb 12 2018, 10:13 AM
MatzeB added a comment to D43168: Don't use -ldl on BSD.

For such options there are designed other options, like CMAKE_EXE_LINKER_FLAGS. target_link_libraries() should be used for libraries.

Feb 12 2018, 10:03 AM
MatzeB added a comment to D43108: Support for the mno-stack-arg-probe flag.
  • No test.
  • What about -mstack-arg-probe, shouldn't we have that for consistency?
  • I'd prefer not to review clang changes myself as I don't know that part of the code too well.
Feb 12 2018, 9:49 AM
MatzeB added a comment to D43168: Don't use -ldl on BSD.

The CMAKE_DL_LIBS change LGTM.

Feb 12 2018, 9:39 AM

Feb 9 2018

MatzeB accepted D43136: Make LLVM timer reprintable: that is, make more than one print action on the same timer feasible.

I agree that a function named printXXX() should not also clear the timer.
As far as I understand it this shouldn't change any behavior in the existing clang/llvm code, so LGTM.

Feb 9 2018, 4:34 PM
MatzeB added a comment to D42576: [test-suite] Add prototypes to functions in Olden and MiBench.

I'll just add that what we do in WebAssembly right now for unprototyped functions is to codegen them using the fixed-arg calling convention; at link-time if there are any mismatches between the definition and any calls, then the program has UB, and the resulting WebAssembly module will be invalid. I think this currently has the problem that if the definition does turn out to be vararg, then we have generated an invalid module even though the program is correct (I consider that a bug or shortcoming in our toolchain though). In any case, we do aim to support correct programs with prototypeless functions. If the test suite has non-vararg cases where there are mismatches between the callsites and/or the definition (I have seen that in some other test/benchmark suites), then even though that works on most X86 and ARM ABIs that I know of it's still UB and I'd be in favor of removing it from LLVM's test suite.

Feb 9 2018, 11:31 AM
MatzeB requested changes to D42576: [test-suite] Add prototypes to functions in Olden and MiBench.

Your example will work for x86 and probably ARM, but as far as I can tell it is up to the ABI which calling convention should be used for functions without prototypes.
For us these calling conventions are incompatible since the variadic one will pass arguments on the stack instead of in registers so if the target function is not actually variadic this will cause it to read garbage from undefined registers. While it *should* work if there is only zero or one arguments it is not guaranteed to work with more than 1.

I also found that apparently WebAssembly does the same: https://github.com/WebAssembly/tool-conventions/issues/16 and it also causes issues with functions without prototypes (https://bugs.llvm.org/show_bug.cgi?id=35385)

Feb 9 2018, 8:43 AM

Feb 8 2018

MatzeB accepted D42411: [test-suite] Fix ambigous call to overloaded function isnan.

Just commit whatever version you see fit, I'm fine with any of the proposals; the bots will come back to you and tell you if things don't build somewhere :)

Feb 8 2018, 3:01 PM
MatzeB accepted D43084: [bugpoint] Simplify the global initializer reducer, NFC.

LGTM
...and except maybe for the renaming of the class this all looks like trivial changes that are fine with post-commit review.

Feb 8 2018, 11:58 AM

Feb 6 2018

MatzeB added a comment to D42879: InstCombine: 1./x >= 0. -> x >= 0..

Underflow or overflow doesn't change sign, so 0 < C < inf && X >= 0 --> C/X >= 0.

Feb 6 2018, 4:12 PM
MatzeB added a comment to D42879: InstCombine: 1./x >= 0. -> x >= 0..

Do we need to restrict this to 1.0 / X ? If we only care about the sign of the fdiv result and we're ruling out INF, then any constant 'C / X' should be ok? Can also handle 'X / C'?

It gets slightly harder with C/X because for large C we can underflow to zero if X is big enough. I wasn't sure how to compute the limit so went with 1.0 for now.

Feb 6 2018, 10:06 AM
MatzeB added a comment to D42879: InstCombine: 1./x >= 0. -> x >= 0..

Do we need to restrict this to 1.0 / X ? If we only care about the sign of the fdiv result and we're ruling out INF, then any constant 'C / X' should be ok? Can also handle 'X / C'?

Feb 6 2018, 10:03 AM

Feb 2 2018

MatzeB added a reviewer for D42879: InstCombine: 1./x >= 0. -> x >= 0.: scanon.
Feb 2 2018, 9:18 PM
MatzeB created D42879: InstCombine: 1./x >= 0. -> x >= 0..
Feb 2 2018, 9:15 PM

Feb 1 2018

MatzeB added inline comments to D42830: Add and link to a testing philosophy document in the developer documentation..
Feb 1 2018, 6:01 PM
MatzeB accepted D42830: Add and link to a testing philosophy document in the developer documentation..

Thanks for writing this! And LGTM in it's current form already.

Feb 1 2018, 5:58 PM
MatzeB added a comment to D42667: SplitKit: Fix liveness recomputation in some remat cases..

Nice trick wmi, I tried it and could create a very small testcase that would hit the problematic code in question. Unfortunately those small inputs would not fail in -verify-machineinstrs mode without this patch (I suspect the liveranges didn't happen to actually be invalid even though we ignored the fact that PHIs can have inputs).

Feb 1 2018, 3:37 PM
MatzeB added inline comments to D42667: SplitKit: Fix liveness recomputation in some remat cases..
Feb 1 2018, 10:41 AM

Jan 31 2018

MatzeB accepted D42726: [x86] Make the retpoline thunk insertion a machine function pass..

LGTM, thanks.

Jan 31 2018, 11:17 AM
MatzeB accepted D42746: MIR PhysReg sigil change from '%' to '$'.

Thanks for doing all the testcase fixup work!

Jan 31 2018, 10:29 AM
MatzeB added a reviewer for D42746: MIR PhysReg sigil change from '%' to '$': thegameg.
Jan 31 2018, 10:25 AM
MatzeB added inline comments to D42726: [x86] Make the retpoline thunk insertion a machine function pass..
Jan 31 2018, 10:22 AM

Jan 30 2018

MatzeB accepted D42655: [LivePhysRegs] Fix handling of return instructions..

Argh:

...
    BX_RET 4, killed %cpsr, implicit %r0
    Successors according to CFG: %bb.2(0x40000000 / 0x80000000 = 50.00%)
...

A conditional return statement breaks the whole nice logic of MachineBasicBlock::isReturnBlock() returning true or false, since it's clearly neither in this case...

Jan 30 2018, 5:01 PM

Jan 29 2018

MatzeB created D42667: SplitKit: Fix liveness recomputation in some remat cases..
Jan 29 2018, 3:33 PM
MatzeB added a comment to D41944: [LLVM][IR][LIT] support of 'no-overflow' flag for sdiv\udiv instructions.

So do we still need all the .ll file changes with the syntax nof/mof syntax changes?

Jan 29 2018, 1:35 PM
MatzeB added inline comments to D42387: [AArch64] Add pipeline model for Exynos M3.
Jan 29 2018, 1:23 PM
MatzeB added a comment to D42387: [AArch64] Add pipeline model for Exynos M3.

Actually can you rename the scheduling model to AArch64SchedExynosM3.td? (Feel free to do the same with the M1 file in a separate commit without review).

Jan 29 2018, 10:32 AM
MatzeB added a comment to D42006: AArch64: Omit callframe setup/destroy when not necessary.

Of course reverting is fine. This whole commit was mostly motivated by llvm/test/CodeGen/AArch64/big-callframe.ll but you could simply XFAIL that test together with the revert.

Jan 29 2018, 10:31 AM
MatzeB added a comment to D42450: [utils] Convert update_{llc_,}test_checks.py to Python 3.

Is the decision "we should not make it Python 3" ?

Jan 29 2018, 10:24 AM
MatzeB added a comment to D42534: [mips] Compute MaxCallFrame size early on.

+CC'ing @MatzeB for additional insight on what needs to be changed.

Found the bug. What's occurring with the debug information test failures with this patch, is that during the insertion of debug instructions for function arguments, it has to determine which register to assign to the debug instruction, see SelectionDAGISel.cpp:505. Here it asks for the frame pointer register which for some RISC like targets by the size of the frame for the function.

It seems to me that we need to compute the max call frame size before that point (SelectionDAGISel.cpp:505) for targets which use call frame setup and destroy pseudo instructions, this however requires changes for every target which implements finalizeLowering() to compute the maxCallFrameSize.

Jan 29 2018, 9:55 AM
MatzeB added inline comments to D42533: [X86FixupBWInsts] Fix miscompilation if sibling sub-register is live..
Jan 29 2018, 9:44 AM
MatzeB accepted D42387: [AArch64] Add pipeline model for Exynos M3.

I can't say much about the scheduling model itself as I'm not familiar with Exynos.

Jan 29 2018, 9:40 AM
MatzeB accepted D42089: [AArch64] Expand testing of zero cycle zeroing.

I guess sooner or later we have to change this test to just check for the two cases of
-mattr=+zcz / -mattr=-zcz
and having a separate test that checks whether your favorite CPU is setting all the subtarget features you expect.

Jan 29 2018, 9:24 AM
MatzeB accepted D42393: [AArch64] Add new target feature to fuse address generation with load or store.

LGTM

Jan 29 2018, 9:22 AM
MatzeB accepted D42449: [MachineVerifier] Add check that renamable operands aren't reserved registers..

I was contemplating whether we need a check that the register assigned on a renamable operand is a physreg. Of course we cannot even write a verifier for that as using MO::isRenamable() will immediately run into an assertion failure when that is not the case (which is good!).

Jan 29 2018, 9:18 AM

Jan 28 2018

MatzeB added a comment to D42576: [test-suite] Add prototypes to functions in Olden and MiBench.

Calling K&R functions is not a problem. It is fine to use them as long as there is a declaration with the parameter types in the LLVM IR and not just an declare i32 @foo(...). The problem is that we are calling functions that don't have a declaration and therefore the compiler assumes that it is a variadic function call. This leads to runtime crashes on our CHERI platform.

They are not necessarily correct C programs since they include undefined behaviour

Jan 28 2018, 1:00 PM

Jan 26 2018

MatzeB added a comment to D42509: [LivePhysRegs] Preserve pristine registers in blocks with no successors..

Oh before, have the exact same problem in LiveRegUnits::addLiveOuts(). I can take a look at that later if you don't beat me to it.

Jan 26 2018, 2:20 PM
MatzeB added a comment to D42576: [test-suite] Add prototypes to functions in Olden and MiBench.

As far as I can tell, those are correct C programs.

Jan 26 2018, 11:01 AM
MatzeB added a comment to D42411: [test-suite] Fix ambigous call to overloaded function isnan.

Anyway let's leave it as is, if it breaks the build on your side.

Just to double check, the 'it' you're referring to is the build system / compilation options for this test and you're happy with the patch as is?

Jan 26 2018, 10:11 AM

Jan 25 2018

MatzeB added reviewers for D42235: [AArch64] Avoid unnecessary vector byte-swapping in big-endian: evandro, t.p.northover.

This apparently defeats the logic in splitStores() (aka the infamous FeatureMisalignedSlow128Bit) because that code had an exception for v2i64 vector types but wants to split all other vector types. An internal test case of ours:

define void @test_split_16B(<4 x float> %val, <4 x float>* %addr) {
  store <4 x float> %val, <4 x float>* %addr, align 8
  ret void
}

would no longer show the store getting split.

Jan 25 2018, 5:27 PM
MatzeB accepted D42509: [LivePhysRegs] Preserve pristine registers in blocks with no successors..

Ok the subtle detail I missed before is that noreturn calls may still throw an exception/cause stack unwinding apparently. In that case we indeed need to prepare for CSR restoration.

Jan 25 2018, 3:48 PM
MatzeB added a comment to D42509: [LivePhysRegs] Preserve pristine registers in blocks with no successors..

A noreturn call doesn't need to have the callers callee saved registers restored.

Why not? The non-returning call doesn't come back to its call site, but it goes somewhere. From the point of view of some function foo, all of that activity may happen inside the call to bar (which would have further calls in it). Not preserving non-volatile registers could cause them to end up being clobbered across the call to bar.

Jan 25 2018, 3:30 PM
MatzeB added a comment to D42411: [test-suite] Fix ambigous call to overloaded function isnan.

Update as per @MatzeB 's comment, checking before commit showed it broke the build on OSX.

Jan 25 2018, 3:20 PM
MatzeB added inline comments to D42533: [X86FixupBWInsts] Fix miscompilation if sibling sub-register is live..
Jan 25 2018, 3:06 PM

Jan 24 2018

MatzeB added a comment to D42450: [utils] Convert update_{llc_,}test_checks.py to Python 3.

so I must've done something to be able to run these scripts on my machine. :)

Well installing any homebrew or macports python (which is often a dependency for other packages) will give you a python2 with the proper symlinks in place. It's usually people maintaining CI infrastructured that are bothered by macOS (or linux) default installations.

Mac's system python comes with a python2.7 symlink. Today you don't need to do anything for this script to work.

Jan 24 2018, 5:22 PM
MatzeB added a comment to D42509: [LivePhysRegs] Preserve pristine registers in blocks with no successors..

A noreturn call doesn't need to have the callers callee saved registers restored. I also think this fact shouldn't change with throwing calls (I assume the unwinding tables would contain information on how to restore the callee saves in this case?).

Jan 24 2018, 3:29 PM
MatzeB added a comment to D42450: [utils] Convert update_{llc_,}test_checks.py to Python 3.
I'd like to see llvm moving away from legacy python2.7

That needs discussion on llvm-dev to make sure everyone is on the same page, GettingStarted.rst is updated etc. Forcing just a single script to python3 is not a good idea.

Jan 24 2018, 1:38 PM
MatzeB accepted D42502: [MIR] Add support for addrspace in MIR.

(+CC AMDGPU so they can veto if they don't like the syntax)

Jan 24 2018, 1:30 PM

Jan 23 2018

MatzeB added a comment to D42411: [test-suite] Fix ambigous call to overloaded function isnan.

I guess a more C++y solution would be to #include <cmath> and using std::isnan() instead of using the C headers. But I guess this is fine for now.

Jan 23 2018, 3:39 PM
MatzeB added a comment to D42450: [utils] Convert update_{llc_,}test_checks.py to Python 3.

LLVM's GettingStarted.html says "python >= 2.7" is fine for LLVM development.

Jan 23 2018, 3:29 PM
MatzeB added a comment to D42006: AArch64: Omit callframe setup/destroy when not necessary.

I didn't do in-depth performance tests. In principle this just remove a few "nop" instructions so I didn't expect big changes. The change gives the scheduler a bit more freedom though as there is not instruction redefining SP anymore... Did you only get regressions and no improvements from this?

Jan 23 2018, 1:37 PM