uweigand (Ulrich Weigand)
User

Projects

User does not belong to any projects.

User Details

User Since
Apr 14 2013, 11:48 AM (279 w, 10 h)

Recent Activity

Yesterday

uweigand added a comment to D50913: [FPEnv] Don't need copysign/fabs/fneg constrained intrinsics.

I do see that you wrote -0-x. Is there something special about the -0?

Sat, Aug 18, 7:09 PM

Fri, Aug 17

uweigand added a comment to D50913: [FPEnv] Don't need copysign/fabs/fneg constrained intrinsics.

I believe we still don't need an intrinsic for constrained fneg. As you say, floating-point negation is implemented as -0-x, using a regular fsub (*not* the constrained intrinsic). The fsub semantics says that it cannot trap, so this would still be fine -- as long as we're sure the -0-x is always implemented via a dedicated negate instruction, and never via a subtraction. But this seems to be true, the idiom -0-x is recognized early during codegen and always treated specially.

Fri, Aug 17, 2:02 PM

Wed, Aug 15

uweigand accepted D50779: [SystemZ] New CL option to enable subreg liveness.

LGTM.

Wed, Aug 15, 7:59 AM
uweigand accepted D50725: [SystemZ] Replace subreg_r with subreg_h.

Ah, that's good to know! So if I understand this correctly, accessing even the 32-bit part (F0S) would be considered to clobber F0D. This is not really necessary, but probably doesn't hurt at this point. We could do the same thing as the HAX you mention to get this modeled exactly.

Wed, Aug 15, 7:58 AM
uweigand added a comment to D50725: [SystemZ] Replace subreg_r with subreg_h.

Well, the original rationale for using different subreg indices for float/vector registers is given here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150504/274358.html

Wed, Aug 15, 6:27 AM

Fri, Aug 10

uweigand added a comment to D50546: CMake: Fix native arch selection on s390 (32-bit).

Why? The SystemZ back-end doesn't support 32-bit code generation anyway, so there is no native LLVM support on s390 ...

Fri, Aug 10, 2:05 AM

Thu, Aug 9

uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

Ping. Would this be easier to review if I split it up into multiple patches?

Thu, Aug 9, 10:19 AM
uweigand added a comment to D50514: [7.0 branch] Update release notes (SystemZ, TableGen).

I've checked it in now, thanks!

Thu, Aug 9, 9:19 AM
uweigand committed rL339355: [7.0 branch] Update release notes (SystemZ, TableGen).
[7.0 branch] Update release notes (SystemZ, TableGen)
Thu, Aug 9, 9:18 AM
uweigand closed D50514: [7.0 branch] Update release notes (SystemZ, TableGen).
Thu, Aug 9, 9:18 AM
uweigand created D50514: [7.0 branch] Update release notes (SystemZ, TableGen).
Thu, Aug 9, 8:27 AM

Tue, Aug 7

uweigand accepted D50358: [SelectionDAG][X86][SystemZ] Add a generic nonvolatile_store/nonvolatile_load pattern fragment in TargetSelectionDAG.td.

LGTM as well.

Tue, Aug 7, 4:27 AM

Fri, Aug 3

uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

Ping?

Fri, Aug 3, 7:51 AM
uweigand accepted D50187: [SystemZ] Improve handling of instructions which expand to several groups.

Patch LGTM. Thanks!

Fri, Aug 3, 2:58 AM

Thu, Aug 2

uweigand added inline comments to D50187: [SystemZ] Improve handling of instructions which expand to several groups.
Thu, Aug 2, 10:31 AM

Wed, Aug 1

uweigand added a comment to D49994: Allow constraining virtual register's class within reason.

The new code for f2 in cond-move-03.ll is in fact better, since it now actually uses the conditional move instruction instead of a branch ...

Thanks, Ulrich. Does it fix FIXME: We should commute the LOCRMux to save one move. or is it unrelated?

Wed, Aug 1, 10:30 AM
uweigand added a comment to D49994: Allow constraining virtual register's class within reason.

The new code for f2 in cond-move-03.ll is in fact better, since it now actually uses the conditional move instruction instead of a branch ...

Wed, Aug 1, 9:37 AM
uweigand committed rL338522: Fix build bot after r338521.
Fix build bot after r338521
Wed, Aug 1, 5:07 AM
uweigand committed rL338521: [SystemZ, TableGen] Fix shift count handling.
[SystemZ, TableGen] Fix shift count handling
Wed, Aug 1, 4:58 AM
This revision was not accepted when it landed; it landed in state Needs Review.
Wed, Aug 1, 4:58 AM

Tue, Jul 31

uweigand added a comment to D50018: SystemZ: keep AND masks before SHL i128.

I've just posted a patch to reimplement the broken SystemZ shift count logic: https://reviews.llvm.org/D50096

Tue, Jul 31, 12:02 PM
uweigand created D50096: [SystemZ, TableGen] Fix shift count handling.
Tue, Jul 31, 12:01 PM

Fri, Jul 27

uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

Since a and b are local to the function, the *non-strict* DIV could be hoisted out of the loop. If we assume that calls implicitly define FPC, which I think we must to model global state, wouldn't that prevent the hoist of this non-strict DIV?

Fri, Jul 27, 9:42 AM
uweigand updated the diff for D45576: [RFC] Allow target to handle STRICT floating-point nodes.

Updated patch to make use of the new multi-alternative pattern fragment features, and completed set of test cases.

Fri, Jul 27, 6:05 AM
uweigand accepted D49847: [SystemZ] Improve decoding in case of instructions with four register operands..

Ah, OK. Thanks for checking!

Fri, Jul 27, 4:48 AM

Thu, Jul 26

uweigand added inline comments to D49847: [SystemZ] Improve decoding in case of instructions with four register operands..
Thu, Jul 26, 10:19 AM

Wed, Jul 25

uweigand committed rL337939: Fix corruption of result number in LegalizeVectorOps.cpp.
Fix corruption of result number in LegalizeVectorOps.cpp
Wed, Jul 25, 10:08 AM
uweigand closed D49805: Fix corruption of result number in LegalizeVectorOps.cpp.
Wed, Jul 25, 10:08 AM
uweigand created D49805: Fix corruption of result number in LegalizeVectorOps.cpp.
Wed, Jul 25, 9:02 AM
uweigand accepted D49598: [SystemZ] Use tablegen loops in SchedModels.

LGTM, thanks.

Wed, Jul 25, 4:45 AM
uweigand added a comment to D49598: [SystemZ] Use tablegen loops in SchedModels.

Sorry, missed your last update.

Wed, Jul 25, 3:26 AM
uweigand added a comment to D49598: [SystemZ] Use tablegen loops in SchedModels.

A few more comments, LGTM with those changes.

Wed, Jul 25, 3:23 AM
uweigand added a comment to D49598: [SystemZ] Use tablegen loops in SchedModels.

See inline comments. In general I agree with this approach.

Wed, Jul 25, 2:35 AM

Mon, Jul 23

uweigand added a comment to D49598: [SystemZ] Use tablegen loops in SchedModels.

I believe in this case it would be preferable to include the full range of latency values. We should try to keep the source as concise and straightforward as possible; if this results in some units being defined that aren't used, that doesn't seem to be a drawback to me. Also, for the common code parts, even if no current instruction uses any particular value, there may always be additional instructions in future processors.

Mon, Jul 23, 11:46 AM

Jul 20 2018

uweigand committed rL337542: [SystemZ] Test case formatting fixes.
[SystemZ] Test case formatting fixes
Jul 20 2018, 5:18 AM

Jul 16 2018

uweigand added a comment to D49262: [DAGCombiner] Call SimplifyDemandedVectorElts from EXTRACT_VECTOR_ELT.

The SystemZ changes look good to me. Thanks for taking care of this!

Jul 16 2018, 10:07 AM

Jul 15 2018

uweigand added a comment to D49262: [DAGCombiner] Call SimplifyDemandedVectorElts from EXTRACT_VECTOR_ELT.

Hmm ... The SystemZ tests seem to be getting strictly worse. Before, we have in f3:

Jul 15 2018, 10:18 AM

Jul 13 2018

uweigand committed rL337023: [TableGen] Suppress type validation when parsing pattern fragments.
[TableGen] Suppress type validation when parsing pattern fragments
Jul 13 2018, 9:47 AM
uweigand closed D48887: [TableGen] Suppress type validation when parsing pattern fragments.
Jul 13 2018, 9:47 AM
uweigand added a comment to D48887: [TableGen] Suppress type validation when parsing pattern fragments.

Ping?

Jul 13 2018, 6:25 AM
uweigand abandoned D48326: [RFC] "Alternative" matches for TableGen DAG patterns.

Superseded by https://reviews.llvm.org/D48545

Jul 13 2018, 6:24 AM
uweigand committed rL336999: [TableGen] Support multi-alternative pattern fragments.
[TableGen] Support multi-alternative pattern fragments
Jul 13 2018, 6:23 AM
uweigand closed D48545: [RFC v2] "Alternative" matches for TableGen DAG patterns.
Jul 13 2018, 6:23 AM

Jul 12 2018

uweigand added a comment to D48545: [RFC v2] "Alternative" matches for TableGen DAG patterns.

LGTM (there are a lot of changes here, but given that it produces no changes to existing matching tables, that seems like pretty good test coverage).

Jul 12 2018, 3:10 PM
uweigand added a comment to D48545: [RFC v2] "Alternative" matches for TableGen DAG patterns.

Ping?

Jul 12 2018, 11:04 AM

Jul 11 2018

uweigand accepted D49161: Fix reading 32 bit gcov tag values on little-endian machines.

Ah, I missed that, probably because on SystemZ char is unsigned by default ... Sorry about that.

Jul 11 2018, 2:29 AM

Jul 10 2018

uweigand added a comment to D49132: Fix gcov profiling on big-endian machines.

This caused a build bot failure here:
http://lab.llvm.org:8011/builders/clang-x86_64-linux-abi-test/builds/28963

Jul 10 2018, 10:15 AM
uweigand committed rCRT336706: [gcov] Fix fallout from r336693.
[gcov] Fix fallout from r336693
Jul 10 2018, 10:13 AM
uweigand committed rL336706: [gcov] Fix fallout from r336693.
[gcov] Fix fallout from r336693
Jul 10 2018, 10:13 AM
uweigand added a comment to D30432: [asan] Print a "PC is at a non-executable memory region" message if that's the case.

I've now disabled the test on s390 to get the build bot going reliably again.

Jul 10 2018, 10:00 AM
uweigand committed rCRT336705: [asan] Disable non-execute test on s390.
[asan] Disable non-execute test on s390
Jul 10 2018, 10:00 AM
uweigand committed rL336705: [asan] Disable non-execute test on s390.
[asan] Disable non-execute test on s390
Jul 10 2018, 10:00 AM
uweigand added a comment to D30432: [asan] Print a "PC is at a non-executable memory region" message if that's the case.

This is also failing on s390x (randomly). The problem is that we don't actually have a non-executable flag, so the processor *will* execute the contents of "array" as code. Depending on what's there (it is just uninitialized memory), this will crash sooner or later, but in a way that may or may not make the check succeed randomly.

Jul 10 2018, 9:53 AM
uweigand added a comment to D49134: Fix ABI when calling llvm_gcov_... routines from instrumentation code.

LGTM! I assume you caught all the affected function calls.

Jul 10 2018, 9:18 AM
uweigand committed rL336695: Remove s390x XFAILs now that gcov profiling works..
Remove s390x XFAILs now that gcov profiling works.
Jul 10 2018, 9:14 AM
uweigand committed rCRT336695: Remove s390x XFAILs now that gcov profiling works..
Remove s390x XFAILs now that gcov profiling works.
Jul 10 2018, 9:14 AM
uweigand committed rCRT336693: [gcov] Fix gcov profiling on big-endian machines.
[gcov] Fix gcov profiling on big-endian machines
Jul 10 2018, 9:13 AM
uweigand committed rL336693: [gcov] Fix gcov profiling on big-endian machines.
[gcov] Fix gcov profiling on big-endian machines
Jul 10 2018, 9:13 AM
uweigand closed D49132: Fix gcov profiling on big-endian machines.
Jul 10 2018, 9:13 AM
uweigand committed rL336692: [gcov] Fix ABI when calling llvm_gcov_... routines from instrumentation code.
[gcov] Fix ABI when calling llvm_gcov_... routines from instrumentation code
Jul 10 2018, 9:11 AM
uweigand closed D49134: Fix ABI when calling llvm_gcov_... routines from instrumentation code.
Jul 10 2018, 9:10 AM
uweigand created D49134: Fix ABI when calling llvm_gcov_... routines from instrumentation code.
Jul 10 2018, 7:21 AM
uweigand created D49132: Fix gcov profiling on big-endian machines.
Jul 10 2018, 7:12 AM

Jul 6 2018

uweigand added a comment to D48538: Make __gcov_flush flush counters for all shared libraries.

It looks like this causes build bot failures on s390x-linux. Three of the new tests fail:
http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/17143

Jul 6 2018, 7:43 AM

Jul 3 2018

uweigand created D48887: [TableGen] Suppress type validation when parsing pattern fragments.
Jul 3 2018, 10:57 AM

Jun 25 2018

uweigand added a comment to D48326: [RFC] "Alternative" matches for TableGen DAG patterns.

I don't think that the new keyword is necessary. Why not have a new class Patternsthat takes a list of input dag patterns (otherwise identical to the current Pattern), and matches when any of these inputs matches? Then you can have a PatFrags as an analogy to PatFrag.

Jun 25 2018, 6:06 AM
uweigand created D48545: [RFC v2] "Alternative" matches for TableGen DAG patterns.
Jun 25 2018, 6:02 AM
uweigand accepted D47008: [SystemZ] Reimplent SchedModel IssueWidth and WriteRes/ReadAdvance mappings to operands.

This version looks good to me, and we're also seeing good performance results.

Jun 25 2018, 2:43 AM

Jun 20 2018

uweigand added a comment to D48326: [RFC] "Alternative" matches for TableGen DAG patterns.

For the purposes of matching the main pattern is really the same as the ones specified by Pats. I am opposed to making this feature specific to the "Pattern" field, (1) because this is a very useful feature to have in general, and (2) because if the implementation is main-pattern-specific, it would need to be replaced to make it work with Pats in the future. The main strength of this feature comes from multiple "source patterns" in a PatFrag.

Jun 20 2018, 8:18 AM
uweigand added a comment to D48326: [RFC] "Alternative" matches for TableGen DAG patterns.

I don't think that the new keyword is necessary. Why not have a new class Patternsthat takes a list of input dag patterns (otherwise identical to the current Pattern), and matches when any of these inputs matches? Then you can have a PatFrags as an analogy to PatFrag.

Jun 20 2018, 7:15 AM

Jun 19 2018

uweigand updated the diff for D45576: [RFC] Allow target to handle STRICT floating-point nodes.

I've come up with a suggestion to avoid even the duplicated DAG patterns, by having them implicitly generated by TableGen. This uses a new "alternative" mechanism described in a separate RFC:
https://reviews.llvm.org/D48326

Jun 19 2018, 10:05 AM
uweigand created D48326: [RFC] "Alternative" matches for TableGen DAG patterns.
Jun 19 2018, 9:46 AM
uweigand committed rT335044: [test-suite] Fix SystemZ build break (missing cycleclock::Now).
[test-suite] Fix SystemZ build break (missing cycleclock::Now)
Jun 19 2018, 6:27 AM
uweigand committed rL335044: [test-suite] Fix SystemZ build break (missing cycleclock::Now).
[test-suite] Fix SystemZ build break (missing cycleclock::Now)
Jun 19 2018, 6:27 AM

Jun 8 2018

uweigand committed rL334286: clang-s390x-linux-lnt: Move to test-suite producer.
clang-s390x-linux-lnt: Move to test-suite producer
Jun 8 2018, 6:09 AM

Jun 6 2018

uweigand accepted D47820: [SystemZ] Build TM from scratch in convertToLoadAndTest, to get CC operand in right place..

Otherwise, this LGTM.

Jun 6 2018, 4:43 AM

Jun 1 2018

uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

This version now correctly implements target support for STRICT floating-point nodes using a MMO to represent the floating-point exception status. (Note that targets can and should additionally use a register use dependency on *all* floating-point instructions to represent floating-point control state, e.g. rounding mode and exception trap flags.)

Jun 1 2018, 10:15 AM
uweigand updated the diff for D45576: [RFC] Allow target to handle STRICT floating-point nodes.

Working version of the patch using memory operands.

Jun 1 2018, 10:04 AM

May 30 2018

uweigand added a comment to rL303907: Fix bug #28898.

Just to confirm: are you saying that the problem reported by Tim will go away if he updates his libedit binary?

May 30 2018, 9:21 AM
uweigand added a comment to rL303907: Fix bug #28898.

We still seem to have a bug here, so we should figure out how to fix it. Simply reverting this does not seem to be an option because it breaks lldb horribly for certain build configs of libedit. I'll try to look at what can be done here. What would certainly help this situation is if we made the value of LLDB_EDITLINE_USE_WCHAR configurable at build time (with the defaults as they are now, at least initially). Then for your internal build to can force-set the variable to true as long as you have a wide-char enabled libedit. I think that should resolve the asan problem and leave you with a working lldb.

May 30 2018, 8:42 AM

May 29 2018

uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

To take a hint from the suggestion you relayed from Chris, how about we just add MMOs to these instructions, and then let mayLoad/mayStore look at optional MMOs when returning their answer? Maybe this lets us do what we want without duplicating all of the patterns?

May 29 2018, 10:43 AM
uweigand updated the diff for D45576: [RFC] Allow target to handle STRICT floating-point nodes.

(Failed) attempt to handle strict and regular FP operations without duplicating patterns ...

May 29 2018, 10:34 AM
uweigand added a comment to D47380: Make getStrictFPOpcodeAction(...) more accessible .

Is there a reason for dropping the "FP" from "StrictFP" in the method name? I think it would be better to keep it (i.e. use "getStrictFPOperationAction"), in particular since there are already other public methods in this area that use "StrictFP", like isStrictFPOpcode and MutateStrictFPToFP.

May 29 2018, 9:09 AM

May 24 2018

uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

To take a hint from the suggestion you relayed from Chris, how about we just add MMOs to these instructions, and then let mayLoad/mayStore look at optional MMOs when returning their answer? Maybe this lets us do what we want without duplicating all of the patterns?

May 24 2018, 5:07 AM

May 23 2018

uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

Ah, okay. That doesn't seem so bad. The fact that the number of times they're called can't be observable means, I think, that we only need to model the trapping behavior as writing (and not reading) inaccessible memory. This can't be removed or speculated, but otherwise shouldn't have undo optimization implications.

Correction: I mean writing to memory (the trap handler might write memory other than inaccessible memory (e.g., some global variable)). Anything that's escaped (trivially including globals) is fair game.

I suppose that they also get to read memory (so long as they're not using it to keep ordering/count information). So the trap handlers are modeled as reading/writing arbitrary escaped memory. That, plus register dependencies, seems like it should be sufficient.

May 23 2018, 10:02 AM
uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

That's a good point. I'm not sure that we want to model the effects of trap handlers, however. We don't at the IR level (as we mark the intrinsics as IntrInaccessibleMemOnly), and don't want to (because otherwise we need to assume that every FP operation is like an arbitrary external function call). As a result, I don't think that we want to here either. Also, can we delete the instructions when they're dead?

May 23 2018, 8:18 AM
uweigand added a comment to D46967: Vector constrained FP intrinsics.

Oh, interesting. So the second comment on D45576 suggests that there's more design for this than I've seen so far. I'll close this Diff, if there are not objections. It's pretty clear that this is not the correct path going forward.

May 23 2018, 7:59 AM
uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

Following the BoF at EuroLLVM, I've been talking to Chris Lattner about this, and he suggested a somewhat different approach: providing a way for an MI instruction to have the "unmodeled side effects" flag be provided by an MI *operand*. That is, right now the hasSideEffects flag is a constant: any particular MI instruction class has this either set or clear, but to model FP instructions, it might be better to have a setting where whether or not any particular instantiation of the instruction has side effects depends on the value of an operand.

But that's much stronger than necessary, and will prevent all kinds of optimizations. I don't think that we want to do that. Why not just add a dependence on the registers that matter?

May 23 2018, 4:32 AM

May 22 2018

uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

The reason that I originally implemented this the way I did, mutating the strict nodes to their non-constrained equivalents, was that I thought it would require too much duplication in the .td files to implement all the pattern matching for the strict nodes. The original plan was to find some other way to communicate the "strict" state to the target after instruction selection, but I never found a way to do that.

What you've done here seems reasonable to me. Obviously it still involves a lot of updates to the td files, but your approach to making that manageable going forward seems plausible. I really don't have the expertise in instruction selection to judge that completely, but this looks like a promising direction.

May 22 2018, 9:05 AM
uweigand added a comment to D46967: Vector constrained FP intrinsics.

In addition to what Andrew said, there's another reason not to mutate this early: sooner or later, we'll need to give the target the chance to actually model strict operations differently (e.g. via more precise tracking of the FP status bit dependencies). At this point, the information which operations were strict must be preserved until the MI level anyway.

May 22 2018, 8:54 AM

May 19 2018

uweigand added a comment to D46315: [RegUsageInfoCollector] Fix handling of callee saved registers with CSR optimization..

An alternative of proving that, would be to have the target explicitly say whether or not their register units covers the full registers.

May 19 2018, 5:13 AM

Apr 30 2018

uweigand added a comment to D46232: [SystemZ, IPRA] determineCalleeSaves must always add return register and DP..

So, in effect, never do the IPRA callee-saved optimization for R11, because the caller *could* use it as frame pointer? That's certainly the conservative solution.

Apr 30 2018, 11:22 AM
uweigand added a comment to D46232: [SystemZ, IPRA] determineCalleeSaves must always add return register and DP..

But we do not want to force use of a frame pointer, in general this just reduces performance for no reason on Z. Some functions require a frame pointer (those that do dynamic allocas), but most do not.

Apr 30 2018, 11:01 AM
uweigand committed rL331203: [SystemZ] Handle SADDO et.al. and ADD/SUBCARRY.
[SystemZ] Handle SADDO et.al. and ADD/SUBCARRY
Apr 30 2018, 10:58 AM
uweigand committed rL331202: [SystemZ] Do not use glue to represent condition code dependencies.
[SystemZ] Do not use glue to represent condition code dependencies
Apr 30 2018, 10:56 AM
uweigand added a comment to D46232: [SystemZ, IPRA] determineCalleeSaves must always add return register and DP..

This looks like a generic problem to me, many platforms have registers that are reserved only in some functions but not others. If function A where a register is reserved calls function B where the register is not reserved, something must ensure that the register is preserved across the call to B. Usually, this is the case because such registers need to be callee-saved. But if function B is optimized via IPRA to not save the register, we have a problem ...

Apr 30 2018, 10:00 AM
uweigand committed rL331192: [SystemZ] Refactor some VT casts in DAG match patterns.
[SystemZ] Refactor some VT casts in DAG match patterns
Apr 30 2018, 8:56 AM
uweigand committed rL331191: [SystemZ] Improve handling of Select pseudo-instructions.
[SystemZ] Improve handling of Select pseudo-instructions
Apr 30 2018, 8:53 AM
uweigand added a comment to D46232: [SystemZ, IPRA] determineCalleeSaves must always add return register and DP..

Well, usually this is handled correctly by the isPhysRegModified call in SystemZFrameLowering::determineCalleeSaves. (The only reason why we even need the extra hasCalls check is that the call instruction is currently not at all stages modeled correctly to show that R14 is clobbered.)

Apr 30 2018, 4:57 AM
uweigand added a comment to D46232: [SystemZ, IPRA] determineCalleeSaves must always add return register and DP..

I do not believe we need to do anything special with R11. This is only special if it is used as frame pointer, in which case the code before already adds it as caller-saved:

Apr 30 2018, 2:56 AM