uweigand (Ulrich Weigand)
User

Projects

User does not belong to any projects.

User Details

User Since
Apr 14 2013, 11:48 AM (270 w, 1 d)

Recent Activity

Fri, Jun 8

uweigand committed rL334286: clang-s390x-linux-lnt: Move to test-suite producer.
clang-s390x-linux-lnt: Move to test-suite producer
Fri, Jun 8, 6:09 AM

Wed, Jun 6

uweigand accepted D47820: [SystemZ] Build TM from scratch in convertToLoadAndTest, to get CC operand in right place..

Otherwise, this LGTM.

Wed, Jun 6, 4:43 AM

Fri, Jun 1

uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

This version now correctly implements target support for STRICT floating-point nodes using a MMO to represent the floating-point exception status. (Note that targets can and should additionally use a register use dependency on *all* floating-point instructions to represent floating-point control state, e.g. rounding mode and exception trap flags.)

Fri, Jun 1, 10:15 AM
uweigand updated the diff for D45576: [RFC] Allow target to handle STRICT floating-point nodes.

Working version of the patch using memory operands.

Fri, Jun 1, 10:04 AM

Wed, May 30

uweigand added a comment to rL303907: Fix bug #28898.

Just to confirm: are you saying that the problem reported by Tim will go away if he updates his libedit binary?

Wed, May 30, 9:21 AM
uweigand added a comment to rL303907: Fix bug #28898.

We still seem to have a bug here, so we should figure out how to fix it. Simply reverting this does not seem to be an option because it breaks lldb horribly for certain build configs of libedit. I'll try to look at what can be done here. What would certainly help this situation is if we made the value of LLDB_EDITLINE_USE_WCHAR configurable at build time (with the defaults as they are now, at least initially). Then for your internal build to can force-set the variable to true as long as you have a wide-char enabled libedit. I think that should resolve the asan problem and leave you with a working lldb.

Wed, May 30, 8:42 AM

Tue, May 29

uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

To take a hint from the suggestion you relayed from Chris, how about we just add MMOs to these instructions, and then let mayLoad/mayStore look at optional MMOs when returning their answer? Maybe this lets us do what we want without duplicating all of the patterns?

Tue, May 29, 10:43 AM
uweigand updated the diff for D45576: [RFC] Allow target to handle STRICT floating-point nodes.

(Failed) attempt to handle strict and regular FP operations without duplicating patterns ...

Tue, May 29, 10:34 AM
uweigand added a comment to D47380: Make getStrictFPOpcodeAction(...) more accessible .

Is there a reason for dropping the "FP" from "StrictFP" in the method name? I think it would be better to keep it (i.e. use "getStrictFPOperationAction"), in particular since there are already other public methods in this area that use "StrictFP", like isStrictFPOpcode and MutateStrictFPToFP.

Tue, May 29, 9:09 AM

Thu, May 24

uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

To take a hint from the suggestion you relayed from Chris, how about we just add MMOs to these instructions, and then let mayLoad/mayStore look at optional MMOs when returning their answer? Maybe this lets us do what we want without duplicating all of the patterns?

Thu, May 24, 5:07 AM

Wed, May 23

uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

Ah, okay. That doesn't seem so bad. The fact that the number of times they're called can't be observable means, I think, that we only need to model the trapping behavior as writing (and not reading) inaccessible memory. This can't be removed or speculated, but otherwise shouldn't have undo optimization implications.

Correction: I mean writing to memory (the trap handler might write memory other than inaccessible memory (e.g., some global variable)). Anything that's escaped (trivially including globals) is fair game.

I suppose that they also get to read memory (so long as they're not using it to keep ordering/count information). So the trap handlers are modeled as reading/writing arbitrary escaped memory. That, plus register dependencies, seems like it should be sufficient.

Wed, May 23, 10:02 AM
uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

That's a good point. I'm not sure that we want to model the effects of trap handlers, however. We don't at the IR level (as we mark the intrinsics as IntrInaccessibleMemOnly), and don't want to (because otherwise we need to assume that every FP operation is like an arbitrary external function call). As a result, I don't think that we want to here either. Also, can we delete the instructions when they're dead?

Wed, May 23, 8:18 AM
uweigand added a comment to D46967: Vector constrained FP intrinsics.

Oh, interesting. So the second comment on D45576 suggests that there's more design for this than I've seen so far. I'll close this Diff, if there are not objections. It's pretty clear that this is not the correct path going forward.

Wed, May 23, 7:59 AM
uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

Following the BoF at EuroLLVM, I've been talking to Chris Lattner about this, and he suggested a somewhat different approach: providing a way for an MI instruction to have the "unmodeled side effects" flag be provided by an MI *operand*. That is, right now the hasSideEffects flag is a constant: any particular MI instruction class has this either set or clear, but to model FP instructions, it might be better to have a setting where whether or not any particular instantiation of the instruction has side effects depends on the value of an operand.

But that's much stronger than necessary, and will prevent all kinds of optimizations. I don't think that we want to do that. Why not just add a dependence on the registers that matter?

Wed, May 23, 4:32 AM

Tue, May 22

uweigand added a comment to D45576: [RFC] Allow target to handle STRICT floating-point nodes.

The reason that I originally implemented this the way I did, mutating the strict nodes to their non-constrained equivalents, was that I thought it would require too much duplication in the .td files to implement all the pattern matching for the strict nodes. The original plan was to find some other way to communicate the "strict" state to the target after instruction selection, but I never found a way to do that.

What you've done here seems reasonable to me. Obviously it still involves a lot of updates to the td files, but your approach to making that manageable going forward seems plausible. I really don't have the expertise in instruction selection to judge that completely, but this looks like a promising direction.

Tue, May 22, 9:05 AM
uweigand added a comment to D46967: Vector constrained FP intrinsics.

In addition to what Andrew said, there's another reason not to mutate this early: sooner or later, we'll need to give the target the chance to actually model strict operations differently (e.g. via more precise tracking of the FP status bit dependencies). At this point, the information which operations were strict must be preserved until the MI level anyway.

Tue, May 22, 8:54 AM

May 19 2018

uweigand added a comment to D46315: [RegUsageInfoCollector] Fix handling of callee saved registers with CSR optimization..

An alternative of proving that, would be to have the target explicitly say whether or not their register units covers the full registers.

May 19 2018, 5:13 AM

Apr 30 2018

uweigand added a comment to D46232: [SystemZ, IPRA] determineCalleeSaves must always add return register and DP..

So, in effect, never do the IPRA callee-saved optimization for R11, because the caller *could* use it as frame pointer? That's certainly the conservative solution.

Apr 30 2018, 11:22 AM
uweigand added a comment to D46232: [SystemZ, IPRA] determineCalleeSaves must always add return register and DP..

But we do not want to force use of a frame pointer, in general this just reduces performance for no reason on Z. Some functions require a frame pointer (those that do dynamic allocas), but most do not.

Apr 30 2018, 11:01 AM
uweigand committed rL331203: [SystemZ] Handle SADDO et.al. and ADD/SUBCARRY.
[SystemZ] Handle SADDO et.al. and ADD/SUBCARRY
Apr 30 2018, 10:58 AM
uweigand committed rL331202: [SystemZ] Do not use glue to represent condition code dependencies.
[SystemZ] Do not use glue to represent condition code dependencies
Apr 30 2018, 10:56 AM
uweigand added a comment to D46232: [SystemZ, IPRA] determineCalleeSaves must always add return register and DP..

This looks like a generic problem to me, many platforms have registers that are reserved only in some functions but not others. If function A where a register is reserved calls function B where the register is not reserved, something must ensure that the register is preserved across the call to B. Usually, this is the case because such registers need to be callee-saved. But if function B is optimized via IPRA to not save the register, we have a problem ...

Apr 30 2018, 10:00 AM
uweigand committed rL331192: [SystemZ] Refactor some VT casts in DAG match patterns.
[SystemZ] Refactor some VT casts in DAG match patterns
Apr 30 2018, 8:56 AM
uweigand committed rL331191: [SystemZ] Improve handling of Select pseudo-instructions.
[SystemZ] Improve handling of Select pseudo-instructions
Apr 30 2018, 8:53 AM
uweigand added a comment to D46232: [SystemZ, IPRA] determineCalleeSaves must always add return register and DP..

Well, usually this is handled correctly by the isPhysRegModified call in SystemZFrameLowering::determineCalleeSaves. (The only reason why we even need the extra hasCalls check is that the call instruction is currently not at all stages modeled correctly to show that R14 is clobbered.)

Apr 30 2018, 4:57 AM
uweigand added a comment to D46232: [SystemZ, IPRA] determineCalleeSaves must always add return register and DP..

I do not believe we need to do anything special with R11. This is only special if it is used as frame pointer, in which case the code before already adds it as caller-saved:

Apr 30 2018, 2:56 AM

Apr 26 2018

uweigand added a comment to D46048: [branchfolding] When hoisting common code, remove kill flags from uses.

The test case looks fine to me, thanks.

Apr 26 2018, 1:58 AM

Apr 24 2018

uweigand committed rL330718: [SystemZ] Use preferred 16-byte function alignment.
[SystemZ] Use preferred 16-byte function alignment
Apr 24 2018, 7:07 AM

Apr 12 2018

uweigand created D45576: [RFC] Allow target to handle STRICT floating-point nodes.
Apr 12 2018, 9:54 AM

Apr 5 2018

uweigand added a comment to D45286: [MachineLICM] Re-enable hoisting of constant stores.

The test could perhaps be freed of alignment, tbaa operands and dso_local specifiers, but maybe it doesn't matter. Adding Uli as reviewer since the test is in SystemZ.

Apr 5 2018, 2:02 AM

Mar 15 2018

uweigand abandoned D43687: Improve merging of debug locations (fixes PR 36410).

The original bug in PR 36410 is now fixed in r327622, so I won't pursue this attempt any further.

Mar 15 2018, 11:30 AM
uweigand committed rL327622: [Debug] Retain both copies of debug intrinsics in HoistThenElseCodeToIf.
[Debug] Retain both copies of debug intrinsics in HoistThenElseCodeToIf
Mar 15 2018, 5:31 AM
uweigand closed D44312: Retain both sets of debug intrinsics in HoistThenElseCodeToIf (fixes PR 36410).
Mar 15 2018, 5:31 AM

Mar 14 2018

uweigand updated the diff for D44312: Retain both sets of debug intrinsics in HoistThenElseCodeToIf (fixes PR 36410).

Updated comment as requested.

Mar 14 2018, 8:51 AM

Mar 12 2018

uweigand updated the diff for D44312: Retain both sets of debug intrinsics in HoistThenElseCodeToIf (fixes PR 36410).

Would it be an option to only hoist debug info instructions where both variable and value match (just like the existing code), but then not attempt to merge the !dbg locations, but instead just hoist both copies of the original instructions with both the original locations (like in this patch)?

Mar 12 2018, 6:53 AM
uweigand requested review of D44312: Retain both sets of debug intrinsics in HoistThenElseCodeToIf (fixes PR 36410).
Mar 12 2018, 6:53 AM
uweigand added a comment to D44312: Retain both sets of debug intrinsics in HoistThenElseCodeToIf (fixes PR 36410).

I think you get approximately the right behavior if you just unconditionally ignore the debug info intrinsics (and never hoist them). Of course, the debug info gets dropped if the terminator is hoisted, but that seems okay; it's hard to handle that well in general anyway.

Mar 12 2018, 6:02 AM

Mar 9 2018

uweigand updated the diff for D43687: Improve merging of debug locations (fixes PR 36410).

Fix the local scope bug.

Mar 9 2018, 2:52 PM
uweigand added a comment to D43687: Improve merging of debug locations (fixes PR 36410).

Ah, I see. Looks like the problem is that the DIScope -> getScope() loop can actually leave the function scope and go up all the way to file scope. This is not a good idea here. I think this needs to be restricted to DILocalScope instead. I'll update the patch.

Mar 9 2018, 2:41 PM
uweigand reopened D44312: Retain both sets of debug intrinsics in HoistThenElseCodeToIf (fixes PR 36410).

Sorry, I didn't see your comment before checking this in. I believe you're right, and this can cause problems --- I've now reverted and will have another look.

Mar 9 2018, 2:03 PM
uweigand committed rL327176: Revert "[Debug] Retain both sets of debug intrinsics in HoistThenElseCodeToIf".
Revert "[Debug] Retain both sets of debug intrinsics in HoistThenElseCodeToIf"
Mar 9 2018, 2:02 PM
uweigand committed rL327175: [Debug] Retain both sets of debug intrinsics in HoistThenElseCodeToIf.
[Debug] Retain both sets of debug intrinsics in HoistThenElseCodeToIf
Mar 9 2018, 1:40 PM
uweigand closed D44312: Retain both sets of debug intrinsics in HoistThenElseCodeToIf (fixes PR 36410).
Mar 9 2018, 1:40 PM
uweigand updated the diff for D44312: Retain both sets of debug intrinsics in HoistThenElseCodeToIf (fixes PR 36410).

Updated test case as requested.

Mar 9 2018, 12:55 PM
uweigand added a comment to D43687: Improve merging of debug locations (fixes PR 36410).

OK. Maybe it is indeed a better choice to *not* merge dbg.value intrinsics, but just move over both sets from both sides of the "if" unchanged. In fact, if you have dbg.value intrinsics for *different* variables, they are currently just deleted -- moving them over should keep the (correct) information that *both* those variables now have the specfied value at the new location ...

Keeping both intrinsics is always a safe choice.

Mar 9 2018, 9:56 AM
uweigand created D44312: Retain both sets of debug intrinsics in HoistThenElseCodeToIf (fixes PR 36410).
Mar 9 2018, 9:56 AM
uweigand added a comment to D43687: Improve merging of debug locations (fixes PR 36410).

Yeah, I still need to look more closely at this - my hunch/feeling is that,
while/if this approach is necessary for dbg.value intrinsics, it'd still
produce wrong or at least suboptimal locations in general. :/ (& I'm not
even sure it's the best thing for the dbg.values either... not sure what
should be done about them - perhaps they shouldn't be merged, instead
duplicated & keep their respective locations)

Mar 9 2018, 7:41 AM
uweigand added a comment to D43687: Improve merging of debug locations (fixes PR 36410).

I just found out that this patch causes an assertion when compiling projects/compiler-rt/lib/sanitizer_common/tests/sanitizer_bvgraph_test.cc:
Assertion failed: (isa<X>(Val) && "cast<Ty>() argument of incompatible type!"), function cast, file ../include/llvm/Support/Casting.h, line 255.

Let me know if you need more detailed instructions to reproduce.

Mar 9 2018, 7:38 AM

Mar 8 2018

uweigand updated subscribers of D44118: [x86][AArch64] ask the target whether it has a vector blend instruction.

This doesn't seem to make any difference for SystemZ. Adding Hal, Kit, and Nemanja for PowerPC ...

Mar 8 2018, 8:54 AM

Mar 2 2018

uweigand committed rL326616: [SystemZ] Fix test cases after r326613.
[SystemZ] Fix test cases after r326613
Mar 2 2018, 1:25 PM
uweigand committed rL326613: [SystemZ] Allow LRV/STRV with volatile memory accesses.
[SystemZ] Allow LRV/STRV with volatile memory accesses
Mar 2 2018, 12:54 PM
uweigand committed rL326612: [SystemZ] Add support for anyregcc calling convention.
[SystemZ] Add support for anyregcc calling convention
Mar 2 2018, 12:42 PM
uweigand committed rL326611: [SystemZ] Support stackmaps and patchpoints.
[SystemZ] Support stackmaps and patchpoints
Mar 2 2018, 12:42 PM
uweigand committed rL326610: [SystemZ] Fix common-code users of stack size.
[SystemZ] Fix common-code users of stack size
Mar 2 2018, 12:41 PM
uweigand committed rL326609: [SystemZ] Support vector registers in inline asm.
[SystemZ] Support vector registers in inline asm
Mar 2 2018, 12:39 PM

Feb 27 2018

uweigand added a comment to D43687: Improve merging of debug locations (fixes PR 36410).

Have you considered a test case where there's a common scope within an inlinedAt location?

SP1 <- S1 <- IA1 <- L1
         \-- IA2 <- L2

I think it's reasonable for S1 to be the scope of the merged location - as much as it would be if IA1 and IA2 were not present. (SP = Subprogram, S = Scope, IA = InlinedAt, L = Location).

Feb 27 2018, 12:48 AM
uweigand added a comment to rL325973: [DebugInfo] Add remaining files to r325970.

This fixed the problem, thanks!

Feb 27 2018, 12:01 AM

Feb 26 2018

uweigand added a comment to rL325973: [DebugInfo] Add remaining files to r325970.

This breaks the SystemZ build bots, because the new debug-empty-source.s test case uses a "nop" instruction, which does not exist (using this name) on SystemZ.

Feb 26 2018, 10:28 AM
uweigand added a comment to D43687: Improve merging of debug locations (fixes PR 36410).

I'm not sure I understand why we should keep two debug intrinsics around in just this case, and not in other cases where the original IR has differing debug intrinsics (e.g. for different variables)?

Feb 26 2018, 6:36 AM

Feb 23 2018

uweigand created D43687: Improve merging of debug locations (fixes PR 36410).
Feb 23 2018, 11:02 AM

Feb 16 2018

uweigand added a comment to D39628: [DebugInfo] Unify logic to merge DILocations. NFC..

I've run into an internal compiler error that seems caused by this behavior of getMergedLocation: https://bugs.llvm.org/show_bug.cgi?id=36410

Feb 16 2018, 8:05 AM

Feb 14 2018

uweigand added a comment to D43235: [SchedModel] Complete models shouldn't match against itineraries when they don't use them (PR35639) (WIP).

Ah, so this is a problem when some processor models use itineraries and others don't? In that case, the change looks good to me ...

Feb 14 2018, 7:39 AM

Feb 9 2018

uweigand closed D43082: [ReleaseNotes] Add SystemZ target section.

Checked in as rev. 324726. Thanks!

Feb 9 2018, 2:31 AM
uweigand committed rL324726: [ReleaseNotes] Add SystemZ target section.
[ReleaseNotes] Add SystemZ target section
Feb 9 2018, 2:31 AM

Feb 8 2018

uweigand created D43082: [ReleaseNotes] Add SystemZ target section.
Feb 8 2018, 11:00 AM
uweigand added a comment to rC317589: SystemZ Swift TargetInfo: swifterror support in the backend is broken.

Hi Arnold, I only now noticed this commit. I wasn't aware there were any SystemZ back-end issues relating to swifterror support -- can you elaborate? Is there anything I should be doing in the back-end to fix this?

Feb 8 2018, 10:23 AM

Jan 31 2018

uweigand added a comment to D42100: Fix codegen of stores of vectors with non byte-sized elements..

Yes, that makes sense to me. It also agrees with the layout of bitfields on (all currently supported) big-endian platforms, so that's another argument for it.

Jan 31 2018, 5:15 AM

Jan 30 2018

uweigand added a comment to D42100: Fix codegen of stores of vectors with non byte-sized elements..

I've been talking to Jonas about the big-endian question, and I now think that there may actually be open questions about what the layout should be.

Jan 30 2018, 8:41 AM

Jan 22 2018

uweigand committed rL323129: [SystemZ] Fix bootstrap failure due to invalid DAG loop.
[SystemZ] Fix bootstrap failure due to invalid DAG loop
Jan 22 2018, 7:43 AM

Jan 19 2018

uweigand committed rL322989: [SystemZ] Prefer LOCHI over generating IPM sequences.
[SystemZ] Prefer LOCHI over generating IPM sequences
Jan 19 2018, 12:59 PM
uweigand abandoned D42115: [SystemZ] Condition code handling rework.
Jan 19 2018, 12:59 PM
uweigand committed rL322988: [SystemZ] Directly use CC result of compare-and-swap.
[SystemZ] Directly use CC result of compare-and-swap
Jan 19 2018, 12:56 PM
uweigand committed rL322987: [SystemZ] Rework IPM sequence generation.
[SystemZ] Rework IPM sequence generation
Jan 19 2018, 12:55 PM
uweigand committed rL322986: [SystemZ] Implement computeKnownBitsForTargetNode.
[SystemZ] Implement computeKnownBitsForTargetNode
Jan 19 2018, 12:52 PM
uweigand committed rL322985: [SelectionDAG] Teach computeKnownBits about ATOMIC_CMP_SWAP_WITH_SUCCESS….
[SelectionDAG] Teach computeKnownBits about ATOMIC_CMP_SWAP_WITH_SUCCESS…
Jan 19 2018, 12:49 PM
uweigand closed D42067: [SelectionDAG] Teach computeKnownBits about ATOMIC_CMP_SWAP_WITH_SUCCESS boolean return value.
Jan 19 2018, 12:49 PM
uweigand added a comment to rL322688: [SystemZ] Handle BRCTH branches correctly in SystemZLongBranch.cpp..

OK, I've now checked in both changes: enable long tests in the clang-s390x-linux builder, and run this large test only when long tests are enabled.

Jan 19 2018, 11:54 AM
uweigand committed rL322983: [SystemZ] Run branch-12.ll test only if long tests enabled.
[SystemZ] Run branch-12.ll test only if long tests enabled
Jan 19 2018, 11:53 AM
uweigand committed rL322981: [zorg] Enable run_long_tests for clang-s390x-linux builder.
[zorg] Enable run_long_tests for clang-s390x-linux builder
Jan 19 2018, 11:44 AM
uweigand added a comment to rL322688: [SystemZ] Handle BRCTH branches correctly in SystemZLongBranch.cpp..

I see that the build bot switched off long tests some time ago. I'll try to switch this on again for at least one of the SystemZ builders.

Jan 19 2018, 11:16 AM
uweigand added a comment to rL322688: [SystemZ] Handle BRCTH branches correctly in SystemZLongBranch.cpp..

We could also move the test to the Large directory with the other large-branch tests; this is not run by default but only when explicitly requested.

Jan 19 2018, 9:39 AM

Jan 18 2018

uweigand abandoned D38215: [SelectionDAG] Fix ATOMIC_CMP_SWAP_WITH_SUCCESS expansion.

Superseded by https://reviews.llvm.org/D41798

Jan 18 2018, 4:41 AM

Jan 16 2018

uweigand added a comment to D42067: [SelectionDAG] Teach computeKnownBits about ATOMIC_CMP_SWAP_WITH_SUCCESS boolean return value.

(I don't have a stand-alone test case for this, but this patch is required to make a test case pass that I'll be adding with a follow-on SystemZ back-end patch.)

Do have you the follow up patch available that you could make dependent on this one to show the diff?

Jan 16 2018, 10:28 AM
uweigand created D42115: [SystemZ] Condition code handling rework.
Jan 16 2018, 10:21 AM
uweigand committed rC322562: [SystemZ] Support vector registers with inline asm.
[SystemZ] Support vector registers with inline asm
Jan 16 2018, 7:40 AM
uweigand committed rL322562: [SystemZ] Support vector registers with inline asm.
[SystemZ] Support vector registers with inline asm
Jan 16 2018, 7:40 AM

Jan 15 2018

uweigand created D42067: [SelectionDAG] Teach computeKnownBits about ATOMIC_CMP_SWAP_WITH_SUCCESS boolean return value.
Jan 15 2018, 5:20 AM

Jan 8 2018

uweigand added a comment to D41798: [LegalizeDAG] Fix ATOMIC_CMP_SWAP_WITH_SUCCESS legalization..

Sure, the operands will have been any-extended from i8/i16 to i32. But the target knows that this happened, because it knows that this is a i8/i16 ATOMIC_CMP_SWAP (via getMemoryVT). In that case, if it actually requires a particular sign- or zero-extended version of the original i8/i16 constant, it can still do the appropriate in-reg extension from the any-extended i32 value it got.

Oh OK, of course. I mentioned in the original patch that I can certainly fix this in the PPC back end. I imagine that other back ends have done the same thing one way or another (or have a bug in this they don't know about just as PPC does). However, I ultimately don't see the utility in type-legalization ignoring how the target wants these inputs extended and doing an any-extend when the target is going to have to pick one down the line. And the target has already stated how it wants atomics extended in that TLI hook.

Jan 8 2018, 9:15 AM
uweigand added a comment to D41798: [LegalizeDAG] Fix ATOMIC_CMP_SWAP_WITH_SUCCESS legalization..

In any case, I'm not sure why the *input* to ATOMIC_CMP_SWAP should be extended in any way in the first place: the back-end gets told the actual type explicitly, and if any extension is necessary due to the particular way a back-end implements the operation, the back-end can just do that extension itself.

I'm not sure what you mean here. The i8 and i16 types are not legal on PPC. As such, type legalization will undoubtedly expand operands of those types. If you look at the debug dumps I posted above, you'll notice that pre-type-legalization, the inputs are i16 and post-type-legalization, they're i32.
I think the gist of the issue here is that type legalization should do the right thing - operation legalization is probably later than the optimal place for this. Although, I agree that the operands should probably be extended according to TLI.getExtendForAtomicOps().

Jan 8 2018, 8:44 AM
uweigand added a comment to D41798: [LegalizeDAG] Fix ATOMIC_CMP_SWAP_WITH_SUCCESS legalization..

This patch seems to be identical to the one I proposed here:
https://reviews.llvm.org/D38215

Jan 8 2018, 7:00 AM

Dec 6 2017

uweigand added a comment to D39720: [X86][AVX512] lowering kunpack intrinsic - llvm part.

This has caused build bot failures on SystemZ (and others):

Dec 6 2017, 9:59 AM

Dec 5 2017

uweigand committed rL319818: [SystemZ] Validate shifted compare value in adjustForTestUnderMask.
[SystemZ] Validate shifted compare value in adjustForTestUnderMask
Dec 5 2017, 11:42 AM
uweigand accepted D38128: Handle COPYs of physregs better (regalloc hints).

The SystemZ changes LGTM. Given that Quentin approved the common part, this should be fine.

Dec 5 2017, 2:27 AM

Dec 4 2017

uweigand accepted D40437: [SystemZ] set 'guessInstructionProperties = 0' and add flags as needed..

For consistency, the other variants of the tbegin intrinsic (int_s390_tbegin and int_s390_tbegin_nofloat) should likewise get the IntrWriteMem flag.

Dec 4 2017, 9:38 AM
uweigand added a comment to D40437: [SystemZ] set 'guessInstructionProperties = 0' and add flags as needed..

The TwoAddress fix has been committed, but the intrinsics flags fixing seem to require some work so I am not sure if we really want to wait with these changes until that is done?

Dec 4 2017, 5:37 AM

Dec 1 2017

uweigand added a comment to D40437: [SystemZ] set 'guessInstructionProperties = 0' and add flags as needed..

These changes (at the code level) look reasonable to me now. (There is one mayStore on a STOC instruction that is probably still superfluous now.)

Dec 1 2017, 6:03 AM

Nov 30 2017

uweigand added a comment to D40437: [SystemZ] set 'guessInstructionProperties = 0' and add flags as needed..

Thanks for providing the diff. I noticed two problems, also annotated in the detailed comment:

  • You added mayLoad to a number of load-conditional instructions that actually don't access memory (reg-reg or reg-immed instructions)
  • The guarded-storage instructions should really have the side effects flag (I overlooked this in my previous review).
Nov 30 2017, 10:16 AM

Nov 29 2017

uweigand accepted D40542: [SystemZ] Bugfix in adjustSubwordCmp().

This LGTM now, thanks!

Nov 29 2017, 9:48 AM

Nov 28 2017

uweigand requested changes to D40542: [SystemZ] Bugfix in adjustSubwordCmp().

Actually, I take it back, sorry ...

Nov 28 2017, 12:11 PM
uweigand accepted D40542: [SystemZ] Bugfix in adjustSubwordCmp().

Yes, this looks good to me. I agree that lowerBITCAST probably needs to be fixed as well.

Nov 28 2017, 11:54 AM