uweigand (Ulrich Weigand)
User

Projects

User does not belong to any projects.

User Details

User Since
Apr 14 2013, 11:48 AM (253 w, 3 d)

Recent Activity

Fri, Feb 16

uweigand added a comment to D39628: [DebugInfo] Unify logic to merge DILocations. NFC..

I've run into an internal compiler error that seems caused by this behavior of getMergedLocation: https://bugs.llvm.org/show_bug.cgi?id=36410

Fri, Feb 16, 8:05 AM

Wed, Feb 14

uweigand added a comment to D43235: [SchedModel] Complete models shouldn't match against itineraries when they don't use them (PR35639) (WIP).

Ah, so this is a problem when some processor models use itineraries and others don't? In that case, the change looks good to me ...

Wed, Feb 14, 7:39 AM

Fri, Feb 9

uweigand closed D43082: [ReleaseNotes] Add SystemZ target section.

Checked in as rev. 324726. Thanks!

Fri, Feb 9, 2:31 AM
uweigand committed rL324726: [ReleaseNotes] Add SystemZ target section.
[ReleaseNotes] Add SystemZ target section
Fri, Feb 9, 2:31 AM

Thu, Feb 8

uweigand created D43082: [ReleaseNotes] Add SystemZ target section.
Thu, Feb 8, 11:00 AM
uweigand added a comment to rC317589: SystemZ Swift TargetInfo: swifterror support in the backend is broken.

Hi Arnold, I only now noticed this commit. I wasn't aware there were any SystemZ back-end issues relating to swifterror support -- can you elaborate? Is there anything I should be doing in the back-end to fix this?

Thu, Feb 8, 10:23 AM

Wed, Jan 31

uweigand added a comment to D42100: Fix codegen of stores of vectors with non byte-sized elements..

Yes, that makes sense to me. It also agrees with the layout of bitfields on (all currently supported) big-endian platforms, so that's another argument for it.

Wed, Jan 31, 5:15 AM

Tue, Jan 30

uweigand added a comment to D42100: Fix codegen of stores of vectors with non byte-sized elements..

I've been talking to Jonas about the big-endian question, and I now think that there may actually be open questions about what the layout should be.

Tue, Jan 30, 8:41 AM

Jan 22 2018

uweigand committed rL323129: [SystemZ] Fix bootstrap failure due to invalid DAG loop.
[SystemZ] Fix bootstrap failure due to invalid DAG loop
Jan 22 2018, 7:43 AM

Jan 19 2018

uweigand committed rL322989: [SystemZ] Prefer LOCHI over generating IPM sequences.
[SystemZ] Prefer LOCHI over generating IPM sequences
Jan 19 2018, 12:59 PM
uweigand abandoned D42115: [SystemZ] Condition code handling rework.
Jan 19 2018, 12:59 PM
uweigand committed rL322988: [SystemZ] Directly use CC result of compare-and-swap.
[SystemZ] Directly use CC result of compare-and-swap
Jan 19 2018, 12:56 PM
uweigand committed rL322987: [SystemZ] Rework IPM sequence generation.
[SystemZ] Rework IPM sequence generation
Jan 19 2018, 12:55 PM
uweigand committed rL322986: [SystemZ] Implement computeKnownBitsForTargetNode.
[SystemZ] Implement computeKnownBitsForTargetNode
Jan 19 2018, 12:52 PM
uweigand committed rL322985: [SelectionDAG] Teach computeKnownBits about ATOMIC_CMP_SWAP_WITH_SUCCESS….
[SelectionDAG] Teach computeKnownBits about ATOMIC_CMP_SWAP_WITH_SUCCESS…
Jan 19 2018, 12:49 PM
uweigand closed D42067: [SelectionDAG] Teach computeKnownBits about ATOMIC_CMP_SWAP_WITH_SUCCESS boolean return value.
Jan 19 2018, 12:49 PM
uweigand added a comment to rL322688: [SystemZ] Handle BRCTH branches correctly in SystemZLongBranch.cpp..

OK, I've now checked in both changes: enable long tests in the clang-s390x-linux builder, and run this large test only when long tests are enabled.

Jan 19 2018, 11:54 AM
uweigand committed rL322983: [SystemZ] Run branch-12.ll test only if long tests enabled.
[SystemZ] Run branch-12.ll test only if long tests enabled
Jan 19 2018, 11:53 AM
uweigand committed rL322981: [zorg] Enable run_long_tests for clang-s390x-linux builder.
[zorg] Enable run_long_tests for clang-s390x-linux builder
Jan 19 2018, 11:44 AM
uweigand added a comment to rL322688: [SystemZ] Handle BRCTH branches correctly in SystemZLongBranch.cpp..

I see that the build bot switched off long tests some time ago. I'll try to switch this on again for at least one of the SystemZ builders.

Jan 19 2018, 11:16 AM
uweigand added a comment to rL322688: [SystemZ] Handle BRCTH branches correctly in SystemZLongBranch.cpp..

We could also move the test to the Large directory with the other large-branch tests; this is not run by default but only when explicitly requested.

Jan 19 2018, 9:39 AM

Jan 18 2018

uweigand abandoned D38215: [SelectionDAG] Fix ATOMIC_CMP_SWAP_WITH_SUCCESS expansion.

Superseded by https://reviews.llvm.org/D41798

Jan 18 2018, 4:41 AM

Jan 16 2018

uweigand added a comment to D42067: [SelectionDAG] Teach computeKnownBits about ATOMIC_CMP_SWAP_WITH_SUCCESS boolean return value.

(I don't have a stand-alone test case for this, but this patch is required to make a test case pass that I'll be adding with a follow-on SystemZ back-end patch.)

Do have you the follow up patch available that you could make dependent on this one to show the diff?

Jan 16 2018, 10:28 AM
uweigand created D42115: [SystemZ] Condition code handling rework.
Jan 16 2018, 10:21 AM
uweigand committed rC322562: [SystemZ] Support vector registers with inline asm.
[SystemZ] Support vector registers with inline asm
Jan 16 2018, 7:40 AM
uweigand committed rL322562: [SystemZ] Support vector registers with inline asm.
[SystemZ] Support vector registers with inline asm
Jan 16 2018, 7:40 AM

Jan 15 2018

uweigand created D42067: [SelectionDAG] Teach computeKnownBits about ATOMIC_CMP_SWAP_WITH_SUCCESS boolean return value.
Jan 15 2018, 5:20 AM

Jan 8 2018

uweigand added a comment to D41798: [LegalizeDAG] Fix ATOMIC_CMP_SWAP_WITH_SUCCESS legalization..

Sure, the operands will have been any-extended from i8/i16 to i32. But the target knows that this happened, because it knows that this is a i8/i16 ATOMIC_CMP_SWAP (via getMemoryVT). In that case, if it actually requires a particular sign- or zero-extended version of the original i8/i16 constant, it can still do the appropriate in-reg extension from the any-extended i32 value it got.

Oh OK, of course. I mentioned in the original patch that I can certainly fix this in the PPC back end. I imagine that other back ends have done the same thing one way or another (or have a bug in this they don't know about just as PPC does). However, I ultimately don't see the utility in type-legalization ignoring how the target wants these inputs extended and doing an any-extend when the target is going to have to pick one down the line. And the target has already stated how it wants atomics extended in that TLI hook.

Jan 8 2018, 9:15 AM
uweigand added a comment to D41798: [LegalizeDAG] Fix ATOMIC_CMP_SWAP_WITH_SUCCESS legalization..

In any case, I'm not sure why the *input* to ATOMIC_CMP_SWAP should be extended in any way in the first place: the back-end gets told the actual type explicitly, and if any extension is necessary due to the particular way a back-end implements the operation, the back-end can just do that extension itself.

I'm not sure what you mean here. The i8 and i16 types are not legal on PPC. As such, type legalization will undoubtedly expand operands of those types. If you look at the debug dumps I posted above, you'll notice that pre-type-legalization, the inputs are i16 and post-type-legalization, they're i32.
I think the gist of the issue here is that type legalization should do the right thing - operation legalization is probably later than the optimal place for this. Although, I agree that the operands should probably be extended according to TLI.getExtendForAtomicOps().

Jan 8 2018, 8:44 AM
uweigand added a comment to D41798: [LegalizeDAG] Fix ATOMIC_CMP_SWAP_WITH_SUCCESS legalization..

This patch seems to be identical to the one I proposed here:
https://reviews.llvm.org/D38215

Jan 8 2018, 7:00 AM

Dec 6 2017

uweigand added a comment to D39720: [X86][AVX512] lowering kunpack intrinsic - llvm part.

This has caused build bot failures on SystemZ (and others):

Dec 6 2017, 9:59 AM

Dec 5 2017

uweigand committed rL319818: [SystemZ] Validate shifted compare value in adjustForTestUnderMask.
[SystemZ] Validate shifted compare value in adjustForTestUnderMask
Dec 5 2017, 11:42 AM
uweigand accepted D38128: Handle COPYs of physregs better (regalloc hints).

The SystemZ changes LGTM. Given that Quentin approved the common part, this should be fine.

Dec 5 2017, 2:27 AM

Dec 4 2017

uweigand accepted D40437: [SystemZ] set 'guessInstructionProperties = 0' and add flags as needed..

For consistency, the other variants of the tbegin intrinsic (int_s390_tbegin and int_s390_tbegin_nofloat) should likewise get the IntrWriteMem flag.

Dec 4 2017, 9:38 AM
uweigand added a comment to D40437: [SystemZ] set 'guessInstructionProperties = 0' and add flags as needed..

The TwoAddress fix has been committed, but the intrinsics flags fixing seem to require some work so I am not sure if we really want to wait with these changes until that is done?

Dec 4 2017, 5:37 AM

Dec 1 2017

uweigand added a comment to D40437: [SystemZ] set 'guessInstructionProperties = 0' and add flags as needed..

These changes (at the code level) look reasonable to me now. (There is one mayStore on a STOC instruction that is probably still superfluous now.)

Dec 1 2017, 6:03 AM

Nov 30 2017

uweigand added a comment to D40437: [SystemZ] set 'guessInstructionProperties = 0' and add flags as needed..

Thanks for providing the diff. I noticed two problems, also annotated in the detailed comment:

  • You added mayLoad to a number of load-conditional instructions that actually don't access memory (reg-reg or reg-immed instructions)
  • The guarded-storage instructions should really have the side effects flag (I overlooked this in my previous review).
Nov 30 2017, 10:16 AM

Nov 29 2017

uweigand accepted D40542: [SystemZ] Bugfix in adjustSubwordCmp().

This LGTM now, thanks!

Nov 29 2017, 9:48 AM

Nov 28 2017

uweigand requested changes to D40542: [SystemZ] Bugfix in adjustSubwordCmp().

Actually, I take it back, sorry ...

Nov 28 2017, 12:11 PM
uweigand accepted D40542: [SystemZ] Bugfix in adjustSubwordCmp().

Yes, this looks good to me. I agree that lowerBITCAST probably needs to be fixed as well.

Nov 28 2017, 11:54 AM

Nov 24 2017

uweigand added a comment to D40437: [SystemZ] set 'guessInstructionProperties = 0' and add flags as needed..

Thanks for looking at and reviewing those flags!

Nov 24 2017, 10:58 AM

Nov 14 2017

uweigand committed rL318187: [SystemZ] Do not crash when selecting an OR of two constants.
[SystemZ] Do not crash when selecting an OR of two constants
Nov 14 2017, 12:03 PM
uweigand committed rL318177: [SystemZ] Fix invalid codegen using RISBMux on out-of-range bits.
[SystemZ] Fix invalid codegen using RISBMux on out-of-range bits
Nov 14 2017, 11:22 AM

Nov 9 2017

uweigand committed rL317807: [SystemZ] Add support for the "o" inline asm constraint.
[SystemZ] Add support for the "o" inline asm constraint
Nov 9 2017, 8:32 AM
uweigand added inline comments to D36795: [SystemZ] Increase number of LOCRs emitted by passing regalloc hints.
Nov 9 2017, 7:39 AM
uweigand added a comment to D36795: [SystemZ] Increase number of LOCRs emitted by passing regalloc hints.

See two minor inline comments. Otherwise, this now LGTM. Thanks!

Nov 9 2017, 7:03 AM

Oct 25 2017

uweigand added a comment to D38128: Handle COPYs of physregs better (regalloc hints).

Right. Always coalescing with the copy source of course extends the live range of the source, and thus increases overall register pressure. So it's not a good idea to do this unconditionally.

Oct 25 2017, 5:00 AM

Oct 24 2017

uweigand added a comment to D38894: [RFC][Tablegen] Add CCIfSplitFrom and CCPassIndirectBySamePointer Calling Convention Interfaces.

Well, the special handling of i128 arguments as such is certainly needed on SystemZ, this is why I added it in the first place in rev. 261325.

Oct 24 2017, 8:20 AM
uweigand added a comment to D38128: Handle COPYs of physregs better (regalloc hints).

Would any of these be an improvement worth trying, then?

(1) If the source phys-reg is contained in the regclass of an immediately following instruction using the vreg, then increase the priority of the hint of that phys-reg.
(2) A simpler alternative would be to simply prefer phys-reg sources more than phys-reg dest-regs (if weight is equal). That would catch all the cases of (1).

Oct 24 2017, 5:22 AM
uweigand added a comment to D38128: Handle COPYs of physregs better (regalloc hints).

Not sure if it is clear that coalescing with %R8 is generally better than %R0.

What do you mean, it isn't clear? Is the performance problem not clear? Or do you mean you're not sure how to detect this situation when you're sorting the hints?

I guess given your initial wording ("not great"), I was not sure if this is general and serious enough so that we really want to add an additional heuristic like "COPY to compare" on top of the sorting by weight. I suppose then there should be a flag like HasCompareUser which is then the tie-breaker when the weight is the same, or?

Oct 24 2017, 2:31 AM

Oct 17 2017

uweigand added a comment to D38894: [RFC][Tablegen] Add CCIfSplitFrom and CCPassIndirectBySamePointer Calling Convention Interfaces.

The SystemZ parts LGTM. I agree that it's much preferable to have this handled in common code than in the back end ...

Oct 17 2017, 4:23 AM

Oct 6 2017

uweigand accepted D37977: SystemZ: mischeduler enabled.

OK, this now looks all good to me, thanks!

Oct 6 2017, 5:04 AM

Oct 4 2017

uweigand added a comment to D37977: SystemZ: mischeduler enabled.

Have you looked into what's going on with the vec-cmp-cmp-logic-select.ll and vec-cmpsel.ll test cases? I see several of the tests now adding spill/reload code when they didn't before -- ideally this shouldn't happen if the misched does proper register pressure analysis ...

Oct 4 2017, 7:17 AM

Sep 29 2017

uweigand accepted D37899: [SystemZ] Implement shouldCoalesce() to help regalloc to avoid running out of registers with GR128 regs.

Given that:

  • Quentin approved the interface change
  • The SystemZ parts now look good to me
  • We have performance runs confirming this does not introduce regressions
Sep 29 2017, 5:22 AM
uweigand added a comment to D38215: [SelectionDAG] Fix ATOMIC_CMP_SWAP_WITH_SUCCESS expansion.

Note that as of r314428 I've switched the SystemZ back-end to use a custom expander for ATOMIC_CMP_SWAP_WITH_SUCCESS, which makes use of the condition code value set by the compare-and-swap hardware instruction to completely omit any extra comparison.

Sep 29 2017, 2:29 AM

Sep 28 2017

uweigand committed rL314465: [SystemZ] Fix fall-out from r314428.
[SystemZ] Fix fall-out from r314428
Sep 28 2017, 3:10 PM
uweigand committed rL314428: [SystemZ] Custom-expand ATOMIC_CMP_AND_SWAP_WITH_SUCCESS.
[SystemZ] Custom-expand ATOMIC_CMP_AND_SWAP_WITH_SUCCESS
Sep 28 2017, 9:24 AM
uweigand added a comment to D38128: Handle COPYs of physregs better (regalloc hints).

Sorry - with the update this fails again, so no update...
Should we make a point to try to handle this somehow?

Sep 28 2017, 7:57 AM

Sep 27 2017

uweigand added a comment to D38128: Handle COPYs of physregs better (regalloc hints).

I have attempted to extend the common code instead to handle this. While doing so, the multiple hints have become sorted by weight as requested.

Sep 27 2017, 4:50 AM
uweigand added a comment to D37899: [SystemZ] Implement shouldCoalesce() to help regalloc to avoid running out of registers with GR128 regs.

Considering other types of code with more heavy use of GR128, would we perhaps want to also give a count for a virtual register definition of a GR128? That should avoid being too optimistic when several of these overlap.

Sep 27 2017, 4:26 AM

Sep 25 2017

uweigand added a comment to D37899: [SystemZ] Implement shouldCoalesce() to help regalloc to avoid running out of registers with GR128 regs.

Not sure about the magic numbers -- does it matter whether we stop after 10 MIs? What kind of compile-time performance impact would it have to scan the whole range?

I don't know, I just thought that we basically wanted to at least handle the type of case where there is a GR128 op and then some subreg COPY(s) very close. My reason for having the instruction limit is that we can't know the register pressure, but it seems safe to think that tiny live ranges checked in this way could be coalesced.

Sep 25 2017, 8:16 AM
uweigand added a comment to D37899: [SystemZ] Implement shouldCoalesce() to help regalloc to avoid running out of registers with GR128 regs.

Patch updated.

I looked into making i128 legal when I recently added the 128-bit atomics, but in the end decided against it. Not only would it be a lot of work (since you basically would have to reimplement all the operations on i128 that are currently handled by common code), but it is not clear that we actually always want it.

I thought one could get away with adding calling setOperationAction(..., i128, Expand) on all operations that we do not support...

Sep 25 2017, 5:30 AM
uweigand created D38215: [SelectionDAG] Fix ATOMIC_CMP_SWAP_WITH_SUCCESS expansion.
Sep 25 2017, 5:02 AM

Sep 22 2017

uweigand added a comment to D37899: [SystemZ] Implement shouldCoalesce() to help regalloc to avoid running out of registers with GR128 regs.

I looked into making i128 legal when I recently added the 128-bit atomics, but in the end decided against it. Not only would it be a lot of work (since you basically would have to reimplement all the operations on i128 that are currently handled by common code), but it is not clear that we actually always want it.

Sep 22 2017, 6:29 AM
uweigand committed rL313977: [zorg] Add email notification for new SystemZ builders.
[zorg] Add email notification for new SystemZ builders
Sep 22 2017, 4:32 AM
uweigand closed D33777: [ZORG] Add email notification for new SystemZ builders by committing rL313977: [zorg] Add email notification for new SystemZ builders.
Sep 22 2017, 4:32 AM · Restricted Project

Sep 21 2017

uweigand added a comment to D38058: [llvm-readobj] Teach readobj to output .res (WindowsResource)..

Looks like this broke the s390x build bots:
http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/11433/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Ares-resources.test

Sep 21 2017, 9:47 AM
uweigand added a comment to D38128: Handle COPYs of physregs better (regalloc hints).

This doesn't look in any way target-specific, so I'm wondering why common code doesn't already do that. It seems there is already support for extracting hints from copies in calculateSpillWeightAndHint, why is this not working here?

Sep 21 2017, 9:41 AM
uweigand accepted D38076: SystemZ: improved optimizeCompareZero().

LGTM, thanks!

Sep 21 2017, 6:27 AM
uweigand added a comment to D38076: SystemZ: improved optimizeCompareZero().

Test case reduced.

Also found out that the first check in resultTests() must not be used in a forward search, as in that case the reg is actually taking on a new value in between the compare and the CC-user. For this purpse, I added the 'BeforeCompare' parameter to the function.

Sep 21 2017, 5:47 AM

Sep 20 2017

uweigand added a comment to D36795: [SystemZ] Increase number of LOCRs emitted by passing regalloc hints.

Looks basically good to me, just a couple of cosmetic comments inline.

Sep 20 2017, 10:17 AM
uweigand added a comment to D38076: SystemZ: improved optimizeCompareZero().

The code change looks good to me, but please try to simplify the .mir test case following the suggestions outlined here: https://llvm.org/docs/MIRLangRef.html#simplifying-mir-files

Sep 20 2017, 10:07 AM

Sep 19 2017

uweigand committed rL313669: [SystemZ] Fix truncstore + bswap codegen bug.
[SystemZ] Fix truncstore + bswap codegen bug
Sep 19 2017, 1:51 PM
uweigand added a comment to D37899: [SystemZ] Implement shouldCoalesce() to help regalloc to avoid running out of registers with GR128 regs.

Test cases updated. As expected, there are now more register moves whenever the coalescer is disabled by this patch.

Sep 19 2017, 8:48 AM

Sep 18 2017

uweigand added a comment to D37977: SystemZ: mischeduler enabled.

Hmm. Some of those changes look like regressions, we should try to at least understand why this happens. Maybe there's a simple way to fix those again.

Sep 18 2017, 8:20 AM

Sep 17 2017

uweigand committed rL313480: [compiler-rt] Fix build break after r313277 on s390x.
[compiler-rt] Fix build break after r313277 on s390x
Sep 17 2017, 2:40 AM

Sep 15 2017

uweigand added a comment to D37898: [TargetLowering] Correctly track NumFixedArgs field of CallLoweringInfo.

As far as SystemZ is concerned, this change should be safe. In our ABI only vector arguments are affected by the IsFixed flags, and there shouldn't be any libcalls with vector arguments.

Sep 15 2017, 10:09 AM
uweigand added a comment to D37899: [SystemZ] Implement shouldCoalesce() to help regalloc to avoid running out of registers with GR128 regs.

Thanks for trying this suggestion! See a couple of inline comments.

Sep 15 2017, 9:31 AM

Aug 21 2017

uweigand added a comment to D36795: [SystemZ] Increase number of LOCRs emitted by passing regalloc hints.

I think that to handle those cases we would have to constrain regclasses somehow after coalescing.

Aug 21 2017, 7:32 AM

Aug 18 2017

uweigand added a comment to D36795: [SystemZ] Increase number of LOCRs emitted by passing regalloc hints.

The getRegAllocationHints implementation makes sense to me. However, I'm wondering if we shouldn't also check for the *destination* register -- you only force one source register to the same class as the other source register, but I think we should check whether *any* of the three registers is already allocated, and then always force the other two to the same class.

I tried this (see updated patch below), but found that it gave the identical results (with or without the DestMO lines which I commented out). My guess is that the the TwoAddress pass is already doing this job in processTiedPairs() by

Aug 18 2017, 3:39 AM

Aug 17 2017

uweigand added a comment to D36795: [SystemZ] Increase number of LOCRs emitted by passing regalloc hints.

Oh, and just comment on this:

Aug 17 2017, 7:32 AM
uweigand added a comment to D36795: [SystemZ] Increase number of LOCRs emitted by passing regalloc hints.

The getRegAllocationHints implementation makes sense to me. However, I'm wondering if we shouldn't also check for the *destination* register -- you only force one source register to the same class as the other source register, but I think we should check whether *any* of the three registers is already allocated, and then always force the other two to the same class.

Aug 17 2017, 7:28 AM

Aug 4 2017

uweigand committed rL310094: [SystemZ] Add support for 128-bit atomic load/store/cmpxchg.
[SystemZ] Add support for 128-bit atomic load/store/cmpxchg
Aug 4 2017, 11:58 AM
uweigand committed rL310093: [SystemZ] Eliminate unnecessary serialization operations.
[SystemZ] Eliminate unnecessary serialization operations
Aug 4 2017, 11:54 AM

Aug 2 2017

uweigand added a comment to D36173: [InstSimplify] Perform known-bits analysis in SimplifyAndInst.

Yes, this benefits a real workload. This happens when the improved SimplifyAndInst now simplifies a partial result, which then allows *further* simplification in InstSimplify, in my case in ThreadBinOpOverPHI. This latter simplification doesn't seem to be done anywhere else.

Aug 2 2017, 10:21 AM
uweigand added a comment to D36172: [InstCombine] Improve profitability check for folding PHI args.
Aug 2 2017, 9:04 AM
uweigand added a comment to D36172: [InstCombine] Improve profitability check for folding PHI args.

The testcase is extracted from a real-word program. On that program, the transformation (moving some of those operations out of a hot loop) is a significant overall win (about 10% improved performance of the whole program). I agree that this application is a quite special case -- this patch doesn't make much difference to the overall performance of the platform in general.

Aug 2 2017, 8:38 AM
uweigand added a comment to D36173: [InstSimplify] Perform known-bits analysis in SimplifyAndInst.

Sure, it would be possible to add the corresponding transforms to SimplifyOrInst. However, I have been unable to trigger this at all, with any workload I've tried. Should I still add the transforms anyway?

Aug 2 2017, 6:53 AM
uweigand added a comment to D36172: [InstCombine] Improve profitability check for folding PHI args.

Well, I'd be fine with NewGVN doing this transform. But at least right now, it doesn't appear to be doing so -- running the new test case (@phi_multi in phi.ll) through "opt -O2 --enable-newgvn" does not move the zext/or out of the loop. Is this supposed to happen?

Aug 2 2017, 6:12 AM
uweigand added a comment to D35053: Improve post-RA scheduling for SystemZ.

OK, the SystemZ parts look good to me now. Thanks!

Aug 2 2017, 2:01 AM

Aug 1 2017

uweigand updated subscribers of D36173: [InstSimplify] Perform known-bits analysis in SimplifyAndInst.
Aug 1 2017, 1:41 PM
uweigand created D36175: [InstSimplify] Accept more loop-invariant cases in ThreadBinOpOverPHI.
Aug 1 2017, 1:40 PM
uweigand created D36173: [InstSimplify] Perform known-bits analysis in SimplifyAndInst.
Aug 1 2017, 1:31 PM
uweigand created D36172: [InstCombine] Improve profitability check for folding PHI args.
Aug 1 2017, 1:24 PM
uweigand added a comment to D35053: Improve post-RA scheduling for SystemZ.

Thanks! The SystemZ part now looks very reasonable. Just a couple of final, really just cosmetic comments inline.

Aug 1 2017, 4:50 AM

Jul 31 2017

uweigand added inline comments to D35053: Improve post-RA scheduling for SystemZ.
Jul 31 2017, 8:44 AM
uweigand added a comment to D35053: Improve post-RA scheduling for SystemZ.

A couple more comments on the SystemZ parts, see inline comments.

Jul 31 2017, 5:42 AM

Jul 28 2017

uweigand added a comment to D35933: Eliminate TargetTransformInfo::isFoldableMemAccess().

I'm not sure about the common code changes -- what does simply calling isLegalAddressingMode instead of isFoldableMemAccessOffset do to targets that have not yet added support to do instruction-specific checks to the former? I'd assume you'd see performance regressions there.

Maybe this transition should be done on a target-by-target basis.

SystemZ was the only user of isFoldableMemAccessOffset() except for any out-of-tree targets.

Jul 28 2017, 4:46 AM

Jul 27 2017

uweigand added a comment to D35933: Eliminate TargetTransformInfo::isFoldableMemAccess().

The SystemZ part looks correct to me, see the inline comment for a style/readability suggestion.

Jul 27 2017, 6:06 AM

Jul 24 2017

uweigand added a comment to D35053: Improve post-RA scheduling for SystemZ.

It does indeed seem to be necessary to get an explicit guarantee from common code that scheduling happens in a particular order, otherwise our back-end code may silently break if common code logic happens to change in the future.

Jul 24 2017, 8:11 AM