Page MenuHomePhabricator

jonpa (Jonas Paulsson)
User

Projects

User does not belong to any projects.

User Details

User Since
Feb 24 2015, 1:18 AM (319 w, 5 d)

Recent Activity

Yesterday

jonpa updated the summary of D100242: [SystemZ / TII] Peephole optimization of zero-extension of i1..
Sat, Apr 10, 3:12 AM · Restricted Project
jonpa requested review of D100242: [SystemZ / TII] Peephole optimization of zero-extension of i1..
Sat, Apr 10, 3:12 AM · Restricted Project

Fri, Apr 9

jonpa updated the diff for D98905: [SystemZ] Reuse known zeros/ones after zero-extension of i1..

Patch improved:

  • Avoid transforming single-use loads - it is probably better to do a compare with memory.
  • Add an AssertZext node on the reused register so that it is known that this is an i1.
  • Avoid some cases where i32 setcc is zero-extended.
  • Option to try single user only
Fri, Apr 9, 4:59 AM · Restricted Project

Wed, Apr 7

jonpa requested review of D100039: [SystemZ] Isel cleanup pass: Reuse known zeros/ones after zero-extension of i1..
Wed, Apr 7, 8:20 AM · Restricted Project

Mon, Apr 5

jonpa added reviewers for D98230: [LSR] Add reconciliation of unfoldable offsets: hfinkel, wmi, lebedev.ri, Florian.

This could be enabled for SystemZ only for now, but review is still needed...

Mon, Apr 5, 2:40 PM · Restricted Project

Tue, Mar 23

jonpa added a comment to D98230: [LSR] Add reconciliation of unfoldable offsets.

Grouping huge offsets (like foldable ones) really should be a general win for most targets, although this patch is only enabled on SystemZ for now.

Tue, Mar 23, 1:47 PM · Restricted Project

Thu, Mar 18

jonpa requested review of D98905: [SystemZ] Reuse known zeros/ones after zero-extension of i1..
Thu, Mar 18, 4:02 PM · Restricted Project

Tue, Mar 16

jonpa added inline comments to D98230: [LSR] Add reconciliation of unfoldable offsets.
Tue, Mar 16, 1:03 PM · Restricted Project

Mon, Mar 15

jonpa committed rG9cfd301ec8b5: [SystemZ] Test for isinf and isfinite in testFPKind(). (authored by jonpa).
[SystemZ] Test for isinf and isfinite in testFPKind().
Mon, Mar 15, 2:04 PM
jonpa closed D97901: [SystemZ] Test for infinity in testFPKind()..
Mon, Mar 15, 2:04 PM · Restricted Project
jonpa updated the diff for D97901: [SystemZ] Test for infinity in testFPKind()..

This "invert" logic doesn't look correct. "isfinite" and "isinf" both need to return false on NaNs. I think you should just drop the invert logic and use a TDC mask of 0xFC0 (zero, normal, or subnormal) to implement "isfinite".

Mon, Mar 15, 11:59 AM · Restricted Project
jonpa added a comment to D98230: [LSR] Add reconciliation of unfoldable offsets.

ping!

Mon, Mar 15, 11:28 AM · Restricted Project
jonpa updated the diff for D98230: [LSR] Add reconciliation of unfoldable offsets.

Patch rebased.

Mon, Mar 15, 11:28 AM · Restricted Project

Mar 11 2021

jonpa committed rG5908c7ca41bd: [libFuzzer] Add attribute noinline on Fuzzer::ExecuteCallback(). (authored by jonpa).
[libFuzzer] Add attribute noinline on Fuzzer::ExecuteCallback().
Mar 11 2021, 7:08 PM
jonpa closed D97975: [libFuzzer] add attribute noinline on Fuzzer::ExecuteCallback().
Mar 11 2021, 7:08 PM · Restricted Project

Mar 10 2021

jonpa updated the diff for D93734: [LoopDeletion] Insert an early exit from dead path in loop.

Patch updated to also run with the new pass manager.

Mar 10 2021, 3:01 PM · Restricted Project

Mar 9 2021

jonpa updated subscribers of D98230: [LSR] Add reconciliation of unfoldable offsets.
Mar 9 2021, 8:52 AM · Restricted Project

Mar 8 2021

jonpa requested review of D98230: [LSR] Add reconciliation of unfoldable offsets.
Mar 8 2021, 6:29 PM · Restricted Project
jonpa added a comment to D97975: [libFuzzer] add attribute noinline on Fuzzer::ExecuteCallback().

IMO it seems better to disable the test on s390x rather than to add this noinline attribute in the sources. WDYT?

Mar 8 2021, 10:58 AM · Restricted Project

Mar 6 2021

jonpa added reviewers for D97975: [libFuzzer] add attribute noinline on Fuzzer::ExecuteCallback(): morehouse, dokyungs.
Mar 6 2021, 11:24 AM · Restricted Project
jonpa updated the diff for D97975: [libFuzzer] add attribute noinline on Fuzzer::ExecuteCallback().

clang-format

Mar 6 2021, 11:10 AM · Restricted Project
jonpa updated the diff for D97975: [libFuzzer] add attribute noinline on Fuzzer::ExecuteCallback().

it's symbolization+inlining on a specific platform that doesn't 100% work, right? are there any existing test cases or bugs that show this?

Yes, that's how it seems. I don't know if there are any tests or bugs reported for this, but I am not aware of any.

Mar 6 2021, 11:04 AM · Restricted Project
jonpa updated the summary of D97975: [libFuzzer] add attribute noinline on Fuzzer::ExecuteCallback().
Mar 6 2021, 10:51 AM · Restricted Project

Mar 4 2021

jonpa requested review of D97975: [libFuzzer] add attribute noinline on Fuzzer::ExecuteCallback().
Mar 4 2021, 1:09 PM · Restricted Project

Mar 3 2021

jonpa added inline comments to D97901: [SystemZ] Test for infinity in testFPKind()..
Mar 3 2021, 6:04 PM · Restricted Project
jonpa requested review of D97901: [SystemZ] Test for infinity in testFPKind()..
Mar 3 2021, 6:03 PM · Restricted Project
jonpa committed rG7334b3dc3ea4: [SystemZ] Reimplement the i8/i16 compare-and-swap logic. (authored by jonpa).
[SystemZ] Reimplement the i8/i16 compare-and-swap logic.
Mar 3 2021, 12:06 PM
jonpa closed D97604: [SystemZ] Reimplement the 1-byte compare-and-swap logic.
Mar 3 2021, 12:06 PM · Restricted Project
jonpa added a comment to D97604: [SystemZ] Reimplement the 1-byte compare-and-swap logic.

This is not needed any more -- it is already done by common code now that you set getAtomicExtendOps to ZERO_EXTEND.

Mar 3 2021, 10:36 AM · Restricted Project

Mar 2 2021

jonpa updated the diff for D97604: [SystemZ] Reimplement the 1-byte compare-and-swap logic.

Updated per review.

Mar 2 2021, 1:27 PM · Restricted Project
jonpa committed rG52bbbf4d4459: [SystemZ] Assign the full space for promoted and split outgoing args. (authored by jonpa).
[SystemZ] Assign the full space for promoted and split outgoing args.
Mar 2 2021, 10:57 AM
jonpa closed D97514: [SystemZ] Assign the full space for promoted and split outgoing args.
Mar 2 2021, 10:57 AM · Restricted Project

Mar 1 2021

jonpa updated the diff for D97514: [SystemZ] Assign the full space for promoted and split outgoing args.

Patch updated per review.

Mar 1 2021, 6:35 PM · Restricted Project
jonpa added a comment to D97604: [SystemZ] Reimplement the 1-byte compare-and-swap logic.

Not necessarily. Our ABI does require that "char" and "short" parameters and return values are extended, but that can be either a zero- or a sign-extension depending on the type. Also, this is implemented via the zeroext/signext type attributes on the parameters in code generated by clang; with LLVM IR generated elsewhere (like in those test cases!), we may get a plain i8 or i16 that is not extended. And of course if the i8 or i16 in question is not a function parameter but the result of some intermediate computation, it is not guaranteed to be extended anyway.

So in short, yes, the CmpVal may have to be extended. However, it is probably worthwhile to detect those (common) cases where it already *is* extended to avoid redundant effort. This is hard(er) to do at the MI level, so I think the extension is best done in SystemZTargetLowering::lowerATOMIC_CMP_SWAP at the Select\ionDAG level before emitting the ATOMIC_CMP_SWAPW MI instruction.

Mar 1 2021, 5:17 PM · Restricted Project

Feb 26 2021

jonpa requested review of D97604: [SystemZ] Reimplement the 1-byte compare-and-swap logic.
Feb 26 2021, 6:19 PM · Restricted Project
jonpa added inline comments to D97514: [SystemZ] Assign the full space for promoted and split outgoing args.
Feb 26 2021, 10:01 AM · Restricted Project
jonpa updated the diff for D97514: [SystemZ] Assign the full space for promoted and split outgoing args.

Updated per review.

Feb 26 2021, 9:59 AM · Restricted Project

Feb 25 2021

jonpa added inline comments to D97514: [SystemZ] Assign the full space for promoted and split outgoing args.
Feb 25 2021, 5:23 PM · Restricted Project
jonpa requested review of D97514: [SystemZ] Assign the full space for promoted and split outgoing args.
Feb 25 2021, 4:53 PM · Restricted Project

Feb 22 2021

jonpa added a comment to D97125: Stop traping on sNaN in __builtin_isinf.
In D97125#2578853, @kpn wrote:

System/Z's TEST DATA CLASS instruction covers most (all?) of the possible FP value states. You might want to subscribe, or add as a reviewer, jonpa just to make sure everyone stays in sync.

Feb 22 2021, 9:51 AM · Restricted Project

Feb 18 2021

jonpa committed rGe57bd1ff4fb6: [CFE, SystemZ] New target hook testFPKind() for checks of FP values. (authored by jonpa).
[CFE, SystemZ] New target hook testFPKind() for checks of FP values.
Feb 18 2021, 10:39 AM
jonpa closed D96568: [CFE, SystemZ] Emit s390.tdc instrincic for __builtin_isnan in Constrained FP mode..
Feb 18 2021, 10:39 AM · Restricted Project

Feb 17 2021

jonpa added a comment to D96568: [CFE, SystemZ] Emit s390.tdc instrincic for __builtin_isnan in Constrained FP mode..

Sounds good to me. Hopefully I'll get round to __builtin_isinf soon and a single hook will make the patch slightly smaller.

Patch updated to call the new hook testFPKind() and make it take a BuiltinID as argument (that seems to work at least for the moment - maybe an enum type will become necessary at some point per your suggestion..?)

I am not sure if this is "only" or "typically" used in constrained FP mode, or if the mode should be independent of calling this hook. The patch as it is asserts that it is called for an FP type but leaves it to the target to decide based on the FP mode, where SystemZ opts out unless it is constrained (which I think is what is wanted...).

LGTM, we can adapt the hook later if needed. I do not know whether allowing the hook to be used for non constrained FP will prove useful but it is easy enough to ignore it for non FP so why not. Thanks for changing that!

Feb 17 2021, 3:49 PM · Restricted Project
jonpa updated the diff for D96568: [CFE, SystemZ] Emit s390.tdc instrincic for __builtin_isnan in Constrained FP mode..

Sounds good to me. Hopefully I'll get round to __builtin_isinf soon and a single hook will make the patch slightly smaller.

Feb 17 2021, 11:52 AM · Restricted Project

Feb 16 2021

jonpa added a comment to D96568: [CFE, SystemZ] Emit s390.tdc instrincic for __builtin_isnan in Constrained FP mode..

That's interesting. I presume that can be used to implement isinf as well? Perhaps better call the hook fpclassify or similar?

Hmm, the instruction doesn't really implement fpclassify in itself, it is more like a combined check for fpclassify() == <some constant>. Specifically, the TEST DATA CLASS instruction takes an immediate operand that represents a bit mask, which each bit standing for one type of floating-point value (zero, normal, subnormal, infinity, QNaN, SNaN -- each in positive and negative versions). The instruction sets the condition code depending on whether the input FP number is in one of the classes selected by the bit mask, or not.

This is why Jonas' patch uses a bit mask of 0x0F -- this has the bits for the four types of NaN set (pos/neg QNaN/SNan). The instruction could indeed also be used to implement an isinf check (bit mask 0x30) or many other checks. We actually have a SystemZ back-end pass that tries to multiple combine FP checks into a single TEST DATA CLASS instruction.

However, the instruction does not directly implement the fpclassify semantics. To implement fpclassify, you'd still have to use multiple invocations of the instruction with different flags to determine the fpclassify output value.

I see. I'm not sure whether it's better to have several target hooks or a single one like testFPKind that would take a flag saying what do we want to test (NaN, Inf, etc.).

Feb 16 2021, 5:00 PM · Restricted Project

Feb 12 2021

jonpa updated subscribers of D96568: [CFE, SystemZ] Emit s390.tdc instrincic for __builtin_isnan in Constrained FP mode..
Feb 12 2021, 4:40 PM · Restricted Project
jonpa added a comment to D96471: [SystemZ] Fix vecintrin.h to not emit alignment hints in vec_xl/vec_xst..

Committed after changed to use __builtin_memcpy() instead.

Feb 12 2021, 4:32 PM · Restricted Project
jonpa committed rGb3ac5b84cdd4: [SystemZ] Fix vecintrin.h to not emit alignment hints in vec_xl/vec_xst. (authored by jonpa).
[SystemZ] Fix vecintrin.h to not emit alignment hints in vec_xl/vec_xst.
Feb 12 2021, 4:30 PM
jonpa closed D96471: [SystemZ] Fix vecintrin.h to not emit alignment hints in vec_xl/vec_xst..
Feb 12 2021, 4:30 PM · Restricted Project

Feb 11 2021

jonpa requested review of D96568: [CFE, SystemZ] Emit s390.tdc instrincic for __builtin_isnan in Constrained FP mode..
Feb 11 2021, 6:17 PM · Restricted Project

Feb 10 2021

jonpa requested review of D96471: [SystemZ] Fix vecintrin.h to not emit alignment hints in vec_xl/vec_xst..
Feb 10 2021, 4:53 PM · Restricted Project

Feb 7 2021

jonpa updated the diff for D44092: [SystemZ] Improve side steering of FPd unit and FXU registers..

I started to simplify the patch and handled one minor regression and then realized something... (see below :-)

Feb 7 2021, 4:57 PM · Restricted Project

Jan 29 2021

jonpa updated the diff for D44092: [SystemZ] Improve side steering of FPd unit and FXU registers..

Patch updated with latest improvements (still experimental).

Jan 29 2021, 3:41 PM · Restricted Project

Jan 28 2021

jonpa added a comment to D93734: [LoopDeletion] Insert an early exit from dead path in loop.

I may be doing something wrong, but D95468 did not help very much looking at these numbers it seems...

Maybe outer loops have been skipped and therefore you avoided duplication of outer and inner loops (with D95468). The statistics we have are too coarse grained to exactly pinpoint what happened.

Jan 28 2021, 3:56 PM · Restricted Project

Jan 26 2021

jonpa updated the diff for D93734: [LoopDeletion] Insert an early exit from dead path in loop.

Patch rebased.

Jan 26 2021, 5:38 PM · Restricted Project
jonpa updated the diff for D44092: [SystemZ] Improve side steering of FPd unit and FXU registers..

Latest improvements - still with ongoing experiments.

Jan 26 2021, 9:01 AM · Restricted Project

Jan 15 2021

jonpa abandoned D91786: [GVN] Strengthen the updating of dominated users.

As of D90231 i'm personally still unconvinced that it is the LLVM code that needs fixing and not the particular code that is using __builtin_constant_p().

^ still am.
Are langref changes needed to justify these optimization changes?

Jan 15 2021, 8:57 AM · Restricted Project

Jan 13 2021

jonpa added a comment to D93764: [LoopUnswitch] Implement first version of partial unswitching..

If a dead path in a loop is unswitched into an empty loop, I suppose the idea is that LoopDeletion will later then delete it?

Jan 13 2021, 4:36 PM · Restricted Project
jonpa committed rGddd03842c347: [SystemZ] Clear Available set in SystemZPostRASchedStrategy::initialize(). (authored by jonpa).
[SystemZ] Clear Available set in SystemZPostRASchedStrategy::initialize().
Jan 13 2021, 4:20 PM
jonpa closed D94383: [SystemZ] Don't crash with -misched-cutoff.
Jan 13 2021, 4:20 PM · Restricted Project

Jan 12 2021

jonpa updated the diff for D94383: [SystemZ] Don't crash with -misched-cutoff.

Patch updated per review.

Jan 12 2021, 12:01 PM · Restricted Project

Jan 11 2021

jonpa updated the diff for D44092: [SystemZ] Improve side steering of FPd unit and FXU registers..

This patch has been improved to make use of B2B information. B2BW, B2BR, and B2BRW FUs have been added to the SchedModel so that instructions can be modeled to use these. B2BRW is not really needed, but I tried using it for readability. This is one way of keeping track of which instructions can read and/or write B2B - a disadvantage is that the enum for the ProcResources is not available from TableGen so that has been added locally instead for now. It looked like there was probably enough irregularity among the opcodes to motivate this approach - although the differences between subtargets were very small.

Jan 11 2021, 11:56 AM · Restricted Project
jonpa added a comment to D94383: [SystemZ] Don't crash with -misched-cutoff.

Could you elaborate why initPolicy is the correct place to clear the Available list? I'm wondering because the default implementation doesn't appear to do that either, it looks like common code only clears the list in the main "init" ...

Jan 11 2021, 10:47 AM · Restricted Project
jonpa committed rG171771e0780f: [SystemZ] Minor NFC fix in SchedModels. (authored by jonpa).
[SystemZ] Minor NFC fix in SchedModels.
Jan 11 2021, 9:40 AM

Jan 10 2021

jonpa requested review of D94383: [SystemZ] Don't crash with -misched-cutoff.
Jan 10 2021, 6:04 PM · Restricted Project

Jan 9 2021

jonpa added inline comments to D93734: [LoopDeletion] Insert an early exit from dead path in loop.
Jan 9 2021, 1:42 PM · Restricted Project

Jan 8 2021

jonpa updated the diff for D93734: [LoopDeletion] Insert an early exit from dead path in loop.

patch rebased

Jan 8 2021, 12:32 PM · Restricted Project
jonpa added a comment to D93734: [LoopDeletion] Insert an early exit from dead path in loop.

I found out quickly that perhaps the hardest part of this was to update datastructures after changing the CFG in a loop... I have barely been able to build the benchmarks as it is, so I am in need of some good advice on how to update things after a loop change, for this patch to be usable. Currently I have changed things temporarily so that LI, DT, etc are recomputed after loop deletion (BTW, I found that only recomputing those analyses on master after LoopDeletion was not NFC, which surprised me... Is that a bug or expected with the aim to save compile time?)

Statistics on SPEC-17 on SystemZ:

Thanks for the update Jonas! It looks like the patch includes some required changes that still landed (D86844), which might impact the number of removed loops. It might be good to re-collect the statistics. I tried to collect stats with this patch on SPEC2000/SPEC2006/MultiSource for X86 with LTO, but unfortunately there have been a few crashes.

Jan 8 2021, 12:29 PM · Restricted Project

Jan 4 2021

jonpa updated the diff for D93734: [LoopDeletion] Insert an early exit from dead path in loop.

With the motto of pushing things forward even if only by aiding the other related patches, I have continued to improve my patch to use as some kind of baseline for "early exit" insertions. Perhaps it can be used during development of the partial loop-unswitching to find cases to handle, or perhaps it could be used for some cases if it would reduce the burden on the other algorithm. It would be very nice if partial unswitching could handle all this instead, of course :-)

Jan 4 2021, 4:08 PM · Restricted Project

Dec 29 2020

jonpa added a comment to D91786: [GVN] Strengthen the updating of dominated users.

ping!

Dec 29 2020, 5:15 PM · Restricted Project

Dec 27 2020

jonpa added a comment to D93764: [LoopUnswitch] Implement first version of partial unswitching..

This patch applies the idea from D93734 to LoopUnswitch.

Dec 27 2020, 5:10 PM · Restricted Project

Dec 22 2020

jonpa added a comment to D93734: [LoopDeletion] Insert an early exit from dead path in loop.

The idea is that we have

H:
  %c = ... ; invariant wrt H and L
  br %c, L, B
B: 
  side_effects
  br L
L:
  br %x, H, Exit

Exit:
  ...

right?

Yes, exactly (could be more blocks than 3, though, of course).

Dec 22 2020, 4:18 PM · Restricted Project
jonpa added a reviewer for D93734: [LoopDeletion] Insert an early exit from dead path in loop: Florian.
Dec 22 2020, 3:52 PM · Restricted Project
jonpa added inline comments to D86844: [LoopDeletion] Allows deletion of possibly infinite side-effect free loops.
Dec 22 2020, 3:34 PM · Restricted Project, Restricted Project
jonpa requested review of D93734: [LoopDeletion] Insert an early exit from dead path in loop.
Dec 22 2020, 3:32 PM · Restricted Project

Dec 14 2020

jonpa committed rG653b97690f0d: [SystemZ] Improve handling of backchain offset. (authored by jonpa).
[SystemZ] Improve handling of backchain offset.
Dec 14 2020, 10:40 AM
jonpa closed D93171: [SystemZ] Improve handling of backchain offset.
Dec 14 2020, 10:40 AM · Restricted Project
jonpa added a comment to D91786: [GVN] Strengthen the updating of dominated users.

ping!

Dec 14 2020, 10:07 AM · Restricted Project

Dec 12 2020

jonpa requested review of D93171: [SystemZ] Improve handling of backchain offset.
Dec 12 2020, 4:55 PM · Restricted Project

Dec 11 2020

jonpa committed rG42f628c84269: Reapply "[SystemZFrameLowering] Don't overrwrite R1D (backchain) when probing." (authored by jonpa).
Reapply "[SystemZFrameLowering] Don't overrwrite R1D (backchain) when probing."
Dec 11 2020, 4:30 PM
jonpa closed D92803: [SystemZFrameLowering] Make sure R1 holding the backchain is not corrupted by probing .
Dec 11 2020, 4:30 PM · Restricted Project
jonpa committed rG0c2d23933f06: [SystemZTTIImpl] Allow some non-prefetched accesses in getMinPrefetchStride(). (authored by jonpa).
[SystemZTTIImpl] Allow some non-prefetched accesses in getMinPrefetchStride().
Dec 11 2020, 4:08 PM
jonpa closed D92985: [SystemZTTIImpl::getMinPrefetchStride] Allow some non-prefetched mem accesses..
Dec 11 2020, 4:08 PM · Restricted Project
jonpa added a comment to D92059: [SLP] Control maximum vectorization factor from TTI.

This LGTM, but for one little detail: "Maximum SLP vectorization factor" should perhaps include "0=unlimited" or something similar, to avoid confusing 0 to mean default "off". Or maybe that isn't needed with a hidden option?

Dec 11 2020, 3:52 PM · Restricted Project

Dec 10 2020

jonpa requested review of D92803: [SystemZFrameLowering] Make sure R1 holding the backchain is not corrupted by probing .
Dec 10 2020, 5:21 PM · Restricted Project
jonpa reopened D92803: [SystemZFrameLowering] Make sure R1 holding the backchain is not corrupted by probing .
Dec 10 2020, 5:20 PM · Restricted Project
jonpa updated the diff for D92803: [SystemZFrameLowering] Make sure R1 holding the backchain is not corrupted by probing .

Sorry - had to revert patch since the live-in lists had not been handled properly.

Dec 10 2020, 5:20 PM · Restricted Project
jonpa added a reverting change for rGea475c77ff9e: [SystemZFrameLowering] Don't overrwrite R1D (backchain) when probing.: rGbc7a61b70360: Revert "[SystemZFrameLowering] Don't overrwrite R1D (backchain) when probing.".
Dec 10 2020, 4:08 PM
jonpa committed rGbc7a61b70360: Revert "[SystemZFrameLowering] Don't overrwrite R1D (backchain) when probing." (authored by jonpa).
Revert "[SystemZFrameLowering] Don't overrwrite R1D (backchain) when probing."
Dec 10 2020, 4:08 PM
jonpa added a reverting change for D92803: [SystemZFrameLowering] Make sure R1 holding the backchain is not corrupted by probing : rGbc7a61b70360: Revert "[SystemZFrameLowering] Don't overrwrite R1D (backchain) when probing.".
Dec 10 2020, 4:08 PM · Restricted Project
jonpa committed rGea475c77ff9e: [SystemZFrameLowering] Don't overrwrite R1D (backchain) when probing. (authored by jonpa).
[SystemZFrameLowering] Don't overrwrite R1D (backchain) when probing.
Dec 10 2020, 1:07 PM
jonpa closed D92803: [SystemZFrameLowering] Make sure R1 holding the backchain is not corrupted by probing .
Dec 10 2020, 1:07 PM · Restricted Project
jonpa added a comment to D92985: [SystemZTTIImpl::getMinPrefetchStride] Allow some non-prefetched mem accesses..

It doesn't seem right to bail just for one non-prefetched access, so it seems reasonable to allow a relatively very small amount of non-prefetched instructions, to make the heuristic more stable. This patch suggests allowing 1 non-prefetched memory access per 32 prefetched ones. This handles LBM, and also gives two prefetches to imagick which however do not seem to play a role.

This sounds reasonable to me, but this is the kind of question that in the end only measurement can decide ... What is the overall effect of this patch on benchmarks? Anything else beyond lbm and imagick?

Dec 10 2020, 11:30 AM · Restricted Project

Dec 9 2020

jonpa requested review of D92985: [SystemZTTIImpl::getMinPrefetchStride] Allow some non-prefetched mem accesses..
Dec 9 2020, 5:20 PM · Restricted Project
jonpa added a comment to D92803: [SystemZFrameLowering] Make sure R1 holding the backchain is not corrupted by probing .

I think it still makes sense to have the backchain store local in inlineStackProbe.

In fact, I think it would be best to have the backchain store in every iteration of the loop, i.e. to the store in allocateAndProbe (of course that means the store then implicitly acts as probe so we don't need the volatile compare any more if we have a backchain).

I remember there was an issue with "store tags" which we are handling for instance when we do loop-unrolling. But maybe that is not an issue any more on newer machines (and maybe we don't need to consider that in unrolling then either)?

Dec 9 2020, 12:19 PM · Restricted Project

Dec 8 2020

jonpa added a comment to D92803: [SystemZFrameLowering] Make sure R1 holding the backchain is not corrupted by probing .

Hmm... now that R0D is used for the loop exit, and R1D is used for the backchain, perhaps the backchain actually could be handled just in emitPrologue()?

Dec 8 2020, 11:11 AM · Restricted Project
jonpa updated the diff for D92803: [SystemZFrameLowering] Make sure R1 holding the backchain is not corrupted by probing .

Updated per review. R0D is now used for the loop exit check while probing.

Dec 8 2020, 11:05 AM · Restricted Project
jonpa abandoned D90231: [GVN] Don't replace argument to @llvm.is.constant.*().
Dec 8 2020, 8:49 AM · Restricted Project
jonpa added a comment to D91786: [GVN] Strengthen the updating of dominated users.

I have it now confirmed that this patch fixes the kernel issues on SystemZ: "This patch seems to fix all the kernel build issues. It builds, it runs, all looks good...".

Dec 8 2020, 8:47 AM · Restricted Project
jonpa added a comment to D91218: Prevent FENTRY_CALL reordering.

We very much need this in order to build the Linux kernel on SystemZ, so it would be nice if someone could approve this...

Dec 8 2020, 8:45 AM · Restricted Project