- User Since
- Apr 14 2013, 11:48 AM (287 w, 2 d)
Mon, Oct 15
This causes check-all to abort for me on SystemZ in Release mode (and never actually run the lit tests):
Wed, Oct 10
It would be good to have some tests for this ...
Tue, Oct 9
This does look like a nice refactoring, so it might be worthwhile getting it in anyway, even if there's not much change in codegen ...
I agree it does make sense to me to consider VF 4 in this test case, and the resulting code is better on Z.
Thu, Oct 4
OK, I understand the concern better now, thanks.
Tue, Oct 2
Yet another Ping after returning from vacation ...
I agree that this seems fine for SystemZ.
Thu, Sep 27
LGTM as well.
Sep 14 2018
Sep 12 2018
I agree, we need to update the cost model. Patch LGTM.
Aug 18 2018
Aug 17 2018
I believe we still don't need an intrinsic for constrained fneg. As you say, floating-point negation is implemented as -0-x, using a regular fsub (*not* the constrained intrinsic). The fsub semantics says that it cannot trap, so this would still be fine -- as long as we're sure the -0-x is always implemented via a dedicated negate instruction, and never via a subtraction. But this seems to be true, the idiom -0-x is recognized early during codegen and always treated specially.
Aug 15 2018
Ah, that's good to know! So if I understand this correctly, accessing even the 32-bit part (F0S) would be considered to clobber F0D. This is not really necessary, but probably doesn't hurt at this point. We could do the same thing as the HAX you mention to get this modeled exactly.
Well, the original rationale for using different subreg indices for float/vector registers is given here:
Aug 10 2018
Why? The SystemZ back-end doesn't support 32-bit code generation anyway, so there is no native LLVM support on s390 ...
Aug 9 2018
Ping. Would this be easier to review if I split it up into multiple patches?
I've checked it in now, thanks!
Aug 7 2018
LGTM as well.
Aug 3 2018
Patch LGTM. Thanks!
Aug 2 2018
Aug 1 2018
The new code for f2 in cond-move-03.ll is in fact better, since it now actually uses the conditional move instruction instead of a branch ...
Jul 31 2018
I've just posted a patch to reimplement the broken SystemZ shift count logic: https://reviews.llvm.org/D50096
Jul 27 2018
Updated patch to make use of the new multi-alternative pattern fragment features, and completed set of test cases.
Ah, OK. Thanks for checking!
Jul 26 2018
Jul 25 2018
Sorry, missed your last update.
A few more comments, LGTM with those changes.
See inline comments. In general I agree with this approach.
Jul 23 2018
I believe in this case it would be preferable to include the full range of latency values. We should try to keep the source as concise and straightforward as possible; if this results in some units being defined that aren't used, that doesn't seem to be a drawback to me. Also, for the common code parts, even if no current instruction uses any particular value, there may always be additional instructions in future processors.
Jul 20 2018
Jul 16 2018
The SystemZ changes look good to me. Thanks for taking care of this!
Jul 15 2018
Hmm ... The SystemZ tests seem to be getting strictly worse. Before, we have in f3:
Jul 13 2018
Superseded by https://reviews.llvm.org/D48545
Jul 12 2018
Jul 11 2018
Ah, I missed that, probably because on SystemZ char is unsigned by default ... Sorry about that.
Jul 10 2018
This caused a build bot failure here:
I've now disabled the test on s390 to get the build bot going reliably again.
This is also failing on s390x (randomly). The problem is that we don't actually have a non-executable flag, so the processor *will* execute the contents of "array" as code. Depending on what's there (it is just uninitialized memory), this will crash sooner or later, but in a way that may or may not make the check succeed randomly.
Jul 6 2018
It looks like this causes build bot failures on s390x-linux. Three of the new tests fail:
Jul 3 2018
Jun 25 2018
This version looks good to me, and we're also seeing good performance results.
Jun 20 2018
Jun 19 2018
I've come up with a suggestion to avoid even the duplicated DAG patterns, by having them implicitly generated by TableGen. This uses a new "alternative" mechanism described in a separate RFC:
Jun 8 2018
Jun 6 2018
Otherwise, this LGTM.
Jun 1 2018
This version now correctly implements target support for STRICT floating-point nodes using a MMO to represent the floating-point exception status. (Note that targets can and should additionally use a register use dependency on *all* floating-point instructions to represent floating-point control state, e.g. rounding mode and exception trap flags.)
Working version of the patch using memory operands.
May 30 2018
May 29 2018
(Failed) attempt to handle strict and regular FP operations without duplicating patterns ...
Is there a reason for dropping the "FP" from "StrictFP" in the method name? I think it would be better to keep it (i.e. use "getStrictFPOperationAction"), in particular since there are already other public methods in this area that use "StrictFP", like isStrictFPOpcode and MutateStrictFPToFP.