- User Since
- May 22 2014, 1:24 PM (264 w, 4 d)
This matches my mental model for FMF propagation, so LGTM.
LGTM. I have no idea if it's worth the trade-off, but we could do this sooner in IR (instcombine) instead of or in addition to SDAG?
I'm not familiar with how flags are being used in this layer, but union of flags doesn't match the way we deal with FMF in IR. If the fneg is strict (disregard the irrelevant 'arcp' in that last test), then it's invalid to assume that it inherits the 'nsz' property from its operand.
Sat, Jun 15
Fri, Jun 14
Patch updated with test changes:
- Added a bigger test for the original PR37428 example because that shows a difference: with AVX1, we do not split the ymm ops any more. I'm not sure if that's better or worse than having mul/sitofp ops in the loop, but since the original request is for an SSE target, I think we can deal with that independently.
- Added an "enable-debugify" RUN to the IR test file to verify that we are not creating naked instructions. The IRBuilder takes care of this for us here, but I fixed a related problem in rL363409.
For reference, the vector shift TLI hook was originally added for a related CGP transform:
Thu, Jun 13
Wed, Jun 12
Tue, Jun 11
LGTM - but let's wait a ~day to commit in case anyone else has comments.
We have 2 text comment acknowledgements ("Seems fine", "Looks ok") - thanks for the prompt reviews!...can someone make it official by toggling the Phab state to 'Approved'? :)
- Add text to header comment to indicate that this may be more powerful than getSplatValue().
- Add TODO comment about recursing through several other types of operands.
Ping * 2.
Mon, Jun 10
This matches the existing fold with 2 fsubs, and I agree with the comment in the description regarding IEEE-754 - fsub doesn't need to preserve the sign bit of a NaN. LGTM.
Sun, Jun 9
I'm seeing intermittent failures locally for this test on macOS since this was committed, and that seems confirmed by this bot:
(click 'Show More' at the bottom to extend the history and see this test as the only toggling failure with seemingly random NFC changes)
Sat, Jun 8
Fri, Jun 7
Only add a check for isKnownNeverNaN(). We can remove FMF.noNaNs() independently when that's less likely to introduce regressions.
How does the output differ from the default expansion/optimization of ctpop? Please commit the tests with baseline CHECK lines, so we just show that diff. Add tests for another target too (AArch64?).