Also add a test for the codegen of strict FP vector operations so these changes get tested.
Details
Diff Detail
Event Timeline
Is it possible to break the 4 subtasks into separate reviews?
-In clang generate fcmps when appropriate for neon intrinsics -Fix legalization of scalarized strict FP vector operations -Add some missing strict FP handling to AArch64TargetLowering -Adjust the aarch64-neon-intrinsics-constrained.c clang test to expect the right output and un-XFAIL it.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
1485 | Can you split this into a separate patch? I know I sound like a broken record, but it doesn't seem to be related to the converts below. Also pre-committing as much of the test that works as possible would cut it down from this patch quite a bit. | |
3423 | Some formatting apparently needs fixing up. | |
3674 | Op.getOperand(1) -> In? | |
3695 | dl is already defined. | |
3698 | Op.getOperand(IsStrict ? 1 : 0) -> In? | |
3701 | This isn't used anywhere |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
1485 | Instead of a separate patch just for these two, it would probably make more sense to move them into D114946 with the rest of the setOperationAction lines. On the test, without the changes in this patch it hits an assertion failure so as a separate commit before this it wouldn't be able to test anything. |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
1485 | D114946 is already pretty big :) What about the add_v4f32 test (for example) requires the changes in this patch? That's what I meant by "precommit as much as possible". My other question about this was going to be - why can't we use the vector instructions for STRICT_FSETCCS? The FCMGE and FCMGT look like they would set exception flags to me, but I may be misunderstanding some part of it. |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
1485 |
A bit, though only in terms of number of lines. Conceptually it fairly simple, as it's almost all "strict_xyz can be handled the same as (non-strict) xyz".
I _could_ try and disentangle the parts that compile without this patch, but it seems like more work than it's worth.
Hmm, it looks like the FCMXY instructions are inconsistent in their handling of NaNs. FCMEQ performs a quiet comparison (only signalling NaNs raise an exception) like FCMP and STRICT_FSETCC, FCMGT etc. performs a signalling comparison (all NaNs raise an exception) like FCMPE and STRICT_FSETCCS. I'll adjust the comments (and also move this to D114946). |
Can you split this into a separate patch? I know I sound like a broken record, but it doesn't seem to be related to the converts below.
Also pre-committing as much of the test that works as possible would cut it down from this patch quite a bit.