This requires a little extra work due tothe fact i32 is not a legal type. When call lowering happens post-legalisation (e.g. when an intrinsic was inserted during legalisation). A bitcast from f32
to i32 can't be introduced. This is similar to the challenges with RV32D. To handle this, we introduce target-specific DAG nodes that perform bitcast+sext for f32->i64 and trunc+bitcast for i64->f32.
Details
Diff Detail
Event Timeline
The presence of no-op any_extend can cause assertions during the DAG combiner, and there's no guarantee that the no-op any_extend will be visited first. I'll update this to avoid the issue, probably setting a target combiner for any_extend to legalise any_extend + bitcast directly to BitcastAndSextF32ToI64.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
8959 ↗ | (On Diff #169532) | Not sure how you're ending up in a situation where this matters; getNode() will normally fold away an ANY_EXTEND where the operand and result have the same type. |
lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
161 | The point of this is to try to generate a "narrower" FP_TO_SINT where possible? Does this transform actually improve performance? | |
lib/Target/RISCV/RISCVInstrInfoF.td | ||
370 | Maybe better to implement this as a DAGCombine? |
lib/Target/RISCV/RISCVInstrInfoF.td | ||
---|---|---|
368 | Typo → because |
Refresh patch and fix typo in comment. This patch does not update float-intrinsics.ll as new promotion code needs to be added for FPOWI. I will post a patch to do this now.
Submitting some old in-line comments that were accidentally left unsubmitted.
Patch still applies cleanly.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
8959 ↗ | (On Diff #169532) | This fold is insufficient anyway. The issue is that anyextend is left hanging around after one of its children is custom legalised. Just handling everything in a target dag combine avoids this and other problems. |
lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
161 | The alternative is fcvt.l.s + sign-extension of the result. Just emitting fcvt.w.s is fewer instructions. |
Rebased (no meaningful changes).
I would really, really appreciate a review. I believe this is ready to land and overall a pretty straight-forward patch. The introduction of target-specific nodes to avoid bitcasts to illegal types is used across other backends in LLVM, as well as for RV32D and is pretty straight-forward.
Hi Alex,
then change looks sensible to me but I found there is some sort of issue with the new DAG nodes. Consider the following example (minimised from CodeGen/Generic/2008-02-25-NegateZero.ll.
define void @test(float* %p) { entry: %tmp98 = load float, float* %p, align 4 call void (i32, ...) @foo( i32 0, float %tmp98 ) nounwind ret void } declare void @foo(i32, ...)
$ ./bin/llc -mtriple riscv64 < t.ll
.text .file "<stdin>" SoftenFloatOperand Op #0: t21: i64 = RISCVISD::BitcastAndSextF32ToI64 t5 Do not know how to soften this operator's operand! UNREACHABLE executed at /home/rferrer/fio/upstream/llvm-src/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp:753! Stack dump: 0. Program arguments: ./bin/llc -mtriple riscv64 1. Running pass 'Function Pass Manager' on module '<stdin>'. 2. Running pass 'RISCV DAG->DAG Pattern Instruction Selection' on function '@test'
I think this happens because f32 is not legal when no f is present, so the type legalizer tries to "soften" it but it doesn't know how to do this.
Ah, I think what happens is that we should not emit such patterns without the presence of the F extension: these patterns are ultimately selected as instructions in F. Does this make sense?
Thanks, the code in PerformDAGCombine is incorrectly assuming it will only see f32 if the F extension is enabled. But of course, that's only the case if you're in a post-legalisation combine call. I'll fix and add a test case.
lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
---|---|---|
161 | I still don't follow; AssertSext doesn't lower to any instructions. So the alternative is just the one fcvt.l.s. (The "trick" here is that fptosi returns poison if the result would be out of range.) |
Updated to address all outstanding feedback. The combines are now properly guarded and a test case verified this. I've removed the logic that prefers fcvt.w[u].s and added exhaustive tests for sext/zext/aext of the results.
Looks good?
lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
---|---|---|
161 | Thanks for pointing out that fptosi with an out-of-range input is poison. The same is true for fptoui. I've removed this logic. |
test/CodeGen/RISCV/float-arith.ll | ||
---|---|---|
148 | It should work, since fmv.x.w sign-extends:
and the lui will also sign-extend, so xor will flip all the higher bits too. |
Changes since last time:
- Use custom legalisation rather than DAG combines to produce the f32 <-> i64 nodes
- Add target DAG combines that perform the equivalent of the (bitcast (fneg/fabs x)) conversion that the DAGCombiner does
- Rename target-specific SelectionDAG nodes to match the RISC-V fmv instruction naming.
test/CodeGen/RISCV/float-arith.ll | ||
---|---|---|
148 | I've split out float-bitmanip-dagcombines.ll and double-bitmanip-dagcombines.ll to better demonstrate the opportunities here. One RV32F, target-independent code provides the combine for us. As we need to use RISCVISD::FMV_X_W_RV64 (as bitcasting to i32 wouldn't be legal), we need to provide an equivalent combine in the RISC-V backend. I've updated the patch to do so. |
Attached the wrong version of the patch, which had a simple issue with the dagcombine. Right version now attached.
Clearly everyone is too polite to point out that I attached the RV64D patch in the last update rather than RV64F. Now addressed.
Reviews would be appreciated.
lib/Target/RISCV/RISCVISelLowering.cpp | ||
---|---|---|
637 | This combine isn't consistent with the ComputeNumSignBitsForTargetNode change: if value is supposed to be sign-extended, you need to insert a SIGN_EXTEND_INREG |
Hi Eli. Thanks, as always, for the careful review. I've opted not to to introduce a SEXT_IN_REG in this case, and instead define a target-specific node that produces an anyext result (and a pattern that ensures FMV_X_W is selected for (sexti32 (riscv_fmv_x_anyextw_rv64 ..)) meaning I can replace with an ANY_EXTEND. Otherwise there are cases where we end up with an unwanted sext.w in cases where redundant GPR->FPR->GPR moves have been eliminated.
For this reason I think this is the more useful semantic, and I believe it's safe based on my understanding of the legalisation+widening expectations but I'd really welcome a second opinion.
[Comment was edited for clarity after submission].
LGTM
Making FMV_X_ANYEXTW_RV64 any-extend the result seems fine.
lib/Target/RISCV/RISCVInstrInfoF.td | ||
---|---|---|
375 | This is supposed to be assertzexti32, as opposed to zexti32, right? I doubt it has any practical impact because an "AND" would be dead, but better to be clear. |
lib/Target/RISCV/RISCVInstrInfoF.td | ||
---|---|---|
375 | Yes, the AND would be dead but as you suggest it's better to stick with assertzexti32 so people don't need to evaluate whether the AND pattern is dead and unnecessary when reading the code. I'll adjust when committing - thanks. |
The point of this is to try to generate a "narrower" FP_TO_SINT where possible? Does this transform actually improve performance?