Page MenuHomePhabricator

[RISCV] Add RV64F codegen support

Authored by asb on Oct 12 2018, 6:27 PM.



This requires a little extra work due tothe fact i32 is not a legal type. When call lowering happens post-legalisation (e.g. when an intrinsic was inserted during legalisation). A bitcast from f32
to i32 can't be introduced. This is similar to the challenges with RV32D. To handle this, we introduce target-specific DAG nodes that perform bitcast+sext for f32->i64 and trunc+bitcast for i64->f32.

Diff Detail


Event Timeline

asb created this revision.Oct 12 2018, 6:27 PM
asb planned changes to this revision.Oct 15 2018, 7:23 AM

The presence of no-op any_extend can cause assertions during the DAG combiner, and there's no guarantee that the no-op any_extend will be visited first. I'll update this to avoid the issue, probably setting a target combiner for any_extend to legalise any_extend + bitcast directly to BitcastAndSextF32ToI64.

efriedma added inline comments.
8959 ↗(On Diff #169532)

Not sure how you're ending up in a situation where this matters; getNode() will normally fold away an ANY_EXTEND where the operand and result have the same type.

161 ↗(On Diff #169532)

The point of this is to try to generate a "narrower" FP_TO_SINT where possible? Does this transform actually improve performance?

354 ↗(On Diff #169532)

Maybe better to implement this as a DAGCombine?

asb updated this revision to Diff 171079.Oct 25 2018, 6:42 AM
asb marked an inline comment as done.
asb edited the summary of this revision. (Show Details)

Updated to use target DAG combines for the int/float conversions.

rogfer01 added inline comments.Oct 25 2018, 8:22 AM
352 ↗(On Diff #171079)

Typo → because

asb updated this revision to Diff 174180.Nov 15 2018, 3:56 AM
asb marked an inline comment as done.

Refresh patch and fix typo in comment. This patch does not update float-intrinsics.ll as new promotion code needs to be added for FPOWI. I will post a patch to do this now.

asb added a comment.Nov 30 2018, 3:52 AM

Submitting some old in-line comments that were accidentally left unsubmitted.

Patch still applies cleanly.

8959 ↗(On Diff #169532)

This fold is insufficient anyway. The issue is that anyextend is left hanging around after one of its children is custom legalised. Just handling everything in a target dag combine avoids this and other problems.

161 ↗(On Diff #169532)

The alternative is fcvt.l.s + sign-extension of the result. Just emitting fcvt.w.s is fewer instructions.

jrtc27 added a subscriber: jrtc27.Dec 7 2018, 8:42 AM
asb updated this revision to Diff 180076.Jan 3 2019, 7:35 AM

Rebased on top of D56264 and D53230. Although this patch could be written to apply without those changes, given all patches modify the dag combines it's simpler to keep them linear.

I'd really appreciate a review. From my perspective, this is ready to land.

asb updated this revision to Diff 181457.Jan 12 2019, 12:56 PM
asb edited reviewers, added: efriedma; removed: eli.friedman.

Rebased (no meaningful changes).

I would really, really appreciate a review. I believe this is ready to land and overall a pretty straight-forward patch. The introduction of target-specific nodes to avoid bitcasts to illegal types is used across other backends in LLVM, as well as for RV32D and is pretty straight-forward.

Hi Alex,

then change looks sensible to me but I found there is some sort of issue with the new DAG nodes. Consider the following example (minimised from CodeGen/Generic/2008-02-25-NegateZero.ll.

define void @test(float* %p) {
	%tmp98 = load float, float* %p, align 4
	call void (i32, ...) @foo( i32 0, float %tmp98 ) nounwind 
	ret void

declare void @foo(i32, ...)
$ ./bin/llc -mtriple riscv64 < t.ll
	.file	"<stdin>"
SoftenFloatOperand Op #0: t21: i64 = RISCVISD::BitcastAndSextF32ToI64 t5

Do not know how to soften this operator's operand!
UNREACHABLE executed at /home/rferrer/fio/upstream/llvm-src/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp:753!
Stack dump:
0.	Program arguments: ./bin/llc -mtriple riscv64 
1.	Running pass 'Function Pass Manager' on module '<stdin>'.
2.	Running pass 'RISCV DAG->DAG Pattern Instruction Selection' on function '@test'

I think this happens because f32 is not legal when no f is present, so the type legalizer tries to "soften" it but it doesn't know how to do this.

Ah, I think what happens is that we should not emit such patterns without the presence of the F extension: these patterns are ultimately selected as instructions in F. Does this make sense?

asb added a comment.Jan 14 2019, 8:39 AM

Ah, I think what happens is that we should not emit such patterns without the presence of the F extension: these patterns are ultimately selected as instructions in F. Does this make sense?

Thanks, the code in PerformDAGCombine is incorrectly assuming it will only see f32 if the F extension is enabled. But of course, that's only the case if you're in a post-legalisation combine call. I'll fix and add a test case.

efriedma added inline comments.Jan 14 2019, 12:25 PM
161 ↗(On Diff #169532)

I still don't follow; AssertSext doesn't lower to any instructions. So the alternative is just the one fcvt.l.s. (The "trick" here is that fptosi returns poison if the result would be out of range.)

asb updated this revision to Diff 182851.EditedJan 21 2019, 11:26 PM
asb marked an inline comment as done.

Updated to address all outstanding feedback. The combines are now properly guarded and a test case verified this. I've removed the logic that prefers fcvt.w[u].s and added exhaustive tests for sext/zext/aext of the results.

Looks good?

asb added inline comments.
161 ↗(On Diff #169532)

Thanks for pointing out that fptosi with an out-of-range input is poison. The same is true for fptoui. I've removed this logic.

lewis-revill added inline comments.
636 ↗(On Diff #182851)

Typo 'redundant'

148 ↗(On Diff #182851)

Can RV64 employ the same technique as RV32 here, and for other similar DAG combines? I'm guessing cost-wise it would be desirable to do so.

jrtc27 added inline comments.Jan 22 2019, 6:36 AM
148 ↗(On Diff #182851)

It should work, since fmv.x.w sign-extends:

For RV64, the higher 32 bits of the destination register are filled with copies of the floating-point number’s sign bit.

and the lui will also sign-extend, so xor will flip all the higher bits too.

asb updated this revision to Diff 183618.Jan 25 2019, 2:08 PM
asb marked 4 inline comments as done.
asb edited the summary of this revision. (Show Details)

Changes since last time:

  • Use custom legalisation rather than DAG combines to produce the f32 <-> i64 nodes
  • Add target DAG combines that perform the equivalent of the (bitcast (fneg/fabs x)) conversion that the DAGCombiner does
  • Rename target-specific SelectionDAG nodes to match the RISC-V fmv instruction naming.
148 ↗(On Diff #182851)

I've split out float-bitmanip-dagcombines.ll and double-bitmanip-dagcombines.ll to better demonstrate the opportunities here. One RV32F, target-independent code provides the combine for us. As we need to use RISCVISD::FMV_X_W_RV64 (as bitcasting to i32 wouldn't be legal), we need to provide an equivalent combine in the RISC-V backend. I've updated the patch to do so.

asb updated this revision to Diff 183634.Jan 25 2019, 3:04 PM

Attached the wrong version of the patch, which had a simple issue with the dagcombine. Right version now attached.

asb updated this revision to Diff 184254.Jan 30 2019, 1:31 AM

Clearly everyone is too polite to point out that I attached the RV64D patch in the last update rather than RV64F. Now addressed.

Reviews would be appreciated.

efriedma added inline comments.Jan 30 2019, 11:27 AM
666 ↗(On Diff #184254)

This combine isn't consistent with the ComputeNumSignBitsForTargetNode change: if value is supposed to be sign-extended, you need to insert a SIGN_EXTEND_INREG

asb updated this revision to Diff 184533.EditedJan 31 2019, 9:54 AM
asb marked an inline comment as done.

Hi Eli. Thanks, as always, for the careful review. I've opted not to to introduce a SEXT_IN_REG in this case, and instead define a target-specific node that produces an anyext result (and a pattern that ensures FMV_X_W is selected for (sexti32 (riscv_fmv_x_anyextw_rv64 ..)) meaning I can replace with an ANY_EXTEND. Otherwise there are cases where we end up with an unwanted sext.w in cases where redundant GPR->FPR->GPR moves have been eliminated.

For this reason I think this is the more useful semantic, and I believe it's safe based on my understanding of the legalisation+widening expectations but I'd really welcome a second opinion.

[Comment was edited for clarity after submission].


Making FMV_X_ANYEXTW_RV64 any-extend the result seems fine.

372 ↗(On Diff #184533)

This is supposed to be assertzexti32, as opposed to zexti32, right? I doubt it has any practical impact because an "AND" would be dead, but better to be clear.

asb marked an inline comment as done.Jan 31 2019, 2:07 PM
asb added inline comments.
372 ↗(On Diff #184533)

Yes, the AND would be dead but as you suggest it's better to stick with assertzexti32 so people don't need to evaluate whether the AND pattern is dead and unnecessary when reading the code.

I'll adjust when committing - thanks.

This revision was not accepted when it landed; it landed in state Needs Review.Jan 31 2019, 2:48 PM
This revision was automatically updated to reflect the committed changes.