Page MenuHomePhabricator

[RISCV][CodeGen] Support Zfinx,Zdinx,Zhinx,Zhinxmin codegen
Needs ReviewPublic

Authored by sunshaoce on Apr 1 2022, 10:33 AM.

Diff Detail

Event Timeline

sunshaoce created this revision.Apr 1 2022, 10:33 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 1 2022, 10:33 AM
sunshaoce requested review of this revision.Apr 1 2022, 10:33 AM
sunshaoce updated this revision to Diff 420063.Apr 3 2022, 9:05 AM

Support float-arith in RV64 Zfinx

sunshaoce edited the summary of this revision. (Show Details)Apr 3 2022, 9:06 AM
sunshaoce retitled this revision from [RISCV][CodeGen] Support float-arith in RV32 zfinx to [RISCV][CodeGen] Support float-arith in Zfinx.
sunshaoce added subscribers: realqhc, liaolucy.
gabcoh added a subscriber: gabcoh.Apr 3 2022, 10:49 AM
sunshaoce updated this revision to Diff 420709.Apr 5 2022, 10:19 PM

Add abi info in test files

hughperkins added a subscriber: hughperkins.EditedApr 12 2022, 3:35 PM

Note: I get a crash when I try to compile some LLVM IR. Crash trace:

$ bin/llc --march riscv32 -mattr +zfinx test.ll
LLVM ERROR: Cannot select: t8: ch = store<(store (s32) into %ir.4)> t7, t4, t6, undef:i32
  t4: f32,ch = load<(dereferenceable load (s32) from %ir.2)> t0, FrameIndex:i32<1>, undef:i32
    t1: i32 = FrameIndex<1>
    t3: i32 = undef
  t6: i32,ch = load<(dereferenceable load (s32) from %ir.1, align 8)> t0, FrameIndex:i32<0>, undef:i32
    t5: i32 = FrameIndex<0>
    t3: i32 = undef
  t3: i32 = undef
In function: _Z10sum_floatsPfjS_
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: bin/llc --march riscv32 -mattr +zfinx test.ll
1.	Running pass 'Function Pass Manager' on module 'test.ll'.
2.	Running pass 'RISCV DAG->DAG Pattern Instruction Selection' on function '@_Z10sum_floatsPfjS_'
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  llc                      0x000000010974962d llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 61
1  llc                      0x0000000109749bdb PrintStackTraceSignalHandler(void*) + 27
2  llc                      0x000000010974785b llvm::sys::RunSignalHandlers() + 139
3  llc                      0x000000010974b7d8 SignalHandler(int) + 232
4  libsystem_platform.dylib 0x00007fff206e5d7d _sigtramp + 29
5  libsystem_platform.dylib 000000000000000000 _sigtramp + 18446603339972059808
6  libsystem_c.dylib        0x00007fff205f5406 abort + 125
7  llc                      0x00000001095aae4a llvm::report_fatal_error(llvm::Twine const&, bool) + 394
8  llc                      0x0000000109432305 llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) + 725
9  llc                      0x000000010942dd72 llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int) + 26690
10 llc                      0x0000000106faec6c llvm::RISCVDAGToDAGISel::SelectCode(llvm::SDNode*) + 44
11 llc                      0x0000000106fade73 llvm::RISCVDAGToDAGISel::Select(llvm::SDNode*) + 31299
12 llc                      0x000000010941e7fd llvm::SelectionDAGISel::DoInstructionSelection() + 1629
13 llc                      0x000000010941cf4d llvm::SelectionDAGISel::CodeGenAndEmitDAG() + 6205
14 llc                      0x000000010941b56f llvm::SelectionDAGISel::SelectBasicBlock(llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, true, false, void>, false, true>, llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, true, false, void>, false, true>, bool&) + 399
15 llc                      0x0000000109419649 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) + 6329
16 llc                      0x0000000109415d2c llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) + 2268
17 llc                      0x0000000106fb0668 llvm::RISCVDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) + 56
18 llc                      0x0000000107cc0dcd llvm::MachineFunctionPass::runOnFunction(llvm::Function&) + 541
19 llc                      0x0000000108645dbc llvm::FPPassManager::runOnFunction(llvm::Function&) + 700
20 llc                      0x000000010864dce5 llvm::FPPassManager::runOnModule(llvm::Module&) + 117
21 llc                      0x0000000108646784 (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) + 772
22 llc                      0x00000001086462a8 llvm::legacy::PassManagerImpl::run(llvm::Module&) + 296
23 llc                      0x000000010864e081 llvm::legacy::PassManager::run(llvm::Module&) + 33
24 llc                      0x0000000106eeba16 compileModule(char**, llvm::LLVMContext&) + 6742
25 llc                      0x0000000106ee977d main + 1501
26 libdyld.dylib            0x00007fff206bbf3d start + 1
27 libdyld.dylib            0x0000000000000006 start + 18446603339972231370
Abort trap: 6

Corresponding LLVM IR:

define dso_local void @_Z10sum_floatsPfjS_() #0 {
  %1 = alloca float*, align 8
  %2 = alloca float, align 4
  %3 = load float, float* %2, align 4
  %4 = load float*, float** %1, align 8
  store float %3, float* %4, align 4
  ret void
}

(Note that it compiles just fine with +f (but using float registers...), or with nothing (but uses softwawre emulation for the fadd)

(Edited to shorten llvm ir code a lot :) )

hughperkins added a comment.EditedApr 13 2022, 2:41 AM

Ok, I've since learned that this PR doesnt handle load/store. Only arithmetic.

However note that immediate float arithmetic also seems to be missing, e.g. the following crashes

define noundef float @_Z4funcf(float noundef %0) #0 {
  %2 = fadd float %0, 3.000000e+00
  ret float %2
}

Compiled with:

bin/llc --march riscv32 -mattr +zfinx test_fadd_imm.ll

Gives:

LLVM ERROR: Cannot select: t15: f32,ch = load<(load (s32) from constant-pool)> t0, t19, undef:i32
  t19: i32 = ADDI t18, TargetConstantPool:i32<float 3.000000e+00> 0 [TF=3]
    t18: i32 = LUI TargetConstantPool:i32<float 3.000000e+00> 0 [TF=4]
      t16: i32 = TargetConstantPool<float 3.000000e+00> 0 [TF=4]
    t17: i32 = TargetConstantPool<float 3.000000e+00> 0 [TF=3]
  t14: i32 = undef
In function: _Z4funcf

(but compiles ok with +f)

hughperkins added inline comments.Apr 13 2022, 3:39 AM
llvm/lib/Target/RISCV/RISCVInstrInfoF.td
548

I think it's confusing that FINX in this context means F and F_INX. I feel it might be good to make more explicit that FINX includes F. How to do that is an open question. I'd suggest FFINX, but FFINX is already taken to mean FF with FF_INX. Maybe we need an explicit acronym to mean 'F and F_INX', like e.g. 'ALLF'. Or perhaps use a verbose form like 'F_AND_FINX'? (If it was my own decision I'd go for the explicit F_AND_FINX)

Also, not really part of this PR, but I don't know where else to put this. This code:

def FF        : ExtInfo_rr<FExt,       FPR32,    FPR32>;
def FF_INX    : ExtInfo_rr<ZfinxExt,   FPR32INX, FPR32INX>;
def FX        : ExtInfo_rr<FExt,       FPR32,    GPR>;
def FX_INX    : ExtInfo_rr<ZfinxExt,   FPR32INX, GPR>;
def FX_64     : ExtInfo_rr<F64Ext,     FPR32,    GPR>;
def FX_INX_64 : ExtInfo_rr<Zfinx64Ext, FPR32INX, GPR>;
def XF        : ExtInfo_rr<FExt,       GPR,      FPR32>;
def XF_64     : ExtInfo_rr<F64Ext,     GPR,      FPR32>;
def XF_INX    : ExtInfo_rr<ZfinxExt,   GPR,      FPR32INX>;
def XF_INX_64 : ExtInfo_rr<Zfinx64Ext, GPR,      FPR32INX>;

... I feel it would be good to add a docstring explaining what this is, like e.g. something like:

Each of these definitions represents a single combination of register spaces.
For example, `FX` means that the destination register is a float register (`FPR32`), e.g. `ft1`,
and that the source register is a general `x` register (`GPR`), e.g. `x1`.
The `_INX` versions are for the `+zfinx` extension. In this case, `FPR32INX` is effectively a synonym in practice
for `GPR`. At least `FPR32INX` are emitted as general `x` registers, even though they 'look like' float registers
for much of the tablegen process.

Ok, I've since learned that this PR doesnt handle load/store. Only arithmetic.

However note that immediate float arithmetic also seems to be missing, e.g. the following crashes

define noundef float @_Z4funcf(float noundef %0) #0 {
  %2 = fadd float %0, 3.000000e+00
  ret float %2
}

Compiled with:

bin/llc --march riscv32 -mattr +zfinx test_fadd_imm.ll

Gives:

LLVM ERROR: Cannot select: t15: f32,ch = load<(load (s32) from constant-pool)> t0, t19, undef:i32
  t19: i32 = ADDI t18, TargetConstantPool:i32<float 3.000000e+00> 0 [TF=3]
    t18: i32 = LUI TargetConstantPool:i32<float 3.000000e+00> 0 [TF=4]
      t16: i32 = TargetConstantPool<float 3.000000e+00> 0 [TF=4]
    t17: i32 = TargetConstantPool<float 3.000000e+00> 0 [TF=3]
  t14: i32 = undef
In function: _Z4funcf

(but compiles ok with +f)

Thanks, I'll try to fix it.

llvm/lib/Target/RISCV/RISCVInstrInfoF.td
548

Your suggestion is very good, but I think FINX should be replaced with a shorter name because it appears as a parameter, eg F_INX. And consider changing the name of F_INX to FprINX. Would you like to submit a new patch about this?

Thanks, I'll try to fix it.

Awesome :) Thank you :)

As a general comment, after thinking about this whilst brushing my teeth, I kind of feel that it might be clearer and simpler for everyone, both F people and Zfinx people, if we decoupled F and Zfinx into two separate files. This way F people don't have to figure out what is Zfinx. They just keep their clean separate F file. and Zfinx people dont have to think about how our bits mangle with F bits. Yes involves a bit of duplication, but if it's clearer/cleaner for everyone, maybe it might be a good approach?

After all, Zfinx is not an extension of F. It's a separate, mutually incompatible extension. And it's a fairly niche extension, I feel.

(Conversely, if we feel that zfinx is a non-niche extension, then it should have a shorter name I feel, eg maybe U for untyped registers)

asb added a comment.Apr 13 2022, 8:23 AM

As a general comment, after thinking about this whilst brushing my teeth, I kind of feel that it might be clearer and simpler for everyone, both F people and Zfinx people, if we decoupled F and Zfinx into two separate files. This way F people don't have to figure out what is Zfinx. They just keep their clean separate F file. and Zfinx people dont have to think about how our bits mangle with F bits. Yes involves a bit of duplication, but if it's clearer/cleaner for everyone, maybe it might be a good approach?

After all, Zfinx is not an extension of F. It's a separate, mutually incompatible extension. And it's a fairly niche extension, I feel.

Hi Hugh, and welcome! The naming of the extension is outside of our control I'm afraid.

I see your point, though the main reasoning about keeping F and Zfinx together is that they are largely the same instructions, and the hope is to define instructions and patterns together where feasible.

hughperkins added a comment.EditedApr 13 2022, 3:31 PM

I see your point, though the main reasoning about keeping F and Zfinx together is that they are largely the same instructions, and the hope is to define instructions and patterns together where feasible.

Yeah, I totally understand that. And when I first looked at this PR, I thought that made total sense. But then I saw how challenging it is to try to combine the two.

As far as the general principle of 'lets keep things together with similar patterns', the same argument can be levied against the pair of F and D, and yet they are in separate files, and I currently feel putting F and D in separate files is likely a reasonable idea. Despite their apparent similarities (32 vs 64 bits...), it is simpler and easier to keep D and F separate. Compared to D vs F, F vs Zfinx are I feel more different, and there is more reason to keep them apart. But even if they were the same difference as D and F, I feel keeping Zfinx and F apart coudl be a good idea:

  • Zfinx is fairly niche, and seems unfair to 'pollute' F with it?
  • I suspect Zfinx might be easier to implement if it is kept separaet fomr F.

Anyway, I seem likely to at least dabble in creating a decoupled version of Zfinx. I sort of have a hunch it will be eiaser. If it's not, well, I will come back, and say "well, yes, you were right, it is good to keep Zfinx and F together because ... [reasons X, Y, Z]".

asb added a comment.Apr 14 2022, 1:17 AM

@hughperkins: You make good points. It's not immediately obvious what kind of splitting would be better, and at what point the added complexity of trying to have piggy-back zfinx on current F/D patterns hampers readability and maintainability rather than enhance it. If you have time to play around with some options, I'd be really interested in hearing what you come up with.

Oooo. you've made load and store work :) Taht's fantastic :) Thank you :)

sunshaoce updated this revision to Diff 427929.May 8 2022, 6:17 AM

Fully support zfinx codegen

sunshaoce retitled this revision from [RISCV][CodeGen] Support float-arith in Zfinx to [RISCV][CodeGen] Support Zfinx codegen.May 8 2022, 6:19 AM

By the way, whilst I remember, since @sunshaoce has implemented all the functionality I need (for now), so I'm not intending to spend any time on working on this. Everything I tested in this PR worked great for me :)

sunshaoce updated this revision to Diff 427933.May 8 2022, 7:00 AM

Fix errors

sunshaoce marked an inline comment as done.

Support zdinx,zhinx codegen

sunshaoce retitled this revision from [RISCV][CodeGen] Support Zfinx codegen to [RISCV][CodeGen] Support Zfinx,Zdinx,Zhinx,Zhinxmin codegen.Mon, Jun 20, 10:47 PM