This patch added the MC layer support of Zfinx extension.
Authored-by: StephenFan
Co-Authored-by: Shao-Ce Sun
Differential D93298
[RISCV] add the MC layer support of Zfinx extension achieveartificialintelligence on Dec 15 2020, 6:53 AM. Authored by
Details This patch added the MC layer support of Zfinx extension. Authored-by: StephenFan
Diff Detail
Event TimelineThere are a very large number of changes, so older changes are hidden. Show Older Changes
Comment Actions Thanks for your work on this. The way you've managed to use multiclasses to handle this with the 'ExtInfo' definitions takes a bit of unpicking to follow, but it does a really good job of keeping the instruction definitions largely unmodified. Nice! I've got a few requests/questions on the test coverage:
Comment Actions Not an issue for this MC-layer patch, but I've created https://github.com/riscv/riscv-zfinx/issues/14 to point out what seems to be an incorrect statement about the status quo on the ABI for 32-bit floating point types on RV64 in the Zfinx spec. Comment Actions @asb Thank you very much! Could you please take a look at what needs to be modified now? Comment Actions Assuming this will be merged soon, do you want to submit a backport request for the 14.0 branch?
Comment Actions I think all my comments have been addressed. @craig.topper - are you happy your RegInfo question is addressed? Comment Actions In that case, LGTM (needs a rebase though). Thanks for your patience on this @achieveartificialintelligence. Comment Actions It appears that this is causing an assertion segfault in a rustc test over at our experimental rust + llvm@head bot:
The assertion: Impossible reg-to-reg copy UNREACHABLE executed at [...]/rust/src/llvm-project/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp:350 The IR for riscv-types.rs: ; ModuleID = 'riscv_types.4cedf4b7-cgu.0' source_filename = "riscv_types.4cedf4b7-cgu.0" target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n64-S128" target triple = "riscv64" @extern_static = external dso_local global i8 @alloc55 = private unnamed_addr constant <{ [6 x i8] }> <{ [6 x i8] c"reg_i8" }>, align 1 @alloc56 = private unnamed_addr constant <{ [7 x i8] }> <{ [7 x i8] c"reg_i16" }>, align 1 @alloc57 = private unnamed_addr constant <{ [7 x i8] }> <{ [7 x i8] c"reg_i32" }>, align 1 @alloc58 = private unnamed_addr constant <{ [7 x i8] }> <{ [7 x i8] c"reg_f32" }>, align 1 @alloc59 = private unnamed_addr constant <{ [7 x i8] }> <{ [7 x i8] c"reg_i64" }>, align 1 @alloc60 = private unnamed_addr constant <{ [7 x i8] }> <{ [7 x i8] c"reg_f64" }>, align 1 @alloc61 = private unnamed_addr constant <{ [7 x i8] }> <{ [7 x i8] c"reg_ptr" }>, align 1 @alloc62 = private unnamed_addr constant <{ [8 x i8] }> <{ [8 x i8] c"freg_f32" }>, align 1 @alloc63 = private unnamed_addr constant <{ [8 x i8] }> <{ [8 x i8] c"freg_f64" }>, align 1 @alloc64 = private unnamed_addr constant <{ [5 x i8] }> <{ [5 x i8] c"a0_i8" }>, align 1 @alloc65 = private unnamed_addr constant <{ [6 x i8] }> <{ [6 x i8] c"a0_i16" }>, align 1 @alloc66 = private unnamed_addr constant <{ [6 x i8] }> <{ [6 x i8] c"a0_i32" }>, align 1 @alloc67 = private unnamed_addr constant <{ [6 x i8] }> <{ [6 x i8] c"a0_f32" }>, align 1 @alloc68 = private unnamed_addr constant <{ [6 x i8] }> <{ [6 x i8] c"a0_i64" }>, align 1 @alloc69 = private unnamed_addr constant <{ [6 x i8] }> <{ [6 x i8] c"a0_f64" }>, align 1 @alloc70 = private unnamed_addr constant <{ [6 x i8] }> <{ [6 x i8] c"a0_ptr" }>, align 1 @alloc71 = private unnamed_addr constant <{ [7 x i8] }> <{ [7 x i8] c"fa0_f32" }>, align 1 @alloc72 = private unnamed_addr constant <{ [7 x i8] }> <{ [7 x i8] c"fa0_f64" }>, align 1 ; Function Attrs: nounwind define dso_local void @sym_fn() unnamed_addr #0 { start: tail call void asm sideeffect alignstack "call ${0:c}", "s,~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(void ()* nonnull @extern_func) #1, !srcloc !1 ret void } ; Function Attrs: nounwind define dso_local void @sym_static() unnamed_addr #0 { start: tail call void asm sideeffect alignstack "lb t0, ${0:c}", "s,~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(i8* nonnull @extern_static) #1, !srcloc !2 ret void } ; Function Attrs: nounwind define dso_local i8 @reg_i8(i8 %x) unnamed_addr #0 { start: tail call void @dont_merge([0 x i8]* noalias noundef nonnull readonly align 1 bitcast (<{ [6 x i8] }>* @alloc55 to [0 x i8]*), i64 6) #1 %0 = tail call i8 asm sideeffect alignstack "mv ${0}, ${1}", "=&r,r,~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(i8 %x) #1, !srcloc !3 ret i8 %0 } ; Function Attrs: nounwind define dso_local i16 @reg_i16(i16 %x) unnamed_addr #0 { start: tail call void @dont_merge([0 x i8]* noalias noundef nonnull readonly align 1 bitcast (<{ [7 x i8] }>* @alloc56 to [0 x i8]*), i64 7) #1 %0 = tail call i16 asm sideeffect alignstack "mv ${0}, ${1}", "=&r,r,~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(i16 %x) #1, !srcloc !3 ret i16 %0 } ; Function Attrs: nounwind define dso_local i32 @reg_i32(i32 %x) unnamed_addr #0 { start: tail call void @dont_merge([0 x i8]* noalias noundef nonnull readonly align 1 bitcast (<{ [7 x i8] }>* @alloc57 to [0 x i8]*), i64 7) #1 %0 = tail call i32 asm sideeffect alignstack "mv ${0}, ${1}", "=&r,r,~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(i32 %x) #1, !srcloc !3 ret i32 %0 } ; Function Attrs: nounwind define dso_local float @reg_f32(float %x) unnamed_addr #0 { start: tail call void @dont_merge([0 x i8]* noalias noundef nonnull readonly align 1 bitcast (<{ [7 x i8] }>* @alloc58 to [0 x i8]*), i64 7) #1 %0 = tail call float asm sideeffect alignstack "mv ${0}, ${1}", "=&r,r,~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(float %x) #1, !srcloc !3 ret float %0 } ; Function Attrs: nounwind define dso_local i64 @reg_i64(i64 %x) unnamed_addr #0 { start: tail call void @dont_merge([0 x i8]* noalias noundef nonnull readonly align 1 bitcast (<{ [7 x i8] }>* @alloc59 to [0 x i8]*), i64 7) #1 %0 = tail call i64 asm sideeffect alignstack "mv ${0}, ${1}", "=&r,r,~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(i64 %x) #1, !srcloc !3 ret i64 %0 } ; Function Attrs: nounwind define dso_local double @reg_f64(double %x) unnamed_addr #0 { start: tail call void @dont_merge([0 x i8]* noalias noundef nonnull readonly align 1 bitcast (<{ [7 x i8] }>* @alloc60 to [0 x i8]*), i64 7) #1 %0 = tail call double asm sideeffect alignstack "mv ${0}, ${1}", "=&r,r,~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(double %x) #1, !srcloc !3 ret double %0 } ; Function Attrs: nounwind define dso_local i8* @reg_ptr(i8* %x) unnamed_addr #0 { start: tail call void @dont_merge([0 x i8]* noalias noundef nonnull readonly align 1 bitcast (<{ [7 x i8] }>* @alloc61 to [0 x i8]*), i64 7) #1 %0 = tail call i8* asm sideeffect alignstack "mv ${0}, ${1}", "=&r,r,~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(i8* %x) #1, !srcloc !3 ret i8* %0 } ; Function Attrs: nounwind define dso_local float @freg_f32(float %x) unnamed_addr #0 { start: tail call void @dont_merge([0 x i8]* noalias noundef nonnull readonly align 1 bitcast (<{ [8 x i8] }>* @alloc62 to [0 x i8]*), i64 8) #1 %0 = tail call float asm sideeffect alignstack "fmv.s ${0}, ${1}", "=&f,f,~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(float %x) #1, !srcloc !3 ret float %0 } ; Function Attrs: nounwind define dso_local double @freg_f64(double %x) unnamed_addr #0 { start: tail call void @dont_merge([0 x i8]* noalias noundef nonnull readonly align 1 bitcast (<{ [8 x i8] }>* @alloc63 to [0 x i8]*), i64 8) #1 %0 = tail call double asm sideeffect alignstack "fmv.d ${0}, ${1}", "=&f,f,~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(double %x) #1, !srcloc !3 ret double %0 } ; Function Attrs: nounwind define dso_local i8 @a0_i8(i8 %x) unnamed_addr #0 { start: tail call void @dont_merge([0 x i8]* noalias noundef nonnull readonly align 1 bitcast (<{ [5 x i8] }>* @alloc64 to [0 x i8]*), i64 5) #1 %0 = tail call i8 asm sideeffect alignstack "mv a0, a0", "={x10},{x10},~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(i8 %x) #1, !srcloc !4 ret i8 %0 } ; Function Attrs: nounwind define dso_local i16 @a0_i16(i16 %x) unnamed_addr #0 { start: tail call void @dont_merge([0 x i8]* noalias noundef nonnull readonly align 1 bitcast (<{ [6 x i8] }>* @alloc65 to [0 x i8]*), i64 6) #1 %0 = tail call i16 asm sideeffect alignstack "mv a0, a0", "={x10},{x10},~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(i16 %x) #1, !srcloc !4 ret i16 %0 } ; Function Attrs: nounwind define dso_local i32 @a0_i32(i32 %x) unnamed_addr #0 { start: tail call void @dont_merge([0 x i8]* noalias noundef nonnull readonly align 1 bitcast (<{ [6 x i8] }>* @alloc66 to [0 x i8]*), i64 6) #1 %0 = tail call i32 asm sideeffect alignstack "mv a0, a0", "={x10},{x10},~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(i32 %x) #1, !srcloc !4 ret i32 %0 } ; Function Attrs: nounwind define dso_local float @a0_f32(float %x) unnamed_addr #0 { start: tail call void @dont_merge([0 x i8]* noalias noundef nonnull readonly align 1 bitcast (<{ [6 x i8] }>* @alloc67 to [0 x i8]*), i64 6) #1 %0 = tail call float asm sideeffect alignstack "mv a0, a0", "={x10},{x10},~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(float %x) #1, !srcloc !4 ret float %0 } ; Function Attrs: nounwind define dso_local i64 @a0_i64(i64 %x) unnamed_addr #0 { start: tail call void @dont_merge([0 x i8]* noalias noundef nonnull readonly align 1 bitcast (<{ [6 x i8] }>* @alloc68 to [0 x i8]*), i64 6) #1 %0 = tail call i64 asm sideeffect alignstack "mv a0, a0", "={x10},{x10},~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(i64 %x) #1, !srcloc !4 ret i64 %0 } ; Function Attrs: nounwind define dso_local double @a0_f64(double %x) unnamed_addr #0 { start: tail call void @dont_merge([0 x i8]* noalias noundef nonnull readonly align 1 bitcast (<{ [6 x i8] }>* @alloc69 to [0 x i8]*), i64 6) #1 %0 = tail call double asm sideeffect alignstack "mv a0, a0", "={x10},{x10},~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(double %x) #1, !srcloc !4 ret double %0 } ; Function Attrs: nounwind define dso_local i8* @a0_ptr(i8* %x) unnamed_addr #0 { start: tail call void @dont_merge([0 x i8]* noalias noundef nonnull readonly align 1 bitcast (<{ [6 x i8] }>* @alloc70 to [0 x i8]*), i64 6) #1 %0 = tail call i8* asm sideeffect alignstack "mv a0, a0", "={x10},{x10},~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(i8* %x) #1, !srcloc !4 ret i8* %0 } ; Function Attrs: nounwind define dso_local float @fa0_f32(float %x) unnamed_addr #0 { start: tail call void @dont_merge([0 x i8]* noalias noundef nonnull readonly align 1 bitcast (<{ [7 x i8] }>* @alloc71 to [0 x i8]*), i64 7) #1 %0 = tail call float asm sideeffect alignstack "fmv.s fa0, fa0", "={f10},{f10},~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(float %x) #1, !srcloc !4 ret float %0 } ; Function Attrs: nounwind define dso_local double @fa0_f64(double %x) unnamed_addr #0 { start: tail call void @dont_merge([0 x i8]* noalias noundef nonnull readonly align 1 bitcast (<{ [7 x i8] }>* @alloc72 to [0 x i8]*), i64 7) #1 %0 = tail call double asm sideeffect alignstack "fmv.d fa0, fa0", "={f10},{f10},~{vtype},~{vl},~{vxsat},~{vxrm},~{memory}"(double %x) #1, !srcloc !4 ret double %0 } ; Function Attrs: nounwind declare dso_local void @extern_func() unnamed_addr #0 ; Function Attrs: nounwind declare dso_local void @dont_merge([0 x i8]* noalias noundef nonnull readonly align 1, i64) unnamed_addr #0 attributes #0 = { nounwind "target-cpu"="generic-rv64" } attributes #1 = { nounwind } !llvm.module.flags = !{!0} !0 = !{i32 1, !"Code Model", i32 3} !1 = !{i32 1048} !2 = !{i32 1281} !3 = !{i32 1670} !4 = !{i32 2113} @achieveartificialintelligence could you please take a look? Any ideas what's going wrong here? Comment Actions This is just MC layer support. Inline assembly interacts with CodeGen; I assume the register classes/constraints still need updating to support Zfinx.
Comment Actions Hm, actually, looking at the log, it doesn't appear to be trying to use Zfinx? My guess is that putting a float in a GPR has broken as that's now legal but should only be legal for Zfinx? Comment Actions
I dont think that patch author is required to debug this issue for "experimental rust + llvm@head" - downstream.
There is no requirement, indeed. LLVM devs are not responsible to keep ""experimental rust + llvm@head" working. Comment Actions They're required to assist with fixing regressions that show when compiling legitimate IR. An IR reproducer was given, so no Rust toolchain knowledge should be needed, at that point it's the LLVM dev's responsibility, and if they can't fix it in a timely manner then it should be backed out. Comment Actions Then yeah, if IR is provided and is standard one, this is okay. Consider to add that IR as testcase for the fix. Comment Actions Because I happened to also run into this, reduced IR for -mtriple=riscv32 -mattr=+d is: define float @test(float %x) { %1 = tail call float asm sideeffect alignstack "mv a0, a0", "={x10},{x10}"(float 0.000000e+00) ret float 0.000000e+00 } Comment Actions @achieveartificialintelligence any progress with the problematic IR? Have you been able to reproduce using nikic's reduced example? Comment Actions Here's a fix that I tested on the original repreoduce. It detects that we picked one of the new register classes and tries to redirect back to the normal GPR register class if the GPR register class has the same width. I also checked for MVT::Other to use the GPR class if there was no type. diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp index 3daf2d03de03..4984872139c5 100644 --- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp +++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp @@ -10893,7 +10893,28 @@ RISCVTargetLowering::getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI, } } - return TargetLowering::getRegForInlineAsmConstraint(TRI, Constraint, VT); + std::pair<Register, const TargetRegisterClass*> Res; + Res = TargetLowering::getRegForInlineAsmConstraint(TRI, Constraint, VT); + + if (Res.second == &RISCV::GPRF32RegClass) { + if (!Subtarget.is64Bit() || VT == MVT::Other) + return std::make_pair(Res.first, &RISCV::GPRRegClass); + return std::make_pair(0, nullptr); + } + + if (Res.second == &RISCV::GPRF64RegClass) { + if (Subtarget.is64Bit() || VT == MVT::Other) + return std::make_pair(Res.first, &RISCV::GPRRegClass); + return std::make_pair(0, nullptr); + } + + if (Res.second == &RISCV::GPRF16RegClass) { + if (VT == MVT::Other) + return std::make_pair(Res.first, &RISCV::GPRRegClass); + return std::make_pair(0, nullptr); + } + + return Res; }
Comment Actions @achieveartificialintelligence thank you for looking into this, sorry for the late reply! It looks like the latest version addresses @nikic's IR reproducer with -mtriple=riscv32. There seems to be an error trying this out with -mtriple=riscv64: % cat test.ll define float @test(float %x) { %1 = tail call float asm sideeffect alignstack "mv a0, a0", "={x10},{x10}"(float 0.000000e+00) ret float 0.000000e+00 } % llc -mtriple=riscv32 -mattr=+d test.ll % llc -mtriple=riscv64 -mattr=+d test.ll error: couldn't allocate output register for constraint '{x10}' I'm not sure if this is intended to work -mtriple=riscv64, but judging by the newly added test RUN: llc -mtriple=riscv64 ... line it seems like it should? Comment Actions Does a double with r for RV32 work with that fix? That's supposed to give the low half of the register. You might need to also deal with the register pair class? Comment Actions
Thank you Craig! That fixed it. |
Do we need to enforce that these can't be mixed with F, D, and Zfh?