Enable FP16 complex FMA instructions.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/X86/X86ScheduleZnver3.td | ||
---|---|---|
64 ↗ | (On Diff #355794) | You could avoid this change if you add a scheduler class to whatever instruction is complaining? |
llvm/lib/Target/X86/X86ScheduleZnver3.td | ||
---|---|---|
64 ↗ | (On Diff #355794) | /me goes to add herald rule that i forgot to add |
clang/lib/Headers/avx512fp16intrin.h | ||
---|---|---|
2955 | Outer brackets |
- Rebase.
- Add _mm_mask3_fcmadd_sch and _mm_mask3_fcmadd_round_sch.
- Address comments from Yuanke and Simon.
clang/test/CodeGen/X86/avx512fp16-builtins.c | ||
---|---|---|
4223 | MADD? | |
4315 | MADD? | |
llvm/include/llvm/IR/IntrinsicsX86.td | ||
5736 | _cph? | |
5800 | _csh? | |
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp | ||
3903 | "b" means rounding. Right? | |
3949 | Sorry, I didn't find the constrain in the spec. | |
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
47414 | Can swap LHS and RHS reduce some redundant code? | |
47421 | The lambda seems only be called once. | |
47423 | Is it possible fast and non-fast instruction is mixed due to inline? Shall we check the instruction AllowContract flag? | |
47436 | Merge it to previous line. | |
llvm/lib/Target/X86/X86InstrAVX512.td | ||
5772 | Moving ClobberConstraint before IsCommutable saves the code for default value? | |
13593 | The name seems not accurate. Is it cfmop for mul and cfmaop for fma? | |
13629 | I didn't see this flag for other scalar instructions, why we need it for complex instruction? | |
llvm/lib/Target/X86/X86InstrFoldTables.cpp | ||
1852 | Why FR32X version is not needed for complex scalar instructions? |
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp | ||
---|---|---|
3903 | broadcasting | |
3949 | #UD if (dest_reg == src1_reg) or ( dest_reg == src2_reg) | |
llvm/lib/Target/X86/X86InstrAVX512.td | ||
13629 | Because all complex instructions have constrains "dst != src1 && dst != src2". We use earlyclobber to avoid the dst been assigned to src1 or src2. |
llvm/lib/Target/X86/X86InstrFoldTables.cpp | ||
---|---|---|
1852 | Do you mean complex ss/sd? We don't have these instructions. |
llvm/lib/Target/X86/X86InstrFoldTables.cpp | ||
---|---|---|
1852 | No, I mean we have both X86::XXX and X86::XXX_Int for other instructions. One is FR16X which can be unfolded, one is VR128X which can't. For example, VFNMADD213SHZm and VFNMADD213SHZm_Int. |
llvm/lib/Target/X86/X86InstrFoldTables.cpp | ||
---|---|---|
1852 | The VFCMULCSHZrr instructions produce two 16-bit values packed into the lower 32 bits. That would mean we would need a FR32X result, but it couldn't interact meaningfully with any other FR32X instruction since its really two values. I think we only have FR32/FR64 instructions for things that have generic IR equivalents or that we create from other generic IR operations. Like I think we have an FR32 RCP and RSQRT because we can convert float div or 1/sqrt to them. |
llvm/lib/Target/X86/X86InstrFoldTables.cpp | ||
---|---|---|
1852 | Thanks, Craig. I understand now. :) |
llvm/lib/Target/X86/X86InstrAVX512.td | ||
---|---|---|
13629 | Got it. Thanks! |
Thanks for the review.
clang/test/CodeGen/X86/avx512fp16-builtins.c | ||
---|---|---|
4223 | They are marks used when adding tests. We can remove them now. |
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
47544 | Sorry, I don't understand the comments. What does FMF mean? |
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
47544 | fast math flags? |
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
47544 | I understand now. Thanks, Simon. :) |
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp | ||
---|---|---|
3903 | Sorry, my mistake. Here b supposes to represent EVEX.b bit in the encoding. |
Outer brackets