Support Intel AMX-FP16 instruction
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/test/CodeGen/amx_fp16.c | ||
---|---|---|
8 | Remove | |
clang/test/CodeGen/amx_fp16_errors.c | ||
7 ↗ | (On Diff #467683) | Remove |
llvm/lib/Support/X86TargetParser.cpp | ||
586 | It should not relate to AVX512FP16. | |
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
36990–36991 | Format the code? | |
llvm/test/CodeGen/X86/amx_fp16_intrinsics.ll | ||
2 | Maybe auto gen it? | |
2 | Why need +avx512f | |
2 | Why need -O0? | |
6 | The comment seems meaningless. | |
10–11 | ditto here. | |
16–17 | ditto here. |
clang/test/CodeGen/amx_fp16_errors.c | ||
---|---|---|
2 ↗ | (On Diff #467683) | Add 32-bit test coverage to ensure the intrinsics aren't visible? |
clang/test/CodeGen/amx_fp16_errors.c | ||
---|---|---|
2 ↗ | (On Diff #467683) | That is better, but let me merge this test into X86/amx_errors.c first : ) |
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
36990–36991 | Make sense, but let's sync with the upper code ? Seems that style is good to reduce line num of the big file. All other comments will be updated soon. thanks a lot! |
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
36986 | This code may be merged to int8 and bf16 case. case X86::PTDPBSSD: case X86::PTDPBSUD: case X86::PTDPBUSD: case X86::PTDPBUUD: case X86::PTDPBF16PS: case X86::PTDPFP16PS: |
clang/test/Driver/x86-target-features.c | ||
---|---|---|
293 | It is not good for amx use i386 testing. We need to update the other amx too. |
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
36986 | Good catch! |
llvm/test/MC/X86/x86-64AmxFP16-att.s | ||
---|---|---|
2 ↗ | (On Diff #468130) | There is "test/MC/X86/AMX/" folder. Probably move the test case to that folder or maybe merge the test case to test/MC/X86/AMX/x86-64-amx-bf16-att.s |
clang/docs/ReleaseNotes.rst | ||
---|---|---|
598 | Yes, that is more clear! | |
llvm/test/MC/Disassembler/X86/x86-64AmxTileFP16-att.txt | ||
4 ↗ | (On Diff #468130) | That is make sense before, but we begin to use tool to auto generate the tests, it is easy to split them for tool, and we have do that for x86-64-avx512bf16-att/intel.txt x86-64-avx512bf16vl-att/intel.txt x86-64-avx512vp2intersectvl-att/intel.txt and also KEYLOCKER tests. |
llvm/test/MC/X86/x86-64AmxFP16-att.s | ||
2 ↗ | (On Diff #468130) | let me put it into "test/MC/X86/AMX/" folder. And keep the tools generated file. That is more clear. |
llvm/test/MC/X86/AMX/x86-64-amx-fp16-att.s | ||
---|---|---|
6 | merge att/intel testing into the same file and use --check-prefix to test them |
llvm/test/MC/Disassembler/X86/x86-64AmxTileFP16-att.txt | ||
---|---|---|
4 ↗ | (On Diff #468130) | OK - I don't suppose you could ads att/intel functionality to your tool? |
llvm/test/MC/X86/AMX/x86-64-amx-fp16-att.s | ||
---|---|---|
6 | Yes, that is our previous action, I think the most benefit is that we can easy to cmp them for same encoding (put them together) not reduce file number. |
llvm/test/MC/Disassembler/X86/x86-64AmxTileFP16-att.txt | ||
---|---|---|
4 ↗ | (On Diff #468130) | The tools is not developed by me, I just use it : ) |
llvm/test/MC/Disassembler/X86/x86-64AmxTileFP16-intel.txt | ||
---|---|---|
1 ↗ | (On Diff #468421) | Move the test case to test/MC/Disassembler/X86/AMX/? |
llvm/test/MC/Disassembler/X86/x86-64AmxTileFP16-att.txt | ||
---|---|---|
4 ↗ | (On Diff #468130) |
Let me sync with the tools' developer, thanks : ) |
llvm/test/MC/X86/AMX/x86-64-amx-fp16-att.s | ||
---|---|---|
6 | This is encoding test, it seems we are not able to merge them into one file as below? // RUN: llvm-mc -triple x86_64-unknown-unknown --show-encoding %s | FileCheck %s --check-prefix=ATT-CHECK // RUN: llvm-mc -triple x86_64-unknown-unknown -x86-asm-syntax=intel -output-asm-variant=1 --show-encoding %s | FileCheck %s --check-prefix=INTEL-CHECK // ATT-CHECK: tdpfp16ps %tmm5, %tmm4, %tmm3 // ATT-CHECK: encoding: [0xc4,0xe2,0x53,0x5c,0xdc] tdpfp16ps %tmm5, %tmm4, %tmm3 // INTEL-CHECK: tdpfp16ps tmm3, tmm4, tmm5 // INTEL-CHECK: encoding: [0xc4,0xe2,0x53,0x5c,0xdc] tdpfp16ps tmm3, tmm4, tmm5 |
llvm/test/MC/X86/AMX/x86-64-amx-fp16-att.s | ||
---|---|---|
6 | Yes, the tool just fold disassemble test. |
yes, so the *.s must be split with intel and att.
So let make same way for disassemble.
Let me first move them into AMX directory.
clang/include/clang/Driver/Options.td | ||
---|---|---|
4531–4532 | I fixed it at rG3770d2b9cad9 |
Probably mention that this is the for the _tile_dpfp16ps instruction?