Add support for roundeven and implement appropriate tests.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/test/CodeGen/AMDGPU/roundeven.ll | ||
---|---|---|
17 | Does v_trunc_f32 ever raise FP exceptions? |
llvm/test/CodeGen/AMDGPU/roundeven.ll | ||
---|---|---|
17 | I assume we have tests for that in a different file, but if not I could make a new task to add them. |
Tests should be unified then
llvm/test/CodeGen/AMDGPU/roundeven.ll | ||
---|---|---|
17 | That's not testable here, it's an isa behavior question |
@arsenm how's this? The tests are unified, but it seems a little misleading to have ISel and GlobalISel tests in the same directory.
Move it out of GlobalISel/. There are plenty of unified sdag/gisel tests in the parent directory already.
llvm/test/CodeGen/AMDGPU/GlobalISel/roundeven.ll | ||
---|---|---|
1 | SDAG- is commonly used for this. ISEL_ is a bit vague. |
The warning is from clang-format. It wants to make 86 changes over hundreds of lines unrelated to this patch.
No, it only wants correct formatting for the lines you actually changed in your patch. If you "install" git clang-format, by making sure llvm-project/clang/tools/clang-format/git-clang-format is on your $PATH, then you can run: git clang-format @^. This is what the Harbormaster build does.
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
1195 | I'm perpetually confused by the variety of rounding functions. llvm.rint => round to nearest integer, in current rounding mode (which is assumed to be round nearest even), which is implied by being non-constrained llvm.round -> round, away from 0 so I think this isn't the same as round, but is supposed to be the same as llvm.rint (which we have 3 different names for) |
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
1195 | I thought llvm.roundeven is supposed to always round to nearest even regardless of the rounding mode. Does our implementation of llvm.rint ignore the rounding mode? |
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
1195 | Yes. This is all legacy cruft from when people were imaging possible solutions to supporting strictfp in the distant future without an actual design in mind. The non-strict, regular floating point intrinsics all assume RTE rounding with no fp exceptions. This cruft is bothering me; as a follow up, can you prepare a patch to deprecate the old intrinsics and auto-upgrade them so we're left with one? |
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
1195 | Can do. I agree it's needlessly confusing. |
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
1195 | For targets that don't optimize them to inline sequences, wouldn't that cause calls library calls to mutate into a different library call? Functionally it would be correct in the default environment, but might be surprising. Depending on which one you choose as canonical it could cause link errors. I don't think you could choose roundeven as canonical since its not in older libm. |
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
1195 | Part of the problem is thinking the intrinsics have anything to do with libm. You can lower the intrinsic to a different name for the libcall |
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
1195 | I didn't say we couldn't. Merely saying that someone will find that surprising and probably file a bug. |
FRINT only handles f64 so I'm updating the patch to handle the remaining scalar types.
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
309–312 | Commented out vector types? The vectors should be and have defaulted to expand |
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
309–312 | Apologies these were debug comments. I've removed them now. |
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
2194–2196 | Subtarget is already available as a member here, don't need to go through the function | |
2198–2204 | You don't need a generation check here, you can just lower to whatever opcode you choose to consolidate on and let the handling of that one take care of the subtarget specific legality considerations which should already work |
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
2198–2204 | LowerFRINT handles lowering when f64 isn't supported. Based on GlobalISel's implementation that's any generation before gfx7. |
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
2198–2204 | You do not need to directly call LowerFRINT. You can just unconditionally produce the frint, and let that be legalized. What you have here is repeating the legality condition in a second place |
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
2198–2204 | I tried that and it crashed. |
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
2198–2204 | Define "crashed". Something else is wrong, you can rely on re-legalization of new nodes |
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
2198–2204 | I didn't save the log unfortunately. I'll see if I can reproduce it. |
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
2198–2204 | You were right. I must have done something wrong the first time. |
llvm/test/CodeGen/AMDGPU/roundeven.ll | ||
---|---|---|
479 | It looks like arguments are converted from f32->f16->f32. Is that correct/efficient? |
llvm/test/CodeGen/AMDGPU/roundeven.ll | ||
---|---|---|
479 | This is the broken ABI the DAG wants to give targets without legal f16. It’s a problem and ends up with different behavior for GlobalISel |
llvm/test/CodeGen/AMDGPU/roundeven.ll | ||
---|---|---|
479 | Is there a workaround? |
llvm/test/CodeGen/AMDGPU/roundeven.ll | ||
---|---|---|
479 | use an i16 argument and bitcast to half in the IR. Should also figure out how to fix the DAG from promoting to float |
Commented out vector types? The vectors should be and have defaulted to expand