Conversion from a fixed-point number to a floating-point number is done by multiplying the fixed-point number by 2^(-n) where n is the number of fractional bits. Currently this is lowered to a vcvt (integer to floating-point) then a vmul, but it can instead be lowered directly to a vcvt (fixed-point to floating-point). This patch enables such transformations as long as the multiplication factor is a power of 2.
Details
Diff Detail
Event Timeline
Nice one. Sounds funky.
llvm/lib/Target/ARM/ARMISelDAGToDAG.cpp | ||
---|---|---|
3691 | This is pretty chunky. I think it's worth moving to it's own function, like the other try* methods here. | |
3706 | I think can be "IsUnsigned" too. The values out of range are 65520+, so can only happen under unsigned convertions. | |
3709 | Bitcast can go wrong under BE if it's changing the type size. It can move the order of elements in the vector around. Is this only interested in cases where the element sizes are the same, and if so can it add a check? | |
3716 | "valid" would be "Valid", but this may not be needed if this was a separate function. It can just return false out of the default then. | |
3720 | If this is a vdup, it won't always have a constant operand. | |
3726 | I think we ruled out 64bit floats above. | |
3752 | Should this be in some range too? Not negative? Do you have a few tests for cases like that? | |
3761–3762 | Perhaps add a default: llvm_unreachable(..); and then Opcode doesn't need a default value (but maybe 0 would be fine too?) | |
llvm/test/CodeGen/ARM/arm_q15_to_float_autovec.ll | ||
2 ↗ | (On Diff #350619) | Can you move the test to CodeGen/Thumb2/mve-<something>.ll, with all the others. Maybe mve-vcvt-fixed.ll? Also it shouldn't need -O3 on the llc line. |
4 ↗ | (On Diff #350619) | Remove dso_local and the local_unnamed_addr from all these tests, and add arm_aapcs_vfpcc. That should clean up a lot of the soft-float vmovs. |
434 ↗ | (On Diff #350619) | This kind of looks negative ;) |
llvm/lib/Target/ARM/ARMISelDAGToDAG.cpp | ||
---|---|---|
3691 | Good idea. Does tryFMUL seem like a good name? | |
3706 | 👍 | |
3709 | It is indeed. I'll add a check to make sure that the element sizes are the same. | |
3720 | 👍 | |
3726 | Indeed we did | |
3752 | I think you're right that there should be a range check, but I think it should come below where we compute FracBits as that must be less or equal to ScalarBits. Regarding it being negative, is that possible if it is a power of 2? | |
3761–3762 | Good idea | |
llvm/test/CodeGen/ARM/arm_q15_to_float_autovec.ll | ||
2 ↗ | (On Diff #350619) | Ah yes! I totally forgot to rename the file. |
4 ↗ | (On Diff #350619) | 👍 |
Move to a function, add IsUnsigned check to fp inf, check bitcast element size, remove IEEEdouble(), make sure FracBits <= ScalarBits, add llvm_unreachable, move test file and use arm_aapcs_vfpcc.
llvm/test/CodeGen/ARM/arm_q15_to_float_autovec.ll | ||
---|---|---|
434 ↗ | (On Diff #350619) | It does! I have added a check for fracbits == scalarbits - 1 when not unsigned. |
llvm/lib/Target/ARM/ARMISelDAGToDAG.cpp | ||
---|---|---|
199 | Perhaps tryFMulFixed, to explain a little more what the function tries to match? | |
3192 | Can use isa, as opposed to dyn_cast. I'm not sure what the TODO is related to? | |
llvm/test/CodeGen/Thumb2/mve-vcvt-fixed.ll | ||
304 | Can you add a testcase that uses 0x3E00000000000000 as the constant too. Plus another one that perhaps uses 0xBF00000000000000. That should cover a few negative power of 2 cases. |
llvm/lib/Target/ARM/ARMISelDAGToDAG.cpp | ||
---|---|---|
3192 | I think we are only interested in constants. We have tests for all the cases, so I'm not sure there will be cases without the operand to the vdup being a constant. And if they weren't constants, it sounds difficult to try and do anything with them here. i.e, I think you can remove the TODO :) | |
llvm/test/CodeGen/Thumb2/mve-vcvt-fixed.ll | ||
772 | This one shouldn't be converted, if the 0xBE00000000000000 is negative. With a constant of 0x3E00000000000000 (which is really 0x30000000 as a float not a double, which is apparently 4.65661287308e-10, which is 1.0f/0x80000000) it should be converted, as far as I understand. | |
986 | I think this one maybe can be converted? But the last one with a FracBits of 31 might be a difficult to prove, if the math becomes inexact. |
clang-tidy: warning: invalid case style for parameter 'dl' [readability-identifier-naming]
not useful