64 bit operand folding did not work because these constants
are usually repsresented as a reg_sequence. Also legality checks
were partially missing and partially too restrictive. We can use
a 64 bit immediate if it can be represented by a 32 bit sign
extended integer.
Details
Diff Detail
Event Timeline
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
2954–2958 | I think this is more complicated and depends on the context instruction, which is why this was never done. I think some instructions zero-extend the 32-bit constants (including FP), and then maybe some sign extend |
I would also expect the immediate selection to understand the expanded set of immediates before the folding
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
2954–2958 | Do you have an example? As far as I understand HW logic is quite primitive and doesn't distinguish, so it is always sign extended. At least this has passed PSDB and was used specifically in the fp context. |
That's a separate optimization. I can also see how this folding can be converted into an s_mov_b64 in a constant limited case, but it is again a separate optimization.
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
2954–2958 | You would have to craft new execution tests for FP constants. It's unlikely anything is really testing FP constants that would stress this |
I think this is more complicated and depends on the context instruction, which is why this was never done. I think some instructions zero-extend the 32-bit constants (including FP), and then maybe some sign extend