The 'A' constraint requires an integer or a floating-point inline constant.
Details
Diff Detail
Event Timeline
llvm/docs/LangRef.rst | ||
---|---|---|
4131 | Probably should. But GCC actually supports about 30 constraints which we do not. I'm not sure how easy it will be to add fp constants because llvm's C_Immediate is for integers only. I'll take a look. |
llvm/docs/LangRef.rst | ||
---|---|---|
4131 | The type doesn't matter. The bit values should still work as integers. It doesn't need to be FP |
llvm/docs/LangRef.rst | ||
---|---|---|
4131 | Ok, thanks! |
Added support of fp inline constants. Clang part and documentation will be updated by a separate change.
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
10893 | Should we enforce this limitation? GCC seems to allow arbitrary vector size and type provided that all elements are equal and may be inlined. |
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
10893 | I was wrong about GCC 'vectors'. Looks like they are not regular types and used only for auto-vectorization. Here is the GCC implementation: (define_constraint "A" "Inline immediate parameter" (and (match_code "const_int,const_double,const_vector") (match_test "gcn_inline_constant_p (op)"))) ... bool gcn_inline_constant_p (rtx x) { if (GET_CODE (x) == CONST_INT) return INTVAL (x) >= -16 && INTVAL (x) <= 64; if (GET_CODE (x) == CONST_DOUBLE) return gcn_inline_fp_constant_p (x, false); if (GET_CODE (x) == CONST_VECTOR) { int n; if (!vgpr_vector_mode_p (GET_MODE (x))) return false; n = gcn_inline_constant_p (CONST_VECTOR_ELT (x, 0)); if (!n) return false; for (int i = 1; i < 64; i++) if (CONST_VECTOR_ELT (x, i) != CONST_VECTOR_ELT (x, 0)) return false; return 1; } return false; } ... inline bool vgpr_vector_mode_p (machine_mode mode) { return (mode == V64QImode || mode == V64HImode || mode == V64SImode || mode == V64DImode || mode == V64HFmode || mode == V64SFmode || mode == V64DFmode); } | |
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp | ||
1464–1466 | When working on this fix I was under an impression that types of values passed to inline assembler are not required to match instruction operand types. In other words, I assumed that the following code is legal: "v_mov_b32 $0, $1", "=v,A"(half 0xH3C00) // 1.0h Also I assumed that this code required a type conversion to end up as "v_mov_b32 ..., 1.0" and not "v_mov_b32 ..., 0x3C00" because in the latter case the constant cannot be encoded as inline value. That assumption complicated implementation. Was the assumption incorrect? If types of values passed to inline assembler are required to match types of corresponding assembler operands, the implementation would be much simpler. | |
1469–1478 | See my comments above. |
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp | ||
---|---|---|
1464–1466 | We have to use 32-bit instruction to materialize 16-bit values. This should be zero extended. This should end up as v_mov_b32 $0, 0x00003c00. For some cases we don't currently use in codegen, (and I'm not sure are supported by the assembler), some 64-bit operands accept 32-bit literals which are zero extended |
I guess the fundamental problem for the 16-bit immediate case is you can't know the use instruction's context. You're really accepting something that's not an inline immediate in the use context, but you have no way of knowing that.
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
10899 | This should be getZExtValue? It looks like the tests worked correctly here, but not sure why it didn't matter |
Actually, could we emit an error in the MC layer if it turns out to not be used in a 16-bit context? That would make more sense, but we don't have a way of tracking the immediate size
I guess the fundamental problem for the 16-bit immediate case is you can't know the use instruction's context. You're really accepting something that's not an inline immediate in the use context, but you have no way of knowing that.
I believe this is a generic problem which affects all supported types because of unknown context.
In the previous version of this change I tried to alleviate this problem by emitting all fp constants as double so they would work in any context.
As you have pointed out, this is unsuitable in some cases where the constant type is importand but for these cases we could have used a 'B' constraint:
"v_mov_b32 $0, $1", "=v,B"(half 0xH3C00)
However I think the corrent version of this change is simpler and cleaner and it has its logic.
We expect constant type to match assembler operand type, but in the end it is up to programmer to decide how to use the constant in assembly code.
Actually, could we emit an error in the MC layer if it turns out to not be used in a 16-bit context? That would make more sense, but we don't have a way of tracking the immediate size
Agree, I do not see how this can be done.
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
10899 | This does not really matter because isInlinableLiteralXX truncates input value to XX bits before checking. Consider this code (actually a corner case): ... asm "v_mov_b32 $0, $1", "=v,A"(half bitcast (i16 -1 to half)) Currently the code ends up as v_mov_b32 v0, -1 If we replace getSExtValue with getZExtValue, the result would be v_mov_b32 v0, 0xFFFF |
Shouldn't this be any inline immediate, which includes the FP values (plus the subtarget dependent one)