__builtin_astype does not generate correct LLVM IR for vec3 types. This patch inserts bitcasts to/from vec4 when necessary in addition to generating vector shuffle. A codegen test is added.
Details
Diff Detail
Event Timeline
LGTM.
test/CodeGenOpenCL/as_type.cl | ||
---|---|---|
7 | Could you also one test case that requires bitcast from vector type to 4-element vector type, please? |
lib/CodeGen/CGExprScalar.cpp | ||
---|---|---|
3403 | numElementsDst -> NumElementsDst Also I would expect formatter to align to the parameter list area... | |
3420 | So this code no longer applies to just vectors? | |
3422 | numElementsSrc -> NumElementsSrc The same below! | |
3427 | Should we check numElementsDst == 4 (the same above)? | |
3428 | I am not sure why is this chosen to be this way? If I check the OpenCL spec for type reinterpreting it doesn't seem to require shuffle vector. Also s6.2.4.2 says: "It is an error to use as_type() or as_typen() operator to reinterpret data to a type of a different number of bytes." The only valid conversion according to the spec seems to be vec4->vec3: "When the operand and result type contain a different number of elements, the result shall be implementation-defined except if the operand is a 4-component vector and the result is a 3-component vector." This change is affecting non-OpenCL code too. Is this reasonable approach for other vector implementations? | |
3429 | I think we only need bitcast if type don't match? | |
test/CodeGenOpenCL/as_type.cl | ||
7 | Should this be disallowed by the frontend -? according to the spec s6.2.4.2: "It is an error to use as_type() or as_typen() operator to reinterpret data to a type of a different number of bytes." |
lib/CodeGen/CGExprScalar.cpp | ||
---|---|---|
3420 | Right. The issue happens when converting vec3 type to/from non-vector types. | |
3427 | No. Special handling is needed not just for conversion between vec3/vec4 types, but also between vec3 and non-vector types. | |
3428 | This change only affects OpenCL code since __builtin_astype is defined as a keyword for OpenCL only. According to OpenCL spec v1.1 s6.1.5, vec3 type has the same size as vec4 type, so it is allowed to be converted to any other types which have the same size as vec4 type. | |
3429 | Builder.CreateBitCast automatically handle this. | |
test/CodeGenOpenCL/as_type.cl | ||
7 | According to the spec v1.1 s6.1.5, vec3 type has the same size as vec4 type, so it is allowed to be converted to other types which have the same size as vec4 type. |
lib/CodeGen/CGExprScalar.cpp | ||
---|---|---|
3394 | should this be 3 unstead of undef? | |
3428 | I see. Not related to your change, but I was just wondering if it would be better to change this to a Clang builtin with a custom check at some point. It would be easier to understand and we can avoid all this parsing/AST handling complications. | |
test/CodeGenOpenCL/as_type.cl | ||
9 | So what happens if the number of bytes don't match? | |
47 | Would it make sense to check that shufflevector is not generated? | |
54 | Could we add CHECK-NOT bitcast here? |
Add a sema test for mismatched type size. Fix codegen test.
lib/CodeGen/CGExprScalar.cpp | ||
---|---|---|
3394 | 3 or undef are both OK here since the 4th component is undefined. undef may be better since if optimizer checks mask first, it knows that the 4th component is undefined without further checking the second operand. | |
test/CodeGenOpenCL/as_type.cl | ||
9 | There will be compilation error and diag msg. I will add a sema test. | |
47 | will fix | |
54 | will fix |
numElementsDst -> NumElementsDst
Also I would expect formatter to align to the parameter list area...