This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/CodeGen/
-
CodeGen/
3/6
CGExprScalar.cpp
-
test/CodeGenOpenCL/
-
CodeGenOpenCL/
-
preserve_vec3.cl

Differential D107963

[OpenCL] Fix as_type(vec3) invalid store creation
ClosedPublic

Authored by svenvh on Aug 12 2021, 7:33 AM.

Download Raw Diff

Details

Reviewers

Anastasia
jaykang10

Commits

rG7bda1a0711c6: [OpenCL] Fix as_type(vec3) invalid store creation

Summary

With -fpreserve-vec3-type enabled, a cast was not created when
converting from a vec3 type to a non-vec3 type, even though a
conversion to vec4 was performed. This resulted in creation of
invalid store instructions for the included test cases.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

svenvh created this revision.Aug 12 2021, 7:33 AM

Herald added subscribers: ldrumm, yaxunl. · View Herald TranscriptAug 12 2021, 7:33 AM

svenvh requested review of this revision.Aug 12 2021, 7:33 AM

Herald added a subscriber: cfe-commits. · View Herald TranscriptAug 12 2021, 7:33 AM

Harbormaster completed remote builds in B119257: Diff 365993.Aug 12 2021, 8:13 AM

Anastasia accepted this revision.Aug 17 2021, 5:25 AM

Anastasia added inline comments.

clang/lib/CodeGen/CGExprScalar.cpp
4789	While I agree with this fix and it obviously looks incorrect, I wonder if the original intent was to condition the previous statement instead so that we avoid converting to size 4 at all? Although I have a feeling we are entering the behavior that is not documented anywhere. In the spec I can see this: When the operand and result type contain a different number of elements, the result shall be implementation-defined except if the operand is a 4-component vector and the result is a 3-component vector. In this case, the bits in the operand shall be returned directly without modification as the new type. but it seems to cover the inverse conversion?

This revision is now accepted and ready to land.Aug 17 2021, 5:25 AM

svenvh added inline comments.Aug 17 2021, 6:12 AM

clang/lib/CodeGen/CGExprScalar.cpp
4789	Yeah I have a similar fix for the inverse case (which is further down in this function) in my local branch. I did try to extend the guard to also cover the `ConvertVec3AndVec4` call, but that also led to invalid StoreInst creation. Since I wasn't sure about the intent of the conditioning on `PreserveVec3Type` here, I didn't investigate further. I was hoping @jaykang10 (who added this in D30810) might have some insight into why the guard was here in the first place. But it has been over 4 years since that was committed, so there might not be a ready answer. Either way, I'll hold off committing this for a few more days.

jaykang10 added inline comments.Aug 17 2021, 8:24 AM

clang/lib/CodeGen/CGExprScalar.cpp
4789	I am sorry for late response. I has not been feeling well. As far as I remember, the goal was to avoid bitcast and keep load or store with vec3 type on IR level. I guess I did not consider the conversion from vec3 type to scalar type and vice versa. I guess this guard was to avoid the bitcast. It could be wrong for scalar type. If you check the scalar type in the guard, it could be good to keep existing behavior for vector type. Additionally, you could also want to change below code for conversion from non-vec3 to vec3.

svenvh added inline comments.Aug 17 2021, 9:50 AM

clang/lib/CodeGen/CGExprScalar.cpp
4789	No worries, thanks for replying! the goal was to avoid bitcast and keep load or store with vec3 type on IR level. I think that is already achieved by the changes in CGExpr.cpp from your previous commit. But here in CGExprScalar.cpp we are handling the case where we have to convert away to non-vec3 (because `NumElementsDst != 3`) and we do this conversion unconditionally already. I don't see why we would not want to emit the bitcast because it is needed for correctness. It could be wrong for scalar type. The problem that my patch fixes is not limited to scalar types: it also occurs for e.g. `float3` to `double2`. Perhaps I should add that test case too? If you check the scalar type in the guard, it could be good to keep existing behavior for vector type. My patch does not make a difference to any of the pre-existing tests in `preserve_vec3.cl`. Do you have a specific case that is not covered by the test, but for which you want to preserve the behavior?

jaykang10 added inline comments.Aug 17 2021, 11:30 AM

clang/lib/CodeGen/CGExprScalar.cpp
4789	I think that is already achieved by the changes in CGExpr.cpp from your previous commit. But here in CGExprScalar.cpp we are handling the case where we have to convert away to non-vec3 (because NumElementsDst != 3) and we do this conversion unconditionally already. I don't see why we would not want to emit the bitcast because it is needed for correctness. I agree with you. I remember vaguely it was for a transformation pass in my previous project. For correctness, please feel free to remove the guard. The problem that my patch fixes is not limited to scalar types: it also occurs for e.g. float3 to double2. Perhaps I should add that test case too? Yep, if you add more test cases, it will be great. My patch does not make a difference to any of the pre-existing tests in preserve_vec3.cl. Do you have a specific case that is not covered by the test, but for which you want to preserve the behavior? I can not remember correctly what my previous patch aimed. If someone raises issues with removing this guard later, I think we can discuss it again.

svenvh added inline comments.Aug 18 2021, 1:27 AM

clang/lib/CodeGen/CGExprScalar.cpp
4789	Thanks! I will land the patch soon then. Yep, if you add more test cases, it will be great. I'll add the float3 to double2 test case as part of my commit.

Closed by commit rG7bda1a0711c6: [OpenCL] Fix as_type(vec3) invalid store creation (authored by svenvh). · Explain WhyAug 19 2021, 3:57 AM

This revision was automatically updated to reflect the committed changes.

svenvh added a commit: rG7bda1a0711c6: [OpenCL] Fix as_type(vec3) invalid store creation.

svenvh mentioned this in D108470: [OpenCL] Fix as_type3 invalid store creation.Aug 20 2021, 9:13 AM

Revision Contents

Path

Size

clang/

lib/

CodeGen/

CGExprScalar.cpp

7 lines

test/

CodeGenOpenCL/

preserve_vec3.cl

28 lines

Diff 367448

clang/lib/CodeGen/CGExprScalar.cpp

Show First 20 Lines • Show All 4,779 Lines • ▼ Show 20 Lines	unsigned NumElementsDst =
isa<llvm::VectorType>(DstTy)		isa<llvm::VectorType>(DstTy)
? cast<llvm::FixedVectorType>(DstTy)->getNumElements()		? cast<llvm::FixedVectorType>(DstTy)->getNumElements()
: 0;		: 0;

// Going from vec3 to non-vec3 is a special case and requires a shuffle		// Going from vec3 to non-vec3 is a special case and requires a shuffle
// vector to get a vec4, then a bitcast if the target type is different.		// vector to get a vec4, then a bitcast if the target type is different.
if (NumElementsSrc == 3 && NumElementsDst != 3) {		if (NumElementsSrc == 3 && NumElementsDst != 3) {
Src = ConvertVec3AndVec4(Builder, CGF, Src, 4);		Src = ConvertVec3AndVec4(Builder, CGF, Src, 4);

if (!CGF.CGM.getCodeGenOpts().PreserveVec3Type) {
AnastasiaUnsubmitted Not Done Reply Inline Actions While I agree with this fix and it obviously looks incorrect, I wonder if the original intent was to condition the previous statement instead so that we avoid converting to size 4 at all? Although I have a feeling we are entering the behavior that is not documented anywhere. In the spec I can see this: When the operand and result type contain a different number of elements, the result shall be implementation-defined except if the operand is a 4-component vector and the result is a 3-component vector. In this case, the bits in the operand shall be returned directly without modification as the new type. but it seems to cover the inverse conversion? Anastasia: While I agree with this fix and it obviously looks incorrect, I wonder if the original intent…
svenvhAuthorUnsubmitted Done Reply Inline Actions Yeah I have a similar fix for the inverse case (which is further down in this function) in my local branch. I did try to extend the guard to also cover the `ConvertVec3AndVec4` call, but that also led to invalid StoreInst creation. Since I wasn't sure about the intent of the conditioning on `PreserveVec3Type` here, I didn't investigate further. I was hoping @jaykang10 (who added this in D30810) might have some insight into why the guard was here in the first place. But it has been over 4 years since that was committed, so there might not be a ready answer. Either way, I'll hold off committing this for a few more days. svenvh: Yeah I have a similar fix for the inverse case (which is further down in this function) in my…
jaykang10Unsubmitted Not Done Reply Inline Actions I am sorry for late response. I has not been feeling well. As far as I remember, the goal was to avoid bitcast and keep load or store with vec3 type on IR level. I guess I did not consider the conversion from vec3 type to scalar type and vice versa. I guess this guard was to avoid the bitcast. It could be wrong for scalar type. If you check the scalar type in the guard, it could be good to keep existing behavior for vector type. Additionally, you could also want to change below code for conversion from non-vec3 to vec3. jaykang10: I am sorry for late response. I has not been feeling well. As far as I remember, the goal was…
svenvhAuthorUnsubmitted Done Reply Inline Actions No worries, thanks for replying! the goal was to avoid bitcast and keep load or store with vec3 type on IR level. I think that is already achieved by the changes in CGExpr.cpp from your previous commit. But here in CGExprScalar.cpp we are handling the case where we have to convert away to non-vec3 (because `NumElementsDst != 3`) and we do this conversion unconditionally already. I don't see why we would not want to emit the bitcast because it is needed for correctness. It could be wrong for scalar type. The problem that my patch fixes is not limited to scalar types: it also occurs for e.g. `float3` to `double2`. Perhaps I should add that test case too? If you check the scalar type in the guard, it could be good to keep existing behavior for vector type. My patch does not make a difference to any of the pre-existing tests in `preserve_vec3.cl`. Do you have a specific case that is not covered by the test, but for which you want to preserve the behavior? svenvh: No worries, thanks for replying! > the goal was to avoid bitcast and keep load or store with…
jaykang10Unsubmitted Not Done Reply Inline Actions I think that is already achieved by the changes in CGExpr.cpp from your previous commit. But here in CGExprScalar.cpp we are handling the case where we have to convert away to non-vec3 (because NumElementsDst != 3) and we do this conversion unconditionally already. I don't see why we would not want to emit the bitcast because it is needed for correctness. I agree with you. I remember vaguely it was for a transformation pass in my previous project. For correctness, please feel free to remove the guard. The problem that my patch fixes is not limited to scalar types: it also occurs for e.g. float3 to double2. Perhaps I should add that test case too? Yep, if you add more test cases, it will be great. My patch does not make a difference to any of the pre-existing tests in preserve_vec3.cl. Do you have a specific case that is not covered by the test, but for which you want to preserve the behavior? I can not remember correctly what my previous patch aimed. If someone raises issues with removing this guard later, I think we can discuss it again. jaykang10: > I think that is already achieved by the changes in CGExpr.cpp from your previous commit. But…
svenvhAuthorUnsubmitted Done Reply Inline Actions Thanks! I will land the patch soon then. Yep, if you add more test cases, it will be great. I'll add the float3 to double2 test case as part of my commit. svenvh: Thanks! I will land the patch soon then. > Yep, if you add more test cases, it will be great.
Src = createCastsForTypeOfSameSize(Builder, CGF.CGM.getDataLayout(), Src,		Src = createCastsForTypeOfSameSize(Builder, CGF.CGM.getDataLayout(), Src,
DstTy);		DstTy);
}

Src->setName("astype");		Src->setName("astype");
return Src;		return Src;
}		}

// Going from non-vec3 to vec3 is a special case and requires a bitcast		// Going from non-vec3 to vec3 is a special case and requires a bitcast
// to vec4 if the original type is not vec4, then a shuffle vector to		// to vec4 if the original type is not vec4, then a shuffle vector to
// get a vec3.		// get a vec3.
▲ Show 20 Lines • Show All 354 Lines • Show Last 20 Lines

clang/test/CodeGenOpenCL/preserve_vec3.cl

	// RUN: %clang_cc1 %s -emit-llvm -o - -triple spir-unknown-unknown -fpreserve-vec3-type \| FileCheck %s			// RUN: %clang_cc1 %s -emit-llvm -o - -triple spir-unknown-unknown -fpreserve-vec3-type \| FileCheck %s

				typedef char char3 __attribute__((ext_vector_type(3)));
				typedef short short3 __attribute__((ext_vector_type(3)));
				typedef double double2 __attribute__((ext_vector_type(2)));
	typedef float float3 __attribute__((ext_vector_type(3)));			typedef float float3 __attribute__((ext_vector_type(3)));
	typedef float float4 __attribute__((ext_vector_type(4)));			typedef float float4 __attribute__((ext_vector_type(4)));

	void kernel foo(global float3 a, global float3 b) {			void kernel foo(global float3 a, global float3 b) {
	// CHECK-LABEL: spir_kernel void @foo			// CHECK-LABEL: spir_kernel void @foo
	// CHECK: %[[LOAD_A:.]] = load <3 x float>, <3 x float> addrspace(1) %a			// CHECK: %[[LOAD_A:.]] = load <3 x float>, <3 x float> addrspace(1) %a
	// CHECK: store <3 x float> %[[LOAD_A]], <3 x float> addrspace(1)* %b			// CHECK: store <3 x float> %[[LOAD_A]], <3 x float> addrspace(1)* %b
	b = a;			b = a;
	Show All 9 Lines

	void kernel float3_to_float4(global float3 a, global float4 b) {			void kernel float3_to_float4(global float3 a, global float4 b) {
	// CHECK-LABEL: spir_kernel void @float3_to_float4			// CHECK-LABEL: spir_kernel void @float3_to_float4
	// CHECK: %[[LOAD_A:.]] = load <3 x float>, <3 x float> addrspace(1) %a, align 16			// CHECK: %[[LOAD_A:.]] = load <3 x float>, <3 x float> addrspace(1) %a, align 16
	// CHECK: %[[ASTYPE:.*]] = shufflevector <3 x float> %[[LOAD_A]], <3 x float> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 undef>			// CHECK: %[[ASTYPE:.*]] = shufflevector <3 x float> %[[LOAD_A]], <3 x float> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 undef>
	// CHECK: store <4 x float> %[[ASTYPE]], <4 x float> addrspace(1)* %b, align 16			// CHECK: store <4 x float> %[[ASTYPE]], <4 x float> addrspace(1)* %b, align 16
	b = __builtin_astype(a, float4);			b = __builtin_astype(a, float4);
	}			}

				void kernel float3_to_double2(global float3 a, global double2 b) {
				// CHECK-LABEL: spir_kernel void @float3_to_double2
				// CHECK: %[[LOAD_A:.]] = load <3 x float>, <3 x float> addrspace(1) %a, align 16
				// CHECK: %[[ASTYPE:.*]] = shufflevector <3 x float> %[[LOAD_A]], <3 x float> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 undef>
				// CHECK: %[[OUT_BC:.]] = bitcast <2 x double> addrspace(1) %b to <4 x float> addrspace(1)*
				// CHECK: store <4 x float> %[[ASTYPE]], <4 x float> addrspace(1)* %[[OUT_BC]], align 16
				b = __builtin_astype(a, double2);
				}

				void from_char3(char3 a, global int *out) {
				// CHECK-LABEL: void @from_char3
				// CHECK: %[[ASTYPE:.*]] = shufflevector <3 x i8> %a, <3 x i8> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 undef>
				// CHECK: %[[OUT_BC:.]] = bitcast i32 addrspace(1) %out to <4 x i8> addrspace(1)*
				// CHECK: store <4 x i8> %[[ASTYPE]], <4 x i8> addrspace(1)* %[[OUT_BC]]
				*out = __builtin_astype(a, int);
				}

				void from_short3(short3 a, global long *out) {
				// CHECK-LABEL: void @from_short3
				// CHECK: %[[ASTYPE:.*]] = shufflevector <3 x i16> %a, <3 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 undef>
				// CHECK: %[[OUT_BC:.]] = bitcast i64 addrspace(1) %out to <4 x i16> addrspace(1)*
				// CHECK: store <4 x i16> %[[ASTYPE]], <4 x i16> addrspace(1)* %[[OUT_BC]]
				*out = __builtin_astype(a, long);
				}