This is an archive of the discontinued LLVM Phabricator instance.

[OpenCL] Fix __builtin_astype for vec3 types.
ClosedPublic

Authored by yaxunl on May 10 2016, 2:00 PM.

Details

Summary

__builtin_astype does not generate correct LLVM IR for vec3 types. This patch inserts bitcasts to/from vec4 when necessary in addition to generating vector shuffle. A codegen test is added.

Diff Detail

Repository
rL LLVM

Event Timeline

yaxunl updated this revision to Diff 56809.May 10 2016, 2:00 PM
yaxunl retitled this revision from to [OpenCL] Fix __builtin_astype for vec3 types..
yaxunl updated this object.
yaxunl added reviewers: Anastasia, pxli168, bader.
yaxunl added subscribers: cfe-commits, tstellarAMD.
bader accepted this revision.May 12 2016, 12:34 PM
bader edited edge metadata.

LGTM.

test/CodeGenOpenCL/as_type.cl
6 ↗(On Diff #56809)

Could you also one test case that requires bitcast from vector type to 4-element vector type, please?
E.g. short2 <-> char3 or char16 <-> int3.

This revision is now accepted and ready to land.May 12 2016, 12:34 PM
yaxunl updated this revision to Diff 57392.May 16 2016, 1:37 PM
yaxunl edited edge metadata.

Add a test for casting char16 to i3 as Alexey suggested.

yaxunl updated this revision to Diff 57393.May 16 2016, 1:40 PM

Update test.

Anastasia added inline comments.May 23 2016, 10:06 AM
lib/CodeGen/CGExprScalar.cpp
3403 ↗(On Diff #57393)

numElementsDst -> NumElementsDst

Also I would expect formatter to align to the parameter list area...

3420 ↗(On Diff #57393)

So this code no longer applies to just vectors?

3422 ↗(On Diff #57393)

numElementsSrc -> NumElementsSrc

The same below!

3427 ↗(On Diff #57393)

Should we check numElementsDst == 4 (the same above)?

3428 ↗(On Diff #57393)

I am not sure why is this chosen to be this way?

If I check the OpenCL spec for type reinterpreting it doesn't seem to require shuffle vector. Also s6.2.4.2 says:

"It is an error to use as_type() or as_typen() operator to reinterpret data to a type of a different number of bytes."

The only valid conversion according to the spec seems to be vec4->vec3:

"When the operand and result type contain a different number of elements, the result shall be implementation-defined except if the operand is a 4-component vector and the result is a 3-component vector."

This change is affecting non-OpenCL code too. Is this reasonable approach for other vector implementations?

3429 ↗(On Diff #57393)

I think we only need bitcast if type don't match?

test/CodeGenOpenCL/as_type.cl
7 ↗(On Diff #57393)

Should this be disallowed by the frontend -? according to the spec s6.2.4.2:

"It is an error to use as_type() or as_typen() operator to reinterpret data to a type of a different number of bytes."
yaxunl marked 7 inline comments as done.May 30 2016, 10:56 AM
yaxunl added inline comments.
lib/CodeGen/CGExprScalar.cpp
3420 ↗(On Diff #57393)

Right. The issue happens when converting vec3 type to/from non-vector types.

3427 ↗(On Diff #57393)

No. Special handling is needed not just for conversion between vec3/vec4 types, but also between vec3 and non-vector types.

3428 ↗(On Diff #57393)

This change only affects OpenCL code since __builtin_astype is defined as a keyword for OpenCL only.

According to OpenCL spec v1.1 s6.1.5, vec3 type has the same size as vec4 type, so it is allowed to be converted to any other types which have the same size as vec4 type.

3429 ↗(On Diff #57393)

Builder.CreateBitCast automatically handle this.

test/CodeGenOpenCL/as_type.cl
7 ↗(On Diff #57393)

According to the spec v1.1 s6.1.5, vec3 type has the same size as vec4 type, so it is allowed to be converted to other types which have the same size as vec4 type.

yaxunl updated this revision to Diff 58991.May 30 2016, 1:23 PM
yaxunl marked 5 inline comments as done.

Fix variable capitalization and indentation.

Anastasia added inline comments.Jun 3 2016, 4:56 AM
lib/CodeGen/CGExprScalar.cpp
3394 ↗(On Diff #58991)

should this be 3 unstead of undef?

3428 ↗(On Diff #58991)

I see. Not related to your change, but I was just wondering if it would be better to change this to a Clang builtin with a custom check at some point. It would be easier to understand and we can avoid all this parsing/AST handling complications.

test/CodeGenOpenCL/as_type.cl
8 ↗(On Diff #58991)

So what happens if the number of bytes don't match?

46 ↗(On Diff #58991)

Would it make sense to check that shufflevector is not generated?

53 ↗(On Diff #58991)

Could we add CHECK-NOT bitcast here?

yaxunl updated this revision to Diff 59630.Jun 3 2016, 2:57 PM
yaxunl marked 10 inline comments as done.

Add a sema test for mismatched type size. Fix codegen test.

lib/CodeGen/CGExprScalar.cpp
3394 ↗(On Diff #58991)

3 or undef are both OK here since the 4th component is undefined. undef may be better since if optimizer checks mask first, it knows that the 4th component is undefined without further checking the second operand.

test/CodeGenOpenCL/as_type.cl
8 ↗(On Diff #58991)

There will be compilation error and diag msg. I will add a sema test.

46 ↗(On Diff #58991)

will fix

53 ↗(On Diff #58991)

will fix

Anastasia accepted this revision.Jun 6 2016, 8:37 AM
Anastasia edited edge metadata.

LGTM!

yaxunl marked 5 inline comments as done.Jun 6 2016, 8:53 AM

Ping

Xiuli, could you please take a look? Thanks.

This revision was automatically updated to reflect the committed changes.
pxli168 edited edge metadata.Jun 11 2016, 8:28 PM

I was on a vecation. LGTM, thanks!