According to C99 standard long long is at least 64 bits in size. However, OpenCL C defines long long as 128 bit signed integer. This prevents one to use x86 builtins when compiling OpenCL C code for x86 targets. The patch changes long long to long for OpenCL only.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
'O' is an interesting choice. Any real justification for it, or just "what was available"? It definitely needs to be documented in the top of Builtins.def however.
clang/lib/AST/ASTContext.cpp | ||
---|---|---|
9362 ↗ | (On Diff #201873) | Likely need to update the asserts in the other places this is used, such as 'Z'. Also, add the 'O' to the documentation in Builtins.def |
A different (perhaps silly) question is why 'W' isn't sufficient? It represents int64_t, which I wonder if is sufficient.
I had asked in a separate conversation not to change the signatures outside of opencl. I believe W would use 'long' on 64-bit targets that aren't Windows or X32. The builtins are defined by gcc as well so I wanted to avoid making them different as much as possible.
O is for OpenCL. I planned to use 'C' but it is already used.
I'll add docs. Thank you.
Conditional approval, commit with the comment fixed. Let me know if you need me to commit it for you.
clang/include/clang/Basic/Builtins.def | ||
---|---|---|
56 ↗ | (On Diff #202014) | End the comment with a period. |
What about a testcase? It shouldn't be hard to add a small testcase that demonstrate that the changed builtins now work when compiling OpenCL C code for x86. I don't think you have to add all changed builtins to the testcase (but a few to demonstrate the change).
@Ka-Ka good point. Thank you.
@Anastasia Would such tests be ok with you?
@erichkeane Thank you very much. I think I don't have permissions to commit changes and will need someone's help.
clang/test/CodeGen/builtins-x86.c | ||
---|---|---|
127–128 ↗ | (On Diff #202403) | I don't like the introduced casts. Can we change the testcase to operate on the appropriate types instead? tmp_V2d = __builtin_ia32_undef128(); tmp_V4d = __builtin_ia32_undef256(); What do you think? |
228–231 ↗ | (On Diff #202403) | same here? |
250 ↗ | (On Diff #202403) | and here? |
428 ↗ | (On Diff #202403) | Can we change this line to? tmp_V4i = __builtin_ia32_pmaddwd128(tmp_V8s, tmp_V8s); |
432 ↗ | (On Diff #202403) | Can we change this line to? tmp_V16c = __builtin_ia32_palignr128(tmp_V16c, tmp_V16c, imm_i); |
433 ↗ | (On Diff #202403) | Can we change this line to? tmp_V8c = __builtin_ia32_palignr(tmp_V8c, tmp_V8c, imm_i); |