Page MenuHomePhabricator

jvesely (Jan Vesely)
User

Projects

User does not belong to any projects.

User Details

User Since
Sep 24 2014, 5:35 PM (443 w, 1 d)

Recent Activity

Sep 2 2022

jvesely added a comment to D131490: [libclc] Quote addition of CLC/LLAsm flags.

This patch breaks formatting. If there's a spaces after opening "(" there should be one before closing ")"

Sep 2 2022, 7:09 AM · Restricted Project, Restricted Project

Feb 27 2022

jvesely added a comment to D107849: [Libclc] Resolve FIXME: GCN insel crashes when a == 0 or b == 0.

I don't understand what you are trying to fix. What crashes exactly?

It doesn’t crash anymore, so we can remove the workarounds in place. That’s the point.

Feb 27 2022, 12:08 PM · Restricted Project, Restricted Project

Jan 12 2022

jvesely accepted D116668: libclc: Add clspv64 target.

LGTM.

Jan 12 2022, 11:43 AM · Restricted Project

Jan 8 2022

jvesely requested changes to D116668: libclc: Add clspv64 target.

Was this tested on an existing device, or is it just for completion (since there already is clspv target)?

Jan 8 2022, 11:15 AM · Restricted Project

Apr 27 2021

jvesely added a comment to D99794: libclc: Add -cl-no-stdinc to clang flags on clang >=13.

ping, @tstellar any concerns about this one?

Apr 27 2021, 9:22 AM · Restricted Project

Apr 21 2021

jvesely added a comment to D99794: libclc: Add -cl-no-stdinc to clang flags on clang >=13.

ping

Apr 21 2021, 8:28 AM · Restricted Project

Apr 8 2021

jvesely added a comment to D99794: libclc: Add -cl-no-stdinc to clang flags on clang >=13.

Btw just out of curiosity are there any reasons for libclc not to use clang implicit headers? They support all OpenCL standards and get a lot of new features and bug fixes regularly.

historic. libclc was ignored by most of the clang opencl work and nobody volunteered to switch it to clang provided headers.
Moreover, it has been stuck in clc 1.1/1.2 so most of the new features don't apply.
The main user is mesa/clover so that one needs to be switched to using clang implicit headers first

OK, I see. Thanks for the clarification! Btw the functionality of version 1.1. and 1.2 is fairly complete in clang headers. So it should be relatively low risk if you switch to it. If there are any specific adjustments needed for libclc we can surely find a way to accommodate those.

Apr 8 2021, 7:16 AM · Restricted Project

Apr 6 2021

jvesely added a comment to D99794: libclc: Add -cl-no-stdinc to clang flags on clang >=13.

Btw just out of curiosity are there any reasons for libclc not to use clang implicit headers? They support all OpenCL standards and get a lot of new features and bug fixes regularly.

Apr 6 2021, 12:17 PM · Restricted Project

Apr 2 2021

jvesely requested review of D99794: libclc: Add -cl-no-stdinc to clang flags on clang >=13.
Apr 2 2021, 6:52 AM · Restricted Project

Mar 3 2021

jvesely added a comment to D94013: [libclc] Add clspv target for libclc.

I pushed https://github.com/alan-baker/llvm-project/commit/248f6480fecb04c343da06aee437d27d75b3a9f6 which converts tabs to spaces in fma.cl if that is helpful.

Mar 3 2021, 9:42 PM · Restricted Project
jvesely committed rG21427b8eb8e7: libclc: Add clspv target to libclc (authored by alan-baker).
libclc: Add clspv target to libclc
Mar 3 2021, 9:20 PM
jvesely closed D94013: [libclc] Add clspv target for libclc.
Mar 3 2021, 9:20 PM · Restricted Project
jvesely added a comment to D94013: [libclc] Add clspv target for libclc.

I've pushed the commit to my fork at 30d41a. I just rebased it on the latest main.

Mar 3 2021, 8:13 AM · Restricted Project

Mar 2 2021

jvesely accepted D81999: libclc: Fix rounding during type conversion.

it took a while to test both versions. this patch fixes the following conversion tests:

	 *** 622) convert_floatn_rtp( uintn ) FAILED ** 
	 *** 627) convert_floatn_rtp( intn ) FAILED ** 
	 *** 639) convert_floatn_rtz( doublen ) FAILED ** 
	 *** 649) convert_floatn_rtz( longn ) FAILED ** 
	 *** 689) convert_doublen_rtz( doublen ) FAILED **

The number of failed cases goes down from 42 to 37

Mar 2 2021, 9:01 PM · Restricted Project

Feb 28 2021

jvesely added a comment to D94013: [libclc] Add clspv target for libclc.

Thanks for the review! Would it be possible for someone with commit privileges to land this patch?

Feb 28 2021, 1:01 AM · Restricted Project

Feb 24 2021

jvesely accepted D94013: [libclc] Add clspv target for libclc.

thanks

Feb 24 2021, 11:45 AM · Restricted Project

Feb 23 2021

jvesely added a comment to D81999: libclc: Fix rounding during type conversion.

while the fix is clearly correct, it'd be nice to do a test run and report changes.
I can run it on my carrizo machine, but the conversion tests take more than a day.

Feb 23 2021, 11:32 AM · Restricted Project
jvesely added a comment to D94013: [libclc] Add clspv target for libclc.

Is there anything else necessary to complete this review?

Feb 23 2021, 11:28 AM · Restricted Project

Feb 17 2021

jvesely added a comment to D94013: [libclc] Add clspv target for libclc.

thanks. can you fix the "to avoid to use" part of the description? the "to avoid" part sounds redundant.
It explains why fma needs to be different, but it could use an explanation why it needs sw implementation of nextafter.

Feb 17 2021, 6:53 PM · Restricted Project

Feb 16 2021

jvesely updated subscribers of D94013: [libclc] Add clspv target for libclc.

The FMA does pass conformance. I don't have performance numbers for the 64-bit version vs the uint2 version. It would be worse, but the whole software FMA is already awful performance. The 64-bit version is simpler to understand, so unless you have the same restrictions as I need for clspv, then I wouldn't adopt it.

Hopefully, the different target name will clarify the intent better. This is targeted for use with a cross compiler to Vulkan SPIR-V. The needs are different than the spir targets.

There are existing spirv and spirv64 targets. The new clspv target uses the same spir-- triple as the spirv target, but doesn't use the final spv conversion step and uses O3 optimization. The selection of files is mostly a subset of the existing spirv target.

Is there somewhere in the repo you'd suggest I document the differences? Nothing jumped out at me that explains the various backends. In addition to fma, nextafter is also a different implementation than the spirv targets.

Feb 16 2021, 7:35 PM · Restricted Project

Feb 14 2021

jvesely added a comment to D94013: [libclc] Add clspv target for libclc.

The FMA does pass conformance. I don't have performance numbers for the 64-bit version vs the uint2 version. It would be worse, but the whole software FMA is already awful performance. The 64-bit version is simpler to understand, so unless you have the same restrictions as I need for clspv, then I wouldn't adopt it.

Hopefully, the different target name will clarify the intent better. This is targeted for use with a cross compiler to Vulkan SPIR-V. The needs are different than the spir targets.

Feb 14 2021, 10:19 PM · Restricted Project

Feb 10 2021

jvesely added a comment to D94013: [libclc] Add clspv target for libclc.

It'd be nice to add an explanation why this can't reuse the existing SPIR-V targets and needs to add a new one.
Does the new FMA implementation generate significantly worse code (EG/NI + mesa-spirv should be the only users) than the original? Should it just be replaced? I assume it passes the conformance suite.

Feb 10 2021, 9:16 PM · Restricted Project

Oct 16 2020

jvesely added a comment to D89372: [OpenCL] Remove unused extensions.

Ok, so the pragma is supposed to control the pointer dereferencing which seems like a valid case but however, it is not implemented now. Do we know of any immediate plans from anyone to implement this? If not I would prefer to remove the pragma for now? If someone decided to implement this functionality later fully it is absolutely fine. Pragma will be very easy to add. There is no need for everyone to pay the price for something that can't be used at the moment.

I though the current behaviour of 'you can use #pragma, but the dereferences are always legal' was intentional for backward compatibility.
I don't think there are programs that would set it to 'disabled' and expect compilation failure.

The initial state of the pragma is disabled, so if disabling the extension isn't supposed to do anything then I don't know why would anyone ever enable it?

My concern is that legacy apps would set '#pragma enabled' before using char/short. such behavior would produce warning/error if with this change applied.

Correct, it will compile with a warning but not fail to compile. I am willing to introduce a -W option (if not present already ) to suppress that warning if it helps with the clean up and backward compatibility. It might also open up opportunities for a wider clean up - removing pragma in extensions that currently require pragma for no good reason. I have written more details on this in https://github.com/KhronosGroup/OpenCL-Docs/pull/355#issuecomment-662679499

Oct 16 2020, 12:14 PM · Restricted Project
jvesely added a comment to D89372: [OpenCL] Remove unused extensions.

Ok, so the pragma is supposed to control the pointer dereferencing which seems like a valid case but however, it is not implemented now. Do we know of any immediate plans from anyone to implement this? If not I would prefer to remove the pragma for now? If someone decided to implement this functionality later fully it is absolutely fine. Pragma will be very easy to add. There is no need for everyone to pay the price for something that can't be used at the moment.

Oct 16 2020, 6:34 AM · Restricted Project

Oct 15 2020

jvesely added a comment to D89372: [OpenCL] Remove unused extensions.

cl_khr_byte_addressable_stores changes language semantics. without it, pointer dereferences of types smaller than 32 bits are illegal.

Ok, does it mean that we are missing to diagnose this in the frontend? Btw I do acknowledge that what you say makes sense but I don't think the documentation support that:
https://www.khronos.org/registry/OpenCL/sdk/2.2/docs/man/html/cl_khr_byte_addressable_store.html

Am I just looking in the wrong place?

Oct 15 2020, 8:24 PM · Restricted Project
jvesely added a comment to D89372: [OpenCL] Remove unused extensions.

cl_khr_byte_addressable_stores changes language semantics. without it, pointer dereferences of types smaller than 32 bits are illegal.
Even if all clang targets support this the macro should still be defined for backward compatibility.

Oct 15 2020, 11:07 AM · Restricted Project

Oct 1 2020

jvesely accepted D88366: libclc: Use find_package to find Python 3 and require it.

LGTM

Oct 1 2020, 8:19 AM · Restricted Project

Sep 30 2020

jvesely added a comment to D88366: libclc: Use find_package to find Python 3 and require it.

I think this should use FindPython3 and drop the shebang from gen_convert.py to fix the 2 vs. 3 issues as well.

Sep 30 2020, 8:09 AM · Restricted Project

Sep 15 2020

jvesely committed rG291bfff5dbb7: libclc: Add a __builtin to let SPIRV targets select between SW and HW FMA (authored by daniels).
libclc: Add a __builtin to let SPIRV targets select between SW and HW FMA
Sep 15 2020, 10:39 PM
jvesely closed D85910: libclc: Add a __builtin to let SPIRV targets select between SW and HW FMA.
Sep 15 2020, 10:39 PM · Restricted Project
jvesely accepted D85910: libclc: Add a __builtin to let SPIRV targets select between SW and HW FMA.
Sep 15 2020, 9:55 AM · Restricted Project
jvesely added a comment to D81999: libclc: Fix rounding during type conversion.

I think these routines need to be rewritten and tested with CTS on platforms that use them.

Sep 15 2020, 9:54 AM · Restricted Project

Sep 10 2020

jvesely committed rG16ba78ee627c: libclc/spirv: Add missing files from D85911 (authored by jvesely).
libclc/spirv: Add missing files from D85911
Sep 10 2020, 9:18 PM

Sep 9 2020

jvesely committed rG060c8e083dd6: libclc/spirv: Add various functions (authored by daniels).
libclc/spirv: Add various functions
Sep 9 2020, 11:27 PM
jvesely closed D85911: libclc: spirv: Add various functions.
Sep 9 2020, 11:27 PM · Restricted Project
jvesely added a comment to D85911: libclc: spirv: Add various functions.

@jvesely, would you be able to land this for us if you're happy with it?

Sep 9 2020, 2:39 PM · Restricted Project

Sep 2 2020

jvesely added a comment to D85911: libclc: spirv: Add various functions.

so the translator can translate calls to fma, but not call to llvm.fma? that's weird. Please add that in a comment. Otherwise LGTM.

Right, it doesn't translate the LLVM intrinsics, only what's covered by the spec (see e.g. the OpenCL extension opcode definitions) as builtins, so it can be precise about what they mean and require.

Would you like the comment in fma.inc or in a commit message?

Sep 2 2020, 6:02 PM · Restricted Project

Sep 1 2020

jvesely added a comment to D85911: libclc: spirv: Add various functions.

I understand the mechanism you're aiming for. It' not clear whether the absence of fp64 path is intentional or not. afaik, all hw exposing fp64 also implements fma.fp64 in hw so generic clc implementation works.

Agreed, I expect all fp64 hardware has a native fp64 fma. However, for hardware-native implementations of CL functionality, we don't need a libclc function which maps to an LLVM intrinsic, because the SPIRV-LLVM-Translator maps calls to fma() to a SPIR-V opcode -- regardless of whether that fma() has an implementation present in the module. In fact, we *can't* have functions which map to LLVM intrinsics, because the SPIRV-LLVM-Translator will *fail* to map that to SPIR-V at all -- it doesn't map to the SPIR-V opcode.

resuing the generic clc fma.cl still works to expose the knob for fma.fp32 added in D85910

It does not, because that results in code which calls the LLVM intrinsic, which fails to translate to SPIR-V.

Sep 1 2020, 10:55 AM · Restricted Project

Aug 31 2020

jvesely added a comment to D85911: libclc: spirv: Add various functions.

I understand the mechanism you're aiming for. It' not clear whether the absence of fp64 path is intentional or not. afaik, all hw exposing fp64 also implements fma.fp64 in hw so generic clc implementation works.

Aug 31 2020, 9:17 AM · Restricted Project
jvesely requested changes to D85910: libclc: Add a __builtin to let SPIRV targets select between SW and HW FMA.
Aug 31 2020, 9:05 AM · Restricted Project
jvesely added inline comments to D85910: libclc: Add a __builtin to let SPIRV targets select between SW and HW FMA.
Aug 31 2020, 8:13 AM · Restricted Project
jvesely added a comment to D85911: libclc: spirv: Add various functions.

Why does this need a special implementation of fma.cl and fma.inc? The fp64 paths are missing, is there an expectation that something else will translate fma(double) function calls to llvm.fma opcodes?

Aug 31 2020, 8:09 AM · Restricted Project

Aug 16 2020

jvesely accepted D84392: libclc: add printf support on amd target.

Thanks. This LGTM, but you might want to get @tstellar's opinion as well.

Aug 16 2020, 9:35 AM

Aug 9 2020

jvesely added a comment to D84392: libclc: add printf support on amd target.

Have you checked that returning NULL is handled properly?
Should the buffer wrap around instead?

Aug 9 2020, 10:29 AM

Jul 24 2020

jvesely added a comment to D84392: libclc: add printf support on amd target.

Does the printf expansion work for all amdgpu triples, including amdgcn--mesa3d?
will it work for non-ROC capable GPUs?

Nothing in the compiler actually depends on roc capable GPUs. The limitations are all runtime/driver related

The exception to this would be places in the libraries that assume flat instructions are available. Theoretically we could handle these in the backend for SI with tagged pointers or something

so there are no hidden assumptions; like the printf buffer being cache coherent with host CPU, or the implementation using s_sendmsg to trigger CPU draining the buffer before the kernel finishes execution?

The implementation use a regular device buffer that is read at the end of the execution. Mostly like a clEnqueueReadBuffer

Jul 24 2020, 7:04 PM

Jul 23 2020

jvesely added a comment to D84392: libclc: add printf support on amd target.

Does the printf expansion work for all amdgpu triples, including amdgcn--mesa3d?
will it work for non-ROC capable GPUs?

Nothing in the compiler actually depends on roc capable GPUs. The limitations are all runtime/driver related

The exception to this would be places in the libraries that assume flat instructions are available. Theoretically we could handle these in the backend for SI with tagged pointers or something

Jul 23 2020, 3:55 PM
jvesely added a comment to D84392: libclc: add printf support on amd target.

Does the printf expansion work for all amdgpu triples, including amdgcn--mesa3d?
will it work for non-ROC capable GPUs?

Jul 23 2020, 2:16 PM
jvesely added a comment to D84392: libclc: add printf support on amd target.

I don't think there are many benefits in merging this ahead of time without the rest of printf implementation so that it can be tested.

Jul 23 2020, 12:49 PM

Jul 15 2020

jvesely accepted D83473: libclc: Fix FP_ILOGBNAN definition.
Jul 15 2020, 5:15 AM · Restricted Project

Jul 14 2020

jvesely added a comment to D83473: libclc: Fix FP_ILOGBNAN definition.

Since OpenCL allows apps to provide precompiled intermediate representations to the API, in the form of SPIR or SPIR-V, that means that the app could have embedded references to FP_ILOGBNAN, which are just a constant value since it's a #define, in their own code. They could write a kernel which calls ilogb and compares the result to FP_ILOGBNAN, and compile that with either clang trunk (which uses opencl-c.h automatically), or with Khronos's offline compiler (https://github.com/KhronosGroup/SPIR) using Khronos's standard library headers (https://github.com/KhronosGroup/libclcxx). Both of these standard library implementation headers define FP_ILOGBNAN to be INT_MAX:

I've no problem with the compatibility argument. but using INT_MAX for both NaN and Inf is in my understanding wrong.

Again, I didn't see this clearly stated in the spec. Can you quote the spec or point us to the relevant section?

Jul 14 2020, 11:24 AM · Restricted Project

Jul 10 2020

jvesely added a comment to D83473: libclc: Fix FP_ILOGBNAN definition.

tldr; I don't mind changing this to be in sync with clang, just get the patch right.
However, it's not the right way to guarantee IR level compatibility between clang-cl-headers and libclc.
details below

Jul 10 2020, 8:30 PM · Restricted Project
jvesely added a comment to D83473: libclc: Fix FP_ILOGBNAN definition.

What is the problem this patch is trying to address?

Well, the primary goal was to have consistent values in clang/lib/Headers/opencl-c-base.h and libclc/generic/include/clc/float/definitions.h to avoid the mess when one links against libclc but includes opencl-c-base.h.
Not to mention that having 2 conflicting definitions in headers that both lives in the same code base and are supposed to represent the same thing is confusing, to say the least.

Jul 10 2020, 11:37 AM · Restricted Project

Jul 9 2020

jvesely added a comment to D83473: libclc: Fix FP_ILOGBNAN definition.

What is the problem this patch is trying to address?
The specs do not mandate these two values to be different.

Jul 9 2020, 8:27 PM · Restricted Project

Jun 16 2020

jvesely accepted D77589: libclc: Add Mesa/SPIR-V target.

thanks!

Jun 16 2020, 2:18 PM · Restricted Project

Jun 4 2020

jvesely added inline comments to D77589: libclc: Add Mesa/SPIR-V target.
Jun 4 2020, 8:14 AM · Restricted Project

Apr 29 2020

jvesely added a comment to D78979: OpenCL: Include builtin header by default.

I'm also wondering if using -nogpulib for the corresponding linker purpose was correct, since in the OpenCL case it's not really an offload target. Maybe this should switch to -nostdlib?

-nogpulib is fine since opencl compiler is in parallel with the device compiler of CUDA/HIP. The library it uses is the device library.

Apr 29 2020, 4:46 PM
jvesely added a comment to D78535: libclc: Pass system libraries to the linker after llvm libraries.

This is the failing cmdline (from here https://github.com/tstellar/llvm-project/runs/587088911):

/usr/bin/c++ CMakeFiles/prepare_builtins.dir/utils/prepare-builtins.cpp.o -o prepare_builtins -L/home/runner/work/llvm-project/llvm-project/build/lib -Wl,-rpath,/home/runner/work/llvm-project/llvm-project/build/lib -lz -lrt -ldl -ltinfo -lpthread -lm -lxml2 -lLLVMBitWriter -lLLVMAnalysis -lLLVMProfileData -lLLVMObject -lLLVMTextAPI -lLLVMMCParser -lLLVMMC -lLLVMDebugInfoCodeView -lLLVMDebugInfoMSF -lLLVMBitReader -lLLVMCore -lLLVMRemarks -lLLVMBitstreamReader -lLLVMBinaryFormat -lLLVMSupport -lLLVMDemangle

Apr 29 2020, 3:06 PM · Restricted Project

Apr 23 2020

jvesely added a comment to D78535: libclc: Pass system libraries to the linker after llvm libraries.

was this a problem with using -Wl,--as-needed?

LGTM.

No, I was hitting this when using the default build options for both LLVM and libclc.

Apr 23 2020, 5:25 PM · Restricted Project

Apr 22 2020

jvesely accepted D78535: libclc: Pass system libraries to the linker after llvm libraries.

was this a problem with using -Wl,--as-needed?

Apr 22 2020, 4:52 PM · Restricted Project

Apr 16 2020

jvesely added a comment to D78220: IR/ConstantRange: Add support for FP ranges.

Unless we use pointers and an extra level of indirection the end sizes for users like ValueLattice end up the same irrespective of where the union is (lower+upper vs. backend vs. user).

I'd expect that this is indeed what we want to do. Given that the float range is just so much larger, we'll want to allocate it out of line.

Apr 16 2020, 10:42 PM · Restricted Project, Restricted Project
jvesely added a comment to D77923: OpenCL: Fix some missing predefined macros.

LGTM. The objective of the change is to simplify offline compilation. We want to avoid asking users to add -D if a proper macro can be inferred from triple and cpu. Ideally, users only need to specify triple and cpu with clang driver. In case users want to override the default macros for the triple, they can always override them by -D or -U in command line.

Apr 16 2020, 3:37 PM · Restricted Project
jvesely added a comment to D78220: IR/ConstantRange: Add support for FP ranges.

I would strongly recommend to implement this as a separate type (FloatRange for example), rather than combining it into ConstantRange. There's a couple reasons for that, the main ones would be a) float ranges have significantly different semantics, in particular the fact that they are inclusive and b) ConstantRange is a size critical structure, because it is used in value lattice elements. As implement, this change is going to skyrocket memory consumption during SCCP.

+1, I think there is a strong case for not increasing the size here. We should rather try to reduce the size of the existing ConstantRange (using the 2 APInts means we store the same bit width twice :()

The point of having one combined range is to allow simplification of logic for users. you don't have to check for the type just do the same operations union/intersect/binop/contains/...
I'm not sure how splitting it to a separate class would address the memory concerns. Tracking fp ranges will require two APFloats irrespective of the structure.
If keeping both APFloat and APint bounds is a problem I can put them in a union since they are used exclusively and an instance never switches between float and int.

Even with a union, you still need extra space for the bits to distinguish between FP/integers (and other FP related flags, like canBeNaN. It should be possible to achieve the same level of convenience to this patch, but separating integer and FP ranges into separate classes. E.g. add ConstantRangeIntStorage & ConstantRangeFloatStorage which use APInt/APFloat respectively. The shared logic can be implemented in a ConstantRangeCommon template, which can be instantiated with either ConstantRangeIntStorage/ConstantRangeFPStorage. ConstantRange would turn into an instantiation of ConstantRangeCommon<ConstantRangeIntStorage>. Of course that would require updating code to make use of the FP version of ConstantRange. That might be a benefit actually, because there probably are various places that assume ConstantRange only contains integers.

When using the FP version of the constant range in ValueLattice, that would allow us to keep the size the same, if no extra bits (like canBeNaN) are added.

Apr 16 2020, 12:14 PM · Restricted Project, Restricted Project

Apr 15 2020

jvesely added a comment to D78224: ValueLattice, LVI, SCCP: Use floating point constant ranges.

Very interesting! On a high-level, it would probably be good to split those changes into 3 (ValueLattice, LVI & SCCP); not sure how tricky that's going to be given adding automatic support in markConstant & co. If it's too tricky splitting LVI & SCCP would be great.

Apr 15 2020, 2:55 PM · Restricted Project
jvesely added a comment to D78220: IR/ConstantRange: Add support for FP ranges.

I would strongly recommend to implement this as a separate type (FloatRange for example), rather than combining it into ConstantRange. There's a couple reasons for that, the main ones would be a) float ranges have significantly different semantics, in particular the fact that they are inclusive and b) ConstantRange is a size critical structure, because it is used in value lattice elements. As implement, this change is going to skyrocket memory consumption during SCCP.

The point of having one combined range is to allow simplification of logic for users. you don't have to check for the type just do the same operations union/intersect/binop/contains/...
I'm not sure how splitting it to a separate class would address the memory concerns. Tracking fp ranges will require two APFloats irrespective of the structure.
If keeping both APFloat and APint bounds is a problem I can put them in a union since they are used exclusively and an instance never switches between float and int.
The difference in semantics shouldn't impact users much. they should use contains/isEmpty/isFull/isSingleElement/getABCrange instead of checking the bounds manually.
The application to valluelattice/lvi/sccp is in D78224, to keep diff simple it hasn't implemented cleanups that are allowed by having one range class, yet.

Apr 15 2020, 2:55 PM · Restricted Project, Restricted Project
jvesely added a comment to D78224: ValueLattice, LVI, SCCP: Use floating point constant ranges.

There were few issues in SCCP tests.

  • apint-load -- the result of the division operation of the original operands depends on rounding mode. it's represented as a range rather than a constant.
  • apint-ipsccp4 -- we can now deduce the starting range of v1 in ssa_copy intrinsic as [-Inf, 2.999] which then gets merged with more accurate information from the function call. The original behaviour started with unknown which then gets combined with more accurate information. Thus having less information at the beginning gives a better result. changing the comparison condition check to only allow Icmp works around the issue and keeps the old behaviour.
  • float-nan-simplification -- NaNs with different payloads/sign are not tracked separately, so one case returns +NaN while the other one returns -NaN. I've adjusted the comment in the test to explain the situation.
Apr 15 2020, 12:03 PM · Restricted Project
jvesely created D78224: ValueLattice, LVI, SCCP: Use floating point constant ranges.
Apr 15 2020, 11:30 AM · Restricted Project
jvesely created D78222: Transform/Float2Int: Don't rely on hack support for floating point type in ConstantRange.
Apr 15 2020, 11:30 AM · Restricted Project
jvesely created D78220: IR/ConstantRange: Add support for FP ranges.
Apr 15 2020, 11:30 AM · Restricted Project, Restricted Project

Apr 14 2020

jvesely added a comment to D77923: OpenCL: Fix some missing predefined macros.

OPENCL_VERSION is something that should be really set by the opencl driver rather than the compiler.
coarse-grained SVM can be done without FEATURE_FLAT_ADDRESS_SPACE, so those gpus can get over 1.2, while the compiler can be used in an implementation that doesn't go above 1.2 even for gfx700+

I don't know why this would imply we can't just have a static __OPENCL_VERSION__ for each device in Clang? I don't see anything in the standard that says you are not allowed to have (going with your example) __OPENCL_VERSION__=200 and __OPENCL_C_VERSION=120 at the same time.

It seems like we should be able to just have __OPENCL_VERSION__ defined in Clang, and let the runtime control setting __OPENCL_C_VERSION__ via -cl-std=. If the current patch sets __OPENCL_VERSION__ differently than the runtime does then it seems like we should make it match, but I'm not sure how best to confirm that.

The __OPENCL_VERSION__ macro in the kernel needs to match the version returned by clGetDeviceInfo (both refer to OpenCL version supported by the device),
which in turn must be <= version returned by clGetPlatformInfo.
This part makes it unsuitable for hardcoding in the compiler.

__OPENCL_VERSION__ should indicate device capabilities rather than language features. specs talk about available resources.
For example, CL_DEVICE_IMAGE2D_MAX_HEIGHT is required to be >= 8192 for OpenCL 1.2 while it's required to be >=16384 for OpenCL 2.1.

But why would the driver not report the device values which we do know in the compiler? A discrepancy between these sounds like a driver bug.

Apr 14 2020, 1:30 PM · Restricted Project
jvesely committed rG24fad7278a39: libclc: Use temporary files rather than a pipe (authored by daniels).
libclc: Use temporary files rather than a pipe
Apr 14 2020, 7:28 AM
jvesely committed rGcccdd0579b50: libclc: Don't pass linker flags to CLC/LLAsm (authored by daniels).
libclc: Don't pass linker flags to CLC/LLAsm
Apr 14 2020, 7:28 AM
jvesely closed D77165: libclc: Use temporary files rather than a pipe.
Apr 14 2020, 7:28 AM · Restricted Project
jvesely committed rGacf079006e6d: libclc: Use echo rather than true for try_compile (authored by daniels).
libclc: Use echo rather than true for try_compile
Apr 14 2020, 7:28 AM
jvesely closed D77164: libclc: Don't pass linker flags to CLC/LLAsm.
Apr 14 2020, 7:28 AM · Restricted Project
jvesely committed rGe6bb1d69eccc: libclc: Fix LLVM library linking on Windows (authored by daniels).
libclc: Fix LLVM library linking on Windows
Apr 14 2020, 7:28 AM
jvesely closed D77163: libclc: Use echo rather than true for try_compile.
Apr 14 2020, 7:28 AM · Restricted Project
jvesely closed D77162: libclc: Fix LLVM library linking on Windows.
Apr 14 2020, 7:28 AM · Restricted Project

Apr 13 2020

jvesely added a comment to D77923: OpenCL: Fix some missing predefined macros.

OPENCL_VERSION is something that should be really set by the opencl driver rather than the compiler.
coarse-grained SVM can be done without FEATURE_FLAT_ADDRESS_SPACE, so those gpus can get over 1.2, while the compiler can be used in an implementation that doesn't go above 1.2 even for gfx700+

I don't know why this would imply we can't just have a static __OPENCL_VERSION__ for each device in Clang? I don't see anything in the standard that says you are not allowed to have (going with your example) __OPENCL_VERSION__=200 and __OPENCL_C_VERSION=120 at the same time.

It seems like we should be able to just have __OPENCL_VERSION__ defined in Clang, and let the runtime control setting __OPENCL_C_VERSION__ via -cl-std=. If the current patch sets __OPENCL_VERSION__ differently than the runtime does then it seems like we should make it match, but I'm not sure how best to confirm that.

Apr 13 2020, 4:52 PM · Restricted Project

Apr 11 2020

jvesely added a comment to D77923: OpenCL: Fix some missing predefined macros.

OPENCL_VERSION is something that should be really set by the opencl driver rather than the compiler.
coarse-grained SVM can be done without FEATURE_FLAT_ADDRESS_SPACE, so those gpus can get over 1.2, while the compiler can be used in an implementation that doesn't go above 1.2 even for gfx700+

Apr 11 2020, 9:52 PM · Restricted Project

Apr 10 2020

jvesely accepted D77165: libclc: Use temporary files rather than a pipe.

I've checked that custom targets are not cleaned up either, so I think it's good to go.
lgtm

Apr 10 2020, 12:30 AM · Restricted Project

Apr 6 2020

jvesely accepted D77164: libclc: Don't pass linker flags to CLC/LLAsm.

Have you tried clearing the target link flags using set_target_properties( ... LINK_FLAGS ... )?
I think having the ability to pass LINK_FLAGS here might be useful in the future.

I tried, but the manual says that a) you must use STATIC_LIBRARY_FLAGS for static libraries, and b) you must use STATIC_LIBRARY_OPTIONS in preference to the former, as it sets rather than appends.

Nevertheless, neither of them actually worked for me, and both continued to pass the machine linker flags.

Apr 6 2020, 7:38 PM · Restricted Project
jvesely accepted D77162: libclc: Fix LLVM library linking on Windows.

LGTM. thanks

Apr 6 2020, 7:38 PM · Restricted Project

Apr 3 2020

jvesely added a comment to D77165: libclc: Use temporary files rather than a pipe.

What was the generator target you used?

I'm using Ninja [...]

As noted in D17762, we have to use Ninja rather than the VS backend, as the 'standard' way to add new languages for compile in CMake doesn't work with the VS backend. CMake's VS backend has a hardcoded-in-C++-code list of acceptable source languages; when you try to produce a static library with sources in an unknown-to-CMake-VS-backend language, the target ends up completely empty with no actions to produce the output, because the CMake backend doesn't know how.

Apr 3 2020, 12:25 PM · Restricted Project
jvesely added a comment to D77165: libclc: Use temporary files rather than a pipe.

Windows does have pipes. they work both in cmd and powershell, but perhaps cmake is not using shell to invoke these commands?
What was the generator target you used?

Apr 3 2020, 6:57 AM · Restricted Project

Apr 2 2020

jvesely added a comment to D77162: libclc: Fix LLVM library linking on Windows.

I don't see what the second part is trying to address. Does visual studio not detect the static libraries on the command line? I presume the dynamic versions are not present.
If anything it should check llvm-config --shared-mode if available instead of checking for win32. Do all visual studio/clang/mingw use the same flag?

Yes, clang runs as clang-cl which accepts MSVC-type arguments, though you can also invoke clang to pass UNIX-style arguments. For the linker, however, there is no option; lld-link only accepts MSVC-type arguments on Windows and nothing else. I don't know what MinGW does. Is there an easy way to test that? Do you currently build on MinGW?

I could test llvm-config --shared-mode == static && WIN32 to apply /MT, as I tried various other methods of convincing Windows to link statically and not dynamically with no luck. On UNIX this isn't an issue as you can happily mix the two modes.

Apr 2 2020, 4:17 PM · Restricted Project
jvesely accepted D77163: libclc: Use echo rather than true for try_compile.

This will add noise to CMakeOutput.log and CMakeError.log, but I guess it's just one line.

Apr 2 2020, 3:44 PM · Restricted Project
jvesely added a comment to D77164: libclc: Don't pass linker flags to CLC/LLAsm.

Have you tried clearing the target link flags using set_target_properties( ... LINK_FLAGS ... )?
I think having the ability to pass LINK_FLAGS here might be useful in the future.

Apr 2 2020, 3:44 PM · Restricted Project

Apr 1 2020

jvesely added a comment to D77162: libclc: Fix LLVM library linking on Windows.

I think the first part should use separate_arguments the same way as LLVM_CXXFLAGS does. (you can also add the following change while at it:

-set( LLVM_CXX_FLAGS ${LLVM_CXX_FLAGS} -fno-rtti -fno-exceptions )
+list( APPEND LLVM_CXX_FLAGS -fno-rtti -fno-exceptions )
Apr 1 2020, 7:41 AM · Restricted Project

Feb 25 2020

jvesely committed rG814fb658ca26: libclc: cmake configure should depend on file list (authored by jvesely).
libclc: cmake configure should depend on file list
Feb 25 2020, 1:48 AM
jvesely closed D74662: libclc: cmake configure should depend on file lists.
Feb 25 2020, 1:48 AM · Restricted Project

Feb 21 2020

jvesely added a comment to D74662: libclc: cmake configure should depend on file lists.

@tstellar , is there a way to make libclc ML an automatic subscriber to phabricator posted libclc patches?

I think you need to ask the phabricator mantainers: https://llvm.org/docs/Phabricator.html#status

Feb 21 2020, 10:14 AM · Restricted Project

Feb 20 2020

jvesely committed rGefeafa1bdaa7: libclc: Use acos implementation from amd_builtins (authored by jvesely).
libclc: Use acos implementation from amd_builtins
Feb 20 2020, 9:06 PM
jvesely closed D74011: libclc: Use acos implementation from amd_builtins.
Feb 20 2020, 9:06 PM · Restricted Project

Feb 18 2020

jvesely added a comment to D74011: libclc: Use acos implementation from amd_builtins.

ping

Feb 18 2020, 11:01 PM · Restricted Project

Feb 14 2020

jvesely added a comment to D74662: libclc: cmake configure should depend on file lists.

@tstellar , is there a way to make libclc ML an automatic subscriber to phabricator posted libclc patches?

Feb 14 2020, 8:07 PM · Restricted Project
jvesely created D74662: libclc: cmake configure should depend on file lists.
Feb 14 2020, 7:39 PM · Restricted Project

Feb 10 2020

jvesely added a comment to D74323: AMDGPU: Move R600 test compatability hack.

Is there any test left that tests the r600 intrinsics? If yes, LGTM.

Feb 10 2020, 7:57 AM · Restricted Project

Feb 9 2020

jvesely committed rG85e2fa44c64e: libclc/r600: Use target specific builtins to implement rsqrt and native_rsqrt (authored by jvesely).
libclc/r600: Use target specific builtins to implement rsqrt and native_rsqrt
Feb 9 2020, 11:47 AM
jvesely closed D74016: libclc/r600: Use target specific builtins to implement rsqrt and native_rsqrt.
Feb 9 2020, 11:47 AM · Restricted Project
jvesely committed rG4b23a2e8e971: libclc: Move rsqrt implementation to a .cl file (authored by jvesely).
libclc: Move rsqrt implementation to a .cl file
Feb 9 2020, 11:47 AM