- User Since
- Jan 8 2015, 1:53 PM (367 w, 1 h)
LGTM in general, modulo few nits.
Nit: looks like the changes need some clang-formatting.
Thu, Jan 13
Wed, Jan 12
I think instead of setting the triple directly from the command line, we should start with adding another --cuda-gpu-arch (AKA --offload-arch) variant and derive the triple and other parameters from it.
Tue, Jan 11
Mon, Jan 10
Fri, Jan 7
Thu, Jan 6
Wed, Jan 5
Ping. @mojca, do you need help landing the patch?
Tue, Jan 4
Dec 14 2021
Dec 13 2021
Added @asbirlea as a reviewer for the GlobalsModRef tests.
Reverted the changes in test/Transforms/ObjCARC/basic.ll that are no longer needed.
Undo ObjC intrinsic properties change. Fixed the tests instead.
Moved nosync check into getModRefInfo, where it makes more
sense as an additional check for potential side effects of the call.
Dec 9 2021
Dec 8 2021
Updated tests to use nosync attributes where the tests assumed it.
Added nosync to a couple of objc intrinsics.
Renamed the test.
Moved the check to GlobalModRef and switched to checking for nosync
Dec 7 2021
Dec 6 2021
Put __hip_gpubin_handle in comdat when it has linkonce_odr linkage.
Note to self: don't forget to hit "submit". The comments below have been left unsubmitted for two weeks. Sorry about that.
Nov 30 2021
Nov 29 2021
With D114601, this patch would no longer be needed.
Looks good overall. My main concern is with the test reaching out to LLVM sources and parsing tablegen files.
Nov 24 2021
Nov 22 2021
Good news this pass does not appear to be the culprit. The miscompilation happens with earlier builds.
a9bceb2b059dc24870882a71baece895fe430107 from before this patch landed already has the issue.
FYI, there's a miscompilation apparently triggered by this change. Not sure yet whether it's the source of the problem or just exposed it.
LGTM in general, modulo push_back/append nits.
Awsome! I'll go over the script in a couple of days.
Meanwhile, could you post a representative example of the code this file generates?
Nov 18 2021
I think this patch has been obsoleted by https://reviews.llvm.org/D113249 which has already landed.
Nov 16 2021
Nov 11 2021
LGTM in general.
Nov 10 2021
Nov 9 2021
Yes, we do need to merge identical functions with identical names for templates.
The changes look good in general. Thank you for cleaning this up.
+ @rnk as it's a windows-specific change.
LGTM in general.
Nov 8 2021
I'll defer to @eugenis. Overall it looks OK to be.
Nov 5 2021
I think we're missing few more changes here:
Nov 4 2021
With these changes, we should have consistent name mangling for kernel stubs and kernel launching mechanism on Linux and Windows.
Nov 3 2021
LGTM in general, modulo remaining nits.
Nov 2 2021
While ldu does indeed specify that it loads from read-only memory, I do not think we can treat ld.global.nc the same way.
PTX spec says Load register variable d from the location specified by the source address operand a in the global state space, and optionally cache in non-coherent texture cache. Since the cache is non-coherent, the data should be read-only within the kernel's process.