- User Since
- Jan 8 2015, 1:53 PM (179 w, 4 d)
Wed, Jun 13
Tue, Jun 12
I've discussed this with @echristo . Bottom line is that this patch is OK to proceed, and that I'm on the hook to add a NVPTX-specific pass to enforce that all target-cpu and and target-feature attributes in the TU are compatible with the current compilation parameters. I.e. if we're compiling for sm_50, then IR with functions that have target-cpu=sm_60 will result in an error.
Few nits. LGTM syntax-wise. I'm not very familiar with backend, so will defer to someone who has better idea.
Last comment in the bug pointed out that those overloads should be constexpr in c++14. Maybe in a separate patch, though.
Fri, Jun 8
Thu, Jun 7
Wed, Jun 6
Few nits. LGTM overall.
Few minor nits/suggestions. LGTM otherwise.
@rsmith - Richard, can you take a look?
Tue, Jun 5
Just to make it clear, I'm not against making a sensible default choice, but rather want to make sure that it is possible to override it if the user needs to.
I'm not sure this is the right thing to do. What if user explicitly wants device-side dependencies and runs clang --cuda-device-only -M ? This patch makes it impossible.
I'll wait to see if that fixes @Hahnfeld's problem.
I've experimented a bit and I think that we may not need this patch at all.
As far as I can tell, nv_weak is only applicable to device functions. It's ignored for global kernels and is apparently forbidden for data.
For device functions nvcc produces .weak attribute in PTX.
With the updated patch description + the discussion I'm OK with the approach from the general "how do we compile/use CUDA" point of view. I'll leave the question of whether the approach works for OpenMP to someone more familiar with it.
Looks OK overall, but I'm not very familiar with CGCall, so you may want to dig through the history and find someone with more expertise.
Fri, Jun 1
IMO overriding TargetTransformInfo::areInlineCompatible to always return true on NVPTX is what we want to do instead of upgrading everything else.
AFAICT, on NVPTX there's no reason to prevent inlining due to those attributes -- we'll never generate code, nor will we ever execute it on any other GPU than we're currently compiling for.
IIUIC, nv_weak is a synonym for weak (<rant>why, oh why did they need it?</rant>)
You may need to hunt down and change few other places that deal with the weak attribute.
Thu, May 31
Wed, May 30
Tue, May 29
Perhaps unrelated documentation cleanup can be extracted into a separate patch.
LGTM in general.
It would be good to add a test that we can use bits w/o explicit casting now.
"Interoperability with other compilers" is probably a statement that's a bit too strong. At best it's kind of compatible with CUDA tools and I don't think it's feasible for other compilers. I.e. it will be useless for AMD GPUs and whatever compiler they use.
Fri, May 25
Wed, May 23
Tue, May 22
CUDA does not expose explicit AS on clang size. All pointers are treated as generic and we infer specific address space only in LLVM.
__nvvm_atom_*_[sg]_* builtins should probably be removed as they are indeed useless without pointers with explicit AS and NVCC itself does not have such builtins either. Instead, we should convert the generic AS builtin to address-space specific instruction somewhere in LLVM.
@sanjoy is more familiar with this, so I've asked him to take a look.
May 18 2018
One more thing -- it would be really good to add some tests to make sure your commands are constructed the way you want.
Sorry about the long silence. I'm back to continue the reviews. I'll handle what I can today and will continue with the rest on Tuesday.
This was not intended. :-( I was unaware that GetCPUAndFeaturesAttributes() would add any feature that looks like a valid CPU name to the target-cpu attribute.
All I needed is to make builtins available or not. Setting them as function attributes is not what we need here.
May 17 2018
May 11 2018
May 10 2018
Fixed mangled comment.
May 9 2018
May 8 2018
Great! Let's close this review then.
And good luck with cling.
May 4 2018
Perhaps we should take a step back and consider whether this is the right approach to solve your problem.
May 3 2018
May 1 2018
Apr 27 2018
Do not use subtarget feature to control codegen of short pointers.
Instead we need to add a field to TargetOptions which is the only way to pass this info to NVPTXTargetInfo constructor where we calculate DataLayout.
@arsenm Will this do?
Moved control of the feature from subtarget feature to --nvptx-short-ptr command-line option.
Use DataLayout to obtain pointer size.
Apr 26 2018
Updated feature description.
UseShortPointer -> UseShortPointers
Removed debug printout.