This is an archive of the discontinued LLVM Phabricator instance.

[NVPTX, CUDA] Improved feature constraints on NVPTX target builtins.
ClosedPublic

Authored by tra on Mar 29 2018, 1:48 PM.

Event Timeline

tra created this revision.Mar 29 2018, 1:48 PM
tra updated this revision to Diff 140334.Mar 29 2018, 2:23 PM

Run the tests with -target-cpu sm_61 to make sure intrinsics that require sm_60 are still enabled.

jlebar accepted this revision.Mar 29 2018, 4:24 PM
jlebar added inline comments.
clang/include/clang/Basic/Cuda.h
55 ↗(On Diff #140334)

Why 'static'?

59 ↗(On Diff #140334)

Could we instead rely on the fact that the CudaArch enum is dense and goes from UNKNOWN to LAST? One less bug to make...

clang/lib/Basic/Targets/NVPTX.cpp
182

Nit, if you spell it /*FeatureMap*/ here, it'll be clearer that you're intentionally ignoring that arg.

This revision is now accepted and ready to land.Mar 29 2018, 4:24 PM
tra added inline comments.Mar 30 2018, 9:29 AM
clang/include/clang/Basic/Cuda.h
55 ↗(On Diff #140334)

Old habits. Just 'inline' should do.

59 ↗(On Diff #140334)

Actually this function is not needed. It's a leftover from an old iteration of the code. I'll remove it.
As for why array -- that's what I wanted to do. Turns out that iterating over enums is a pain. I can't increment/decrement it and had to cast to an int for that. And then back, in order to use it.

tra updated this revision to Diff 140447.Mar 30 2018, 9:29 AM

Removed unneeded function.

tra updated this revision to Diff 140493.Mar 30 2018, 2:43 PM

Commented out unused argument.

The llvm change and corresponding switch from satom->sm_60 in the front end is fine.

Let's talk about the rest of it more. I'm not sure I'm seeing the need here rather than the annotations that are already here. Can you elaborate more here on why we need an additional method when you've already got subtarget features for each of the ptx versions anyhow?

Thanks.

-eric

tra added a comment.Apr 3 2018, 10:12 AM

Let's talk about the rest of it more. I'm not sure I'm seeing the need here rather than the annotations that are already here. Can you elaborate more here on why we need an additional method when you've already got subtarget features for each of the ptx versions anyhow?

The patch intends to address two issues:

  • mismatch on constraints between llvm and clang. E.g. hasPTX60() on LLVM side means "ptx60 or newer". "ptx60" in TARGET_BUILTIN on clang side means "ptx60" *only*. It is possible to address this within existing implementation by enumerating all PTX versions that are newer. It works OK for ptx60 as we only need to write "ptx60|ptx61". It gets more interesting for older GPUs E.g "ptx31+" would have to become "ptx31|ptx32|ptx40|ptx41|ptx42|ptx43|ptx50|ptx60|ptx61". Similar enumeration will need to happen for GPU version, which brings us to the next point
  • NVIDIA keeps growing PTX versions and GPU variants. Recently they've changed CUDA release frequency to ~1/quarter and they tend to add minor variants of PTX and GPU versions fairly frequently. It's going to be an unnecessary maintenance headache as after every new introduced variant I'll need to go and update all NVPTX builtins that have nothing to do with the CUDA changes. Granted, it's not a showstopper, but it is an annoyance that is guaranteed to stay.

With this patch, TARGET_BUILTIN constraints become semantically identical to LLVM and we no longer need to chase every bump in PTX version or GPU variant.

tra updated this revision to Diff 141932.Apr 10 2018, 5:19 PM

@echristo convinced me that this functionality can be implemented without growing a target-specific hook for custom interpretation of constraints used in TARGET_BUILTIN. Instead, we can hide unwieldy feature lists behind a macro.

tra retitled this revision from [NVPTX, CUDA] Use custom feature detection to handle NVPTX target builtins. to [NVPTX, CUDA] Improved feature constraints on NVPTX target builtins..
tra edited the summary of this revision. (Show Details)
tra added a reviewer: echristo.
echristo accepted this revision.Apr 10 2018, 5:28 PM

Guessing that SM_60 (etc) are defines?

Anyhow LGTM. I'm not sure you can split up the satom part of the patch, but if you can that'd be great.

-eric

tra added a comment.Apr 11 2018, 10:29 AM

Guessing that SM_60 (etc) are defines?

Anyhow LGTM. I'm not sure you can split up the satom part of the patch, but if you can that'd be great.

-eric

I'll split out removal of satom feature into a separate commit.

This revision was automatically updated to reflect the committed changes.