This is an archive of the discontinued LLVM Phabricator instance.

[NVPTX] Added NVVMIntrRange pass
ClosedPublic

Authored by tra on May 25 2016, 2:10 PM.

Details

Summary

NVVMIntrRange adds !range metadata to calls of NVVM intrinsics
that return values within known limited range.

This allows LLVM to generate better code for indexing arrays
based on tid/ctaid which is a frequently used pattern in CUDA code.

Diff Detail

Repository
rL LLVM

Event Timeline

tra updated this revision to Diff 58507.May 25 2016, 2:10 PM
tra retitled this revision from to [NVPTX] Added NVVMIntrRange pass .
tra updated this object.
tra added reviewers: jholewinski, jlebar, jingyue.
tra added a subscriber: llvm-commits.
tra updated this revision to Diff 58511.May 25 2016, 2:12 PM

Updated file name & description in the copyright banner.

jlebar edited edge metadata.May 25 2016, 2:34 PM

\o/

lib/Target/NVPTX/NVPTXTargetMachine.cpp
181 ↗(On Diff #58507)

I have no idea if this is the right way to pass this information to the pass. Seems reasonable to me, though.

lib/Target/NVPTX/NVVMIntrRange.cpp
10 ↗(On Diff #58507)

calls to

11 ↗(On Diff #58507)

a limited range

36 ↗(On Diff #58507)

I think at least a comment, and possibly more descriptive names (blockMaxes / gridMaxes?) would be helpful to readers.

Actually, since you don't use the fact that these are structs at all, maybe just six member variables would be clearer. maxBlockX, maxThreadZ, etc. What do you think?

60 ↗(On Diff #58507)

Maybe indicate that the range is [Low, High)?

66 ↗(On Diff #58507)

We're already using namespace llvm, so don't need "llvm::" here.

test/CodeGen/NVPTX/intrinsic-old.ll
10 ↗(On Diff #58507)

Probably shouldn't do it in this patch, but FYI if you add CHECK-LABEL: @foo in front of each function, that will make it a lot easier to debug if it fails. CHECK-LABEL essentially partitions the checks.

Although I think it applies only to CHECKs, not e.g. RANGEs.

314 ↗(On Diff #58507)

It looks like we're covering everything other than warpSize -- can we add that one too, for completeness?

jingyue edited edge metadata.May 25 2016, 2:50 PM

Have you considered letting Clang (instead of a late-stage IR pass) add these ranges? These ranges are useful for some target-independent IR passes, e.g. those using ValueTracking (D4150).

lib/Target/NVPTX/NVVMIntrRange.cpp
36 ↗(On Diff #58511)

Maybe name them BlockDim and GridDim

61 ↗(On Diff #58511)

Can you comment on whether the range is [Low, High) or [Low, High]?

tra marked 6 inline comments as done.May 25 2016, 3:26 PM

Have you considered letting Clang (instead of a late-stage IR pass) add these ranges? These ranges are useful for some target-independent IR passes, e.g. those using ValueTracking (D4150).

This pass runs at the very beginning of optimization pipeline. Justin created hook for early target-specific passes in D18616.

lib/Target/NVPTX/NVVMIntrRange.cpp
37 ↗(On Diff #58511)

I've made names more descriptive. IMO struct fits quite well for describing dimensions/indexes of 3d grid.

test/CodeGen/NVPTX/intrinsic-old.ll
10–11 ↗(On Diff #58511)

Having -LABEL does not matter much in this case as each function checks for its own unique register and that applies to both CHECK and RANGE.

314 ↗(On Diff #58511)

ptx.read.* and nvvm.read.ptx.sreg.* appear to have diverged.
There's no llvm.ptx.read.warpsize, so it's not in this file.
We apparently don't test nvvm.read.ptx.sreg.* intrinsics much, either.

I can add a range test for nvvm.read.ptx.sreg.warpsize to this patch, but it looks like there's enough work for a separate patch.

tra updated this revision to Diff 58522.May 25 2016, 3:27 PM
tra edited edge metadata.
tra marked 2 inline comments as done.

Addressed Justin's comments.

tra marked an inline comment as done.May 25 2016, 3:27 PM
jingyue accepted this revision.May 25 2016, 3:31 PM
jingyue edited edge metadata.
jingyue added inline comments.
test/CodeGen/NVPTX/intrinsic-old.ll
324 ↗(On Diff #58522)

Trailing blank line

This revision is now accepted and ready to land.May 25 2016, 3:31 PM
jlebar added a comment.EditedMay 25 2016, 3:33 PM

it looks like there's enough work for a separate patch.

I'm happy if you want to do it in a separate patch, but we should definitely add testing of important intrinsics like these, one way or another.

LGTM aside from the apparent integer overflow.

lib/Target/NVPTX/NVVMIntrRange.cpp
61 ↗(On Diff #58522)

MaxGridSize.x is INT_MAX for sm_30+. Then we add 1 to it, and store in a signed int...

tra updated this revision to Diff 58531.May 25 2016, 4:06 PM
tra edited edge metadata.

Use uint64_t to pass range limits so we don't risk overflow on the way to ConstantInt::get().

tra marked an inline comment as done.May 25 2016, 4:09 PM
tra added inline comments.
lib/Target/NVPTX/NVVMIntrRange.cpp
61 ↗(On Diff #58522)

Changed arg type to uint64_t.
As for the range itself, it apparently allows wrapping, so exclusive high boundary wrapped to negative value should be OK. In any case we don't have any other way to encode it considering that range values must be the same as return type which is i32 in this case.

jlebar accepted this revision.May 25 2016, 4:13 PM
jlebar edited edge metadata.
This revision was automatically updated to reflect the committed changes.
tra marked an inline comment as done.