This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Teams reduction on the NVPTX device.
ClosedPublic

Authored by arpith-jacob on Feb 12 2017, 4:29 PM.

Details

Summary

This patch implements codegen for the reduction clause on
any teams construct for elementary data types. It builds
on parallel reductions on the GPU. Subsequently,
the team master writes to a unique location in a global
memory scratchpad. The last team to do so loads and
reduces this array to calculate the final result.

This patch emits two helper functions that are used by
the OpenMP runtime on the GPU to perform reductions across
teams.

Patch by Tian Jin in collaboration with Arpith Jacob

Diff Detail

Event Timeline

arpith-jacob created this revision.Feb 12 2017, 4:29 PM
ABataev edited edge metadata.Feb 13 2017, 1:14 AM

General note:
Be more strict with the types. For indeces it's better to use SizeTy or UintPtrTy, if Int32Ty is specified as type of parameter, use Int32y, not IntTy

lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
725

You should use CGM.IntTy instead of CGM.Int32Ty if (*kmp_InterWarpCopyFctPtr) really uses int type as the second type.

1121

Maybe it's better to use SizeTy rather than to rely on the support of Int64Ty?

1140

Again, SizeTy instead of Int64Ty? Or Intptr?

1201–1209

It is better to deduce type from ScratchpadBasePtr rather than to rely on support of Int64Ty.

1247–1256

I don't think you should use IntTy here, you should use Int32Ty

1295–1301

Again, better to use SizeTy

1309

IntPtr

1374–1378

IntTy or Int32Ty?

Use SizeTy instead of assuming 64 bits!

arpith-jacob marked 8 inline comments as done.Feb 14 2017, 11:28 AM

Alexey, thank you for your review. I have used SizeTy instead of assuming 64-bits.

lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
725

Yes, this should be Int32Ty. I've modified the function signature in the comment to use int32_t instead of int.

arpith-jacob marked an inline comment as done.Feb 16 2017, 3:06 AM

Alexey, do you any more comments on this patch?

ABataev added inline comments.Feb 16 2017, 4:16 AM
lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
715

int32_t, not int

1040–1041

To many params already, try to join them in the struct.

1248

It is better to create Int32Ty att the beginning of the function rather than call C.getIntTypeForBitwidth() each time

1386

It is better to create Int32Ty att the beginning of the function rather than call C.getIntTypeForBitwidth() each time

Addressed review comments.

arpith-jacob marked 4 inline comments as done.Feb 16 2017, 4:59 AM
This revision is now accepted and ready to land.Feb 16 2017, 5:06 AM
This revision was automatically updated to reflect the committed changes.