User Details
- User Since
- Dec 8 2015, 9:32 PM (267 w, 6 d)
Aug 4 2020
Can someone with commit access land this patch? Thanks.
Rebase
Rebase
Aug 3 2020
Addressed comments. PTAL.
Jul 31 2020
Jul 21 2020
Nov 21 2017
Nov 9 2017
I was not sure if the *_sync intrinsics required preventing CSE since these intrinsics capture all state as arguments (lanes in a warp to sync as an argument). However, on Volta, I think different lanes in a warp can execute the intrinsic from different syntactic locations (i.e., different program counters). If true, then we do indeed have to model the data exchanged.
Apr 21 2017
Hi Jonas,
Feb 16 2017
Thank you for this. LGTM.
Addressed review comments.
Alexey, do you any more comments on this patch?
Feb 14 2017
Hi Alexey,
Alexey, thank you for your review. I have used SizeTy instead of assuming 64-bits.
Use SizeTy instead of assuming 64 bits!
Hi Alexey,
Feb 13 2017
Feb 12 2017
Minor fixup of comment style on emitInterWarpCopyFunction().
Updated patch to address Alexey's comments. Condensed parameters in emitReduction() to a struct Options.
Feb 10 2017
Feb 9 2017
Feb 3 2017
Jan 27 2017
Looks like this patch slipped through the cracks :( I've made the requested changes.
Jan 25 2017
Jan 24 2017
Jan 23 2017
Jan 18 2017
Inherit from OMPLexical scope with an added argument to reduce code duplication.
Jan 17 2017
Another correction. We'll have to create a similar scope OMPTeamsScope that inherits from OMPLexicalScope for target-teams combined directives.
The patch was updated to split 'emitParallelOrTeamsOutlinedFunction' into 'emitParallelOutlinedFunction' and 'emitTeamsOutlinedFunction' to enable the use of getCapturedStmt().
Jan 16 2017
Updated 'getOpenMPCaptureRegions' to return the OMPD_teams region kind for the teams directive.
Added a method 'getCapturedStmt' as part of OMPExecutableDirective.
Thanks Alexey.
Jan 15 2017
Jan 9 2017
Use i1 type for bool after all. But this time use the api ConvertType().
Using CGF.ConvertTypeForMem(Context.getBoolType()) to get the right type for 'bool' rather than using i1.
Moved CommonActionTy to CGOpenMPRuntimeNVPTX.cpp and renamed it to NVPTXActionTy, allowing us to customize the class in the future, if necessary.
Jan 5 2017
Jan 3 2017
Updated patch based on reviews.
Jan 2 2017
Dec 30 2016
Alexey, thank you for your review. I've updated the patch addressing your comments.
Dec 29 2016
Dec 28 2016
Alexey and Justin, thank you for spending the time to review this patch. I've updated the patch accordingly. I've also removed a dot ('.') from the worker function name since the character is not accepted by the nvidia linker ptxas in function names.
Addressed comments in review to start function names with a lowercase letter and to fix the enum type name along with the enumerator name.
Dec 27 2016
Mar 9 2016
Stylistic changes to address feedback.
Mar 4 2016
Addressed feedback; see inline comments for details.
Thanks for the quick review! The test cases are the same as the CUDA version so it should be fine.
Mar 3 2016
Friendly ping. Rafael, do you have further concerns? Thanks.
Feb 29 2016
Don't dangle pointers :(
Feb 27 2016
Added convergent to the barrier intrinsics.
Feb 26 2016
Feb 3 2016
Alexey, I've made the change in the comments. Thanks very much for your time!
Jan 31 2016
Jan 26 2016
Committed revision 258832.
Jan 25 2016
Patch fixed.
Addressed fixes by Alexey Bataev. Thanks.
Jan 24 2016
Jan 21 2016
Committed revision 258425.
Added template instantiation test case for all feasible tests.
Added template instantiation test case for all feasible tests.