This is an archive of the discontinued LLVM Phabricator instance.

[NVPTX, CUDA] added optional src_size argument to __nvvm_cp_async*
ClosedPublic

Authored by tra on May 17 2023, 2:44 PM.

Details

Summary

The optional argument is needed for CUDA-11+ headers when we're compiling for sm_80+ GPUs.

For the intrinsics, the src_size argument is required now. Old calls w/o the src_size argument can be upgraded by using src_size=transfer size of the intrinsic.

Diff Detail

Event Timeline

tra created this revision.May 17 2023, 2:44 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 17 2023, 2:44 PM
tra updated this revision to Diff 523216.May 17 2023, 4:17 PM

Updated clang side.

tra retitled this revision from [NVPTX] added src_size argument to __nvvm_cp_async* intrinsics. to [NVPTX, CUDA] added optional src_size argument to __nvvm_cp_async*.May 17 2023, 4:21 PM
tra edited the summary of this revision. (Show Details)
tra published this revision for review.May 17 2023, 4:21 PM
tra added reviewers: jlebar, nyalloc.
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMay 17 2023, 4:21 PM
jlebar accepted this revision.May 17 2023, 5:03 PM
This revision is now accepted and ready to land.May 17 2023, 5:03 PM
tra updated this revision to Diff 523426.May 18 2023, 10:06 AM

Actually connected the Sema check for the optional argument, and added a test to cover it.

tra updated this revision to Diff 523428.May 18 2023, 10:09 AM

Cosmetic test cleanup.

This revision was landed with ongoing or failed builds.May 18 2023, 11:06 AM
This revision was automatically updated to reflect the committed changes.
tra reopened this revision.May 18 2023, 11:48 AM
This revision is now accepted and ready to land.May 18 2023, 11:48 AM
tra added a comment.May 18 2023, 12:58 PM

Looks like the extra intrinsic argument broke MLIR. I'll need to figure out how to deal with that.

tra updated this revision to Diff 523566.May 18 2023, 2:43 PM

Instead of changing existing intrinsic, introduce a new set which takes an
additional src_size argument. This should keep existing users working.

tra requested review of this revision.May 18 2023, 2:45 PM

PTAL.

jlebar accepted this revision.May 18 2023, 3:05 PM

Re-approval.

This revision is now accepted and ready to land.May 18 2023, 3:05 PM
This revision was landed with ongoing or failed builds.May 19 2023, 11:00 AM
This revision was automatically updated to reflect the committed changes.