This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Align kernel launch args correctly when the LLVM type's alignment is different from the clang type's alignment.
ClosedPublic

Authored by jlebar on Jul 27 2016, 10:46 AM.

Details

Summary

Before this patch, we computed the offsets in memory of args passed to
GPU kernel functions by throwing all of the args into an LLVM struct.

clang emits packed llvm structs basically whenever it feels like it, and
packed structs have alignment 1. So we cannot rely on the llvm type's
alignment matching the C++ type's alignment.

This patch fixes our codegen so we always respect the clang types'
alignments.

Diff Detail

Event Timeline

jlebar updated this revision to Diff 65771.Jul 27 2016, 10:46 AM
jlebar retitled this revision from to [CUDA] Align kernel launch args correctly when the LLVM type's alignment is different from the clang type's alignment..
jlebar updated this object.
jlebar added a reviewer: rnk.
jlebar added subscribers: tra, cfe-commits.
rnk added inline comments.Jul 27 2016, 11:05 AM
test/CodeGenCUDA/kernel-args-alignment.cu
1–2

Typically clang doesn't need a registered backend for a target to generate IR for that target. It "knows" a whole bunch of stuff about all target calling conventions and data layout. Unless CUDA goes out of its way to query LLVM backend information, we shouldn't need these REQUIRES lines.

You should probably test this theory, though, by configuring an ARM-only clang and running the tests. :)

jlebar updated this revision to Diff 65800.Jul 27 2016, 1:52 PM

Remove REQUIRES lines.

test/CodeGenCUDA/kernel-args-alignment.cu
1–2

Yeah, I don't think we actually need this, as we have a bunch of other codegen tests that don't have these REQUIRES lines.

rnk accepted this revision.Jul 27 2016, 3:31 PM
rnk edited edge metadata.

lgtm

This revision is now accepted and ready to land.Jul 27 2016, 3:31 PM
This revision was automatically updated to reflect the committed changes.