This is an archive of the discontinued LLVM Phabricator instance.

[NVPTX] Use align attribute for kernel pointer arg alignment
ClosedPublic

Authored by nikic on Feb 8 2022, 7:31 AM.

Details

Reviewers
tra
Group Reviewers
Restricted Project
Commits
rG1c729d719a34: [NVPTX] Use align attribute for kernel pointer arg alignment
Summary

Instead of determining the alignment based on the pointer element type (which is incompatible with opaque pointers), make use of alignment annotations added by the frontend.

In particular, clang will add alignment attributes to OpenCL kernels since D118894.

Diff Detail

Event Timeline

nikic created this revision.Feb 8 2022, 7:31 AM
nikic requested review of this revision.Feb 8 2022, 7:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 8 2022, 7:31 AM
tra added a comment.Feb 8 2022, 11:02 AM

make use of alignment annotations added by the frontend.

Does it mean that we now require explicit alignment annotation to generate reasonable code? What happens if alignment is not specified explicitly? Do we fall back to unaligned/naturally aligned/something else?

llvm/test/CodeGen/NVPTX/nvcl-param-align.ll
8

What's expected to happen if the alignment is not specified explicitly?

It may be worth adding a test case for that.

nikic updated this revision to Diff 407086.Feb 9 2022, 1:33 AM

Test absence of align attribute.

nikic added a comment.Feb 9 2022, 1:41 AM

make use of alignment annotations added by the frontend.

Does it mean that we now require explicit alignment annotation to generate reasonable code? What happens if alignment is not specified explicitly? Do we fall back to unaligned/naturally aligned/something else?

If the frontend does not add an explicit alignment annotation (and no alignment is inferred during optimization), then LLVM assumes that pointers are unaligned (alignment 1).

llvm/test/CodeGen/NVPTX/nvcl-param-align.ll
8

If no alignment is specified, then the default is 1. I've extended the test to check for this.

tra added inline comments.Feb 9 2022, 10:41 AM
llvm/test/CodeGen/NVPTX/nvcl-param-align.ll
8

Always defaulting to unaligned access would likely be a performance regression.
LLVM IR spec generally defaults to If the alignment is not specified, then the code generator makes a target-specific assumption.
I think we do need to infer assumed alignment from the data layout, if we do know the type.

nikic added inline comments.Feb 9 2022, 11:53 AM
llvm/test/CodeGen/NVPTX/nvcl-param-align.ll
8

There may be some confusion here: LLVM defaults to ABI alignment for alignment of memory accesses like loads and stores (or rather, these accesses always have explicit alignment, but this default is usedwhen parsing textual IR). However, for pointer arguments, the default alignment is 1 if no align attribute is provided (and it's not a byval parameter, but that's explicitly excluded here).

There should be no regressions here as long as the frontend adds the necessary alignment annotations, which was done for OpenCL kernels in D118894. As this code is only used for NVCL and not CUDA I assume that is sufficient, but please correct me if I'm wrong.

tra added inline comments.Feb 9 2022, 12:07 PM
llvm/test/CodeGen/NVPTX/nvcl-param-align.ll
8

The nominal goal of this patch is to allow dealing with opaque pointer types. It does not require defaulting to alignment of 1. Sticking with a known working default alignment for the non-opaque pointers seems to be a better choice.

I agree that it's more or less a no-op if IR is generated by clang. However, there are other LLVM IR users that generate IR themselves. E.g. I have no idea whether TensorFlow's XLA, or julia, or any other LLVM-based JIT always passes alignment into for pointer arguments.

nikic added a reviewer: Restricted Project.Feb 9 2022, 12:23 PM
nikic added inline comments.
llvm/test/CodeGen/NVPTX/nvcl-param-align.ll
8

If other frontends don't generate the necessary alignment information, then we want to break those frontends ASAP, so they can be adjusted. As a matter of general policy, we always make changes necessary for opaque pointers unconditionally, so that we can deal with their effects in an isolated fashion. Making behavior changes part of the opaque pointer switch itself means that it will be very hard to root-cause the reason for any particular change.

FWIW the equivalent change has already been made in the AMDGPU target.

tra accepted this revision.Feb 9 2022, 1:13 PM

LGTM.

llvm/test/CodeGen/NVPTX/nvcl-param-align.ll
8

After reading https://llvm.org/docs/OpaquePointers.html, I see your point.

This revision is now accepted and ready to land.Feb 9 2022, 1:13 PM
This revision was landed with ongoing or failed builds.Feb 10 2022, 2:57 AM
This revision was automatically updated to reflect the committed changes.