This is an archive of the discontinued LLVM Phabricator instance.

[Clang][AMDGPU] Set LTO CG opt level based on Clang option
ClosedPublic

Authored by scott.linder on Jan 24 2023, 12:51 PM.

Details

Summary

For AMDGCN default to mapping --lto-O# to --lto-CGO# in a 1:1 manner
(i.e. clang -O<N> implies --lto-O<N> and --lto-CGO<N>).

Ensure there is a means to override this via -Xoffload-linker and begin
to claim these arguments to avoid incorrect warnings that they are not
used.

Diff Detail

Event Timeline

scott.linder created this revision.Jan 24 2023, 12:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 24 2023, 12:51 PM
scott.linder requested review of this revision.Jan 24 2023, 12:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 24 2023, 12:51 PM

(Just adding everyone from the parent review, feel free to remove yourself as reviewer if you aren't interested!)

It isn't ideal, but as only AMDGCN wants the 1:1 mapping for LTO->CG opt level to be the default I implemented it here. Thoughts?

I can move the missing claim call to another patch, it just seemed like a small enough change to include inline.

Making this target specific doesn’t make sense

Making this target specific doesn’t make sense

I don't know how else to resolve others wanting the default to be the CGOptLevel = clamp(ltoOptLevel, 2, 3) behavior and AMDGPU wanting the default to be CGOptLevel = ltoOptLevel. Maybe that isn't actually the default we want? We can ask users to specify -Xoffload-linker --lto-CGO#.

scott.linder added a comment.EditedFeb 2 2023, 2:48 PM

Are there any thoughts on whether this is too ugly to live? It will be awkward to teach users the current default behavior without this change, but if we can accept it as a historical quirk that may be OK.

The primary driver for wanting the 1:1 mapping is an odd interaction between our debug info (only implemented for -O0 codegen currently, higher opt levels are WIP) and our implementation of -fgpu-rdc (currently this implies -flto as we don't support true device code-object linking).

A user who compiles and debugs without -fgpu-rdc cannot simply add it to their command-line, they will also need -Xoffload-linker --lto-CGO0. Of course, this is a temporary arrangement, but it also just seems "correct" to default to the 1:1 mapping. Even once our debug info supports higher opt levels, I suspect users will still explicitly request no optimization and be surprised that some arcane pass-thru option like -Xoffload-linker --lto-CGO0 is required to make it work how they expect.

MaskRay accepted this revision.Feb 3 2023, 11:41 AM

Fine with me, but you may need an AMDGPU reviewer.

This revision is now accepted and ready to land.Feb 3 2023, 11:41 AM
yaxunl accepted this revision.Feb 3 2023, 12:07 PM

LGTM. Thanks