This is an archive of the discontinued LLVM Phabricator instance.

[NVPTX] Fix debugging information being added to NVPTX target if remarks are enabled
ClosedPublic

Authored by jhuber6 on Jan 5 2021, 2:33 PM.

Details

Summary

Optimized debugging is not supported by ptxas. Debugging information is degraded to line information only if optimizations are enabled, but debugging information would be added back in by the driver if remarks were enabled. This solves https://bugs.llvm.org/show_bug.cgi?id=48153.

Diff Detail

Event Timeline

jhuber6 created this revision.Jan 5 2021, 2:33 PM
jhuber6 requested review of this revision.Jan 5 2021, 2:33 PM
jdoerfert added subscribers: serge-sans-paille, tra.

Can we have a test for this?

@tra @jholewinski I'd be interested to hear what you think about this solution. It should allow us to stop disabling -g in the frontend, thereby providing source information to things like the remarks emitted for GPU code.
@serge-sans-paille Is the a NPM way of doing this?

tra added a comment.Jan 5 2021, 2:50 PM

There's --cuda-noopt-device-debug option specifically to allow compiling GPU code with full debug info. Clang will generate optimized PTX, but ptxas optimizations will be disabled.

Without that flag clang automatically downgrades debug info generation to lineinfo only. I think -fsave-optimization-record should do the same.
Adding a pass to strip debug info may not be the best place to deal with the issue. I think not enabling full debug info would be a better choice.

In D94123#2480633, @tra wrote:

There's --cuda-noopt-device-debug option specifically to allow compiling GPU code with full debug info. Clang will generate optimized PTX, but ptxas optimizations will be disabled.

Without that flag clang automatically downgrades debug info generation to lineinfo only. I think -fsave-optimization-record should do the same.
Adding a pass to strip debug info may not be the best place to deal with the issue. I think not enabling full debug info would be a better choice.

In D94123#2480633, @tra wrote:

There's --cuda-noopt-device-debug option specifically to allow compiling GPU code with full debug info. Clang will generate optimized PTX, but ptxas optimizations will be disabled.

Without that flag clang automatically downgrades debug info generation to lineinfo only. I think -fsave-optimization-record should do the same.
Adding a pass to strip debug info may not be the best place to deal with the issue. I think not enabling full debug info would be a better choice.

Okay, so without that flag Clang will not create debug symbols in the PTX assembly output. And if the user specified --cuda-noopt-device-debug then the Cuda driver will not pass the optimization flags to the ptxas invocation, right? So if that's the case, then the problem with -fsave-optimization-record is that it's not being correctly picked up as generating debug info. So the solution here would be to make sure it treats that flag as debug information. You should be able to see it not working by checking the *.s output when build with -fsave-optimization-record having debug in the target.

tra added a comment.Jan 5 2021, 4:49 PM

Okay, so without that flag Clang will not create debug symbols in the PTX assembly output.

Only if optimizations are enabled. W/o optimization, full debug info will be there.
--cuda-noopt-device-debug re-enables full debug info but tells ptxas to expect it (and that requires disabling ptxas optimizations)
E.g. https://godbolt.org/z/1jPcnd

jhuber6 updated this revision to Diff 314899.Jan 6 2021, 8:05 AM
jhuber6 retitled this revision from [NVPTX] Strip debugging symbols for optimized NVPTX targets. to [NVPTX] Fix debugging information being added to NVPTX target if remarks are enabled.
jhuber6 edited the summary of this revision. (Show Details)

Changing the solution. The problem seems to be that after adjusting the debug info, the driver would change the debug kind if remarks were enabled. Now it adjusts the debug information after performing that change. This means that some diagnostics won't work with optimizations but it's necessary to compile correctly.

Herald added a project: Restricted Project. · View Herald TranscriptJan 6 2021, 8:05 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
tra accepted this revision.Jan 6 2021, 10:37 AM

LGTM.

This revision is now accepted and ready to land.Jan 6 2021, 10:37 AM
MaskRay added a subscriber: MaskRay.EditedJan 6 2021, 11:01 AM

If you use arc diff, you can obtain Reviewed-by: line from Phabricator. It is more useful than Reviewers: (a list of reviewers do not mean they endorse or accept the patch)

If you use arc diff, you can obtain Reviewed-by: line from Phabricator. It is more useful than Reviewers: (a list of reviewers do not mean they endorse or accept the patch)

arc land did work, now it is arc land --onto main, but it does these things for you. I like it.