Optimized debugging is not supported by ptxas. Debugging information is degraded to line information only if optimizations are enabled, but debugging information would be added back in by the driver if remarks were enabled. This solves https://bugs.llvm.org/show_bug.cgi?id=48153.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Can we have a test for this?
@tra @jholewinski I'd be interested to hear what you think about this solution. It should allow us to stop disabling -g in the frontend, thereby providing source information to things like the remarks emitted for GPU code.
@serge-sans-paille Is the a NPM way of doing this?
There's --cuda-noopt-device-debug option specifically to allow compiling GPU code with full debug info. Clang will generate optimized PTX, but ptxas optimizations will be disabled.
Without that flag clang automatically downgrades debug info generation to lineinfo only. I think -fsave-optimization-record should do the same.
Adding a pass to strip debug info may not be the best place to deal with the issue. I think not enabling full debug info would be a better choice.
Okay, so without that flag Clang will not create debug symbols in the PTX assembly output. And if the user specified --cuda-noopt-device-debug then the Cuda driver will not pass the optimization flags to the ptxas invocation, right? So if that's the case, then the problem with -fsave-optimization-record is that it's not being correctly picked up as generating debug info. So the solution here would be to make sure it treats that flag as debug information. You should be able to see it not working by checking the *.s output when build with -fsave-optimization-record having debug in the target.
Only if optimizations are enabled. W/o optimization, full debug info will be there.
--cuda-noopt-device-debug re-enables full debug info but tells ptxas to expect it (and that requires disabling ptxas optimizations)
E.g. https://godbolt.org/z/1jPcnd
Changing the solution. The problem seems to be that after adjusting the debug info, the driver would change the debug kind if remarks were enabled. Now it adjusts the debug information after performing that change. This means that some diagnostics won't work with optimizations but it's necessary to compile correctly.
If you use arc diff, you can obtain Reviewed-by: line from Phabricator. It is more useful than Reviewers: (a list of reviewers do not mean they endorse or accept the patch)
arc land did work, now it is arc land --onto main, but it does these things for you. I like it.