Optimized debugging is not supported by ptxas. Debugging information is degraded to line information only if optimizations are enabled, but debugging information would be added back in by the driver if remarks were enabled. This solves https://bugs.llvm.org/show_bug.cgi?id=48153.
Can we have a test for this?
@tra @jholewinski I'd be interested to hear what you think about this solution. It should allow us to stop disabling -g in the frontend, thereby providing source information to things like the remarks emitted for GPU code.
@serge-sans-paille Is the a NPM way of doing this?
There's --cuda-noopt-device-debug option specifically to allow compiling GPU code with full debug info. Clang will generate optimized PTX, but ptxas optimizations will be disabled.
Without that flag clang automatically downgrades debug info generation to lineinfo only. I think -fsave-optimization-record should do the same.
Adding a pass to strip debug info may not be the best place to deal with the issue. I think not enabling full debug info would be a better choice.
Okay, so without that flag Clang will not create debug symbols in the PTX assembly output. And if the user specified --cuda-noopt-device-debug then the Cuda driver will not pass the optimization flags to the ptxas invocation, right? So if that's the case, then the problem with -fsave-optimization-record is that it's not being correctly picked up as generating debug info. So the solution here would be to make sure it treats that flag as debug information. You should be able to see it not working by checking the *.s output when build with -fsave-optimization-record having debug in the target.
Changing the solution. The problem seems to be that after adjusting the debug info, the driver would change the debug kind if remarks were enabled. Now it adjusts the debug information after performing that change. This means that some diagnostics won't work with optimizations but it's necessary to compile correctly.