Initial function labels must follow the debug location for the correct relocation info generation.
Ptxas generates the relo info basing on the first label of the function. And it is unable to get correct debug location if there is no debug loc before the initial label. I don't know, how ptxas works, but I know that we had troubles with this when there were no debug loc before label.
Rather than testing for isNVPTX in AsmPrinter.cpp I'd rather just make a function "emitPreFunctionDebugInfo" and have it do nothing unless it's NVPTX and then define this in the nvptx backend. Easier to update if nvidia ever fixes this weirdness too.
Closer, I think you can do all of this in the NVPTX backend by just having something emit an "empty" line directive where you want rather than needing to redo the target independent code. Just call recordSourceLine with bogus information in the backend? Do you need a real location or can something that looks like the line number program initial state work for you? If you need a valid one then it's possible you could run into problems with multiple cus in a single input file (which I agree is less than likely with cuda, but I don't want to discount the possibility of people linking cuda modules together).
- I cannot call DwarfDebug::recordSourceLine from the NVPTX backend, as it has no direct access to the DwarfDebug. It is accessible only from AsmPrinter. So, we need AsmPrinter::emitInitialRawDwarfLocDirective or something similar.
- Yes, we need the valid debug location here. Otherwise, it may cause incorrect data emission from the Cuda profiling tools. That's why we need the same functionality as the regular debug info has. I don't think there's going to be some problems with the multiple cus. The debug directives for NVPTX are quite primitive, they include just file number, line number + column position. The .file directives are emitted, so, I don't see why we may have troubles here.