Page MenuHomePhabricator

[DEBUG] Initial adaptation of NVPTX target for debug info emission.
ClosedPublic

Authored by ABataev on Jan 8 2018, 11:03 AM.

Details

Summary

Patch adds initial emission of the debug info for NVPTX target.
Currently, only .file and .loc directives are emitted, everything else is
commented out to not break the compilation of Cuda.

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
echristo added inline comments.Jan 18 2018, 2:27 PM
lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
416 ↗(On Diff #130473)

I don't believe that this is a necessary or sufficient check here. What's the point of it?

760 ↗(On Diff #130473)

So there's a difference between a begin symbol for a section and the section itself. You're doing the former and not the latter it looks like? And also I think you're breaking the possibility of multiple cus in a single module.

lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.cpp
43 ↗(On Diff #130473)

I don't understand what's going on here?

lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.h
27 ↗(On Diff #130473)

A description of how this works would be good.

Hi Eric, thanks for the review.

In general, I think this patch needs a lot more description and possibly breaking up. For now a few questions:

Ok, I'll add some description. As to breaking it up, it will be very hard to do. This patch adds commented out debug info, that allows successful compiling of resulting PTX file. If we will remove at least some small change from this patch, it will produce incorrect PTX file. So, I rather doubt that it is possible to break this patch into several small patches.

a) nvptx doesn't allow a debug_str section?

Here is an excerpt from CUDA Toolkit documentation (http://docs.nvidia.com/cuda/ptx-writers-guide-to-interoperability/index.html#debug-information): The PTX producer is responsible for emitting binary DWARF into the PTX file, using the .section and .b8-.b16-.b32-and-.b64 directives in PTX. This should contain the .debug_info and .debug_abbrev sections, and possibly optional sections .debug_pubnames and .debug_aranges. So, yes, .debug_str is not allowed.

b) relocations in nvptx should be against the section rather than the label? (To be fair, they can be this way on elf platforms as well which would be an improvement)

According to PTX ISA, 11.5.2. Debugging Directives: .section(http://docs.nvidia.com/cuda/parallel-thread-execution/index.html#debugging-directives) labels are not allowed inside the debugging directives, so we cannot use labels.

c) There are a bunch of directives for nvptx that you want to use for emission?

Just .debug_info and .debug_abbrevs sections, all other sections should be empty, if emitted.

d) It looks like you've got this turned off by default by commenting out everything? I'd rather you just conditionalize it and have it all work and be testable before turning on.

It is just an initial patch. It adds some default debug info generation, but this info is still incorrect. This is why I commented it out in the output PTX. After this patch I need to commit several patches to LLVM and couple patches to clang to be able to produce correct debug info, that is compilable and does not break cuda-gdb and ptxas.
We still can test it, using LLVM lit tests, but it definitely won't work with the cuda-gdb and ptxas.

I think I'd probably start splitting up the NVPTX asm printer patch into separate and testable patches, perhaps by rewriting the existing line table handling and going from there?

I think the only thing I can do separately is to remove an existing line table handling completely. If this is ok, I can prepare the patch for it.

I've also made some inline comments of things that were confusing or otherwise not sure about if you also want to tackle those.

Thanks!

-eric

Thanks

ABataev added inline comments.Jan 19 2018, 7:34 AM
lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
416 ↗(On Diff #130473)

PTX format does not allow use of .debug_ranges section, we should not emit it.

760 ↗(On Diff #130473)

I checked it, multiple CUs are handled correctly. But the reference to CU is generated as .b64 .debug_info+offset, which is supported by PTX. Don't forget, that PTX format is very limited in support of DWARF sections and we will genarte only .debug_abbrevs and .debug_info sections.

lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.cpp
36 ↗(On Diff #130473)

I mean remove comments from the emitted DWARF sections once they are generated correctly.

43 ↗(On Diff #130473)

The body of the DWARF sections in PTX file must be enclosed into braces, like this:

.section .debug_info
{
...
}

This code checks that current section is one of the DWARF sections and encloses the content of these sections into braces.

lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.h
27 ↗(On Diff #130473)

Ok, will add

lib/Target/NVPTX/NVPTXAsmPrinter.cpp
875 ↗(On Diff #130473)

I mean, that the comment symbols // from the emitted //, debug must be removed once the debug info is properly generated.

ABataev added inline comments.Jan 19 2018, 8:16 AM
lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
760 ↗(On Diff #130473)

After some thoughts, I realized that generally speaking, you're correct here. We need an additional offset for the compilation unit. I 'll fix this.

ABataev updated this revision to Diff 130633.Jan 19 2018, 9:09 AM

Fixed handling of multiple CUs.

ABataev updated this revision to Diff 131291.Jan 24 2018, 9:23 AM

Update against latest version.

ABataev updated this revision to Diff 132255.Jan 31 2018, 12:46 PM

Force DWARF 2 for NVPTX target.

probinson added inline comments.
lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
416 ↗(On Diff #130473)

The .debug_ranges section was not defined in DWARF v2. Better to avoid emitting it based on the DWARF version.

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
274 ↗(On Diff #132255)

Unconditionally ignoring a command-line option? Are there tests that fail if you don't do this?

ABataev added inline comments.Feb 1 2018, 7:31 AM
lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
416 ↗(On Diff #130473)

It won't be emitted (at least for NVPTX). I'm not sure about other targets. Maybe they allow to use .debug_ranges sections even for DWARF2?

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
274 ↗(On Diff #132255)

I'm not aware of any, did it just in case if there are such cases.

Ping

At this point I'm looking at how to break this up for you. It's going to take more time because I have other things, but I don't want to put in all of this in a single patch because there are things I want to reason about separately.

-eric

Ping

At this point I'm looking at how to break this up for you. It's going to take more time because I have other things, but I don't want to put in all of this in a single patch because there are things I want to reason about separately.

-eric

Hi Eric, thanks for your help. My ping is not a ping for the review, it is a ping for my question.
It is very hard to split this patch. It will break the generation of the PTX files so that either the compiler will crash or ptxas won't be able to compile the generated file. I asked you, is this ok if as the first part of the patch I remove generation of the debug lines info? It will reduce the size of the patch.

Ahh, and I think I can remove the part for the strings from this patch and commit it later

ABataev updated this revision to Diff 136292.Feb 28 2018, 7:38 AM

Updated to the latest version

ABataev removed a subscriber: probinson.
ABataev updated this revision to Diff 136295.Feb 28 2018, 7:57 AM

Restored accidentally deleted changes.

ABataev updated this revision to Diff 139597.Mar 23 2018, 7:30 AM

Updated to the latest version.

echristo added inline comments.Mar 31 2018, 12:37 PM
include/llvm/MC/MCAsmInfo.h
586 ↗(On Diff #139597)

Let's separate this out into a separate patch as well?

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
333 ↗(On Diff #139597)

Do you still need to use dwarf2 here?

274 ↗(On Diff #132255)

Yeah, I don't think this should be here.

lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.h
33 ↗(On Diff #139597)

So, this functionality is used for changing files within the line table. I think what's actually going on here is that nvptx, for some reason, only supports a single file directive.

Basically you don't support the idea of:

int foo (void) {

// some code
#include "somefile.def"
// some more code

}

having a file switch in the middle of a function. That's arguably a bug, but I think you don't want to collect them and emit them outside the functions because then it would be more actively wrong.

lib/Target/NVPTX/NVPTXAsmPrinter.cpp
477 ↗(On Diff #130473)

Seems odd that we're pulling the functionality across multiple functions that don't really work in pairs.

hintonda removed a subscriber: hintonda.Apr 1 2018, 6:42 PM
ABataev marked an inline comment as done.Apr 2 2018, 9:13 AM
ABataev added inline comments.
include/llvm/MC/MCAsmInfo.h
586 ↗(On Diff #139597)

Ok

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
333 ↗(On Diff #139597)

Yes, we need to force dwarf2 for NVPTX, this is what we agreed on with Paul Robinson reviewing clang part of the patch

lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.h
33 ↗(On Diff #139597)

No, it means that you may have the reference to the somefile.def, but the table of files should be emitted outside of the function, in the outer context

lib/Target/NVPTX/NVPTXAsmPrinter.cpp
477 ↗(On Diff #130473)

Yes, maybe it is worth it to add some option that allows enclosing of the dwarf sections into braces on the emission?

ABataev added inline comments.Apr 2 2018, 9:32 AM
lib/Target/NVPTX/NVPTXAsmPrinter.cpp
477 ↗(On Diff #130473)

Ahh, no, it won't work. This is caused by the fact that we only perform switch to section but not doing exit from the section emission. Because of that we just can't form pair of functions to emit braces

ABataev updated this revision to Diff 140825.Apr 3 2018, 10:43 AM

Updated to latest version

echristo added inline comments.Apr 10 2018, 6:03 PM
lib/Target/NVPTX/MCTargetDesc/NVPTXMCAsmInfo.cpp
42 ↗(On Diff #140825)

Do we still need these parts commented out?

lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.cpp
39 ↗(On Diff #140825)

This should be commented as to "only dwarf sections need comments according to 'some reason'"

39 ↗(On Diff #130473)

Similar to my comment below, I think that the "which section am I in" part needs some work. I definitely don't think it should be a "!=" comparison, and perhaps should be checking for some other metadata/feature.

43 ↗(On Diff #130473)

But why looking for particular sections this way?

lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.h
33 ↗(On Diff #139597)

I'm not sure what you mean. Can you show some examples and elaborate? I.e. you may have a file table entry, but if you're not emitting them before the function you might have a bad time?

What do you do with this testcase?

echristo@athyra ~> cat foo.c
void foo() {

#include "foo.def"
return;

}
echristo@athyra ~> cat foo.def
#include <stdio.h>
printf("hello world\n");

lib/Target/NVPTX/NVPTXAsmPrinter.h
350 ↗(On Diff #129123)

Can you add a rationale here as to why we do it here rather than some other place? More detail in comment descriptions is good.

ABataev added inline comments.Apr 11 2018, 8:14 AM
lib/Target/NVPTX/MCTargetDesc/NVPTXMCAsmInfo.cpp
42 ↗(On Diff #140825)

Yes, cause the debug info is still not correct. I'm afraid that it may cause some problems in the ptxas

lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.cpp
39 ↗(On Diff #140825)

Ok

39 ↗(On Diff #130473)
  1. Do you suggest explicitly check for the particular debug sections?
  2. What metadata/feature are you talking about? Could you give a reference?
43 ↗(On Diff #130473)

Do you have some better solution in mind?

lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.h
33 ↗(On Diff #139597)

I removed include <stdio.h> from your second file, it causes a lot of troubles. Here is what I got:

In file included from /NVPTX_DEBUG_INFO/foo.c:3:
./foo.def:2:1: warning: implicitly declaring library function 'printf' with type 'int (const char *, ...)' [-Wimplicit-function-declaration]
printf("hello world\n");
^
./foo.def:2:1: note: include the header <stdio.h> or explicitly provide a declaration for 'printf'
//
// Generated by LLVM NVPTX Back-End
//

.version 3.2
.target sm_20//, debug
.address_size 64

	// .globl	foo
.extern .func  (.param .b32 func_retval0) vprintf
(
	.param .b64 vprintf_param_0,
	.param .b64 vprintf_param_1
)
;
.global .align 1 .b8 _$_str[13] = {104, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100, 10, 0};

.visible .func foo()
{
	.reg .b32 	%r<2>;
	.reg .b64 	%rd<4>;
Lfunc_begin0:
	.loc	1 1 0

	.loc	2 2 1
	mov.u64 	%rd1, _$_str;
	cvta.global.u64 	%rd2, %rd1;
	mov.u64 	%rd3, 0;
	{ // callseq 0, 0
	.reg .b32 temp_param_reg;
	.param .b64 param0;
	st.param.b64 	[param0+0], %rd2;
	.param .b64 param1;
	st.param.b64 	[param1+0], %rd3;
	.param .b32 retval0;
	call.uni (retval0), 
	vprintf, 
	(
	param0, 
	param1
	);
	ld.param.b32 	%r1, [retval0+0];
	} // callseq 0
	.loc	1 4 1
	ret;
Ltmp0:
Lfunc_end0:

}
	.file	1 "/NVPTX_DEBUG_INFO/foo.c"
	.file	2 "/NVPTX_DEBUG_INFO/./foo.def"
//	.section	.debug_abbrev
//	{
// .b8 1
// .b8 17
// .b8 1
// .b8 37
// .b8 8
// .b8 19
// .b8 5
// .b8 3
// .b8 8
// .b8 16
// .b8 6
// .b8 27
// .b8 8
// .b8 17
// .b8 1
// .b8 18
// .b8 1
// .b8 0
// .b8 0
// .b8 2
// .b8 46
// .b8 0
// .b8 17
// .b8 1
// .b8 18
// .b8 1
// .b8 3
// .b8 8
// .b8 58
// .b8 11
// .b8 59
// .b8 11
// .b8 63
// .b8 12
// .b8 0
// .b8 0
// .b8 0
//	}
//	.section	.debug_info
//	{
// .b32 166
// .b8 2
// .b8 0
// .b32 .debug_abbrev
// .b8 8
// .b8 1
// .b8 99
// .b8 108
// .b8 97
// .b8 110
// .b8 103
// .b8 32
// .b8 118
// .b8 101
// .b8 114
// .b8 115
// .b8 105
// .b8 111
// .b8 110
// .b8 32
// .b8 55
// .b8 46
// .b8 48
// .b8 46
// .b8 48
// .b8 32
// .b8 40
// .b8 116
// .b8 114
// .b8 117
// .b8 110
// .b8 107
// .b8 32
// .b8 51
// .b8 50
// .b8 57
// .b8 56
// .b8 48
// .b8 52
// .b8 41
// .b8 32
// .b8 40
// .b8 108
// .b8 108
// .b8 118
// .b8 109
// .b8 47
// .b8 116
// .b8 114
// .b8 117
// .b8 110
// .b8 107
// .b8 32
// .b8 51
// .b8 50
// .b8 57
// .b8 56
// .b8 48
// .b8 53
// .b8 41
// .b8 0
// .b8 12
// .b8 0
// .b8 102
// .b8 111
// .b8 111
// .b8 46
// .b8 99
// .b8 0
// .b32 .debug_line
// .b8 47
// .b8 103
// .b8 115
// .b8 97
// .b8 47
// .b8 121
// .b8 107
// .b8 116
// .b8 103
// .b8 115
// .b8 97
// .b8 47
// .b8 104
// .b8 111
// .b8 109
// .b8 101
// .b8 47
// .b8 97
// .b8 47
// .b8 108
// .b8 47
// .b8 97
// .b8 108
// .b8 101
// .b8 120
// .b8 101
// .b8 121
// .b8 47
// .b8 116
// .b8 101
// .b8 115
// .b8 116
// .b8 47
// .b8 78
// .b8 86
// .b8 80
// .b8 84
// .b8 88
// .b8 95
// .b8 68
// .b8 69
// .b8 66
// .b8 85
// .b8 71
// .b8 95
// .b8 73
// .b8 78
// .b8 70
// .b8 79
// .b8 0
// .b64 Lfunc_begin0
// .b64 Lfunc_end0
// .b8 2
// .b64 Lfunc_begin0
// .b64 Lfunc_end0
// .b8 102
// .b8 111
// .b8 111
// .b8 0
// .b8 1
// .b8 1
// .b8 1
// .b8 0
//	}
//	.section	.debug_macinfo
//	{
// .b8 0

//	}
1 warning generated.

As you can see .loc directives are emitted with references to files 1 and 2 and after the .text section we have 2 .file directives: one for file foo.c and another one for file foo.def

lib/Target/NVPTX/NVPTXAsmPrinter.h
350 ↗(On Diff #129123)

Ok, will add

ABataev updated this revision to Diff 142035.Apr 11 2018, 10:13 AM

Address comments from @echristo

echristo added inline comments.Apr 11 2018, 10:22 AM
lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.h
33 ↗(On Diff #139597)

Typically you don't have file directives listed after the function and still have it able to assemble. Doing that in the .s file created with my testcase gives:

echristo@athyra ~> clang -c foo.s
foo.s:8:7: error: unassigned file number in '.loc' directive

.loc    1 1 0                   # foo.c:1:0
        ^

foo.s:18:7: error: unassigned file number in '.loc' directive

.loc    2 2 1 prologue_end      # ./foo.def:2:1
        ^

foo.s:22:7: error: unassigned file number in '.loc' directive

.loc    1 3 3                   # foo.c:3:3

which seems problematic?

ABataev added inline comments.Apr 11 2018, 10:35 AM
lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.h
33 ↗(On Diff #139597)

Not for ptxas, ptxas supports this order of the .loc and .file directives.

Couple of small comment then this is going to be ok.

One question: "What work needs to happen to uncomment the debug info output?" In other words, what isn't done right now?

-eric

lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.cpp
39 ↗(On Diff #140825)

For the record this is pretty fragile and should be fixed up in a followup patch.

lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.h
33 ↗(On Diff #139597)

Can you document this in comments?

Couple of small comment then this is going to be ok.

One question: "What work needs to happen to uncomment the debug info output?" In other words, what isn't done right now?

-eric

The main problem is that the emitted debug info is correct syntactically, but not semantically. And I'm afraid that this may cause some problems with the cuda tools.

lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.cpp
39 ↗(On Diff #140825)

Ok, will do

lib/Target/NVPTX/MCTargetDesc/NVPTXTargetStreamer.h
33 ↗(On Diff #139597)

Sure

ABataev updated this revision to Diff 142791.Apr 17 2018, 10:12 AM

Updated after review

echristo accepted this revision.Apr 17 2018, 10:20 AM

Good start for now. Let's see what we can do to start testing as we can and documenting any bugs in the cuda tools as we work around them.

Thanks for all of your efforts here.

-eric

This revision is now accepted and ready to land.Apr 17 2018, 10:20 AM
This revision was automatically updated to reflect the committed changes.
tra added a comment.Apr 18 2018, 9:48 AM

Hooray! @ABataev, @echristo -- Thank you for all the work you have put into landing this patch.

In D41827#1071104, @tra wrote:

Hooray! @ABataev, @echristo -- Thank you for all the work you have put into landing this patch.

:) Actually, we just started, there is a long road ahead :)

tra added a comment.Apr 18 2018, 10:12 AM
In D41827#1071104, @tra wrote:

Hooray! @ABataev, @echristo -- Thank you for all the work you have put into landing this patch.

:) Actually, we just started, there is a long road ahead :)

I know. Still, it's good to see that you are landing bits and pieces of it now.

FWIW I'm seeing internal correctness issues after this patch has been merged - I think it's related to the section changing part of the patches. I'm investigating.

FWIW I'm seeing internal correctness issues after this patch has been merged - I think it's related to the section changing part of the patches. I'm investigating.

:( Very said. Do have troubles with NVPTX target or some other targets?

FWIW I'm seeing internal correctness issues after this patch has been merged - I think it's related to the section changing part of the patches. I'm investigating.

:( Very said. Do have troubles with NVPTX target or some other targets?

NVPTX :(

FWIW I'm seeing internal correctness issues after this patch has been merged - I think it's related to the section changing part of the patches. I'm investigating.

:( Very said. Do have troubles with NVPTX target or some other targets?

NVPTX :(

Do you have a repro?

FWIW I'm seeing internal correctness issues after this patch has been merged - I think it's related to the section changing part of the patches. I'm investigating.

:( Very said. Do have troubles with NVPTX target or some other targets?

NVPTX :(

Do you have a repro?

Not yet. Trying to work one up.

FWIW I'm seeing internal correctness issues after this patch has been merged - I think it's related to the section changing part of the patches. I'm investigating.

:( Very said. Do have troubles with NVPTX target or some other targets?

NVPTX :(

Do you have a repro?

Not yet. Trying to work one up.

So this diff:

  • for (const std::string &S : DwarfFiles)
  • getStreamer().EmitRawText(S.data());

+ for (const std::string &S : DwarfFiles)
+
getStreamer().EmitRawText(DwarfFiles.front().data());

causes the problem to go away.

The problem shows up this way:

There is more than one initializer with name 'file'

during program startup (within google so you won't be able to duplicate this part). which seems a little weird, but I'm guessing that there's something going on in the way that we're outputting the file changes at section switch time with how ptxas is dealing with it.

For now I'm going to go ahead and revert this because it's making it so that cuda compilation isn't working with clang and I'll continue working up a testcase and see if I can't narrow down the problem a little better and get back to you soon.

FWIW I'm seeing internal correctness issues after this patch has been merged - I think it's related to the section changing part of the patches. I'm investigating.

:( Very said. Do have troubles with NVPTX target or some other targets?

NVPTX :(

Do you have a repro?

Not yet. Trying to work one up.

So this diff:

  • for (const std::string &S : DwarfFiles)
  • getStreamer().EmitRawText(S.data()); + for (const std::string &S : DwarfFiles) + getStreamer().EmitRawText(DwarfFiles.front().data());

    causes the problem to go away.

    The problem shows up this way:

    There is more than one initializer with name 'file'

    during program startup (within google so you won't be able to duplicate this part). which seems a little weird, but I'm guessing that there's something going on in the way that we're outputting the file changes at section switch time with how ptxas is dealing with it.

    For now I'm going to go ahead and revert this because it's making it so that cuda compilation isn't working with clang and I'll continue working up a testcase and see if I can't narrow down the problem a little better and get back to you soon.

Eric, could you send the small part of the PTX file, that causes troubles? Not the whole file, just several lines.

FWIW I'm seeing internal correctness issues after this patch has been merged - I think it's related to the section changing part of the patches. I'm investigating.

:( Very said. Do have troubles with NVPTX target or some other targets?

NVPTX :(

Do you have a repro?

Not yet. Trying to work one up.

So this diff:

  • for (const std::string &S : DwarfFiles)
  • getStreamer().EmitRawText(S.data()); + for (const std::string &S : DwarfFiles) + getStreamer().EmitRawText(DwarfFiles.front().data());

    causes the problem to go away.

    The problem shows up this way:

    There is more than one initializer with name 'file'

    during program startup (within google so you won't be able to duplicate this part). which seems a little weird, but I'm guessing that there's something going on in the way that we're outputting the file changes at section switch time with how ptxas is dealing with it.

    For now I'm going to go ahead and revert this because it's making it so that cuda compilation isn't working with clang and I'll continue working up a testcase and see if I can't narrow down the problem a little better and get back to you soon.

Eric, could you send the small part of the PTX file, that causes troubles? Not the whole file, just several lines.

Once I extract it from our build system I will :)

Hmm. Not sure, of course, but looks like the corruption of the output buffer. Like the flush is required or something like this. Need the output result for better analysis.