The target needs to be queried here, but previously we seemed to only
duplicate CUDA's (and so HIP's) behavior, and only partially. Use the
same function as codegen to find the correct address space.
|3,190 ms||windows > Profile-x86_64.Profile-x86_64::gcc-flag-compatibility.test|
Script: -- : 'RUN: at line 1'; rm -rf C:\ws\w16-1\llvm-project\premerge-checks\build\projects\compiler-rt\test\profile\Profile-x86_64\Output\gcc-flag-compatibility.test.tmp.d
|600,110 ms||windows > Profile-x86_64.Profile-x86_64::instrprof-basic.c|
Script: -- : 'RUN: at line 1'; C:/ws/w16-1/llvm-project/premerge-checks/build/./bin/clang.exe -Wl,-incremental:no -fprofile-instr-generate -o C:\ws\w16-1\llvm-project\premerge-checks\build\projects\compiler-rt\test\profile\Profile-x86_64\Output\instrprof-basic.c.tmp -O3 C:\ws\w16-1\llvm-project\premerge-checks\compiler-rt\test\profile\instrprof-basic.c
|600,090 ms||windows > Profile-x86_64.Profile-x86_64::instrprof-dump.c|
Script: -- : 'RUN: at line 2'; rm -fr C:\ws\w16-1\llvm-project\premerge-checks\build\projects\compiler-rt\test\profile\Profile-x86_64\Output\instrprof-dump.c.tmp.profdir
|3,850 ms||windows > Profile-x86_64.Profile-x86_64::instrprof-override-filename.c|
Script: -- : 'RUN: at line 1'; rm -rf C:\ws\w16-1\llvm-project\premerge-checks\build\projects\compiler-rt\test\profile\Profile-x86_64\Output\instrprof-override-filename.c.tmp.dir && mkdir -p C:\ws\w16-1\llvm-project\premerge-checks\build\projects\compiler-rt\test\profile\Profile-x86_64\Output\instrprof-override-filename.c.tmp.dir
|600,090 ms||windows > Profile-x86_64.Windows::instrprof-multiprocess.test|
Script: -- : 'RUN: at line 1'; C:/ws/w16-1/llvm-project/premerge-checks/build/./bin/clang.exe -Wl,-incremental:no -fprofile-instr-generate C:\ws\w16-1\llvm-project\premerge-checks\compiler-rt\test\profile\Windows/Inputs/instrprof-multiprocess.c -o C:\ws\w16-1\llvm-project\premerge-checks\build\projects\compiler-rt\test\profile\Profile-x86_64\Windows\Output\instrprof-multiprocess.test.tmp
|View Full Test Results (28 Failed)|
I'm not certain I fully understand NVPTX's relationship with its debugger, but from https://reviews.llvm.org/D57162 I gather the "default" address space in the debugger is global, and so the frontend omits it rather than explicitly mentioning it. I think it would be simpler to carry this information throughout the compiler, and only strip it late in the backend as a quirk controllable via some "optimize for NVPTX debugger", but in the patch as it currently is I instead just update NVPTXDWARFAddrSpaceMap.
Edit: Concerning auto variables, when coming back to the patch to post it I had missed the next patch in the series which addresses it by directly referring to the corresponding alloca rather than the addrspacecast to the default address space. I'll post that patch shortly to address the "FIXME" in this one.
The test is pretty short, so I just re-ordered the checks to match how they appear in the output (and used more descriptive names to make it easier to follow).
This does mean the test relies on the order these things are traversed. Some bits are maybe a bit surprising, like how the metadata for the __shared__ auto variable comes before the argument, but I don't imagine it is liable to change often/accidentally.
Does anyone have any thoughts on this change specifically? Is someone more familiar with NVPTX willing to weigh in on whether it makes more sense to carry the address space throughout the compiler explicitly and "drop" it late in the DWARF emission, or to do what I did in the current patch (drop it early).
I would lean towards updating the patch to do the latter, but I wanted to get feedback before plunging off to do it.