This is an archive of the discontinued LLVM Phabricator instance.

Emit S_COMPILE3 CodeView record
ClosedPublic

Authored by amccarth on Sep 7 2016, 2:29 PM.

Details

Summary

CodeView has an S_COMPILE3 record to identify the compiler and source language of the compiland.

This record comes first in the debug$S section for the compiland. The debuggers rely on this record to know the source language of the code.

I'm not thrilled about how this infers the CPUType and how it has to parse the compiler version from the producer string in the DICompileUnit. If you know of a better way to get at this information, I'd be happy to improve this patch.

Diff Detail

Repository
rL LLVM

Event Timeline

amccarth updated this revision to Diff 70599.Sep 7 2016, 2:29 PM
amccarth retitled this revision from to Emit S_COMPILE3 CodeView record.
amccarth updated this object.
amccarth added a reviewer: rnk.
amccarth added a subscriber: llvm-commits.

Forgive my naïveté but under what circumstance does the debugger need to know C vs C++ vs whatever?

lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
542–543 ↗(On Diff #70599)

This shouldn't map to IA64, that would be Itanium. It's also not correct to map it to X64 because it could be 32-bit sandybridge.

Does MSVC actually change what it sticks in here depending on /arch? If not, I'd just use the ArchType from the Triple to reconstruct this info.

593 ↗(On Diff #70599)

Using the triple for this looks weird. I'd grab the cpu name from the TM via Asm->TM.getTargetCPU()

Forgive my naïveté but under what circumstance does the debugger need to know C vs C++ vs whatever?

It may not _need_ to know, but the Visual Studio debugger does _display_ the source language in places like the call stack. I suppose the source language might be used in evaluating Watch expressions.

lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
542–543 ↗(On Diff #70599)

OK, I'll try to fix that up. I was mostly relying on Wikipedia to figure out how the architectures mapped. I guess I misunderstood what I read.

Does MSVC actually change what it sticks in here depending on /arch? If not, I'd just use the ArchType from the Triple to reconstruct this info.

I don't know how MSVC chooses what goes here. I'll experiment.

It does appear, however, that clang-cl chooses one of these architectures for the triple based on /arch, so I figured these were the important ones to map back into the CodeView type.

593 ↗(On Diff #70599)

Thanks! I was looking for something like that but had failed to find it.

I had tried Asm->TM.getMCSubtargetInfo()->getCPU();, but that asserted.

amccarth updated this revision to Diff 70615.Sep 7 2016, 4:18 PM

Simpler approach for getting the CodeView CPUType.

majnemer added inline comments.Sep 7 2016, 5:42 PM
lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
574–575 ↗(On Diff #70615)

I'd make it a switch on the triple's architecture (Triple(MMI->getModule()->getTargetTriple()).getArch()). Clang also supports ARMNT.

I'd have the default label call report_fatal_error to indicate that the architecture is not supported.

majnemer added inline comments.Sep 7 2016, 6:05 PM
lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
530–532 ↗(On Diff #70615)

This should be in the anonymous namespace.

574–575 ↗(On Diff #70615)
majnemer added inline comments.Sep 7 2016, 7:42 PM
lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
574–575 ↗(On Diff #70615)

Eh, we should just report_fatal_error. Sorry for waffling.

583–585 ↗(On Diff #70615)

Apparently, Microsoft tools (like BinScope [https://www.microsoft.com/en-us/download/details.aspx?id=11910]) expect that the backend version is at least 8.0.50727 [1]

I think the backend version should be controlled by LLVM_VERSION_MAJOR, LLVM_VERSION_MINOR and LLVM_VERSION_PATCH instead of whatever is contained in the .ll file. Because of the aforementioned limitation, perhaps we should do something like LLVM_VERSION_MAJOR * 1000 + LLVM_VERSION_MINOR * 10 + LLVM_VERSION_PATCH and use zeros for the other three values.

[1] http://repo.or.cz/nasm.git/commitdiff/1df89ea03902161b76773d0eef48f277b49d918e

amccarth marked 2 inline comments as done.Sep 9 2016, 3:42 PM

It looks like I have to fix up more tests, too, because the extra record make the subsection size a little larger. I'm working through those now but may not have a new diff until Monday.

lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
530–532 ↗(On Diff #70615)

oops. I had it in there initially until I learned that LLVM style is to prefer file static functions to functions in anonymous namespaces.

583–585 ↗(On Diff #70615)

I see your point, but ...

  1. I also have to provide the textual compiler version string, and it seems awkward to get that from a different source than the structured data. This seems prone to getting conflicting information.
  1. Having the version dependent on the actual build instead of what's in the .ll file makes the test harder to write.

What if I continue to parse the version from the string but roll up the fields as you've described to solve the Binscope expectation problem?

Another possibility is to use the MS compatibility version rather than the Clang version and synthesize a string like "clang in MSVC compatibility mode 1.2.3.4" for the textual compiler version.

majnemer added inline comments.Sep 9 2016, 4:21 PM
lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
583–585 ↗(On Diff #70615)

Eh, but the frontend is the IR that came from clang and the backend is AsmPrinter's job.
The test issue can be handled by making the FileCheck use a regex.

amccarth updated this revision to Diff 71895.Sep 19 2016, 5:19 PM
amccarth marked an inline comment as done.

Several tests needed updating because the extra S_COMPILE3 record in the Symbols subsection changes the section's size and offsets some relocation information.

Also uses reports distinct front-end and back-end versions per David Majnemer's suggestion.

majnemer accepted this revision.Sep 20 2016, 10:22 AM
majnemer added a reviewer: majnemer.

LGTM, thanks!

This revision is now accepted and ready to land.Sep 20 2016, 10:22 AM
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Sep 21 2016, 11:02 AM
llvm/trunk/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
509

(post-commit nit: since it's in anonymous namespace, the static should probably go now)

rnk added inline comments.Sep 30 2016, 10:18 AM
llvm/trunk/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
2100

I don't think we want to repeat the compile record in every symbol subsection. We have one symbol subsection per function. MSVC only emits the compiler flags once up front, like this:

CodeViewDebugInfo [
  Section: .debug$S (2)
  Magic: 0x4
  Subsection [
    SubSectionType: Symbols (0xF1)
    SubSectionSize: 0x5C
    ObjectName {
      Signature: 0x0
      ObjectName: C:\src\llvm\build\t.obj
    }
    CompilerFlags3 {
      Language: Cpp (0x1)
      Flags [ (0x6000)
        HotPatch (0x4000)
        SecurityChecks (0x2000)
      ]
      Machine: X64 (0xD0)
      FrontendVersion: 19.0.24210.0
      BackendVersion: 19.0.24210.0
      VersionName: Microsoft (R) Optimizing Compiler
    }
  ]
  Subsection [
    SubSectionType: Symbols (0xF1)
    SubSectionSize: 0x337
    UDT {
amccarth added inline comments.Sep 30 2016, 1:01 PM
llvm/trunk/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
2100

Good catch. I'll fix that this afternoon and add a test for it.