This is an archive of the discontinued LLVM Phabricator instance.

Fix incorrect DWARF 5 file name 0 when using -no-integrated-as
AcceptedPublic

Authored by osandov on Jul 8 2021, 2:38 PM.

Details

Summary

DWARF 5 specifies that in a line number program header in .debug_line,
directory entry 0 must be the compilation directory and file name entry
0 must be the compilation file name. Consider the following correct
output:

$ pwd
/home/osandov/test
$ cat foo/bar/baz.c
int my_var = 1;
$ clang -g -gdwarf-5 -c foo/bar/baz.c
$ readelf --debug-dump=line baz.o
...
 The Directory Table (offset 0x22, lines 1, columns 1):
  Entry Name
  0     (indirect line string, offset: 0x0): /home/osandov/test

 The File Name Table (offset 0x2e, lines 1, columns 3):
  Entry Dir     MD5                             Name
  0     0 0x436e7c8f87b60ba4ecab92f1eddfcbb4    (indirect line string, offset: 0x13): foo/bar/baz.c

However, when using -no-integrated-as, the compilation directory is
incorrectly given as the parent directory of the source file, and the
compilation file name is given relative to that:

$ clang -g -gdwarf-5 -no-integrated-as -c foo/bar/baz.c
$ readelf --debug-dump=line baz.o
...
 The Directory Table (offset 0x22, lines 1, columns 1):
  Entry Name
  0     (indirect line string, offset: 0x0): /home/osandov/test/foo/bar

 The File Name Table (offset 0x2e, lines 1, columns 3):
  Entry Dir     MD5                             Name
  0     0 0xb4cbdfedf192abeca40bb6878f7c6e43    (indirect line string, offset: 0x1b): baz.c

This is because the generated assembly .file directive does not split
the directory from the file name if -no-integrated-as is used:

$ clang -g -gdwarf-5 -S foo/bar/baz.c -o - | grep '\.file\s\+0'
        .file   0 "/home/osandov/test" "foo/bar/baz.c" md5 0xb4cbdfedf192abeca40bb6878f7c6e43
$ clang -g -gdwarf-5 -no-integrated-as -S foo/bar/baz.c -o - | grep '\.file\s\+0'
        .file   0 "/home/osandov/test/foo/bar/baz.c" md5 0xb4cbdfedf192abeca40bb6878f7c6e43

This is ostensibly for compatibility, since the [extended .file
syntax](https://sourceware.org/binutils/docs/as/File.html) that allows
specifying the directory separately was only recently added to the GNU
Assembler in binutils 2.35.

But, the md5 argument also relies on the extended syntax, so this
isn't actually improving compatibility when using DWARF 5. Fix this bug
by always using the directory argument for DWARF 5.

I noticed this bug when compiling the Linux kernel with Clang and DWARF
5, as the Linux kernel is compiled with -no-integrated-as.

Test Plan:
Updated CodeGen tests. Ran the reproducer above and saw that it matched
the -integrated-as version.

Diff Detail

Event Timeline

osandov created this revision.Jul 8 2021, 2:38 PM
osandov requested review of this revision.Jul 8 2021, 2:38 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 8 2021, 2:38 PM

Please include a test case (hopefully/possibly the original patch that added the functionality for compatibility already included testing and this patch will need to update that testing)

Got a link to conversations, patches, etc, that confirm this claim:

This is ostensibly for compatibility, since the extended .file syntax that allows specifying the directory separately was only recently added to the GNU Assembler in binutils 2.35.

Got a link to conversations, patches, etc, that confirm this claim:

This behavior goes back to https://github.com/llvm/llvm-project/commit/1d617acef9f102629b736db06a5b5def0e116297, which states:

With r142300 but not this patch, clang -S may emit .s files that assemblers other than llvm-mc can't parse.

(llvm-mc got support in the previous commit, https://github.com/llvm/llvm-project/commit/40f8f2ff2452c32529a10ba504ba7fbecf8731cd.)

I'll work on a test case, thanks.

osandov updated this revision to Diff 357387.Jul 8 2021, 4:47 PM
osandov edited the summary of this revision. (Show Details)

The existing CodeGen tests already cover this, so I updated them.

But, the md5 argument also relies on the extended syntax, so this isn't actually improving compatibility when using DWARF 5. Fix this bug by always using the directory argument for DWARF 5.

Hmm, I don't think this argument holds - we don't emit checksums by default, do we? They're opt-in, I think? (at the clang level)

So this change as proposed would break that backwards compatibility for assemblers for users who aren't opting in to checksums/haven't already broken it.

I'm not passing any flags to opt in to hashes, and I'm getting them:

$ clang --version
clang version 12.0.0
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
$ clang -g -gdwarf-5 -no-integrated-as -S foo/bar/baz.c -o - | grep '\.file'
        .file   "baz.c"
        .file   0 "/home/osandov/test/foo/bar/baz.c" md5 0xb4cbdfedf192abeca40bb6878f7c6e43

With the same results on the main branch. I don't have access to a non-Linux system to check there, though.

It is opt-in in the sense that you have to opt in to DWARF 5, though. And if you're asking for DWARF 5 with -no-integrated-as, then your assembler better support the directives needed for DWARF 5.

dblaikie accepted this revision.Jul 9 2021, 1:40 PM
dblaikie added a subscriber: MaskRay.

I'm not passing any flags to opt in to hashes, and I'm getting them:

$ clang --version
clang version 12.0.0
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
$ clang -g -gdwarf-5 -no-integrated-as -S foo/bar/baz.c -o - | grep '\.file'
        .file   "baz.c"
        .file   0 "/home/osandov/test/foo/bar/baz.c" md5 0xb4cbdfedf192abeca40bb6878f7c6e43

With the same results on the main branch. I don't have access to a non-Linux system to check there, though.

ah, right - fair enough then.

This revision is now accepted and ready to land.Jul 9 2021, 1:40 PM

Thank you! How do I get this landed?

Thank you! How do I get this landed?

https://llvm.org/docs/Phabricator.html#committing-someone-s-change-from-phabricator

Usually the reviewer can help commit the change on behalf of the author.

MaskRay added a comment.EditedJul 12 2021, 11:23 AM

clang/lib/Driver/ToolChains/Clang.cpp:ShouldDisableDwarfDirectory sets -fno-dwarf-directory-asm for -fno-integrated-as.

With -fintegrated-as, UseDwarfDirectory is true (the default -fdwarf-directory-asm).

With -fno-integrated-as, UseDwarfDirectory is false (as if -fno-dwarf-directory-asm is specified).

-fdwarf-directory-asm appears to be a Clang specific option (not in GCC) for compatibility when GNU as did not support the new .file. I think with modern -gdwarf-5 -fno-dwarf-directory-asm should probably just be ignored.

MaskRay added inline comments.Jul 12 2021, 12:40 PM
llvm/test/CodeGen/Generic/dwarf-source.ll
14 ↗(On Diff #357387)

I sent D105835 (driver option) to fix the original issue. Clang sets `UseDwarfDirectory propably.

A similar change is needed on llc side. llc should set UseDwarfDirectory to false only if -no-integrated-as and dwarf-version<5.

MaskRay added inline comments.Jul 12 2021, 4:34 PM
llvm/lib/MC/MCAsmStreamer.cpp
67–68

You may repurpose the patch to drop UseDwarfDirectory. The refactoring is still good to have.