This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Object/
-
Object/
2/3
MachOObjectFile.cpp
-
test/tools/llvm-nm/X86/
-
tools/
-
llvm-nm/
-
X86/
-
Inputs/
-
macho-dwarf-x86_64
1/2
macho-dwarf.test

Differential D41657

Do not look up symbol names when n_strx == 0
ClosedPublic

Authored by mtrent on Jan 1 2018, 2:13 PM.

Download Raw Diff

Details

Reviewers

enderby
davide

Commits

rGca30902ff841: Do not look up symbol names when n_strx == 0
rL321773: Do not look up symbol names when n_strx == 0

Summary

Historical tools for working with mach-o binaries verify the nlist field
n_strx has a non-zero value before using that value to retrieve symbol names.
Under some cirumstances, llvm-nm will attempt to display the symbol name at
position 0, even though symbol names at that position are not well defined.
This change addresses this problem by returning an empty string when n_strx
is zero.

rdar://problem/35750548

Diff Detail

Build Status

Buildable 13504
Build 13504: arc lint + arc unit

Event Timeline

mtrent created this revision.Jan 1 2018, 2:13 PM

Herald added a subscriber: JDevlieghere. · View Herald TranscriptJan 1 2018, 2:13 PM

I wonder whether you can use something like yaml2obj to craft the object? (or, assuming it's valid, llvm-mc)?
That would improve the readability a lot IMHO.
If not, can you at least add comments to the test (e.g. source file + compiler/linker version etc..) in case we need to regenerate this in the future?

lib/Object/MachOObjectFile.cpp
1662–1665	No `{}` around single line ifs.
test/tools/llvm-nm/X86/macho-dwarf.test
3	Do you need `cat` ? Can't you just pipe the nm output to `FileCheck`?

davide requested changes to this revision.Jan 2 2018, 6:25 AM

This revision now requires changes to proceed.Jan 2 2018, 6:25 AM

In D41657#965799, @davide wrote:

I wonder whether you can use something like yaml2obj to craft the object? (or, assuming it's valid, llvm-mc)?
That would improve the readability a lot IMHO.
If not, can you at least add comments to the test (e.g. source file + compiler/linker version etc..) in case we need to regenerate this in the future?

llvm-objdump and llvm-nm commonly use binary tests when working with bad binaries, so this test isn't unusual. It's not clear to me how you would regenerate this mach-o with lld.

lib/Object/MachOObjectFile.cpp
1662–1665	will fix.
test/tools/llvm-nm/X86/macho-dwarf.test
3	Good point, will fix.

mtrent marked 2 inline comments as done.Jan 2 2018, 11:37 AM

Updating change request with review feedback.

Main change is to rewrite the lit test to exactly match whole lines of input.
Also added a comment explaining how the test binary was formed.

In D41657#966001, @mtrent wrote:

In D41657#965799, @davide wrote:

I wonder whether you can use something like yaml2obj to craft the object? (or, assuming it's valid, llvm-mc)?
That would improve the readability a lot IMHO.
If not, can you at least add comments to the test (e.g. source file + compiler/linker version etc..) in case we need to regenerate this in the future?

llvm-objdump and llvm-nm commonly use binary tests when working with bad binaries, so this test isn't unusual. It's not clear to me how you would regenerate this mach-o with lld.

It happened in the past that changes break tests. If you have a binary checked in, it's harder to understand what was the original intent, IMHO.
That's why I was recommending to use YAML, if possible at all. That said, I don't think it's critical, and your comment is explicative enough.

This LGTM after the two minors are fixed.

This revision is now accepted and ready to land.Jan 2 2018, 2:52 PM

Looks good to me with the update of the one comment.

lib/Object/MachOObjectFile.cpp
1662	Mach-O back in 1988 when I created it, was based on 4.2 BSD Unix. In there there it states "A n_strx value of 0 indicates that no name is associated with a particular symbol table entry". Also the a.out(5) format at that time had the size of the string table as the first 4 bytes of the string table, so valid string table indexes where 4 or more. In the early days of Mach-O since the load command had the string table size, tools still reserved these 4 bytes. And generally put nulls in them so incase some tool did not correctly understand index 0 was special it would "just work". So I would update or remove this comment.

Updating comment based upon review feedback.

mtrent closed this revision.Jan 3 2018, 3:29 PM

Revision Contents

Path

Size

lib/

Object/

MachOObjectFile.cpp

4 lines

test/

tools/

llvm-nm/

X86/

Inputs/

macho-dwarf-x86_64

macho-dwarf.test

15 lines

Diff 128573

lib/Object/MachOObjectFile.cpp

Show First 20 Lines • Show All 1,653 Lines • ▼ Show 20 Lines	unsigned SymbolTableEntrySize = is64Bit() ?
sizeof(MachO::nlist_64) :		sizeof(MachO::nlist_64) :
sizeof(MachO::nlist);		sizeof(MachO::nlist);
Symb.p += SymbolTableEntrySize;		Symb.p += SymbolTableEntrySize;
}		}

Expected<StringRef> MachOObjectFile::getSymbolName(DataRefImpl Symb) const {		Expected<StringRef> MachOObjectFile::getSymbolName(DataRefImpl Symb) const {
StringRef StringTable = getStringTableData();		StringRef StringTable = getStringTableData();
MachO::nlist_base Entry = getSymbolTableEntryBase(*this, Symb);		MachO::nlist_base Entry = getSymbolTableEntryBase(*this, Symb);
		if (Entry.n_strx == 0)
		enderbyUnsubmitted Done Reply Inline Actions Mach-O back in 1988 when I created it, was based on 4.2 BSD Unix. In there there it states "A n_strx value of 0 indicates that no name is associated with a particular symbol table entry". Also the a.out(5) format at that time had the size of the string table as the first 4 bytes of the string table, so valid string table indexes where 4 or more. In the early days of Mach-O since the load command had the string table size, tools still reserved these 4 bytes. And generally put nulls in them so incase some tool did not correctly understand index 0 was special it would "just work". So I would update or remove this comment. enderby: Mach-O back in 1988 when I created it, was based on 4.2 BSD Unix. In there there it states "A…
		// A n_strx value of 0 indicates that no name is associated with a
		// particular symbol table entry.
		return StringRef();
		davideUnsubmitted Done Reply Inline Actions No `{}` around single line ifs. davide: No `{}` around single line ifs.
		mtrentAuthorUnsubmitted Not Done Reply Inline Actions will fix. mtrent: will fix.
const char *Start = &StringTable.data()[Entry.n_strx];		const char *Start = &StringTable.data()[Entry.n_strx];
if (Start < getData().begin() \|\| Start >= getData().end()) {		if (Start < getData().begin() \|\| Start >= getData().end()) {
return malformedError("bad string index: " + Twine(Entry.n_strx) +		return malformedError("bad string index: " + Twine(Entry.n_strx) +
" for symbol at index " + Twine(getSymbolIndex(Symb)));		" for symbol at index " + Twine(getSymbolIndex(Symb)));
}		}
return StringRef(Start);		return StringRef(Start);
}		}

▲ Show 20 Lines • Show All 2,882 Lines • Show Last 20 Lines

test/tools/llvm-nm/X86/Inputs/macho-dwarf-x86_64

test/tools/llvm-nm/X86/macho-dwarf.test

This file was added.

				# This file was constructed from 3 trivial source files and linked with macOS's
				# ld64 linker.
				#
				davideUnsubmitted Done Reply Inline Actions Do you need `cat` ? Can't you just pipe the nm output to `FileCheck`? davide: Do you need `cat` ? Can't you just pipe the nm output to `FileCheck`?
				mtrentAuthorUnsubmitted Not Done Reply Inline Actions Good point, will fix. mtrent: Good point, will fix.
				# cc -gdwarf-2 -o foo.o -c foo.c
				# cc -gdwarf-2 -o bar.o -c bar.c
				# ld -r foo.o bar.o -o foobar.o
				# cc -gdwarf-2 -o baz foobar.o baz.c

				# RUN: llvm-nm -ap %p/Inputs/macho-dwarf-x86_64 \| FileCheck -match-full-lines -strict-whitespace %s

				# CHECK:000000000000002a - 01 0000 ENSYM
				# CHECK:0000000000000010 - 01 0000 ENSYM
				# CHECK:000000000000000b - 01 0000 ENSYM