This is an archive of the discontinued LLVM Phabricator instance.

COFF: Friendlier undefined symbol errors.
ClosedPublic

Authored by pcc on Apr 9 2018, 4:36 PM.

Details

Summary

This change does three things:

  • Try to find the file and line number information of an undefined symbol reference by reading codeview debug info.
  • Try to find the name of the function or global variable with the undefined symbol reference by searching the object file's symbol table.
  • Prints the information in the same style as the ELF linker.

Event Timeline

pcc created this revision.Apr 9 2018, 4:36 PM
pcc added a comment.Apr 9 2018, 4:40 PM

The error messages look a little odd on non-Windows platforms:

ra/bin/lld-link: error: undefined symbol: ?foo@@YAHXZ
>>> referenced by \path\to\foo.cpp:4
>>>               foo.obj:(main)

(i.e. backslashes rather than slashes in the file path)

@zturner Would there be any downside to using slashes in our codeview output on non-Windows?

In D45467#1062509, @pcc wrote:

The error messages look a little odd on non-Windows platforms:

ra/bin/lld-link: error: undefined symbol: ?foo@@YAHXZ
>>> referenced by \path\to\foo.cpp:4
>>>               foo.obj:(main)

(i.e. backslashes rather than slashes in the file path)

@zturner Would there be any downside to using slashes in our codeview output on non-Windows?

We should probably use the native host slash for all paths. Reason being that MS tools need to be able to link our object files, and we don't have any control over how they handle slashes. Windows' tools handling of forward slashes in tools is pretty inconsistent, but in most cases doesn't work very well, so I would expect things to break if we have slashes (e.g. even if the linker doesn't break and dutifully copies a path with forward slashes into a PDB, the debugger could break).

I just noticed you said on non-Windows. Yea, I think the path should be whatever is native to the host you're building on.

pcc updated this revision to Diff 141776.Apr 9 2018, 6:26 PM
  • Remove getDebugChunks() check
  • Update and add tests
pcc retitled this revision from [wip] COFF: Friendlier undefined symbol errors. to COFF: Friendlier undefined symbol errors..Apr 9 2018, 6:26 PM
pcc edited the summary of this revision. (Show Details)
smeenai added a subscriber: smeenai.Apr 9 2018, 6:47 PM

I just noticed you said on non-Windows. Yea, I think the path should be whatever is native to the host you're building on.

We cross-compile for Windows on Linux. If there's a change here, it would be awesome if I was CC'd, so I could test it with our setup and make sure VS and WinDbg are still happy.

pcc added a comment.Apr 9 2018, 7:02 PM

I just noticed you said on non-Windows. Yea, I think the path should be whatever is native to the host you're building on.

We cross-compile for Windows on Linux. If there's a change here, it would be awesome if I was CC'd, so I could test it with our setup and make sure VS and WinDbg are still happy.

The change is D45473 and I've cc'd you.

I just noticed you said on non-Windows. Yea, I think the path should be whatever is native to the host you're building on.

We cross-compile for Windows on Linux. If there's a change here, it would be awesome if I was CC'd, so I could test it with our setup and make sure VS and WinDbg are still happy.

Cross compiling is a weird situation here, because the paths aren't going to make sense anyway (obviously you can't run your debugger on the same machine that the program was built on in that setup, therefore any paths in the PDB are necessarily going to be useless). So I think this should be a property of the machine you're building on (since that's the machine the source resides on, which is what the paths refer to), and have nothing to do with the target.

I was out the second half of last week, so can you catch me up on what happened with the other patch and what you're waiting on here? Is the path issue all settled and now you just want feedback on the error message printing?

ruiu added inline comments.Apr 16 2018, 3:18 PM
lld/COFF/Chunks.h
195–197

If we can always do this, should we store Relocs as an ArrayRef instead of iterator_range in the first place?

lld/COFF/PDB.cpp
1161

Please add a function comment.

1170–1186

Can you factor this code out as a separate function?

1188

Maybe you can factor out this loop as a new function that returns a line table for a given address?

lld/COFF/SymbolTable.cpp
68

Add a function comment.

77–79

Instead of storing two separate values, you might be able to just save the symbol.

108

Use error().

pcc added a comment.Apr 16 2018, 3:20 PM

I was out the second half of last week, so can you catch me up on what happened with the other patch and what you're waiting on here? Is the path issue all settled and now you just want feedback on the error message printing?

Yes, that patch landed and we now emit Unix-style paths on Unix hosts. I'm just waiting on review of the error message printing here.

zturner added inline comments.Apr 16 2018, 3:33 PM
lld/COFF/Chunks.h
195–197

Alternatively, why not just return an iterator_range?

lld/COFF/PDB.cpp
1193

Can we use a DenseMap here?

1210

Don't we already iterate the subsection list for other reasons in this file? Would it be worth piggybacking off of that to save the relevant info from the various subsections into each SectionChunk so that we can just do something like DbgC->Lines

lld/COFF/PDB.h
33–34

Can you take this argument by const& instead of by non-const pointer?

pcc updated this revision to Diff 142731.Apr 16 2018, 7:49 PM
pcc marked 7 inline comments as done.
  • Address review comments
lld/COFF/Chunks.h
195–197

Switching to ArrayRef made the code a little simpler and was done in D45714.

lld/COFF/PDB.cpp
1210

I don't think that would work. We read CV subsections elsewhere to create a PDB, but that needs to happen after we issue undefined symbol errors because it requires section contents to be relocated, which requires all sections to be laid out and all symbols to be defined. But if we're issuing an undefined symbol error then obviously not all symbols will be defined.

lld/COFF/PDB.h
33–34

Const seems fine but we usually pass pointers around in this part of the code for things like chunks.

zturner added a subscriber: pcc.Apr 16 2018, 8:47 PM

Is there a particular reason to use a pointer? Inside the function it isn’t
null checked so we’re already assuming non null.

pcc added a comment.Apr 16 2018, 8:50 PM

Is there a particular reason to use a pointer? Inside the function it isn’t
null checked so we’re already assuming non null.

It's just the local convention, but I can change this one if you feel strongly.

ruiu added inline comments.Apr 16 2018, 9:21 PM
lld/COFF/Chunks.h
195

Now we probably should just export Relocs instead of defining an accessor for it.

lld/COFF/PDB.cpp
1195

auto -> coff_relocation

1277–1278

Can you use real types instead of auto?

lld/COFF/SymbolTable.cpp
196–198

Something like this is I think more straightforward

errorOrWarn("undefined symbol: " + Sym->getName() + getSymbolLocations(File, SymIndex));
In D45467#1069519, @pcc wrote:

Is there a particular reason to use a pointer? Inside the function it isn’t
null checked so we’re already assuming non null.

It's just the local convention, but I can change this one if you feel strongly.

Eh, stick with the convention. Maybe in a follow up if I feel strongly enough I can throw up a patch to change the convention :)

pcc updated this revision to Diff 142736.Apr 16 2018, 10:09 PM
pcc marked 4 inline comments as done.
  • Address review comments
ruiu accepted this revision.Apr 17 2018, 3:32 PM

LGTM

lld/COFF/SymbolTable.cpp
115

auto -> Location

This revision is now accepted and ready to land.Apr 17 2018, 3:32 PM
This revision was automatically updated to reflect the committed changes.
pcc marked an inline comment as done.