This is an archive of the discontinued LLVM Phabricator instance.

Change reference kind for FDE references the compiler already emitted
ClosedPublic

Authored by pete on Dec 10 2015, 6:38 PM.

Details

Summary

In processFDE, we currently early exit if there are any references. This is taken to mean the the compiler knew what it was doing so we trust it.

However, there were a couple of issues here.

The first is that these references may not have the correct kind. For example, we expect the function reference to be of type unwindFDEToFunction and so must change the kind of the compiler emitted reference to that. Similarly for CFI and LSDA.

The second issue was that we were stopping processing the FDE if the compiler emitted any references, but there's no guarantee that it emitted all of them. So the compiler may emit a func ref but we should still parse the FDE to generate the CFI and LSDA refs.

We now work out which refs we got from the compiler, then parse the FDE. We create those refs which the compiler did not emit, and any it did are converted to the correct kinds.

Diff Detail

Repository
rL LLVM

Event Timeline

pete updated this revision to Diff 42492.Dec 10 2015, 6:38 PM
pete retitled this revision from to Change reference kind for FDE references the compiler already emitted.
pete updated this object.
pete added a reviewer: lhames.
pete added a project: lld.
pete added a subscriber: kledzik.
kledzik added inline comments.Dec 10 2015, 6:52 PM
lib/ReaderWriter/MachO/MachONormalizedFileToAtoms.cpp
762–774 ↗(On Diff #42492)

You can also add asserts that the offset of the reference in the atom is the correct location in the compact unwind struct.

Actually, all these should not be assert()s. Malformed input files should generate a (descriptive) error message (not an assert).

824–828 ↗(On Diff #42492)

The other way to fix this is in getPairReferenceInfo() to check the type of "inAtom" and if compact unwind, generate an unwindFDEToFunction instead of delta32.

The difference between the two is that unwindFDEToFunction does not create a reloc in -r mode. That is, it upgrades the format to not need the relocation.

test/mach-o/eh-frame-relocs-arm64.yaml
1–3 ↗(On Diff #42492)

I like to add an extra round trip test here. That is, run lld again on %t to verify the new output (that should be same as original input) still passes the test.

pete added inline comments.Dec 11 2015, 10:20 AM
lib/ReaderWriter/MachO/MachONormalizedFileToAtoms.cpp
762–774 ↗(On Diff #42492)

Yeah, you're right.

There's a couple of asserts which were there already but which I can also convert to error messages. For example, we make sure that CIE references from the FDE point to the start of the CIE atom:

assert(!addend && "FDE's CIE field does not point at the start of a CIE.");
assert(cie && cie->contentType() == DefinedAtom::typeCFI &&

"FDE's CIE field does not point at the start of a CIE.");

In this case, I should actually also add that the contentType is CFI but also that we return true from ArchHandler::isDwarfCIE(isBig, atom).

In terms of the error checking, it will be hard to write the error checking tests without committing (some version of) this patch first. Would you be ok with me doing extensive error checking (including replacing most of the asserts) in follow up commits from this one?

824–828 ↗(On Diff #42492)

That actually sounds better to handle this in getPairReferenceInfo(). I took a look but its slightly more complicated than just checking the atom type. For example, CIE's and FDE's both get typeCFI, so when processing a delta64 we don't know if we are in an FDE with calling ArchHandler::isDwarfCIE(isBig, atom). That function is unfortunately long enough that I don't think I want to call it on all delta64's:

bool ArchHandler::isDwarfCIE(bool isBig, const DefinedAtom *atom) {

assert(atom->contentType() == DefinedAtom::typeCFI);
if (atom->rawContent().size() < sizeof(uint32_t))
  return false;
uint32_t size = read32(atom->rawContent().data(), isBig);

uint32_t idOffset = sizeof(uint32_t);
if (size == 0xffffffffU)
  idOffset += sizeof(uint64_t);

return read32(atom->rawContent().data() + idOffset, isBig) == 0;

}

Perhaps I could add a typeCIE and typeFDE and classify these early, and remove typeCFI?

test/mach-o/eh-frame-relocs-arm64.yaml
1–3 ↗(On Diff #42492)

Ah yeah, i've seen that in other tests. I'll add that.

kledzik added inline comments.Dec 11 2015, 1:21 PM
lib/ReaderWriter/MachO/MachONormalizedFileToAtoms.cpp
824–828 ↗(On Diff #42492)

The design issue this is dancing around is unwindFDEToFunction vs delta32. In terms of the fixup they do in the final executable, they are the same thing. The reason for the existence of unwindFDEToFunction is two-fold: 1) In -r mode it suppresses a relocation being added to __compact_unwind section (since it is not needed), 2) slightly easier to differentiate ref to function vs ref to CIE when walking references.

By having unwindFDEToFunction, we make the parsing of the .o file slightly harder, but the rest of the linker is easier.

In this case we don't need/want relocations on the __compact_unwind section. Maybe another approach is to not have unwindFDEToFunction (just use delta32), but have a new attribute on sections saying where they use relocations or not. That way applyFixupFinal() would consult that section attribute to decide whether to make a reloc or not.

pete updated this revision to Diff 44332.Jan 8 2016, 8:58 AM

Updated patch to address Nick's feedback about setting the kind of the reference in getPairReferenceInfo.

After r257099 (Don't emit relocs for the __eh_frame section as they can be implicit.), I've also had to update this patch to handle parsing the reloc in the CIE too, as it was otherwise being emitted as an implicit reloc (i.e., the reloc isn't edited at all, and the fixup bytes contain the offset to the personality function), but not parsed as an implicit reloc.

I'd like to handle more error checking in a follow up patch as this one has grown quite large already. Once the structure of this one is ok, i'll make sure all the asserts are errors and we have tests for them.

lhames accepted this revision.Feb 11 2016, 4:26 PM
lhames edited edge metadata.

Thanks Pete!

lib/ReaderWriter/MachO/MachONormalizedFileToAtoms.cpp
1001 ↗(On Diff #44332)

These blocks (this one and the previous ones that use currentRefGetter) look pretty regular. Could these be pulled out into some sort of "fixEHFrameAtomReference" function? That could probably contain the logic from currentRefGetter too.

Otherwise LGTM.

This revision is now accepted and ready to land.Feb 11 2016, 4:26 PM
This revision was automatically updated to reflect the committed changes.