This is an archive of the discontinued LLVM Phabricator instance.

[cfi-verify] Support cross-DSO
ClosedPublic

Authored by jgalenson on Jul 16 2018, 9:31 AM.

Details

Summary

When used in cross-DSO mode, CFI will generate calls to special functions rather than trap instructions. For example, instead of generating

if (!InlinedFastCheck(f))

abort();

call *f

CFI generates

if (!InlinedFastCheck(f))

__cfi_slowpath(CallSiteTypeId, f);

call *f

This patch teaches cfi-verify to recognize calls to __cfi_slowpath and abort and treat them as trap functions.

In addition to normal symbols, we also parse the dynamic relocations to handle cross-DSO calls in libraries.

We also extend cfi-verify to recognize other patterns that occur using cross-DSO. For example, some indirect calls are not guarded by a branch to a trap but instead follow a call to __cfi_slowpath. For example:

if (!InlinedFastCheck(f))

call *f

else {

__cfi_slowpath(CallSiteTypeId, f);
call *f

}

In this case, the second call to f is not marked as protected by the current code. We thus recognize if indirect calls directly follow a call to a function that will trap on CFI violations and treat them as protected.

We also ignore indirect calls in the PLT, since on AArch64 each entry contains an indirect call that should not be protected by CFI, and these are labeled incorrectly when debug information is not present.

Diff Detail

Repository
rL LLVM

Event Timeline

jgalenson created this revision.Jul 16 2018, 9:31 AM

It would be really nice for llvm object library to support PLT detection/parsing.

Just for cfi-verify tool, we can cheat and link with -Wl,-emit-relocs. We already rely on line tables to tell data from code, so the tool does not work on fully stripped production executables anyway.

tools/llvm-cfi-verify/lib/FileAnalysis.cpp
563 ↗(On Diff #155696)

TrapName should be just a function name.
"@plt" is assembly convention, the user of the tool should not care whether the call is local or external.

576 ↗(On Diff #155696)

I'm not sure you can rely on the order of these sections being the same.

Generally speaking, you need to disassemble each PLT entry and find the indirect branch instruction that points to the address of the corresponding .rela.plt relocation.

This sounds complicated, but there are only a few _recommended_ PLT instruction sequences, usually found in SYSV architecture supplement documents.

Maybe relying on .rela.plt order is not such a bad idea after all, provided that all linkers behave the same way.

580 ↗(On Diff #155696)

.rela.plt can be found by DT_JMPREL entry in dynamic section.

It's very uncommon, but the section table can be stripped from the binary.

pcc added inline comments.Jul 24 2018, 5:02 PM
tools/llvm-cfi-verify/lib/FileAnalysis.cpp
55 ↗(On Diff #155696)

Since we know what all of the names of the "trap functions" are, does this really need to be a flag?

Also, __cfi_slowpath technically isn't a trap function so maybe there's a better name for this concept.

Thanks for the comments!

It would be really nice for llvm object library to support PLT detection/parsing.

Just for cfi-verify tool, we can cheat and link with -Wl,-emit-relocs. We already rely on line tables to tell data from code, so the tool does not work on fully stripped production executables anyway.

What exactly does -Wl,-emit-relocs do? Does it force the linker to keep the rela.plt section? There's not much we can do without that flag, is there?

tools/llvm-cfi-verify/lib/FileAnalysis.cpp
55 ↗(On Diff #155696)

Good point; I added this specifically for __cfi_slowpath@plt. If we want to hardcode that and other trap functions, that's fine by me. What other functions should I add? abort?

It's true that __cfi_slowpath isn't exactly a trap function, but it acts like one in the context of cfi-verify, as it will trap if and only if the call is invalid. I guess I can try to rename this to make that clearer (e.g., willTrapOnIllegalCall or something).

563 ↗(On Diff #155696)

True. But the whole rest of this function only looks for functions in the PLT, which end with a "@plt" suffix (see line 598, which I added to match GNU objdump's notation). So I just added this as an optimization to avoid this logic if it's not needed.

576 ↗(On Diff #155696)

I'm definitely not an expert on ELF files, and I couldn't find out whether these sections always have the same order (besides a random StackOverflow answer saying they did).

Disassembling the PLT does sound complicated, especially for something like this that should be as architecture-independent as possible.

Should we just assume they have the same order, or should we look into this more?

580 ↗(On Diff #155696)

Okay, I'll look into using the ELF::DT_JMPREL flag instead of this. There's no way to do something like that to find the PLT sections, is there?

If the section table is stripped, this code will fail to find the section and fail, so at least we don't get incorrect results.

--emit-relocs will generate a relocation at the call site, the same kind you find in object files.

tools/llvm-cfi-verify/lib/FileAnalysis.cpp
580 ↗(On Diff #155696)

On a second thought, I would not worry about executables w/o a section table, especially since there is no reliable way to find .plt there.

jgalenson updated this revision to Diff 157297.Jul 25 2018, 9:54 AM
jgalenson retitled this revision from [cfi-verify] Add an option to treat calls to a specified function as traps. to [cfi-verify] Support cross-DSO by treating certain calls as traps..
jgalenson edited the summary of this revision. (Show Details)

I replaced the command-line flag to specify functions with a hardcoded list of __cfi_slowpath, abort, and their @plt variants. Does that seem reasonable?

pcc added a comment.Jul 25 2018, 1:33 PM

I agree with eugenis that the PLT symbolization should live in lib/Object. That would also help us implement PLT symbolization in llvm-objdump. I was imagining that there would be a function that would take an object file and return a list of (PLT entry address, symbol) pairs. That would seem to be sufficient for llvm-objdump as well as for this code.

I was discussing PLT symbolization offline with eugenis and did a little more investigation. It turns out that it shouldn't be that hard to symbolize the PLT, at least not on aarch64. All of bfd/gold/lld start their PLT entries with a straightforward ADRP/LDR sequence, which you can pretty straighforwardly map onto a GOT entry and then onto a symbol.

tools/llvm-cfi-verify/lib/FileAnalysis.cpp
539 ↗(On Diff #157297)

Where does abort come from? In the diagnostic mode the failure path will end up calling __ubsan_handle_cfi_check_fail or __ubsan_handle_cfi_check_fail_abort.

eugenis added inline comments.Jul 25 2018, 1:55 PM
tools/llvm-cfi-verify/lib/FileAnalysis.cpp
539 ↗(On Diff #157297)

I think it comes from a combination of -fsanitize-trap and -ftrap-function=abort.

In D49383#1175738, @pcc wrote:

I agree with eugenis that the PLT symbolization should live in lib/Object. That would also help us implement PLT symbolization in llvm-objdump. I was imagining that there would be a function that would take an object file and return a list of (PLT entry address, symbol) pairs. That would seem to be sufficient for llvm-objdump as well as for this code.

I was discussing PLT symbolization offline with eugenis and did a little more investigation. It turns out that it shouldn't be that hard to symbolize the PLT, at least not on aarch64. All of bfd/gold/lld start their PLT entries with a straightforward ADRP/LDR sequence, which you can pretty straighforwardly map onto a GOT entry and then onto a symbol.

As I said initially, I did think putting this logic in core LLVM was the best way, but I still don't understand how to do it (other than the way I did). Do you have more details or a sample patch?

tools/llvm-cfi-verify/lib/FileAnalysis.cpp
539 ↗(On Diff #157297)

Yes, I figured it was a common function that might get used instead of a trap instruction. I can remove it.

I didn't add the diagnostic ones because I couldn't easily generate a test file using them and I didn't want to add them without a test.

I think a better API would be something like ObjectFile::symbols() - a content_iterator<PltRef>, where PltRef would be a DataRefImpl to the start of the current plt entry.
You don't really need to pull in the disassembler to parse the entry, you can just match bytes for known entry type(s) and extract the immediate offset.

On AArch64, searching for adrp/ldr pairs is still a heuristic; is there not a more principled way of doing it? What about x86? And how do we get the name of the function this way?

I'm not aware of a better way.
X86 will need to match x86 plt entry format. Luckily, it's just a few lines of code for each target.

Function name is obtained from .plt.rela relocation for the address extracted from adrp.

pcc added a comment.Jul 25 2018, 3:26 PM
In D49383#1175738, @pcc wrote:

I agree with eugenis that the PLT symbolization should live in lib/Object. That would also help us implement PLT symbolization in llvm-objdump. I was imagining that there would be a function that would take an object file and return a list of (PLT entry address, symbol) pairs. That would seem to be sufficient for llvm-objdump as well as for this code.

I was discussing PLT symbolization offline with eugenis and did a little more investigation. It turns out that it shouldn't be that hard to symbolize the PLT, at least not on aarch64. All of bfd/gold/lld start their PLT entries with a straightforward ADRP/LDR sequence, which you can pretty straighforwardly map onto a GOT entry and then onto a symbol.

As I said initially, I did think putting this logic in core LLVM was the best way, but I still don't understand how to do it (other than the way I did). Do you have more details or a sample patch?

There are three parts: disassembling the .plt section, finding the PLT entry points with corresponding virtual addresses for GOT entries and mapping them onto the dynamic relocations for the GOT. The first two parts are architecture specific and the third is mostly architecture independent (except for the relocation type for GOT entries). I was imagining that you would add a virtual function to MCInstrAnalysis:

std::vector<std::pair</*PLT VA*/uint64_t, /*GOT entry VA*/uint64_t>> findPltEntries(uint64_t PltSectionVA, ArrayRef<uint8_t> PltContents);

whose implementations would basically do something like this function:

http://llvm-cs.pcc.me.uk/tools/llvm-cfi-verify/lib/FileAnalysis.cpp#417

except that they would look for instructions that appear in PLT entries (i.e. ADRP/LDR instructions in the AArch64 implementation).

Then you'd add a function to ELFObjectFileBase

std::vector<std::pair</* symbol */DataRefImpl, /* PLT VA */uint64_t>> getPltAddresses()

that would

  1. find the .plt section by enumerating sections using section_iterator
  2. use getSectionContents to get the contents and then pass them to findPltEntries
  3. find .rela.plt or .rel.plt in the same way (or just in the same loop during step 1) and once you've found it use SectionRef::relocations to enumerate relocations
  4. from there you'll be able to get the "offset" (actually a VA in executables and DSOs), type and symbol. Check that the type is correct (correct meaning has the correct relocation type -- this could just be determined with a switch statement like
switch (something) {
case aarch64:
  return R_AARCH64_JUMP_SLOT;
....
}

) and if the offset/VA matches one of the GOT entry VAs that you previously got from findPltEntries then add the (symbol, PLT VA) pair to the vector that you're going to return.

pcc added a comment.Jul 25 2018, 3:32 PM

And agreed with eugenis that if you can avoid using the disassembler in findPltEntries that might be simpler.

pcc added a comment.Jul 25 2018, 3:36 PM

I'm not sure that DataRefImpl would be a better API because you would need to keep track of a lot of state between getting the info for each PLT entry and that could get complicated. DataRefImpl is better suited for the thin wrappers around object file data structures where you're just enumerating the elements of an array so the between-element state is smaller.

By the state to be tracked, do you mean an index of .got.plt relocations by address? I guess it could be too heavy for an iterator, which needs to be copyable.

pcc added a comment.Jul 25 2018, 3:45 PM

By the state to be tracked, do you mean an index of .got.plt relocations by address?

Yes as well as the list of PLT entries. Otherwise we'd end up with some of the state belonging to the target-dependent code (i.e. my findPltEntries would need to be replaced with APIs for getting the first/next PLT with state passed in) and that could get complicated.

Thanks for the explanations! pcc, that example API was very helpful. I have the core part of it working, including integration with cfi-verify and objdump. I'll finish it up and put the patches up for review at some point.

I have a few questions:

Is there a reason you suggested the getPltAddresses method should return a DataRefImpl instead of a SectionRef? The latter seems easier to use.

Is the reason to avoid having findPltEntries use the disassembler just to be more lightweight or avoid the extra dependency? Re-implementing even just a tiny portion of it seems a bit strange to me.

The 64-bit x86 PLT format seems pretty straightforward, but the 32-bit format seems to contain just an offset from the beginning of the GOT. Am I missing a simple way to use that to compute the GOT virtual address for a PLT entry, or do I need to extend the API to give it a way to compute the GOT base address? Also, I assume it's fine for X86MCInstrAnalysis to try both the 32 and 64-bit decodings to see if either matches.

pcc added a comment.Jul 27 2018, 2:33 PM

Thanks for the explanations! pcc, that example API was very helpful. I have the core part of it working, including integration with cfi-verify and objdump. I'll finish it up and put the patches up for review at some point.

Great!

I have a few questions:

Is there a reason you suggested the getPltAddresses method should return a DataRefImpl instead of a SectionRef? The latter seems easier to use.

You mean a SymbolRef? Yes, I think it should return one of those instead.

Is the reason to avoid having findPltEntries use the disassembler just to be more lightweight or avoid the extra dependency? Re-implementing even just a tiny portion of it seems a bit strange to me.

Main advantage is simplicity: there's no need to use a generic disassembler here when a simple specialized one would do.

The 64-bit x86 PLT format seems pretty straightforward, but the 32-bit format seems to contain just an offset from the beginning of the GOT. Am I missing a simple way to use that to compute the GOT virtual address for a PLT entry, or do I need to extend the API to give it a way to compute the GOT base address?

It seems fine to pass the .got address into findPltEntries as well.

Also, I assume it's fine for X86MCInstrAnalysis to try both the 32 and 64-bit decodings to see if either matches.

It seems better to pass a target triple or something into the function so that it knows which instruction set to use. There may also be architectures where this is a critical piece of information, such as architectures which support multiple instruction endiannesses.

jgalenson updated this revision to Diff 158853.Aug 2 2018, 3:40 PM
jgalenson retitled this revision from [cfi-verify] Support cross-DSO by treating certain calls as traps. to [cfi-verify] Support cross-DSO.
jgalenson edited the summary of this revision. (Show Details)

Updated to use the new scheme suggested by pcc and eugenis, which lets other parts of LVLM (e.g., objdump) use the PLT information.

jgalenson updated this revision to Diff 159581.Aug 7 2018, 1:07 PM

Added __cfi_slowpath_diag to the list of functions that trap on CFI violations.

Friendly ping.

pcc added inline comments.Aug 16 2018, 1:37 PM
tools/llvm-cfi-verify/lib/FileAnalysis.cpp
539 ↗(On Diff #157297)

I'd remove it since we shouldn't be trapping directly in cross-DSO mode, so even if a trap function is specified we shouldn't be emitting a call to it in connection with a CFI check.

jgalenson added inline comments.Aug 16 2018, 1:42 PM
tools/llvm-cfi-verify/lib/FileAnalysis.cpp
539 ↗(On Diff #157297)

You mean remove __cfi_slowpath_diag from this list? I actually want to keep it since I see calls to it in real binaries. It's also specified in https://clang.llvm.org/docs/ControlFlowIntegrityDesign.html#cfi-slowpath just next to __cfi_slowpath, which apparently calls __cfi_slowpath_diag.

pcc added inline comments.Aug 16 2018, 1:43 PM
tools/llvm-cfi-verify/lib/FileAnalysis.cpp
539 ↗(On Diff #157297)

Sorry, I meant abort.

eugenis added inline comments.Aug 16 2018, 1:46 PM
tools/llvm-cfi-verify/lib/FileAnalysis.cpp
539 ↗(On Diff #157297)

Why remove abort from the list? This tool is intended to check if the binary is safe while relying on the cfi implementation details as little as possible. Any function that does not return when the call is unsafe is fine in this list, IMHO.

pcc added inline comments.Aug 16 2018, 1:49 PM
tools/llvm-cfi-verify/lib/FileAnalysis.cpp
539 ↗(On Diff #157297)

Ah, that is indeed a good reason to keep it. I withdraw my request then.

jgalenson marked an inline comment as done.Aug 17 2018, 8:31 AM

Any other comments?

tools/llvm-cfi-verify/lib/FileAnalysis.cpp
539 ↗(On Diff #157297)

Yes, I didn't actually see anything using this, but I figured it would be correct, so I left it there.

pcc accepted this revision.Aug 17 2018, 10:54 AM

LGTM

This revision is now accepted and ready to land.Aug 17 2018, 10:54 AM
This revision was automatically updated to reflect the committed changes.
jgalenson marked an inline comment as done.