This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
test/tools/llvm-cfi-verify/
-
tools/
-
llvm-cfi-verify/
-
AArch64/
-
Inputs/
-
function-only-check.o
-
function-only-check.s
-
X86/
-
Inputs/
-
function-only-check.o
-
function-only-check.s
-
tools/llvm-cfi-verify/lib/
-
llvm-cfi-verify/
-
lib/
-
FileAnalysis.h
-
FileAnalysis.cpp
-
GraphBuilder.cpp

Differential D49383

[cfi-verify] Support cross-DSO
ClosedPublic

Authored by jgalenson on Jul 16 2018, 9:31 AM.

Download Raw Diff

Details

Reviewers

pcc
vlad.tsyrklevich
javed.absar
eugenis

Commits

rG6cc0e63e2f02: [cfi-verify] Support cross-DSO
rL340612: [cfi-verify] Support cross-DSO

Summary

When used in cross-DSO mode, CFI will generate calls to special functions rather than trap instructions. For example, instead of generating

if (!InlinedFastCheck(f))

abort();

call *f

CFI generates

if (!InlinedFastCheck(f))

__cfi_slowpath(CallSiteTypeId, f);

call *f

This patch teaches cfi-verify to recognize calls to __cfi_slowpath and abort and treat them as trap functions.

In addition to normal symbols, we also parse the dynamic relocations to handle cross-DSO calls in libraries.

We also extend cfi-verify to recognize other patterns that occur using cross-DSO. For example, some indirect calls are not guarded by a branch to a trap but instead follow a call to __cfi_slowpath. For example:

if (!InlinedFastCheck(f))

call *f

else {

__cfi_slowpath(CallSiteTypeId, f);
call *f

}

In this case, the second call to f is not marked as protected by the current code. We thus recognize if indirect calls directly follow a call to a function that will trap on CFI violations and treat them as protected.

We also ignore indirect calls in the PLT, since on AArch64 each entry contains an indirect call that should not be protected by CFI, and these are labeled incorrectly when debug information is not present.

Diff Detail

Repository: rL LLVM

Event Timeline

jgalenson created this revision.Jul 16 2018, 9:31 AM

Herald added a reviewer: javed.absar. · View Herald TranscriptJul 16 2018, 9:31 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

jgalenson added a child revision: D49675: [cfi-verify] Detect more protected calls using cross-DSO..Jul 23 2018, 9:40 AM

jgalenson added a reviewer: eugenis.Jul 24 2018, 8:24 AM

It would be really nice for llvm object library to support PLT detection/parsing.

Just for cfi-verify tool, we can cheat and link with -Wl,-emit-relocs. We already rely on line tables to tell data from code, so the tool does not work on fully stripped production executables anyway.

tools/llvm-cfi-verify/lib/FileAnalysis.cpp
563 ↗	(On Diff #155696)	TrapName should be just a function name. "@plt" is assembly convention, the user of the tool should not care whether the call is local or external.
576 ↗	(On Diff #155696)	I'm not sure you can rely on the order of these sections being the same. Generally speaking, you need to disassemble each PLT entry and find the indirect branch instruction that points to the address of the corresponding .rela.plt relocation. This sounds complicated, but there are only a few _recommended_ PLT instruction sequences, usually found in SYSV architecture supplement documents. Maybe relying on .rela.plt order is not such a bad idea after all, provided that all linkers behave the same way.
580 ↗	(On Diff #155696)	.rela.plt can be found by DT_JMPREL entry in dynamic section. It's very uncommon, but the section table can be stripped from the binary.

pcc added inline comments.Jul 24 2018, 5:02 PM

tools/llvm-cfi-verify/lib/FileAnalysis.cpp
55 ↗	(On Diff #155696)	Since we know what all of the names of the "trap functions" are, does this really need to be a flag? Also, `__cfi_slowpath` technically isn't a trap function so maybe there's a better name for this concept.

Thanks for the comments!

In D49383#1174223, @eugenis wrote:

It would be really nice for llvm object library to support PLT detection/parsing.

Just for cfi-verify tool, we can cheat and link with -Wl,-emit-relocs. We already rely on line tables to tell data from code, so the tool does not work on fully stripped production executables anyway.

What exactly does -Wl,-emit-relocs do? Does it force the linker to keep the rela.plt section? There's not much we can do without that flag, is there?

tools/llvm-cfi-verify/lib/FileAnalysis.cpp
55 ↗	(On Diff #155696)	Good point; I added this specifically for __cfi_slowpath@plt. If we want to hardcode that and other trap functions, that's fine by me. What other functions should I add? abort? It's true that __cfi_slowpath isn't exactly a trap function, but it acts like one in the context of cfi-verify, as it will trap if and only if the call is invalid. I guess I can try to rename this to make that clearer (e.g., willTrapOnIllegalCall or something).
563 ↗	(On Diff #155696)	True. But the whole rest of this function only looks for functions in the PLT, which end with a "@plt" suffix (see line 598, which I added to match GNU objdump's notation). So I just added this as an optimization to avoid this logic if it's not needed.
576 ↗	(On Diff #155696)	I'm definitely not an expert on ELF files, and I couldn't find out whether these sections always have the same order (besides a random StackOverflow answer saying they did). Disassembling the PLT does sound complicated, especially for something like this that should be as architecture-independent as possible. Should we just assume they have the same order, or should we look into this more?
580 ↗	(On Diff #155696)	Okay, I'll look into using the ELF::DT_JMPREL flag instead of this. There's no way to do something like that to find the PLT sections, is there? If the section table is stripped, this code will fail to find the section and fail, so at least we don't get incorrect results.

--emit-relocs will generate a relocation at the call site, the same kind you find in object files.

tools/llvm-cfi-verify/lib/FileAnalysis.cpp
580 ↗	(On Diff #155696)	On a second thought, I would not worry about executables w/o a section table, especially since there is no reliable way to find .plt there.

I replaced the command-line flag to specify functions with a hardcoded list of __cfi_slowpath, abort, and their @plt variants. Does that seem reasonable?

I agree with eugenis that the PLT symbolization should live in lib/Object. That would also help us implement PLT symbolization in llvm-objdump. I was imagining that there would be a function that would take an object file and return a list of (PLT entry address, symbol) pairs. That would seem to be sufficient for llvm-objdump as well as for this code.

I was discussing PLT symbolization offline with eugenis and did a little more investigation. It turns out that it shouldn't be that hard to symbolize the PLT, at least not on aarch64. All of bfd/gold/lld start their PLT entries with a straightforward ADRP/LDR sequence, which you can pretty straighforwardly map onto a GOT entry and then onto a symbol.

tools/llvm-cfi-verify/lib/FileAnalysis.cpp
539 ↗	(On Diff #157297)	Where does `abort` come from? In the diagnostic mode the failure path will end up calling `__ubsan_handle_cfi_check_fail` or `__ubsan_handle_cfi_check_fail_abort`.

eugenis added inline comments.Jul 25 2018, 1:55 PM

tools/llvm-cfi-verify/lib/FileAnalysis.cpp
539 ↗	(On Diff #157297)	I think it comes from a combination of -fsanitize-trap and -ftrap-function=abort.

In D49383#1175738, @pcc wrote:

I agree with eugenis that the PLT symbolization should live in lib/Object. That would also help us implement PLT symbolization in llvm-objdump. I was imagining that there would be a function that would take an object file and return a list of (PLT entry address, symbol) pairs. That would seem to be sufficient for llvm-objdump as well as for this code.

I was discussing PLT symbolization offline with eugenis and did a little more investigation. It turns out that it shouldn't be that hard to symbolize the PLT, at least not on aarch64. All of bfd/gold/lld start their PLT entries with a straightforward ADRP/LDR sequence, which you can pretty straighforwardly map onto a GOT entry and then onto a symbol.

As I said initially, I did think putting this logic in core LLVM was the best way, but I still don't understand how to do it (other than the way I did). Do you have more details or a sample patch?

tools/llvm-cfi-verify/lib/FileAnalysis.cpp
539 ↗	(On Diff #157297)	Yes, I figured it was a common function that might get used instead of a trap instruction. I can remove it. I didn't add the diagnostic ones because I couldn't easily generate a test file using them and I didn't want to add them without a test.

I think a better API would be something like ObjectFile::symbols() - a content_iterator<PltRef>, where PltRef would be a DataRefImpl to the start of the current plt entry.
You don't really need to pull in the disassembler to parse the entry, you can just match bytes for known entry type(s) and extract the immediate offset.

On AArch64, searching for adrp/ldr pairs is still a heuristic; is there not a more principled way of doing it? What about x86? And how do we get the name of the function this way?

I'm not aware of a better way.
X86 will need to match x86 plt entry format. Luckily, it's just a few lines of code for each target.

Function name is obtained from .plt.rela relocation for the address extracted from adrp.

In D49383#1175843, @jgalenson wrote:

In D49383#1175738, @pcc wrote:

I agree with eugenis that the PLT symbolization should live in lib/Object. That would also help us implement PLT symbolization in llvm-objdump. I was imagining that there would be a function that would take an object file and return a list of (PLT entry address, symbol) pairs. That would seem to be sufficient for llvm-objdump as well as for this code.

I was discussing PLT symbolization offline with eugenis and did a little more investigation. It turns out that it shouldn't be that hard to symbolize the PLT, at least not on aarch64. All of bfd/gold/lld start their PLT entries with a straightforward ADRP/LDR sequence, which you can pretty straighforwardly map onto a GOT entry and then onto a symbol.

As I said initially, I did think putting this logic in core LLVM was the best way, but I still don't understand how to do it (other than the way I did). Do you have more details or a sample patch?

There are three parts: disassembling the .plt section, finding the PLT entry points with corresponding virtual addresses for GOT entries and mapping them onto the dynamic relocations for the GOT. The first two parts are architecture specific and the third is mostly architecture independent (except for the relocation type for GOT entries). I was imagining that you would add a virtual function to MCInstrAnalysis:

std::vector<std::pair</*PLT VA*/uint64_t, /*GOT entry VA*/uint64_t>> findPltEntries(uint64_t PltSectionVA, ArrayRef<uint8_t> PltContents);

whose implementations would basically do something like this function:

http://llvm-cs.pcc.me.uk/tools/llvm-cfi-verify/lib/FileAnalysis.cpp#417

except that they would look for instructions that appear in PLT entries (i.e. ADRP/LDR instructions in the AArch64 implementation).

Then you'd add a function to ELFObjectFileBase

std::vector<std::pair</* symbol */DataRefImpl, /* PLT VA */uint64_t>> getPltAddresses()

that would

find the .plt section by enumerating sections using section_iterator
use getSectionContents to get the contents and then pass them to findPltEntries
find .rela.plt or .rel.plt in the same way (or just in the same loop during step 1) and once you've found it use SectionRef::relocations to enumerate relocations
from there you'll be able to get the "offset" (actually a VA in executables and DSOs), type and symbol. Check that the type is correct (correct meaning has the correct relocation type -- this could just be determined with a switch statement like

switch (something) {
case aarch64:
  return R_AARCH64_JUMP_SLOT;
....
}

) and if the offset/VA matches one of the GOT entry VAs that you previously got from findPltEntries then add the (symbol, PLT VA) pair to the vector that you're going to return.

And agreed with eugenis that if you can avoid using the disassembler in findPltEntries that might be simpler.

I'm not sure that DataRefImpl would be a better API because you would need to keep track of a lot of state between getting the info for each PLT entry and that could get complicated. DataRefImpl is better suited for the thin wrappers around object file data structures where you're just enumerating the elements of an array so the between-element state is smaller.

By the state to be tracked, do you mean an index of .got.plt relocations by address? I guess it could be too heavy for an iterator, which needs to be copyable.

In D49383#1175928, @eugenis wrote:

By the state to be tracked, do you mean an index of .got.plt relocations by address?

Yes as well as the list of PLT entries. Otherwise we'd end up with some of the state belonging to the target-dependent code (i.e. my findPltEntries would need to be replaced with APIs for getting the first/next PLT with state passed in) and that could get complicated.

Thanks for the explanations! pcc, that example API was very helpful. I have the core part of it working, including integration with cfi-verify and objdump. I'll finish it up and put the patches up for review at some point.

I have a few questions:

Is there a reason you suggested the getPltAddresses method should return a DataRefImpl instead of a SectionRef? The latter seems easier to use.

Is the reason to avoid having findPltEntries use the disassembler just to be more lightweight or avoid the extra dependency? Re-implementing even just a tiny portion of it seems a bit strange to me.

The 64-bit x86 PLT format seems pretty straightforward, but the 32-bit format seems to contain just an offset from the beginning of the GOT. Am I missing a simple way to use that to compute the GOT virtual address for a PLT entry, or do I need to extend the API to give it a way to compute the GOT base address? Also, I assume it's fine for X86MCInstrAnalysis to try both the 32 and 64-bit decodings to see if either matches.

In D49383#1178807, @jgalenson wrote:

Thanks for the explanations! pcc, that example API was very helpful. I have the core part of it working, including integration with cfi-verify and objdump. I'll finish it up and put the patches up for review at some point.

Great!

I have a few questions:

Is there a reason you suggested the getPltAddresses method should return a DataRefImpl instead of a SectionRef? The latter seems easier to use.

You mean a SymbolRef? Yes, I think it should return one of those instead.

Is the reason to avoid having findPltEntries use the disassembler just to be more lightweight or avoid the extra dependency? Re-implementing even just a tiny portion of it seems a bit strange to me.

Main advantage is simplicity: there's no need to use a generic disassembler here when a simple specialized one would do.

The 64-bit x86 PLT format seems pretty straightforward, but the 32-bit format seems to contain just an offset from the beginning of the GOT. Am I missing a simple way to use that to compute the GOT virtual address for a PLT entry, or do I need to extend the API to give it a way to compute the GOT base address?

It seems fine to pass the .got address into findPltEntries as well.

Also, I assume it's fine for X86MCInstrAnalysis to try both the 32 and 64-bit decodings to see if either matches.

It seems better to pass a target triple or something into the function so that it knows which instruction set to use. There may also be architectures where this is a critical piece of information, such as architectures which support multiple instruction endiannesses.

Updated to use the new scheme suggested by pcc and eugenis, which lets other parts of LVLM (e.g., objdump) use the PLT information.

Herald added subscribers: JDevlieghere, kristof.beyls, aprantl. · View Herald TranscriptAug 2 2018, 3:40 PM

jgalenson added a parent revision: D50203: Find PLT entries for x86, x86_64, and AArch64.Aug 2 2018, 3:41 PM

Added __cfi_slowpath_diag to the list of functions that trap on CFI violations.

Friendly ping.

pcc added inline comments.Aug 16 2018, 1:37 PM

tools/llvm-cfi-verify/lib/FileAnalysis.cpp
539 ↗	(On Diff #157297)	I'd remove it since we shouldn't be trapping directly in cross-DSO mode, so even if a trap function is specified we shouldn't be emitting a call to it in connection with a CFI check.

jgalenson added inline comments.Aug 16 2018, 1:42 PM

tools/llvm-cfi-verify/lib/FileAnalysis.cpp
539 ↗	(On Diff #157297)	You mean remove `__cfi_slowpath_diag` from this list? I actually want to keep it since I see calls to it in real binaries. It's also specified in https://clang.llvm.org/docs/ControlFlowIntegrityDesign.html#cfi-slowpath just next to `__cfi_slowpath`, which apparently calls `__cfi_slowpath_diag`.

pcc added inline comments.Aug 16 2018, 1:43 PM

tools/llvm-cfi-verify/lib/FileAnalysis.cpp
539 ↗	(On Diff #157297)	Sorry, I meant `abort`.

eugenis added inline comments.Aug 16 2018, 1:46 PM

tools/llvm-cfi-verify/lib/FileAnalysis.cpp
539 ↗	(On Diff #157297)	Why remove abort from the list? This tool is intended to check if the binary is safe while relying on the cfi implementation details as little as possible. Any function that does not return when the call is unsafe is fine in this list, IMHO.

pcc added inline comments.Aug 16 2018, 1:49 PM

tools/llvm-cfi-verify/lib/FileAnalysis.cpp
539 ↗	(On Diff #157297)	Ah, that is indeed a good reason to keep it. I withdraw my request then.

Any other comments?

tools/llvm-cfi-verify/lib/FileAnalysis.cpp
539 ↗	(On Diff #157297)	Yes, I didn't actually see anything using this, but I figured it would be correct, so I left it there.

LGTM

This revision is now accepted and ready to land.Aug 17 2018, 10:54 AM

Closed by commit rL340612: [cfi-verify] Support cross-DSO (authored by jgalenson). · Explain WhyAug 24 2018, 8:23 AM

This revision was automatically updated to reflect the committed changes.

jgalenson marked an inline comment as done.

Revision Contents

Path

Size

llvm/

trunk/

test/

tools/

llvm-cfi-verify/

AArch64/

Inputs/

function-only-check.o

function-only-check.s

9 lines

X86/

Inputs/

function-only-check.o

function-only-check.s

8 lines

tools/

llvm-cfi-verify/

lib/

FileAnalysis.h

12 lines

FileAnalysis.cpp

56 lines

GraphBuilder.cpp

18 lines

Diff 162385

llvm/trunk/test/tools/llvm-cfi-verify/AArch64/Inputs/function-only-check.o

This is a binary file.

llvm/trunk/test/tools/llvm-cfi-verify/AArch64/function-only-check.s

				# RUN: llvm-cfi-verify -search-length-undef=6 %S/Inputs/function-only-check.o \| FileCheck %s

				# CHECK-LABEL: {{^Instruction: .* \(PROTECTED\)}}
				# CHECK-NEXT: tiny.cc:9

				# CHECK: Expected Protected: 1 (100.00%)
				# CHECK: Unexpected Protected: 0 (0.00%)
				# CHECK: Expected Unprotected: 0 (0.00%)
				# CHECK: Unexpected Unprotected (BAD): 0 (0.00%)

llvm/trunk/test/tools/llvm-cfi-verify/X86/Inputs/function-only-check.o

This is a binary file.

llvm/trunk/test/tools/llvm-cfi-verify/X86/function-only-check.s

				# RUN: llvm-cfi-verify %S/Inputs/function-only-check.o \| FileCheck %s

				# CHECK-LABEL: {{^Instruction: .* \(PROTECTED\)}}

				# CHECK: Expected Protected: 1 (100.00%)
				# CHECK: Unexpected Protected: 0 (0.00%)
				# CHECK: Expected Unprotected: 0 (0.00%)
				# CHECK: Unexpected Unprotected (BAD): 0 (0.00%)

llvm/trunk/tools/llvm-cfi-verify/lib/FileAnalysis.h

//===- FileAnalysis.h -------------------------------------------- C++ --===//		//===- FileAnalysis.h -------------------------------------------- C++ --===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_CFI_VERIFY_FILE_ANALYSIS_H		#ifndef LLVM_CFI_VERIFY_FILE_ANALYSIS_H
#define LLVM_CFI_VERIFY_FILE_ANALYSIS_H		#define LLVM_CFI_VERIFY_FILE_ANALYSIS_H

#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
		#include "llvm/ADT/SmallSet.h"
#include "llvm/BinaryFormat/ELF.h"		#include "llvm/BinaryFormat/ELF.h"
#include "llvm/DebugInfo/Symbolize/Symbolize.h"		#include "llvm/DebugInfo/Symbolize/Symbolize.h"
#include "llvm/MC/MCAsmInfo.h"		#include "llvm/MC/MCAsmInfo.h"
#include "llvm/MC/MCContext.h"		#include "llvm/MC/MCContext.h"
#include "llvm/MC/MCDisassembler/MCDisassembler.h"		#include "llvm/MC/MCDisassembler/MCDisassembler.h"
#include "llvm/MC/MCInst.h"		#include "llvm/MC/MCInst.h"
#include "llvm/MC/MCInstPrinter.h"		#include "llvm/MC/MCInstPrinter.h"
#include "llvm/MC/MCInstrAnalysis.h"		#include "llvm/MC/MCInstrAnalysis.h"
▲ Show 20 Lines • Show All 83 Lines • ▼ Show 20 Lines	public:
// respectively. Returns nullptr if the next/prev instruction doesn't exist,		// respectively. Returns nullptr if the next/prev instruction doesn't exist,
// or if the provided instruction doesn't exist.		// or if the provided instruction doesn't exist.
const Instr *getPrevInstructionSequential(const Instr &InstrMeta) const;		const Instr *getPrevInstructionSequential(const Instr &InstrMeta) const;
const Instr *getNextInstructionSequential(const Instr &InstrMeta) const;		const Instr *getNextInstructionSequential(const Instr &InstrMeta) const;

// Returns whether this instruction is used by CFI to trap the program.		// Returns whether this instruction is used by CFI to trap the program.
bool isCFITrap(const Instr &InstrMeta) const;		bool isCFITrap(const Instr &InstrMeta) const;

		// Returns whether this instruction is a call to a function that will trap on
		// CFI violations (i.e., it serves as a trap in this instance).
		bool willTrapOnCFIViolation(const Instr &InstrMeta) const;

// Returns whether this function can fall through to the next instruction.		// Returns whether this function can fall through to the next instruction.
// Undefined (and bad) instructions cannot fall through, and instruction that		// Undefined (and bad) instructions cannot fall through, and instruction that
// modify the control flow can only fall through if they are conditional		// modify the control flow can only fall through if they are conditional
// branches or calls.		// branches or calls.
bool canFallThrough(const Instr &InstrMeta) const;		bool canFallThrough(const Instr &InstrMeta) const;

// Returns the definitive next instruction. This is different from the next		// Returns the definitive next instruction. This is different from the next
// instruction sequentially as it will follow unconditional branches (assuming		// instruction sequentially as it will follow unconditional branches (assuming
▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	protected:

// Constructs and initialises members required for disassembly.		// Constructs and initialises members required for disassembly.
Error initialiseDisassemblyMembers();		Error initialiseDisassemblyMembers();

// Parses code sections from the internal object file. Saves them into the		// Parses code sections from the internal object file. Saves them into the
// internal members. Should only be called once by Create().		// internal members. Should only be called once by Create().
Error parseCodeSections();		Error parseCodeSections();

		// Parses the symbol table to look for the addresses of functions that will
		// trap on CFI violations.
		Error parseSymbolTable();

private:		private:
// Members that describe the input file.		// Members that describe the input file.
object::OwningBinary<object::Binary> Binary;		object::OwningBinary<object::Binary> Binary;
const object::ObjectFile *Object = nullptr;		const object::ObjectFile *Object = nullptr;
Triple ObjectTriple;		Triple ObjectTriple;
std::string ArchName;		std::string ArchName;
std::string MCPU;		std::string MCPU;
const Target *ObjectTarget = nullptr;		const Target *ObjectTarget = nullptr;
Show All 19 Lines	private:
std::map<uint64_t, Instr> Instructions;		std::map<uint64_t, Instr> Instructions;

// Contains a mapping between a specific address, and a list of instructions		// Contains a mapping between a specific address, and a list of instructions
// that use this address as a branch target (including call instructions).		// that use this address as a branch target (including call instructions).
DenseMap<uint64_t, std::vector<uint64_t>> StaticBranchTargetings;		DenseMap<uint64_t, std::vector<uint64_t>> StaticBranchTargetings;

// A list of addresses of indirect control flow instructions.		// A list of addresses of indirect control flow instructions.
std::set<uint64_t> IndirectInstructions;		std::set<uint64_t> IndirectInstructions;

		// The addresses of functions that will trap on CFI violations.
		SmallSet<uint64_t, 4> TrapOnFailFunctionAddresses;
};		};

class UnsupportedDisassembly : public ErrorInfo<UnsupportedDisassembly> {		class UnsupportedDisassembly : public ErrorInfo<UnsupportedDisassembly> {
public:		public:
static char ID;		static char ID;
std::string Text;		std::string Text;

UnsupportedDisassembly(StringRef Text);		UnsupportedDisassembly(StringRef Text);
Show All 9 Lines

llvm/trunk/tools/llvm-cfi-verify/lib/FileAnalysis.cpp

Show First 20 Lines • Show All 99 Lines • ▼ Show 20 Lines	Expected<FileAnalysis> FileAnalysis::Create(StringRef Filename) {

// Init the rest of the object.		// Init the rest of the object.
if (auto InitResponse = Analysis.initialiseDisassemblyMembers())		if (auto InitResponse = Analysis.initialiseDisassemblyMembers())
return std::move(InitResponse);		return std::move(InitResponse);

if (auto SectionParseResponse = Analysis.parseCodeSections())		if (auto SectionParseResponse = Analysis.parseCodeSections())
return std::move(SectionParseResponse);		return std::move(SectionParseResponse);

		if (auto SymbolTableParseResponse = Analysis.parseSymbolTable())
		return std::move(SymbolTableParseResponse);

return std::move(Analysis);		return std::move(Analysis);
}		}

FileAnalysis::FileAnalysis(object::OwningBinary<object::Binary> Binary)		FileAnalysis::FileAnalysis(object::OwningBinary<object::Binary> Binary)
: Binary(std::move(Binary)) {}		: Binary(std::move(Binary)) {}

FileAnalysis::FileAnalysis(const Triple &ObjectTriple,		FileAnalysis::FileAnalysis(const Triple &ObjectTriple,
const SubtargetFeatures &Features)		const SubtargetFeatures &Features)
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
const Instr &FileAnalysis::getInstructionOrDie(uint64_t Address) const {		const Instr &FileAnalysis::getInstructionOrDie(uint64_t Address) const {
const auto &InstrKV = Instructions.find(Address);		const auto &InstrKV = Instructions.find(Address);
assert(InstrKV != Instructions.end() && "Address doesn't exist.");		assert(InstrKV != Instructions.end() && "Address doesn't exist.");
return InstrKV->second;		return InstrKV->second;
}		}

bool FileAnalysis::isCFITrap(const Instr &InstrMeta) const {		bool FileAnalysis::isCFITrap(const Instr &InstrMeta) const {
const auto &InstrDesc = MII->get(InstrMeta.Instruction.getOpcode());		const auto &InstrDesc = MII->get(InstrMeta.Instruction.getOpcode());
return InstrDesc.isTrap();		return InstrDesc.isTrap() \|\| willTrapOnCFIViolation(InstrMeta);
		}

		bool FileAnalysis::willTrapOnCFIViolation(const Instr &InstrMeta) const {
		const auto &InstrDesc = MII->get(InstrMeta.Instruction.getOpcode());
		if (!InstrDesc.isCall())
		return false;
		uint64_t Target;
		if (!MIA->evaluateBranch(InstrMeta.Instruction, InstrMeta.VMAddress,
		InstrMeta.InstructionSize, Target))
		return false;
		return TrapOnFailFunctionAddresses.count(Target) > 0;
}		}

bool FileAnalysis::canFallThrough(const Instr &InstrMeta) const {		bool FileAnalysis::canFallThrough(const Instr &InstrMeta) const {
if (!InstrMeta.Valid)		if (!InstrMeta.Valid)
return false;		return false;

if (isCFITrap(InstrMeta))		if (isCFITrap(InstrMeta))
return false;		return false;
▲ Show 20 Lines • Show All 249 Lines • ▼ Show 20 Lines	if (!LineInfoValid)
inconvertibleErrorCode());		inconvertibleErrorCode());
}		}

for (const object::SectionRef &Section : Object->sections()) {		for (const object::SectionRef &Section : Object->sections()) {
// Ensure only executable sections get analysed.		// Ensure only executable sections get analysed.
if (!(object::ELFSectionRef(Section).getFlags() & ELF::SHF_EXECINSTR))		if (!(object::ELFSectionRef(Section).getFlags() & ELF::SHF_EXECINSTR))
continue;		continue;

		// Avoid checking the PLT since it produces spurious failures on AArch64
		// when ignoring DWARF data.
		StringRef SectionName;
		if (!Section.getName(SectionName) && SectionName == ".plt")
		continue;

StringRef SectionContents;		StringRef SectionContents;
if (Section.getContents(SectionContents))		if (Section.getContents(SectionContents))
return make_error<StringError>("Failed to retrieve section contents",		return make_error<StringError>("Failed to retrieve section contents",
inconvertibleErrorCode());		inconvertibleErrorCode());

ArrayRef<uint8_t> SectionBytes((const uint8_t *)SectionContents.data(),		ArrayRef<uint8_t> SectionBytes((const uint8_t *)SectionContents.data(),
Section.getSize());		Section.getSize());
parseSectionContents(SectionBytes, Section.getAddress());		parseSectionContents(SectionBytes, Section.getAddress());
▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	void FileAnalysis::addInstruction(const Instr &Instruction) {
if (!KV.second) {		if (!KV.second) {
errs() << "Failed to add instruction at address "		errs() << "Failed to add instruction at address "
<< format_hex(Instruction.VMAddress, 2)		<< format_hex(Instruction.VMAddress, 2)
<< ": Instruction at this address already exists.\n";		<< ": Instruction at this address already exists.\n";
exit(EXIT_FAILURE);		exit(EXIT_FAILURE);
}		}
}		}

		Error FileAnalysis::parseSymbolTable() {
		// Functions that will trap on CFI violations.
		SmallSet<StringRef, 4> TrapOnFailFunctions;
		TrapOnFailFunctions.insert("__cfi_slowpath");
		TrapOnFailFunctions.insert("__cfi_slowpath_diag");
		TrapOnFailFunctions.insert("abort");

		// Look through the list of symbols for functions that will trap on CFI
		// violations.
		for (auto &Sym : Object->symbols()) {
		auto SymNameOrErr = Sym.getName();
		if (!SymNameOrErr)
		consumeError(SymNameOrErr.takeError());
		else if (TrapOnFailFunctions.count(*SymNameOrErr) > 0) {
		auto AddrOrErr = Sym.getAddress();
		if (!AddrOrErr)
		consumeError(AddrOrErr.takeError());
		else
		TrapOnFailFunctionAddresses.insert(*AddrOrErr);
		}
		}
		if (auto *ElfObject = dyn_cast<object::ELFObjectFileBase>(Object)) {
		for (const auto &Addr : ElfObject->getPltAddresses()) {
		object::SymbolRef Sym(Addr.first, Object);
		auto SymNameOrErr = Sym.getName();
		if (!SymNameOrErr)
		consumeError(SymNameOrErr.takeError());
		else if (TrapOnFailFunctions.count(*SymNameOrErr) > 0)
		TrapOnFailFunctionAddresses.insert(Addr.second);
		}
		}
		return Error::success();
		}

UnsupportedDisassembly::UnsupportedDisassembly(StringRef Text) : Text(Text) {}		UnsupportedDisassembly::UnsupportedDisassembly(StringRef Text) : Text(Text) {}

char UnsupportedDisassembly::ID;		char UnsupportedDisassembly::ID;
void UnsupportedDisassembly::log(raw_ostream &OS) const {		void UnsupportedDisassembly::log(raw_ostream &OS) const {
OS << "Could not initialise disassembler: " << Text;		OS << "Could not initialise disassembler: " << Text;
}		}

std::error_code UnsupportedDisassembly::convertToErrorCode() const {		std::error_code UnsupportedDisassembly::convertToErrorCode() const {
return std::error_code();		return std::error_code();
}		}

} // namespace cfi_verify		} // namespace cfi_verify
} // namespace llvm		} // namespace llvm

llvm/trunk/tools/llvm-cfi-verify/lib/GraphBuilder.cpp

Show First 20 Lines • Show All 305 Lines • ▼ Show 20 Lines	for (const auto *ParentMetaPtr : CFCrossRefs) {
else		else
BranchNode.Fallthrough = Address;		BranchNode.Fallthrough = Address;

HasValidCrossRef = true;		HasValidCrossRef = true;
buildFlowsToUndefined(Analysis, Result, BranchNode, ParentMeta);		buildFlowsToUndefined(Analysis, Result, BranchNode, ParentMeta);
Result.ConditionalBranchNodes.push_back(BranchNode);		Result.ConditionalBranchNodes.push_back(BranchNode);
}		}

		// When using cross-DSO, some indirect calls are not guarded by a branch to a
		// trap but instead follow a call to __cfi_slowpath. For example:
		// if (!InlinedFastCheck(f))
		// call *f
		// else {
		// __cfi_slowpath(CallSiteTypeId, f);
		// call *f
		// }
		// To mark the second call as protected, we recognize indirect calls that
		// directly follow calls to functions that will trap on CFI violations.
		if (CFCrossRefs.empty()) {
		const Instr *PrevInstr = Analysis.getPrevInstructionSequential(ChildMeta);
		if (PrevInstr && Analysis.willTrapOnCFIViolation(*PrevInstr)) {
		Result.IntermediateNodes[PrevInstr->VMAddress] = Address;
		HasValidCrossRef = true;
		}
		}

if (!HasValidCrossRef)		if (!HasValidCrossRef)
Result.OrphanedNodes.push_back(Address);		Result.OrphanedNodes.push_back(Address);

OpenedNodes.erase(Address);		OpenedNodes.erase(Address);
}		}

} // namespace cfi_verify		} // namespace cfi_verify
} // namespace llvm		} // namespace llvm

This is an archive of the discontinued LLVM Phabricator instance.

[cfi-verify] Support cross-DSOClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 162385

llvm/trunk/test/tools/llvm-cfi-verify/AArch64/Inputs/function-only-check.o

llvm/trunk/test/tools/llvm-cfi-verify/AArch64/function-only-check.s

llvm/trunk/test/tools/llvm-cfi-verify/X86/Inputs/function-only-check.o

llvm/trunk/test/tools/llvm-cfi-verify/X86/function-only-check.s

llvm/trunk/tools/llvm-cfi-verify/lib/FileAnalysis.h

llvm/trunk/tools/llvm-cfi-verify/lib/FileAnalysis.cpp

llvm/trunk/tools/llvm-cfi-verify/lib/GraphBuilder.cpp

[cfi-verify] Support cross-DSO
ClosedPublic