Skip to content

Commit 31b4531

Browse files
committedSep 20, 2017
Introduce the llvm-cfi-verify tool (resubmission of D37937).
Summary: Resubmission of D37937. Fixed i386 target building (conversion from std::size_t& to uint64_t& failed). Fixed documentation warning failure about docs/CFIVerify.rst not being in the tree. Reviewers: vlad.tsyrklevich Reviewed By: vlad.tsyrklevich Patch by Mitch Phillips Subscribers: sbc100, mgorny, pcc, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D38089 llvm-svn: 313809
1 parent adc4bc6 commit 31b4531

File tree

6 files changed

+375
-1
lines changed

6 files changed

+375
-1
lines changed
 

‎llvm/docs/CFIVerify.rst

+91
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
==============================================
2+
Control Flow Verification Tool Design Document
3+
==============================================
4+
5+
.. contents::
6+
:local:
7+
8+
Objective
9+
=========
10+
11+
This document provides an overview of an external tool to verify the protection
12+
mechanisms implemented by Clang's *Control Flow Integrity* (CFI) schemes
13+
(``-fsanitize=cfi``). This tool, provided a binary or DSO, should infer whether
14+
indirect control flow operations are protected by CFI, and should output these
15+
results in a human-readable form.
16+
17+
This tool should also be added as part of Clang's continuous integration testing
18+
framework, where modifications to the compiler ensure that CFI protection
19+
schemes are still present in the final binary.
20+
21+
Location
22+
========
23+
24+
This tool will be present as a part of the LLVM toolchain, and will reside in
25+
the "/llvm/tools/llvm-cfi-verify" directory, relative to the LLVM trunk. It will
26+
be tested in two methods:
27+
28+
- Unit tests to validate code sections, present in "/llvm/unittests/llvm-cfi-
29+
verify".
30+
- Integration tests, present in "/llvm/tools/clang/test/LLVMCFIVerify". These
31+
integration tests are part of clang as part of a continuous integration
32+
framework, ensuring updates to the compiler that reduce CFI coverage on
33+
indirect control flow instructions are identified.
34+
35+
Background
36+
==========
37+
38+
This tool will continuously validate that CFI directives are properly
39+
implemented around all indirect control flows by analysing the output machine
40+
code. The analysis of machine code is important as it ensures that any bugs
41+
present in linker or compiler do not subvert CFI protections in the final
42+
shipped binary.
43+
44+
Unprotected indirect control flow instructions will be flagged for manual
45+
review. These unexpected control flows may simply have not been accounted for in
46+
the compiler implementation of CFI (e.g. indirect jumps to facilitate switch
47+
statements may not be fully protected).
48+
49+
It may be possible in the future to extend this tool to flag unnecessary CFI
50+
directives (e.g. CFI directives around a static call to a non-polymorphic base
51+
type). This type of directive has no security implications, but may present
52+
performance impacts.
53+
54+
Design Ideas
55+
============
56+
57+
This tool will disassemble binaries and DSO's from their machine code format and
58+
analyse the disassembled machine code. The tool will inspect virtual calls and
59+
indirect function calls. This tool will also inspect indirect jumps, as inlined
60+
functions and jump tables should also be subject to CFI protections. Non-virtual
61+
calls (``-fsanitize=cfi-nvcall``) and cast checks (``-fsanitize=cfi-*cast*``)
62+
are not implemented due to a lack of information provided by the bytecode.
63+
64+
The tool would operate by searching for indirect control flow instructions in
65+
the disassembly. A control flow graph would be generated from a small buffer of
66+
the instructions surrounding the 'target' control flow instruction. If the
67+
target instruction is branched-to, the fallthrough of the branch should be the
68+
CFI trap (on x86, this is a ``ud2`` instruction). If the target instruction is
69+
the fallthrough (i.e. immediately succeeds) of a conditional jump, the
70+
conditional jump target should be the CFI trap. If an indirect control flow
71+
instruction does not conform to one of these formats, the target will be noted
72+
as being CFI-unprotected.
73+
74+
Note that in the second case outlined above (where the target instruction is the
75+
fallthrough of a conditional jump), if the target represents a vcall that takes
76+
arguments, these arguments may be pushed to the stack after the branch but
77+
before the target instruction. In these cases, a secondary 'spill graph' in
78+
constructed, to ensure the register argument used by the indirect jump/call is
79+
not spilled from the stack at any point in the interim period. If there are no
80+
spills that affect the target register, the target is marked as CFI-protected.
81+
82+
Other Design Notes
83+
~~~~~~~~~~~~~~~~~~
84+
85+
Only machine code sections that are marked as executable will be subject to this
86+
analysis. Non-executable sections do not require analysis as any execution
87+
present in these sections has already violated the control flow integrity.
88+
89+
Suitable extensions may be made at a later date to include anaylsis for indirect
90+
control flow operations across DSO boundaries. Currently, these CFI features are
91+
only experimental with an unstable ABI, making them unsuitable for analysis.

‎llvm/docs/index.rst

+5-1
Original file line numberDiff line numberDiff line change
@@ -159,7 +159,7 @@ representation.
159159
misunderstood instruction.
160160

161161
:doc:`Frontend/PerformanceTips`
162-
A collection of tips for frontend authors on how to generate IR
162+
A collection of tips for frontend authors on how to generate IR
163163
which LLVM is able to effectively optimize.
164164

165165
:doc:`Docker`
@@ -281,6 +281,7 @@ For API clients and LLVM developers.
281281
XRayExample
282282
XRayFDRFormat
283283
PDB/index
284+
CFIVerify
284285

285286
:doc:`WritingAnLLVMPass`
286287
Information on how to write LLVM transformations and analyses.
@@ -411,6 +412,9 @@ For API clients and LLVM developers.
411412
:doc:`The Microsoft PDB File Format <PDB/index>`
412413
A detailed description of the Microsoft PDB (Program Database) file format.
413414

415+
:doc:`CFIVerify`
416+
A description of the verification tool for Control Flow Integrity.
417+
414418
Development Process Documentation
415419
=================================
416420

‎llvm/tools/LLVMBuild.txt

+1
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ subdirectories =
2525
llvm-as
2626
llvm-bcanalyzer
2727
llvm-cat
28+
llvm-cfi-verify
2829
llvm-cov
2930
llvm-cvtres
3031
llvm-diff
+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
set(LLVM_LINK_COMPONENTS
2+
AllTargetsAsmPrinters
3+
AllTargetsAsmParsers
4+
AllTargetsDescs
5+
AllTargetsDisassemblers
6+
AllTargetsInfos
7+
MC
8+
MCParser
9+
Object
10+
Support
11+
)
12+
13+
add_llvm_tool(llvm-cfi-verify
14+
llvm-cfi-verify.cpp
15+
)
+22
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
;===- ./tools/llvm-cfi-verify/LLVMBuild.txt --------------------*- Conf -*--===;
2+
;
3+
; The LLVM Compiler Infrastructure
4+
;
5+
; This file is distributed under the University of Illinois Open Source
6+
; License. See LICENSE.TXT for details.
7+
;
8+
;===------------------------------------------------------------------------===;
9+
;
10+
; This is an LLVMBuild description file for the components in this subdirectory.
11+
;
12+
; For more information on the LLVMBuild system, please see:
13+
;
14+
; http://llvm.org/docs/LLVMBuild.html
15+
;
16+
;===------------------------------------------------------------------------===;
17+
18+
[component_0]
19+
type = Tool
20+
name = llvm-cfi-verify
21+
parent = Tools
22+
required_libraries = MC MCDisassembler MCParser Support all-targets
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,241 @@
1+
//===-- llvm-cfi-verify.cpp - CFI Verification tool for LLVM --------------===//
2+
//
3+
// The LLVM Compiler Infrastructure
4+
//
5+
// This file is distributed under the University of Illinois Open Source
6+
// License. See LICENSE.TXT for details.
7+
//
8+
//===----------------------------------------------------------------------===//
9+
//
10+
// This tool verifies Control Flow Integrity (CFI) instrumentation by static
11+
// binary anaylsis. See the design document in /docs/CFIVerify.rst for more
12+
// information.
13+
//
14+
// This tool is currently incomplete. It currently only does disassembly for
15+
// object files, and searches through the code for indirect control flow
16+
// instructions, printing them once found.
17+
//
18+
//===----------------------------------------------------------------------===//
19+
20+
#include "llvm/MC/MCAsmInfo.h"
21+
#include "llvm/MC/MCContext.h"
22+
#include "llvm/MC/MCDisassembler/MCDisassembler.h"
23+
#include "llvm/MC/MCInst.h"
24+
#include "llvm/MC/MCInstPrinter.h"
25+
#include "llvm/MC/MCInstrAnalysis.h"
26+
#include "llvm/MC/MCInstrDesc.h"
27+
#include "llvm/MC/MCInstrInfo.h"
28+
#include "llvm/MC/MCObjectFileInfo.h"
29+
#include "llvm/MC/MCRegisterInfo.h"
30+
#include "llvm/MC/MCSubtargetInfo.h"
31+
#include "llvm/Object/Binary.h"
32+
#include "llvm/Object/COFF.h"
33+
#include "llvm/Object/ObjectFile.h"
34+
#include "llvm/Support/Casting.h"
35+
#include "llvm/Support/CommandLine.h"
36+
#include "llvm/Support/MemoryBuffer.h"
37+
#include "llvm/Support/TargetRegistry.h"
38+
#include "llvm/Support/TargetSelect.h"
39+
#include "llvm/Support/raw_ostream.h"
40+
41+
#include <cassert>
42+
#include <cstdlib>
43+
44+
using namespace llvm;
45+
using namespace llvm::object;
46+
47+
cl::opt<bool> ArgDumpSymbols("sym", cl::desc("Dump the symbol table."));
48+
cl::opt<std::string> InputFilename(cl::Positional, cl::desc("<input file>"),
49+
cl::Required);
50+
51+
static void printSymbols(const ObjectFile *Object) {
52+
for (const SymbolRef &Symbol : Object->symbols()) {
53+
outs() << "Symbol [" << format_hex_no_prefix(Symbol.getValue(), 2)
54+
<< "] = ";
55+
56+
auto SymbolName = Symbol.getName();
57+
if (SymbolName)
58+
outs() << *SymbolName;
59+
else
60+
outs() << "UNKNOWN";
61+
62+
if (Symbol.getFlags() & SymbolRef::SF_Hidden)
63+
outs() << " .hidden";
64+
65+
outs() << " (Section = ";
66+
67+
auto SymbolSection = Symbol.getSection();
68+
if (SymbolSection) {
69+
StringRef SymbolSectionName;
70+
if ((*SymbolSection)->getName(SymbolSectionName))
71+
outs() << "UNKNOWN)";
72+
else
73+
outs() << SymbolSectionName << ")";
74+
} else {
75+
outs() << "N/A)";
76+
}
77+
78+
outs() << "\n";
79+
}
80+
}
81+
82+
int main(int argc, char **argv) {
83+
cl::ParseCommandLineOptions(argc, argv);
84+
85+
InitializeAllTargetInfos();
86+
InitializeAllTargetMCs();
87+
InitializeAllAsmParsers();
88+
InitializeAllDisassemblers();
89+
90+
Expected<OwningBinary<Binary>> BinaryOrErr = createBinary(InputFilename);
91+
if (!BinaryOrErr) {
92+
errs() << "Failed to open file.\n";
93+
return EXIT_FAILURE;
94+
}
95+
96+
Binary &Binary = *BinaryOrErr.get().getBinary();
97+
ObjectFile *Object = dyn_cast<ObjectFile>(&Binary);
98+
if (!Object) {
99+
errs() << "Disassembling of non-objects not currently supported.\n";
100+
return EXIT_FAILURE;
101+
}
102+
103+
Triple TheTriple = Object->makeTriple();
104+
std::string TripleName = TheTriple.getTriple();
105+
std::string ArchName = "";
106+
std::string ErrorString;
107+
108+
const Target *TheTarget =
109+
TargetRegistry::lookupTarget(ArchName, TheTriple, ErrorString);
110+
111+
if (!TheTarget) {
112+
errs() << "Couldn't find target \"" << TheTriple.getTriple()
113+
<< "\", failed with error: " << ErrorString << ".\n";
114+
return EXIT_FAILURE;
115+
}
116+
117+
SubtargetFeatures Features = Object->getFeatures();
118+
119+
std::unique_ptr<const MCRegisterInfo> RegisterInfo(
120+
TheTarget->createMCRegInfo(TripleName));
121+
if (!RegisterInfo) {
122+
errs() << "Failed to initialise RegisterInfo.\n";
123+
return EXIT_FAILURE;
124+
}
125+
126+
std::unique_ptr<const MCAsmInfo> AsmInfo(
127+
TheTarget->createMCAsmInfo(*RegisterInfo, TripleName));
128+
if (!AsmInfo) {
129+
errs() << "Failed to initialise AsmInfo.\n";
130+
return EXIT_FAILURE;
131+
}
132+
133+
std::string MCPU = "";
134+
std::unique_ptr<MCSubtargetInfo> SubtargetInfo(
135+
TheTarget->createMCSubtargetInfo(TripleName, MCPU, Features.getString()));
136+
if (!SubtargetInfo) {
137+
errs() << "Failed to initialise SubtargetInfo.\n";
138+
return EXIT_FAILURE;
139+
}
140+
141+
std::unique_ptr<const MCInstrInfo> MII(TheTarget->createMCInstrInfo());
142+
if (!MII) {
143+
errs() << "Failed to initialise MII.\n";
144+
return EXIT_FAILURE;
145+
}
146+
147+
MCObjectFileInfo MOFI;
148+
MCContext Context(AsmInfo.get(), RegisterInfo.get(), &MOFI);
149+
150+
std::unique_ptr<const MCDisassembler> Disassembler(
151+
TheTarget->createMCDisassembler(*SubtargetInfo, Context));
152+
153+
if (!Disassembler) {
154+
errs() << "No disassembler available for target.";
155+
return EXIT_FAILURE;
156+
}
157+
158+
std::unique_ptr<const MCInstrAnalysis> MIA(
159+
TheTarget->createMCInstrAnalysis(MII.get()));
160+
161+
std::unique_ptr<MCInstPrinter> Printer(
162+
TheTarget->createMCInstPrinter(TheTriple, AsmInfo->getAssemblerDialect(),
163+
*AsmInfo, *MII, *RegisterInfo));
164+
165+
if (ArgDumpSymbols)
166+
printSymbols(Object);
167+
168+
for (const SectionRef &Section : Object->sections()) {
169+
outs() << "Section [" << format_hex_no_prefix(Section.getAddress(), 2)
170+
<< "] = ";
171+
StringRef SectionName;
172+
173+
if (Section.getName(SectionName))
174+
outs() << "UNKNOWN.\n";
175+
else
176+
outs() << SectionName << "\n";
177+
178+
StringRef SectionContents;
179+
if (Section.getContents(SectionContents)) {
180+
errs() << "Failed to retrieve section contents.\n";
181+
return EXIT_FAILURE;
182+
}
183+
184+
MCInst Instruction;
185+
uint64_t InstructionSize;
186+
187+
ArrayRef<uint8_t> SectionBytes((const uint8_t *)SectionContents.data(),
188+
Section.getSize());
189+
190+
for (uint64_t Byte = 0; Byte < Section.getSize();) {
191+
bool BadInstruction = false;
192+
193+
// Disassemble the instruction.
194+
if (Disassembler->getInstruction(
195+
Instruction, InstructionSize, SectionBytes.drop_front(Byte), 0,
196+
nulls(), outs()) != MCDisassembler::Success) {
197+
BadInstruction = true;
198+
}
199+
200+
Byte += InstructionSize;
201+
202+
if (BadInstruction)
203+
continue;
204+
205+
// Skip instructions that do not affect the control flow.
206+
const auto &InstrDesc = MII->get(Instruction.getOpcode());
207+
if (!InstrDesc.mayAffectControlFlow(Instruction, *RegisterInfo))
208+
continue;
209+
210+
// Skip instructions that do not operate on register operands.
211+
bool UsesRegisterOperand = false;
212+
for (const auto &Operand : Instruction) {
213+
if (Operand.isReg())
214+
UsesRegisterOperand = true;
215+
}
216+
217+
if (!UsesRegisterOperand)
218+
continue;
219+
220+
// Print the instruction address.
221+
outs() << " "
222+
<< format_hex(Section.getAddress() + Byte - InstructionSize, 2)
223+
<< ": ";
224+
225+
// Print the instruction bytes.
226+
for (uint64_t i = 0; i < InstructionSize; ++i) {
227+
outs() << format_hex_no_prefix(SectionBytes[Byte - InstructionSize + i],
228+
2)
229+
<< " ";
230+
}
231+
232+
// Print the instruction.
233+
outs() << " | " << MII->getName(Instruction.getOpcode()) << " ";
234+
Instruction.dump_pretty(outs(), Printer.get());
235+
236+
outs() << "\n";
237+
}
238+
}
239+
240+
return EXIT_SUCCESS;
241+
}

0 commit comments

Comments
 (0)
Please sign in to comment.