Skip to content

Commit

Permalink
Implement CFI for indirect calls via a member function pointer.
Browse files Browse the repository at this point in the history
Similarly to CFI on virtual and indirect calls, this implementation
tries to use program type information to make the checks as precise
as possible.  The basic way that it works is as follows, where `C`
is the name of the class being defined or the target of a call and
the function type is assumed to be `void()`.

For virtual calls:
- Attach type metadata to the addresses of function pointers in vtables
  (not the functions themselves) of type `void (B::*)()` for each `B`
  that is a recursive dynamic base class of `C`, including `C` itself.
  This type metadata has an annotation that the type is for virtual
  calls (to distinguish it from the non-virtual case).
- At the call site, check that the computed address of the function
  pointer in the vtable has type `void (C::*)()`.

For non-virtual calls:
- Attach type metadata to each non-virtual member function whose address
  can be taken with a member function pointer. The type of a function
  in class `C` of type `void()` is each of the types `void (B::*)()`
  where `B` is a most-base class of `C`. A most-base class of `C`
  is defined as a recursive base class of `C`, including `C` itself,
  that does not have any bases.
- At the call site, check that the function pointer has one of the types
  `void (B::*)()` where `B` is a most-base class of `C`.

Differential Revision: https://reviews.llvm.org/D47567

llvm-svn: 335569
  • Loading branch information
pcc committed Jun 26, 2018
1 parent 689e363 commit e44acad
Showing 20 changed files with 495 additions and 66 deletions.
30 changes: 30 additions & 0 deletions clang/docs/ControlFlowIntegrity.rst
Original file line number Diff line number Diff line change
@@ -66,6 +66,8 @@ Available schemes are:
wrong dynamic type.
- ``-fsanitize=cfi-icall``: Indirect call of a function with wrong dynamic
type.
- ``-fsanitize=cfi-mfcall``: Indirect call via a member function pointer with
wrong dynamic type.

You can use ``-fsanitize=cfi`` to enable all the schemes and use
``-fno-sanitize`` flag to narrow down the set of schemes as desired.
@@ -255,6 +257,34 @@ the identity of function pointers is maintained, and calls across shared
library boundaries are no different from calls within a single program or
shared library.

Member Function Pointer Call Checking
=====================================

This scheme checks that indirect calls via a member function pointer
take place using an object of the correct dynamic type. Specifically, we
check that the dynamic type of the member function referenced by the member
function pointer matches the "function pointer" part of the member function
pointer, and that the member function's class type is related to the base
type of the member function. This CFI scheme can be enabled on its own using
``-fsanitize=cfi-mfcall``.

The compiler will only emit a full CFI check if the member function pointer's
base type is complete. This is because the complete definition of the base
type contains information that is necessary to correctly compile the CFI
check. To ensure that the compiler always emits a full CFI check, it is
recommended to also pass the flag ``-fcomplete-member-pointers``, which
enables a non-conforming language extension that requires member pointer
base types to be complete if they may be used for a call.

For this scheme to work, all translation units containing the definition
of a virtual member function (whether inline or not), other than members
of :ref:`blacklisted <cfi-blacklist>` types or types with public :doc:`LTO
visibility <LTOVisibility>`, must be compiled with ``-flto`` or ``-flto=thin``
enabled and be statically linked into the program.

This scheme is currently not compatible with cross-DSO CFI or the
Microsoft ABI.

.. _cfi-blacklist:

Blacklist
6 changes: 3 additions & 3 deletions clang/docs/LTOVisibility.rst
Original file line number Diff line number Diff line change
@@ -11,9 +11,9 @@ linkage unit's LTO unit is empty. Each linkage unit has only a single LTO unit.

The LTO visibility of a class is used by the compiler to determine which
classes the whole-program devirtualization (``-fwhole-program-vtables``) and
control flow integrity (``-fsanitize=cfi-vcall``) features apply to. These
features use whole-program information, so they require the entire class
hierarchy to be visible in order to work correctly.
control flow integrity (``-fsanitize=cfi-vcall`` and ``-fsanitize=cfi-mfcall``)
features apply to. These features use whole-program information, so they
require the entire class hierarchy to be visible in order to work correctly.

If any translation unit in the program uses either of the whole-program
devirtualization or control flow integrity features, it is effectively an ODR
5 changes: 3 additions & 2 deletions clang/include/clang/Basic/Sanitizers.def
Original file line number Diff line number Diff line change
@@ -104,12 +104,13 @@ SANITIZER("dataflow", DataFlow)
SANITIZER("cfi-cast-strict", CFICastStrict)
SANITIZER("cfi-derived-cast", CFIDerivedCast)
SANITIZER("cfi-icall", CFIICall)
SANITIZER("cfi-mfcall", CFIMFCall)
SANITIZER("cfi-unrelated-cast", CFIUnrelatedCast)
SANITIZER("cfi-nvcall", CFINVCall)
SANITIZER("cfi-vcall", CFIVCall)
SANITIZER_GROUP("cfi", CFI,
CFIDerivedCast | CFIICall | CFIUnrelatedCast | CFINVCall |
CFIVCall)
CFIDerivedCast | CFIICall | CFIMFCall | CFIUnrelatedCast |
CFINVCall | CFIVCall)

// Safe Stack
SANITIZER("safe-stack", SafeStack)
4 changes: 3 additions & 1 deletion clang/lib/CodeGen/CGClass.cpp
Original file line number Diff line number Diff line change
@@ -2688,7 +2688,9 @@ void CodeGenFunction::EmitVTablePtrCheck(const CXXRecordDecl *RD,
SSK = llvm::SanStat_CFI_UnrelatedCast;
break;
case CFITCK_ICall:
llvm_unreachable("not expecting CFITCK_ICall");
case CFITCK_NVMFCall:
case CFITCK_VMFCall:
llvm_unreachable("unexpected sanitizer kind");
}

std::string TypeName = RD->getQualifiedNameAsString();
43 changes: 29 additions & 14 deletions clang/lib/CodeGen/CGVTables.cpp
Original file line number Diff line number Diff line change
@@ -1012,41 +1012,56 @@ void CodeGenModule::EmitVTableTypeMetadata(llvm::GlobalVariable *VTable,
CharUnits PointerWidth =
Context.toCharUnitsFromBits(Context.getTargetInfo().getPointerWidth(0));

typedef std::pair<const CXXRecordDecl *, unsigned> TypeMetadata;
std::vector<TypeMetadata> TypeMetadatas;
// Create type metadata for each address point.
typedef std::pair<const CXXRecordDecl *, unsigned> AddressPoint;
std::vector<AddressPoint> AddressPoints;
for (auto &&AP : VTLayout.getAddressPoints())
TypeMetadatas.push_back(std::make_pair(
AddressPoints.push_back(std::make_pair(
AP.first.getBase(), VTLayout.getVTableOffset(AP.second.VTableIndex) +
AP.second.AddressPointIndex));

// Sort the type metadata for determinism.
llvm::sort(TypeMetadatas.begin(), TypeMetadatas.end(),
[this](const TypeMetadata &M1, const TypeMetadata &M2) {
if (&M1 == &M2)
// Sort the address points for determinism.
llvm::sort(AddressPoints.begin(), AddressPoints.end(),
[this](const AddressPoint &AP1, const AddressPoint &AP2) {
if (&AP1 == &AP2)
return false;

std::string S1;
llvm::raw_string_ostream O1(S1);
getCXXABI().getMangleContext().mangleTypeName(
QualType(M1.first->getTypeForDecl(), 0), O1);
QualType(AP1.first->getTypeForDecl(), 0), O1);
O1.flush();

std::string S2;
llvm::raw_string_ostream O2(S2);
getCXXABI().getMangleContext().mangleTypeName(
QualType(M2.first->getTypeForDecl(), 0), O2);
QualType(AP2.first->getTypeForDecl(), 0), O2);
O2.flush();

if (S1 < S2)
return true;
if (S1 != S2)
return false;

return M1.second < M2.second;
return AP1.second < AP2.second;
});

for (auto TypeMetadata : TypeMetadatas)
AddVTableTypeMetadata(VTable, PointerWidth * TypeMetadata.second,
TypeMetadata.first);
ArrayRef<VTableComponent> Comps = VTLayout.vtable_components();
for (auto AP : AddressPoints) {
// Create type metadata for the address point.
AddVTableTypeMetadata(VTable, PointerWidth * AP.second, AP.first);

// The class associated with each address point could also potentially be
// used for indirect calls via a member function pointer, so we need to
// annotate the address of each function pointer with the appropriate member
// function pointer type.
for (unsigned I = 0; I != Comps.size(); ++I) {
if (Comps[I].getKind() != VTableComponent::CK_FunctionPointer)
continue;
llvm::Metadata *MD = CreateMetadataIdentifierForVirtualMemPtrType(
Context.getMemberPointerType(
Comps[I].getFunctionDecl()->getType(),
Context.getRecordType(AP.first).getTypePtr()));
VTable->addTypeMetadata((PointerWidth * I).getQuantity(), MD);
}
}
}
2 changes: 2 additions & 0 deletions clang/lib/CodeGen/CodeGenFunction.h
Original file line number Diff line number Diff line change
@@ -1765,6 +1765,8 @@ class CodeGenFunction : public CodeGenTypeCache {
CFITCK_DerivedCast,
CFITCK_UnrelatedCast,
CFITCK_ICall,
CFITCK_NVMFCall,
CFITCK_VMFCall,
};

/// Derived is the presumed address of an object of type T after a
89 changes: 63 additions & 26 deletions clang/lib/CodeGen/CodeGenModule.cpp
Original file line number Diff line number Diff line change
@@ -1132,6 +1132,34 @@ static bool hasUnwindExceptions(const LangOptions &LangOpts) {
return true;
}

static bool requiresMemberFunctionPointerTypeMetadata(CodeGenModule &CGM,
const CXXMethodDecl *MD) {
// Check that the type metadata can ever actually be used by a call.
if (!CGM.getCodeGenOpts().LTOUnit ||
!CGM.HasHiddenLTOVisibility(MD->getParent()))
return false;

// Only functions whose address can be taken with a member function pointer
// need this sort of type metadata.
return !MD->isStatic() && !MD->isVirtual() && !isa<CXXConstructorDecl>(MD) &&
!isa<CXXDestructorDecl>(MD);
}

std::vector<const CXXRecordDecl *>
CodeGenModule::getMostBaseClasses(const CXXRecordDecl *RD) {
llvm::SetVector<const CXXRecordDecl *> MostBases;

std::function<void (const CXXRecordDecl *)> CollectMostBases;
CollectMostBases = [&](const CXXRecordDecl *RD) {
if (RD->getNumBases() == 0)
MostBases.insert(RD);
for (const CXXBaseSpecifier &B : RD->bases())
CollectMostBases(B.getType()->getAsCXXRecordDecl());
};
CollectMostBases(RD);
return MostBases.takeVector();
}

void CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D,
llvm::Function *F) {
llvm::AttrBuilder B;
@@ -1257,7 +1285,20 @@ void CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D,
// In the cross-dso CFI mode, we want !type attributes on definitions only.
if (CodeGenOpts.SanitizeCfiCrossDso)
if (auto *FD = dyn_cast<FunctionDecl>(D))
CreateFunctionTypeMetadata(FD, F);
CreateFunctionTypeMetadataForIcall(FD, F);

// Emit type metadata on member functions for member function pointer checks.
// These are only ever necessary on definitions; we're guaranteed that the
// definition will be present in the LTO unit as a result of LTO visibility.
auto *MD = dyn_cast<CXXMethodDecl>(D);
if (MD && requiresMemberFunctionPointerTypeMetadata(*this, MD)) {
for (const CXXRecordDecl *Base : getMostBaseClasses(MD->getParent())) {
llvm::Metadata *Id =
CreateMetadataIdentifierForType(Context.getMemberPointerType(
MD->getType(), Context.getRecordType(Base).getTypePtr()));
F->addTypeMetadata(0, Id);
}
}
}

void CodeGenModule::SetCommonAttributes(GlobalDecl GD, llvm::GlobalValue *GV) {
@@ -1378,13 +1419,14 @@ static void setLinkageForGV(llvm::GlobalValue *GV, const NamedDecl *ND) {
GV->setLinkage(llvm::GlobalValue::ExternalWeakLinkage);
}

void CodeGenModule::CreateFunctionTypeMetadata(const FunctionDecl *FD,
llvm::Function *F) {
void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
llvm::Function *F) {
// Only if we are checking indirect calls.
if (!LangOpts.Sanitize.has(SanitizerKind::CFIICall))
return;

// Non-static class methods are handled via vtable pointer checks elsewhere.
// Non-static class methods are handled via vtable or member function pointer
// checks elsewhere.
if (isa<CXXMethodDecl>(FD) && !cast<CXXMethodDecl>(FD)->isStatic())
return;

@@ -1476,7 +1518,7 @@ void CodeGenModule::SetFunctionAttributes(GlobalDecl GD, llvm::Function *F,
// Don't emit entries for function declarations in the cross-DSO mode. This
// is handled with better precision by the receiving DSO.
if (!CodeGenOpts.SanitizeCfiCrossDso)
CreateFunctionTypeMetadata(FD, F);
CreateFunctionTypeMetadataForIcall(FD, F);

if (getLangOpts().OpenMP && FD->hasAttr<OMPDeclareSimdDeclAttr>())
getOpenMPRuntime().emitDeclareSimdFunction(FD, F);
@@ -4925,15 +4967,18 @@ void CodeGenModule::EmitOMPThreadPrivateDecl(const OMPThreadPrivateDecl *D) {
}
}

llvm::Metadata *CodeGenModule::CreateMetadataIdentifierForType(QualType T) {
llvm::Metadata *&InternalId = MetadataIdMap[T.getCanonicalType()];
llvm::Metadata *
CodeGenModule::CreateMetadataIdentifierImpl(QualType T, MetadataTypeMap &Map,
StringRef Suffix) {
llvm::Metadata *&InternalId = Map[T.getCanonicalType()];
if (InternalId)
return InternalId;

if (isExternallyVisible(T->getLinkage())) {
std::string OutName;
llvm::raw_string_ostream Out(OutName);
getCXXABI().getMangleContext().mangleTypeName(T, Out);
Out << Suffix;

InternalId = llvm::MDString::get(getLLVMContext(), Out.str());
} else {
@@ -4944,6 +4989,15 @@ llvm::Metadata *CodeGenModule::CreateMetadataIdentifierForType(QualType T) {
return InternalId;
}

llvm::Metadata *CodeGenModule::CreateMetadataIdentifierForType(QualType T) {
return CreateMetadataIdentifierImpl(T, MetadataIdMap, "");
}

llvm::Metadata *
CodeGenModule::CreateMetadataIdentifierForVirtualMemPtrType(QualType T) {
return CreateMetadataIdentifierImpl(T, VirtualMetadataIdMap, ".virtual");
}

// Generalize pointer types to a void pointer with the qualifiers of the
// originally pointed-to type, e.g. 'const char *' and 'char * const *'
// generalize to 'const void *' while 'char *' and 'const char **' generalize to
@@ -4977,25 +5031,8 @@ static QualType GeneralizeFunctionType(ASTContext &Ctx, QualType Ty) {
}

llvm::Metadata *CodeGenModule::CreateMetadataIdentifierGeneralized(QualType T) {
T = GeneralizeFunctionType(getContext(), T);

llvm::Metadata *&InternalId = GeneralizedMetadataIdMap[T.getCanonicalType()];
if (InternalId)
return InternalId;

if (isExternallyVisible(T->getLinkage())) {
std::string OutName;
llvm::raw_string_ostream Out(OutName);
getCXXABI().getMangleContext().mangleTypeName(T, Out);
Out << ".generalized";

InternalId = llvm::MDString::get(getLLVMContext(), Out.str());
} else {
InternalId = llvm::MDNode::getDistinct(getLLVMContext(),
llvm::ArrayRef<llvm::Metadata *>());
}

return InternalId;
return CreateMetadataIdentifierImpl(GeneralizeFunctionType(getContext(), T),
GeneralizedMetadataIdMap, ".generalized");
}

/// Returns whether this module needs the "all-vtables" type identifier.
19 changes: 18 additions & 1 deletion clang/lib/CodeGen/CodeGenModule.h
Original file line number Diff line number Diff line change
@@ -503,6 +503,7 @@ class CodeGenModule : public CodeGenTypeCache {
/// MDNodes.
typedef llvm::DenseMap<QualType, llvm::Metadata *> MetadataTypeMap;
MetadataTypeMap MetadataIdMap;
MetadataTypeMap VirtualMetadataIdMap;
MetadataTypeMap GeneralizedMetadataIdMap;

public:
@@ -1232,13 +1233,18 @@ class CodeGenModule : public CodeGenTypeCache {
/// internal identifiers).
llvm::Metadata *CreateMetadataIdentifierForType(QualType T);

/// Create a metadata identifier that is intended to be used to check virtual
/// calls via a member function pointer.
llvm::Metadata *CreateMetadataIdentifierForVirtualMemPtrType(QualType T);

/// Create a metadata identifier for the generalization of the given type.
/// This may either be an MDString (for external identifiers) or a distinct
/// unnamed MDNode (for internal identifiers).
llvm::Metadata *CreateMetadataIdentifierGeneralized(QualType T);

/// Create and attach type metadata to the given function.
void CreateFunctionTypeMetadata(const FunctionDecl *FD, llvm::Function *F);
void CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
llvm::Function *F);

/// Returns whether this module needs the "all-vtables" type identifier.
bool NeedAllVtablesTypeId() const;
@@ -1247,6 +1253,14 @@ class CodeGenModule : public CodeGenTypeCache {
void AddVTableTypeMetadata(llvm::GlobalVariable *VTable, CharUnits Offset,
const CXXRecordDecl *RD);

/// Return a vector of most-base classes for RD. This is used to implement
/// control flow integrity checks for member function pointers.
///
/// A most-base class of a class C is defined as a recursive base class of C,
/// including C itself, that does not have any bases.
std::vector<const CXXRecordDecl *>
getMostBaseClasses(const CXXRecordDecl *RD);

/// Get the declaration of std::terminate for the platform.
llvm::Constant *getTerminateFn();

@@ -1408,6 +1422,9 @@ class CodeGenModule : public CodeGenTypeCache {
void ConstructDefaultFnAttrList(StringRef Name, bool HasOptnone,
bool AttrOnCallSite,
llvm::AttrBuilder &FuncAttrs);

llvm::Metadata *CreateMetadataIdentifierImpl(QualType T, MetadataTypeMap &Map,
StringRef Suffix);
};

} // end namespace CodeGen
Loading

0 comments on commit e44acad

Please sign in to comment.