Index: docs/LangRef.rst =================================================================== --- docs/LangRef.rst +++ docs/LangRef.rst @@ -628,6 +628,20 @@ Variables and aliases can have a :ref:`Thread Local Storage Model `. +If :ref:`splitpoint metadata ` is attached, a global variable may +be split into multiple objects, which are not required to appear consecutively +in memory. An *object reference* is a reference to a global variable, or a +constant ``bitcast`` or ``getelementptr`` expression whose first operand is +an object reference. If a global variable is split, any object references +pointing into the global variable will point into the appropriate object, +and any object reference pointing past the end of the global will point past +the end of the last object. + +Any pointer derived from an object reference that is not itself an object +reference may only be reliably used to access that object. As a consequence +of this, no optimization pass may transform any object reference into a +object reference that refers to a different object. + Syntax:: @ = [Linkage] [Visibility] [DLLStorageClass] [ThreadLocal] @@ -4889,6 +4903,75 @@ !1 = !{!"other ptr"} +'``type``' Metadata +^^^^^^^^^^^^^^^^^^^ + +See :doc:`TypeMetadata`. + +.. _splitpoint: + +'``splitpoint``' Metadata +^^^^^^^^^^^^^^^^^^^^^^^^^ + +This metadata can be used to annotate :ref:`global variables ` +with split points. Each split point specifies a byte offset within the +global, and allows the global variable to be split into multiple objects, +which are not required to appear consecutively in memory. + +Under the Itanium C++ ABI, Clang attaches split points at each boundary +between virtual tables in a virtual table group, because the virtual tables +are independent entities as far as a user of a virtual table within an object +is concerned, provided that the dynamic type is unknown. A compiler cannot +legally generate code to adjust a virtual table pointer to point to another +virtual table in the virtual table group, as the primary virtual table will +have an unknown number of virtual functions. + +The optimizer will split a global variable at split points only if possible +and beneficial. + +A virtual table split is possible if it has local linkage. This guarantees +that nothing outside of the module can reference the virtual table group +directly. Typically, when LTO is used, the linker is able to internalize +most virtual tables. + +A split is beneficial if either of the whole-program devirtualization or +control flow integrity features are being used. In the former case, under +virtual constant propagation we are able to place propagated constants +directly in front of virtual tables of classes with multiple bases. In the +latter case, we can arrange virtual tables with multiple bases in a more +hierarchical order, which reduces the required amount of runtime data and +simplifies the required checks. + +Example: + +.. code-block:: llvm + + @global = internal constant [3 x i8* ()*] [i8* ()* @f, i8* ()* @g, i8* ()* @h], !splitpoint !0 + + define i8* @f() { + ret i8* bitcast (i8* ()** getelementptr ([3 x i8* ()*], [3 x i8* ()*]* @global, i32 0, i32 0) to i8*) + } + + define i8* @g() { + ret i8* bitcast (i8* ()** getelementptr ([3 x i8* ()*], [3 x i8* ()*]* @global, i32 0, i32 2) to i8*) + } + + !0 = !{i32 16} + +may be transformed into: + +.. code-block:: llvm + + @0 = private constant [2 x i8* ()*] [i8* ()* @f, i8* ()* @g] + @1 = private constant [1 x i8* ()*] [i8* ()* @h] + + define i8* @f() { + ret i8* bitcast ([2 x i8* ()*]* @0 to i8*) + } + + define i8* @g() { + ret i8* bitcast ([2 x i8* ()*]* @1 to i8*) + } Module Flags Metadata ===================== @@ -7535,6 +7618,11 @@ :ref:`Pointer Aliasing Rules ` section for more information. +Note that a global variable may be split into multiple objects if +`!splitpoint` metadata is attached. For more details, see the documentation +for :ref:`splitpoint metadata ` and :ref:`global variables +`. + The getelementptr instruction is often confusing. For some more insight into how it works, see :doc:`the getelementptr FAQ `. Index: include/llvm/IR/LLVMContext.h =================================================================== --- include/llvm/IR/LLVMContext.h +++ include/llvm/IR/LLVMContext.h @@ -69,6 +69,7 @@ MD_align = 17, // "align" MD_loop = 18, // "llvm.loop" MD_type = 19, // "type" + MD_splitpoint = 20, // "splitpoint" }; /// Known operand bundle tag IDs, which always have the same value. All Index: include/llvm/InitializePasses.h =================================================================== --- include/llvm/InitializePasses.h +++ include/llvm/InitializePasses.h @@ -137,6 +137,7 @@ void initializeGlobalDCELegacyPassPass(PassRegistry&); void initializeGlobalMergePass(PassRegistry&); void initializeGlobalOptLegacyPassPass(PassRegistry&); +void initializeGlobalSplitPass(PassRegistry&); void initializeGlobalsAAWrapperPassPass(PassRegistry&); void initializeGuardWideningLegacyPassPass(PassRegistry&); void initializeIPCPPass(PassRegistry&); Index: include/llvm/Transforms/IPO.h =================================================================== --- include/llvm/Transforms/IPO.h +++ include/llvm/Transforms/IPO.h @@ -225,6 +225,10 @@ /// metadata. ModulePass *createWholeProgramDevirtPass(); +/// This pass splits globals into pieces for the benefit of whole-program +/// devirtualization and control-flow integrity. +ModulePass *createGlobalSplitPass(); + //===----------------------------------------------------------------------===// // SampleProfilePass - Loads sample profile data from disk and generates // IR metadata to reflect the profile. Index: lib/IR/LLVMContext.cpp =================================================================== --- lib/IR/LLVMContext.cpp +++ lib/IR/LLVMContext.cpp @@ -138,6 +138,10 @@ assert(TypeID == MD_type && "type kind id drifted"); (void)TypeID; + unsigned SplitPointID = getMDKindID("splitpoint"); + assert(SplitPointID == MD_splitpoint && "splitpoint kind id drifted"); + (void)SplitPointID; + auto *DeoptEntry = pImpl->getOrInsertBundleTag("deopt"); assert(DeoptEntry->second == LLVMContext::OB_deopt && "deopt operand bundle id drifted!"); Index: lib/Transforms/IPO/CMakeLists.txt =================================================================== --- lib/Transforms/IPO/CMakeLists.txt +++ lib/Transforms/IPO/CMakeLists.txt @@ -11,6 +11,7 @@ FunctionImport.cpp GlobalDCE.cpp GlobalOpt.cpp + GlobalSplit.cpp IPConstantPropagation.cpp IPO.cpp InferFunctionAttrs.cpp Index: lib/Transforms/IPO/GlobalSplit.cpp =================================================================== --- /dev/null +++ lib/Transforms/IPO/GlobalSplit.cpp @@ -0,0 +1,202 @@ +//===- GlobalSplit.cpp - global variable splitter -------------------------===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This pass uses !splitpoint metadata to split globals where beneficial. Clang +// currently attaches this metadata to virtual table globals under the Itanium +// ABI for the benefit of the whole-program virtual call optimization and +// control flow integrity passes. +// +//===----------------------------------------------------------------------===// + +#include "llvm/Transforms/IPO.h" +#include "llvm/IR/Constants.h" +#include "llvm/IR/GlobalVariable.h" +#include "llvm/IR/Intrinsics.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/Operator.h" +#include "llvm/Pass.h" + +#include + +using namespace llvm; + +namespace { + +bool splitGlobal(GlobalVariable &GV) { + // If the address of the global is taken outside of the module, we cannot + // apply this transformation. + if (!GV.hasLocalLinkage()) + return false; + + // We currently only know how to split ConstantArrays. + auto *Init = dyn_cast_or_null(GV.getInitializer()); + if (!Init) + return false; + + const DataLayout &DL = GV.getParent()->getDataLayout(); + + Type *ElemTy = Init->getType()->getElementType(); + uint64_t ElemSize = DL.getTypeAllocSize(ElemTy); + + SmallVector SplitPointMD, Types; + GV.getMetadata(LLVMContext::MD_splitpoint, SplitPointMD); + GV.getMetadata(LLVMContext::MD_type, Types); + + // Build the set of offsets at which we need to split. + std::set SplitPoints; + for (MDNode *SplitPoint : SplitPointMD) { + uint64_t ByteOffset = + cast( + cast(SplitPoint->getOperand(0))->getValue()) + ->getZExtValue(); + if (ByteOffset % ElemSize) + return false; + size_t Index = ByteOffset / ElemSize; + if (Index >= Init->getNumOperands()) + return false; + SplitPoints.insert(Index); + } + + if (SplitPoints.empty()) + return false; + + IntegerType *Int8Ty = Type::getInt8Ty(GV.getContext()); + PointerType *Int8PtrTy = Type::getInt8PtrTy(GV.getContext()); + IntegerType *Int32Ty = Type::getInt32Ty(GV.getContext()); + + // Each element of SplitPoints corresponds to the last operand of a "piece" + // of the original initializer. This element terminates the final piece. + SplitPoints.insert(Init->getNumOperands()); + + std::map SplitGlobals; + uint64_t LastSplitPoint = 0; + for (auto SplitPoint : SplitPoints) { + // Build a global representing this split piece. + std::vector Ops; + for (unsigned Op = LastSplitPoint; Op != SplitPoint; ++Op) + Ops.push_back(Init->getOperand(Op)); + ArrayType *Ty = ArrayType::get(ElemTy, Ops.size()); + Constant *NewInit = ConstantArray::get(Ty, Ops); + auto *SplitGV = new GlobalVariable(*GV.getParent(), Ty, GV.isConstant(), + GlobalValue::PrivateLinkage, NewInit); + SplitGlobals[LastSplitPoint] = SplitGV; + + // Rebuild type metadata, adjusting by the split offset. + for (MDNode *Type : Types) { + uint64_t ByteOffset = cast( + cast(Type->getOperand(0))->getValue()) + ->getZExtValue(); + if (ByteOffset < LastSplitPoint * ElemSize || + ByteOffset >= SplitPoint * ElemSize) + continue; + SplitGV->addMetadata( + LLVMContext::MD_type, + *MDNode::get(GV.getContext(), + {ConstantAsMetadata::get(ConstantInt::get( + Int32Ty, ByteOffset - (LastSplitPoint * ElemSize))), + Type->getOperand(1)})); + } + + LastSplitPoint = SplitPoint; + } + + // Now, replace all uses of the original global with references to the split + // globals. This is a little tricky because we need to walk use lists in order + // to calculate offsets. + + std::function ApplyReplacements; + ApplyReplacements = [&](Constant *C, uint64_t Offset) { + // Replace all users first in order to ensure that we remain valid. + for (User *U : C->users()) { + // We only handle constant offsets references here. Any non-constant + // offsets are presumed to refer to the same global. + ConstantExpr *CE = dyn_cast(U); + if (!CE) + continue; + + switch (CE->getOpcode()) { + case Instruction::BitCast: + ApplyReplacements(CE, Offset); + break; + + case Instruction::GetElementPtr: { + APInt APOffset(DL.getPointerSizeInBits(), 0); + cast(CE)->accumulateConstantOffset(DL, APOffset); + ApplyReplacements(CE, Offset + APOffset.getZExtValue()); + break; + } + } + } + + // Find the global corresponding to the offset calculated so far. + auto I = SplitGlobals.upper_bound(Offset / ElemSize); + --I; + + // Build a offseted reference to the global adjusting for the calculated + // offset and the split offset. + Constant *Replacement = ConstantExpr::getBitCast( + ConstantExpr::getGetElementPtr( + Int8Ty, ConstantExpr::getBitCast(I->second, Int8PtrTy), + ArrayRef{ + ConstantInt::get(Int32Ty, Offset - I->first * ElemSize)}), + C->getType()); + + // Finally, perform the replacement. + C->replaceAllUsesWith(Replacement); + }; + + ApplyReplacements(&GV, 0); + + // Finally, remove the original global. + GV.eraseFromParent(); + return true; +} + +bool splitGlobals(Module &M) { + // First, see if the module uses either of the llvm.type.test or + // llvm.type.checked.load intrinsics, which indicates that splitting globals + // may be beneficial. + Function *TypeTestFunc = + M.getFunction(Intrinsic::getName(Intrinsic::type_test)); + Function *TypeCheckedLoadFunc = + M.getFunction(Intrinsic::getName(Intrinsic::type_checked_load)); + if ((!TypeTestFunc || TypeTestFunc->use_empty()) && + (!TypeCheckedLoadFunc || TypeCheckedLoadFunc->use_empty())) + return false; + + bool Changed = false; + for (auto I = M.global_begin(); I != M.global_end();) { + GlobalVariable &GV = *I; + ++I; + Changed |= splitGlobal(GV); + } + return Changed; +} + +struct GlobalSplit : public ModulePass { + static char ID; + GlobalSplit() : ModulePass(ID) { + initializeGlobalSplitPass(*PassRegistry::getPassRegistry()); + } + bool runOnModule(Module &M) { + if (skipModule(M)) + return false; + + return splitGlobals(M); + } +}; + +} + +INITIALIZE_PASS(GlobalSplit, "globalsplit", "Global splitter", false, false) +char GlobalSplit::ID = 0; + +ModulePass *llvm::createGlobalSplitPass() { + return new GlobalSplit; +} Index: lib/Transforms/IPO/IPO.cpp =================================================================== --- lib/Transforms/IPO/IPO.cpp +++ lib/Transforms/IPO/IPO.cpp @@ -31,6 +31,7 @@ initializeForceFunctionAttrsLegacyPassPass(Registry); initializeGlobalDCELegacyPassPass(Registry); initializeGlobalOptLegacyPassPass(Registry); + initializeGlobalSplitPass(Registry); initializeIPCPPass(Registry); initializeAlwaysInlinerPass(Registry); initializeSimpleInlinerPass(Registry); Index: lib/Transforms/IPO/PassManagerBuilder.cpp =================================================================== --- lib/Transforms/IPO/PassManagerBuilder.cpp +++ lib/Transforms/IPO/PassManagerBuilder.cpp @@ -618,6 +618,11 @@ PM.add(createPostOrderFunctionAttrsLegacyPass()); PM.add(createReversePostOrderFunctionAttrsPass()); + // Split globals using !splitpoint metadata. This can help improve the quality + // of generated code when virtual constant propagation or control flow + // integrity are enabled. + PM.add(createGlobalSplitPass()); + // Apply whole-program devirtualization and virtual constant propagation. PM.add(createWholeProgramDevirtPass()); Index: test/Transforms/GlobalSplit/basic.ll =================================================================== --- /dev/null +++ test/Transforms/GlobalSplit/basic.ll @@ -0,0 +1,48 @@ +; RUN: opt -S -globalsplit %s | FileCheck %s + +target datalayout = "e-p:64:64" +target triple = "x86_64-unknown-linux-gnu" + +; CHECK: @vtt = constant [3 x i8*] [i8* bitcast ([2 x i8* ()*]* @0 to i8*), i8* getelementptr (i8, i8* bitcast ([2 x i8* ()*]* @0 to i8*), i32 8), i8* bitcast ([1 x i8* ()*]* @1 to i8*)] +@vtt = constant [3 x i8*] [ + i8* bitcast (i8* ()** getelementptr ([3 x i8* ()*], [3 x i8* ()*]* @global, i32 0, i32 0) to i8*), + i8* bitcast (i8* ()** getelementptr ([3 x i8* ()*], [3 x i8* ()*]* @global, i32 0, i32 1) to i8*), + i8* bitcast (i8* ()** getelementptr ([3 x i8* ()*], [3 x i8* ()*]* @global, i32 0, i32 2) to i8*) +] + +; CHECK-NOT: @global +; CHECK: @0 = private constant [2 x i8* ()*] [i8* ()* @f, i8* ()* @g], !type [[T1:![0-9]+$]] +; CHECK: @1 = private constant [1 x i8* ()*] [i8* ()* @h], !type [[T2:![0-9]+$]] +; CHECK-NOT: @global +@global = internal constant [3 x i8* ()*] [i8* ()* @f, i8* ()* @g, i8* ()* @h], !type !0, !type !1, !splitpoint !2 + +; CHECK: define i8* @f() +define i8* @f() { + ; CHECK-NEXT: ret i8* bitcast ([2 x i8* ()*]* @0 to i8*) + ret i8* bitcast (i8* ()** getelementptr ([3 x i8* ()*], [3 x i8* ()*]* @global, i32 0, i32 0) to i8*) +} + +; CHECK: define i8* @g() +define i8* @g() { + ; CHECK-NEXT: ret i8* getelementptr (i8, i8* bitcast ([2 x i8* ()*]* @0 to i8*), i32 8) + ret i8* bitcast (i8* ()** getelementptr ([3 x i8* ()*], [3 x i8* ()*]* @global, i32 0, i32 1) to i8*) +} + +; CHECK: define i8* @h() +define i8* @h() { + ; CHECK-NEXT: ret i8* bitcast ([1 x i8* ()*]* @1 to i8*) + ret i8* bitcast (i8* ()** getelementptr ([3 x i8* ()*], [3 x i8* ()*]* @global, i32 0, i32 2) to i8*) +} + +define void @foo() { + %p = call i1 @llvm.type.test(i8* null, metadata !"") + ret void +} + +declare i1 @llvm.type.test(i8*, metadata) nounwind readnone + +; CHECK: [[T1]] = !{i32 8, !"foo"} +; CHECK: [[T2]] = !{i32 0, !"bar"} +!0 = !{i32 8, !"foo"} +!1 = !{i32 16, !"bar"} +!2 = !{i32 16} Index: test/Transforms/GlobalSplit/invalid-offset.ll =================================================================== --- /dev/null +++ test/Transforms/GlobalSplit/invalid-offset.ll @@ -0,0 +1,32 @@ +; RUN: opt -S -globalsplit %s | FileCheck %s + +target datalayout = "e-p:64:64" +target triple = "x86_64-unknown-linux-gnu" + +; CHECK: @global1 = +@global1 = constant [3 x i8* ()*] [i8* ()* @f, i8* ()* @g, i8* ()* @h], !splitpoint !0 + +; CHECK: @global2 = +@global2 = constant [3 x i8* ()*] [i8* ()* @f, i8* ()* @g, i8* ()* @h], !splitpoint !1 + +define i8* @f() { + ret i8* null +} + +define i8* @g() { + ret i8* null +} + +define i8* @h() { + ret i8* null +} + +define void @foo() { + %p = call i1 @llvm.type.test(i8* null, metadata !"") + ret void +} + +declare i1 @llvm.type.test(i8*, metadata) nounwind readnone + +!0 = !{i32 3} +!1 = !{i32 48} Index: test/Transforms/GlobalSplit/non-beneficial.ll =================================================================== --- /dev/null +++ test/Transforms/GlobalSplit/non-beneficial.ll @@ -0,0 +1,21 @@ +; RUN: opt -S -globalsplit %s | FileCheck %s + +target datalayout = "e-p:64:64" +target triple = "x86_64-unknown-linux-gnu" + +; CHECK: @global = +@global = constant [3 x i8* ()*] [i8* ()* @f, i8* ()* @g, i8* ()* @h], !splitpoint !0 + +define i8* @f() { + ret i8* null +} + +define i8* @g() { + ret i8* null +} + +define i8* @h() { + ret i8* null +} + +!0 = !{i32 16} Index: test/Transforms/GlobalSplit/nonlocal.ll =================================================================== --- /dev/null +++ test/Transforms/GlobalSplit/nonlocal.ll @@ -0,0 +1,28 @@ +; RUN: opt -S -globalsplit %s | FileCheck %s + +target datalayout = "e-p:64:64" +target triple = "x86_64-unknown-linux-gnu" + +; CHECK: @global = +@global = constant [3 x i8* ()*] [i8* ()* @f, i8* ()* @g, i8* ()* @h], !splitpoint !0 + +define i8* @f() { + ret i8* null +} + +define i8* @g() { + ret i8* null +} + +define i8* @h() { + ret i8* null +} + +define void @foo() { + %p = call i1 @llvm.type.test(i8* null, metadata !"") + ret void +} + +declare i1 @llvm.type.test(i8*, metadata) nounwind readnone + +!0 = !{i32 16}