Index: docs/LangRef.rst =================================================================== --- docs/LangRef.rst +++ docs/LangRef.rst @@ -562,6 +562,8 @@ LLVM allows an explicit section to be specified for globals. If the target supports it, it will emit globals to the section specified. +Additionally, the global can placed in a comdat if the target has the necessary +support. By default, global initializers are optimized by assuming that global variables defined within the module are not modified from their @@ -627,8 +629,9 @@ :ref:`parameter attribute ` for the return type, a function name, a (possibly empty) argument list (each with optional :ref:`parameter attributes `), optional :ref:`function attributes `, -an optional section, an optional alignment, an optional :ref:`garbage -collector name `, an optional :ref:`prefix `, an opening +an optional section, an optional alignment, +an optional :ref:`comdat `, +an optional :ref:`garbage collector name `, an optional :ref:`prefix `, an opening curly brace, a list of basic blocks, and a closing curly brace. LLVM function declarations consist of the "``declare``" keyword, an @@ -658,6 +661,7 @@ LLVM allows an explicit section to be specified for functions. If the target supports it, it will emit functions to the section specified. +Additionally, the function can placed in a COMDAT. An explicit alignment may be specified for a function. If not present, or if the alignment is set to zero, the alignment of the function is set @@ -673,8 +677,8 @@ define [linkage] [visibility] [DLLStorageClass] [cconv] [ret attrs] @ ([argument list]) - [unnamed_addr] [fn Attrs] [section "name"] [align N] - [gc] [prefix Constant] { ... } + [unnamed_addr] [fn Attrs] [section "name"] [comdat $] + [align N] [gc] [prefix Constant] { ... } .. _langref_aliases: @@ -716,6 +720,80 @@ * No global value in the expression can be a declaration, since that would require a relocation, which is not possible. +.. _langref_comdats: + +Comdats +------- + +Comdat IR provides access to COFF and ELF object file COMDAT functionality. + +Comdats have a name which represents the COMDAT key. All global objects which +specify this key will only end up in the final object file if the linker chooses +that key over some other key. Aliases are placed in the same COMDAT that their +aliasee computes to, if any. The global object may not have local linkage. + +Comdats have a selection kind to provide input on how the linker should +choose between keys in two different object files. + +Syntax:: + + $ = comdat SelectionKind + +The selection kind must be one of the following: + +``any`` + The linker may choose any COMDAT key, the choice is arbitrary. +``exactmatch`` + The linker may choose any COMDAT key but the sections must contain the + same data. +``largest`` + The linker will choose the section containing the largest COMDAT key. +``noduplicates`` + The linker requires that only section with this COMDAT key exist. +``samesize`` + The linker may choose any COMDAT key but the sections must contain the + same amount of data. + +Note that the Mach-O platform doesn't support COMDATs and ELF only supports +``any`` as a selection kind. + +Here is an example of a COMDAT group where a function will only be selected if +the COMDAT key's section is the largest: + +.. code-block:: llvm + + $foo = comdat largest + @foo = global i32 2, comdat $foo + + define void @bar() comdat $foo { + ret void + } + +In a COFF object file, this will create a COMDAT section with selection kind +``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol +and another COMDAT section with selection kind +``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT +section and contains the contents of the ``@baz`` symbol. + +The combined use of COMDATS and section attributes may yield surprising results. +For example: + +.. code-block:: llvm + + $foo = comdat any + $bar = comdat any + @g1 = global i32 42, section "sec", comdat $foo + @g2 = global i32 42, section "sec", comdat $bar + +From the object file perspective, this requires the creation of two sections +with the same name. This is necessary because both globals belong to different +COMDAT groups and COMDATs, at the object file level, are represented by +sections. + +Note that certain IR constructs like global variables and functions may create +COMDATs in the object file in addition to any which are specified using COMDAT +IR. This arises, for example, when a global variable has linkonce_odr linkage. + .. _namedmetadatastructure: Named Metadata Index: include/llvm/ADT/UniqueVector.h =================================================================== --- include/llvm/ADT/UniqueVector.h +++ include/llvm/ADT/UniqueVector.h @@ -22,13 +22,18 @@ /// class should have an implementation of operator== and of operator<. /// Entries can be fetched using operator[] with the entry ID. template class UniqueVector { +public: + typedef typename std::vector VectorType; + typedef typename VectorType::iterator iterator; + typedef typename VectorType::const_iterator const_iterator; + private: // Map - Used to handle the correspondence of entry to ID. std::map Map; // Vector - ID ordered vector of entries. Entries can be indexed by ID - 1. // - std::vector Vector; + VectorType Vector; public: /// insert - Append entry to the vector if it doesn't already exist. Returns @@ -68,6 +73,18 @@ return Vector[ID - 1]; } + /// \brief Return an iterator to the start of the vector. + iterator begin() { return Vector.begin(); } + + /// \brief Return an iterator to the start of the vector. + const_iterator begin() const { return Vector.begin(); } + + /// \brief Return an iterator to the end of the vector. + iterator end() { return Vector.end(); } + + /// \brief Return an iterator to the end of the vector. + const_iterator end() const { return Vector.end(); } + /// size - Returns the number of entries in the vector. /// size_t size() const { return Vector.size(); } Index: include/llvm/Bitcode/LLVMBitCodes.h =================================================================== --- include/llvm/Bitcode/LLVMBitCodes.h +++ include/llvm/Bitcode/LLVMBitCodes.h @@ -71,7 +71,8 @@ // MODULE_CODE_PURGEVALS: [numvals] MODULE_CODE_PURGEVALS = 10, - MODULE_CODE_GCNAME = 11 // GCNAME: [strchr x N] + MODULE_CODE_GCNAME = 11, // GCNAME: [strchr x N] + MODULE_CODE_COMDAT = 12, // COMDAT: [selection_kind, name] }; /// PARAMATTR blocks have code for defining a parameter attribute set. @@ -376,6 +377,14 @@ ATTR_KIND_JUMP_TABLE = 40 }; + enum ComdatSelectionKindCodes { + COMDAT_SELECTION_KIND_ANY = 1, + COMDAT_SELECTION_KIND_EXACT_MATCH = 2, + COMDAT_SELECTION_KIND_LARGEST = 3, + COMDAT_SELECTION_KIND_NO_DUPLICATES = 4, + COMDAT_SELECTION_KIND_SAME_SIZE = 5, + }; + } // End bitc namespace } // End llvm namespace Index: include/llvm/IR/Comdat.h =================================================================== --- /dev/null +++ include/llvm/IR/Comdat.h @@ -0,0 +1,66 @@ +//===-- llvm/IR/Comdat.h - Comdat definitions -------------------*- C++ -*-===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +/// @file +/// This file contains the declaration of the Comdat class, which represents a +/// single COMDAT in LLVM. +// +//===----------------------------------------------------------------------===// + +#ifndef LLVM_IR_COMDAT_H +#define LLVM_IR_COMDAT_H + +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/Compiler.h" + +namespace llvm { + +class raw_ostream; +template class StringMapEntry; + +// This is a Name X SelectionKind pair. The reason for having this be an +// independent object instead of just adding the name and the SelectionKind +// to a GlobalObject is that it is invalid to have two Comdats with the same +// name but different SelectionKind. This structure makes that unrepresentable. +class Comdat { +public: + enum SelectionKind { + Any, ///< The linker may choose any COMDAT. + ExactMatch, ///< The data referenced by the COMDAT must be the same. + Largest, ///< The linker will choose the largest COMDAT. + NoDuplicates, ///< No other Module may specify this COMDAT. + SameSize, ///< The data referenced by the COMDAT must be the same size. + }; + + Comdat(Comdat &&C); + SelectionKind getSelectionKind() const { return SK; } + void setSelectionKind(SelectionKind Val) { SK = Val; } + StringRef getName() const; + void print(raw_ostream &OS) const; + void dump() const; + +private: + friend class Module; + Comdat(); + Comdat(SelectionKind SK, StringMapEntry *Name); + Comdat(const Comdat &) LLVM_DELETED_FUNCTION; + + // Points to the map in Module. + StringMapEntry *Name; + SelectionKind SK; +}; + +inline raw_ostream &operator<<(raw_ostream &OS, const Comdat &C) { + C.print(OS); + return OS; +} + +} // end llvm namespace + +#endif Index: include/llvm/IR/GlobalAlias.h =================================================================== --- include/llvm/IR/GlobalAlias.h +++ include/llvm/IR/GlobalAlias.h @@ -87,6 +87,21 @@ return getOperand(0); } + const GlobalObject *getBaseObject() const { + return const_cast(this)->getBaseObject(); + } + GlobalObject *getBaseObject() { + return dyn_cast(getAliasee()->stripInBoundsOffsets()); + } + + const GlobalObject *getBaseObject(const DataLayout &DL, APInt &Offset) const { + return const_cast(this)->getBaseObject(DL, Offset); + } + GlobalObject *getBaseObject(const DataLayout &DL, APInt &Offset) { + return dyn_cast( + getAliasee()->stripAndAccumulateInBoundsConstantOffsets(DL, Offset)); + } + static bool isValidLinkage(LinkageTypes L) { return isExternalLinkage(L) || isLocalLinkage(L) || isWeakLinkage(L) || isLinkOnceLinkage(L); Index: include/llvm/IR/GlobalObject.h =================================================================== --- include/llvm/IR/GlobalObject.h +++ include/llvm/IR/GlobalObject.h @@ -20,7 +20,7 @@ #include "llvm/IR/GlobalValue.h" namespace llvm { - +class Comdat; class Module; class GlobalObject : public GlobalValue { @@ -29,11 +29,12 @@ protected: GlobalObject(Type *Ty, ValueTy VTy, Use *Ops, unsigned NumOps, LinkageTypes Linkage, const Twine &Name) - : GlobalValue(Ty, VTy, Ops, NumOps, Linkage, Name) { + : GlobalValue(Ty, VTy, Ops, NumOps, Linkage, Name), ObjComdat(nullptr) { setGlobalValueSubClassData(0); } std::string Section; // Section to emit this into, empty means default + Comdat *ObjComdat; public: unsigned getAlignment() const { return (1u << getGlobalValueSubClassData()) >> 1; @@ -44,6 +45,11 @@ const char *getSection() const { return Section.c_str(); } void setSection(StringRef S); + bool hasComdat() const { return getComdat() != nullptr; } + const Comdat *getComdat() const { return ObjComdat; } + Comdat *getComdat() { return ObjComdat; } + void setComdat(Comdat *C) { ObjComdat = C; } + void copyAttributesFrom(const GlobalValue *Src) override; // Methods for support type inquiry through isa, cast, and dyn_cast: Index: include/llvm/IR/GlobalValue.h =================================================================== --- include/llvm/IR/GlobalValue.h +++ include/llvm/IR/GlobalValue.h @@ -23,6 +23,7 @@ namespace llvm { +class Comdat; class PointerType; class Module; @@ -110,6 +111,12 @@ bool hasUnnamedAddr() const { return UnnamedAddr; } void setUnnamedAddr(bool Val) { UnnamedAddr = Val; } + bool hasComdat() const { return getComdat() != nullptr; } + Comdat *getComdat(); + const Comdat *getComdat() const { + return const_cast(this)->getComdat(); + } + VisibilityTypes getVisibility() const { return VisibilityTypes(Visibility); } bool hasDefaultVisibility() const { return Visibility == DefaultVisibility; } bool hasHiddenVisibility() const { return Visibility == HiddenVisibility; } Index: include/llvm/IR/Module.h =================================================================== --- include/llvm/IR/Module.h +++ include/llvm/IR/Module.h @@ -16,6 +16,7 @@ #define LLVM_IR_MODULE_H #include "llvm/ADT/iterator_range.h" +#include "llvm/IR/Comdat.h" #include "llvm/IR/DataLayout.h" #include "llvm/IR/Function.h" #include "llvm/IR/GlobalAlias.h" @@ -123,6 +124,8 @@ typedef iplist AliasListType; /// The type for the list of named metadata. typedef ilist NamedMDListType; + /// The type of the comdat "symbol" table. + typedef StringMap ComdatSymTabType; /// The Global Variable iterator. typedef GlobalListType::iterator global_iterator; @@ -197,6 +200,7 @@ NamedMDListType NamedMDList; ///< The named metadata in the module std::string GlobalScopeAsm; ///< Inline Asm at global scope. ValueSymbolTable *ValSymTab; ///< Symbol table for values + ComdatSymTabType ComdatSymTab; ///< Symbol table for COMDATs std::unique_ptr Materializer; ///< Used to materialize GlobalValues std::string ModuleID; ///< Human readable identifier for the module @@ -404,6 +408,14 @@ void eraseNamedMetadata(NamedMDNode *NMD); /// @} +/// @name Comdat Accessors +/// @{ + + /// Return the Comdat in the module with the specified name. It is created + /// if it didn't already exist. + Comdat *getOrInsertComdat(StringRef Name); + +/// @} /// @name Module Flags Accessors /// @{ @@ -504,6 +516,10 @@ const ValueSymbolTable &getValueSymbolTable() const { return *ValSymTab; } /// Get the Module's symbol table of global variable and function identifiers. ValueSymbolTable &getValueSymbolTable() { return *ValSymTab; } + /// Get the Module's symbol table for COMDATs (constant). + const ComdatSymTabType &getComdatSymbolTable() const { return ComdatSymTab; } + /// Get the Module's symbol table for COMDATs. + ComdatSymTabType &getComdatSymbolTable() { return ComdatSymTab; } /// @} /// @name Global Variable Iteration Index: include/llvm/Linker/Linker.h =================================================================== --- include/llvm/Linker/Linker.h +++ include/llvm/Linker/Linker.h @@ -15,6 +15,8 @@ namespace llvm { +class Comdat; +class GlobalValue; class Module; class StringRef; class StructType; Index: include/llvm/MC/MCContext.h =================================================================== --- include/llvm/MC/MCContext.h +++ include/llvm/MC/MCContext.h @@ -22,6 +22,7 @@ #include "llvm/Support/Compiler.h" #include "llvm/Support/raw_ostream.h" #include +#include #include // FIXME: Shouldn't be needed. namespace llvm { @@ -160,10 +161,11 @@ unsigned DwarfCompileUnitID; typedef std::pair SectionGroupPair; + typedef std::tuple SectionGroupTriple; StringMap MachOUniquingMap; std::map ELFUniquingMap; - std::map COFFUniquingMap; + std::map COFFUniquingMap; /// Do automatic reset in destructor bool AutoReset; Index: lib/AsmParser/LLLexer.h =================================================================== --- lib/AsmParser/LLLexer.h +++ lib/AsmParser/LLLexer.h @@ -81,6 +81,7 @@ lltok::Kind LexDigitOrNegative(); lltok::Kind LexPositive(); lltok::Kind LexAt(); + lltok::Kind LexDollar(); lltok::Kind LexExclaim(); lltok::Kind LexPercent(); lltok::Kind LexQuote(); Index: lib/AsmParser/LLLexer.cpp =================================================================== --- lib/AsmParser/LLLexer.cpp +++ lib/AsmParser/LLLexer.cpp @@ -209,6 +209,7 @@ return LexToken(); case '+': return LexPositive(); case '@': return LexAt(); + case '$': return LexDollar(); case '%': return LexPercent(); case '"': return LexQuote(); case '.': @@ -222,13 +223,6 @@ return lltok::dotdotdot; } return lltok::Error; - case '$': - if (const char *Ptr = isLabelTail(CurPtr)) { - CurPtr = Ptr; - StrVal.assign(TokStart, CurPtr-1); - return lltok::LabelStr; - } - return lltok::Error; case ';': SkipLineComment(); return LexToken(); @@ -307,6 +301,43 @@ return lltok::Error; } +lltok::Kind LLLexer::LexDollar() { + if (const char *Ptr = isLabelTail(TokStart)) { + CurPtr = Ptr; + StrVal.assign(TokStart, CurPtr - 1); + return lltok::LabelStr; + } + + // Handle DollarStringConstant: $\"[^\"]*\" + if (CurPtr[0] == '"') { + ++CurPtr; + + while (1) { + int CurChar = getNextChar(); + + if (CurChar == EOF) { + Error("end of file in COMDAT variable name"); + return lltok::Error; + } + if (CurChar == '"') { + StrVal.assign(TokStart + 2, CurPtr - 1); + UnEscapeLexed(StrVal); + if (StringRef(StrVal).find_first_of(0) != StringRef::npos) { + Error("Null bytes are not allowed in names"); + return lltok::Error; + } + return lltok::ComdatVar; + } + } + } + + // Handle ComdatVarName: $[-a-zA-Z$._][-a-zA-Z$._0-9]* + if (ReadVarName()) + return lltok::ComdatVar; + + return lltok::Error; +} + /// ReadString - Read a string until the closing quote. lltok::Kind LLLexer::ReadString(lltok::Kind kind) { const char *Start = CurPtr; @@ -618,6 +649,15 @@ KEYWORD(type); KEYWORD(opaque); + KEYWORD(comdat); + + // Comdat types + KEYWORD(any); + KEYWORD(exactmatch); + KEYWORD(largest); + KEYWORD(noduplicates); + KEYWORD(samesize); + KEYWORD(eq); KEYWORD(ne); KEYWORD(slt); KEYWORD(sgt); KEYWORD(sle); KEYWORD(sge); KEYWORD(ult); KEYWORD(ugt); KEYWORD(ule); KEYWORD(uge); KEYWORD(oeq); KEYWORD(one); KEYWORD(olt); KEYWORD(ogt); KEYWORD(ole); Index: lib/AsmParser/LLParser.h =================================================================== --- lib/AsmParser/LLParser.h +++ lib/AsmParser/LLParser.h @@ -34,6 +34,7 @@ class Instruction; class Constant; class GlobalValue; + class Comdat; class MDString; class MDNode; class StructType; @@ -122,6 +123,9 @@ std::map > ForwardRefValIDs; std::vector NumberedVals; + // Comdat forward reference information. + std::map ForwardRefComdats; + // References to blockaddress. The key is the function ValID, the value is // a list of references to blocks in that function. std::map > > @@ -154,6 +158,10 @@ GlobalValue *GetGlobalVal(const std::string &N, Type *Ty, LocTy Loc); GlobalValue *GetGlobalVal(unsigned ID, Type *Ty, LocTy Loc); + /// Get a Comdat with the specified name, creating a forward reference + /// record if needed. + Comdat *getComdat(const std::string &N, LocTy Loc); + // Helper Routines. bool ParseToken(lltok::Kind T, const char *ErrMsg); bool EatIfPresent(lltok::Kind T) { @@ -247,6 +255,7 @@ bool ParseAlias(const std::string &Name, LocTy Loc, unsigned Visibility, unsigned DLLStorageClass, GlobalVariable::ThreadLocalMode TLM, bool UnnamedAddr); + bool parseComdat(); bool ParseStandaloneMetadata(); bool ParseNamedMetadata(); bool ParseMDString(MDString *&Result); @@ -358,6 +367,7 @@ bool ParseGlobalValue(Type *Ty, Constant *&V); bool ParseGlobalTypeAndValue(Constant *&V); bool ParseGlobalValueVector(SmallVectorImpl &Elts); + bool parseOptionalComdat(Comdat *&C); bool ParseMetadataListValue(ValID &ID, PerFunctionState *PFS); bool ParseMetadataValue(ValID &ID, PerFunctionState *PFS); bool ParseMDNodeVector(SmallVectorImpl &, PerFunctionState *PFS); Index: lib/AsmParser/LLParser.cpp =================================================================== --- lib/AsmParser/LLParser.cpp +++ lib/AsmParser/LLParser.cpp @@ -162,6 +162,11 @@ return Error(I->second.second, "use of undefined type named '" + I->getKey() + "'"); + if (!ForwardRefComdats.empty()) + return Error(ForwardRefComdats.begin()->second, + "use of undefined comdat '$" + + ForwardRefComdats.begin()->first + "'"); + if (!ForwardRefVals.empty()) return Error(ForwardRefVals.begin()->second.second, "use of undefined value '@" + ForwardRefVals.begin()->first + @@ -237,6 +242,7 @@ case lltok::LocalVar: if (ParseNamedType()) return true; break; case lltok::GlobalID: if (ParseUnnamedGlobal()) return true; break; case lltok::GlobalVar: if (ParseNamedGlobal()) return true; break; + case lltok::ComdatVar: if (parseComdat()) return true; break; case lltok::exclaim: if (ParseStandaloneMetadata()) return true; break; case lltok::MetadataVar:if (ParseNamedMetadata()) return true; break; @@ -512,6 +518,56 @@ UnnamedAddr); } +bool LLParser::parseComdat() { + assert(Lex.getKind() == lltok::ComdatVar); + std::string Name = Lex.getStrVal(); + LocTy NameLoc = Lex.getLoc(); + Lex.Lex(); + + if (ParseToken(lltok::equal, "expected '=' here")) + return true; + + if (ParseToken(lltok::kw_comdat, "expected comdat keyword")) + return TokError("expected comdat type"); + + Comdat::SelectionKind SK; + switch (Lex.getKind()) { + default: + return TokError("unknown selection kind"); + case lltok::kw_any: + SK = Comdat::Any; + break; + case lltok::kw_exactmatch: + SK = Comdat::ExactMatch; + break; + case lltok::kw_largest: + SK = Comdat::Largest; + break; + case lltok::kw_noduplicates: + SK = Comdat::NoDuplicates; + break; + case lltok::kw_samesize: + SK = Comdat::SameSize; + break; + } + Lex.Lex(); + + // See if the comdat was forward referenced, if so, use the comdat. + Module::ComdatSymTabType &ComdatSymTab = M->getComdatSymbolTable(); + Module::ComdatSymTabType::iterator I = ComdatSymTab.find(Name); + if (I != ComdatSymTab.end() && !ForwardRefComdats.erase(Name)) + return Error(NameLoc, "redefinition of comdat '$" + Name + "'"); + + Comdat *C; + if (I != ComdatSymTab.end()) + C = &I->second; + else + C = M->getOrInsertComdat(Name); + C->setSelectionKind(SK); + + return false; +} + // MDString: // ::= '!' STRINGCONSTANT bool LLParser::ParseMDString(MDString *&Result) { @@ -837,7 +893,13 @@ if (ParseOptionalAlignment(Alignment)) return true; GV->setAlignment(Alignment); } else { - TokError("unknown global variable property!"); + Comdat *C; + if (parseOptionalComdat(C)) + return true; + if (C) + GV->setComdat(C); + else + return TokError("unknown global variable property!"); } } @@ -1096,6 +1158,24 @@ //===----------------------------------------------------------------------===// +// Comdat Reference/Resolution Routines. +//===----------------------------------------------------------------------===// + +Comdat *LLParser::getComdat(const std::string &Name, LocTy Loc) { + // Look this name up in the comdat symbol table. + Module::ComdatSymTabType &ComdatSymTab = M->getComdatSymbolTable(); + Module::ComdatSymTabType::iterator I = ComdatSymTab.find(Name); + if (I != ComdatSymTab.end()) + return &I->second; + + // Otherwise, create a new forward reference for this value and remember it. + Comdat *C = M->getOrInsertComdat(Name); + ForwardRefComdats[Name] = Loc; + return C; +} + + +//===----------------------------------------------------------------------===// // Helper Routines. //===----------------------------------------------------------------------===// @@ -2789,6 +2869,19 @@ ParseGlobalValue(Ty, V); } +bool LLParser::parseOptionalComdat(Comdat *&C) { + C = nullptr; + if (!EatIfPresent(lltok::kw_comdat)) + return false; + if (Lex.getKind() != lltok::ComdatVar) + return TokError("expected comdat variable"); + LocTy Loc = Lex.getLoc(); + StringRef Name = Lex.getStrVal(); + C = getComdat(Name, Loc); + Lex.Lex(); + return false; +} + /// ParseGlobalValueVector /// ::= /*empty*/ /// ::= TypeAndValue (',' TypeAndValue)* @@ -3089,6 +3182,7 @@ bool UnnamedAddr; LocTy UnnamedAddrLoc; Constant *Prefix = nullptr; + Comdat *C; if (ParseArgumentList(ArgList, isVarArg) || ParseOptionalToken(lltok::kw_unnamed_addr, UnnamedAddr, @@ -3097,6 +3191,7 @@ BuiltinLoc) || (EatIfPresent(lltok::kw_section) && ParseStringConstant(Section)) || + parseOptionalComdat(C) || ParseOptionalAlignment(Alignment) || (EatIfPresent(lltok::kw_gc) && ParseStringConstant(GC)) || @@ -3199,6 +3294,7 @@ Fn->setUnnamedAddr(UnnamedAddr); Fn->setAlignment(Alignment); Fn->setSection(Section); + Fn->setComdat(C); if (!GC.empty()) Fn->setGC(GC.c_str()); Fn->setPrefixData(Prefix); ForwardRefAttrGroups[Fn] = FwdRefAttrGrps; Index: lib/AsmParser/LLToken.h =================================================================== --- lib/AsmParser/LLToken.h +++ lib/AsmParser/LLToken.h @@ -142,6 +142,15 @@ kw_type, kw_opaque, + kw_comdat, + + // Comdat types + kw_any, + kw_exactmatch, + kw_largest, + kw_noduplicates, + kw_samesize, + kw_eq, kw_ne, kw_slt, kw_sgt, kw_sle, kw_sge, kw_ult, kw_ugt, kw_ule, kw_uge, kw_oeq, kw_one, kw_olt, kw_ogt, kw_ole, kw_oge, kw_ord, kw_uno, kw_ueq, kw_une, @@ -180,6 +189,7 @@ // String valued tokens (StrVal). LabelStr, // foo: GlobalVar, // @foo @"foo" + ComdatVar, // $foo LocalVar, // %foo %"foo" MetadataVar, // !foo StringConstant, // "foo" Index: lib/Bitcode/Reader/BitcodeReader.h =================================================================== --- lib/Bitcode/Reader/BitcodeReader.h +++ lib/Bitcode/Reader/BitcodeReader.h @@ -26,6 +26,7 @@ #include namespace llvm { + class Comdat; class MemoryBuffer; class LLVMContext; @@ -135,6 +136,7 @@ std::vector TypeList; BitcodeReaderValueList ValueList; BitcodeReaderMDValueList MDValueList; + std::vector ComdatList; SmallVector InstructionList; SmallVector, 64> UseListRecords; Index: lib/Bitcode/Reader/BitcodeReader.cpp =================================================================== --- lib/Bitcode/Reader/BitcodeReader.cpp +++ lib/Bitcode/Reader/BitcodeReader.cpp @@ -43,6 +43,7 @@ std::vector().swap(TypeList); ValueList.clear(); MDValueList.clear(); + std::vector().swap(ComdatList); std::vector().swap(MAttributes); std::vector().swap(FunctionBBs); @@ -203,6 +204,22 @@ } } +static Comdat::SelectionKind getDecodedComdatSelectionKind(unsigned Val) { + switch (Val) { + default: // Map unknown selection kinds to any. + case bitc::COMDAT_SELECTION_KIND_ANY: + return Comdat::Any; + case bitc::COMDAT_SELECTION_KIND_EXACT_MATCH: + return Comdat::ExactMatch; + case bitc::COMDAT_SELECTION_KIND_LARGEST: + return Comdat::Largest; + case bitc::COMDAT_SELECTION_KIND_NO_DUPLICATES: + return Comdat::NoDuplicates; + case bitc::COMDAT_SELECTION_KIND_SAME_SIZE: + return Comdat::SameSize; + } +} + static void UpgradeDLLImportExportLinkage(llvm::GlobalValue *GV, unsigned Val) { switch (Val) { case 5: GV->setDLLStorageClass(GlobalValue::DLLImportStorageClass); break; @@ -1838,6 +1855,20 @@ GCTable.push_back(S); break; } + case bitc::MODULE_CODE_COMDAT: { // COMDAT: [selection_kind, name] + if (Record.size() < 2) + return Error(InvalidRecord); + Comdat::SelectionKind SK = getDecodedComdatSelectionKind(Record[0]); + unsigned ComdatNameSize = Record[1]; + std::string ComdatName; + ComdatName.reserve(ComdatNameSize); + for (unsigned i = 0; i != ComdatNameSize; ++i) + ComdatName += (char)Record[2 + i]; + Comdat *C = TheModule->getOrInsertComdat(ComdatName); + C->setSelectionKind(SK); + ComdatList.push_back(C); + break; + } // GLOBALVAR: [pointer type, isconst, initid, // linkage, alignment, section, visibility, threadlocal, // unnamed_addr, dllstorageclass] @@ -1898,6 +1929,12 @@ // Remember which value to use for the global initializer. if (unsigned InitID = Record[2]) GlobalInits.push_back(std::make_pair(NewGV, InitID-1)); + + if (Record.size() > 11) + if (unsigned ComdatID = Record[11]) { + assert(ComdatID <= ComdatList.size()); + NewGV->setComdat(ComdatList[ComdatID - 1]); + } break; } // FUNCTION: [type, callingconv, isproto, linkage, paramattr, @@ -1951,6 +1988,12 @@ else UpgradeDLLImportExportLinkage(Func, Record[3]); + if (Record.size() > 12) + if (unsigned ComdatID = Record[12]) { + assert(ComdatID <= ComdatList.size()); + Func->setComdat(ComdatList[ComdatID - 1]); + } + ValueList.push_back(Func); // If this is a function with a body, remember the prototype we are Index: lib/Bitcode/Writer/BitcodeWriter.cpp =================================================================== --- lib/Bitcode/Writer/BitcodeWriter.cpp +++ lib/Bitcode/Writer/BitcodeWriter.cpp @@ -524,6 +524,35 @@ llvm_unreachable("Invalid TLS model"); } +static unsigned getEncodedComdatSelectionKind(const Comdat &C) { + switch (C.getSelectionKind()) { + case Comdat::Any: + return bitc::COMDAT_SELECTION_KIND_ANY; + case Comdat::ExactMatch: + return bitc::COMDAT_SELECTION_KIND_EXACT_MATCH; + case Comdat::Largest: + return bitc::COMDAT_SELECTION_KIND_LARGEST; + case Comdat::NoDuplicates: + return bitc::COMDAT_SELECTION_KIND_NO_DUPLICATES; + case Comdat::SameSize: + return bitc::COMDAT_SELECTION_KIND_SAME_SIZE; + } + llvm_unreachable("Invalid selection kind"); +} + +static void writeComdats(const ValueEnumerator &VE, BitstreamWriter &Stream) { + SmallVector Vals; + for (const Comdat *C : VE.getComdats()) { + // COMDAT: [selection_kind, name] + Vals.push_back(getEncodedComdatSelectionKind(*C)); + Vals.push_back(C->getName().size()); + for (char Chr : C->getName()) + Vals.push_back((unsigned char)Chr); + Stream.EmitRecord(bitc::MODULE_CODE_COMDAT, Vals, /*AbbrevToUse=*/0); + Vals.clear(); + } +} + // Emit top-level description of module, including target triple, inline asm, // descriptors for global variables, and function prototype info. static void WriteModuleInfo(const Module *M, const ValueEnumerator &VE, @@ -625,12 +654,14 @@ if (GV.isThreadLocal() || GV.getVisibility() != GlobalValue::DefaultVisibility || GV.hasUnnamedAddr() || GV.isExternallyInitialized() || - GV.getDLLStorageClass() != GlobalValue::DefaultStorageClass) { + GV.getDLLStorageClass() != GlobalValue::DefaultStorageClass || + GV.hasComdat()) { Vals.push_back(getEncodedVisibility(GV)); Vals.push_back(getEncodedThreadLocalMode(GV)); Vals.push_back(GV.hasUnnamedAddr()); Vals.push_back(GV.isExternallyInitialized()); Vals.push_back(getEncodedDLLStorageClass(GV)); + Vals.push_back(GV.hasComdat() ? VE.getComdatID(GV.getComdat()) : 0); } else { AbbrevToUse = SimpleGVarAbbrev; } @@ -656,6 +687,7 @@ Vals.push_back(F.hasPrefixData() ? (VE.getValueID(F.getPrefixData()) + 1) : 0); Vals.push_back(getEncodedDLLStorageClass(F)); + Vals.push_back(F.hasComdat() ? VE.getComdatID(F.getComdat()) : 0); unsigned AbbrevToUse = 0; Stream.EmitRecord(bitc::MODULE_CODE_FUNCTION, Vals, AbbrevToUse); @@ -1915,6 +1947,8 @@ // Emit information describing all of the types in the module. WriteTypeTable(VE, Stream); + writeComdats(VE, Stream); + // Emit top-level description of module, including target triple, inline asm, // descriptors for global variables, and function prototype info. WriteModuleInfo(M, VE, Stream); Index: lib/Bitcode/Writer/ValueEnumerator.h =================================================================== --- lib/Bitcode/Writer/ValueEnumerator.h +++ lib/Bitcode/Writer/ValueEnumerator.h @@ -16,6 +16,7 @@ #include "llvm/ADT/DenseMap.h" #include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/UniqueVector.h" #include "llvm/IR/Attributes.h" #include @@ -25,6 +26,7 @@ class Value; class Instruction; class BasicBlock; +class Comdat; class Function; class Module; class MDNode; @@ -48,6 +50,10 @@ typedef DenseMap ValueMapType; ValueMapType ValueMap; ValueList Values; + + typedef UniqueVector ComdatSetType; + ComdatSetType Comdats; + ValueList MDValues; SmallVector FunctionLocalMDs; ValueMapType MDValueMap; @@ -139,6 +145,9 @@ return AttributeGroups; } + const ComdatSetType &getComdats() const { return Comdats; } + unsigned getComdatID(const Comdat *C) const; + /// getGlobalBasicBlockID - This returns the function-specific ID for the /// specified basic block. This is relatively expensive information, so it /// should only be used by rare constructs such as address-of-label. Index: lib/Bitcode/Writer/ValueEnumerator.cpp =================================================================== --- lib/Bitcode/Writer/ValueEnumerator.cpp +++ lib/Bitcode/Writer/ValueEnumerator.cpp @@ -117,6 +117,12 @@ return I->second; } +unsigned ValueEnumerator::getComdatID(const Comdat *C) const { + unsigned ComdatID = Comdats.idFor(C); + assert(ComdatID && "Comdat not found!"); + return ComdatID; +} + void ValueEnumerator::setInstructionID(const Instruction *I) { InstructionMap[I] = InstructionCount++; } @@ -307,6 +313,10 @@ return; } + if (auto *GO = dyn_cast(V)) + if (const Comdat *C = GO->getComdat()) + Comdats.insert(C); + // Enumerate the type of this value. EnumerateType(V->getType()); Index: lib/CodeGen/TargetLoweringObjectFileImpl.cpp =================================================================== --- lib/CodeGen/TargetLoweringObjectFileImpl.cpp +++ lib/CodeGen/TargetLoweringObjectFileImpl.cpp @@ -192,6 +192,18 @@ return Flags; } +static const Comdat *getELFComdat(const GlobalValue *GV) { + const Comdat *C = GV->getComdat(); + if (!C) + return nullptr; + + if (C->getSelectionKind() != Comdat::Any) + report_fatal_error("ELF COMDATs only support SelectionKind::Any, '" + + C->getName() + "' cannot be lowered."); + + return C; +} + const MCSection *TargetLoweringObjectFileELF::getExplicitSectionGlobal( const GlobalValue *GV, SectionKind Kind, Mangler &Mang, const TargetMachine &TM) const { @@ -200,9 +212,15 @@ // Infer section flags from the section name if we can. Kind = getELFKindForNamedSection(SectionName, Kind); + StringRef Group = ""; + unsigned Flags = getELFSectionFlags(Kind); + if (const Comdat *C = getELFComdat(GV)) { + Group = C->getName(); + Flags |= ELF::SHF_GROUP; + } return getContext().getELFSection(SectionName, - getELFSectionType(SectionName, Kind), - getELFSectionFlags(Kind), Kind); + getELFSectionType(SectionName, Kind), Flags, + Kind, /*EntrySize=*/0, Group); } /// getSectionPrefixForGlobal - Return the section prefix name used by options @@ -224,7 +242,6 @@ return ".data.rel.ro."; } - const MCSection *TargetLoweringObjectFileELF:: SelectSectionForGlobal(const GlobalValue *GV, SectionKind Kind, Mangler &Mang, const TargetMachine &TM) const { @@ -238,7 +255,7 @@ // If this global is linkonce/weak and the target handles this by emitting it // into a 'uniqued' section name, create and return the section now. - if ((GV->isWeakForLinker() || EmitUniquedSection) && + if ((GV->isWeakForLinker() || EmitUniquedSection || GV->hasComdat()) && !Kind.isCommon()) { StringRef Prefix = getSectionPrefixForGlobal(Kind); @@ -247,8 +264,11 @@ StringRef Group = ""; unsigned Flags = getELFSectionFlags(Kind); - if (GV->isWeakForLinker()) { - Group = Name.substr(Prefix.size()); + if (GV->isWeakForLinker() || GV->hasComdat()) { + if (const Comdat *C = getELFComdat(GV)) + Group = C->getName(); + else + Group = Name.substr(Prefix.size()); Flags |= ELF::SHF_GROUP; } @@ -482,6 +502,15 @@ Streamer.AddBlankLine(); } +static void checkMachOComdat(const GlobalValue *GV) { + const Comdat *C = GV->getComdat(); + if (!C) + return; + + report_fatal_error("MachO doesn't support COMDATs, '" + C->getName() + + "' cannot be lowered."); +} + const MCSection *TargetLoweringObjectFileMachO::getExplicitSectionGlobal( const GlobalValue *GV, SectionKind Kind, Mangler &Mang, const TargetMachine &TM) const { @@ -489,6 +518,9 @@ StringRef Segment, Section; unsigned TAA = 0, StubSize = 0; bool TAAParsed; + + checkMachOComdat(GV); + std::string ErrorCode = MCSectionMachO::ParseSectionSpecifier(GV->getSection(), Segment, Section, TAA, TAAParsed, StubSize); @@ -559,6 +591,7 @@ const MCSection *TargetLoweringObjectFileMachO:: SelectSectionForGlobal(const GlobalValue *GV, SectionKind Kind, Mangler &Mang, const TargetMachine &TM) const { + checkMachOComdat(GV); // Handle thread local data. if (Kind.isThreadBSS()) return TLSBSSSection; @@ -727,6 +760,50 @@ return Flags; } +const GlobalValue *getComdatGVForCOFF(const GlobalValue *GV) { + const Comdat *C = GV->getComdat(); + assert(C && "expected GV to have a Comdat!"); + + StringRef ComdatGVName = C->getName(); + const GlobalValue *ComdatGV = GV->getParent()->getNamedValue(ComdatGVName); + if (!ComdatGV) + report_fatal_error("Associative COMDAT symbol '" + ComdatGVName + + "' does not exist."); + + if (ComdatGV->getComdat() != C) + report_fatal_error("Associative COMDAT symbol '" + ComdatGVName + + "' is not a key for it's COMDAT."); + + return ComdatGV; +} + +static int getSelectionForCOFF(const GlobalValue *GV) { + if (const Comdat *C = GV->getComdat()) { + const GlobalValue *ComdatKey = getComdatGVForCOFF(GV); + if (const auto *GA = dyn_cast(ComdatKey)) + ComdatKey = GA->getBaseObject(); + if (ComdatKey == GV) { + switch (C->getSelectionKind()) { + case Comdat::Any: + return COFF::IMAGE_COMDAT_SELECT_ANY; + case Comdat::ExactMatch: + return COFF::IMAGE_COMDAT_SELECT_EXACT_MATCH; + case Comdat::Largest: + return COFF::IMAGE_COMDAT_SELECT_LARGEST; + case Comdat::NoDuplicates: + return COFF::IMAGE_COMDAT_SELECT_NODUPLICATES; + case Comdat::SameSize: + return COFF::IMAGE_COMDAT_SELECT_SAME_SIZE; + } + } else { + return COFF::IMAGE_COMDAT_SELECT_ASSOCIATIVE; + } + } else if (GV->isWeakForLinker()) { + return COFF::IMAGE_COMDAT_SELECT_ANY; + } + return 0; +} + const MCSection *TargetLoweringObjectFileCOFF::getExplicitSectionGlobal( const GlobalValue *GV, SectionKind Kind, Mangler &Mang, const TargetMachine &TM) const { @@ -734,11 +811,21 @@ unsigned Characteristics = getCOFFSectionFlags(Kind); StringRef Name = GV->getSection(); StringRef COMDATSymName = ""; - if (GV->isWeakForLinker()) { - Selection = COFF::IMAGE_COMDAT_SELECT_ANY; - Characteristics |= COFF::IMAGE_SCN_LNK_COMDAT; - MCSymbol *Sym = TM.getSymbol(GV, Mang); - COMDATSymName = Sym->getName(); + if ((GV->isWeakForLinker() || GV->hasComdat()) && !Kind.isCommon()) { + Selection = getSelectionForCOFF(GV); + if (Selection == COFF::IMAGE_COMDAT_SELECT_ASSOCIATIVE || + !GV->hasPrivateLinkage()) { + const GlobalValue *ComdatGV; + if (Selection == COFF::IMAGE_COMDAT_SELECT_ASSOCIATIVE) + ComdatGV = getComdatGVForCOFF(GV); + else + ComdatGV = GV; + MCSymbol *Sym = TM.getSymbol(ComdatGV, Mang); + COMDATSymName = Sym->getName(); + Characteristics |= COFF::IMAGE_SCN_LNK_COMDAT; + } else { + Selection = 0; + } } return getContext().getCOFFSection(Name, Characteristics, @@ -775,17 +862,27 @@ // into a 'uniqued' section name, create and return the section now. // Section names depend on the name of the symbol which is not feasible if the // symbol has private linkage. - if ((GV->isWeakForLinker() || EmitUniquedSection) && - !GV->hasPrivateLinkage() && !Kind.isCommon()) { + if ((GV->isWeakForLinker() || EmitUniquedSection || GV->hasComdat()) && + !Kind.isCommon()) { const char *Name = getCOFFSectionNameForUniqueGlobal(Kind); unsigned Characteristics = getCOFFSectionFlags(Kind); Characteristics |= COFF::IMAGE_SCN_LNK_COMDAT; - MCSymbol *Sym = TM.getSymbol(GV, Mang); - return getContext().getCOFFSection( - Name, Characteristics, Kind, Sym->getName(), - GV->isWeakForLinker() ? COFF::IMAGE_COMDAT_SELECT_ANY - : COFF::IMAGE_COMDAT_SELECT_NODUPLICATES); + int Selection = getSelectionForCOFF(GV); + if (!Selection) + Selection = COFF::IMAGE_COMDAT_SELECT_NODUPLICATES; + if (Selection == COFF::IMAGE_COMDAT_SELECT_ASSOCIATIVE || + !GV->hasPrivateLinkage()) { + const GlobalValue *ComdatGV; + if (GV->hasComdat()) + ComdatGV = getComdatGVForCOFF(GV); + else + ComdatGV = GV; + MCSymbol *Sym = TM.getSymbol(ComdatGV, Mang); + StringRef COMDATSymName = Sym->getName(); + return getContext().getCOFFSection(Name, Characteristics, Kind, + COMDATSymName, Selection); + } } if (Kind.isText()) Index: lib/IR/AsmWriter.h =================================================================== --- lib/IR/AsmWriter.h +++ lib/IR/AsmWriter.h @@ -16,6 +16,7 @@ #define LLVM_IR_ASSEMBLYWRITER_H #include "llvm/ADT/DenseMap.h" +#include "llvm/ADT/SetVector.h" #include "llvm/IR/Attributes.h" #include "llvm/IR/Instructions.h" #include "llvm/IR/TypeFinder.h" @@ -26,6 +27,7 @@ class BasicBlock; class Function; class GlobalValue; +class Comdat; class Module; class NamedMDNode; class Value; @@ -70,6 +72,7 @@ SlotTracker &Machine; TypePrinting TypePrinter; AssemblyAnnotationWriter *AnnotationWriter; + SetVector Comdats; public: /// Construct an AssemblyWriter with an external SlotTracker @@ -101,6 +104,7 @@ void printTypeIdentities(); void printGlobal(const GlobalVariable *GV); void printAlias(const GlobalAlias *GV); + void printComdat(const Comdat *C); void printFunction(const Function *F); void printArgument(const Argument *FA, AttributeSet Attrs, unsigned Idx); void printBasicBlock(const BasicBlock *BB); Index: lib/IR/AsmWriter.cpp =================================================================== --- lib/IR/AsmWriter.cpp +++ lib/IR/AsmWriter.cpp @@ -106,6 +106,7 @@ enum PrefixType { GlobalPrefix, + ComdatPrefix, LabelPrefix, LocalPrefix, NoPrefix @@ -119,6 +120,7 @@ switch (Prefix) { case NoPrefix: break; case GlobalPrefix: OS << '@'; break; + case ComdatPrefix: OS << '$'; break; case LabelPrefix: break; case LocalPrefix: OS << '%'; break; } @@ -1165,8 +1167,15 @@ } void AssemblyWriter::init() { - if (TheModule) - TypePrinter.incorporateTypes(*TheModule); + if (!TheModule) + return; + TypePrinter.incorporateTypes(*TheModule); + for (const Function &F : *TheModule) + if (const Comdat *C = F.getComdat()) + Comdats.insert(C); + for (const GlobalVariable &GV : TheModule->globals()) + if (const Comdat *C = GV.getComdat()) + Comdats.insert(C); } @@ -1308,6 +1317,15 @@ printTypeIdentities(); + // Output all comdats. + if (!Comdats.empty()) + Out << '\n'; + for (const Comdat *C : Comdats) { + printComdat(C); + if (C != Comdats.back()) + Out << '\n'; + } + // Output all globals. if (!M->global_empty()) Out << '\n'; for (Module::const_global_iterator I = M->global_begin(), E = M->global_end(); @@ -1470,6 +1488,10 @@ PrintEscapedString(GV->getSection(), Out); Out << '"'; } + if (GV->hasComdat()) { + Out << ", comdat "; + PrintLLVMName(Out, GV->getComdat()->getName(), ComdatPrefix); + } if (GV->getAlignment()) Out << ", align " << GV->getAlignment(); @@ -1506,10 +1528,19 @@ writeOperand(Aliasee, !isa(Aliasee)); } + if (GA->hasComdat()) { + Out << ", comdat "; + PrintLLVMName(Out, GA->getComdat()->getName(), ComdatPrefix); + } + printInfoComment(*GA); Out << '\n'; } +void AssemblyWriter::printComdat(const Comdat *C) { + C->print(Out); +} + void AssemblyWriter::printTypeIdentities() { if (TypePrinter.NumberedTypes.empty() && TypePrinter.NamedTypes.empty()) @@ -1647,6 +1678,10 @@ PrintEscapedString(F->getSection(), Out); Out << '"'; } + if (F->hasComdat()) { + Out << " comdat "; + PrintLLVMName(Out, F->getComdat()->getName(), ComdatPrefix); + } if (F->getAlignment()) Out << " align " << F->getAlignment(); if (F->hasGC()) @@ -2158,6 +2193,31 @@ W.printNamedMDNode(this); } +void Comdat::print(raw_ostream &ROS) const { + PrintLLVMName(ROS, getName(), ComdatPrefix); + ROS << " = comdat "; + + switch (getSelectionKind()) { + case Comdat::Any: + ROS << "any"; + break; + case Comdat::ExactMatch: + ROS << "exactmatch"; + break; + case Comdat::Largest: + ROS << "largest"; + break; + case Comdat::NoDuplicates: + ROS << "noduplicates"; + break; + case Comdat::SameSize: + ROS << "samesize"; + break; + } + + ROS << '\n'; +} + void Type::print(raw_ostream &OS) const { TypePrinting TP; TP.print(const_cast(this), OS); @@ -2241,5 +2301,8 @@ // Module::dump() - Allow printing of Modules from the debugger. void Module::dump() const { print(dbgs(), nullptr); } +// \brief Allow printing of Comdats from the debugger. +void Comdat::dump() const { print(dbgs()); } + // NamedMDNode::dump() - Allow printing of NamedMDNodes from the debugger. void NamedMDNode::dump() const { print(dbgs()); } Index: lib/IR/CMakeLists.txt =================================================================== --- lib/IR/CMakeLists.txt +++ lib/IR/CMakeLists.txt @@ -3,6 +3,7 @@ Attributes.cpp AutoUpgrade.cpp BasicBlock.cpp + Comdat.cpp ConstantFold.cpp ConstantRange.cpp Constants.cpp Index: lib/IR/Comdat.cpp =================================================================== --- /dev/null +++ lib/IR/Comdat.cpp @@ -0,0 +1,25 @@ +//===-- Comdat.cpp - Implement Metadata classes --------------------------===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This file implements the Comdat class. +// +//===----------------------------------------------------------------------===// + +#include "llvm/IR/Comdat.h" +#include "llvm/ADT/StringMap.h" +using namespace llvm; + +Comdat::Comdat(SelectionKind SK, StringMapEntry *Name) + : Name(Name), SK(SK) {} + +Comdat::Comdat(Comdat &&C) : Name(C.Name), SK(C.SK) {} + +Comdat::Comdat() : Name(nullptr), SK(Comdat::Any) {} + +StringRef Comdat::getName() const { return Name->first(); } Index: lib/IR/Globals.cpp =================================================================== --- lib/IR/Globals.cpp +++ lib/IR/Globals.cpp @@ -59,15 +59,10 @@ setDLLStorageClass(Src->getDLLStorageClass()); } -static const GlobalObject *getBaseObject(const Constant &C) { - // FIXME: We should probably return a base + offset pair for non-zero GEPs. - return dyn_cast(C.stripPointerCasts()); -} - unsigned GlobalValue::getAlignment() const { if (auto *GA = dyn_cast(this)) { // In general we cannot compute this at the IR level, but we try. - if (const GlobalObject *GO = getBaseObject(*GA->getAliasee())) + if (const GlobalObject *GO = GA->getBaseObject()) return GO->getAlignment(); // FIXME: we should also be able to handle: @@ -96,13 +91,23 @@ const char *GlobalValue::getSection() const { if (auto *GA = dyn_cast(this)) { // In general we cannot compute this at the IR level, but we try. - if (const GlobalObject *GO = getBaseObject(*GA->getAliasee())) + if (const GlobalObject *GO = GA->getBaseObject()) return GO->getSection(); return ""; } return cast(this)->getSection(); } +Comdat *GlobalValue::getComdat() { + if (auto *GA = dyn_cast(this)) { + // In general we cannot compute this at the IR level, but we try. + if (const GlobalObject *GO = GA->getBaseObject()) + return const_cast(GO)->getComdat(); + return nullptr; + } + return cast(this)->getComdat(); +} + void GlobalObject::setSection(StringRef S) { Section = S; } bool GlobalValue::isDeclaration() const { Index: lib/IR/Module.cpp =================================================================== --- lib/IR/Module.cpp +++ lib/IR/Module.cpp @@ -453,3 +453,11 @@ return dwarf::DWARF_VERSION; return cast(Val)->getZExtValue(); } + +Comdat *Module::getOrInsertComdat(StringRef Name) { + Comdat C; + StringMapEntry &Entry = + ComdatSymTab.GetOrCreateValue(Name, std::move(C)); + Entry.second.Name = &Entry; + return &Entry.second; +} Index: lib/IR/Verifier.cpp =================================================================== --- lib/IR/Verifier.cpp +++ lib/IR/Verifier.cpp @@ -107,6 +107,12 @@ OS << ' ' << *T; } + void WriteComdat(const Comdat *C) { + if (!C) + return; + OS << *C; + } + // CheckFailed - A check failed, so print out the condition and the message // that failed. This provides a nice place to put a breakpoint if you want // to see why something is not correct. @@ -138,6 +144,12 @@ WriteType(T3); Broken = true; } + + void CheckFailed(const Twine &Message, const Comdat *C) { + OS << Message.str() << "\n"; + WriteComdat(C); + Broken = true; + } }; class Verifier : public InstVisitor, VerifierSupport { friend class InstVisitor; @@ -230,6 +242,9 @@ I != E; ++I) visitNamedMDNode(*I); + for (const StringMapEntry &SMEC : M.getComdatSymbolTable()) + visitComdat(SMEC.getValue()); + visitModuleFlags(M); visitModuleIdents(M); @@ -246,6 +261,7 @@ const GlobalAlias &A, const Constant &C); void visitNamedMDNode(const NamedMDNode &NMD); void visitMDNode(MDNode &MD, Function *F); + void visitComdat(const Comdat &C); void visitModuleIdents(const Module &M); void visitModuleFlags(const Module &M); void visitModuleFlag(const MDNode *Op, @@ -387,6 +403,7 @@ "'common' global must have a zero initializer!", &GV); Assert1(!GV.isConstant(), "'common' global may not be marked constant!", &GV); + Assert1(!GV.hasComdat(), "'common' global may not be in a Comdat!", &GV); } } else { Assert1(GV.hasExternalLinkage() || GV.hasExternalWeakLinkage(), @@ -578,6 +595,22 @@ } } +void Verifier::visitComdat(const Comdat &C) { + // All Comdat::SelectionKind values other than Comdat::Any require a + // GlobalValue with the same name as the Comdat. + const GlobalValue *GV = M->getNamedValue(C.getName()); + if (C.getSelectionKind() != Comdat::Any) + Assert1(GV, + "comdat selection kind requires a global value with the same name", + &C); + // The Module is invalid if the GlobalValue has local linkage. Allowing + // otherwise opens us up to seeing the underling global value get renamed if + // collisions occur. + if (GV) + Assert1(!GV->hasLocalLinkage(), "comdat global value has local linkage", + GV); +} + void Verifier::visitModuleIdents(const Module &M) { const NamedMDNode *Idents = M.getNamedMetadata("llvm.ident"); if (!Idents) Index: lib/Linker/LinkModules.cpp =================================================================== --- lib/Linker/LinkModules.cpp +++ lib/Linker/LinkModules.cpp @@ -426,6 +426,18 @@ return true; } + bool getComdatLeader(Module *M, StringRef ComdatName, + const GlobalVariable *&GVar); + bool computeResultingSelectionKind(StringRef ComdatName, + Comdat::SelectionKind Src, + Comdat::SelectionKind Dst, + Comdat::SelectionKind &Result, + bool &LinkFromSrc); + std::map> + ComdatsChosen; + bool getComdatResult(const Comdat *SrcC, Comdat::SelectionKind &SK, + bool &LinkFromSrc); + /// getLinkageResult - This analyzes the two global values and determines /// what the result will look like in the destination module. bool getLinkageResult(GlobalValue *Dest, const GlobalValue *Src, @@ -534,6 +546,115 @@ return DF; } +bool ModuleLinker::getComdatLeader(Module *M, StringRef ComdatName, + const GlobalVariable *&GVar) { + const GlobalValue *GVal = M->getNamedValue(ComdatName); + if (const auto *GA = dyn_cast_or_null(GVal)) { + GVal = GA->getBaseObject(); + if (!GVal) + // We cannot resolve the size of the aliasee yet. + return emitError("Linking COMDATs named '" + ComdatName + + "': COMDAT key involves incomputable alias size."); + } + + GVar = dyn_cast_or_null(GVal); + if (!GVar) + return emitError( + "Linking COMDATs named '" + ComdatName + + "': GlobalVariable required for data dependent selection!"); + + return false; +} + +bool ModuleLinker::computeResultingSelectionKind(StringRef ComdatName, + Comdat::SelectionKind Src, + Comdat::SelectionKind Dst, + Comdat::SelectionKind &Result, + bool &LinkFromSrc) { + // The ability to mix Comdat::SelectionKind::Any with + // Comdat::SelectionKind::Largest is a behavior that comes from COFF. + bool DstAnyOrLargest = Dst == Comdat::SelectionKind::Any || + Dst == Comdat::SelectionKind::Largest; + bool SrcAnyOrLargest = Src == Comdat::SelectionKind::Any || + Src == Comdat::SelectionKind::Largest; + if (DstAnyOrLargest && SrcAnyOrLargest) { + if (Dst == Comdat::SelectionKind::Largest || + Src == Comdat::SelectionKind::Largest) + Result = Comdat::SelectionKind::Largest; + else + Result = Comdat::SelectionKind::Any; + } else if (Src == Dst) { + Result = Dst; + } else { + return emitError("Linking COMDATs named '" + ComdatName + + "': invalid selection kinds!"); + } + + switch (Result) { + case Comdat::SelectionKind::Any: + // Go with Dst. + LinkFromSrc = false; + break; + case Comdat::SelectionKind::NoDuplicates: + return emitError("Linking COMDATs named '" + ComdatName + + "': noduplicates has been violated!"); + case Comdat::SelectionKind::ExactMatch: + case Comdat::SelectionKind::Largest: + case Comdat::SelectionKind::SameSize: { + const GlobalVariable *DstGV; + const GlobalVariable *SrcGV; + if (getComdatLeader(DstM, ComdatName, DstGV) || + getComdatLeader(SrcM, ComdatName, SrcGV)) + return true; + + const DataLayout *DstDL = DstM->getDataLayout(); + const DataLayout *SrcDL = SrcM->getDataLayout(); + if (!DstDL || !SrcDL) { + return emitError( + "Linking COMDATs named '" + ComdatName + + "': can't do size dependent selection without DataLayout!"); + } + uint64_t DstSize = + DstDL->getTypeAllocSize(DstGV->getType()->getPointerElementType()); + uint64_t SrcSize = + SrcDL->getTypeAllocSize(SrcGV->getType()->getPointerElementType()); + if (Result == Comdat::SelectionKind::ExactMatch) { + if (SrcGV->getInitializer() != DstGV->getInitializer()) + return emitError("Linking COMDATs named '" + ComdatName + + "': ExactMatch violated!"); + LinkFromSrc = false; + } else if (Result == Comdat::SelectionKind::Largest) { + LinkFromSrc = SrcSize > DstSize; + } else if (Result == Comdat::SelectionKind::SameSize) { + if (SrcSize != DstSize) + return emitError("Linking COMDATs named '" + ComdatName + + "': SameSize violated!"); + LinkFromSrc = false; + } else { + llvm_unreachable("unknown selection kind"); + } + break; + } + } + + return false; +} + +bool ModuleLinker::getComdatResult(const Comdat *SrcC, + Comdat::SelectionKind &Result, + bool &LinkFromSrc) { + StringRef ComdatName = SrcC->getName(); + Module::ComdatSymTabType &ComdatSymTab = DstM->getComdatSymbolTable(); + Module::ComdatSymTabType::iterator DstCI = ComdatSymTab.find(ComdatName); + if (DstCI != ComdatSymTab.end()) { + const Comdat *DstC = &DstCI->second; + Comdat::SelectionKind SSK = SrcC->getSelectionKind(); + Comdat::SelectionKind DSK = DstC->getSelectionKind(); + if (computeResultingSelectionKind(ComdatName, SSK, DSK, Result, LinkFromSrc)) + return true; + } + return false; +} /// getLinkageResult - This analyzes the two global values and determines what /// the result will look like in the destination module. In particular, it @@ -764,34 +885,47 @@ llvm::Optional NewVisibility; bool HasUnnamedAddr = SGV->hasUnnamedAddr(); + bool LinkFromSrc = false; + Comdat *DC = nullptr; + if (const Comdat *SC = SGV->getComdat()) { + Comdat::SelectionKind SK; + std::tie(SK, LinkFromSrc) = ComdatsChosen[SC]; + DC = DstM->getOrInsertComdat(SC->getName()); + DC->setSelectionKind(SK); + } + if (DGV) { - // Concatenation of appending linkage variables is magic and handled later. - if (DGV->hasAppendingLinkage() || SGV->hasAppendingLinkage()) - return linkAppendingVarProto(cast(DGV), SGV); - - // Determine whether linkage of these two globals follows the source - // module's definition or the destination module's definition. - GlobalValue::LinkageTypes NewLinkage = GlobalValue::InternalLinkage; - GlobalValue::VisibilityTypes NV; - bool LinkFromSrc = false; - if (getLinkageResult(DGV, SGV, NewLinkage, NV, LinkFromSrc)) - return true; - NewVisibility = NV; - HasUnnamedAddr = HasUnnamedAddr && DGV->hasUnnamedAddr(); + if (!DC) { + // Concatenation of appending linkage variables is magic and handled later. + if (DGV->hasAppendingLinkage() || SGV->hasAppendingLinkage()) + return linkAppendingVarProto(cast(DGV), SGV); + + // Determine whether linkage of these two globals follows the source + // module's definition or the destination module's definition. + GlobalValue::LinkageTypes NewLinkage = GlobalValue::InternalLinkage; + GlobalValue::VisibilityTypes NV; + if (getLinkageResult(DGV, SGV, NewLinkage, NV, LinkFromSrc)) + return true; + NewVisibility = NV; + HasUnnamedAddr = HasUnnamedAddr && DGV->hasUnnamedAddr(); + + // If we're not linking from the source, then keep the definition that we + // have. + if (!LinkFromSrc) { + // Special case for const propagation. + if (GlobalVariable *DGVar = dyn_cast(DGV)) + if (DGVar->isDeclaration() && SGV->isConstant() && + !DGVar->isConstant()) + DGVar->setConstant(true); + + // Set calculated linkage, visibility and unnamed_addr. + DGV->setLinkage(NewLinkage); + DGV->setVisibility(*NewVisibility); + DGV->setUnnamedAddr(HasUnnamedAddr); + } + } - // If we're not linking from the source, then keep the definition that we - // have. if (!LinkFromSrc) { - // Special case for const propagation. - if (GlobalVariable *DGVar = dyn_cast(DGV)) - if (DGVar->isDeclaration() && SGV->isConstant() && !DGVar->isConstant()) - DGVar->setConstant(true); - - // Set calculated linkage, visibility and unnamed_addr. - DGV->setLinkage(NewLinkage); - DGV->setVisibility(*NewVisibility); - DGV->setUnnamedAddr(HasUnnamedAddr); - // Make sure to remember this mapping. ValueMap[SGV] = ConstantExpr::getBitCast(DGV,TypeMap.get(SGV->getType())); @@ -803,6 +937,12 @@ } } + // If the Comdat this variable was inside of wasn't selected, skip it. + if (DC && !DGV && !LinkFromSrc) { + DoNotLinkFromSource.insert(SGV); + return false; + } + // No linking to be performed or linking from the source: simply create an // identical version of the symbol over in the dest module... the // initializer will be filled in later by LinkGlobalInits. @@ -818,6 +958,9 @@ NewDGV->setVisibility(*NewVisibility); NewDGV->setUnnamedAddr(HasUnnamedAddr); + if (DC) + NewDGV->setComdat(DC); + if (DGV) { DGV->replaceAllUsesWith(ConstantExpr::getBitCast(NewDGV, DGV->getType())); DGV->eraseFromParent(); @@ -835,21 +978,33 @@ llvm::Optional NewVisibility; bool HasUnnamedAddr = SF->hasUnnamedAddr(); + bool LinkFromSrc = false; + Comdat *DC = nullptr; + if (const Comdat *SC = SF->getComdat()) { + Comdat::SelectionKind SK; + std::tie(SK, LinkFromSrc) = ComdatsChosen[SC]; + DC = DstM->getOrInsertComdat(SC->getName()); + DC->setSelectionKind(SK); + } + if (DGV) { - GlobalValue::LinkageTypes NewLinkage = GlobalValue::InternalLinkage; - bool LinkFromSrc = false; - GlobalValue::VisibilityTypes NV; - if (getLinkageResult(DGV, SF, NewLinkage, NV, LinkFromSrc)) - return true; - NewVisibility = NV; - HasUnnamedAddr = HasUnnamedAddr && DGV->hasUnnamedAddr(); + if (!DC) { + GlobalValue::LinkageTypes NewLinkage = GlobalValue::InternalLinkage; + GlobalValue::VisibilityTypes NV; + if (getLinkageResult(DGV, SF, NewLinkage, NV, LinkFromSrc)) + return true; + NewVisibility = NV; + HasUnnamedAddr = HasUnnamedAddr && DGV->hasUnnamedAddr(); + + if (!LinkFromSrc) { + // Set calculated linkage + DGV->setLinkage(NewLinkage); + DGV->setVisibility(*NewVisibility); + DGV->setUnnamedAddr(HasUnnamedAddr); + } + } if (!LinkFromSrc) { - // Set calculated linkage - DGV->setLinkage(NewLinkage); - DGV->setVisibility(*NewVisibility); - DGV->setUnnamedAddr(HasUnnamedAddr); - // Make sure to remember this mapping. ValueMap[SF] = ConstantExpr::getBitCast(DGV, TypeMap.get(SF->getType())); @@ -869,6 +1024,12 @@ return false; } + // If the Comdat this function was inside of wasn't selected, skip it. + if (DC && !DGV && !LinkFromSrc) { + DoNotLinkFromSource.insert(SF); + return false; + } + // If there is no linkage to be performed or we are linking from the source, // bring SF over. Function *NewDF = Function::Create(TypeMap.get(SF->getFunctionType()), @@ -878,6 +1039,9 @@ NewDF->setVisibility(*NewVisibility); NewDF->setUnnamedAddr(HasUnnamedAddr); + if (DC) + NewDF->setComdat(DC); + if (DGV) { // Any uses of DF need to change to NewDF, with cast. DGV->replaceAllUsesWith(ConstantExpr::getBitCast(NewDF, DGV->getType())); @@ -895,21 +1059,33 @@ llvm::Optional NewVisibility; bool HasUnnamedAddr = SGA->hasUnnamedAddr(); + bool LinkFromSrc = false; + Comdat *DC = nullptr; + if (const Comdat *SC = SGA->getComdat()) { + Comdat::SelectionKind SK; + std::tie(SK, LinkFromSrc) = ComdatsChosen[SC]; + DC = DstM->getOrInsertComdat(SC->getName()); + DC->setSelectionKind(SK); + } + if (DGV) { - GlobalValue::LinkageTypes NewLinkage = GlobalValue::InternalLinkage; - GlobalValue::VisibilityTypes NV; - bool LinkFromSrc = false; - if (getLinkageResult(DGV, SGA, NewLinkage, NV, LinkFromSrc)) - return true; - NewVisibility = NV; - HasUnnamedAddr = HasUnnamedAddr && DGV->hasUnnamedAddr(); + if (!DC) { + GlobalValue::LinkageTypes NewLinkage = GlobalValue::InternalLinkage; + GlobalValue::VisibilityTypes NV; + if (getLinkageResult(DGV, SGA, NewLinkage, NV, LinkFromSrc)) + return true; + NewVisibility = NV; + HasUnnamedAddr = HasUnnamedAddr && DGV->hasUnnamedAddr(); + + if (!LinkFromSrc) { + // Set calculated linkage. + DGV->setLinkage(NewLinkage); + DGV->setVisibility(*NewVisibility); + DGV->setUnnamedAddr(HasUnnamedAddr); + } + } if (!LinkFromSrc) { - // Set calculated linkage. - DGV->setLinkage(NewLinkage); - DGV->setVisibility(*NewVisibility); - DGV->setUnnamedAddr(HasUnnamedAddr); - // Make sure to remember this mapping. ValueMap[SGA] = ConstantExpr::getBitCast(DGV,TypeMap.get(SGA->getType())); @@ -920,6 +1096,12 @@ } } + // If the Comdat this alias was inside of wasn't selected, skip it. + if (DC && !DGV && !LinkFromSrc) { + DoNotLinkFromSource.insert(SGA); + return false; + } + // If there is no linkage to be performed or we're linking from the source, // bring over SGA. auto *PTy = cast(TypeMap.get(SGA->getType())); @@ -1254,6 +1436,18 @@ // Loop over all of the linked values to compute type mappings. computeTypeMapping(); + ComdatsChosen.clear(); + for (const StringMapEntry &SMEC : SrcM->getComdatSymbolTable()) { + const Comdat &C = SMEC.getValue(); + if (ComdatsChosen.count(&C)) + continue; + Comdat::SelectionKind SK; + bool LinkFromSrc; + if (getComdatResult(&C, SK, LinkFromSrc)) + return true; + ComdatsChosen[&C] = std::make_pair(SK, LinkFromSrc); + } + // Insert all of the globals in src into the DstM module... without linking // initializers (which could refer to functions not yet mapped over). for (Module::global_iterator I = SrcM->global_begin(), Index: lib/MC/MCContext.cpp =================================================================== --- lib/MC/MCContext.cpp +++ lib/MC/MCContext.cpp @@ -282,8 +282,8 @@ int Selection) { // Do the lookup, if we have a hit, return it. - SectionGroupPair P(Section, COMDATSymName); - auto IterBool = COFFUniquingMap.insert(std::make_pair(P, nullptr)); + SectionGroupTriple T(Section, COMDATSymName, Selection); + auto IterBool = COFFUniquingMap.insert(std::make_pair(T, nullptr)); auto Iter = IterBool.first; if (!IterBool.second) return Iter->second; @@ -292,7 +292,7 @@ if (!COMDATSymName.empty()) COMDATSymbol = GetOrCreateSymbol(COMDATSymName); - StringRef CachedName = Iter->first.first; + StringRef CachedName = std::get<0>(Iter->first); MCSectionCOFF *Result = new (*this) MCSectionCOFF(CachedName, Characteristics, COMDATSymbol, Selection, Kind); @@ -307,8 +307,8 @@ } const MCSectionCOFF *MCContext::getCOFFSection(StringRef Section) { - SectionGroupPair P(Section, ""); - auto Iter = COFFUniquingMap.find(P); + SectionGroupTriple T(Section, "", 0); + auto Iter = COFFUniquingMap.find(T); if (Iter == COFFUniquingMap.end()) return nullptr; return Iter->second; Index: lib/MC/MCParser/COFFAsmParser.cpp =================================================================== --- lib/MC/MCParser/COFFAsmParser.cpp +++ lib/MC/MCParser/COFFAsmParser.cpp @@ -292,8 +292,7 @@ bool COFFAsmParser::ParseSectionSwitch(StringRef Section, unsigned Characteristics, SectionKind Kind) { - return ParseSectionSwitch(Section, Characteristics, Kind, "", - COFF::IMAGE_COMDAT_SELECT_ANY); + return ParseSectionSwitch(Section, Characteristics, Kind, "", (COFF::COMDATType)0); } bool COFFAsmParser::ParseSectionSwitch(StringRef Section, @@ -357,9 +356,10 @@ return true; } - COFF::COMDATType Type = COFF::IMAGE_COMDAT_SELECT_ANY; + COFF::COMDATType Type = (COFF::COMDATType)0; StringRef COMDATSymName; if (getLexer().is(AsmToken::Comma)) { + Type = COFF::IMAGE_COMDAT_SELECT_ANY;; Lex(); Flags |= COFF::IMAGE_SCN_LNK_COMDAT; Index: lib/Transforms/IPO/GlobalDCE.cpp =================================================================== --- lib/Transforms/IPO/GlobalDCE.cpp +++ lib/Transforms/IPO/GlobalDCE.cpp @@ -77,13 +77,19 @@ // Remove empty functions from the global ctors list. Changed |= optimizeGlobalCtorsList(M, isEmptyFunction); + typedef std::multimap ComdatGVPairsTy; + ComdatGVPairsTy ComdatGVPairs; + // Loop over the module, adding globals which are obviously necessary. for (Module::iterator I = M.begin(), E = M.end(); I != E; ++I) { Changed |= RemoveUnusedGlobalValue(*I); // Functions with external linkage are needed if they have a body - if (!I->isDiscardableIfUnused() && - !I->isDeclaration() && !I->hasAvailableExternallyLinkage()) - GlobalIsNeeded(I); + if (!I->isDeclaration() && !I->hasAvailableExternallyLinkage()) { + if (!I->isDiscardableIfUnused()) + GlobalIsNeeded(I); + else if (const Comdat *C = I->getComdat()) + ComdatGVPairs.insert(std::make_pair(C, I)); + } } for (Module::global_iterator I = M.global_begin(), E = M.global_end(); @@ -91,17 +97,38 @@ Changed |= RemoveUnusedGlobalValue(*I); // Externally visible & appending globals are needed, if they have an // initializer. - if (!I->isDiscardableIfUnused() && - !I->isDeclaration() && !I->hasAvailableExternallyLinkage()) - GlobalIsNeeded(I); + if (!I->isDeclaration() && !I->hasAvailableExternallyLinkage()) { + if (!I->isDiscardableIfUnused()) + GlobalIsNeeded(I); + else if (const Comdat *C = I->getComdat()) + ComdatGVPairs.insert(std::make_pair(C, I)); + } } for (Module::alias_iterator I = M.alias_begin(), E = M.alias_end(); I != E; ++I) { Changed |= RemoveUnusedGlobalValue(*I); // Externally visible aliases are needed. - if (!I->isDiscardableIfUnused()) + if (!I->isDiscardableIfUnused()) { GlobalIsNeeded(I); + } else if (const Comdat *C = I->getComdat()) { + ComdatGVPairs.insert(std::make_pair(C, I)); + } + } + + for (ComdatGVPairsTy::iterator I = ComdatGVPairs.begin(), + E = ComdatGVPairs.end(); + I != E;) { + ComdatGVPairsTy::iterator UB = ComdatGVPairs.upper_bound(I->first); + bool CanDiscard = std::all_of(I, UB, [](ComdatGVPairsTy::value_type Pair) { + return Pair.second->isDiscardableIfUnused(); + }); + if (!CanDiscard) { + std::for_each(I, UB, [this](ComdatGVPairsTy::value_type Pair) { + GlobalIsNeeded(Pair.second); + }); + } + I = UB; } // Now that all globals which are needed are in the AliveGlobals set, we loop Index: lib/Transforms/IPO/GlobalOpt.cpp =================================================================== --- lib/Transforms/IPO/GlobalOpt.cpp +++ lib/Transforms/IPO/GlobalOpt.cpp @@ -17,6 +17,7 @@ #include "llvm/ADT/DenseMap.h" #include "llvm/ADT/STLExtras.h" #include "llvm/ADT/SmallPtrSet.h" +#include "llvm/ADT/SmallSet.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/Statistic.h" #include "llvm/Analysis/ConstantFolding.h" @@ -1699,9 +1700,6 @@ /// possible. If we make a change, return true. bool GlobalOpt::ProcessGlobal(GlobalVariable *GV, Module::global_iterator &GVI) { - if (!GV->isDiscardableIfUnused()) - return false; - // Do more involved optimizations if the global is internal. GV->removeDeadConstantUsers(); @@ -1944,6 +1942,13 @@ bool GlobalOpt::OptimizeGlobalVars(Module &M) { bool Changed = false; + + SmallSet NotDiscardableComdats; + for (const GlobalVariable &GV : M.globals()) + if (const Comdat *C = GV.getComdat()) + if (!GV.isDiscardableIfUnused()) + NotDiscardableComdats.insert(C); + for (Module::global_iterator GVI = M.global_begin(), E = M.global_end(); GVI != E; ) { GlobalVariable *GV = GVI++; @@ -1958,7 +1963,12 @@ GV->setInitializer(New); } - Changed |= ProcessGlobal(GV, GVI); + if (GV->isDiscardableIfUnused()) { + if (const Comdat *C = GV->getComdat()) + if (NotDiscardableComdats.count(C)) + continue; + Changed |= ProcessGlobal(GV, GVI); + } } return Changed; } Index: test/Assembler/invalid-comdat.ll =================================================================== --- /dev/null +++ test/Assembler/invalid-comdat.ll @@ -0,0 +1,4 @@ +; RUN: not llvm-as < %s 2>&1 | FileCheck %s + +@v = global i32 0, comdat $v +; CHECK: use of undefined comdat '$v' Index: test/Assembler/invalid-comdat2.ll =================================================================== --- /dev/null +++ test/Assembler/invalid-comdat2.ll @@ -0,0 +1,5 @@ +; RUN: not llvm-as < %s 2>&1 | FileCheck %s + +$v = comdat any +$v = comdat any +; CHECK: redefinition of comdat '$v' Index: test/CodeGen/X86/coff-comdat.ll =================================================================== --- /dev/null +++ test/CodeGen/X86/coff-comdat.ll @@ -0,0 +1,92 @@ +; RUN: llc -mtriple i386-pc-win32 < %s | FileCheck %s + +$f1 = comdat any +@v1 = global i32 0, comdat $f1 +define void @f1() comdat $f1 { + ret void +} + +$f2 = comdat exactmatch +@v2 = global i32 0, comdat $f2 +define void @f2() comdat $f2 { + ret void +} + +$f3 = comdat largest +@v3 = global i32 0, comdat $f3 +define void @f3() comdat $f3 { + ret void +} + +$f4 = comdat noduplicates +@v4 = global i32 0, comdat $f4 +define void @f4() comdat $f4 { + ret void +} + +$f5 = comdat samesize +@v5 = global i32 0, comdat $f5 +define void @f5() comdat $f5 { + ret void +} + +$f6 = comdat samesize +@v6 = global i32 0, comdat $f6 +@f6 = global i32 0, comdat $f6 + +$"\01@f7@0" = comdat any +define x86_fastcallcc void @"\01@v7@0"() comdat $"\01@f7@0" { + ret void +} +define x86_fastcallcc void @"\01@f7@0"() comdat $"\01@f7@0" { + ret void +} + +$f8 = comdat any +define x86_fastcallcc void @v8() comdat $f8 { + ret void +} +define x86_fastcallcc void @f8() comdat $f8 { + ret void +} + +$vftable = comdat largest + +@some_name = linkonce_odr unnamed_addr constant [2 x i8*] zeroinitializer, comdat $vftable +@vftable = alias getelementptr([2 x i8*]* @some_name, i32 0, i32 1) + +; CHECK: .section .text,"xr",discard,_f1 +; CHECK: .globl _f1 +; CHECK: .section .text,"xr",same_contents,_f2 +; CHECK: .globl _f2 +; CHECK: .section .text,"xr",largest,_f3 +; CHECK: .globl _f3 +; CHECK: .section .text,"xr",one_only,_f4 +; CHECK: .globl _f4 +; CHECK: .section .text,"xr",same_size,_f5 +; CHECK: .globl _f5 +; CHECK: .section .text,"xr",associative,@f7@0 +; CHECK: .globl @v7@0 +; CHECK: .section .text,"xr",discard,@f7@0 +; CHECK: .globl @f7@0 +; CHECK: .section .text,"xr",associative,@f8@0 +; CHECK: .globl @v8@0 +; CHECK: .section .text,"xr",discard,@f8@0 +; CHECK: .globl @f8@0 +; CHECK: .section .bss,"bw",associative,_f1 +; CHECK: .globl _v1 +; CHECK: .section .bss,"bw",associative,_f2 +; CHECK: .globl _v2 +; CHECK: .section .bss,"bw",associative,_f3 +; CHECK: .globl _v3 +; CHECK: .section .bss,"bw",associative,_f4 +; CHECK: .globl _v4 +; CHECK: .section .bss,"bw",associative,_f5 +; CHECK: .globl _v5 +; CHECK: .section .bss,"bw",associative,_f6 +; CHECK: .globl _v6 +; CHECK: .section .bss,"bw",same_size,_f6 +; CHECK: .globl _f6 +; CHECK: .section .rdata,"rd",largest,_vftable +; CHECK: .globl _vftable +; CHECK: _vftable = _some_name+4 Index: test/CodeGen/X86/coff-comdat2.ll =================================================================== --- /dev/null +++ test/CodeGen/X86/coff-comdat2.ll @@ -0,0 +1,9 @@ +; RUN: not llc %s -o /dev/null 2>&1 | FileCheck %s + +target datalayout = "e-m:w-p:32:32-i64:64-f80:32-n8:16:32-S32" +target triple = "i686-pc-windows-msvc" + +$foo = comdat largest +@foo = global i32 0 +@bar = global i32 0, comdat $foo +; CHECK: Associative COMDAT symbol 'foo' is not a key for it's COMDAT. Index: test/CodeGen/X86/coff-comdat3.ll =================================================================== --- /dev/null +++ test/CodeGen/X86/coff-comdat3.ll @@ -0,0 +1,8 @@ +; RUN: not llc %s -o /dev/null 2>&1 | FileCheck %s + +target datalayout = "e-m:w-p:32:32-i64:64-f80:32-n8:16:32-S32" +target triple = "i686-pc-windows-msvc" + +$foo = comdat largest +@bar = global i32 0, comdat $foo +; CHECK: Associative COMDAT symbol 'foo' does not exist. Index: test/CodeGen/X86/elf-comdat.ll =================================================================== --- /dev/null +++ test/CodeGen/X86/elf-comdat.ll @@ -0,0 +1,11 @@ +; RUN: llc -mtriple x86_64-pc-linux-gnu < %s | FileCheck %s + +$f = comdat any +@v = global i32 0, comdat $f +define void @f() comdat $f { + ret void +} +; CHECK: .section .text.f,"axG",@progbits,f,comdat +; CHECK: .globl f +; CHECK: .section .bss.v,"aGw",@nobits,f,comdat +; CHECK: .globl v Index: test/CodeGen/X86/elf-comdat2.ll =================================================================== --- /dev/null +++ test/CodeGen/X86/elf-comdat2.ll @@ -0,0 +1,12 @@ +; RUN: llc -mtriple x86_64-pc-linux-gnu < %s | FileCheck %s + +$foo = comdat any +@bar = global i32 42, comdat $foo +@foo = global i32 42 + +; CHECK: .type bar,@object +; CHECK-NEXT: .section .data.bar,"aGw",@progbits,foo,comdat +; CHECK-NEXT: .globl bar +; CHECK: .type foo,@object +; CHECK-NEXT: .data +; CHECK-NEXT: .globl foo Index: test/CodeGen/X86/macho-comdat.ll =================================================================== --- /dev/null +++ test/CodeGen/X86/macho-comdat.ll @@ -0,0 +1,6 @@ +; RUN: not llc -mtriple x86_64-apple-darwin < %s 2> %t +; RUN: FileCheck < %t %s + +$f = comdat any +@v = global i32 0, comdat $f +; CHECK: LLVM ERROR: MachO doesn't support COMDATs, 'f' cannot be lowered. Index: test/Feature/comdat.ll =================================================================== --- /dev/null +++ test/Feature/comdat.ll @@ -0,0 +1,15 @@ +; RUN: llvm-as < %s | llvm-dis | FileCheck %s + +$f = comdat any +; CHECK: $f = comdat any + +$f2 = comdat any +; CHECK-NOT: f2 + +@v = global i32 0, comdat $f +; CHECK: @v = global i32 0, comdat $f + +define void @f() comdat $f { + ret void +} +; CHECK: define void @f() comdat $f Index: test/Instrumentation/AddressSanitizer/do-not-instrument-llvm-metadata.ll =================================================================== --- test/Instrumentation/AddressSanitizer/do-not-instrument-llvm-metadata.ll +++ test/Instrumentation/AddressSanitizer/do-not-instrument-llvm-metadata.ll @@ -5,7 +5,7 @@ target triple = "x86_64-unknown-linux-gnu" @.str_noinst = private unnamed_addr constant [4 x i8] c"aaa\00", section "llvm.metadata" -@.str_inst = private unnamed_addr constant [4 x i8] c"aaa\00", +@.str_inst = private unnamed_addr constant [4 x i8] c"aaa\00" ; CHECK-NOT: {{asan_gen.*str_noinst}} ; CHECK: {{asan_gen.*str_inst}} Index: test/Linker/Inputs/comdat.ll =================================================================== --- /dev/null +++ test/Linker/Inputs/comdat.ll @@ -0,0 +1,20 @@ +target datalayout = "e-m:w-p:32:32-i64:64-f80:32-n8:16:32-S32" +target triple = "i686-pc-windows-msvc" + +$foo = comdat largest +@foo = global i64 43, comdat $foo + +define i32 @bar() comdat $foo { + ret i32 43 +} + +$qux = comdat largest +@qux = global i32 13, comdat $qux +@in_unselected_group = global i32 13, comdat $qux + +define i32 @baz() comdat $qux { + ret i32 13 +} + +$any = comdat any +@any = global i64 7, comdat $any Index: test/Linker/Inputs/comdat2.ll =================================================================== --- /dev/null +++ test/Linker/Inputs/comdat2.ll @@ -0,0 +1,2 @@ +$foo = comdat largest +@foo = global i64 43, comdat $foo Index: test/Linker/Inputs/comdat3.ll =================================================================== --- /dev/null +++ test/Linker/Inputs/comdat3.ll @@ -0,0 +1,2 @@ +$foo = comdat noduplicates +@foo = global i64 43, comdat $foo Index: test/Linker/Inputs/comdat4.ll =================================================================== --- /dev/null +++ test/Linker/Inputs/comdat4.ll @@ -0,0 +1,5 @@ +target datalayout = "e-m:w-p:32:32-i64:64-f80:32-n8:16:32-S32" +target triple = "i686-pc-windows-msvc" + +$foo = comdat samesize +@foo = global i64 42, comdat $foo Index: test/Linker/Inputs/comdat5.ll =================================================================== --- /dev/null +++ test/Linker/Inputs/comdat5.ll @@ -0,0 +1,15 @@ +target datalayout = "e-m:w-p:32:32-i64:64-f80:32-n8:16:32-S32" +target triple = "i686-pc-windows-msvc" + +%MSRTTICompleteObjectLocator = type { i32, i32, i32, i8*, %MSRTTIClassHierarchyDescriptor* } +%MSRTTIClassHierarchyDescriptor = type { i32, i32, i32, %MSRTTIBaseClassDescriptor** } +%MSRTTIBaseClassDescriptor = type { i8*, i32, i32, i32, i32, i32, %MSRTTIClassHierarchyDescriptor* } +%struct.S = type { i32 (...)** } + +$"\01??_7S@@6B@" = comdat largest + +@"\01??_R4S@@6B@" = external constant %MSRTTICompleteObjectLocator +@some_name = linkonce_odr unnamed_addr constant [2 x i8*] [i8* bitcast (%MSRTTICompleteObjectLocator* @"\01??_R4S@@6B@" to i8*), i8* bitcast (void (%struct.S*, i32)* @"\01??_GS@@UAEPAXI@Z" to i8*)], comdat $"\01??_7S@@6B@" +@"\01??_7S@@6B@" = alias getelementptr([2 x i8*]* @some_name, i32 0, i32 1) + +declare x86_thiscallcc void @"\01??_GS@@UAEPAXI@Z"(%struct.S*, i32) unnamed_addr Index: test/Linker/comdat.ll =================================================================== --- /dev/null +++ test/Linker/comdat.ll @@ -0,0 +1,32 @@ +; RUN: llvm-link %s %p/Inputs/comdat.ll -S -o - | FileCheck %s +target datalayout = "e-m:w-p:32:32-i64:64-f80:32-n8:16:32-S32" +target triple = "i686-pc-windows-msvc" + +$foo = comdat largest +@foo = global i32 42, comdat $foo + +define i32 @bar() comdat $foo { + ret i32 42 +} + +$qux = comdat largest +@qux = global i64 12, comdat $qux + +define i32 @baz() comdat $qux { + ret i32 12 +} + +$any = comdat any +@any = global i64 6, comdat $any + +; CHECK: $qux = comdat largest +; CHECK: $foo = comdat largest +; CHECK: $any = comdat any + +; CHECK: @qux = global i64 12, comdat $qux +; CHECK: @any = global i64 6, comdat $any +; CHECK: @foo = global i64 43, comdat $foo +; CHECK-NOT: @in_unselected_group = global i32 13, comdat $qux + +; CHECK: define i32 @baz() comdat $qux +; CHECK: define i32 @bar() comdat $foo Index: test/Linker/comdat2.ll =================================================================== --- /dev/null +++ test/Linker/comdat2.ll @@ -0,0 +1,7 @@ +; RUN: not llvm-link %s %p/Inputs/comdat.ll -S -o - 2>&1 | FileCheck %s +target datalayout = "e-m:w-p:32:32-i64:64-f80:32-n8:16:32-S32" +target triple = "i686-pc-windows-msvc" + +$foo = comdat samesize +@foo = global i32 42, comdat $foo +; CHECK: Linking COMDATs named 'foo': invalid selection kinds! Index: test/Linker/comdat3.ll =================================================================== --- /dev/null +++ test/Linker/comdat3.ll @@ -0,0 +1,5 @@ +; RUN: not llvm-link %s %p/Inputs/comdat2.ll -S -o - 2>&1 | FileCheck %s + +$foo = comdat largest +@foo = global i32 43, comdat $foo +; CHECK: Linking COMDATs named 'foo': can't do size dependent selection without DataLayout! Index: test/Linker/comdat4.ll =================================================================== --- /dev/null +++ test/Linker/comdat4.ll @@ -0,0 +1,5 @@ +; RUN: not llvm-link %s %p/Inputs/comdat3.ll -S -o - 2>&1 | FileCheck %s + +$foo = comdat noduplicates +@foo = global i64 43, comdat $foo +; CHECK: Linking COMDATs named 'foo': noduplicates has been violated! Index: test/Linker/comdat5.ll =================================================================== --- /dev/null +++ test/Linker/comdat5.ll @@ -0,0 +1,7 @@ +; RUN: not llvm-link %s %p/Inputs/comdat4.ll -S -o - 2>&1 | FileCheck %s +target datalayout = "e-m:w-p:32:32-i64:64-f80:32-n8:16:32-S32" +target triple = "i686-pc-windows-msvc" + +$foo = comdat samesize +@foo = global i32 42, comdat $foo +; CHECK: Linking COMDATs named 'foo': SameSize violated! Index: test/Linker/comdat6.ll =================================================================== --- /dev/null +++ test/Linker/comdat6.ll @@ -0,0 +1,13 @@ +; RUN: llvm-link %s %p/Inputs/comdat5.ll -S -o - 2>&1 | FileCheck %s +; RUN: llvm-link %p/Inputs/comdat5.ll %s -S -o - 2>&1 | FileCheck %s +target datalayout = "e-m:w-p:32:32-i64:64-f80:32-n8:16:32-S32" +target triple = "i686-pc-windows-msvc" + +%struct.S = type { i32 (...)** } + +$"\01??_7S@@6B@" = comdat largest +@"\01??_7S@@6B@" = linkonce_odr unnamed_addr constant [1 x i8*] [i8* bitcast (void (%struct.S*, i32)* @"\01??_GS@@UAEPAXI@Z" to i8*)], comdat $"\01??_7S@@6B@" + +; CHECK: @"\01??_7S@@6B@" = alias getelementptr inbounds ([2 x i8*]* @some_name, i32 0, i32 1) + +declare x86_thiscallcc void @"\01??_GS@@UAEPAXI@Z"(%struct.S*, i32) unnamed_addr Index: test/Verifier/comdat.ll =================================================================== --- /dev/null +++ test/Verifier/comdat.ll @@ -0,0 +1,5 @@ +; RUN: not llvm-as %s -o /dev/null 2>&1 | FileCheck %s + +$v = comdat any +@v = common global i32 0, comdat $v +; CHECK: 'common' global may not be in a Comdat! Index: test/Verifier/comdat2.ll =================================================================== --- /dev/null +++ test/Verifier/comdat2.ll @@ -0,0 +1,5 @@ +; RUN: not llvm-as %s -o /dev/null 2>&1 | FileCheck %s + +$v = comdat any +@v = private global i32 0, comdat $v +; CHECK: comdat global value has local linkage