This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/lib/Index/
-
lib/
-
Index/
-
CMakeLists.txt
1/2
IndexRecordHasher.h
-
RecordHasher/
-
CMakeLists.txt
-
CachingHasher.h
-
CachingHasher.cpp
-
DeclHasher.h
-
DeclHasher.cpp
-
IndexRecordHasher.cpp

Differential D58749

[index-while-building] IndexRecordHasher
AbandonedPublic

Authored by jkorous on Feb 27 2019, 4:39 PM.

Download Raw Diff

Details

Reviewers

nathawes
akyrtzi
arphaman
dexonsmith
ioeric
malaperle
kadircet
gribozavr

Summary

Another piece of index-while-building functionality.
RFC: http://lists.llvm.org/pipermail/cfe-dev/2019-February/061432.html

Originally part of review https://reviews.llvm.org/D39050

This implementation is covered by lit tests using it through c-index-test in upcoming patch. It's just split off to make the patch smaller and easier to review.

Diff Detail

Event Timeline

jkorous created this revision.Feb 27 2019, 4:39 PM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 27 2019, 4:39 PM

Herald added subscribers: cfe-commits, jdoerfert, mgorny. · View Herald Transcript

Adding clangd folks in case they want to take a look.

kadircet added inline comments.Mar 8 2019, 2:35 AM

clang/lib/Index/IndexRecordHasher.h
42	Why expose hashing functionality for other types?

I left some comments, but it is difficult for me to review without understanding what the requirements for this hasher are, why some information is hashed in, and some is left out, could you clarify? See detailed comments.

This implementation is covered by lit tests using it through c-index-test in upcoming patch.

This code looks very unit-testable. It also handles many corner cases (what information is hashed and what is left out). c-index-tests are integration tests that are not as good at that, and also they would be testing this code quite indirectly.

clang/lib/Index/IndexRecordHasher.cpp
175 ↗	(On Diff #188646)	"caching all decls" => "caching hashes for all decls"
191 ↗	(On Diff #188646)	I don't think that hiding a "CreateUnsafe" in an operation with a benign name "asCanon" is a good idea. Why not inline this lambda everywhere, it looks like it is defined to only save typing a few characters.
207 ↗	(On Diff #188646)	What about other qualifiers? Why not use `Qualifiers::getAsOpaqueValue()` instead of manually converting a subset of qualifiers to unsigned?
209 ↗	(On Diff #188646)	Is this a FIXME or?..
215 ↗	(On Diff #188646)	Why not hash in `T->getTypeClass()` uniformly for all types, instead of inventing a random sigil?
219 ↗	(On Diff #188646)	There is LValueReferenceType and RValueReferenceType, do we care about the difference? If not, why not?
254 ↗	(On Diff #188646)	What about all other types -- array type, function type, decltype(), ...? What about attributes?
277 ↗	(On Diff #188646)	I had to read the implementation of this function to understand what it does. How about renaming it to "cachedHash()" ?
291 ↗	(On Diff #188646)	hashImpl => cachedHashImpl
409 ↗	(On Diff #188646)	So what is the failure mode for unhandled types, what is the effect on the whole system?
476 ↗	(On Diff #188646)	I'd suggest to remove this comment. It is more confusing than helpful. It makes the code look like there's some processing (like a "break" in other cases), but at a close inspection it turns out to be a comment. This kind of fallthrough is pretty common and I don't think it requires a comment. (For example, hashImpl above has this kind of fallthough, but does not have comments like this.)
clang/lib/Index/IndexRecordHasher.h
32	Could you add some explanation about the information that is being hashed? What is the underlying principle behind choosing to hash some aspect of the AST node (or to skip it). For example, I see you're hashing names, but not, say source locations. Or that QualTypes are first translated into canonical types. What are the completeness constraints for this hasher? What happens if we don't hash something that we should have? "Caching all produced hashes" seems like an implementation comment. Especially since the implementation chooses not to cache some of the hashes.

Based on Kadir comment I refactored the code.

Herald added a subscriber: jfb. · View Herald TranscriptMar 12 2019, 1:24 PM

jkorous marked an inline comment as done.Mar 12 2019, 1:25 PM

In D58749#1426270, @gribozavr wrote:

I left some comments, but it is difficult for me to review without understanding what the requirements for this hasher are, why some information is hashed in, and some is left out, could you clarify? See detailed comments.

Will do.

This implementation is covered by lit tests using it through c-index-test in upcoming patch.

This code looks very unit-testable. It also handles many corner cases (what information is hashed and what is left out). c-index-tests are integration tests that are not as good at that, and also they would be testing this code quite indirectly.

I basically didn't really like the idea of testing against hard-coded hash values in unittests as I consider it to be an implementation detail. I was thinking about integration tests that would work around this by both writing index and reading index and writing assertions against that data. What do you think?

I basically didn't really like the idea of testing against hard-coded hash values in unittests as I consider it to be an implementation detail. I was thinking about integration tests that would work around this by both writing index and reading index and writing assertions against that data. What do you think?

Instead of testing against hardcoded values maybe you can test for difference? Making sure hashes differ or stays same when appropriate changes are done to the contents?

I basically didn't really like the idea of testing against hard-coded hash values in unittests as I consider it to be an implementation detail.

Sorry, that's not what I was suggesting. There are better ways to test hashing. For example, write a piece of source code and check that the hash codes for two declarations that only differ in one aspect are same or different.

For example,

void f(int &&);
void f(int &);

and then assert that hashes for two decls are same or different depending on the desired specification.

I was thinking about integration tests that would work around this by both writing index and reading index and writing assertions against that data.

How is hashing used in this process?

Addressed some comments, going to update the diff.

clang/lib/Index/IndexRecordHasher.cpp
291 ↗	(On Diff #188646)	Honestly, I am not sure this would be better. I added a comment about this set of methods in the header file. Wouldn't mind renaming them but think `hashImpl` is actually quite accurate.
409 ↗	(On Diff #188646)	Seems like just the `InitialHash` is returned at the moment. I guess using llvm::Optional<hash_code> as a return type would be better. WDYT?

Addressed some of Dmitri's comments.

In D58749#1426769, @kadircet

In D58749#1426778, @gribozavr

I see what you mean now. That's a good idea. I'll add some unit tests.

We did performance tests of alternative approach - just hashing the serialized bit code representation. There's a performance regression in the sense that while the current implementation costs approx. extra 2.2% in build time the alternative approach costs 3.8%.

We are not happy about the regression but the best way to fix the current implementation seems to be using the alternative approach as a temporary solution. The plan is to move on with upstreaming the rest of the index-while-building so we can optimize the hasher after i-w-b lands - possibly using the ODR violation hasher.

This means the whole implementation will become just:

hash_code RecordHash = hash_combine_range(State.Buffer.begin(), State.Buffer.end());

where

SmallString<512> Buffer;

Due to necessary change in interface of IndexRecordWriter I'll abandon this patch and land the new hasher later.

jkorous abandoned this revision.Apr 12 2019, 2:42 PM

Revision Contents

Path

Size

clang/

lib/

Index/

CMakeLists.txt

3 lines

IndexRecordHasher.h

25 lines

RecordHasher/

15 lines

60 lines

337 lines

51 lines

140 lines

IndexRecordHasher.cpp

33 lines

Diff 190342

clang/lib/Index/CMakeLists.txt

Show All 22 Lines	add_clang_library(clangIndex
clangAST		clangAST
clangBasic		clangBasic
clangFormat		clangFormat
clangFrontend		clangFrontend
clangLex		clangLex
clangRewrite		clangRewrite
clangSerialization		clangSerialization
clangToolingCore		clangToolingCore
		clangIndexRecordHasher
)		)

		add_subdirectory(RecordHasher)

clang/lib/Index/IndexRecordHasher.h

This file was added.

				//===--- IndexRecordHasher.h - Hashing of FileIndexRecord -------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_LIB_INDEX_INDEXRECORDHASHER_H
				#define LLVM_CLANG_LIB_INDEX_INDEXRECORDHASHER_H

				#include "clang/AST/ASTContext.h"
				#include "llvm/ADT/Hashing.h"

				namespace clang {
				namespace index {
				class FileIndexRecord;

				/// \returns hash of the \p Record
				llvm::hash_code hashRecord(ASTContext &Ctx, const FileIndexRecord &Record);

				} // end namespace index
				} // end namespace clang

				#endif // LLVM_CLANG_LIB_INDEX_INDEXRECORDHASHER_H
				kadircetUnsubmitted Done Reply Inline Actions Why expose hashing functionality for other types? kadircet: Why expose hashing functionality for other types?
				gribozavrUnsubmitted Not Done Reply Inline Actions Could you add some explanation about the information that is being hashed? What is the underlying principle behind choosing to hash some aspect of the AST node (or to skip it). For example, I see you're hashing names, but not, say source locations. Or that QualTypes are first translated into canonical types. What are the completeness constraints for this hasher? What happens if we don't hash something that we should have? "Caching all produced hashes" seems like an implementation comment. Especially since the implementation chooses not to cache some of the hashes. gribozavr: Could you add some explanation about the information that is being hashed? What is the…

clang/lib/Index/RecordHasher/CMakeLists.txt

This file was added.

				include_directories(${CMAKE_CURRENT_SOURCE_DIR}/..)

				set(LLVM_LINK_COMPONENTS
				Support
				)

				add_clang_library(clangIndexRecordHasher
				IndexRecordHasher.cpp
				DeclHasher.cpp
				CachingHasher.cpp

				LINK_LIBS
				clangAST
				clangBasic
				)

clang/lib/Index/RecordHasher/CachingHasher.h

This file was added.

				//===--- CachingHasher.h - Hashing of indexed entities ----------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_LIB_INDEX_RECORDHASHER_CACHINGHASHER_H
				#define LLVM_CLANG_LIB_INDEX_RECORDHASHER_CACHINGHASHER_H

				#include "clang/AST/ASTContext.h"
				#include "clang/Basic/LLVM.h"
				#include "llvm/ADT/DenseMap.h"
				#include "llvm/ADT/Hashing.h"

				namespace clang {
				namespace index {

				constexpr size_t InitialHash = 5381;

				/// Utility class implementing hashing and caching of hashes.
				class CachingHasher {
				ASTContext &Ctx;
				llvm::DenseMap<const void *, llvm::hash_code> HashByPtr;

				public:
				explicit CachingHasher(ASTContext &Ctx) : Ctx(Ctx) {}
				ASTContext &getASTContext() { return Ctx; }

				/// Public interface that implements caching strategy.
				llvm::hash_code hash(const Decl *D);
				llvm::hash_code hash(QualType Ty);
				llvm::hash_code hash(CanQualType Ty);
				llvm::hash_code hash(DeclarationName Name);
				llvm::hash_code hash(const NestedNameSpecifier *NNS);
				llvm::hash_code hash(const TemplateArgument &Arg);

				private:
				/// \returns hash of \p Obj.
				/// Uses cached value if it exists otherwise calculates the hash, adds it to
				/// the cache and returns.
				template <typename T> llvm::hash_code getCachedHash(const void *Ptr, T Obj);

				// Private methods implement hashing itself. Intentionally hidden from client
				// (DeclHasher) to prevent accidental caching bypass.
				llvm::hash_code hashImpl(const Decl *D);
				llvm::hash_code hashImpl(CanQualType Ty);
				llvm::hash_code hashImpl(DeclarationName Name);
				llvm::hash_code hashImpl(const NestedNameSpecifier *NNS);
				llvm::hash_code hashImpl(const TemplateArgument &Arg);
				llvm::hash_code hashImpl(const IdentifierInfo *II);
				llvm::hash_code hashImpl(Selector Sel);
				llvm::hash_code hashImpl(TemplateName Name);
				};

				} // end namespace index
				} // end namespace clang

				#endif // LLVM_CLANG_LIB_INDEX_RECORDHASHER_CACHINGHASHER_H

clang/lib/Index/RecordHasher/CachingHasher.cpp

This file was added.

				//===--- CachingHasher.cpp - Hashing of indexed entities --------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "RecordHasher/CachingHasher.h"
				#include "RecordHasher/DeclHasher.h"

				namespace clang {
				namespace index {

				using llvm::hash_code;

				hash_code CachingHasher::hash(const Decl *D) {
				assert(D->isCanonicalDecl());

				if (isa<TagDecl>(D) \|\| isa<ObjCContainerDecl>(D)) {
				return getCachedHash(D, D);
				} else if (auto *NS = dyn_cast<NamespaceDecl>(D)) {
				if (NS->isAnonymousNamespace())
				return hash_value(StringRef("@aN"));
				return getCachedHash(D, D);
				} else {
				// There's a balance between caching results and not growing the cache too
				// much. Measurements showed that avoiding caching hashes for all decls is
				// beneficial particularly when including all of Cocoa.
				return hashImpl(D);
				}
				}

				hash_code CachingHasher::hash(QualType NonCanTy) {
				CanQualType CanTy = Ctx.getCanonicalType(NonCanTy);
				return hash(CanTy);
				}

				hash_code CachingHasher::hash(CanQualType CT) {
				// Do some hashing without going to the cache, for example we can avoid
				// storing the hash for both the type and its const-qualified version.
				hash_code Hash = InitialHash;

				while (true) {
				Qualifiers Q = CT.getQualifiers();
				CT = CT.getUnqualifiedType();
				const Type *T = CT.getTypePtr();
				unsigned qVal = 0;
				if (Q.hasConst())
				qVal \|= 0x1;
				if (Q.hasVolatile())
				qVal \|= 0x2;
				if (Q.hasRestrict())
				qVal \|= 0x4;
				if(qVal)
				Hash = hash_combine(Hash, qVal);

				// FIXME: Hash in ObjC GC qualifiers

				if (const BuiltinType *BT = dyn_cast<BuiltinType>(T)) {
				return hash_combine(Hash, BT->getKind());
				}
				if (const PointerType *PT = dyn_cast<PointerType>(T)) {
				Hash = hash_combine(Hash, '*');
				CT = CanQualType::CreateUnsafe(PT->getPointeeType());
				continue;
				}
				if (const ReferenceType *RT = dyn_cast<ReferenceType>(T)) {
				Hash = hash_combine(Hash, '&');
				CT = CanQualType::CreateUnsafe(RT->getPointeeType());
				continue;
				}
				if (const BlockPointerType *BT = dyn_cast<BlockPointerType>(T)) {
				Hash = hash_combine(Hash, 'B');
				CT = CanQualType::CreateUnsafe(BT->getPointeeType());
				continue;
				}
				if (const ObjCObjectPointerType *OPT = dyn_cast<ObjCObjectPointerType>(T)) {
				Hash = hash_combine(Hash, '*');
				CT = CanQualType::CreateUnsafe(OPT->getPointeeType());
				continue;
				}
				if (const TagType *TT = dyn_cast<TagType>(T)) {
				return hash_combine(Hash, '$', hash(TT->getDecl()->getCanonicalDecl()));
				}
				if (const ObjCInterfaceType *OIT = dyn_cast<ObjCInterfaceType>(T)) {
				return hash_combine(Hash, '$', hash(OIT->getDecl()->getCanonicalDecl()));
				}
				if (const ObjCObjectType *OIT = dyn_cast<ObjCObjectType>(T)) {
				for (auto *Prot : OIT->getProtocols())
				Hash = hash_combine(Hash, hash(Prot));
				CT = CanQualType::CreateUnsafe(OIT->getBaseType());
				continue;
				}
				if (const TemplateTypeParmType *TTP = dyn_cast<TemplateTypeParmType>(T)) {
				return hash_combine(Hash, 't', TTP->getDepth(), TTP->getIndex());
				}
				if (const InjectedClassNameType *InjT = dyn_cast<InjectedClassNameType>(T)) {
				CT = CanQualType::CreateUnsafe(InjT->getInjectedSpecializationType().getCanonicalType());
				continue;
				}

				break;
				}

				return hash_combine(Hash, getCachedHash(CT.getAsOpaquePtr(), CT));
				}

				hash_code CachingHasher::hash(DeclarationName Name) {
				assert(!Name.isEmpty());
				// Measurements for using cache or not here, showed significant slowdown when
				// using the cache for all DeclarationNames when parsing Cocoa, and minor
				// improvement or no difference for a couple of C++ single translation unit
				// files. So we avoid caching DeclarationNames.
				return hashImpl(Name);
				}

				hash_code CachingHasher::hash(const NestedNameSpecifier *NNS) {
				assert(NNS);
				// Measurements for the C++ single translation unit files did not show much
				// difference here; choosing to cache them currently.
				return getCachedHash(NNS, NNS);
				}

				hash_code CachingHasher::hash(const TemplateArgument &Arg) {
				// No caching.
				return hashImpl(Arg);
				}

				template <typename T>
				hash_code CachingHasher::getCachedHash(const void *Ptr, T Obj) {
				auto It = HashByPtr.find(Ptr);
				if (It != HashByPtr.end())
				return It->second;

				hash_code Hash = hashImpl(Obj);
				// hashImpl() may call into getCachedHash recursively and mutate
				// HashByPtr, so we use find() earlier and insert the hash with another
				// lookup here instead of calling insert() earlier and utilizing the iterator
				// that insert() returns.
				HashByPtr[Ptr] = Hash;
				return Hash;
				}

				hash_code CachingHasher::hashImpl(const Decl *D) {
				return DeclHasher(*this).Visit(D);
				}

				hash_code CachingHasher::hashImpl(const IdentifierInfo *II) {
				return hash_value(II->getName());
				}

				hash_code CachingHasher::hashImpl(Selector Sel) {
				unsigned N = Sel.getNumArgs();
				if (N == 0)
				++N;
				hash_code Hash = InitialHash;
				for (unsigned I = 0; I != N; ++I)
				if (IdentifierInfo *II = Sel.getIdentifierInfoForSlot(I))
				Hash = hash_combine(Hash, hashImpl(II));
				return Hash;
				}

				hash_code CachingHasher::hashImpl(TemplateName Name) {
				hash_code Hash = InitialHash;
				if (TemplateDecl *Template = Name.getAsTemplateDecl()) {
				if (TemplateTemplateParmDecl *TTP
				= dyn_cast<TemplateTemplateParmDecl>(Template)) {
				return hash_combine(Hash, 't', TTP->getDepth(), TTP->getIndex());
				}

				return hash_combine(Hash, hash(Template->getCanonicalDecl()));
				}

				// FIXME: Hash dependent template names.
				return Hash;
				}

				hash_code CachingHasher::hashImpl(const TemplateArgument &Arg) {
				hash_code Hash = InitialHash;

				switch (Arg.getKind()) {
				case TemplateArgument::Null:
				break;

				case TemplateArgument::Declaration:
				Hash = hash_combine(Hash, hash(Arg.getAsDecl()));
				break;

				case TemplateArgument::NullPtr:
				break;

				case TemplateArgument::TemplateExpansion:
				Hash = hash_combine(Hash, 'P'); // pack expansion of...
				LLVM_FALLTHROUGH;
				case TemplateArgument::Template:
				Hash = hash_combine(Hash, hashImpl(Arg.getAsTemplateOrTemplatePattern()));
				break;

				case TemplateArgument::Expression:
				// FIXME: Hash expressions.
				break;

				case TemplateArgument::Pack:
				Hash = hash_combine(Hash, 'p');
				for (const auto &P : Arg.pack_elements())
				Hash = hash_combine(Hash, hashImpl(P));
				break;

				case TemplateArgument::Type:
				Hash = hash_combine(Hash, hash(Arg.getAsType()));
				break;

				case TemplateArgument::Integral:
				Hash = hash_combine(Hash, 'V', hash(Arg.getIntegralType()),
				Arg.getAsIntegral());
				break;
				}

				return Hash;
				}

				hash_code CachingHasher::hashImpl(CanQualType CQT) {
				hash_code Hash = InitialHash;

				auto asCanon = [](QualType Ty) -> CanQualType {
				return CanQualType::CreateUnsafe(Ty);
				};

				const Type *T = CQT.getTypePtr();

				if (const PackExpansionType *Expansion = dyn_cast<PackExpansionType>(T)) {
				return hash_combine(Hash, 'P', hash(asCanon(Expansion->getPattern())));
				}
				if (const RValueReferenceType *RT = dyn_cast<RValueReferenceType>(T)) {
				return hash_combine(Hash, '%', hash(asCanon(RT->getPointeeType())));
				}
				if (const FunctionProtoType *FT = dyn_cast<FunctionProtoType>(T)) {
				Hash = hash_combine(Hash, 'F', hash(asCanon(FT->getReturnType())));
				for (const auto &I : FT->param_types())
				Hash = hash_combine(Hash, hash(asCanon(I)));
				return hash_combine(Hash, FT->isVariadic());
				}
				if (const ComplexType *CT = dyn_cast<ComplexType>(T)) {
				return hash_combine(Hash, '<', hash(asCanon(CT->getElementType())));
				}
				if (const TemplateSpecializationType *Spec
				= dyn_cast<TemplateSpecializationType>(T)) {
				Hash = hash_combine(Hash, '>', hashImpl(Spec->getTemplateName()));
				for (unsigned I = 0, N = Spec->getNumArgs(); I != N; ++I)
				Hash = hash_combine(Hash, hashImpl(Spec->getArg(I)));
				return Hash;
				}
				if (const DependentNameType *DNT = dyn_cast<DependentNameType>(T)) {
				Hash = hash_combine(Hash, '^');
				if (const NestedNameSpecifier *NNS = DNT->getQualifier())
				Hash = hash_combine(Hash, hash(NNS));
				return hash_combine(Hash, hashImpl(DNT->getIdentifier()));
				}

				// Unhandled type.
				return Hash;
				}

				hash_code CachingHasher::hashImpl(DeclarationName Name) {
				hash_code Hash = InitialHash;
				Hash = hash_combine(Hash, Name.getNameKind());

				switch (Name.getNameKind()) {
				case DeclarationName::Identifier:
				Hash = hash_combine(Hash, hashImpl(Name.getAsIdentifierInfo()));
				break;
				case DeclarationName::ObjCZeroArgSelector:
				case DeclarationName::ObjCOneArgSelector:
				case DeclarationName::ObjCMultiArgSelector:
				Hash = hash_combine(Hash, hashImpl(Name.getObjCSelector()));
				break;
				case DeclarationName::CXXConstructorName:
				case DeclarationName::CXXDestructorName:
				case DeclarationName::CXXConversionFunctionName:
				break;
				case DeclarationName::CXXOperatorName:
				Hash = hash_combine(Hash, Name.getCXXOverloadedOperator());
				break;
				case DeclarationName::CXXLiteralOperatorName:
				Hash = hash_combine(Hash, hashImpl(Name.getCXXLiteralIdentifier()));
				break;
				case DeclarationName::CXXUsingDirective:
				break;
				case DeclarationName::CXXDeductionGuideName:
				Hash = hash_combine(Hash, hashImpl(Name.getCXXDeductionGuideTemplate()
				->getDeclName()
				.getAsIdentifierInfo()));
				break;
				}

				return Hash;
				}

				hash_code CachingHasher::hashImpl(const NestedNameSpecifier *NNS) {
				hash_code Hash = InitialHash;
				if (auto *Pre = NNS->getPrefix())
				Hash = hash_combine(Hash, hash(Pre));

				Hash = hash_combine(Hash, NNS->getKind());

				switch (NNS->getKind()) {
				case NestedNameSpecifier::Identifier:
				Hash = hash_combine(Hash, hashImpl(NNS->getAsIdentifier()));
				break;

				case NestedNameSpecifier::Namespace:
				Hash = hash_combine(Hash, hash(NNS->getAsNamespace()->getCanonicalDecl()));
				break;

				case NestedNameSpecifier::NamespaceAlias:
				Hash = hash_combine(Hash,
				hash(NNS->getAsNamespaceAlias()->getCanonicalDecl()));
				break;

				case NestedNameSpecifier::Global:
				break;

				case NestedNameSpecifier::Super:
				break;

				case NestedNameSpecifier::TypeSpecWithTemplate:
				case NestedNameSpecifier::TypeSpec:
				Hash = hash_combine(Hash, hash(QualType(NNS->getAsType(), 0)));
				break;
				}

				return Hash;
				}

				} // end namespace index
				} // end namespace clang

clang/lib/Index/RecordHasher/DeclHasher.h

This file was added.

				//===--- DeclHasher.h - Hashing of Decl nodes in AST ------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_LIB_INDEX_RECORDHASHER_DECLHASHER_H
				#define LLVM_CLANG_LIB_INDEX_RECORDHASHER_DECLHASHER_H

				#include "clang/AST/Decl.h"
				#include "clang/AST/DeclVisitor.h"
				#include "llvm/Support/Path.h"

				namespace clang {
				namespace index {

				class CachingHasher;

				/// Implements hashing for declaration nodes in AST.
				/// This is just a convenient way how to avoid writing a huge switch for various
				/// types derived from Decl. Uses CachingHasher for hashing of atomic entities.

				class DeclHasher : public ConstDeclVisitor<DeclHasher, llvm::hash_code> {
				CachingHasher &Hasher;

				public:
				DeclHasher(CachingHasher &Hasher) : Hasher(Hasher) {}

				llvm::hash_code VisitDecl(const Decl *D);
				llvm::hash_code VisitNamedDecl(const NamedDecl *D);
				llvm::hash_code VisitTagDecl(const TagDecl *D);
				llvm::hash_code VisitClassTemplateSpecializationDecl(
				const ClassTemplateSpecializationDecl *D);
				llvm::hash_code VisitObjCContainerDecl(const ObjCContainerDecl *D);
				llvm::hash_code VisitObjCImplDecl(const ObjCImplDecl *D);
				llvm::hash_code VisitObjCCategoryDecl(const ObjCCategoryDecl *D);
				llvm::hash_code VisitFunctionDecl(const FunctionDecl *D);
				llvm::hash_code
				VisitUnresolvedUsingTypenameDecl(const UnresolvedUsingTypenameDecl *D);
				llvm::hash_code
				VisitUnresolvedUsingValueDecl(const UnresolvedUsingValueDecl *D);
				llvm::hash_code VisitDeclContext(const DeclContext *DC);
				llvm::hash_code hashLoc(SourceLocation Loc, bool IncludeOffset);
				};

				} // end namespace index
				} // end namespace clang

				#endif // LLVM_CLANG_LIB_INDEX_RECORDHASHER_DECLHASHER_H

clang/lib/Index/RecordHasher/DeclHasher.cpp

This file was added.

				//===--- DeclHasher.cpp - Hashing of Decl nodes in AST ----------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "RecordHasher/DeclHasher.h"
				#include "RecordHasher/CachingHasher.h"

				namespace clang {
				namespace index {

				using llvm::hash_code;

				hash_code DeclHasher::VisitDecl(const Decl *D) {
				return VisitDeclContext(D->getDeclContext());
				}

				hash_code DeclHasher::VisitNamedDecl(const NamedDecl *D) {
				hash_code Hash = VisitDecl(D);
				if (auto *attr = D->getExternalSourceSymbolAttr()) {
				Hash = hash_combine(Hash, hash_value(attr->getDefinedIn()));
				}
				return hash_combine(Hash, Hasher.hash(D->getDeclName()));
				}

				hash_code DeclHasher::VisitTagDecl(const TagDecl *D) {
				if (D->getDeclName().isEmpty()) {
				if (const TypedefNameDecl *TD = D->getTypedefNameForAnonDecl())
				return Visit(TD);

				hash_code Hash = VisitDeclContext(D->getDeclContext());
				if (D->isEmbeddedInDeclarator() && !D->isFreeStanding()) {
				Hash =
				hash_combine(Hash, hashLoc(D->getLocation(), /IncludeOffset=/true));
				} else
				Hash = hash_combine(Hash, 'a');
				return Hash;
				}

				hash_code Hash = VisitTypeDecl(D);
				return hash_combine(Hash, 'T');
				}

				hash_code DeclHasher::VisitClassTemplateSpecializationDecl(
				const ClassTemplateSpecializationDecl *D) {
				hash_code Hash = VisitCXXRecordDecl(D);
				const TemplateArgumentList &Args = D->getTemplateArgs();
				Hash = hash_combine(Hash, '>');
				for (unsigned I = 0, N = Args.size(); I != N; ++I) {
				Hash = hash_combine(Hash, Hasher.hash(Args.get(I)));
				}
				return Hash;
				}

				hash_code DeclHasher::VisitObjCContainerDecl(const ObjCContainerDecl *D) {
				hash_code Hash = VisitNamedDecl(D);
				return hash_combine(Hash, 'I');
				}

				hash_code DeclHasher::VisitObjCImplDecl(const ObjCImplDecl *D) {
				if (auto *ID = D->getClassInterface())
				return VisitObjCInterfaceDecl(ID);
				else
				return 0;
				}

				hash_code DeclHasher::VisitObjCCategoryDecl(const ObjCCategoryDecl *D) {
				// FIXME: Differentiate between category and the interface ?
				if (auto *ID = D->getClassInterface())
				return VisitObjCInterfaceDecl(ID);
				else
				return 0;
				}

				hash_code DeclHasher::VisitFunctionDecl(const FunctionDecl *D) {
				hash_code Hash = VisitNamedDecl(D);
				ASTContext &Ctx = Hasher.getASTContext();
				if ((!Ctx.getLangOpts().CPlusPlus && !D->hasAttr<OverloadableAttr>()) \|\|
				D->isExternC())
				return Hash;

				for (auto param : D->parameters()) {
				Hash = hash_combine(Hash, Hasher.hash(param->getType()));
				}
				return Hash;
				}

				hash_code DeclHasher::VisitUnresolvedUsingTypenameDecl(
				const UnresolvedUsingTypenameDecl *D) {
				hash_code Hash = VisitNamedDecl(D);
				Hash = hash_combine(Hash, Hasher.hash(D->getQualifier()));
				return Hash;
				}

				hash_code
				DeclHasher::VisitUnresolvedUsingValueDecl(const UnresolvedUsingValueDecl *D) {
				hash_code Hash = VisitNamedDecl(D);
				Hash = hash_combine(Hash, Hasher.hash(D->getQualifier()));
				return Hash;
				}

				hash_code DeclHasher::VisitDeclContext(const DeclContext *DC) {
				// FIXME: Add location if this is anonymous namespace ?
				DC = DC->getRedeclContext();
				const Decl *D = cast<Decl>(DC)->getCanonicalDecl();
				if (auto *ND = dyn_cast<NamedDecl>(D))
				return Hasher.hash(ND);
				else
				return 0;
				}

				hash_code DeclHasher::hashLoc(SourceLocation Loc, bool IncludeOffset) {
				if (Loc.isInvalid()) {
				return 0;
				}
				hash_code Hash = InitialHash;
				const SourceManager &SM = Hasher.getASTContext().getSourceManager();
				Loc = SM.getFileLoc(Loc);
				const std::pair<FileID, unsigned> &Decomposed = SM.getDecomposedLoc(Loc);
				const FileEntry *FE = SM.getFileEntryForID(Decomposed.first);
				if (FE) {
				Hash = hash_combine(Hash, llvm::sys::path::filename(FE->getName()));
				} else {
				// This case really isn't interesting.
				return 0;
				}
				if (IncludeOffset) {
				// Use the offest into the FileID to represent the location. Using
				// a line/column can cause us to look back at the original source file,
				// which is expensive.
				Hash = hash_combine(Hash, Decomposed.second);
				}
				return Hash;
				}

				} // end namespace index
				} // end namespace clang

clang/lib/Index/RecordHasher/IndexRecordHasher.cpp

This file was added.

				//===--- IndexRecordHasher.cpp - Hashing of FileIndexRecord ------ C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "IndexRecordHasher.h"
				#include "FileIndexRecord.h"
				#include "RecordHasher/CachingHasher.h"

				namespace clang {
				namespace index {

				using llvm::hash_code;

				hash_code hashRecord(ASTContext &Ctx, const FileIndexRecord &Record) {

				CachingHasher Hasher(Ctx);

				hash_code Hash = InitialHash;
				for (auto &Info : Record.getDeclOccurrencesSortedByOffset()) {
				Hash = hash_combine(Hash, Info.Roles, Info.Offset, Hasher.hash(Info.Dcl));
				for (auto &Rel : Info.Relations) {
				Hash = hash_combine(Hash, Hasher.hash(Rel.RelatedSymbol));
				}
				}
				return Hash;
				}

				} // end namespace index
				} // end namespace clang