This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
include/llvm/DebugInfo/PDB/Raw/
-
llvm/
-
DebugInfo/
-
PDB/
-
Raw/
-
NameHashTableBuilder.h
-
RawTypes.h
-
lib/DebugInfo/PDB/
-
DebugInfo/
-
PDB/
-
CMakeLists.txt
-
Raw/
-
NameHashTable.cpp
-
NameHashTableBuilder.cpp
-
unittests/DebugInfo/PDB/
-
DebugInfo/
-
PDB/
-
CMakeLists.txt
-
NameHashTableBuilderTest.cpp

Differential D28707

PDB: Add a class to create the /names stream contents.
ClosedPublic

Authored by ruiu on Jan 13 2017, 3:06 PM.

Download Raw Diff

Details

Reviewers

zturner

Commits

rGdcd32937dc39: PDB: Add a class to create the /names stream contents.
rL292040: PDB: Add a class to create the /names stream contents.

Summary

This patch adds a new class NameHashTableBuilder which creates /names streams.
This patch contains a test to confirm that a stream created by
NameHashTableBuilder can be read by NameHashTable reader class.

Diff Detail

Repository: rL LLVM

Event Timeline

ruiu updated this revision to Diff 84394.Jan 13 2017, 3:06 PM

ruiu retitled this revision from to PDB: Add a class to create the /names stream contents..

ruiu updated this object.

ruiu added a reviewer: zturner.

ruiu added a subscriber: llvm-commits.

Herald added a subscriber: mgorny. · View Herald TranscriptJan 13 2017, 3:06 PM

It seems like this would be more appropriate as a YAML test, since that is the way we're testing everything else.

I will eventually do that, but that's not doable at the moment. In order to test this output with YAM, we need to plug this in to other parts so that PDB files get this string table. Currently doing that doesn't make much sense because no one is using the string table, and we have no code for that.

Also I think this unit test is useful by itself. This is compact and does only what it should do.

zturner added inline comments.Jan 13 2017, 4:59 PM

llvm/lib/DebugInfo/PDB/Raw/NameHashTableBuilder.cpp
22 ↗	(On Diff #84394)	`NameHashTable` has a structure for reading the header into. I think we should re-use that rather than hardcoding the number. Can you move the `Header` structure from `NameHashTable.cpp` into the `.h` file, and then re-use it here by filling out the fields and then just calling `Writer.writeObject(H)`?
33–34 ↗	(On Diff #84394)	Nitpick: This looks like it's wrapped too soon, well before before 80 characters.
40 ↗	(On Diff #84394)	Can you call this `computeBucketCount`? compute entries sounds like you might compute the actual values of some entries rather than the number of them.
49 ↗	(On Diff #84394)	I would prefer if this method is called `commit()` and takes the same signature as the other methods. (i.e. a `StreamWriter&`). You will also need to add a `calculateSerializedLength()` method. If you want to write to a buffer and return it, you can do something like this: std::vector<uint32_t> Buffer(Builder.calculateSerializedLength()); msf::MutableByteStream Stream(Buffer); msf::StreamWriter Writer(Stream); if (auto EC = Builder.commit(Writer)) return EC; // The bytes of `Buffer` should be filled out now. This also makes the body of the method cleaner, because it gets rid of a lot of the offset bookkeeping and manual incrementing, and it handles endianness for you as well. if (auto EC = Writer.writeObject(H)) return EC; for (auto Pair : Strings) { Writer.setOffset(Pair.second); if (auto EC = Writer.writeFixedString(S)) return EC; } Writer.setOffset(StringSize); if (auto EC = Writer.writeInteger(NumEntries) return EC; etc etc.

Updated as per Zach's comments.

lgtm aside from the mentioned changes

llvm/include/llvm/DebugInfo/PDB/Raw/NameHashTableBuilder.h
38 ↗	(On Diff #84415)	How about `StringMap`, which is quite a bit more efficient than `DenseMap` when the keys are strings?
llvm/lib/DebugInfo/PDB/Raw/NameHashTableBuilder.cpp
96 ↗	(On Diff #84415)	Is the conversion to `ArrayRef` necessary? It should get converted implicitly I think.

This revision is now accepted and ready to land.Jan 13 2017, 7:51 PM

ruiu added inline comments.Jan 14 2017, 4:45 PM

llvm/lib/DebugInfo/PDB/Raw/NameHashTableBuilder.cpp
96 ↗	(On Diff #84415)	Seems like I need to convert to `ArrayRef` since `writeArray` is a overloaded function.

Closed by commit rL292040: PDB: Add a class to create the /names stream contents. (authored by ruiu). · Explain WhyJan 14 2017, 4:47 PM

This revision was automatically updated to reflect the committed changes.

Sorry for not mentioning that in the previous message. I tried to use StringMap, but it seems that the class doesn't provide an iterator to iterate both keys and values at the same time, so it didn't work well. Also I think StringMap copies strings into a map, so it needs more memory.

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

DebugInfo/

PDB/

Raw/

NameHashTableBuilder.h

45 lines

RawTypes.h

9 lines

lib/

DebugInfo/

PDB/

CMakeLists.txt

1 line

Raw/

NameHashTable.cpp

11 lines

NameHashTableBuilder.cpp

101 lines

unittests/

DebugInfo/

PDB/

CMakeLists.txt

1 line

NameHashTableBuilderTest.cpp

54 lines

Diff 84467

llvm/trunk/include/llvm/DebugInfo/PDB/Raw/NameHashTableBuilder.h

				//===- NameHashTableBuilder.h - PDB Name Hash Table Builder ------ C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file creates the "/names" stream.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_DEBUGINFO_PDB_RAW_NAMEHASHTABLEBUILDER_H
				#define LLVM_DEBUGINFO_PDB_RAW_NAMEHASHTABLEBUILDER_H

				#include "llvm/ADT/DenseMap.h"
				#include "llvm/ADT/StringRef.h"
				#include "llvm/Support/Error.h"
				#include <vector>

				namespace llvm {
				namespace msf {
				class StreamWriter;
				}
				namespace pdb {

				class NameHashTableBuilder {
				public:
				// If string S does not exist in the string table, insert it.
				// Returns the ID for S.
				uint32_t insert(StringRef S);

				uint32_t calculateSerializedLength() const;
				Error commit(msf::StreamWriter &Writer) const;

				private:
				DenseMap<StringRef, uint32_t> Strings;
				uint32_t StringSize = 1;
				};

				} // end namespace pdb
				} // end namespace llvm

				#endif // LLVM_DEBUGINFO_PDB_RAW_NAMEHASHTABLEBUILDER_H

llvm/trunk/include/llvm/DebugInfo/PDB/Raw/RawTypes.h

	Show First 20 Lines • Show All 296 Lines • ▼ Show 20 Lines
	/// The header preceeding the global PDB Stream (Stream 1)			/// The header preceeding the global PDB Stream (Stream 1)
	struct InfoStreamHeader {			struct InfoStreamHeader {
	support::ulittle32_t Version;			support::ulittle32_t Version;
	support::ulittle32_t Signature;			support::ulittle32_t Signature;
	support::ulittle32_t Age;			support::ulittle32_t Age;
	PDB_UniqueId Guid;			PDB_UniqueId Guid;
	};			};

				/// The header preceeding the /names stream.
				struct NameHashTableHeader {
				support::ulittle32_t Signature;
				support::ulittle32_t HashVersion;
				support::ulittle32_t ByteSize;
				};

				const uint32_t NameHashTableSignature = 0xEFFEEFFE;

	} // namespace pdb			} // namespace pdb
	} // namespace llvm			} // namespace llvm

	#endif			#endif

llvm/trunk/lib/DebugInfo/PDB/CMakeLists.txt

Show All 33 Lines	add_pdb_impl_folder(Raw
Raw/GlobalsStream.cpp		Raw/GlobalsStream.cpp
Raw/GSI.cpp		Raw/GSI.cpp
Raw/Hash.cpp		Raw/Hash.cpp
Raw/InfoStream.cpp		Raw/InfoStream.cpp
Raw/InfoStreamBuilder.cpp		Raw/InfoStreamBuilder.cpp
Raw/ModInfo.cpp		Raw/ModInfo.cpp
Raw/ModStream.cpp		Raw/ModStream.cpp
Raw/NameHashTable.cpp		Raw/NameHashTable.cpp
		Raw/NameHashTableBuilder.cpp
Raw/NameMap.cpp		Raw/NameMap.cpp
Raw/NameMapBuilder.cpp		Raw/NameMapBuilder.cpp
Raw/PDBFile.cpp		Raw/PDBFile.cpp
Raw/PDBFileBuilder.cpp		Raw/PDBFileBuilder.cpp
Raw/PublicsStream.cpp		Raw/PublicsStream.cpp
Raw/RawError.cpp		Raw/RawError.cpp
Raw/RawSession.cpp		Raw/RawSession.cpp
Raw/SymbolStream.cpp		Raw/SymbolStream.cpp
▲ Show 20 Lines • Show All 54 Lines • Show Last 20 Lines

llvm/trunk/lib/DebugInfo/PDB/Raw/NameHashTable.cpp

	//===- NameHashTable.cpp - PDB Name Hash Table ------------------- C++ --===//			//===- NameHashTable.cpp - PDB Name Hash Table ------------------- C++ --===//
	//			//
	// The LLVM Compiler Infrastructure			// The LLVM Compiler Infrastructure
	//			//
	// This file is distributed under the University of Illinois Open Source			// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.			// License. See LICENSE.TXT for details.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "llvm/DebugInfo/PDB/Raw/NameHashTable.h"			#include "llvm/DebugInfo/PDB/Raw/NameHashTable.h"

	#include "llvm/ADT/ArrayRef.h"			#include "llvm/ADT/ArrayRef.h"
	#include "llvm/DebugInfo/MSF/StreamReader.h"			#include "llvm/DebugInfo/MSF/StreamReader.h"
	#include "llvm/DebugInfo/PDB/Raw/Hash.h"			#include "llvm/DebugInfo/PDB/Raw/Hash.h"
	#include "llvm/DebugInfo/PDB/Raw/RawError.h"			#include "llvm/DebugInfo/PDB/Raw/RawError.h"
				#include "llvm/DebugInfo/PDB/Raw/RawTypes.h"
	#include "llvm/Support/Endian.h"			#include "llvm/Support/Endian.h"

	using namespace llvm;			using namespace llvm;
	using namespace llvm::msf;			using namespace llvm::msf;
	using namespace llvm::support;			using namespace llvm::support;
	using namespace llvm::pdb;			using namespace llvm::pdb;

	NameHashTable::NameHashTable() : Signature(0), HashVersion(0), NameCount(0) {}			NameHashTable::NameHashTable() : Signature(0), HashVersion(0), NameCount(0) {}

	Error NameHashTable::load(StreamReader &Stream) {			Error NameHashTable::load(StreamReader &Stream) {
	struct Header {			const NameHashTableHeader *H;
	support::ulittle32_t Signature;
	support::ulittle32_t HashVersion;
	support::ulittle32_t ByteSize;
	};

	const Header *H;
	if (auto EC = Stream.readObject(H))			if (auto EC = Stream.readObject(H))
	return EC;			return EC;

	if (H->Signature != 0xEFFEEFFE)			if (H->Signature != NameHashTableSignature)
	return make_error<RawError>(raw_error_code::corrupt_file,			return make_error<RawError>(raw_error_code::corrupt_file,
	"Invalid hash table signature");			"Invalid hash table signature");
	if (H->HashVersion != 1 && H->HashVersion != 2)			if (H->HashVersion != 1 && H->HashVersion != 2)
	return make_error<RawError>(raw_error_code::corrupt_file,			return make_error<RawError>(raw_error_code::corrupt_file,
	"Unsupported hash version");			"Unsupported hash version");

	Signature = H->Signature;			Signature = H->Signature;
	HashVersion = H->HashVersion;			HashVersion = H->HashVersion;
	▲ Show 20 Lines • Show All 60 Lines • Show Last 20 Lines

llvm/trunk/lib/DebugInfo/PDB/Raw/NameHashTableBuilder.cpp

				//===- NameHashTable.cpp - PDB Name Hash Table ------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/DebugInfo/PDB/Raw/NameHashTableBuilder.h"
				#include "llvm/ADT/ArrayRef.h"
				#include "llvm/DebugInfo/MSF/StreamWriter.h"
				#include "llvm/DebugInfo/PDB/Raw/Hash.h"
				#include "llvm/DebugInfo/PDB/Raw/RawTypes.h"
				#include "llvm/Support/Endian.h"

				using namespace llvm;
				using namespace llvm::support;
				using namespace llvm::support::endian;
				using namespace llvm::pdb;

				uint32_t NameHashTableBuilder::insert(StringRef S) {
				auto P = Strings.insert({S, StringSize});

				// If a given string didn't exist in the string table, we want to increment
				// the string table size.
				if (P.second)
				StringSize += S.size() + 1; // +1 for '\0'
				return P.first->second;
				}

				static uint32_t computeBucketCount(uint32_t NumStrings) {
				// The /names stream is basically an on-disk open-addressing hash table.
				// Hash collisions are resolved by linear probing. We cannot make
				// utilization 100% because it will make the linear probing extremely
				// slow. But lower utilization wastes disk space. As a reasonable
				// load factor, we choose 80%. We need +1 because slot 0 is reserved.
				return (NumStrings + 1) * 1.25;
				}

				uint32_t NameHashTableBuilder::calculateSerializedLength() const {
				uint32_t Size = 0;
				Size += sizeof(NameHashTableHeader);
				Size += StringSize;
				Size += 4; // Hash table begins with 4-byte size field.

				uint32_t BucketCount = computeBucketCount(Strings.size());
				Size += BucketCount * 4;

				Size += 4; // The /names stream ends with the number of strings.
				return Size;
				}

				Error NameHashTableBuilder::commit(msf::StreamWriter &Writer) const {
				// Write a header
				NameHashTableHeader H;
				H.Signature = NameHashTableSignature;
				H.HashVersion = 1;
				H.ByteSize = StringSize;
				if (auto EC = Writer.writeObject(H))
				return EC;

				// Write a string table.
				uint32_t StringStart = Writer.getOffset();
				for (auto Pair : Strings) {
				StringRef S = Pair.first;
				uint32_t Offset = Pair.second;
				Writer.setOffset(StringStart + Offset);
				if (auto EC = Writer.writeZeroString(S))
				return EC;
				}
				Writer.setOffset(StringStart + StringSize);

				// Write a hash table.
				uint32_t BucketCount = computeBucketCount(Strings.size());
				if (auto EC = Writer.writeInteger(BucketCount))
				return EC;
				std::vector<ulittle32_t> Buckets(BucketCount);

				for (auto Pair : Strings) {
				StringRef S = Pair.first;
				uint32_t Offset = Pair.second;
				uint32_t Hash = hashStringV1(S);

				for (uint32_t I = 0; I != BucketCount; ++I) {
				uint32_t Slot = (Hash + I) % BucketCount;
				if (Slot == 0)
				continue; // Skip reserved slot
				if (Buckets[Slot] != 0)
				continue;
				Buckets[Slot] = Offset;
				break;
				}
				}

				if (auto EC = Writer.writeArray(ArrayRef<ulittle32_t>(Buckets)))
				return EC;
				if (auto EC = Writer.writeInteger(static_cast<uint32_t>(Strings.size())))
				return EC;
				return Error::success();
				}

llvm/trunk/unittests/DebugInfo/PDB/CMakeLists.txt

	set(LLVM_LINK_COMPONENTS			set(LLVM_LINK_COMPONENTS
	DebugInfoCodeView			DebugInfoCodeView
	DebugInfoMSF			DebugInfoMSF
	DebugInfoPDB			DebugInfoPDB
	)			)

	set(DebugInfoPDBSources			set(DebugInfoPDBSources
	MappedBlockStreamTest.cpp			MappedBlockStreamTest.cpp
				NameHashTableBuilderTest.cpp
	MSFBuilderTest.cpp			MSFBuilderTest.cpp
	PDBApiTest.cpp			PDBApiTest.cpp
	)			)

	add_llvm_unittest(DebugInfoPDBTests			add_llvm_unittest(DebugInfoPDBTests
	${DebugInfoPDBSources}			${DebugInfoPDBSources}
	)			)

llvm/trunk/unittests/DebugInfo/PDB/NameHashTableBuilderTest.cpp

				//===- NameHashTableBuilderTest.cpp ---------------------------------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#include "ErrorChecking.h"

				#include "llvm/DebugInfo/MSF/ByteStream.h"
				#include "llvm/DebugInfo/MSF/StreamReader.h"
				#include "llvm/DebugInfo/MSF/StreamWriter.h"
				#include "llvm/DebugInfo/PDB/Raw/NameHashTable.h"
				#include "llvm/DebugInfo/PDB/Raw/NameHashTableBuilder.h"

				#include "gtest/gtest.h"

				using namespace llvm;
				using namespace llvm::pdb;

				namespace {
				class NameHashTableBuilderTest : public ::testing::Test {};
				}

				TEST_F(NameHashTableBuilderTest, Simple) {
				// Create /names table contents.
				NameHashTableBuilder Builder;
				EXPECT_EQ(1U, Builder.insert("foo"));
				EXPECT_EQ(5U, Builder.insert("bar"));
				EXPECT_EQ(1U, Builder.insert("foo"));
				EXPECT_EQ(9U, Builder.insert("baz"));

				std::vector<uint8_t> Buffer(Builder.calculateSerializedLength());
				msf::MutableByteStream OutStream(Buffer);
				msf::StreamWriter Writer(OutStream);
				EXPECT_NO_ERROR(Builder.commit(Writer));

				// Reads the contents back.
				msf::ByteStream InStream(Buffer);
				msf::StreamReader Reader(InStream);
				NameHashTable Table;
				EXPECT_NO_ERROR(Table.load(Reader));

				EXPECT_EQ(3U, Table.getNameCount());
				EXPECT_EQ(1U, Table.getHashVersion());
				EXPECT_EQ("foo", Table.getStringForID(1));
				EXPECT_EQ("bar", Table.getStringForID(5));
				EXPECT_EQ("baz", Table.getStringForID(9));
				EXPECT_EQ(1U, Table.getIDForString("foo"));
				EXPECT_EQ(5U, Table.getIDForString("bar"));
				EXPECT_EQ(9U, Table.getIDForString("baz"));
				}

This is an archive of the discontinued LLVM Phabricator instance.

PDB: Add a class to create the /names stream contents.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 84467

llvm/trunk/include/llvm/DebugInfo/PDB/Raw/NameHashTableBuilder.h

llvm/trunk/include/llvm/DebugInfo/PDB/Raw/RawTypes.h

llvm/trunk/lib/DebugInfo/PDB/CMakeLists.txt

llvm/trunk/lib/DebugInfo/PDB/Raw/NameHashTable.cpp

llvm/trunk/lib/DebugInfo/PDB/Raw/NameHashTableBuilder.cpp

llvm/trunk/unittests/DebugInfo/PDB/CMakeLists.txt

llvm/trunk/unittests/DebugInfo/PDB/NameHashTableBuilderTest.cpp

PDB: Add a class to create the /names stream contents.
ClosedPublic