This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang-tools-extra/trunk/
-
trunk/
-
CMakeLists.txt
-
clang-doc/
-
BitcodeWriter.h
-
BitcodeWriter.cpp
-
CMakeLists.txt
-
ClangDoc.h
-
ClangDoc.cpp
-
Mapper.h
-
Mapper.cpp
-
Representation.h
-
Serialize.h
-
Serialize.cpp
-
tool/
-
CMakeLists.txt
-
ClangDocMain.cpp
-
docs/
-
clang-doc.rst
-
test/
-
CMakeLists.txt
-
clang-doc/
-
mapper-class-in-class.cpp
-
mapper-class-in-function.cpp
-
mapper-class.cpp
-
mapper-comments.cpp
-
mapper-enum.cpp
-
mapper-function.cpp
-
mapper-method.cpp
-
mapper-namespace.cpp
-
mapper-struct.cpp
-
mapper-union.cpp

Differential D41102

Setup clang-doc frontend framework
ClosedPublic

Authored by juliehockett on Dec 11 2017, 5:36 PM.

Download Raw Diff

Details

Reviewers

klimek
jakehehrlich
sammccall
lebedev.ri

Commits

Summary

Setting up the mapper part of the frontend framework for a clang-doc tool. It creates a series of relevant matchers for declarations, and uses the ToolExecutor to traverse the AST and extract the matching declarations and comments. The mapper serializes the extracted information to individual records for reducing and eventually doc generation.

For a more detailed overview of the tool, see the design document on the mailing list: RFC: clang-doc proposal

Diff Detail

Repository: rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Refactoring bitcode writer

Next, i suggest to look into code self-debugging, see comments.
Also, i have added a few questions, it would be great to know that my understanding is correct?

I'm sorry that it seems like we are going over and over and over over the same code again,
this is the very base of the tool, i think it is important to get it as close to great as possible.
I *think* these review comments move it in that direction, not in the opposite direction?

clang-doc/BitcodeWriter.cpp
47 ↗	(On Diff #135559)	So in other words this is making an assumption that no file with more than 65535 lines will be analyzed, correct? Can you add that as comment please?
56 ↗	(On Diff #135559)	AbbrevDsc Abbrev = nullptr;
57 ↗	(On Diff #135559)	// Is this 'description' valid? operator bool() const { return Abbrev != nullptr && Name.data() != nullptr && !Name.empty(); }
137 ↗	(On Diff #135559)	So `FUNCTION_MANGLED_NAME` is phased out, and is thus missing, as far as i understand?
148 ↗	(On Diff #135559)	+`assert(RecordIdNameMap[ID] && "Unknown Abbreviation");`
153 ↗	(On Diff #135559)	+`assert(RecordIdNameMap[ID] && "Unknown Abbreviation");`
158 ↗	(On Diff #135559)	Called only once, and that call does nothing. I'd drop it.
175 ↗	(On Diff #135559)	/// \brief Emits a block ID and the block name to the BLOCKINFO block. void ClangDocBitcodeWriter::emitBlockID(BlockId ID) { const auto& BlockIdName = BlockIdNameMap[ID]; assert(BlockIdName.data() && BlockIdName.size() && "Unknown BlockId!"); Record.clear(); Record.push_back(ID); Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_SETBID, Record); Record.clear(); for (const char C : BlockIdName) Record.push_back(C); Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_BLOCKNAME, Record); }
187 ↗	(On Diff #135559)	/// \brief Emits a record name to the BLOCKINFO block. void ClangDocBitcodeWriter::emitRecordID(RecordId ID) { assert(RecordIdNameMap[ID] && "Unknown Abbreviation"); prepRecordData(ID); (Yes, `prepRecordData()` will have the same code. It should get optimized away.)
194 ↗	(On Diff #135559)	void ClangDocBitcodeWriter::emitAbbrev(RecordId ID, BlockId Block) { assert(RecordIdNameMap[ID] && "Unknown Abbreviation"); auto Abbrev = std::make_shared<BitCodeAbbrev>();
204 ↗	(On Diff #135559)	So remember that in a previous iteration, seemingly useless `AbbrevDsc` stuff was added to the `RecordIdNameMap`? It is going to pay-off now: void ClangDocBitcodeWriter::emitRecord(StringRef Str, RecordId ID) { assert(RecordIdNameMap[ID] && "Unknown Abbreviation"); assert(RecordIdNameMap[ID].Abbrev == &StringAbbrev && "Abbrev type mismatch"); if (!prepRecordData(ID, !Str.empty())) return; ... And if we did not add an `RecordIdNameMap` entry for this `RecordId`, then i believe that will also be detected because `Abbrev` will be a `nullptr`.
205 ↗	(On Diff #135559)	assert(Str.size() < (1U << BitCodeConstants::StringLengthSize)); Record.push_back(Str.size());
210 ↗	(On Diff #135559)	void ClangDocBitcodeWriter::emitRecord(const Location &Loc, RecordId ID) { assert(RecordIdNameMap[ID] && "Unknown Abbreviation"); assert(RecordIdNameMap[ID].Abbrev == &LocationAbbrev && "Abbrev type mismatch"); if (!prepRecordData(ID, !OmitFilenames)) return; ...
211 ↗	(On Diff #135559)	Call me paranoid, but: assert(Loc.LineNumber < (1U << BitCodeConstants::LineNumberSize)); Record.push_back(Loc.LineNumber); assert(Loc.Filename.size()) < (1U << BitCodeConstants::StringLengthSize)); Record.push_back(Loc.Filename.size());
217 ↗	(On Diff #135559)	void ClangDocBitcodeWriter::emitRecord(int Val, RecordId ID) { assert(RecordIdNameMap[ID] && "Unknown Abbreviation"); assert(RecordIdNameMap[ID].Abbrev == &IntAbbrev && "Abbrev type mismatch"); if (!prepRecordData(ID, Val)) return;
218 ↗	(On Diff #135559)	assert(Val < (1U << BitCodeConstants::IntSize)); Record.push_back(Val);
222 ↗	(On Diff #135559)	bool ClangDocBitcodeWriter::prepRecordData(RecordId ID, bool ShouldEmit) { assert(RecordIdNameMap[ID] && "Unknown Abbreviation"); if (!ShouldEmit) return false;
232 ↗	(On Diff #135559)	Since `ClangDocBitcodeWriter` is not re-used, but re-constructed* each time, `Abbrevs.clear();` does nothing. Hmm, i wonder if that will be a bad thing. Benchmarking will tell i guess :/
236 ↗	(On Diff #135559)	https://godbolt.org/g/rD6BWK also suggests it should be `static const`
276 ↗	(On Diff #135559)	Uhm, do you plan on calling `emitBlockInfo()` from anywhere else other than `emitBlockInfoBlock()`? Since it takes `const std::vector<RecordId>` instead of a `const std::initializer_list<RecordId>&`, a memory copy will happen... https://godbolt.org/g/rD6BWK
clang-doc/BitcodeWriter.h
35 ↗	(On Diff #135559)	`LineNumFixedSize` is used for a different things. Given such a specific name, i think it may be confusing? Also, looking at http://llvm.org/doxygen/classllvm_1_1BitstreamWriter.html#ae6a40b4a5ea89bb8b5076c26e0d0b638 i guess these all should be `unsigned`. I think this would be better, albeit more verbose: struct BitCodeConstants { static constexpr unsigned SignatureBitSize = 8U; static constexpr unsigned SubblockIDSize = 5U; static constexpr unsigned IntSize = 16U; static constexpr unsigned StringLengthSize = 16U; static constexpr unsigned LineNumberSize = 16U; };
53 ↗	(On Diff #135559)	So what exactly does `BitCodeConstants::SubblockIDSize` mean? static_assert(BI_LAST < (1U << BitCodeConstants::SubblockIDSize), "Too many block id's!"); ?
94 ↗	(On Diff #135559)	So i have a question: if something (`FUNCTION_MANGLED_NAME` in this case) is phased out, does it have to stay in this enum? That will introduce holes in `RecordIdNameMap`. Are the actual numerical id's of enumerators stored in the bitcode, or the string (abbrev, `RecordIdNameMap[].Name`)? Looking at tests, i guess these enums are internal detail, and they can be changed freely, including removing enumerators. Am i wrong? I think that should be explained in a comment before this `enum`.
100 ↗	(On Diff #135559)	If `AbbreviationMap` comment makes sense, i guess that common code should be moved here, i.e. static constexpr unsigned RecordIdCount = RI_LAST - RI_FIRST + 1; and use this new variable in those two places.
163 ↗	(On Diff #135559)	We know we will have at most `RI_LAST - RI_FIRST + 1` abbreviations. Right now that results in just ~40 abbreviations. Would it make sense to AbbreviationMap() : Abbrevs(RI_LAST - RI_FIRST + 1) {} ? (or `llvm::DenseMap<unsigned, unsigned> Abbrevs = llvm::DenseMap<unsigned, unsigned>(RI_LAST - RI_FIRST + 1);` but that looks uglier to me..)

The change to USR seems like quite an improvement already! That being said, I do think that it might be preferable to opt out of the use of strings for linking things together. What we did with our clang-doc is that we directly used pointers to refer to other types. So for example, our class for storing Record/CXX related information has something like:

std::vector<Function*>	mMethods;
std::vector<Variable*>	mVariables;
std::vector<Enum*>	mEnums;
std::vector<Typedef*>	mTypedefs;

Only upon serialization we fetch some kind of USR that would uniquely identify the type. This is especially useful to us for the conversion to HTML and I think the same would go for this backend, as it seems this way you'll have to do string lookups to get to the actual types, which would be inefficient in multiple aspects. It can make the backend a little more of a one-on-one conversion, e.g. with one of our HTML template definitions (note: this is a Jinja2 template in Python):

{%- for enum in inEntry.GetMemberEnums() -%}
	<tr class="separator">
		<td class="memSeparator" colspan="3"></td>
	</tr>
	<tr class="memitem:EAllocatorStrategy">
		<td class="memItemLeft" align="right">{{- Modifiers.RenderAccessModifier(enum.GetAccessModifier()) -}}</td>
		<td class="memItemMiddle" align="left">enum <a href="{{ enum.GetID() }}.html">{{- enum.GetName().GetName()|e -}}</a></td>
		<td class="memItemRight" valign="bottom">{{- Descriptions.RenderDescription(enum.GetBriefDescription()) -}}</td>
	</tr>
{%- endfor -%}

Disadvantage is of course that you add complexity to certain parts of the deserialization (/serialization) for nested types and inheritance, by either having to do so in the correct order or having to defer the process of initializing these pointers. But see this as just as some thought sharing. I do think this would improve the interaction in the backend (assuming you use the same representation as currently in the frontend). Also, we didn't apply this to our Type representation (which we use to store the type of a member, parameter etc.), which stores the name of the type rather than a pointer to it (since it can also be a built-in), though it embeds pretty much every possible modifier on said type, like this:

EntryName			mName;									
bool				mIsConst = false;						
EReferenceType			mReferenceType = EReferenceType::None;	
std::vector<bool>		mPointerConstnessMask;					
std::vector<std::string>	mArraySizes;							
bool				mIsAtomic = false;						
std::vector<Attribute>		mAttributes;							
bool				mIsExpansion = false;					
std::vector<TemplateArgument>	mTemplateArguments;						
std::unique_ptr<FunctionTypeProperties>     mFunctionTypeProperties = nullptr;		
EntryName			mParentCXXEntry;

The last member refers to the case where a pointer is a pointer to member, though some other fields may require some explaining too. Anyway, this is just to give some insight into how we structured our representation, where we largely omitted string representations where possible.

Have you actually started work already on some backend? Developing backend and frontend in tandem can provide some additional insights as to how things should be structured, especially representation-wise!

clang-doc/Representation.h
113 ↗	(On Diff #135559)	How come these are actually unique ptrs? They can be stored directly in the vector, right? (same for CommentInfo children, FnctionInfo params etc.)

Please run Clang-format and Clang-tidy modernize.

clang-doc/Representation.h
80 ↗	(On Diff #135559)	Please separate constructors from data members with empty line.

Continued refactoring the bitcode writer
Added a USR attribute to infos
Created a Reference struct to replace the string references to other infos

In D41102#1017499, @Athosvk wrote:

Disadvantage is of course that you add complexity to certain parts of the deserialization (/serialization) for nested types and inheritance, by either having to do so in the correct order or having to defer the process of initializing these pointers. But see this as just as some thought sharing. I do think this would improve the interaction in the backend (assuming you use the same representation as currently in the frontend).

I agree that the pointer approach would be much more efficient on the backend, but the issue here is that the mapper has no idea where the representation of anything other than the decl it's currently looking at will be, since it sees each decl and serializes it immediately. The reducer, on the other hand, will be able to see everything, and so such pointers could be added as a pass over the final reduced data structure.
So, as an idea (as this diff implements), I updated the string references to be a struct, which holds the USR of the referenced type (for serialization, both here in the mapper and for the dump option in the reducer, as well as a pointer to an Info struct. This pointer is not used at this point, but would be populated by the reducer. Thoughts?

Have you actually started work already on some backend? Developing backend and frontend in tandem can provide some additional insights as to how things should be structured, especially representation-wise!

I added you as a subscriber on the follow-up patches (the reducer, YAML/MD formats) -- would love to hear your thoughts! As of now, the MD output is very rough, but I'm hoping to keep moving forward on that in the next few days.

clang-doc/BitcodeWriter.h
53 ↗	(On Diff #135559)	It's the current abbrev id width for the block (described here), so it's the max id width for the block's abbrevs.
94 ↗	(On Diff #135559)	Yes, the enum is an implementation detail (`FUNCTION_MANGLED_NAME` should have been removed earlier). I'll put the comment describing how it works!

Fixing CMakeLists formatting

Could you please add a bit more tests? In particular, i'd like to see how blocks-in-blocks work.
I.e. class-in-class, class-in-function, ...

Is there some (internal to BitstreamWriter) logic that would 'assert()' if trying to output some recordid
which is, according to the BLOCKINFO_BLOCK, should not be there?
E.g. outputting VERSION in BI_COMMENT_BLOCK_ID?

clang-doc/BitcodeWriter.cpp
30 ↗	(On Diff #135682)	Ok, these three functions still look off, how about this? // Yes, not by reference, https://godbolt.org/g/T52Vcj static void AbbrevGen(std::shared_ptr<llvm::BitCodeAbbrev> &Abbrev, const std::initializer_list<llvm::BitCodeAbbrevOp> Ops) { for(const auto &Op : Ops) Abbrev->Add(Op); } static void IntAbbrev(std::shared_ptr<llvm::BitCodeAbbrev> &Abbrev) { AbbrevGen(Abbrev, { // 0. Fixed-size integer {llvm::BitCodeAbbrevOp::Fixed, BitCodeConstants::IntSize}}); } static void StringAbbrev(std::shared_ptr<llvm::BitCodeAbbrev> &Abbrev) { AbbrevGen(Abbrev, { // 0. Fixed-size integer (length of the following string) {llvm::BitCodeAbbrevOp::Fixed, BitCodeConstants::StringLengthSize}, // 1. The string blob {llvm::BitCodeAbbrevOp::Blob}}); } // Assumes that the file will not have more than 65535 lines. static void LocationAbbrev(std::shared_ptr<llvm::BitCodeAbbrev> &Abbrev) { AbbrevGen(Abbrev, { // 0. Fixed-size integer (line number) {llvm::BitCodeAbbrevOp::Fixed, BitCodeConstants::LineNumberSize}, // 1. Fixed-size integer (length of the following string (filename)) {llvm::BitCodeAbbrevOp::Fixed, BitCodeConstants::StringLengthSize}, // 2. the string blob {llvm::BitCodeAbbrevOp::Blob}}); } Though i bet clang-format will mess-up the formatting again :/
108 ↗	(On Diff #135682)	Some of these `IntAbbrev`'s are actually `bool`s. Would it make sense to already think about being bitcode-size-conservative and introduce `BoolAbbrev` from the get go? static void BoolAbbrev(std::shared_ptr<llvm::BitCodeAbbrev> &Abbrev) { AbbrevGen(Abbrev, { // 0. Fixed-size boolean {llvm::BitCodeAbbrevOp::Fixed, BitCodeConstants::BoolSize}}); } where `BitCodeConstants::BoolSize` = `1U` ? Or is there some internal padding that would make that pointless?
156 ↗	(On Diff #135682)	Uh, oh, i'm sorry, all(?) these `"Unknown Abbreviation"` are likely copypaste gone wrong. I'm not sure why i wrote that comment. `"Unknown RecordId"` might make more sense?
240 ↗	(On Diff #135682)	Ok, now that i think about it, it can't be that easy. Maybe FIXME: assumes 8 bits per byte assert(llvm::APInt(8Usizeof(Val), Val, /isSigned=*/true).getBitWidth() <= BitCodeConstants::IntSize)); Not sure whether `getBitWidth()` is really the right function to ask though. (Not sure how this all works for negative numbers)
clang-doc/BitcodeWriter.h
53 ↗	(On Diff #135559)	So in other words that `static_assert()` is doing the right thing? Add it after the `enum BlockId{}` then please, will both document things, and ensure that things remain in a sane state.
172 ↗	(On Diff #135682)	Newline after constructor
216 ↗	(On Diff #135682)	`// Emission of appropriate abbreviation type`

Thank you for working on this!
Some more thoughts.

clang-doc/BitcodeWriter.cpp
191 ↗	(On Diff #135682)	Why do we have this indirection? Is there a need to first to (unefficiently?) copy to `Record`, and then emit from there? Wouldn't this work just as well? Record.clear(); Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_BLOCKNAME, BlockIdNameMap[ID]);
196 ↗	(On Diff #135682)	Hmm, so i've been staring at this and http://llvm.org/doxygen/classllvm_1_1BitstreamWriter.html and i must say i'm not fond of this indirection. What i don't understand is, in previous function, we don't store `BlockId`, why do we want to store `RecordId`? Aren't they both unstable, and are implementation detail? Do we want to store it (`RecordId`)? If yes, please explain it as a new comment in code. If no, i guess this would work too? assert(RecordIdNameMap[ID] && "Unknown Abbreviation"); Record.clear(); Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_SETRECORDNAME, RecordIdNameMap[ID].Name); And after that you can lower the default size of `SmallVector<> Record` down to, hm, `4`?
clang-doc/BitcodeWriter.h
161 ↗	(On Diff #135682)	This alias is used exactly once, for `Record` member variable in this class. Is there any point in having this alias?
161 ↗	(On Diff #135682)	Also, why is `uint64_t` used? We either push `char`, or `enum`, or `int`. Do we ever need 64-bit?
clang-doc/ClangDoc.h
47 ↗	(On Diff #135682)	Please add space before `{}`, and drop unneeded `;`
clang-doc/Mapper.h
56 ↗	(On Diff #135682)	`ClangDocMapper` class is staring to look like a god-class. I would recommend: Rename `ClangDocMapper` to `ClangDocASTVisitor`. It's kind-of conventional to name `RecursiveASTVisitor`-based classes like that. Move `ClangDocCommentVisitor` out of the `ClangDocMapper`, into `namespace {}` in `clang-doc/Mapper.cpp` Split `ClangDocSerializer` into new .h/.cpp Replace `ClangDocSerializer Serializer;` with `ClangDocSerializer& Serializer;` Instantiate `ClangDocSerializer` (in `MapperActionFactory`, i think?) before `ClangDocMapper` Pass `ClangDocSerializer&` into `ClangDocMapper` ctor.

lebedev.ri mentioned this in D43779: [Tooling] [0/1] Refactor FrontendActionFactory::create() to return std::unique_ptr<>.Feb 26 2018, 12:47 PM

Moved the serialization logic out of the Mapper class and into its own namespace
Updated tests
Addressing comments

In D41102#1017918, @lebedev.ri wrote:

Is there some (internal to BitstreamWriter) logic that would 'assert()' if trying to output some recordid
which is, according to the BLOCKINFO_BLOCK, should not be there?
E.g. outputting VERSION in BI_COMMENT_BLOCK_ID?

Yes -- it will fail an assertion:
Assertion 'V == Op.getLiteralValue() && "Invalid abbrev for record!"' failed.

clang-doc/BitcodeWriter.cpp
191 ↗	(On Diff #135682)	No, since `BlockIdNameMap[ID]` returns a `StringRef`, which can be manipulated into an `std::string` or a `const char*`, but the `Stream` wants an `unsigned char`. So, the copying is to satisfy that. Unless there's a better way to convert a `StringRef` into an array of `unsigned char`?
196 ↗	(On Diff #135682)	I'm not entirely certain what you mean -- in `emitBlockId()`, we are storing both the block id and the block name in separate records (`BLOCKINFO_CODE_SETBID`, `BLOCKINFO_CODE_BLOCKNAME`, respectively). In `emitRecordId()`, we're doing something slightly different, in that we emit one record with both the record id and the record name (in record `BLOCKINFO_CODE_SETRECORDNAME`). Replacing the copy loop here has the same issue as above, namely that there isn't an easy way to convert between a `StringRef` and an array of `unsigned char`.
240 ↗	(On Diff #135682)	That assertion fails :/ I could do something like `static_cast<int64_t>(Val) == Val` but that would require a) IntSize being a power of 2 b) updating the assert anytime IntSize is updated, and 3) still throws a warning about comparing a signed to an unsigned int...
clang-doc/BitcodeWriter.h
53 ↗	(On Diff #135559)	No...it's the (max) number of the abbrevs relevant to the block itself, which is to say some subset of the RecordIds for any given block (e.g. for a `BI_COMMENT_BLOCK`, the number of abbrevs would be 12 and so on the abbrev width would be 4). To assert for it we could put block start/end markers on the RecordIds and then use that to calculate the bitwidth, if you think the assertion should be there.

Diffusion mentioned this in rC326201: [Tooling] [0/1] Refactor FrontendActionFactory::create() to return std….Feb 27 2018, 7:22 AM

Diffusion mentioned this in rL326201: [Tooling] [0/1] Refactor FrontendActionFactory::create() to return std….

Tried fixing tooling::FrontendActionFactory::create() in D43779/D43780, but had to revert due to gcc4.8 issues :/

Thank you for working on this, some more review notes.

In D41102#1020107, @juliehockett wrote:

In D41102#1017918, @lebedev.ri wrote:

Is there some (internal to BitstreamWriter) logic that would 'assert()' if trying to output some recordid
which is, according to the BLOCKINFO_BLOCK, should not be there?
E.g. outputting VERSION in BI_COMMENT_BLOCK_ID?

Yes -- it will fail an assertion:
Assertion 'V == Op.getLiteralValue() && "Invalid abbrev for record!"' failed.

Ok, great.
And it will also complain if you try to output a block within block?

clang-doc/BitcodeWriter.cpp
191 ↗	(On Diff #135682)	Aha, i see, did not think of that. But there is a `bytes()` function in `StringRef`, which returns `iterator_range<const unsigned char *>`. Would it help? http://llvm.org/doxygen/classllvm_1_1StringRef.html#a5e8f22c3553e341404b445430a3b075b
240 ↗	(On Diff #135682)	I see. Let's not have this assertion for now, just a `FIXME`.
184 ↗	(On Diff #136010)	That comment seems wrong. If the namespace is indeed supposed to be closed, it should happen after the lambda is called, i.e. assert(RecordIdNameMap.size() == RecordIdCount); return RecordIdNameMap; }(); } // namespace doc // AbbreviationMap
265 ↗	(On Diff #136010)	I think it is as simple as assert(Loc.LineNumber < (1U << BitCodeConstants::LineNumberSize)); ?
367 ↗	(On Diff #136010)	So i guess this should be: void ClangDocBitcodeWriter::emitBlockInfo( BlockId BID, const std::initializer_list<RecordId> &RIDs) { assert(RIDs.size() < (1U << BitCodeConstants::SubblockIDSize), "Too many records in a block!"); emitBlockID(BID); ... ?
clang-doc/BitcodeWriter.h
53 ↗	(On Diff #135559)	Aha, i see, so that should go into `ClangDocBitcodeWriter::emitBlockInfoBlock()`, since that already has that info. (On a related node, it feels like this all should be somehow tablegen-generated, but that is for some later, post-commit cleanup.)

Fixing comments

In D41102#1020808, @lebedev.ri wrote:

Ok, great.
And it will also complain if you try to output a block within block?

Um...no. Since you can have subblocks within blocks.

clang-doc/BitcodeWriter.cpp
191 ↗	(On Diff #135682)	Replaced it with an ArrayRef to the `bytes_begin()` and `bytes_end()`, but that only works for the block id, not the record id, since `emitRecordId()` also has to emit the ID number in addition to the name in the same record.
265 ↗	(On Diff #136010)	`LineNumber` is a signed int, so the compiler complains that we're comparing signed and unsigned ints.

lebedev.ri added inline comments.Feb 28 2018, 7:23 AM

clang-doc/BitcodeWriter.h

37 ↗

(On Diff #136161)

Hmm, you build with asserts enabled, right?
I tried testing this, and three tests fail with

clang-doc: /build/llvm/include/llvm/Bitcode/BitstreamWriter.h:122: void llvm::BitstreamWriter::Emit(uint32_t, unsigned int): Assertion `(Val & ~(~0U >> (32-NumBits))) == 0 && "High bits set!"' failed.

Failing Tests (3):
    Clang Tools :: clang-doc/mapper-class-in-function.cpp
    Clang Tools :: clang-doc/mapper-function.cpp
    Clang Tools :: clang-doc/mapper-method.cpp

  Expected Passes    : 6
  Unexpected Failures: 3

At least one failure is because of BoolSize, so i'd suspect the assertion itself is wrong...

Running clang-format and fixing newlines

clang-doc/BitcodeWriter.h
37 ↗	(On Diff #136161)	I do, and I've definitely seen that one triggered before but it's been because something was off in how the data was being outputted as I was shifting things around. That said, I'm not seeing it in my local build with this diff though -- I'll update it again just to make sure they're in sync.

Thank you for working on this!
Some more review notes.
Please look into adding a bit more tests.

clang-doc/BitcodeWriter.cpp
196 ↗	(On Diff #135682)	Tried locally, and yes, we do need to output record id. What we could actually do, is simply inline that `EmitRecord()`, first emitting the RID, and then the name. template <typename Container> void EmitRecord(unsigned Code, int ID, const Container &Vals) { // If we don't have an abbrev to use, emit this in its fully unabbreviated // form. auto Count = static_cast<uint32_t>(makeArrayRef(Vals).size()); EmitCode(bitc::UNABBREV_RECORD); EmitVBR(Code, 6); EmitVBR(Count + 1, 6); // Including ID EmitVBR64(ID, 6); // 'Prefix' with ID for (unsigned i = 0, e = Count; i != e; ++i) EmitVBR64(Vals[i], 6); } But that will result in rather ugly code. So given that the record names are quite short, and all the other strings we output directly, maybe leave it as it is for now, until it shows in profiles?
179 ↗	(On Diff #136303)	Since this is the only string we ever push to `Record`, can we add an assertion to make sure we always have enough room for it? E.g. for (const auto &Init : Inits) { RecordId RID = Init.first; RecordIdNameMap[RID] = Init.second; assert((1 + RecordIdNameMap[RID].size()) <= Record.size()); // Since record was just created, it should not have any dynamic size. // Or move the small size into a variable and use it when declaring the Record and here. }
230 ↗	(On Diff #136303)	Sadly, i can not prove it via godbolt (can't add LLVM as library), but i'd expect streamlining this should at least not hurt, i.e. something like Record.append(RecordIdNameMap[ID].Name.begin(), RecordIdNameMap[ID].Name.end()); ?
clang-doc/BitcodeWriter.h
37 ↗	(On Diff #136161)	I did not retry with updated tree/patch, but i'm quite sure i did hit those asserts. My current build line: -DCMAKE_BUILD_TYPE:STRING=RelWithDebInfo -DLLVM_BINUTILS_INCDIR:PATH=/usr/include -DLLVM_BUILD_TESTS:BOOL=ON -DLLVM_ENABLE_ASSERTIONS:BOOL=ON -DLLVM_ENABLE_LLD:BOOL=ON -DLLVM_ENABLE_PROJECTS:STRING=clang;libcxx;libcxxabi;compiler-rt;lld -DLLVM_ENABLE_SPHINX:BOOL=ON -DLLVM_ENABLE_WERROR:BOOL=ON -DLLVM_PARALLEL_LINK_JOBS:STRING=1 -DLLVM_TARGETS_TO_BUILD:STRING=X86 -DLLVM_USE_SANITIZER:STRING=Address Additional env variables: export MALLOC_CHECK_=3 export MALLOC_PERTURB_=$(($RANDOM % 255 + 1)) export ASAN_OPTIONS=abort_on_error=1 export UBSAN_OPTIONS=print_stacktrace=1
226 ↗	(On Diff #136303)	Needs a comment about the choice of static size of Record. I.e. the maximal amount of stuff we expect to push there is recordname string (right now `IsDefinition` is the longest at `13` chars) + 1 integer. And add a newline // Notes SmallVector<uint32_t, 16> Record; llvm::BitstreamWriter &Stream; ...
clang-doc/Mapper.cpp
28 ↗	(On Diff #136303)	+// If we should ignore this declaration, exit this decl ?
clang-doc/Mapper.h
30 ↗	(On Diff #136303)	I wonder if we could reflect the usage of `RecursiveASTVisitor` in the class name. Though `ClangDocMapperASTVisitor` sounds too long?
clang-doc/Representation.h
27 ↗	(On Diff #136303)	Is there an intentional decision to minimize `sizeof()` of these structs? Many(?) of those could be `SmallString`'s
test/CMakeLists.txt
44 ↗	(On Diff #136303)	There is are no tests with `CommentBlock` blocks.
test/clang-doc/mapper-class-in-class.cpp
6 ↗	(On Diff #136161)	Ok, so this actually produced `c:@S@X.bc` and `c:@S@X@S@Y.bc`. Please do something like: // RUN: llvm-bcanalyzer %t/docs/c:@S@X.bc --dump \| FileCheck %s --check-prefix CHECK-X // RUN: llvm-bcanalyzer %t/docs/c:@S@X@S@Y.bc --dump \| FileCheck %s --check-prefix CHECK-X-Y // CHECK-X: <BLOCKINFO_BLOCK/> // CHECK-X: <VersionBlock NumWords=1 BlockCodeSize=4> // CHECK-X: <Version abbrevid=4 op0=1/> // CHECK-X: </VersionBlock> // CHECK-X: <RecordBlock NumWords=6 BlockCodeSize=4> // CHECK-X: <USR abbrevid=4 op0=6/> blob data = 'c:@S@X' // CHECK-X: <Name abbrevid=5 op0=1/> blob data = 'X' // CHECK-X: <IsDefinition abbrevid=7 op0=1/> // CHECK-X: <TagType abbrevid=10 op0=3/> // CHECK-X: </RecordBlock> // CHECK-X-Y: <BLOCKINFO_BLOCK/> // CHECK-X-Y: <VersionBlock NumWords=1 BlockCodeSize=4> // CHECK-X-Y: <Version abbrevid=4 op0=1/> // CHECK-X-Y: </VersionBlock> // CHECK-X-Y: <RecordBlock NumWords=11 BlockCodeSize=4> // CHECK-X-Y: <USR abbrevid=4 op0=10/> blob data = 'c:@S@X@S@Y' // CHECK-X-Y: <Name abbrevid=5 op0=1/> blob data = 'Y' // CHECK-X-Y: <Namespace abbrevid=6 op0=1 op1=6/> blob data = 'c:@S@X' // CHECK-X-Y: <IsDefinition abbrevid=7 op0=1/> // CHECK-X-Y: <TagType abbrevid=10 op0=3/> // CHECK-X-Y: </RecordBlock> On a related note, is there any way to auto-generate these `CHECK` lines? There is this `llvm/utils/update_test_checks.py`, but i doubt it will work here.
test/clang-doc/mapper-class-in-function.cpp
8 ↗	(On Diff #136161)	Here too, i suppose
test/clang-doc/mapper-enum.cpp
7–8 ↗	(On Diff #136303)	Could you please also add a similar `enum class` test?
17 ↗	(On Diff #136303)	Can `TypeBlock` be on the same depth as `VersionBlock`? Via `using`/`typename`? If yes, please add such a test.
test/clang-doc/mapper-method.cpp
8 ↗	(On Diff #136161)	And here

Fixing comments and adding tests

Thank you for working on this!
Some more nitpicking.

Please consider adding even more tests (ideally, all this code should have 100% test coverage)

clang-doc/BitcodeWriter.cpp
139 ↗	(On Diff #136520)	This change is not covered by tests. (I've actually found out that the hard way, by trying to find why it didn't trigger any asssertions, oh well)
325 ↗	(On Diff #136520)	I think it would be cleaner to move it (at least the enterblock, it might make sense to leave the header at the very top) after the static variable
363 ↗	(On Diff #136520)	I.e. ... , FUNCTION_IS_METHOD}}}; Stream.EnterBlockInfoBlock(); for (const auto &Block : TheBlocks) { assert(Block.second.size() < (1U << BitCodeConstants::SubblockIDSize)); emitBlockInfo(Block.first, Block.second); } Stream.ExitBlock(); emitVersion(); }
clang-doc/BitcodeWriter.h
19 ↗	(On Diff #136520)	Please sort includes, clang-tidy complains.
32 ↗	(On Diff #136520)	/build/clang-tools-extra/clang-doc/BitcodeWriter.h:32:23: warning: invalid case style for variable 'VERSION_NUMBER' [readability-identifier-naming] static const unsigned VERSION_NUMBER = 1; ^~~~~~~~~~~~~~ VersionNumber
163 ↗	(On Diff #136520)	The simplest solution would be #ifndef NDEBUG // Don't want explicit dtor unless needed ~ClangDocBitcodeWriter() { // Check that the static size is large-enough. assert(Record.capacity() == BitCodeConstants::RecordSize); } #endif
228 ↗	(On Diff #136520)	So you want to be really definitive with this. I wanted to avoid that, actually.. Then i'm afraid one more assert is needed, to make sure this is actually true. I'm not seeing any way to make `SmallVector` completely static, so you could either add one more wrapper around it (rather ugly), or check the final size in the `ClangDocBitcodeWriter` destructor (will not pinpoint when the size has 'overflowed')
246 ↗	(On Diff #136520)	Does it ever make sense to output `BlockInfoBlock` anywhere else other than once at the very beginning? I'd think you should drop the boolean param, and unconditinally call the `emitBlockInfoBlock();` from `ClangDocBitcodeWriter::ClangDocBitcodeWriter()` ctor.
248 ↗	(On Diff #136520)	The naming choices confuse me. There is `writeBitstream()` and `emitBlock()`, which is called from `writeBitstream()` to write the actual contents of the block. Why one is `write` and another is `emit`? To match the `BitstreamWriter` naming choices? (which uses `Emit` prefix)? To avoid the confusion of which one outputs the actual content, and which one outputs the whole block? I think it should be: - void emitBlock(const NamespaceInfo &I); + void emitBlockContent(const NamespaceInfo &I); - void ClangDocBitcodeWriter::writeBitstream(const T &I, bool WriteBlockInfo); + void ClangDocBitcodeWriter::emitBlock(const T &I, bool EmitBlockInfo); This way, i think their names would clearner-er state what they do, and won't be weirdly different. What do you think?
clang-doc/Representation.h
18 ↗	(On Diff #136520)	Please sort includes, clang-tidy complains.
clang-doc/Serialize.cpp
88 ↗	(On Diff #136520)	/build/clang-tools-extra/clang-doc/Serialize.cpp:88:17: warning: invalid case style for variable 'i' [readability-identifier-naming] for (unsigned i = 0, e = C->getNumArgs(); i < e; ++i) ^ ~ ~~ I I I /build/clang-tools-extra/clang-doc/Serialize.cpp:88:24: warning: invalid case style for variable 'e' [readability-identifier-naming] for (unsigned i = 0, e = C->getNumArgs(); i < e; ++i) ^ ~~ E E
107 ↗	(On Diff #136520)	/build/clang-tools-extra/clang-doc/Serialize.cpp:107:19: warning: invalid case style for variable 'i' [readability-identifier-naming] for (unsigned i = 0, e = C->getDepth(); i < e; ++i) ^ ~ ~~ I I I /build/clang-tools-extra/clang-doc/Serialize.cpp:107:26: warning: invalid case style for variable 'e' [readability-identifier-naming] for (unsigned i = 0, e = C->getDepth(); i < e; ++i) ^ ~~ E E
clang-doc/Serialize.h
19 ↗	(On Diff #136520)	Please sort includes, clang-tidy complains.
clang-doc/tool/ClangDocMain.cpp
80 ↗	(On Diff #136520)	Why at the beginning though? Couldn't the user pass `-extra-arg=-fno-parse-all-comments`, which could override this?

Adding tests, fixing comments, and removing an (as-of-yet) unused element of the CommentInfo struct.

clang-doc/BitcodeWriter.cpp
196 ↗	(On Diff #135682)	If that makes sense to you, sounds good to me!
139 ↗	(On Diff #136520)	So after a some digging, this particular field can't be tested right now as the mapper doesn't look at any `TemplateDecl`s (something that definitely needs to be implemented, but in a follow-on patch). I've removed it for now, until it can be properly used/tested.
clang-doc/BitcodeWriter.h
37 ↗	(On Diff #136161)	Figured it out -- the `Reference` struct didn't have default for the enum, and so if it wasn't initialized it was undefined. Should be fixed now.
test/clang-doc/mapper-enum.cpp
17 ↗	(On Diff #136303)	Not currently -- I'm planning to add that functionality in the future, but right now it ignores typedef or using decls.

Could some other people please review this differential, too?
I'm sure i have missed things.

Some more nitpicking.

For this differential as standalone, i'we mostly run out of things to nitpick.
Some things can probably be done better (the blockid/recordid stuff could probably be nicer if tablegen-ed, but that is for later).

I'll try to look at the next differential, and at them combined.

clang-doc/BitcodeWriter.cpp
120 ↗	(On Diff #136650)	We don't actually push these strings to the `Record` (but instead output them directly), so this assertion is not really meaningful, i think?
clang-doc/BitcodeWriter.h
21 ↗	(On Diff #136650)	+DenseMap
21 ↗	(On Diff #136650)	+StringRef
197 ↗	(On Diff #136650)	Humm, you could avoid this constant, and conserve a few bits, if you move the init-list out of `emitBlockInfoBlock()` to somewhere e.g. after the `enum RecordId`, and then since the `BlockId ID` is already passed, you could compute it on-the-fly the same way the `BitCodeConstants::SubblockIDSize` is asserted in `emitBlockInfo*()`. Not sure if it's worth doing though. Maybe just add it as a `NOTE` here.
249 ↗	(On Diff #136650)	Stale comment
clang-doc/Representation.h
60 ↗	(On Diff #136650)	`Info *Ref;` isn't used anywhere
117 ↗	(On Diff #136650)	`llvm::Optional<Location> DefLoc;` ?

Addressing comments

lebedev.ri added inline comments.Mar 2 2018, 10:38 AM

clang-doc/Representation.h
117 ↗	(On Diff #136791)	I meant that `IsDefinition` controls whether `DefLoc` will be set/used or not. So with `llvm::Optional<Location> DefLoc`, you don't need the `bool IsDefinition`.

Removing IsDefinition field.

clang-doc/Representation.h
117 ↗	(On Diff #136791)	That...makes so much sense. Oops. Thank you!

Eugene.Zelenko added inline comments.Mar 5 2018, 6:15 PM

clang-doc/BitcodeWriter.h
160 ↗	(On Diff #136809)	Looks like Clang-format was applied incorrectly, because this is Google, not LLVM style. Please note that it doesn't modify file, just output formatted code to terminal. Please reformat other files, including those in dependent patches.

My apologies for getting back on this so late!

In D41102#1017683, @juliehockett wrote:

So, as an idea (as this diff implements), I updated the string references to be a struct, which holds the USR of the referenced type (for serialization, both here in the mapper and for the dump option in the reducer, as well as a pointer to an Info struct. This pointer is not used at this point, but would be populated by the reducer. Thoughts?

This seems like quite a decent approach! That being said, I don't see the pointer yet? I assume you mean that you will be adding this? Additionally, a slight disadvantage of doing this generic approach is that you need to do bookkeeping on what it is referencing, but I guess there's no helping that due to the architecture which makes you rely upon the USR? Personally I'd prefer having the explicit types if and where possible. So for now a RecordInfo has a vecotr of Reference's to its parents, but we know the parents can only be of certain kinds (more than just a RecordType, but you get the point); it won't be an enum, namespace or function.

As I mentioned, we did this the other way around, which also has the slight advantage that I only had to create and save the USR once per info instance (as in, 10 references to a class only add the overhead of 10 pointers, rather than each having the USR as well), but our disadvantage was of course that we had delayed serialization (although we could arguably do both simultaneously). It seems each method has its merits :).

In D41102#1028228, @Athosvk wrote:

This seems like quite a decent approach! That being said, I don't see the pointer yet? I assume you mean that you will be adding this? Additionally, a slight disadvantage of doing this generic approach is that you need to do bookkeeping on what it is referencing, but I guess there's no helping that due to the architecture which makes you rely upon the USR? Personally I'd prefer having the explicit types if and where possible. So for now a RecordInfo has a vecotr of Reference's to its parents, but we know the parents can only be of certain kinds (more than just a RecordType, but you get the point); it won't be an enum, namespace or function.

If you take a look at the follow-on patch to this (D43341), you'll see that that is where the pointer is added in (since it is irrelevant to the mapper portion, as it cannot be filled out until the information has been reduced). The back references to children and whatnot are also added there.

As I mentioned, we did this the other way around, which also has the slight advantage that I only had to create and save the USR once per info instance (as in, 10 references to a class only add the overhead of 10 pointers, rather than each having the USR as well), but our disadvantage was of course that we had delayed serialization (although we could arguably do both simultaneously). It seems each method has its merits :).

The USRs are kept for serialization purposes -- given the modular nature of the design, the goal is to be able to write out the bitstream and have it be consumable with all necessary information. Since we can't write out pointers (and it would be useless if we did, since they would change as soon as the file was read in), we maintain the USRs to have a means of re-finding the referenced declaration.

That said, I was looking at the Clangd symbol indexing code yesterday, and noticed that they're hashing the USRs (since they get a little lengthy, particularly when you have nested and/or overloaded functions). I'm going to take a look at that today to try to make the USRs more space-efficient here.

Adding hashing to reduce the size of USRs and updating tests.

Nice!
Some further notes based on the SHA1 nature.

clang-doc/BitcodeWriter.cpp
74 ↗	(On Diff #137244)	Those are mixed up. `USRLengthSize` is definitively supposed to be second.
81 ↗	(On Diff #137244)	The sha1 is all-printable, so how about using `BitCodeAbbrevOp::Encoding::Char6` ? Char4 would work best, but it is not there.
149 ↗	(On Diff #137244)	Ha, and all the `*_USR` are actually `StringAbbrev`'s, not confusing at all :)
309 ↗	(On Diff #137244)	Now it would make sense to also assert that this sha1(usr).strlen() == 20
clang-doc/BitcodeWriter.h
46 ↗	(On Diff #137244)	Can definitively lower this to `5U` (2^6 == 32, which is more than the 20 8-bit chars of sha1)
clang-doc/Representation.h
59 ↗	(On Diff #137244)	Now that USR is sha1'd, this is always 20 8-bit characters long.
107 ↗	(On Diff #137244)	`20` Maybe place `using USRString = SmallString<20>; // SHA1 of USR` somewhere and use it everywhere?

In D41102#1028760, @juliehockett wrote:

If you take a look at the follow-on patch to this (D43341), you'll see that that is where the pointer is added in (since it is irrelevant to the mapper portion, as it cannot be filled out until the information has been reduced). The back references to children and whatnot are also added there.

Oops! I'll have a look!

In D41102#1028760, @juliehockett wrote:

The USRs are kept for serialization purposes -- given the modular nature of the design, the goal is to be able to write out the bitstream and have it be consumable with all necessary information. Since we can't write out pointers (and it would be useless if we did, since they would change as soon as the file was read in), we maintain the USRs to have a means of re-finding the referenced declaration.

What I was referring to was the storing of a USR per reference. Of course, serializing pointers wouldn't work, but what I mean is that what we used as a USR was stored in what was pointed to, not in the reference that tells what we are pointing to. To be a little more concise, a RecordInfo has pointers to the FuntionInfo for its member functions. Upon serialization, the RecordInfo queries the USR of those functions. A function being referenced multiple times remains to only have the USR stored. If I understand correctly, you currently save the USR for time an InfoType references another InfoType.

Anyhow, don't pay too much attention to that comment, it's all meant as a minor thing. It sure is looking good so far!

In D41102#1028995, @lebedev.ri wrote:

Some further notes based on the SHA1 nature.

I'm sorry, brainfreeze, i meant 40 chars, not 20.
Updated comments...

clang-doc/BitcodeWriter.cpp
309 ↗	(On Diff #137244)	40 that is
clang-doc/BitcodeWriter.h
46 ↗	(On Diff #137244)	Edit: to 6U (2^6 == 64, which is more than the 40 8-bit chars of sha1)
clang-doc/Representation.h
59 ↗	(On Diff #137244)	40 that is
107 ↗	(On Diff #137244)	40

Updating bitcode writer for hashed USRs, and re-running clang-format. Also cleaning up a couple of unused fields.

Hmm, i'm missing something about the way store sha1...

clang-doc/BitcodeWriter.cpp
53 ↗	(On Diff #137457)	This is VBR because USRLengthSize is of such strange size, to conserve the bits?
57 ↗	(On Diff #137457)	Looking at the `NumWords` changes (decrease!) in the tests, and this is bugging me. And now that i have realized what we do with USR: we first compute SHA1, and get 20x uint8_t store/use it internally then hex-ify it, getting 40x char (assuming 8-bit char) then convert to char6, winning back two bits. but we still loose 2 bits. Question: why do we store sha1 of USR as a string? Why can't we just store that USRString (aka USRSha1 binary) directly? That would be just 20 bytes, you just couldn't go any lower than that.
clang-doc/Representation.h
29 ↗	(On Diff #137457)	Right, of course, internally this is kept in the binary format, which is just 20 chars. This is not the string (the hex-ified version of sha1), but the raw sha1, the binary. This should somehow convey that. This should be something closer to `USRSha1`.

There's a few places where we can trim some of the boilerplate, which I think is important - it's hard to find the "real code" among all the plumbing in places.
Other than that, this seems OK to me.

clang-doc/BitcodeWriter.h
116 ↗	(On Diff #137457)	I think you don't want to declare ID in the unspecialized template, so you get a compile error if you try to use it. (Using traits for this sort of thing seems a bit overboard to me, but YMMV)
154 ↗	(On Diff #137457)	Hmm, you spend a lot of effort plumbing this variable around! Why is it so important? Filesize? (I'm not that familiar with LLVM bitcode, but surely we'll end up with a string table anyway?) If it really is an important option people will want, the command-line arg should probably say why.
241 ↗	(On Diff #137457)	OK, I don't get this at all. We have to declare emitBlockContent(NamespaceInfo) and the specialization of MapFromInfoToBlockId<NamespaceInfo>, and deal with the public interface emitBlock being a template function where you can't tell what's legal to pass, instead of writing: void emitBlock(const NamespaceInfo &I) { SubStreamBlockGuard Block(Stream, BI_NAMESPACE_BLOCK_ID); // <-- this one line ... } This really seems like templates for the sake of templates :(
clang-doc/ClangDoc.h
10 ↗	(On Diff #137457)	This comment doesn't seem accurate - there's no main() in this file. There's a FrontendActionFactory, but nothing in this file uses it.
37 ↗	(On Diff #137457)	nit: seems odd to put all this implementation in the header. (personally I'd just expose a function returning unique_ptr<FrontendActionFactory> from the header, but up to you...)
38 ↗	(On Diff #137457)	for ASTConsumers implemented by ASTVisitors, there seems a fairly strong convention to just make the same class extend both (MapASTVisitor, here). That would eliminate one plumbing class...
clang-doc/Mapper.cpp
33 ↗	(On Diff #137457)	It seems a bit of a poor fit to use a complete bitcode file (header, version, block info) as your value format when you know the format, and know there'll be no version skew. Is it easy just to emit the block we care about?
clang-doc/Representation.h
29 ↗	(On Diff #137457)	I'm not sure that any of the implementation (either USR or SHA) belongs in the type name. In clangd we called this type SymbolID, which seems like a reasonable name here too.
44 ↗	(On Diff #137457)	this is probably the right place to document these fields - what are the legal kinds? what's the name of a comment, direction, etc?

This revision is now accepted and ready to land.Mar 8 2018, 4:51 PM

Closed by commit rL327102: [clang-doc] Setup clang-doc frontend framework (authored by juliehockett). · Explain WhyMar 8 2018, 7:21 PM

This revision was automatically updated to reflect the committed changes.

juliehockett marked 11 inline comments as done.

Herald added a subscriber: llvm-commits. · View Herald TranscriptMar 8 2018, 7:21 PM

Might have been better to not start landing until the all differentials are understood/accepted, but i understand that it is not really up to me to decide.
Let's hope nothing in the next differentials will require changes to this initial code :)

clang-doc/BitcodeWriter.h
241 ↗	(On Diff #137457)	If you want to add a new block, in one case you just need to add one template <> struct MapFromInfoToBlockId<???Info> { static const BlockId ID = BI_???_BLOCK_ID; }; In the other case you need to add whole void ClangDocBitcodeWriter::emitBlock(const ???Info &I) { StreamSubBlockGuard Block(Stream, BI_???_BLOCK_ID); emitBlockContent(I); } (and it was even longer initially) It seems just templating one static variable is shorter than duplicating `emitBlock()` each time, no? Do compare the current diff with the original diff state. I think these templates helped move much of the duplication to simplify the code overall.

Since the commit was reverted, did you mean to either recommit it, or reopen this (with updated diff), so it does not get lost?

In D41102#1034919, @lebedev.ri wrote:

Since the commit was reverted, did you mean to either recommit it, or reopen this (with updated diff), so it does not get lost?

Relanded in r327295.

clang-doc/BitcodeWriter.h
154 ↗	(On Diff #137457)	It was for testing purposes (so that the tests aren't flaky on filenames), but I replaced it with regex.
241 ↗	(On Diff #137457)	You'd still have to add the appropriate `emitBlock()` function for any new block, since it would have different attributes.
clang-doc/Mapper.cpp
33 ↗	(On Diff #137457)	Ideally, yes, but right now in the clang BitstreamWriter there's no way to tell the instance what all the abbreviations are without also emitting the blockinfo to the output stream, though I'm thinking about taking a stab at separating the two. Also, this relies on the llvm-bcanalyzer for testing, which requires both the header and the blockinfo in order to read the data :/

lebedev.ri added inline comments.Mar 14 2018, 1:44 PM

clang-doc/BitcodeWriter.cpp
230 ↗	(On Diff #136303)	And https://github.com/mattgodbolt/compiler-explorer/issues/841 is done, so now we can see that `SmallVector::append()` at least results in less code: https://godbolt.org/g/xJQ59c

So what part is failing, specifically?
The SHA1 blobs of USR's differ in the llvm-bcanalyzer dumps?
The actual filenames %t/docs/bc/<sha1-to-text> differ?
I guess both?

First one you should be able to handle by replacing the actual values with a regex
(i'd guess <USR abbrevid=4 op0=20 op1=11 <...> op19=226 op20=232/> -> <USR abbrevid=4 .*/>, but did not try)
I'm not sure we care about the actual values here, do we?

Second one is interesting.
If we assume that the order in which those are generated is the same, which i think is a safer assumption,
then you could just use result id, not key (sha1-to-text of USR), i.e. %t/docs/bc/00.bc, %t/docs/bc/01.bc and so on.
I.e. something like:

  if (DumpMapperResult) {
+   unsigned id = 0;
    Exec->get()->getToolResults()->forEachResult([&](StringRef Key,
                                                     StringRef Value) {
      SmallString<128> IRRootPath;
      llvm::sys::path::native(OutDirectory, IRRootPath);
      llvm::sys::path::append(IRRootPath, "bc");
      std::error_code DirectoryStatus =
          llvm::sys::fs::create_directories(IRRootPath);
      if (DirectoryStatus != OK) {
        llvm::errs() << "Unable to create documentation directories.\n";
        return;
      }
-     llvm::sys::path::append(IRRootPath, Key + ".bc");
+     llvm::sys::path::append(IRRootPath, std::to_string(id) + ".bc");
      std::error_code OutErrorInfo;
      llvm::raw_fd_ostream OS(IRRootPath, OutErrorInfo, llvm::sys::fs::F_None);
      if (OutErrorInfo != OK) {
        llvm::errs() << "Error opening documentation file.\n";
        return;
      }
      OS << Value;
      OS.close();
+     id++;
    });
  }

Hm, or possibly you could just pass the triple to clang?

I was just thinking of disabling the one test that has an issue (class-in-function) on Windows -- the filename is only used in generating *some* USRs, so all of the other ones are fine. We ran into some issues with that though, since UNSUPPORTED: system-windows didn't seem to disable the test on the machine I have access to. Thoughts?

In D41102#1041773, @juliehockett wrote:

I was just thinking of disabling the one test that has an issue (class-in-function) on Windows -- the filename is only used in generating *some* USRs, so all of the other ones are fine. We ran into some issues with that though, since UNSUPPORTED: system-windows didn't seem to disable the test on the machine I have access to. Thoughts?

UNSUPPORTED: system-windows

Perhaps that is only for msvc?

Have you tried something more broad, like
UNSUPPORTED: mingw32,win32
?

In D41102#1041791, @lebedev.ri wrote:

Have you tried something more broad, like
UNSUPPORTED: mingw32,win32
?

That wasn't working either, confusingly, at least on the local windows machine I have.

Huh, something weird is going on there.
What about the other way around, REQUIRES: linux ?

After much digging, it looks like the lit config is never initialized in clang-tools-extra like it is in the other projects. REQUIRES et.al. work properly once that's in there (see D44708). Once that lands I'll reland this and *hopefully* that'll be that!

hintonda removed a subscriber: hintonda.Mar 24 2018, 11:57 AM

Revision Contents

Path

Size

clang-tools-extra/

trunk/

CMakeLists.txt

1 line

clang-doc/

201 lines

517 lines

23 lines

33 lines

61 lines

57 lines

86 lines

184 lines

53 lines

336 lines

tool/

CMakeLists.txt

17 lines

ClangDocMain.cpp

114 lines

docs/

clang-doc.rst

62 lines

test/

CMakeLists.txt

1 line

clang-doc/

mapper-class-in-class.cpp

35 lines

mapper-class-in-function.cpp

38 lines

19 lines

172 lines

36 lines

25 lines

43 lines

17 lines

23 lines

29 lines

Diff 137689

clang-tools-extra/trunk/CMakeLists.txt

	add_subdirectory(clang-apply-replacements)			add_subdirectory(clang-apply-replacements)
	add_subdirectory(clang-reorder-fields)			add_subdirectory(clang-reorder-fields)
	add_subdirectory(modularize)			add_subdirectory(modularize)
	if(CLANG_ENABLE_STATIC_ANALYZER)			if(CLANG_ENABLE_STATIC_ANALYZER)
	add_subdirectory(clang-tidy)			add_subdirectory(clang-tidy)
	add_subdirectory(clang-tidy-vs)			add_subdirectory(clang-tidy-vs)
	endif()			endif()

	add_subdirectory(change-namespace)			add_subdirectory(change-namespace)
				add_subdirectory(clang-doc)
	add_subdirectory(clang-query)			add_subdirectory(clang-query)
	add_subdirectory(clang-move)			add_subdirectory(clang-move)
	add_subdirectory(clangd)			add_subdirectory(clangd)
	add_subdirectory(include-fixer)			add_subdirectory(include-fixer)
	add_subdirectory(pp-trace)			add_subdirectory(pp-trace)
	add_subdirectory(tool-template)			add_subdirectory(tool-template)

	# Add the common testsuite after all the tools.			# Add the common testsuite after all the tools.
	Show All 11 Lines

clang-tools-extra/trunk/clang-doc/BitcodeWriter.h

				//===-- BitcodeWriter.h - ClangDoc Bitcode Writer --------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements a writer for serializing the clang-doc internal
				// representation to LLVM bitcode. The writer takes in a stream and emits the
				// generated bitcode to that stream.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANG_DOC_BITCODEWRITER_H
				#define LLVM_CLANG_TOOLS_EXTRA_CLANG_DOC_BITCODEWRITER_H

				#include "Representation.h"
				#include "clang/AST/AST.h"
				#include "llvm/ADT/DenseMap.h"
				#include "llvm/ADT/SmallVector.h"
				#include "llvm/ADT/StringRef.h"
				#include "llvm/Bitcode/BitstreamWriter.h"
				#include <initializer_list>
				#include <vector>

				namespace clang {
				namespace doc {

				// Current version number of clang-doc bitcode.
				// Should be bumped when removing or changing BlockIds, RecordIds, or
				// BitCodeConstants, though they can be added without breaking it.
				static const unsigned VersionNumber = 1;

				struct BitCodeConstants {
				static constexpr unsigned RecordSize = 16U;
				static constexpr unsigned SignatureBitSize = 8U;
				static constexpr unsigned SubblockIDSize = 4U;
				static constexpr unsigned BoolSize = 1U;
				static constexpr unsigned IntSize = 16U;
				static constexpr unsigned StringLengthSize = 16U;
				static constexpr unsigned FilenameLengthSize = 16U;
				static constexpr unsigned LineNumberSize = 16U;
				static constexpr unsigned ReferenceTypeSize = 8U;
				static constexpr unsigned USRLengthSize = 6U;
				static constexpr unsigned USRBitLengthSize = 8U;
				};

				// New Ids need to be added to both the enum here and the relevant IdNameMap in
				// the implementation file.
				enum BlockId {
				BI_VERSION_BLOCK_ID = llvm::bitc::FIRST_APPLICATION_BLOCKID,
				BI_NAMESPACE_BLOCK_ID,
				BI_ENUM_BLOCK_ID,
				BI_TYPE_BLOCK_ID,
				BI_FIELD_TYPE_BLOCK_ID,
				BI_MEMBER_TYPE_BLOCK_ID,
				BI_RECORD_BLOCK_ID,
				BI_FUNCTION_BLOCK_ID,
				BI_COMMENT_BLOCK_ID,
				BI_FIRST = BI_VERSION_BLOCK_ID,
				BI_LAST = BI_COMMENT_BLOCK_ID
				};

				// New Ids need to be added to the enum here, and to the relevant IdNameMap and
				// initialization list in the implementation file.
				#define INFORECORDS(X) X##_USR, X##_NAME, X##_NAMESPACE

				enum RecordId {
				VERSION = 1,
				INFORECORDS(FUNCTION),
				FUNCTION_DEFLOCATION,
				FUNCTION_LOCATION,
				FUNCTION_PARENT,
				FUNCTION_ACCESS,
				FUNCTION_IS_METHOD,
				COMMENT_KIND,
				COMMENT_TEXT,
				COMMENT_NAME,
				COMMENT_DIRECTION,
				COMMENT_PARAMNAME,
				COMMENT_CLOSENAME,
				COMMENT_SELFCLOSING,
				COMMENT_EXPLICIT,
				COMMENT_ATTRKEY,
				COMMENT_ATTRVAL,
				COMMENT_ARG,
				TYPE_REF,
				FIELD_TYPE_REF,
				FIELD_TYPE_NAME,
				MEMBER_TYPE_REF,
				MEMBER_TYPE_NAME,
				MEMBER_TYPE_ACCESS,
				INFORECORDS(NAMESPACE),
				INFORECORDS(ENUM),
				ENUM_DEFLOCATION,
				ENUM_LOCATION,
				ENUM_MEMBER,
				ENUM_SCOPED,
				INFORECORDS(RECORD),
				RECORD_DEFLOCATION,
				RECORD_LOCATION,
				RECORD_TAG_TYPE,
				RECORD_PARENT,
				RECORD_VPARENT,
				RI_FIRST = VERSION,
				RI_LAST = RECORD_VPARENT
				};

				static constexpr unsigned BlockIdCount = BI_LAST - BI_FIRST + 1;
				static constexpr unsigned RecordIdCount = RI_LAST - RI_FIRST + 1;

				#undef INFORECORDS

				class ClangDocBitcodeWriter {
				public:
				ClangDocBitcodeWriter(llvm::BitstreamWriter &Stream) : Stream(Stream) {
				emitHeader();
				emitBlockInfoBlock();
				emitVersionBlock();
				}

				#ifndef NDEBUG // Don't want explicit dtor unless needed.
				~ClangDocBitcodeWriter() {
				// Check that the static size is large-enough.
				assert(Record.capacity() > BitCodeConstants::RecordSize);
				}
				#endif

				// Block emission of different info types.
				void emitBlock(const NamespaceInfo &I);
				void emitBlock(const RecordInfo &I);
				void emitBlock(const FunctionInfo &I);
				void emitBlock(const EnumInfo &I);
				void emitBlock(const TypeInfo &B);
				void emitBlock(const FieldTypeInfo &B);
				void emitBlock(const MemberTypeInfo &B);
				void emitBlock(const CommentInfo &B);

				private:
				class AbbreviationMap {
				llvm::DenseMap<unsigned, unsigned> Abbrevs;

				public:
				AbbreviationMap() : Abbrevs(RecordIdCount) {}

				void add(RecordId RID, unsigned AbbrevID);
				unsigned get(RecordId RID) const;
				};

				class StreamSubBlockGuard {
				llvm::BitstreamWriter &Stream;

				public:
				StreamSubBlockGuard(llvm::BitstreamWriter &Stream_, BlockId ID)
				: Stream(Stream_) {
				// NOTE: SubBlockIDSize could theoretically be calculated on the fly,
				// based on the initialization list of records in each block.
				Stream.EnterSubblock(ID, BitCodeConstants::SubblockIDSize);
				}

				StreamSubBlockGuard() = default;
				StreamSubBlockGuard(const StreamSubBlockGuard &) = delete;
				StreamSubBlockGuard &operator=(const StreamSubBlockGuard &) = delete;

				~StreamSubBlockGuard() { Stream.ExitBlock(); }
				};

				// Emission of validation and overview blocks.
				void emitHeader();
				void emitVersionBlock();
				void emitRecordID(RecordId ID);
				void emitBlockID(BlockId ID);
				void emitBlockInfoBlock();
				void emitBlockInfo(BlockId BID, const std::initializer_list<RecordId> &RIDs);

				// Emission of individual record types.
				void emitRecord(StringRef Str, RecordId ID);
				void emitRecord(const SymbolID &Str, RecordId ID);
				void emitRecord(const Location &Loc, RecordId ID);
				void emitRecord(const Reference &Ref, RecordId ID);
				void emitRecord(bool Value, RecordId ID);
				void emitRecord(int Value, RecordId ID);
				void emitRecord(unsigned Value, RecordId ID);
				bool prepRecordData(RecordId ID, bool ShouldEmit = true);

				// Emission of appropriate abbreviation type.
				void emitAbbrev(RecordId ID, BlockId Block);

				// Static size is the maximum length of the block/record names we're pushing
				// to this + 1. Longest is currently `MemberTypeBlock` at 15 chars.
				SmallVector<uint32_t, BitCodeConstants::RecordSize> Record;
				llvm::BitstreamWriter &Stream;
				AbbreviationMap Abbrevs;
				};

				} // namespace doc
				} // namespace clang

				#endif // LLVM_CLANG_TOOLS_EXTRA_CLANG_DOC_BITCODEWRITER_H

clang-tools-extra/trunk/clang-doc/BitcodeWriter.cpp

				//===-- BitcodeWriter.cpp - ClangDoc Bitcode Writer ------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#include "BitcodeWriter.h"
				#include "llvm/ADT/IndexedMap.h"

				namespace clang {
				namespace doc {

				// Since id enums are not zero-indexed, we need to transform the given id into
				// its associated index.
				struct BlockIdToIndexFunctor {
				using argument_type = unsigned;
				unsigned operator()(unsigned ID) const { return ID - BI_FIRST; }
				};

				struct RecordIdToIndexFunctor {
				using argument_type = unsigned;
				unsigned operator()(unsigned ID) const { return ID - RI_FIRST; }
				};

				using AbbrevDsc = void (*)(std::shared_ptr<llvm::BitCodeAbbrev> &Abbrev);

				static void AbbrevGen(std::shared_ptr<llvm::BitCodeAbbrev> &Abbrev,
				const std::initializer_list<llvm::BitCodeAbbrevOp> Ops) {
				for (const auto &Op : Ops)
				Abbrev->Add(Op);
				}

				static void BoolAbbrev(std::shared_ptr<llvm::BitCodeAbbrev> &Abbrev) {
				AbbrevGen(Abbrev,
				{// 0. Boolean
				llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::Fixed,
				BitCodeConstants::BoolSize)});
				}

				static void IntAbbrev(std::shared_ptr<llvm::BitCodeAbbrev> &Abbrev) {
				AbbrevGen(Abbrev,
				{// 0. Fixed-size integer
				llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::Fixed,
				BitCodeConstants::IntSize)});
				}

				static void SymbolIDAbbrev(std::shared_ptr<llvm::BitCodeAbbrev> &Abbrev) {
				AbbrevGen(Abbrev,
				{// 0. Fixed-size integer (length of the sha1'd USR)
				llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::Fixed,
				BitCodeConstants::USRLengthSize),
				// 1. Fixed-size array of Char6 (USR)
				llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::Array),
				llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::Fixed,
				BitCodeConstants::USRBitLengthSize)});
				}

				static void StringAbbrev(std::shared_ptr<llvm::BitCodeAbbrev> &Abbrev) {
				AbbrevGen(Abbrev,
				{// 0. Fixed-size integer (length of the following string)
				llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::Fixed,
				BitCodeConstants::StringLengthSize),
				// 1. The string blob
				llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::Blob)});
				}

				// Assumes that the file will not have more than 65535 lines.
				static void LocationAbbrev(std::shared_ptr<llvm::BitCodeAbbrev> &Abbrev) {
				AbbrevGen(
				Abbrev,
				{// 0. Fixed-size integer (line number)
				llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::Fixed,
				BitCodeConstants::LineNumberSize),
				// 1. Fixed-size integer (length of the following string (filename))
				llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::Fixed,
				BitCodeConstants::StringLengthSize),
				// 2. The string blob
				llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::Blob)});
				}

				static void ReferenceAbbrev(std::shared_ptr<llvm::BitCodeAbbrev> &Abbrev) {
				AbbrevGen(Abbrev,
				{// 0. Fixed-size integer (ref type)
				llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::Fixed,
				BitCodeConstants::ReferenceTypeSize),
				// 1. Fixed-size integer (length of the USR or UnresolvedName)
				llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::Fixed,
				BitCodeConstants::StringLengthSize),
				// 2. The string blob
				llvm::BitCodeAbbrevOp(llvm::BitCodeAbbrevOp::Blob)});
				}

				struct RecordIdDsc {
				llvm::StringRef Name;
				AbbrevDsc Abbrev = nullptr;

				RecordIdDsc() = default;
				RecordIdDsc(llvm::StringRef Name, AbbrevDsc Abbrev)
				: Name(Name), Abbrev(Abbrev) {}

				// Is this 'description' valid?
				operator bool() const {
				return Abbrev != nullptr && Name.data() != nullptr && !Name.empty();
				}
				};

				static const llvm::IndexedMap<llvm::StringRef, BlockIdToIndexFunctor>
				BlockIdNameMap = []() {
				llvm::IndexedMap<llvm::StringRef, BlockIdToIndexFunctor> BlockIdNameMap;
				BlockIdNameMap.resize(BlockIdCount);

				// There is no init-list constructor for the IndexedMap, so have to
				// improvise
				static constexpr std::initializer_list<
				std::pair<BlockId, const char *const>>
				Inits = {{BI_VERSION_BLOCK_ID, "VersionBlock"},
				{BI_NAMESPACE_BLOCK_ID, "NamespaceBlock"},
				{BI_ENUM_BLOCK_ID, "EnumBlock"},
				{BI_TYPE_BLOCK_ID, "TypeBlock"},
				{BI_FIELD_TYPE_BLOCK_ID, "FieldTypeBlock"},
				{BI_MEMBER_TYPE_BLOCK_ID, "MemberTypeBlock"},
				{BI_RECORD_BLOCK_ID, "RecordBlock"},
				{BI_FUNCTION_BLOCK_ID, "FunctionBlock"},
				{BI_COMMENT_BLOCK_ID, "CommentBlock"}};
				static_assert(Inits.size() == BlockIdCount,
				"unexpected count of initializers");
				for (const auto &Init : Inits)
				BlockIdNameMap[Init.first] = Init.second;
				assert(BlockIdNameMap.size() == BlockIdCount);
				return BlockIdNameMap;
				}();

				static const llvm::IndexedMap<RecordIdDsc, RecordIdToIndexFunctor>
				RecordIdNameMap = []() {
				llvm::IndexedMap<RecordIdDsc, RecordIdToIndexFunctor> RecordIdNameMap;
				RecordIdNameMap.resize(RecordIdCount);

				// There is no init-list constructor for the IndexedMap, so have to
				// improvise
				static std::initializer_list<std::pair<RecordId, RecordIdDsc>> Inits = {
				{VERSION, {"Version", &IntAbbrev}},
				{COMMENT_KIND, {"Kind", &StringAbbrev}},
				{COMMENT_TEXT, {"Text", &StringAbbrev}},
				{COMMENT_NAME, {"Name", &StringAbbrev}},
				{COMMENT_DIRECTION, {"Direction", &StringAbbrev}},
				{COMMENT_PARAMNAME, {"ParamName", &StringAbbrev}},
				{COMMENT_CLOSENAME, {"CloseName", &StringAbbrev}},
				{COMMENT_SELFCLOSING, {"SelfClosing", &BoolAbbrev}},
				{COMMENT_EXPLICIT, {"Explicit", &BoolAbbrev}},
				{COMMENT_ATTRKEY, {"AttrKey", &StringAbbrev}},
				{COMMENT_ATTRVAL, {"AttrVal", &StringAbbrev}},
				{COMMENT_ARG, {"Arg", &StringAbbrev}},
				{TYPE_REF, {"Type", &ReferenceAbbrev}},
				{FIELD_TYPE_REF, {"Type", &ReferenceAbbrev}},
				{FIELD_TYPE_NAME, {"Name", &StringAbbrev}},
				{MEMBER_TYPE_REF, {"Type", &ReferenceAbbrev}},
				{MEMBER_TYPE_NAME, {"Name", &StringAbbrev}},
				{MEMBER_TYPE_ACCESS, {"Access", &IntAbbrev}},
				{NAMESPACE_USR, {"USR", &SymbolIDAbbrev}},
				{NAMESPACE_NAME, {"Name", &StringAbbrev}},
				{NAMESPACE_NAMESPACE, {"Namespace", &ReferenceAbbrev}},
				{ENUM_USR, {"USR", &SymbolIDAbbrev}},
				{ENUM_NAME, {"Name", &StringAbbrev}},
				{ENUM_NAMESPACE, {"Namespace", &ReferenceAbbrev}},
				{ENUM_DEFLOCATION, {"DefLocation", &LocationAbbrev}},
				{ENUM_LOCATION, {"Location", &LocationAbbrev}},
				{ENUM_MEMBER, {"Member", &StringAbbrev}},
				{ENUM_SCOPED, {"Scoped", &BoolAbbrev}},
				{RECORD_USR, {"USR", &SymbolIDAbbrev}},
				{RECORD_NAME, {"Name", &StringAbbrev}},
				{RECORD_NAMESPACE, {"Namespace", &ReferenceAbbrev}},
				{RECORD_DEFLOCATION, {"DefLocation", &LocationAbbrev}},
				{RECORD_LOCATION, {"Location", &LocationAbbrev}},
				{RECORD_TAG_TYPE, {"TagType", &IntAbbrev}},
				{RECORD_PARENT, {"Parent", &ReferenceAbbrev}},
				{RECORD_VPARENT, {"VParent", &ReferenceAbbrev}},
				{FUNCTION_USR, {"USR", &SymbolIDAbbrev}},
				{FUNCTION_NAME, {"Name", &StringAbbrev}},
				{FUNCTION_NAMESPACE, {"Namespace", &ReferenceAbbrev}},
				{FUNCTION_DEFLOCATION, {"DefLocation", &LocationAbbrev}},
				{FUNCTION_LOCATION, {"Location", &LocationAbbrev}},
				{FUNCTION_PARENT, {"Parent", &ReferenceAbbrev}},
				{FUNCTION_ACCESS, {"Access", &IntAbbrev}},
				{FUNCTION_IS_METHOD, {"IsMethod", &BoolAbbrev}}};
				// assert(Inits.size() == RecordIdCount);
				for (const auto &Init : Inits) {
				RecordIdNameMap[Init.first] = Init.second;
				assert((Init.second.Name.size() + 1) <= BitCodeConstants::RecordSize);
				}
				// assert(RecordIdNameMap.size() == RecordIdCount);
				return RecordIdNameMap;
				}();

				static const std::initializer_list<
				std::pair<BlockId, std::initializer_list<RecordId>>>
				RecordsByBlock{
				// Version Block
				{BI_VERSION_BLOCK_ID, {VERSION}},
				// Comment Block
				{BI_COMMENT_BLOCK_ID,
				{COMMENT_KIND, COMMENT_TEXT, COMMENT_NAME, COMMENT_DIRECTION,
				COMMENT_PARAMNAME, COMMENT_CLOSENAME, COMMENT_SELFCLOSING,
				COMMENT_EXPLICIT, COMMENT_ATTRKEY, COMMENT_ATTRVAL, COMMENT_ARG}},
				// Type Block
				{BI_TYPE_BLOCK_ID, {TYPE_REF}},
				// FieldType Block
				{BI_FIELD_TYPE_BLOCK_ID, {FIELD_TYPE_REF, FIELD_TYPE_NAME}},
				// MemberType Block
				{BI_MEMBER_TYPE_BLOCK_ID,
				{MEMBER_TYPE_REF, MEMBER_TYPE_NAME, MEMBER_TYPE_ACCESS}},
				// Enum Block
				{BI_ENUM_BLOCK_ID,
				{ENUM_USR, ENUM_NAME, ENUM_NAMESPACE, ENUM_DEFLOCATION, ENUM_LOCATION,
				ENUM_MEMBER, ENUM_SCOPED}},
				// Namespace Block
				{BI_NAMESPACE_BLOCK_ID,
				{NAMESPACE_USR, NAMESPACE_NAME, NAMESPACE_NAMESPACE}},
				// Record Block
				{BI_RECORD_BLOCK_ID,
				{RECORD_USR, RECORD_NAME, RECORD_NAMESPACE, RECORD_DEFLOCATION,
				RECORD_LOCATION, RECORD_TAG_TYPE, RECORD_PARENT, RECORD_VPARENT}},
				// Function Block
				{BI_FUNCTION_BLOCK_ID,
				{FUNCTION_USR, FUNCTION_NAME, FUNCTION_NAMESPACE, FUNCTION_DEFLOCATION,
				FUNCTION_LOCATION, FUNCTION_PARENT, FUNCTION_ACCESS,
				FUNCTION_IS_METHOD}}};

				// AbbreviationMap

				void ClangDocBitcodeWriter::AbbreviationMap::add(RecordId RID,
				unsigned AbbrevID) {
				assert(RecordIdNameMap[RID] && "Unknown RecordId.");
				assert(Abbrevs.find(RID) == Abbrevs.end() && "Abbreviation already added.");
				Abbrevs[RID] = AbbrevID;
				}

				unsigned ClangDocBitcodeWriter::AbbreviationMap::get(RecordId RID) const {
				assert(RecordIdNameMap[RID] && "Unknown RecordId.");
				assert(Abbrevs.find(RID) != Abbrevs.end() && "Unknown abbreviation.");
				return Abbrevs.lookup(RID);
				}

				// Validation and Overview Blocks

				/// \brief Emits the magic number header to check that its the right format,
				/// in this case, 'DOCS'.
				void ClangDocBitcodeWriter::emitHeader() {
				for (char C : llvm::StringRef("DOCS"))
				Stream.Emit((unsigned)C, BitCodeConstants::SignatureBitSize);
				}

				void ClangDocBitcodeWriter::emitVersionBlock() {
				StreamSubBlockGuard Block(Stream, BI_VERSION_BLOCK_ID);
				emitRecord(VersionNumber, VERSION);
				}

				/// \brief Emits a block ID and the block name to the BLOCKINFO block.
				void ClangDocBitcodeWriter::emitBlockID(BlockId BID) {
				const auto &BlockIdName = BlockIdNameMap[BID];
				assert(BlockIdName.data() && BlockIdName.size() && "Unknown BlockId.");

				Record.clear();
				Record.push_back(BID);
				Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_SETBID, Record);
				Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_BLOCKNAME,
				ArrayRef<unsigned char>(BlockIdNameMap[BID].bytes_begin(),
				BlockIdNameMap[BID].bytes_end()));
				}

				/// \brief Emits a record name to the BLOCKINFO block.
				void ClangDocBitcodeWriter::emitRecordID(RecordId ID) {
				assert(RecordIdNameMap[ID] && "Unknown RecordId.");
				prepRecordData(ID);
				Record.append(RecordIdNameMap[ID].Name.begin(),
				RecordIdNameMap[ID].Name.end());
				Stream.EmitRecord(llvm::bitc::BLOCKINFO_CODE_SETRECORDNAME, Record);
				}

				// Abbreviations

				void ClangDocBitcodeWriter::emitAbbrev(RecordId ID, BlockId Block) {
				assert(RecordIdNameMap[ID] && "Unknown abbreviation.");
				auto Abbrev = std::make_shared<llvm::BitCodeAbbrev>();
				Abbrev->Add(llvm::BitCodeAbbrevOp(ID));
				RecordIdNameMap[ID].Abbrev(Abbrev);
				Abbrevs.add(ID, Stream.EmitBlockInfoAbbrev(Block, std::move(Abbrev)));
				}

				// Records

				void ClangDocBitcodeWriter::emitRecord(const SymbolID &Sym, RecordId ID) {
				assert(RecordIdNameMap[ID] && "Unknown RecordId.");
				assert(RecordIdNameMap[ID].Abbrev == &SymbolIDAbbrev &&
				"Abbrev type mismatch.");
				if (!prepRecordData(ID, !Sym.empty()))
				return;
				assert(Sym.size() == 20);
				// std::string Out = llvm::toHex(llvm::toStringRef(Str));
				Record.push_back(Sym.size());
				// for (unsigned I = 0, E = Sym.size(); I != E; ++I) {
				// assert(llvm::BitCodeAbbrevOp::isFixed(Sym[I]));
				// Record.push_back(Sym[I]);
				// }
				Record.append(Sym.begin(), Sym.end());
				Stream.EmitRecordWithAbbrev(Abbrevs.get(ID), Record);
				}

				void ClangDocBitcodeWriter::emitRecord(llvm::StringRef Str, RecordId ID) {
				assert(RecordIdNameMap[ID] && "Unknown RecordId.");
				assert(RecordIdNameMap[ID].Abbrev == &StringAbbrev &&
				"Abbrev type mismatch.");
				if (!prepRecordData(ID, !Str.empty()))
				return;
				assert(Str.size() < (1U << BitCodeConstants::StringLengthSize));
				Record.push_back(Str.size());
				Stream.EmitRecordWithBlob(Abbrevs.get(ID), Record, Str);
				}

				void ClangDocBitcodeWriter::emitRecord(const Location &Loc, RecordId ID) {
				assert(RecordIdNameMap[ID] && "Unknown RecordId.");
				assert(RecordIdNameMap[ID].Abbrev == &LocationAbbrev &&
				"Abbrev type mismatch.");
				if (!prepRecordData(ID, true))
				return;
				// FIXME: Assert that the line number is of the appropriate size.
				Record.push_back(Loc.LineNumber);
				assert(Loc.Filename.size() < (1U << BitCodeConstants::StringLengthSize));
				// Record.push_back(Loc.Filename.size());
				// Stream.EmitRecordWithBlob(Abbrevs.get(ID), Record, Loc.Filename);
				Record.push_back(4);
				Stream.EmitRecordWithBlob(Abbrevs.get(ID), Record, "test");
				}

				void ClangDocBitcodeWriter::emitRecord(const Reference &Ref, RecordId ID) {
				assert(RecordIdNameMap[ID] && "Unknown RecordId.");
				assert(RecordIdNameMap[ID].Abbrev == &ReferenceAbbrev &&
				"Abbrev type mismatch.");
				SmallString<40> StringUSR;
				StringRef OutString;
				if (Ref.RefType == InfoType::IT_default)
				OutString = Ref.UnresolvedName;
				else {
				StringUSR = llvm::toHex(llvm::toStringRef(Ref.USR));
				OutString = StringUSR;
				}
				if (!prepRecordData(ID, !OutString.empty()))
				return;
				assert(OutString.size() < (1U << BitCodeConstants::StringLengthSize));
				Record.push_back((int)Ref.RefType);
				Record.push_back(OutString.size());
				Stream.EmitRecordWithBlob(Abbrevs.get(ID), Record, OutString);
				}

				void ClangDocBitcodeWriter::emitRecord(bool Val, RecordId ID) {
				assert(RecordIdNameMap[ID] && "Unknown RecordId.");
				assert(RecordIdNameMap[ID].Abbrev == &BoolAbbrev && "Abbrev type mismatch.");
				if (!prepRecordData(ID, Val))
				return;
				Record.push_back(Val);
				Stream.EmitRecordWithAbbrev(Abbrevs.get(ID), Record);
				}

				void ClangDocBitcodeWriter::emitRecord(int Val, RecordId ID) {
				assert(RecordIdNameMap[ID] && "Unknown RecordId.");
				assert(RecordIdNameMap[ID].Abbrev == &IntAbbrev && "Abbrev type mismatch.");
				if (!prepRecordData(ID, Val))
				return;
				// FIXME: Assert that the integer is of the appropriate size.
				Record.push_back(Val);
				Stream.EmitRecordWithAbbrev(Abbrevs.get(ID), Record);
				}

				void ClangDocBitcodeWriter::emitRecord(unsigned Val, RecordId ID) {
				assert(RecordIdNameMap[ID] && "Unknown RecordId.");
				assert(RecordIdNameMap[ID].Abbrev == &IntAbbrev && "Abbrev type mismatch.");
				if (!prepRecordData(ID, Val))
				return;
				assert(Val < (1U << BitCodeConstants::IntSize));
				Record.push_back(Val);
				Stream.EmitRecordWithAbbrev(Abbrevs.get(ID), Record);
				}

				bool ClangDocBitcodeWriter::prepRecordData(RecordId ID, bool ShouldEmit) {
				assert(RecordIdNameMap[ID] && "Unknown RecordId.");
				if (!ShouldEmit)
				return false;
				Record.clear();
				Record.push_back(ID);
				return true;
				}

				// BlockInfo Block

				void ClangDocBitcodeWriter::emitBlockInfoBlock() {
				Stream.EnterBlockInfoBlock();
				for (const auto &Block : RecordsByBlock) {
				assert(Block.second.size() < (1U << BitCodeConstants::SubblockIDSize));
				emitBlockInfo(Block.first, Block.second);
				}
				Stream.ExitBlock();
				}

				void ClangDocBitcodeWriter::emitBlockInfo(
				BlockId BID, const std::initializer_list<RecordId> &RIDs) {
				assert(RIDs.size() < (1U << BitCodeConstants::SubblockIDSize));
				emitBlockID(BID);
				for (RecordId RID : RIDs) {
				emitRecordID(RID);
				emitAbbrev(RID, BID);
				}
				}

				// Block emission

				void ClangDocBitcodeWriter::emitBlock(const TypeInfo &T) {
				StreamSubBlockGuard Block(Stream, BI_TYPE_BLOCK_ID);
				emitRecord(T.Type, TYPE_REF);
				}

				void ClangDocBitcodeWriter::emitBlock(const FieldTypeInfo &T) {
				StreamSubBlockGuard Block(Stream, BI_FIELD_TYPE_BLOCK_ID);
				emitRecord(T.Type, FIELD_TYPE_REF);
				emitRecord(T.Name, FIELD_TYPE_NAME);
				}

				void ClangDocBitcodeWriter::emitBlock(const MemberTypeInfo &T) {
				StreamSubBlockGuard Block(Stream, BI_MEMBER_TYPE_BLOCK_ID);
				emitRecord(T.Type, MEMBER_TYPE_REF);
				emitRecord(T.Name, MEMBER_TYPE_NAME);
				emitRecord(T.Access, MEMBER_TYPE_ACCESS);
				}

				void ClangDocBitcodeWriter::emitBlock(const CommentInfo &I) {
				StreamSubBlockGuard Block(Stream, BI_COMMENT_BLOCK_ID);
				for (const auto &L :
				std::initializer_list<std::pair<llvm::StringRef, RecordId>>{
				{I.Kind, COMMENT_KIND},
				{I.Text, COMMENT_TEXT},
				{I.Name, COMMENT_NAME},
				{I.Direction, COMMENT_DIRECTION},
				{I.ParamName, COMMENT_PARAMNAME},
				{I.CloseName, COMMENT_CLOSENAME}})
				emitRecord(L.first, L.second);
				emitRecord(I.SelfClosing, COMMENT_SELFCLOSING);
				emitRecord(I.Explicit, COMMENT_EXPLICIT);
				for (const auto &A : I.AttrKeys)
				emitRecord(A, COMMENT_ATTRKEY);
				for (const auto &A : I.AttrValues)
				emitRecord(A, COMMENT_ATTRVAL);
				for (const auto &A : I.Args)
				emitRecord(A, COMMENT_ARG);
				for (const auto &C : I.Children)
				emitBlock(*C);
				}

				#define EMITINFO(X) \
				emitRecord(I.USR, X##_USR); \
				emitRecord(I.Name, X##_NAME); \
				for (const auto &N : I.Namespace) \
				emitRecord(N, X##_NAMESPACE); \
				for (const auto &CI : I.Description) \
				emitBlock(CI);

				void ClangDocBitcodeWriter::emitBlock(const NamespaceInfo &I) {
				StreamSubBlockGuard Block(Stream, BI_NAMESPACE_BLOCK_ID);
				EMITINFO(NAMESPACE)
				}

				void ClangDocBitcodeWriter::emitBlock(const EnumInfo &I) {
				StreamSubBlockGuard Block(Stream, BI_ENUM_BLOCK_ID);
				EMITINFO(ENUM)
				if (I.DefLoc)
				emitRecord(I.DefLoc.getValue(), ENUM_DEFLOCATION);
				for (const auto &L : I.Loc)
				emitRecord(L, ENUM_LOCATION);
				emitRecord(I.Scoped, ENUM_SCOPED);
				for (const auto &N : I.Members)
				emitRecord(N, ENUM_MEMBER);
				}

				void ClangDocBitcodeWriter::emitBlock(const RecordInfo &I) {
				StreamSubBlockGuard Block(Stream, BI_RECORD_BLOCK_ID);
				EMITINFO(RECORD)
				if (I.DefLoc)
				emitRecord(I.DefLoc.getValue(), RECORD_DEFLOCATION);
				for (const auto &L : I.Loc)
				emitRecord(L, RECORD_LOCATION);
				emitRecord(I.TagType, RECORD_TAG_TYPE);
				for (const auto &N : I.Members)
				emitBlock(N);
				for (const auto &P : I.Parents)
				emitRecord(P, RECORD_PARENT);
				for (const auto &P : I.VirtualParents)
				emitRecord(P, RECORD_VPARENT);
				}

				void ClangDocBitcodeWriter::emitBlock(const FunctionInfo &I) {
				StreamSubBlockGuard Block(Stream, BI_FUNCTION_BLOCK_ID);
				EMITINFO(FUNCTION)
				emitRecord(I.IsMethod, FUNCTION_IS_METHOD);
				if (I.DefLoc)
				emitRecord(I.DefLoc.getValue(), FUNCTION_DEFLOCATION);
				for (const auto &L : I.Loc)
				emitRecord(L, FUNCTION_LOCATION);
				emitRecord(I.Parent, FUNCTION_PARENT);
				emitBlock(I.ReturnType);
				for (const auto &N : I.Params)
				emitBlock(N);
				}

				#undef EMITINFO

				} // namespace doc
				} // namespace clang

clang-tools-extra/trunk/clang-doc/CMakeLists.txt

				set(LLVM_LINK_COMPONENTS
				support
				)

				add_clang_library(clangDoc
				BitcodeWriter.cpp
				ClangDoc.cpp
				Mapper.cpp
				Serialize.cpp

				LINK_LIBS
				clangAnalysis
				clangAST
				clangASTMatchers
				clangBasic
				clangFrontend
				clangIndex
				clangLex
				clangTooling
				clangToolingCore
				)

				add_subdirectory(tool)

clang-tools-extra/trunk/clang-doc/ClangDoc.h

				//===-- ClangDoc.h - ClangDoc ------------------------------------ C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file exposes a method to craete the FrontendActionFactory for the
				// clang-doc tool. The factory runs the clang-doc mapper on a given set of
				// source code files, storing the results key-value pairs in its
				// ExecutionContext.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANG_DOC_CLANGDOC_H
				#define LLVM_CLANG_TOOLS_EXTRA_CLANG_DOC_CLANGDOC_H

				#include "clang/Tooling/Execution.h"
				#include "clang/Tooling/StandaloneExecution.h"
				#include "clang/Tooling/Tooling.h"

				namespace clang {
				namespace doc {

				std::unique_ptr<tooling::FrontendActionFactory>
				newMapperActionFactory(tooling::ExecutionContext *ECtx);

				} // namespace doc
				} // namespace clang

				#endif // LLVM_CLANG_TOOLS_EXTRA_CLANG_DOC_CLANGDOC_H

clang-tools-extra/trunk/clang-doc/ClangDoc.cpp

				//===-- ClangDoc.cpp - ClangDoc ---------------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements the main entry point for the clang-doc tool. It runs
				// the clang-doc mapper on a given set of source code files using a
				// FrontendActionFactory.
				//
				//===----------------------------------------------------------------------===//

				#include "ClangDoc.h"
				#include "Mapper.h"
				#include "clang/AST/AST.h"
				#include "clang/AST/ASTConsumer.h"
				#include "clang/AST/ASTContext.h"
				#include "clang/AST/RecursiveASTVisitor.h"
				#include "clang/Frontend/ASTConsumers.h"
				#include "clang/Frontend/CompilerInstance.h"
				#include "clang/Frontend/FrontendActions.h"

				namespace clang {
				namespace doc {

				class MapperActionFactory : public tooling::FrontendActionFactory {
				public:
				MapperActionFactory(tooling::ExecutionContext *ECtx) : ECtx(ECtx) {}
				clang::FrontendAction *create() override;

				private:
				tooling::ExecutionContext *ECtx;
				};

				clang::FrontendAction *MapperActionFactory::create() {
				class ClangDocAction : public clang::ASTFrontendAction {
				public:
				ClangDocAction(ExecutionContext *ECtx) : ECtx(ECtx) {}

				std::unique_ptr<clang::ASTConsumer>
				CreateASTConsumer(clang::CompilerInstance &Compiler,
				llvm::StringRef InFile) override {
				return llvm::make_unique<MapASTVisitor>(&Compiler.getASTContext(), ECtx);
				}

				private:
				ExecutionContext *ECtx;
				};
				return new ClangDocAction(ECtx);
				}

				std::unique_ptr<tooling::FrontendActionFactory>
				newMapperActionFactory(tooling::ExecutionContext *ECtx) {
				return llvm::make_unique<MapperActionFactory>(ECtx);
				}

				} // namespace doc
				} // namespace clang

clang-tools-extra/trunk/clang-doc/Mapper.h

				//===-- Mapper.h - ClangDoc Mapper ------------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements the Mapper piece of the clang-doc tool. It implements
				// a RecursiveASTVisitor to look at each declaration and populate the info
				// into the internal representation. Each seen declaration is serialized to
				// to bitcode and written out to the ExecutionContext as a KV pair where the
				// key is the declaration's USR and the value is the serialized bitcode.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANG_DOC_MAPPER_H
				#define LLVM_CLANG_TOOLS_EXTRA_CLANG_DOC_MAPPER_H

				#include "clang/AST/RecursiveASTVisitor.h"
				#include "clang/Tooling/Execution.h"

				using namespace clang::comments;
				using namespace clang::tooling;

				namespace clang {
				namespace doc {

				class MapASTVisitor : public clang::RecursiveASTVisitor<MapASTVisitor>,
				public ASTConsumer {
				public:
				explicit MapASTVisitor(ASTContext Ctx, ExecutionContext ECtx)
				: ECtx(ECtx) {}

				void HandleTranslationUnit(ASTContext &Context) override;
				bool VisitNamespaceDecl(const NamespaceDecl *D);
				bool VisitRecordDecl(const RecordDecl *D);
				bool VisitEnumDecl(const EnumDecl *D);
				bool VisitCXXMethodDecl(const CXXMethodDecl *D);
				bool VisitFunctionDecl(const FunctionDecl *D);

				private:
				template <typename T> bool mapDecl(const T *D);

				int getLine(const NamedDecl *D, const ASTContext &Context) const;
				StringRef getFile(const NamedDecl *D, const ASTContext &Context) const;
				comments::FullComment getComment(const NamedDecl D,
				const ASTContext &Context) const;

				ExecutionContext *ECtx;
				};

				} // namespace doc
				} // namespace clang

				#endif // LLVM_CLANG_TOOLS_EXTRA_CLANG_DOC_MAPPER_H

clang-tools-extra/trunk/clang-doc/Mapper.cpp

				//===-- Mapper.cpp - ClangDoc Mapper ----------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#include "Mapper.h"
				#include "BitcodeWriter.h"
				#include "Serialize.h"
				#include "clang/AST/Comment.h"
				#include "clang/Index/USRGeneration.h"
				#include "llvm/ADT/StringExtras.h"

				using clang::comments::FullComment;

				namespace clang {
				namespace doc {

				void MapASTVisitor::HandleTranslationUnit(ASTContext &Context) {
				TraverseDecl(Context.getTranslationUnitDecl());
				}

				template <typename T> bool MapASTVisitor::mapDecl(const T *D) {
				// If we're looking a decl not in user files, skip this decl.
				if (D->getASTContext().getSourceManager().isInSystemHeader(D->getLocation()))
				return true;

				llvm::SmallString<128> USR;
				// If there is an error generating a USR for the decl, skip this decl.
				if (index::generateUSRForDecl(D, USR))
				return true;

				ECtx->reportResult(llvm::toHex(llvm::toStringRef(serialize::hashUSR(USR))),
				serialize::emitInfo(D, getComment(D, D->getASTContext()),
				getLine(D, D->getASTContext()),
				getFile(D, D->getASTContext())));
				return true;
				}

				bool MapASTVisitor::VisitNamespaceDecl(const NamespaceDecl *D) {
				return mapDecl(D);
				}

				bool MapASTVisitor::VisitRecordDecl(const RecordDecl *D) { return mapDecl(D); }

				bool MapASTVisitor::VisitEnumDecl(const EnumDecl *D) { return mapDecl(D); }

				bool MapASTVisitor::VisitCXXMethodDecl(const CXXMethodDecl *D) {
				return mapDecl(D);
				}

				bool MapASTVisitor::VisitFunctionDecl(const FunctionDecl *D) {
				// Don't visit CXXMethodDecls twice
				if (dyn_cast<CXXMethodDecl>(D))
				return true;
				return mapDecl(D);
				}

				comments::FullComment *
				MapASTVisitor::getComment(const NamedDecl *D, const ASTContext &Context) const {
				RawComment *Comment = Context.getRawCommentForDeclNoCache(D);
				// FIXME: Move setAttached to the initial comment parsing.
				if (Comment) {
				Comment->setAttached();
				return Comment->parse(Context, nullptr, D);
				}
				return nullptr;
				}

				int MapASTVisitor::getLine(const NamedDecl *D,
				const ASTContext &Context) const {
				return Context.getSourceManager().getPresumedLoc(D->getLocStart()).getLine();
				}

				llvm::StringRef MapASTVisitor::getFile(const NamedDecl *D,
				const ASTContext &Context) const {
				return Context.getSourceManager()
				.getPresumedLoc(D->getLocStart())
				.getFilename();
				}

				} // namespace doc
				} // namespace clang

clang-tools-extra/trunk/clang-doc/Representation.h

				///===-- Representation.h - ClangDoc Represenation --------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines the internal representations of different declaration
				// types for the clang-doc tool.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANG_DOC_REPRESENTATION_H
				#define LLVM_CLANG_TOOLS_EXTRA_CLANG_DOC_REPRESENTATION_H

				#include "clang/AST/Type.h"
				#include "clang/Basic/Specifiers.h"
				#include "llvm/ADT/Optional.h"
				#include "llvm/ADT/SmallVector.h"
				#include "llvm/ADT/StringExtras.h"
				#include <array>
				#include <string>

				namespace clang {
				namespace doc {

				using SymbolID = std::array<uint8_t, 20>;

				struct Info;
				enum class InfoType {
				IT_namespace,
				IT_record,
				IT_function,
				IT_enum,
				IT_default
				};

				// A representation of a parsed comment.
				struct CommentInfo {
				CommentInfo() = default;
				CommentInfo(CommentInfo &&Other) : Children(std::move(Other.Children)) {}

				SmallString<16>
				Kind; // Kind of comment (TextComment, InlineCommandComment,
				// HTMLStartTagComment, HTMLEndTagComment, BlockCommandComment,
				// ParamCommandComment, TParamCommandComment, VerbatimBlockComment,
				// VerbatimBlockLineComment, VerbatimLineComment).
				SmallString<64> Text; // Text of the comment.
				SmallString<16> Name; // Name of the comment (for Verbatim and HTML).
				SmallString<8> Direction; // Parameter direction (for (T)ParamCommand).
				SmallString<16> ParamName; // Parameter name (for (T)ParamCommand).
				SmallString<16> CloseName; // Closing tag name (for VerbatimBlock).
				bool SelfClosing = false; // Indicates if tag is self-closing (for HTML).
				bool Explicit = false; // Indicates if the direction of a param is explicit
				// (for (T)ParamCommand).
				llvm::SmallVector<SmallString<16>, 4>
				AttrKeys; // List of attribute keys (for HTML).
				llvm::SmallVector<SmallString<16>, 4>
				AttrValues; // List of attribute values for each key (for HTML).
				llvm::SmallVector<SmallString<16>, 4>
				Args; // List of arguments to commands (for InlineCommand).
				std::vector<std::unique_ptr<CommentInfo>>
				Children; // List of child comments for this CommentInfo.
				};

				struct Reference {
				Reference() = default;
				Reference(llvm::StringRef Name) : UnresolvedName(Name) {}
				Reference(SymbolID USR, InfoType IT) : USR(USR), RefType(IT) {}

				SymbolID USR; // Unique identifer for referenced decl
				SmallString<16> UnresolvedName; // Name of unresolved type.
				InfoType RefType =
				InfoType::IT_default; // Indicates the type of this Reference (namespace,
				// record, function, enum, default).
				};

				// A base struct for TypeInfos
				struct TypeInfo {
				TypeInfo() = default;
				TypeInfo(SymbolID &Type, InfoType IT) : Type(Type, IT) {}
				TypeInfo(llvm::StringRef RefName) : Type(RefName) {}

				Reference Type; // Referenced type in this info.
				};

				// Info for field types.
				struct FieldTypeInfo : public TypeInfo {
				FieldTypeInfo() = default;
				FieldTypeInfo(SymbolID &Type, InfoType IT, llvm::StringRef Name)
				: TypeInfo(Type, IT), Name(Name) {}
				FieldTypeInfo(llvm::StringRef RefName, llvm::StringRef Name)
				: TypeInfo(RefName), Name(Name) {}

				SmallString<16> Name; // Name associated with this info.
				};

				// Info for member types.
				struct MemberTypeInfo : public FieldTypeInfo {
				MemberTypeInfo() = default;
				MemberTypeInfo(SymbolID &Type, InfoType IT, llvm::StringRef Name)
				: FieldTypeInfo(Type, IT, Name) {}
				MemberTypeInfo(llvm::StringRef RefName, llvm::StringRef Name)
				: FieldTypeInfo(RefName, Name) {}

				AccessSpecifier Access =
				clang::AccessSpecifier::AS_none; // Access level associated with this
				// info (public, protected, private,
				// none).
				};

				struct Location {
				Location() = default;
				Location(int LineNumber, SmallString<16> Filename)
				: LineNumber(LineNumber), Filename(std::move(Filename)) {}

				int LineNumber; // Line number of this Location.
				SmallString<32> Filename; // File for this Location.
				};

				/// A base struct for Infos.
				struct Info {
				Info() = default;
				Info(Info &&Other) : Description(std::move(Other.Description)) {}
				virtual ~Info() = default;

				SymbolID USR; // Unique identifier for the decl described by this Info.
				SmallString<16> Name; // Unqualified name of the decl.
				llvm::SmallVector<Reference, 4>
				Namespace; // List of parent namespaces for this decl.
				std::vector<CommentInfo> Description; // Comment description of this decl.
				};

				// Info for namespaces.
				struct NamespaceInfo : public Info {};

				// Info for symbols.
				struct SymbolInfo : public Info {
				llvm::Optional<Location> DefLoc; // Location where this decl is defined.
				llvm::SmallVector<Location, 2> Loc; // Locations where this decl is declared.
				};

				// TODO: Expand to allow for documenting templating and default args.
				// Info for functions.
				struct FunctionInfo : public SymbolInfo {
				bool IsMethod = false; // Indicates whether this function is a class method.
				Reference Parent; // Reference to the parent class decl for this method.
				TypeInfo ReturnType; // Info about the return type of this function.
				llvm::SmallVector<FieldTypeInfo, 4> Params; // List of parameters.
				AccessSpecifier Access =
				AccessSpecifier::AS_none; // Access level for this method (public,
				// private, protected, none).
				};

				// TODO: Expand to allow for documenting templating, inheritance access,
				// friend classes
				// Info for types.
				struct RecordInfo : public SymbolInfo {
				TagTypeKind TagType = TagTypeKind::TTK_Struct; // Type of this record (struct,
				// class, union, interface).
				llvm::SmallVector<MemberTypeInfo, 4>
				Members; // List of info about record members.
				llvm::SmallVector<Reference, 4> Parents; // List of base/parent records (does
				// not include virtual parents).
				llvm::SmallVector<Reference, 4>
				VirtualParents; // List of virtual base/parent records.
				};

				// TODO: Expand to allow for documenting templating.
				// Info for types.
				struct EnumInfo : public SymbolInfo {
				bool Scoped =
				false; // Indicates whether this enum is scoped (e.g. enum class).
				llvm::SmallVector<SmallString<16>, 4> Members; // List of enum members.
				};

				// TODO: Add functionality to include separate markdown pages.

				} // namespace doc
				} // namespace clang

				#endif // LLVM_CLANG_TOOLS_EXTRA_CLANG_DOC_REPRESENTATION_H

clang-tools-extra/trunk/clang-doc/Serialize.h

				//===-- Serializer.h - ClangDoc Serializer ----------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements the serializing functions fro the clang-doc tool. Given
				// a particular declaration, it collects the appropriate information and returns
				// a serialized bitcode string for the declaration.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANG_DOC_SERIALIZE_H
				#define LLVM_CLANG_TOOLS_EXTRA_CLANG_DOC_SERIALIZE_H

				#include "Representation.h"
				#include "clang/AST/AST.h"
				#include "clang/AST/CommentVisitor.h"
				#include <string>
				#include <vector>

				using namespace clang::comments;

				namespace clang {
				namespace doc {
				namespace serialize {

				std::string emitInfo(const NamespaceDecl D, const FullComment FC,
				int LineNumber, StringRef File);
				std::string emitInfo(const RecordDecl D, const FullComment FC, int LineNumber,
				StringRef File);
				std::string emitInfo(const EnumDecl D, const FullComment FC, int LineNumber,
				StringRef File);
				std::string emitInfo(const FunctionDecl D, const FullComment FC,
				int LineNumber, StringRef File);
				std::string emitInfo(const CXXMethodDecl D, const FullComment FC,
				int LineNumber, StringRef File);

				// Function to hash a given USR value for storage.
				// As USRs (Unified Symbol Resolution) could be large, especially for functions
				// with long type arguments, we use 160-bits SHA1(USR) values to
				// guarantee the uniqueness of symbols while using a relatively small amount of
				// memory (vs storing USRs directly).
				SymbolID hashUSR(llvm::StringRef USR);

				} // namespace serialize
				} // namespace doc
				} // namespace clang

				#endif // LLVM_CLANG_TOOLS_EXTRA_CLANG_DOC_SERIALIZE_H

clang-tools-extra/trunk/clang-doc/Serialize.cpp

				//===-- Serializer.cpp - ClangDoc Serializer --------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#include "Serialize.h"
				#include "BitcodeWriter.h"
				#include "clang/AST/Comment.h"
				#include "clang/Index/USRGeneration.h"
				#include "llvm/ADT/Hashing.h"
				#include "llvm/ADT/StringExtras.h"
				#include "llvm/Support/SHA1.h"

				using clang::comments::FullComment;

				namespace clang {
				namespace doc {
				namespace serialize {

				SymbolID hashUSR(llvm::StringRef USR) {
				return llvm::SHA1::hash(arrayRefFromStringRef(USR));
				}

				class ClangDocCommentVisitor
				: public ConstCommentVisitor<ClangDocCommentVisitor> {
				public:
				ClangDocCommentVisitor(CommentInfo &CI) : CurrentCI(CI) {}

				void parseComment(const comments::Comment *C);

				void visitTextComment(const TextComment *C);
				void visitInlineCommandComment(const InlineCommandComment *C);
				void visitHTMLStartTagComment(const HTMLStartTagComment *C);
				void visitHTMLEndTagComment(const HTMLEndTagComment *C);
				void visitBlockCommandComment(const BlockCommandComment *C);
				void visitParamCommandComment(const ParamCommandComment *C);
				void visitTParamCommandComment(const TParamCommandComment *C);
				void visitVerbatimBlockComment(const VerbatimBlockComment *C);
				void visitVerbatimBlockLineComment(const VerbatimBlockLineComment *C);
				void visitVerbatimLineComment(const VerbatimLineComment *C);

				private:
				std::string getCommandName(unsigned CommandID) const;
				bool isWhitespaceOnly(StringRef S) const;

				CommentInfo &CurrentCI;
				};

				void ClangDocCommentVisitor::parseComment(const comments::Comment *C) {
				CurrentCI.Kind = C->getCommentKindName();
				ConstCommentVisitor<ClangDocCommentVisitor>::visit(C);
				for (comments::Comment *Child :
				llvm::make_range(C->child_begin(), C->child_end())) {
				CurrentCI.Children.emplace_back(llvm::make_unique<CommentInfo>());
				ClangDocCommentVisitor Visitor(*CurrentCI.Children.back());
				Visitor.parseComment(Child);
				}
				}

				void ClangDocCommentVisitor::visitTextComment(const TextComment *C) {
				if (!isWhitespaceOnly(C->getText()))
				CurrentCI.Text = C->getText();
				}

				void ClangDocCommentVisitor::visitInlineCommandComment(
				const InlineCommandComment *C) {
				CurrentCI.Name = getCommandName(C->getCommandID());
				for (unsigned I = 0, E = C->getNumArgs(); I != E; ++I)
				CurrentCI.Args.push_back(C->getArgText(I));
				}

				void ClangDocCommentVisitor::visitHTMLStartTagComment(
				const HTMLStartTagComment *C) {
				CurrentCI.Name = C->getTagName();
				CurrentCI.SelfClosing = C->isSelfClosing();
				for (unsigned I = 0, E = C->getNumAttrs(); I < E; ++I) {
				const HTMLStartTagComment::Attribute &Attr = C->getAttr(I);
				CurrentCI.AttrKeys.push_back(Attr.Name);
				CurrentCI.AttrValues.push_back(Attr.Value);
				}
				}

				void ClangDocCommentVisitor::visitHTMLEndTagComment(
				const HTMLEndTagComment *C) {
				CurrentCI.Name = C->getTagName();
				CurrentCI.SelfClosing = true;
				}

				void ClangDocCommentVisitor::visitBlockCommandComment(
				const BlockCommandComment *C) {
				CurrentCI.Name = getCommandName(C->getCommandID());
				for (unsigned I = 0, E = C->getNumArgs(); I < E; ++I)
				CurrentCI.Args.push_back(C->getArgText(I));
				}

				void ClangDocCommentVisitor::visitParamCommandComment(
				const ParamCommandComment *C) {
				CurrentCI.Direction =
				ParamCommandComment::getDirectionAsString(C->getDirection());
				CurrentCI.Explicit = C->isDirectionExplicit();
				if (C->hasParamName())
				CurrentCI.ParamName = C->getParamNameAsWritten();
				}

				void ClangDocCommentVisitor::visitTParamCommandComment(
				const TParamCommandComment *C) {
				if (C->hasParamName())
				CurrentCI.ParamName = C->getParamNameAsWritten();
				}

				void ClangDocCommentVisitor::visitVerbatimBlockComment(
				const VerbatimBlockComment *C) {
				CurrentCI.Name = getCommandName(C->getCommandID());
				CurrentCI.CloseName = C->getCloseName();
				}

				void ClangDocCommentVisitor::visitVerbatimBlockLineComment(
				const VerbatimBlockLineComment *C) {
				if (!isWhitespaceOnly(C->getText()))
				CurrentCI.Text = C->getText();
				}

				void ClangDocCommentVisitor::visitVerbatimLineComment(
				const VerbatimLineComment *C) {
				if (!isWhitespaceOnly(C->getText()))
				CurrentCI.Text = C->getText();
				}

				bool ClangDocCommentVisitor::isWhitespaceOnly(llvm::StringRef S) const {
				return std::all_of(S.begin(), S.end(), isspace);
				}

				std::string ClangDocCommentVisitor::getCommandName(unsigned CommandID) const {
				const CommandInfo *Info = CommandTraits::getBuiltinCommandInfo(CommandID);
				if (Info)
				return Info->Name;
				// TODO: Add parsing for \file command.
				return "<not a builtin command>";
				}

				// Serializing functions.

				template <typename T> static std::string serialize(T &I) {
				SmallString<2048> Buffer;
				llvm::BitstreamWriter Stream(Buffer);
				ClangDocBitcodeWriter Writer(Stream);
				Writer.emitBlock(I);
				return Buffer.str().str();
				}

				static void parseFullComment(const FullComment *C, CommentInfo &CI) {
				ClangDocCommentVisitor Visitor(CI);
				Visitor.parseComment(C);
				}

				static SymbolID getUSRForDecl(const Decl *D) {
				llvm::SmallString<128> USR;
				if (index::generateUSRForDecl(D, USR))
				return SymbolID();
				return hashUSR(USR);
				}

				static RecordDecl *getDeclForType(const QualType &T) {
				auto *Ty = T->getAs<RecordType>();
				if (!Ty)
				return nullptr;
				return Ty->getDecl()->getDefinition();
				}

				static void parseFields(RecordInfo &I, const RecordDecl *D) {
				for (const FieldDecl *F : D->fields()) {
				// FIXME: Set Access to the appropriate value.
				SymbolID Type;
				std::string Name;
				InfoType RefType;
				if (const auto *T = getDeclForType(F->getTypeSourceInfo()->getType())) {
				Type = getUSRForDecl(T);
				if (dyn_cast<EnumDecl>(T))
				RefType = InfoType::IT_enum;
				else if (dyn_cast<RecordDecl>(T))
				RefType = InfoType::IT_record;
				I.Members.emplace_back(Type, RefType, F->getQualifiedNameAsString());
				} else {
				Name = F->getTypeSourceInfo()->getType().getAsString();
				I.Members.emplace_back(Name, F->getQualifiedNameAsString());
				}
				}
				}

				static void parseEnumerators(EnumInfo &I, const EnumDecl *D) {
				for (const EnumConstantDecl *E : D->enumerators())
				I.Members.emplace_back(E->getNameAsString());
				}

				static void parseParameters(FunctionInfo &I, const FunctionDecl *D) {
				for (const ParmVarDecl *P : D->parameters()) {
				SymbolID Type;
				std::string Name;
				InfoType RefType;
				if (const auto *T = getDeclForType(P->getOriginalType())) {
				Type = getUSRForDecl(T);
				if (dyn_cast<EnumDecl>(T))
				RefType = InfoType::IT_enum;
				else if (dyn_cast<RecordDecl>(T))
				RefType = InfoType::IT_record;
				I.Params.emplace_back(Type, RefType, P->getQualifiedNameAsString());
				} else {
				Name = P->getOriginalType().getAsString();
				I.Params.emplace_back(Name, P->getQualifiedNameAsString());
				}
				}
				}

				static void parseBases(RecordInfo &I, const CXXRecordDecl *D) {
				for (const CXXBaseSpecifier &B : D->bases()) {
				if (B.isVirtual())
				continue;
				if (const auto *P = getDeclForType(B.getType()))
				I.Parents.emplace_back(getUSRForDecl(P), InfoType::IT_record);
				else
				I.Parents.emplace_back(B.getType().getAsString());
				}
				for (const CXXBaseSpecifier &B : D->vbases()) {
				if (const auto *P = getDeclForType(B.getType()))
				I.VirtualParents.emplace_back(getUSRForDecl(P), InfoType::IT_record);
				else
				I.VirtualParents.emplace_back(B.getType().getAsString());
				}
				}

				template <typename T>
				static void
				populateParentNamespaces(llvm::SmallVector<Reference, 4> &Namespaces,
				const T *D) {
				const auto *DC = dyn_cast<DeclContext>(D);
				while ((DC = DC->getParent())) {
				if (const auto *N = dyn_cast<NamespaceDecl>(DC))
				Namespaces.emplace_back(getUSRForDecl(N), InfoType::IT_namespace);
				else if (const auto *N = dyn_cast<RecordDecl>(DC))
				Namespaces.emplace_back(getUSRForDecl(N), InfoType::IT_record);
				else if (const auto *N = dyn_cast<FunctionDecl>(DC))
				Namespaces.emplace_back(getUSRForDecl(N), InfoType::IT_function);
				else if (const auto *N = dyn_cast<EnumDecl>(DC))
				Namespaces.emplace_back(getUSRForDecl(N), InfoType::IT_enum);
				}
				}

				template <typename T>
				static void populateInfo(Info &I, const T D, const FullComment C) {
				I.USR = getUSRForDecl(D);
				I.Name = D->getNameAsString();
				populateParentNamespaces(I.Namespace, D);
				if (C) {
				I.Description.emplace_back();
				parseFullComment(C, I.Description.back());
				}
				}

				template <typename T>
				static void populateSymbolInfo(SymbolInfo &I, const T D, const FullComment C,
				int LineNumber, StringRef Filename) {
				populateInfo(I, D, C);
				if (D->isThisDeclarationADefinition())
				I.DefLoc.emplace(LineNumber, Filename);
				else
				I.Loc.emplace_back(LineNumber, Filename);
				}

				static void populateFunctionInfo(FunctionInfo &I, const FunctionDecl *D,
				const FullComment *FC, int LineNumber,
				StringRef Filename) {
				populateSymbolInfo(I, D, FC, LineNumber, Filename);
				if (const auto *T = getDeclForType(D->getReturnType())) {
				I.ReturnType.Type.USR = getUSRForDecl(T);
				if (dyn_cast<EnumDecl>(T))
				I.ReturnType.Type.RefType = InfoType::IT_enum;
				else if (dyn_cast<RecordDecl>(T))
				I.ReturnType.Type.RefType = InfoType::IT_record;
				} else {
				I.ReturnType.Type.UnresolvedName = D->getReturnType().getAsString();
				}
				parseParameters(I, D);
				}

				std::string emitInfo(const NamespaceDecl D, const FullComment FC,
				int LineNumber, llvm::StringRef File) {
				NamespaceInfo I;
				populateInfo(I, D, FC);
				return serialize(I);
				}

				std::string emitInfo(const RecordDecl D, const FullComment FC, int LineNumber,
				llvm::StringRef File) {
				RecordInfo I;
				populateSymbolInfo(I, D, FC, LineNumber, File);
				I.TagType = D->getTagKind();
				parseFields(I, D);
				if (const auto *C = dyn_cast<CXXRecordDecl>(D))
				parseBases(I, C);
				return serialize(I);
				}

				std::string emitInfo(const FunctionDecl D, const FullComment FC,
				int LineNumber, llvm::StringRef File) {
				FunctionInfo I;
				populateFunctionInfo(I, D, FC, LineNumber, File);
				I.Access = clang::AccessSpecifier::AS_none;
				return serialize(I);
				}

				std::string emitInfo(const CXXMethodDecl D, const FullComment FC,
				int LineNumber, llvm::StringRef File) {
				FunctionInfo I;
				populateFunctionInfo(I, D, FC, LineNumber, File);
				I.IsMethod = true;
				I.Parent = Reference(getUSRForDecl(D->getParent()), InfoType::IT_record);
				I.Access = D->getAccess();
				return serialize(I);
				}

				std::string emitInfo(const EnumDecl D, const FullComment FC, int LineNumber,
				llvm::StringRef File) {
				EnumInfo I;
				populateSymbolInfo(I, D, FC, LineNumber, File);
				I.Scoped = D->isScoped();
				parseEnumerators(I, D);
				return serialize(I);
				}

				} // namespace serialize
				} // namespace doc
				} // namespace clang

clang-tools-extra/trunk/clang-doc/tool/CMakeLists.txt

				include_directories(${CMAKE_CURRENT_SOURCE_DIR}/..)

				add_clang_executable(clang-doc
				ClangDocMain.cpp
				)

				target_link_libraries(clang-doc
				PRIVATE
				clangAST
				clangASTMatchers
				clangBasic
				clangFrontend
				clangDoc
				clangTooling
				clangToolingCore
				)

				No newline at end of file

clang-tools-extra/trunk/clang-doc/tool/ClangDocMain.cpp

				//===-- ClangDocMain.cpp - ClangDoc ------------------------------ C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This tool for generating C and C++ documenation from source code
				// and comments. Generally, it runs a LibTooling FrontendAction on source files,
				// mapping each declaration in those files to its USR and serializing relevant
				// information into LLVM bitcode. It then runs a pass over the collected
				// declaration information, reducing by USR. There is an option to dump this
				// intermediate result to bitcode. Finally, it hands the reduced information
				// off to a generator, which does the final parsing from the intermediate
				// representation to the desired output format.
				//
				//===----------------------------------------------------------------------===//

				#include "ClangDoc.h"
				#include "clang/AST/AST.h"
				#include "clang/AST/Decl.h"
				#include "clang/ASTMatchers/ASTMatchFinder.h"
				#include "clang/ASTMatchers/ASTMatchersInternal.h"
				#include "clang/Driver/Options.h"
				#include "clang/Frontend/FrontendActions.h"
				#include "clang/Tooling/CommonOptionsParser.h"
				#include "clang/Tooling/Execution.h"
				#include "clang/Tooling/StandaloneExecution.h"
				#include "clang/Tooling/Tooling.h"
				#include "llvm/ADT/APFloat.h"
				#include "llvm/Support/FileSystem.h"
				#include "llvm/Support/Path.h"
				#include "llvm/Support/Process.h"
				#include "llvm/Support/Signals.h"
				#include "llvm/Support/raw_ostream.h"
				#include <string>

				using namespace clang::ast_matchers;
				using namespace clang::tooling;
				using namespace clang;

				static llvm::cl::extrahelp CommonHelp(CommonOptionsParser::HelpMessage);
				static llvm::cl::OptionCategory ClangDocCategory("clang-doc options");

				static llvm::cl::opt<std::string>
				OutDirectory("output",
				llvm::cl::desc("Directory for outputting generated files."),
				llvm::cl::init("docs"), llvm::cl::cat(ClangDocCategory));

				static llvm::cl::opt<bool>
				DumpMapperResult("dump-mapper",
				llvm::cl::desc("Dump mapper results to bitcode file."),
				llvm::cl::init(false), llvm::cl::cat(ClangDocCategory));

				static llvm::cl::opt<bool> DoxygenOnly(
				"doxygen",
				llvm::cl::desc("Use only doxygen-style comments to generate docs."),
				llvm::cl::init(false), llvm::cl::cat(ClangDocCategory));

				int main(int argc, const char **argv) {
				llvm::sys::PrintStackTraceOnErrorSignal(argv[0]);
				std::error_code OK;

				auto Exec = clang::tooling::createExecutorFromCommandLineArgs(
				argc, argv, ClangDocCategory);

				if (!Exec) {
				llvm::errs() << toString(Exec.takeError()) << "\n";
				return 1;
				}

				ArgumentsAdjuster ArgAdjuster;
				if (!DoxygenOnly)
				ArgAdjuster = combineAdjusters(
				getInsertArgumentAdjuster("-fparse-all-comments",
				tooling::ArgumentInsertPosition::END),
				ArgAdjuster);

				// Mapping phase
				llvm::outs() << "Mapping decls...\n";
				auto Err = Exec->get()->execute(doc::newMapperActionFactory(
				Exec->get()->getExecutionContext()),
				ArgAdjuster);
				if (Err)
				llvm::errs() << toString(std::move(Err)) << "\n";

				if (DumpMapperResult) {
				Exec->get()->getToolResults()->forEachResult([&](StringRef Key,
				StringRef Value) {
				SmallString<128> IRRootPath;
				llvm::sys::path::native(OutDirectory, IRRootPath);
				llvm::sys::path::append(IRRootPath, "bc");
				std::error_code DirectoryStatus =
				llvm::sys::fs::create_directories(IRRootPath);
				if (DirectoryStatus != OK) {
				llvm::errs() << "Unable to create documentation directories.\n";
				return;
				}
				llvm::sys::path::append(IRRootPath, Key + ".bc");
				std::error_code OutErrorInfo;
				llvm::raw_fd_ostream OS(IRRootPath, OutErrorInfo, llvm::sys::fs::F_None);
				if (OutErrorInfo != OK) {
				llvm::errs() << "Error opening documentation file.\n";
				return;
				}
				OS << Value;
				OS.close();
				});
				}

				return 0;
				}

clang-tools-extra/trunk/docs/clang-doc.rst

				===================
				Clang-Doc
				===================

				.. contents::

				:program:`clang-doc` is a tool for generating C and C++ documenation from
				source code and comments.

				The tool is in a very early development stage, so you might encounter bugs and
				crashes. Submitting reports with information about how to reproduce the issue
				to `the LLVM bugtracker <https://llvm.org/bugs>`_ will definitely help the
				project. If you have any ideas or suggestions, please to put a feature request
				there.

				Use
				=====

				:program:`clang-doc` is a `LibTooling
				<http://clang.llvm.org/docs/LibTooling.html>`_-based tool, and so requires a
				compile command database for your project (for an example of how to do this
				see `How To Setup Tooling For LLVM
				<http://clang.llvm.org/docs/HowToSetupToolingForLLVM.html>`_).

				The tool can be used on a single file or multiple files as defined in
				the compile commands database:

				.. code-block:: console

				$ clang-doc /path/to/file.cpp -p /path/to/compile/commands

				This generates an intermediate representation of the declarations and their
				associated information in the specified TUs, serialized to LLVM bitcode.

				As currently implemented, the tool is only able to parse TUs that can be
				stored in-memory. Future additions will extend the current framework to use
				map-reduce frameworks to allow for use with large codebases.

				:program:`clang-doc` offers the following options:

				.. code-block:: console

				$ clang-doc --help
				USAGE: clang-doc [options] <source0> [... <sourceN>]

				OPTIONS:

				Generic Options:

				-help - Display available options (-help-hidden for more)
				-help-list - Display list of available options (-help-list-hidden for more)
				-version - Display the version of this program

				clang-doc options:

				-doxygen - Use only doxygen-style comments to generate docs.
				-dump - Dump intermediate results to bitcode file.
				-extra-arg=<string> - Additional argument to append to the compiler command line
				-extra-arg-before=<string> - Additional argument to prepend to the compiler command line
				-omit-filenames - Omit filenames in output.
				-output=<string> - Directory for outputting generated files.
				-p=<string> - Build path

clang-tools-extra/trunk/test/CMakeLists.txt

	Show All 35 Lines
	set(CLANG_TOOLS_TEST_DEPS			set(CLANG_TOOLS_TEST_DEPS
	# For the clang-apply-replacements test that uses clang-rename.			# For the clang-apply-replacements test that uses clang-rename.
	clang-rename			clang-rename

	# Individual tools we test.			# Individual tools we test.
	clang-apply-replacements			clang-apply-replacements
	clang-change-namespace			clang-change-namespace
	clangd			clangd
				clang-doc
	clang-include-fixer			clang-include-fixer
	clang-move			clang-move
	clang-query			clang-query
	clang-reorder-fields			clang-reorder-fields
	find-all-symbols			find-all-symbols
	modularize			modularize
	pp-trace			pp-trace

	Show All 36 Lines

clang-tools-extra/trunk/test/clang-doc/mapper-class-in-class.cpp

				// RUN: rm -rf %t
				// RUN: mkdir %t
				// RUN: echo "" > %t/compile_flags.txt
				// RUN: cp "%s" "%t/test.cpp"
				// RUN: clang-doc --dump-mapper -doxygen -p %t %t/test.cpp -output=%t/docs
				// RUN: llvm-bcanalyzer %t/docs/bc/641AB4A3D36399954ACDE29C7A8833032BF40472.bc --dump \| FileCheck %s --check-prefix CHECK-X-Y
				// RUN: llvm-bcanalyzer %t/docs/bc/CA7C7935730B5EACD25F080E9C83FA087CCDC75E.bc --dump \| FileCheck %s --check-prefix CHECK-X

				class X {
				class Y {};
				};

				// CHECK-X: <BLOCKINFO_BLOCK/>
				// CHECK-X-NEXT: <VersionBlock NumWords=1 BlockCodeSize=4>
				// CHECK-X-NEXT: <Version abbrevid=4 op0=1/>
				// CHECK-X-NEXT: </VersionBlock>
				// CHECK-X-NEXT: <RecordBlock NumWords={{[0-9]*}} BlockCodeSize=4>
				// CHECK-X-NEXT: <USR abbrevid=4 op0=20 op1=202 op2=124 op3=121 op4=53 op5=115 op6=11 op7=94 op8=172 op9=210 op10=95 op11=8 op12=14 op13=156 op14=131 op15=250 op16=8 op17=124 op18=205 op19=199 op20=94/>
				// CHECK-X-NEXT: <Name abbrevid=5 op0=1/> blob data = 'X'
				// CHECK-X-NEXT: <DefLocation abbrevid=7 op0=9 op1={{[0-9]}}/> blob data = '{{.}}'
				// CHECK-X-NEXT: <TagType abbrevid=9 op0=3/>
				// CHECK-X-NEXT: </RecordBlock>


				// CHECK-X-Y: <BLOCKINFO_BLOCK/>
				// CHECK-X-Y-NEXT: <VersionBlock NumWords=1 BlockCodeSize=4>
				// CHECK-X-Y-NEXT: <Version abbrevid=4 op0=1/>
				// CHECK-X-Y-NEXT: </VersionBlock>
				// CHECK-X-Y-NEXT: <RecordBlock NumWords={{[0-9]*}} BlockCodeSize=4>
				// CHECK-X-Y-NEXT: <USR abbrevid=4 op0=20 op1=100 op2=26 op3=180 op4=163 op5=211 op6=99 op7=153 op8=149 op9=74 op10=205 op11=226 op12=156 op13=122 op14=136 op15=51 op16=3 op17=43 op18=244 op19=4 op20=114/>
				// CHECK-X-Y-NEXT: <Name abbrevid=5 op0=1/> blob data = 'Y'
				// CHECK-X-Y-NEXT: <Namespace abbrevid=6 op0=1 op1=40/> blob data = 'CA7C7935730B5EACD25F080E9C83FA087CCDC75E'
				// CHECK-X-Y-NEXT: <DefLocation abbrevid=7 op0=10 op1={{[0-9]}}/> blob data = '{{.}}'
				// CHECK-X-Y-NEXT: <TagType abbrevid=9 op0=3/>
				// CHECK-X-Y-NEXT: </RecordBlock>

clang-tools-extra/trunk/test/clang-doc/mapper-class-in-function.cpp

				// RUN: rm -rf %t
				// RUN: mkdir %t
				// RUN: echo "" > %t/compile_flags.txt
				// RUN: cp "%s" "%t/test.cpp"
				// RUN: clang-doc --dump-mapper -doxygen -p %t %t/test.cpp -output=%t/docs
				// RUN: llvm-bcanalyzer %t/docs/bc/B6AC4C5C9F2EA3F2B3ECE1A33D349F4EE502B24E.bc --dump \| FileCheck %s --check-prefix CHECK-H
				// RUN: llvm-bcanalyzer %t/docs/bc/E03E804368784360D86C757B549D14BB84A94415.bc --dump \| FileCheck %s --check-prefix CHECK-H-I

				void H() {
				class I {};
				}

				// CHECK-H: <BLOCKINFO_BLOCK/>
				// CHECK-H-NEXT: <VersionBlock NumWords=1 BlockCodeSize=4>
				// CHECK-H-NEXT: <Version abbrevid=4 op0=1/>
				// CHECK-H-NEXT: </VersionBlock>
				// CHECK-H-NEXT: <FunctionBlock NumWords={{[0-9]*}} BlockCodeSize=4>
				// CHECK-H-NEXT: <USR abbrevid=4 op0=20 op1=182 op2=172 op3=76 op4=92 op5=159 op6=46 op7=163 op8=242 op9=179 op10=236 op11=225 op12=163 op13=61 op14=52 op15=159 op16=78 op17=229 op18=2 op19=178 op20=78/>
				// CHECK-H-NEXT: <Name abbrevid=5 op0=1/> blob data = 'H'
				// CHECK-H-NEXT: <DefLocation abbrevid=7 op0=9 op1={{[0-9]}}/> blob data = '{{.}}'
				// CHECK-H-NEXT: <TypeBlock NumWords=4 BlockCodeSize=4>
				// CHECK-H-NEXT: <Type abbrevid=4 op0=4 op1=4/> blob data = 'void'
				// CHECK-H-NEXT: </TypeBlock>
				// CHECK-H-NEXT: </FunctionBlock>

				// CHECK-H-I: <BLOCKINFO_BLOCK/>
				// CHECK-H-I-NEXT: <VersionBlock NumWords=1 BlockCodeSize=4>
				// CHECK-H-I-NEXT: <Version abbrevid=4 op0=1/>
				// CHECK-H-I-NEXT: </VersionBlock>
				// CHECK-H-I-NEXT: <RecordBlock NumWords={{[0-9]*}} BlockCodeSize=4>
				// CHECK-H-I-NEXT: <USR abbrevid=4 op0=20 op1=224 op2=62 op3=128 op4=67 op5=104 op6=120 op7=67 op8=96 op9=216 op10=108 op11=117 op12=123 op13=84 op14=157 op15=20 op16=187 op17=132 op18=169 op19=68 op20=21/>
				// CHECK-H-I-NEXT: <Name abbrevid=5 op0=1/> blob data = 'I'
				// CHECK-H-I-NEXT: <Namespace abbrevid=6 op0=2 op1=40/> blob data = 'B6AC4C5C9F2EA3F2B3ECE1A33D349F4EE502B24E'
				// CHECK-H-I-NEXT: <DefLocation abbrevid=7 op0=10 op1={{[0-9]}}/> blob data = '{{.}}'
				// CHECK-H-I-NEXT: <TagType abbrevid=9 op0=3/>
				// CHECK-H-I-NEXT: </RecordBlock>

clang-tools-extra/trunk/test/clang-doc/mapper-class.cpp

				// RUN: rm -rf %t
				// RUN: mkdir %t
				// RUN: echo "" > %t/compile_flags.txt
				// RUN: cp "%s" "%t/test.cpp"
				// RUN: clang-doc --dump-mapper -doxygen -p %t %t/test.cpp -output=%t/docs
				// RUN: llvm-bcanalyzer %t/docs/bc/289584A8E0FF4178A794622A547AA622503967A1.bc --dump \| FileCheck %s

				class E {};

				// CHECK: <BLOCKINFO_BLOCK/>
				// CHECK-NEXT: <VersionBlock NumWords=1 BlockCodeSize=4>
				// CHECK-NEXT: <Version abbrevid=4 op0=1/>
				// CHECK-NEXT: </VersionBlock>
				// CHECK-NEXT: <RecordBlock NumWords={{[0-9]*}} BlockCodeSize=4>
				// CHECK-NEXT: <USR abbrevid=4 op0=20 op1=40 op2=149 op3=132 op4=168 op5=224 op6=255 op7=65 op8=120 op9=167 op10=148 op11=98 op12=42 op13=84 op14=122 op15=166 op16=34 op17=80 op18=57 op19=103 op20=161/>
				// CHECK-NEXT: <Name abbrevid=5 op0=1/> blob data = 'E'
				// CHECK-NEXT: <DefLocation abbrevid=7 op0=8 op1={{[0-9]}}/> blob data = '{{.}}'
				// CHECK-NEXT: <TagType abbrevid=9 op0=3/>
				// CHECK-NEXT: </RecordBlock>

clang-tools-extra/trunk/test/clang-doc/mapper-comments.cpp

				// RUN: rm -rf %t
				// RUN: mkdir %t
				// RUN: echo "" > %t/compile_flags.txt
				// RUN: cp "%s" "%t/test.cpp"
				// RUN: clang-doc --dump-mapper -doxygen -p %t %t/test.cpp -output=%t/docs
				// RUN: llvm-bcanalyzer %t/docs/bc/7574630614A535710E5A6ABCFFF98BCA2D06A4CA.bc --dump \| FileCheck %s

				/// \brief Brief description.
				///
				/// Extended description that
				/// continues onto the next line.
				///
				/// <ul> class="test">
				/// <li> Testing.
				/// </ul>
				///
				/// \verbatim
				/// The description continues.
				/// \endverbatim
				///
				/// \param [out] I is a parameter.
				/// \param J is a parameter.
				/// \return int
				int F(int I, int J);

				// CHECK: <BLOCKINFO_BLOCK/>
				// CHECK-NEXT: <VersionBlock NumWords=1 BlockCodeSize=4>
				// CHECK-NEXT: <Version abbrevid=4 op0=1/>
				// CHECK-NEXT: </VersionBlock>
				// CHECK-NEXT: <FunctionBlock NumWords={{[0-9]*}} BlockCodeSize=4>
				// CHECK-NEXT: <USR abbrevid=4 op0=20 op1=117 op2=116 op3=99 op4=6 op5=20 op6=165 op7=53 op8=113 op9=14 op10=90 op11=106 op12=188 op13=255 op14=249 op15=139 op16=202 op17=45 op18=6 op19=164 op20=202/>
				// CHECK-NEXT: <Name abbrevid=5 op0=1/> blob data = 'F'
				// CHECK-NEXT: <CommentBlock NumWords=351 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=11/> blob data = 'FullComment'
				// CHECK-NEXT: <CommentBlock NumWords=13 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=16/> blob data = 'ParagraphComment'
				// CHECK-NEXT: <CommentBlock NumWords=5 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=11/> blob data = 'TextComment'
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <CommentBlock NumWords=31 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=19/> blob data = 'BlockCommandComment'
				// CHECK-NEXT: <Name abbrevid=6 op0=5/> blob data = 'brief'
				// CHECK-NEXT: <CommentBlock NumWords=19 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=16/> blob data = 'ParagraphComment'
				// CHECK-NEXT: <CommentBlock NumWords=11 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=11/> blob data = 'TextComment'
				// CHECK-NEXT: <Text abbrevid=5 op0=19/> blob data = ' Brief description.'
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <CommentBlock NumWords=37 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=16/> blob data = 'ParagraphComment'
				// CHECK-NEXT: <CommentBlock NumWords=13 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=11/> blob data = 'TextComment'
				// CHECK-NEXT: <Text abbrevid=5 op0=26/> blob data = ' Extended description that'
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <CommentBlock NumWords=14 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=11/> blob data = 'TextComment'
				// CHECK-NEXT: <Text abbrevid=5 op0=30/> blob data = ' continues onto the next line.'
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <CommentBlock NumWords=83 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=16/> blob data = 'ParagraphComment'
				// CHECK-NEXT: <CommentBlock NumWords=5 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=11/> blob data = 'TextComment'
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <CommentBlock NumWords=9 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=19/> blob data = 'HTMLStartTagComment'
				// CHECK-NEXT: <Name abbrevid=6 op0=2/> blob data = 'ul'
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <CommentBlock NumWords=10 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=11/> blob data = 'TextComment'
				// CHECK-NEXT: <Text abbrevid=5 op0=14/> blob data = ' class="test">'
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <CommentBlock NumWords=5 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=11/> blob data = 'TextComment'
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <CommentBlock NumWords=9 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=19/> blob data = 'HTMLStartTagComment'
				// CHECK-NEXT: <Name abbrevid=6 op0=2/> blob data = 'li'
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <CommentBlock NumWords=9 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=11/> blob data = 'TextComment'
				// CHECK-NEXT: <Text abbrevid=5 op0=9/> blob data = ' Testing.'
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <CommentBlock NumWords=5 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=11/> blob data = 'TextComment'
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <CommentBlock NumWords=9 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=17/> blob data = 'HTMLEndTagComment'
				// CHECK-NEXT: <Name abbrevid=6 op0=2/> blob data = 'ul'
				// CHECK-NEXT: <SelfClosing abbrevid=10 op0=1/>
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <CommentBlock NumWords=13 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=16/> blob data = 'ParagraphComment'
				// CHECK-NEXT: <CommentBlock NumWords=5 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=11/> blob data = 'TextComment'
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <CommentBlock NumWords=32 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=20/> blob data = 'VerbatimBlockComment'
				// CHECK-NEXT: <Name abbrevid=6 op0=8/> blob data = 'verbatim'
				// CHECK-NEXT: <CloseName abbrevid=9 op0=11/> blob data = 'endverbatim'
				// CHECK-NEXT: <CommentBlock NumWords=16 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=24/> blob data = 'VerbatimBlockLineComment'
				// CHECK-NEXT: <Text abbrevid=5 op0=27/> blob data = ' The description continues.'
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <CommentBlock NumWords=13 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=16/> blob data = 'ParagraphComment'
				// CHECK-NEXT: <CommentBlock NumWords=5 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=11/> blob data = 'TextComment'
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <CommentBlock NumWords=39 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=19/> blob data = 'ParamCommandComment'
				// CHECK-NEXT: <Direction abbrevid=7 op0=5/> blob data = '[out]'
				// CHECK-NEXT: <ParamName abbrevid=8 op0=1/> blob data = 'I'
				// CHECK-NEXT: <Explicit abbrevid=11 op0=1/>
				// CHECK-NEXT: <CommentBlock NumWords=25 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=16/> blob data = 'ParagraphComment'
				// CHECK-NEXT: <CommentBlock NumWords=10 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=11/> blob data = 'TextComment'
				// CHECK-NEXT: <Text abbrevid=5 op0=16/> blob data = ' is a parameter.'
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <CommentBlock NumWords=5 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=11/> blob data = 'TextComment'
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <CommentBlock NumWords=38 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=19/> blob data = 'ParamCommandComment'
				// CHECK-NEXT: <Direction abbrevid=7 op0=4/> blob data = '[in]'
				// CHECK-NEXT: <ParamName abbrevid=8 op0=1/> blob data = 'J'
				// CHECK-NEXT: <CommentBlock NumWords=25 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=16/> blob data = 'ParagraphComment'
				// CHECK-NEXT: <CommentBlock NumWords=10 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=11/> blob data = 'TextComment'
				// CHECK-NEXT: <Text abbrevid=5 op0=16/> blob data = ' is a parameter.'
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <CommentBlock NumWords=5 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=11/> blob data = 'TextComment'
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <CommentBlock NumWords=27 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=19/> blob data = 'BlockCommandComment'
				// CHECK-NEXT: <Name abbrevid=6 op0=6/> blob data = 'return'
				// CHECK-NEXT: <CommentBlock NumWords=15 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=16/> blob data = 'ParagraphComment'
				// CHECK-NEXT: <CommentBlock NumWords=7 BlockCodeSize=4>
				// CHECK-NEXT: <Kind abbrevid=4 op0=11/> blob data = 'TextComment'
				// CHECK-NEXT: <Text abbrevid=5 op0=4/> blob data = ' int'
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: </CommentBlock>
				// CHECK-NEXT: <Location abbrevid=8 op0=24 op1={{[0-9]}}/> blob data = '{{.}}'
				// CHECK-NEXT: <TypeBlock NumWords=4 BlockCodeSize=4>
				// CHECK-NEXT: <Type abbrevid=4 op0=4 op1=3/> blob data = 'int'
				// CHECK-NEXT: </TypeBlock>
				// CHECK-NEXT: <FieldTypeBlock NumWords=6 BlockCodeSize=4>
				// CHECK-NEXT: <Type abbrevid=4 op0=4 op1=3/> blob data = 'int'
				// CHECK-NEXT: <Name abbrevid=5 op0=1/> blob data = 'I'
				// CHECK-NEXT: </FieldTypeBlock>
				// CHECK-NEXT: <FieldTypeBlock NumWords=6 BlockCodeSize=4>
				// CHECK-NEXT: <Type abbrevid=4 op0=4 op1=3/> blob data = 'int'
				// CHECK-NEXT: <Name abbrevid=5 op0=1/> blob data = 'J'
				// CHECK-NEXT: </FieldTypeBlock>
				// CHECK-NEXT: </FunctionBlock>

clang-tools-extra/trunk/test/clang-doc/mapper-enum.cpp

				// RUN: rm -rf %t
				// RUN: mkdir %t
				// RUN: echo "" > %t/compile_flags.txt
				// RUN: cp "%s" "%t/test.cpp"
				// RUN: clang-doc --dump-mapper -doxygen -p %t %t/test.cpp -output=%t/docs
				// RUN: llvm-bcanalyzer %t/docs/bc/FC07BD34D5E77782C263FA944447929EA8753740.bc --dump \| FileCheck %s --check-prefix CHECK-B
				// RUN: llvm-bcanalyzer %t/docs/bc/020E6C32A700C3170C009FCCD41671EDDBEAF575.bc --dump \| FileCheck %s --check-prefix CHECK-C

				enum B { X, Y };

				// CHECK-B: <BLOCKINFO_BLOCK/>
				// CHECK-B-NEXT: <VersionBlock NumWords=1 BlockCodeSize=4>
				// CHECK-B-NEXT: <Version abbrevid=4 op0=1/>
				// CHECK-B-NEXT: </VersionBlock>
				// CHECK-B-NEXT: <EnumBlock NumWords={{[0-9]*}} BlockCodeSize=4>
				// CHECK-B-NEXT: <USR abbrevid=4 op0=20 op1=252 op2=7 op3=189 op4=52 op5=213 op6=231 op7=119 op8=130 op9=194 op10=99 op11=250 op12=148 op13=68 op14=71 op15=146 op16=158 op17=168 op18=117 op19=55 op20=64/>
				// CHECK-B-NEXT: <Name abbrevid=5 op0=1/> blob data = 'B'
				// CHECK-B-NEXT: <DefLocation abbrevid=7 op0=9 op1={{[0-9]}}/> blob data = '{{.}}'
				// CHECK-B-NEXT: <Member abbrevid=9 op0=1/> blob data = 'X'
				// CHECK-B-NEXT: <Member abbrevid=9 op0=1/> blob data = 'Y'
				// CHECK-B-NEXT: </EnumBlock>

				enum class C { A, B };

				// CHECK-C: <BLOCKINFO_BLOCK/>
				// CHECK-C-NEXT: <VersionBlock NumWords=1 BlockCodeSize=4>
				// CHECK-C-NEXT: <Version abbrevid=4 op0=1/>
				// CHECK-C-NEXT: </VersionBlock>
				// CHECK-C-NEXT: <EnumBlock NumWords={{[0-9]*}} BlockCodeSize=4>
				// CHECK-C-NEXT: <USR abbrevid=4 op0=20 op1=2 op2=14 op3=108 op4=50 op5=167 op6=0 op7=195 op8=23 op9=12 op10=0 op11=159 op12=204 op13=212 op14=22 op15=113 op16=237 op17=219 op18=234 op19=245 op20=117/>
				// CHECK-C-NEXT: <Name abbrevid=5 op0=1/> blob data = 'C'
				// CHECK-C-NEXT: <DefLocation abbrevid=7 op0=23 op1={{[0-9]}}/> blob data = '{{.}}'
				// CHECK-C-NEXT: <Scoped abbrevid=10 op0=1/>
				// CHECK-C-NEXT: <Member abbrevid=9 op0=1/> blob data = 'A'
				// CHECK-C-NEXT: <Member abbrevid=9 op0=1/> blob data = 'B'
				// CHECK-C-NEXT: </EnumBlock>

clang-tools-extra/trunk/test/clang-doc/mapper-function.cpp

				// RUN: rm -rf %t
				// RUN: mkdir %t
				// RUN: echo "" > %t/compile_flags.txt
				// RUN: cp "%s" "%t/test.cpp"
				// RUN: clang-doc --dump-mapper -doxygen -p %t %t/test.cpp -output=%t/docs
				// RUN: llvm-bcanalyzer %t/docs/bc/A44B32CC3C087C9AF75DAF50DE193E85E7B2C16B.bc --dump \| FileCheck %s

				int F(int param) { return param; }

				// CHECK: <BLOCKINFO_BLOCK/>
				// CHECK-NEXT: <VersionBlock NumWords=1 BlockCodeSize=4>
				// CHECK-NEXT: <Version abbrevid=4 op0=1/>
				// CHECK-NEXT: </VersionBlock>
				// CHECK-NEXT: <FunctionBlock NumWords={{[0-9]*}} BlockCodeSize=4>
				// CHECK-NEXT: <USR abbrevid=4 op0=20 op1=164 op2=75 op3=50 op4=204 op5=60 op6=8 op7=124 op8=154 op9=247 op10=93 op11=175 op12=80 op13=222 op14=25 op15=62 op16=133 op17=231 op18=178 op19=193 op20=107/>
				// CHECK-NEXT: <Name abbrevid=5 op0=1/> blob data = 'F'
				// CHECK-NEXT: <DefLocation abbrevid=7 op0=8 op1={{[0-9]}}/> blob data = '{{.}}'
				// CHECK-NEXT: <TypeBlock NumWords=4 BlockCodeSize=4>
				// CHECK-NEXT: <Type abbrevid=4 op0=4 op1=3/> blob data = 'int'
				// CHECK-NEXT: </TypeBlock>
				// CHECK-NEXT: <FieldTypeBlock NumWords=7 BlockCodeSize=4>
				// CHECK-NEXT: <Type abbrevid=4 op0=4 op1=3/> blob data = 'int'
				// CHECK-NEXT: <Name abbrevid=5 op0=5/> blob data = 'param'
				// CHECK-NEXT: </FieldTypeBlock>
				// CHECK-NEXT: </FunctionBlock>

clang-tools-extra/trunk/test/clang-doc/mapper-method.cpp

				// RUN: rm -rf %t
				// RUN: mkdir %t
				// RUN: echo "" > %t/compile_flags.txt
				// RUN: cp "%s" "%t/test.cpp"
				// RUN: clang-doc --dump-mapper -doxygen -p %t %t/test.cpp -output=%t/docs
				// RUN: llvm-bcanalyzer %t/docs/bc/F0F9FC65FC90F54F690144A7AFB15DFC3D69B6E6.bc --dump \| FileCheck %s --check-prefix CHECK-G-F
				// RUN: llvm-bcanalyzer %t/docs/bc/4202E8BF0ECB12AE354C8499C52725B0EE30AED5.bc --dump \| FileCheck %s --check-prefix CHECK-G

				class G {
				public:
				int Method(int param) { return param; }
				};

				// CHECK-G: <BLOCKINFO_BLOCK/>
				// CHECK-G-NEXT: <VersionBlock NumWords=1 BlockCodeSize=4>
				// CHECK-G-NEXT: <Version abbrevid=4 op0=1/>
				// CHECK-G-NEXT: </VersionBlock>
				// CHECK-G-NEXT: <RecordBlock NumWords={{[0-9]*}} BlockCodeSize=4>
				// CHECK-G-NEXT: <USR abbrevid=4 op0=20 op1=66 op2=2 op3=232 op4=191 op5=14 op6=203 op7=18 op8=174 op9=53 op10=76 op11=132 op12=153 op13=197 op14=39 op15=37 op16=176 op17=238 op18=48 op19=174 op20=213/>
				// CHECK-G-NEXT: <Name abbrevid=5 op0=1/> blob data = 'G'
				// CHECK-G-NEXT: <DefLocation abbrevid=7 op0=9 op1={{[0-9]}}/> blob data = '{{.}}'
				// CHECK-G-NEXT: <TagType abbrevid=9 op0=3/>
				// CHECK-G-NEXT: </RecordBlock>

				// CHECK-G-F: <BLOCKINFO_BLOCK/>
				// CHECK-G-F-NEXT: <VersionBlock NumWords=1 BlockCodeSize=4>
				// CHECK-G-F-NEXT: <Version abbrevid=4 op0=1/>
				// CHECK-G-F-NEXT: </VersionBlock>
				// CHECK-G-F-NEXT: <FunctionBlock NumWords={{[0-9]*}} BlockCodeSize=4>
				// CHECK-G-F-NEXT: <USR abbrevid=4 op0=20 op1=240 op2=249 op3=252 op4=101 op5=252 op6=144 op7=245 op8=79 op9=105 op10=1 op11=68 op12=167 op13=175 op14=177 op15=93 op16=252 op17=61 op18=105 op19=182 op20=230/>
				// CHECK-G-F-NEXT: <Name abbrevid=5 op0=6/> blob data = 'Method'
				// CHECK-G-F-NEXT: <Namespace abbrevid=6 op0=1 op1=40/> blob data = '4202E8BF0ECB12AE354C8499C52725B0EE30AED5'
				// CHECK-G-F-NEXT: <IsMethod abbrevid=11 op0=1/>
				// CHECK-G-F-NEXT: <DefLocation abbrevid=7 op0=11 op1={{[0-9]}}/> blob data = '{{.}}'
				// CHECK-G-F-NEXT: <Parent abbrevid=9 op0=1 op1=40/> blob data = '4202E8BF0ECB12AE354C8499C52725B0EE30AED5'
				// CHECK-G-F-NEXT: <TypeBlock NumWords=4 BlockCodeSize=4>
				// CHECK-G-F-NEXT: <Type abbrevid=4 op0=4 op1=3/> blob data = 'int'
				// CHECK-G-F-NEXT: </TypeBlock>
				// CHECK-G-F-NEXT: <FieldTypeBlock NumWords=7 BlockCodeSize=4>
				// CHECK-G-F-NEXT: <Type abbrevid=4 op0=4 op1=3/> blob data = 'int'
				// CHECK-G-F-NEXT: <Name abbrevid=5 op0=5/> blob data = 'param'
				// CHECK-G-F-NEXT: </FieldTypeBlock>
				// CHECK-G-F-NEXT: </FunctionBlock>

clang-tools-extra/trunk/test/clang-doc/mapper-namespace.cpp

				// RUN: rm -rf %t
				// RUN: mkdir %t
				// RUN: echo "" > %t/compile_flags.txt
				// RUN: cp "%s" "%t/test.cpp"
				// RUN: clang-doc --dump-mapper -doxygen -p %t %t/test.cpp -output=%t/docs
				// RUN: llvm-bcanalyzer %t/docs/bc/8D042EFFC98B373450BC6B5B90A330C25A150E9C.bc --dump \| FileCheck %s

				namespace A {}

				// CHECK: <BLOCKINFO_BLOCK/>
				// CHECK-NEXT: <VersionBlock NumWords=1 BlockCodeSize=4>
				// CHECK-NEXT: <Version abbrevid=4 op0=1/>
				// CHECK-NEXT: </VersionBlock>
				// CHECK-NEXT: <NamespaceBlock NumWords={{[0-9]*}} BlockCodeSize=4>
				// CHECK-NEXT: <USR abbrevid=4 op0=20 op1=141 op2=4 op3=46 op4=255 op5=201 op6=139 op7=55 op8=52 op9=80 op10=188 op11=107 op12=91 op13=144 op14=163 op15=48 op16=194 op17=90 op18=21 op19=14 op20=156/>
				// CHECK-NEXT: <Name abbrevid=5 op0=1/> blob data = 'A'
				// CHECK-NEXT: </NamespaceBlock>

clang-tools-extra/trunk/test/clang-doc/mapper-struct.cpp

				// RUN: rm -rf %t
				// RUN: mkdir %t
				// RUN: echo "" > %t/compile_flags.txt
				// RUN: cp "%s" "%t/test.cpp"
				// RUN: clang-doc --dump-mapper -doxygen -p %t %t/test.cpp -output=%t/docs
				// RUN: llvm-bcanalyzer %t/docs/bc/06B5F6A19BA9F6A832E127C9968282B94619B210.bc --dump \| FileCheck %s

				struct C { int i; };

				// CHECK: <BLOCKINFO_BLOCK/>
				// CHECK-NEXT: <VersionBlock NumWords=1 BlockCodeSize=4>
				// CHECK-NEXT: <Version abbrevid=4 op0=1/>
				// CHECK-NEXT: </VersionBlock>
				// CHECK-NEXT: <RecordBlock NumWords={{[0-9]*}} BlockCodeSize=4>
				// CHECK-NEXT: <USR abbrevid=4 op0=20 op1=6 op2=181 op3=246 op4=161 op5=155 op6=169 op7=246 op8=168 op9=50 op10=225 op11=39 op12=201 op13=150 op14=130 op15=130 op16=185 op17=70 op18=25 op19=178 op20=16/>
				// CHECK-NEXT: <Name abbrevid=5 op0=1/> blob data = 'C'
				// CHECK-NEXT: <DefLocation abbrevid=7 op0=8 op1={{[0-9]}}/> blob data = '{{.}}'
				// CHECK-NEXT: <MemberTypeBlock NumWords=6 BlockCodeSize=4>
				// CHECK-NEXT: <Type abbrevid=4 op0=4 op1=3/> blob data = 'int'
				// CHECK-NEXT: <Name abbrevid=5 op0=4/> blob data = 'C::i'
				// CHECK-NEXT: <Access abbrevid=6 op0=3/>
				// CHECK-NEXT: </MemberTypeBlock>
				// CHECK-NEXT: </RecordBlock>

clang-tools-extra/trunk/test/clang-doc/mapper-union.cpp

				// RUN: rm -rf %t
				// RUN: mkdir %t
				// RUN: echo "" > %t/compile_flags.txt
				// RUN: cp "%s" "%t/test.cpp"
				// RUN: clang-doc --dump-mapper -doxygen -p %t %t/test.cpp -output=%t/docs
				// RUN: llvm-bcanalyzer %t/docs/bc/0B8A6B938B939B77C6325CCCC8AA3E938BF9E2E8.bc --dump \| FileCheck %s

				union D { int X; int Y; };

				// CHECK: <BLOCKINFO_BLOCK/>
				// CHECK-NEXT: <VersionBlock NumWords=1 BlockCodeSize=4>
				// CHECK-NEXT: <Version abbrevid=4 op0=1/>
				// CHECK-NEXT: </VersionBlock>
				// CHECK-NEXT: <RecordBlock NumWords={{[0-9]*}} BlockCodeSize=4>
				// CHECK-NEXT: <USR abbrevid=4 op0=20 op1=11 op2=138 op3=107 op4=147 op5=139 op6=147 op7=155 op8=119 op9=198 op10=50 op11=92 op12=204 op13=200 op14=170 op15=62 op16=147 op17=139 op18=249 op19=226 op20=232/>
				// CHECK-NEXT: <Name abbrevid=5 op0=1/> blob data = 'D'
				// CHECK-NEXT: <DefLocation abbrevid=7 op0=8 op1={{[0-9]}}/> blob data = '{{.}}'
				// CHECK-NEXT: <TagType abbrevid=9 op0=2/>
				// CHECK-NEXT: <MemberTypeBlock NumWords=6 BlockCodeSize=4>
				// CHECK-NEXT: <Type abbrevid=4 op0=4 op1=3/> blob data = 'int'
				// CHECK-NEXT: <Name abbrevid=5 op0=4/> blob data = 'D::X'
				// CHECK-NEXT: <Access abbrevid=6 op0=3/>
				// CHECK-NEXT: </MemberTypeBlock>
				// CHECK-NEXT: <MemberTypeBlock NumWords=6 BlockCodeSize=4>
				// CHECK-NEXT: <Type abbrevid=4 op0=4 op1=3/> blob data = 'int'
				// CHECK-NEXT: <Name abbrevid=5 op0=4/> blob data = 'D::Y'
				// CHECK-NEXT: <Access abbrevid=6 op0=3/>
				// CHECK-NEXT: </MemberTypeBlock>
				// CHECK-NEXT: </RecordBlock>

This is an archive of the discontinued LLVM Phabricator instance.

Setup clang-doc frontend frameworkClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 137689

clang-tools-extra/trunk/CMakeLists.txt

clang-tools-extra/trunk/clang-doc/BitcodeWriter.h

clang-tools-extra/trunk/clang-doc/BitcodeWriter.cpp

clang-tools-extra/trunk/clang-doc/CMakeLists.txt

clang-tools-extra/trunk/clang-doc/ClangDoc.h

clang-tools-extra/trunk/clang-doc/ClangDoc.cpp

clang-tools-extra/trunk/clang-doc/Mapper.h

clang-tools-extra/trunk/clang-doc/Mapper.cpp

clang-tools-extra/trunk/clang-doc/Representation.h

clang-tools-extra/trunk/clang-doc/Serialize.h

clang-tools-extra/trunk/clang-doc/Serialize.cpp

clang-tools-extra/trunk/clang-doc/tool/CMakeLists.txt

clang-tools-extra/trunk/clang-doc/tool/ClangDocMain.cpp

clang-tools-extra/trunk/docs/clang-doc.rst

clang-tools-extra/trunk/test/CMakeLists.txt

clang-tools-extra/trunk/test/clang-doc/mapper-class-in-class.cpp

clang-tools-extra/trunk/test/clang-doc/mapper-class-in-function.cpp

clang-tools-extra/trunk/test/clang-doc/mapper-class.cpp

clang-tools-extra/trunk/test/clang-doc/mapper-comments.cpp

clang-tools-extra/trunk/test/clang-doc/mapper-enum.cpp

clang-tools-extra/trunk/test/clang-doc/mapper-function.cpp

clang-tools-extra/trunk/test/clang-doc/mapper-method.cpp

clang-tools-extra/trunk/test/clang-doc/mapper-namespace.cpp

clang-tools-extra/trunk/test/clang-doc/mapper-struct.cpp

clang-tools-extra/trunk/test/clang-doc/mapper-union.cpp

Setup clang-doc frontend framework
ClosedPublic