This is an archive of the discontinued LLVM Phabricator instance.

Refactor bitcode reader to simplify control.
AbandonedPublic

Authored by kschimpf on Apr 1 2015, 2:32 PM.

Download Raw Diff

Details

Reviewers

dschuff
filcab
• rafael
jvoung

Summary

Modifies the bitcode reader such that the same logic is used for
both memory buffers and data streams. The incremental parsing
was factored into startParse, continueParse, and finishParse.
All parses (incremental or non-incremental) begin with startParse.
Then zero (or more) calls to continueParse incrementally read more
input, picking up from where the last call left off. finishParse
materializes any additional parts, based on the flags passed to startParse.

Diff Detail

Event Timeline

kschimpf updated this revision to Diff 23088.Apr 1 2015, 2:32 PM

kschimpf retitled this revision from to Refactor bitcode reader to simplify control..

kschimpf updated this object.

kschimpf edited the test plan for this revision. (Show Details)

kschimpf added reviewers: dschuff, jvoung, • rafael.

kschimpf added a subscriber: Unknown Object (MLST).

A couple nits for now, but still trying to read through and understand what's happening in the streaming and lazy cases...

lib/Bitcode/Reader/BitcodeReader.cpp
285–286	http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments says "\brief" instead of @brief
285–286	\returns
293	lowercase first letter of function name -- should probably do it for these new functions since you're touching it (but leave existing functions alone?)
404	lowercase first letter for new function (and I guess update the commit message if you do)
408	same
414	extra space in between updateParseState and (
416	In the review, I've tried to look for where ParseState gets set, and there are various ways to grep for that... one is ParseState = X... another is updateParseState(Y, ...), or updateParseState(Z); Is there any way you can make the number of variations smaller?
431	This is a bit weird to me... you have a bunch of function called "updateParseState" but some variants modify ParseState and some variants don't.
2795	NextUnreadBit is no longer set -- wanted to check if that is okay now (and why)?
2863	NOte -> Note
3300	I don't quite understand how "ShouldMaterializeAll = false" is supposed to work for the streaming case, if this isn't checked until after: while (ParseState < NoMoreInput) { if (std::error_code EC = ContinueParse()) { return EC; } } How do you delay reading until materialize(GV) for streaming?
3304	This used to be early, in "getLazyBitcodeModuleImpl". This looks like it is now happening late in FinishParse after the loop until NoMoreInput. Why is this okay now? Was the early call to "materializeForwardReferencedFunctions" actually extraneous because of the call in materialize(GV), or what? Make sure to check the lazy case with blockaddresses for computed gotos, if there isn't already a unittest for that.
4639	no need for extra space

jvoung added inline comments.Apr 1 2015, 4:49 PM

lib/Bitcode/Reader/BitcodeReader.cpp
251	be consistent about capitalization in comments one line starts with "parsed input" and another line starts with "Parsed input" =)
267–268	Don't need explicit anymore, though that transition was a while back so not really related to this CL.
294	Variable name is different from comment "MaterializeAll" vs "ShouldMaterializeAll" -- make them the same? I see that in the actual definition you're trying to avoid conflicting with the field name...

kschimpf updated this object.Apr 6 2015, 10:04 AM

kschimpf edited edge metadata.

Merge branch 'master' of http://llvm.org/git/llvm into readfac1
Working version. Save state.
Cleanup startParse.
Cleaned up code.
Fix nit.
Merge branch 'master' of http://llvm.org/git/llvm into readfac1

DId all changes except adding test that forward block address references get resolved on lazy loads.

lib/Bitcode/Reader/BitcodeReader.cpp
251	Done.
267–268	Done.
285–286	Sorry, my fault. I followed the syntax of ConstantPlaceHolder below. Changing to follow coding standards.
285–286	Done.
293	Good point. Fixing.
294	Fixing the names to be consistent. Fixing name conflict by prefixing assignments of field names with "this->".
404	Done.
408	Done.
416	I guess the first part of the problem is that there are 2 notions of parse state: The field ParseState that names the state of the parser, and NextUnreadBit which defines where to continue the parse on return (in case a function body gets parsed between calls). I also overloaded the return value with this update. Refactoring to do less and be more clear.
431	The refactoring is a bit better now. Hopefully good enough.
2863	Done.
3300	After talking to Derek, I realized what was the issue I was missing. I'l summarize what I understand: When streaming, we want to "return" as soon as possible, without having to force all bitcode to be scanned. This reduces the cost of (potential) blocking calls to the data streamer. Control can return to the caller without having completed the parse. However, the parsed portions must be consistent (i.e. forward block address references have been resolved). Based on this, I've modified the code to lift the materializeForwardReferencedFunctions into startParse.
3304	I agree that there should be some type of forward reference unit test to verify we can lazy evaluate these forward-referenced block addresses. I also agree that for the use by llvm-dis, my original code worked because it eventually calls materializeAllPermanently. I think I may have been confused about the full expectation of "streamed" (or lazy) was because of this.
4639	Done.

• rafael added inline comments.Apr 8 2015, 5:27 PM

include/llvm/Support/StreamingMemoryObject.h
75 ↗	(On Diff #23431)	Why do you need this? The streamer will return how many bytes were read and can handle a larger request. Also, why does it need to be part of this patch? It looks like this patch has many independent changes in it.
88 ↗	(On Diff #23431)	Why the extra logic? If objectsize is known it is the same as BytesRead, no?

kschimpf added inline comments.Apr 9 2015, 9:41 AM

include/llvm/Support/StreamingMemoryObject.h
75 ↗	(On Diff #23431)	This was added to handle the case of when one is parsing a wrapped bitcode file. In such cases, you do not need to do another read (which may block until it succeeds). That was the intent of this change. However, a simpler approach would be to allow the extra read, and then not set ObjectSize (below) if already set. I will remove this change, and add the conditional assignment to ObjectSize below. I changed it in this CL because it didn't cause a problem until I fixed that materializing a module (when streaming) didn't actually read all of the bitcode file. When that change was added, tests failed and this issue was exposed. I will remove the changes StreamingMemoryObject.{h,cpp} and put in a separate CL.
88 ↗	(On Diff #23431)	No, they aren't necessarily the same. The problem happens when you have a wrapped bitcode file, and was not exposed until I fixed the case that we weren't reading the entire bitcode when materializing lazily. Then, a bunch of test cases failed. When I looked into it, this is what I discovered: The wrapped bitcode was smaller than kChunkSize. Hence, the initial read set BytesRead to the size of the wrapped file on first read. The wrapper was then read, and set ObjectSize, which corresponded to 4 bytes smaller than BytesRead. This is the reason I changed this file as I did.

jvoung added inline comments.Apr 9 2015, 2:33 PM

lib/Bitcode/Reader/BitcodeReader.cpp
259	nit: This is usually 0 or 1, but it seems unexpected for this field to be named "NumModulesParsed", and yet have the type be "bool". Rename or change type?
431	Thanks -- this is better. For a while I was also wondering how many places need to be aware of setting the state to ParseError, but I think it's just continueParse() because most/all searching for bit position, etc. goes through that.
3178	Does this need to be cleanupOnError(EC) also?
3216	This covers two states? InsideModule and AtTopLevel? It might be more clear if you list them out so it's clear what the "break" corresponds to (AtTopLevel). Previously, the Stream.JumpToBit(NextUnreadBit); was only needed when InsideModule... is it now needed for AtTopLevel too?
3315	"NoMorInput" -> "NoMoreInput" It could also be that more states >= NoMoreInput are added as code evolves, but not handled here. Can the compiler accept/handle a "static_assert(ParseState < NoMoreInput, "...") to catch what happens if more states are added after NoMoreInput but not handled by this switch?
4687	Is this necessary at this point? Should that already be covered by the " // Iterate over the module, deserializing any functions that are still on disk" loop?
4687–4688	The "promise" comment from "above" is removed now, so you could update this comment.

Fix issues raised by jvoung, and remove changes now in D8907.

kschimpf added a parent revision: D8931: Add test showing error in StreamingMemoryObject.setKnownObjectSize()..Apr 9 2015, 3:37 PM

kschimpf added inline comments.

lib/Bitcode/Reader/BitcodeReader.cpp
259	Good catch. I did meant to use size_t. Fixing.
431	That is correct and was the intent. State updates (and bit positioning) is intentionally now localized to continueParse. The only exception is in ParseModule, which updates the state to state whether it returned without completing.
3178	Yes. Good catch. Fixing.
3216	The jumpToBit is needed because various "materialize" methods may be called between calls to continueParse. By forcing a jumpToBit to happen at all calls to continueParse, we no longer need to know where the materialize methods leave the bitcursor. While I did not see an example of an error caused by interleaved calls to materialize, I was very suspicious that they could occur, and wanted to make sure that this would not happen. Hence, I made sure that continueParse always resets the position to where it left off. I will fix to not use default, so that a corresponding warning will be generated if a new value is added to the enumeration.
3315	Fixed string. Also removed "default" case and made all states explicit. This will force a warning if a new state is added.
4674	Removing the comment about being after a function body. This is no longer true. A call to materializeMetadata would put us some place else in the bitcode file.
4687	In correct bitcode files, you are right. However, if the function doesn't define any function blocks, but (incorrectly) references function block addresses, this code will cause the error to be generated. However, looking at the following instruction, this is checked anyway. Removing.
4687–4688	Done.

Fix issues in diff 23431.

Merge branch 'master' into readfac1
Fix issues raised by merge.
Merge branch 'master' of http://llvm.org/git/llvm into readfac1
Fix tests to use old-style parser.

Fix nits.

Now that the issues with the streaming memory object has been fixed, I have updated this CL for review.

Note that I added a CL flag "-old-lazy-bitcode-parser". This was done to deal with a bug fixed by this CL. That is, in the old code, when you materialized a module, it didn't check if there was any additional data in the bitcode file. The new code fixes this by calling "finishParse". However, there are a couple of (bitcode binary) tests that were generated with this violation. Hence, the flag was added to fix this problem.

I'm willing to remove this flag in either (1) a later review, or (2) in a later revision. However, for this review I made the issues explicit so that the problem can be seen.

In D8786#179791, @kschimpf wrote:

However, there are a couple of (bitcode binary) tests that were generated with this violation. Hence, the flag was added to fix this problem.

I'm willing to remove this flag in either (1) a later review, or (2) in a later revision. However, for this review I made the issues explicit so that the problem can be seen.

Let me know which ones have bugs. They might be easy-ish to reconstruct (especially since I have some additional practice in fiddling with bc files, now (Can't promise to deal with them very quickly, though).

lib/Bitcode/Reader/BitcodeReader.cpp
300–301	Why the empty line?
312	http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments ^^ This changed recently. Omit \brief if the brief description is just a sentence. (You might want to add the '.' at the end, though)
789	Omit \brief.
803	Omit \brief.
3254	We're already at top level, no? (line 3234) I might be missing something, but it looks like we're at top level, and saw an EndBlock. Shouldn't this be an error?

kschimpf added a reviewer: filcab.May 28 2015, 1:54 PM

Merge branch 'master' of http://llvm.org/git/llvm into readfac1
Fix issues raised by filcab.

Fixes based on feedback by Filipe.

lib/Bitcode/Reader/BitcodeReader.cpp
300–301	Removed.
789	Done.
803	Done.
3254	I agree that we should be at top level. I also agree that it appears weird that we allow extra (unmathced) EndBlocks. This has been allowed by the bitcode reader/writer for years. I just wasn't willing to make the leap that I should remove this. However, I tried removing it (and making it an error), and no tests failed. Hence, Converting this to an error.
4679	Moved the iterating of functions to before the call to finishParse. This deals with the problems I was having with tests in test/Bitcode/invalid.test (i.e. Inputs/invalid-fwdref-type-mismatch-2.bc and Inputs/invalid-load-ptr-type.bc). These two files had multiple errors (the one they intended which was inside a function body, and the one probably not intended - extraneous stuff at the end of the bitcode file). This removes the need for the command line flag UseOldLazyBitcodeParse, and I have deleted it.

Hi Karl,

The files came straight from the fuzzer, so it is likely have more than one error. If, in order to support them (where support is: keep the test working and diagnosing what we want), you have to change the code in a convoluted way, I would prefer to change the test.

If the change is minimal and not a problem (doesn't impact legibility or architecture), then keeping the tests as they are is not a problem either. I just want to avoid having worse code just so we don't have to re-do some tests.

Of course, if we started crashing on the tests, that's a problem :-)

Thanks,

Filipe

(Phab butchered the comment in my email. Editing it to get a complete history in Phabricator)

Fix invalid bitcode tests with more than one error.
Merge branch 'master' into readfac1

In addition to moving the code back in materializeModule, I fixed three tests. Two of them were fuzz tests where I "truncated" the file to the end of the module block. The other test did not have anything after a bad abbreviation definition, and the code file was incomplete. So I generated a replacement test that was well structured otherwise (i.e. only had that one error in it).

include/llvm/Bitcode/BitstreamReader.h
328 ↗	(On Diff #26923)	This code fixes the state of the bit streamer when no more input is found. As a result, method AtEndOfStream now works correctly.
lib/Bitcode/Reader/BitcodeReader.cpp
3253–3255	Discovered that the Bitstream::EndBLock was "hiding" a bug int the bitstream reader when processing a data stream. That is, when using a data stream, the size is not set until after the eof is reached. Hence, when Stream.AtEndOfStream() was called above, it would return false even when at the eof. The actual problem was in FillCurWord, which did not set the bit position correctly when there was no more input. The old code worked because the read (at eof) would return zero, and is understood as an end block. By returning success for this value, it would hide this problem. I also improved the error message so that once can see where the reader thought the eof should be, if there is miscellaneous stuff at the end of the bit code file. This makes it easier to know where to cut a test file in such cases.
4680	Now that the eof checking is fixed, I moved this back where it was in an earlier version of this CL.

Hi Karl,

Really sorry for the delay.
LGTM on my part, as long as you add the test for the error message and do the fix.

Thank you,

Filipe

lib/Bitcode/Reader/BitcodeReader.cpp
790	Nit: Put more words on the first line.
794	Nit: If it's for docs, it's probably best to start with an uppercase letter.
3258	Thank you!
3262	errs()? Or StrBuf? Please also add a test for this error message.

Merge branch 'master' of http://llvm.org/git/llvm into readfac1
Fixes associated with review by Filipe.

Fixed issues raised by Filipe.

lib/Bitcode/Reader/BitcodeReader.cpp
790	Done.
794	Done.
3262	Good catch. I meant StrBuf, so that we can use the same API for all errors.

Applying the patch locally to take a better look.

lib/Bitcode/Reader/BitcodeReader.cpp
286–299	Please commit the pure cleanup bits first: using \ instead of @ starting functions with lowercase names.
3263	This looks a bit much to be honest. Corrupted files are not that common and it is trivial to set a breakpoint to find the state.
3266	This is always 0 or 1. Use a boolean instead.

I got the following test failures locally:

LLVM :: Bitcode/invalid.test
LLVM :: tools/gold/invalid.ll

This CL has gotten a bit long, and hard to read (to many versions). Moved to a new CL http://reviews.llvm.org/D10518

lib/Bitcode/Reader/BitcodeReader.cpp
286–299	I assume this has already been done. The new CL doesn't have suc cases anymore.
3263	Simplified in new CL to do same as before.
3266	This was already fixed in master.

This CL has gotten a bit long, and hard to read (to many versions). Moved to a new CL http://reviews.llvm.org/D10518

Revision Contents

Path

Size

lib/

Bitcode/

Reader/

BitcodeReader.cpp

340 lines

BitstreamReader.cpp

1 line

test/

Bitcode/

invalid.test

6 lines

Commit	Tree	Parents	Author	Summary	Date
33fac955326a	f4eafba3ba14	ebb2942e3600	Karl Schimpf	Fix tests to use old-style parser.	May 27 2015, 3:28 PM
ebb2942e3600	fc2e92854f6a	b247e435f281 344593ce6c91	Karl Schimpf	Merge branch 'master' of http://llvm.org/git/llvm into readfac1	May 27 2015, 3:00 PM
b247e435f281	4107d4f57346	94061b821ca8	Karl Schimpf	Fix code due to merge, and report issue with invalid bitcode test.	May 27 2015, 2:59 PM
94061b821ca8	5421b155116b	d80585f197c5	Karl Schimpf	Save state.	May 27 2015, 1:49 PM
d80585f197c5	222a2e884a6b	a8ad769f4f7e	Karl Schimpf	Move check for TheModule being defined.	May 27 2015, 1:31 PM
a8ad769f4f7e	d37cfb3bee2b	ab35075a9118	Karl Schimpf	Fix issues raised by merge.	May 27 2015, 12:56 PM
ab35075a9118	75eb612a2483	8f22bd42f4dc 890a876e0e16	Karl Schimpf	Merge branch 'master' into readfac1	May 27 2015, 11:03 AM
8f22bd42f4dc	483c907c9f8a	cedc6147f3be	Karl Schimpf	Fix nit.	Apr 13 2015, 10:34 AM
cedc6147f3be	3685c77bf21c	437feddf04b5 332adac427ca	Karl Schimpf	Merge branch 'master' of http://llvm.org/git/llvm into readfac1	Apr 13 2015, 10:14 AM
437feddf04b5	49c35e14280e	8644fc8e6366	Karl Schimpf	Remove changes now in http://reviews.llvm.org/D8931.	Apr 9 2015, 3:28 PM
8644fc8e6366	0c564b770339	672b0e7b2d15	Karl Schimpf	Fix issues raised by jvoung.	Apr 9 2015, 3:26 PM
672b0e7b2d15	027579062840	19e24c9e77c0 6b5c9d5dd290	Karl Schimpf	Merge branch 'master' of http://llvm.org/git/llvm into readfac1	Apr 9 2015, 9:51 AM
19e24c9e77c0	b654d403b674	7a117b906196	Karl Schimpf	Fix StreamingMemoryObject based on Rafael's comments.	Apr 9 2015, 9:43 AM
7a117b906196	6c3cfdecb837	8eddbbc603dc cd13a3808a22	Karl Schimpf	Merge branch 'master' of http://llvm.org/git/llvm into readfac1	Apr 8 2015, 11:12 AM
8eddbbc603dc	15947275b1c6	fc1c6a8e568a	Karl Schimpf	Fix nit.	Apr 8 2015, 11:11 AM
fc1c6a8e568a	c8d19fa5f221	db835c00f8bb	Karl Schimpf	Cleaned up code.	Apr 8 2015, 11:07 AM
db835c00f8bb	9caa425c456f	45396af2fdfd	Karl Schimpf	Cleanup startParse.	Apr 8 2015, 9:47 AM
45396af2fdfd	ac2e8173502f	8daa676f74d2	Karl Schimpf	Working version. Save state.	Apr 8 2015, 8:51 AM
8daa676f74d2	2775d9b566c3	07dd5b215789 e17e7a2400df	Karl Schimpf	Merge branch 'master' of http://llvm.org/git/llvm into readfac1	Apr 6 2015, 9:17 AM
07dd5b215789	ae70c05d27df	012b38438e94	Karl Schimpf	Refactor bitcode reader to simplify control. (Show More…)	Apr 1 2015, 2:23 PM
012b38438e94	0bd1f195df6c	f279e19a390c a066ed09db37	Karl Schimpf	Merge branch 'master' of http://llvm.org/git/llvm into readfac1	Apr 1 2015, 1:48 PM
f279e19a390c	d8b4f0ff2a90	aa223716873a	Karl Schimpf	Finish code review.	Apr 1 2015, 12:57 PM
aa223716873a	cfae04b4b9fa	7016aa1cc1c8	Karl Schimpf	Clean up saving state.	Apr 1 2015, 12:12 PM
7016aa1cc1c8	e1f3d9ce6f28	da1b585479bf	Karl Schimpf	Make MaterializeModule not be recursive.	Apr 1 2015, 11:03 AM
da1b585479bf	da68d2183094	4b12b9cc327f	Karl Schimpf	Remove tracing code.	Mar 31 2015, 3:42 PM
4b12b9cc327f	a237c5cf18f7	5e41a7399c2a	Karl Schimpf	Working code for all tests in check.	Mar 31 2015, 3:33 PM
5e41a7399c2a	b3ff737b87af	0e8bafe3e45b	Karl Schimpf	Save state to test what is happening in master.	Mar 31 2015, 12:41 PM
0e8bafe3e45b	ef1f136464a0	9864b616c701	Karl Schimpf	Modify parsing of bitcode files to stop after first (safe) skipped block.	Mar 27 2015, 11:13 AM
9864b616c701	030c35e525f8	3f972ab1bf37	Karl Schimpf	Save start.	Mar 25 2015, 3:14 PM

Diff 26641

lib/Bitcode/Reader/BitcodeReader.cpp

Show All 22 Lines
#include "llvm/IR/GVMaterializer.h"		#include "llvm/IR/GVMaterializer.h"
#include "llvm/IR/InlineAsm.h"		#include "llvm/IR/InlineAsm.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/LLVMContext.h"		#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
#include "llvm/IR/OperandTraits.h"		#include "llvm/IR/OperandTraits.h"
#include "llvm/IR/Operator.h"		#include "llvm/IR/Operator.h"
#include "llvm/IR/ValueHandle.h"		#include "llvm/IR/ValueHandle.h"
		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/DataStream.h"		#include "llvm/Support/DataStream.h"
#include "llvm/Support/ManagedStatic.h"		#include "llvm/Support/ManagedStatic.h"
#include "llvm/Support/MathExtras.h"		#include "llvm/Support/MathExtras.h"
#include "llvm/Support/MemoryBuffer.h"		#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include <deque>		#include <deque>
using namespace llvm;		using namespace llvm;

namespace {		namespace {
enum {		enum {
SWITCH_INST_MAGIC = 0x4B5 // May 2012 => 1205 => Hex		SWITCH_INST_MAGIC = 0x4B5 // May 2012 => 1205 => Hex
};		};

		// Flag to deal with the fact that previous implementations of the
		// bitcode parser did not check for more input when lazily
		// materializing a module (This has been fixed by the call to
		// finishParse in materializeModule). Allows old bitcode invalid
		// tests to continue to work.

		// TODO(kschimpf): Remove this check once invalid tests are correctly
		// generated.
		cl::opt<bool>
		UseOldLazyBitcodeParser(
		"old-lazy-bitcode-parser",
		cl::desc("use old bitcode parser that doesn't check for more "
		"after materialization"),
		cl::init(false));

class BitcodeReaderValueList {		class BitcodeReaderValueList {
std::vector<WeakVH> ValuePtrs;		std::vector<WeakVH> ValuePtrs;

/// ResolveConstants - As we resolve forward-referenced constants, we add		/// ResolveConstants - As we resolve forward-referenced constants, we add
/// information about them to this vector. This allows us to resolve them in		/// information about them to this vector. This allows us to resolve them in
/// bulk instead of resolving each reference at a time. See the code in		/// bulk instead of resolving each reference at a time. See the code in
/// ResolveConstantForwardRefs for more information about this.		/// ResolveConstantForwardRefs for more information about this.
///		///
▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines	public:
void AssignValue(Metadata *MD, unsigned Idx);		void AssignValue(Metadata *MD, unsigned Idx);
void tryToResolveCycles();		void tryToResolveCycles();
};		};

class BitcodeReader : public GVMaterializer {		class BitcodeReader : public GVMaterializer {
LLVMContext &Context;		LLVMContext &Context;
DiagnosticHandlerFunction DiagnosticHandler;		DiagnosticHandlerFunction DiagnosticHandler;
Module *TheModule;		Module *TheModule;
		// The following two fields define the type of memory to parse.
std::unique_ptr<MemoryBuffer> Buffer;		std::unique_ptr<MemoryBuffer> Buffer;
		DataStreamer *Streamer;
std::unique_ptr<BitstreamReader> StreamFile;		std::unique_ptr<BitstreamReader> StreamFile;
BitstreamCursor Stream;		BitstreamCursor Stream;
DataStreamer *LazyStreamer;
uint64_t NextUnreadBit;
bool SeenValueSymbolTable;		bool SeenValueSymbolTable;

std::vector<Type*> TypeList;		std::vector<Type*> TypeList;
BitcodeReaderValueList ValueList;		BitcodeReaderValueList ValueList;
BitcodeReaderMDValueList MDValueList;		BitcodeReaderMDValueList MDValueList;
std::vector<Comdat *> ComdatList;		std::vector<Comdat *> ComdatList;
SmallVector<Instruction *, 64> InstructionList;		SmallVector<Instruction *, 64> InstructionList;

▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	class BitcodeReader : public GVMaterializer {
/// for a more compact encoding. Some instruction operands are not		/// for a more compact encoding. Some instruction operands are not
/// relative to the instruction ID: basic block numbers, and types.		/// relative to the instruction ID: basic block numbers, and types.
/// Once the old style function blocks have been phased out, we would		/// Once the old style function blocks have been phased out, we would
/// not need this flag.		/// not need this flag.
bool UseRelativeIDs;		bool UseRelativeIDs;

/// True if all functions will be materialized, negating the need to process		/// True if all functions will be materialized, negating the need to process
/// (e.g.) blockaddress forward references.		/// (e.g.) blockaddress forward references.
bool WillMaterializeAllForwardRefs;		bool WillMaterializeAllForwardRefs = false;

/// Functions that have block addresses taken. This is usually empty.		/// Functions that have block addresses taken. This is usually empty.
SmallPtrSet<const Function *, 4> BlockAddressesTaken;		SmallPtrSet<const Function *, 4> BlockAddressesTaken;

/// True if any Metadata block has been materialized.		/// True if any Metadata block has been materialized.
bool IsMetadataMaterialized;		bool IsMetadataMaterialized = false;

		/// True if meta data should initially be skipped.
		bool ShouldLazyLoadMetadata = false;

		/// The name of state of the parse. Along with NextUnreadBit, they
		/// define the state of the parse between calls to continueParse().
		enum BitcodeReaderState {
		AtStart,
		AtTopLevel, // Processing top-level records.
		InsideModule, // Processing records inside a module block.
		// All states below here represent cases where input shouldn't be parsed.
		NoMoreInput, // Generic marker for having parsed input.
		ReachedEof, // Parsed input, but not necessary materializations.
		FinishedParse, // Parsed input and materialized necessary parts.
		ParseError, // An error has occurred, stop parsing.
		jvoungUnsubmitted Not Done Reply Inline Actions be consistent about capitalization in comments one line starts with "parsed input" and another line starts with "Parsed input" =) jvoung: be consistent about capitalization in comments one line starts with "parsed input" and another…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
		} ParseState = AtStart;

		/// The position (within the bitcode) where continueParse() left off, and used
		/// to set input position on the next call to continueParse().
		uint64_t NextUnreadBit = 0;

		/// The number of modules read at the top level.
		size_t NumModulesParsed = 0;
		jvoungUnsubmitted Not Done Reply Inline Actions nit: This is usually 0 or 1, but it seems unexpected for this field to be named "NumModulesParsed", and yet have the type be "bool". Rename or change type? jvoung: nit: This is usually 0 or 1, but it seems unexpected for this field to be named…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Good catch. I did meant to use size_t. Fixing. kschimpf: Good catch. I did meant to use size_t. Fixing.

bool StripDebugInfo = false;		bool StripDebugInfo = false;

public:		public:
std::error_code Error(BitcodeError E, const Twine &Message);		std::error_code Error(BitcodeError E, const Twine &Message);
std::error_code Error(BitcodeError E);		std::error_code Error(BitcodeError E);
std::error_code Error(const Twine &Message);		std::error_code Error(const Twine &Message);

explicit BitcodeReader(MemoryBuffer *buffer, LLVMContext &C,		BitcodeReader(MemoryBuffer *Buffer, LLVMContext &C,
		jvoungUnsubmitted Not Done Reply Inline Actions Don't need explicit anymore, though that transition was a while back so not really related to this CL. jvoung: Don't need explicit anymore, though that transition was a while back so not really related to…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
DiagnosticHandlerFunction DiagnosticHandler);		DiagnosticHandlerFunction DiagnosticHandler);
explicit BitcodeReader(DataStreamer *streamer, LLVMContext &C,		BitcodeReader(DataStreamer *Streamer, LLVMContext &C,
DiagnosticHandlerFunction DiagnosticHandler);		DiagnosticHandlerFunction DiagnosticHandler);
~BitcodeReader() override { FreeState(); }		~BitcodeReader() override { FreeState(); }

std::error_code materializeForwardReferencedFunctions();		std::error_code materializeForwardReferencedFunctions();

void FreeState();		void FreeState();

void releaseBuffer();		void releaseBuffer();

bool isDematerializable(const GlobalValue *GV) const override;		bool isDematerializable(const GlobalValue *GV) const override;
std::error_code materialize(GlobalValue *GV) override;		std::error_code materialize(GlobalValue *GV) override;
std::error_code materializeModule(Module *M) override;		std::error_code materializeModule(Module *M) override;
std::vector<StructType *> getIdentifiedStructTypes() const override;		std::vector<StructType *> getIdentifiedStructTypes() const override;
void dematerialize(GlobalValue *GV) override;		void dematerialize(GlobalValue *GV) override;

/// @brief Main interface to parsing a bitcode buffer.		/// \brief Starts parse of bitcode. Materializes during parse based on flags.
		jvoungUnsubmitted Not Done Reply Inline Actions http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments says "\brief" instead of @brief jvoung: http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments says "\brief"…
		jvoungUnsubmitted Not Done Reply Inline Actions \returns jvoung: \returns
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Sorry, my fault. I followed the syntax of ConstantPlaceHolder below. Changing to follow coding standards. kschimpf: Sorry, my fault. I followed the syntax of ConstantPlaceHolder below. Changing to follow coding…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
/// @returns true if an error occurred.		///
std::error_code ParseBitcodeInto(Module *M,		/// \param M the module to build.
bool ShouldLazyLoadMetadata = false);
		/// \param ShouldMaterializeAll true when the module should be materialized
		/// completely before returning. Otherwise, function bodies are only loaded on
		/// demand.
		/// \param ShouldLazyLoadMetadata true when the metadata blocks should be
		jvoungUnsubmitted Not Done Reply Inline Actions lowercase first letter of function name -- should probably do it for these new functions since you're touching it (but leave existing functions alone?) jvoung: lowercase first letter of function name -- should probably do it for these new functions since…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Good point. Fixing. kschimpf: Good point. Fixing.
		/// parsed.
		jvoungUnsubmitted Not Done Reply Inline Actions Variable name is different from comment "MaterializeAll" vs "ShouldMaterializeAll" -- make them the same? I see that in the actual definition you're trying to avoid conflicting with the field name... jvoung: Variable name is different from comment "MaterializeAll" vs "ShouldMaterializeAll" -- make them…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Fixing the names to be consistent. Fixing name conflict by prefixing assignments of field names with "this->". kschimpf: Fixing the names to be consistent. Fixing name conflict by prefixing assignments of field names…
		///
		/// \returns true if an error occurred.
		std::error_code parseBitcodeInto(Module *M,
		bool ShouldMaterializeAll,
		bool ShouldLazyLoadMetadata);
		rafaelUnsubmitted Not Done Reply Inline Actions Please commit the pure cleanup bits first: using \ instead of @ starting functions with lowercase names. rafael: Please commit the pure cleanup bits first: using \ instead of @ starting functions with…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions I assume this has already been done. The new CL doesn't have suc cases anymore. kschimpf: I assume this has already been done. The new CL doesn't have suc cases anymore.

/// @brief Cheap mechanism to just extract module triple		/// \brief Cheap mechanism to just extract module triple
		filcabUnsubmitted Not Done Reply Inline Actions Why the empty line? filcab: Why the empty line?
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Removed. kschimpf: Removed.
/// @returns true if an error occurred.		/// \returns true if an error occurred.
ErrorOr<std::string> parseTriple();		ErrorOr<std::string> parseTriple();

static uint64_t decodeSignRotatedValue(uint64_t V);		static uint64_t decodeSignRotatedValue(uint64_t V);

/// Materialize any deferred Metadata block.		/// Materialize any deferred Metadata block.
std::error_code materializeMetadata() override;		std::error_code materializeMetadata() override;

void setStripDebugInfo() override;		void setStripDebugInfo() override;

private:		private:
		filcabUnsubmitted Not Done Reply Inline Actions http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments ^^ This changed recently. Omit \brief if the brief description is just a sentence. (You might want to add the '.' at the end, though) filcab: http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments ^^ This changed…
std::vector<StructType *> IdentifiedStructTypes;		std::vector<StructType *> IdentifiedStructTypes;
StructType *createIdentifiedStructType(LLVMContext &Context, StringRef Name);		StructType *createIdentifiedStructType(LLVMContext &Context, StringRef Name);
StructType *createIdentifiedStructType(LLVMContext &Context);		StructType *createIdentifiedStructType(LLVMContext &Context);

Type *getTypeByID(unsigned ID);		Type *getTypeByID(unsigned ID);
Value getFnValueByID(unsigned ID, Type Ty) {		Value getFnValueByID(unsigned ID, Type Ty) {
if (Ty && Ty->isMetadataTy())		if (Ty && Ty->isMetadataTy())
return MetadataAsValue::get(Ty->getContext(), getFnMetadataByID(ID));		return MetadataAsValue::get(Ty->getContext(), getFnMetadataByID(ID));
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	Value *getValueSigned(SmallVectorImpl<uint64_t> &Record, unsigned Slot,
if (Slot == Record.size()) return nullptr;		if (Slot == Record.size()) return nullptr;
unsigned ValNo = (unsigned)decodeSignRotatedValue(Record[Slot]);		unsigned ValNo = (unsigned)decodeSignRotatedValue(Record[Slot]);
// Adjust the ValNo, if it was encoded relative to the InstNum.		// Adjust the ValNo, if it was encoded relative to the InstNum.
if (UseRelativeIDs)		if (UseRelativeIDs)
ValNo = InstNum - ValNo;		ValNo = InstNum - ValNo;
return getFnValueByID(ValNo, Ty);		return getFnValueByID(ValNo, Ty);
}		}

		/// \name Functions that parses bitcode files, other than skipped blocks based
		/// on flags to parseBitcodeInto().
		/// @{
		jvoungUnsubmitted Not Done Reply Inline Actions lowercase first letter for new function (and I guess update the commit message if you do) jvoung: lowercase first letter for new function (and I guess update the commit message if you do)
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
		std::error_code startParse();
		std::error_code continueParse();
		std::error_code finishParse();
		/// @}
		jvoungUnsubmitted Not Done Reply Inline Actions same jvoung: same
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.

		// Changes the parse state to the new value.
		void setParseState(BitcodeReaderState NewValue) {
		NextUnreadBit = Stream.GetCurrentBitNo();
		ParseState = NewValue;
		}
		jvoungUnsubmitted Not Done Reply Inline Actions extra space in between updateParseState and ( jvoung: extra space in between updateParseState and (

		// Changes the parse state to ParseError if given an error.
		jvoungUnsubmitted Not Done Reply Inline Actions In the review, I've tried to look for where ParseState gets set, and there are various ways to grep for that... one is ParseState = X... another is updateParseState(Y, ...), or updateParseState(Z); Is there any way you can make the number of variations smaller? jvoung: In the review, I've tried to look for where ParseState gets set, and there are various ways to…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions I guess the first part of the problem is that there are 2 notions of parse state: The field ParseState that names the state of the parser, and NextUnreadBit which defines where to continue the parse on return (in case a function body gets parsed between calls). I also overloaded the return value with this update. Refactoring to do less and be more clear. kschimpf: I guess the first part of the problem is that there are 2 notions of parse state: 1) The field…
		void setParseStateIfError(std::error_code EC) {
		NextUnreadBit = Stream.GetCurrentBitNo();
		if (EC)
		ParseState = ParseError;
		}

/// Converts alignment exponent (i.e. power of two (or zero)) to the		/// Converts alignment exponent (i.e. power of two (or zero)) to the
/// corresponding alignment to use. If alignment is too large, returns		/// corresponding alignment to use. If alignment is too large, returns
/// a corresponding error code.		/// a corresponding error code.
std::error_code parseAlignmentValue(uint64_t Exponent, unsigned &Alignment);		std::error_code parseAlignmentValue(uint64_t Exponent, unsigned &Alignment);
std::error_code ParseAttrKind(uint64_t Code, Attribute::AttrKind *Kind);		std::error_code ParseAttrKind(uint64_t Code, Attribute::AttrKind *Kind);
std::error_code ParseModule(bool Resume, bool ShouldLazyLoadMetadata = false);		std::error_code ParseModule();
std::error_code ParseAttributeBlock();		std::error_code ParseAttributeBlock();
std::error_code ParseAttributeGroupBlock();		std::error_code ParseAttributeGroupBlock();
std::error_code ParseTypeTable();		std::error_code ParseTypeTable();
		jvoungUnsubmitted Not Done Reply Inline Actions This is a bit weird to me... you have a bunch of function called "updateParseState" but some variants modify ParseState and some variants don't. jvoung: This is a bit weird to me... you have a bunch of function called "updateParseState" but some…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions The refactoring is a bit better now. Hopefully good enough. kschimpf: The refactoring is a bit better now. Hopefully good enough.
		jvoungUnsubmitted Not Done Reply Inline Actions Thanks -- this is better. For a while I was also wondering how many places need to be aware of setting the state to ParseError, but I think it's just continueParse() because most/all searching for bit position, etc. goes through that. jvoung: Thanks -- this is better. For a while I was also wondering how many places need to be aware of…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions That is correct and was the intent. State updates (and bit positioning) is intentionally now localized to continueParse. The only exception is in ParseModule, which updates the state to state whether it returned without completing. kschimpf: That is correct and was the intent. State updates (and bit positioning) is intentionally now…
std::error_code ParseTypeTableBody();		std::error_code ParseTypeTableBody();

std::error_code ParseValueSymbolTable();		std::error_code ParseValueSymbolTable();
std::error_code ParseConstants();		std::error_code ParseConstants();
std::error_code RememberAndSkipFunctionBody();		std::error_code RememberAndSkipFunctionBody();
/// Save the positions of the Metadata blocks and skip parsing the blocks.		/// Save the positions of the Metadata blocks and skip parsing the blocks.
std::error_code rememberAndSkipMetadata();		std::error_code rememberAndSkipMetadata();
std::error_code ParseFunctionBody(Function *F);		std::error_code ParseFunctionBody(Function *F);
Show All 33 Lines

static std::error_code Error(DiagnosticHandlerFunction DiagnosticHandler,		static std::error_code Error(DiagnosticHandlerFunction DiagnosticHandler,
const Twine &Message) {		const Twine &Message) {
return Error(DiagnosticHandler,		return Error(DiagnosticHandler,
make_error_code(BitcodeError::CorruptedBitcode), Message);		make_error_code(BitcodeError::CorruptedBitcode), Message);
}		}

std::error_code BitcodeReader::Error(BitcodeError E, const Twine &Message) {		std::error_code BitcodeReader::Error(BitcodeError E, const Twine &Message) {
		setParseState(ParseError);
return ::Error(DiagnosticHandler, make_error_code(E), Message);		return ::Error(DiagnosticHandler, make_error_code(E), Message);
}		}

std::error_code BitcodeReader::Error(const Twine &Message) {		std::error_code BitcodeReader::Error(const Twine &Message) {
		setParseState(ParseError);
return ::Error(DiagnosticHandler,		return ::Error(DiagnosticHandler,
make_error_code(BitcodeError::CorruptedBitcode), Message);		make_error_code(BitcodeError::CorruptedBitcode), Message);
}		}

std::error_code BitcodeReader::Error(BitcodeError E) {		std::error_code BitcodeReader::Error(BitcodeError E) {
		setParseState(ParseError);
return ::Error(DiagnosticHandler, make_error_code(E));		return ::Error(DiagnosticHandler, make_error_code(E));
}		}

static DiagnosticHandlerFunction getDiagHandler(DiagnosticHandlerFunction F,		static DiagnosticHandlerFunction getDiagHandler(DiagnosticHandlerFunction F,
LLVMContext &C) {		LLVMContext &C) {
if (F)		if (F)
return F;		return F;
return [&C](const DiagnosticInfo &DI) { C.diagnose(DI); };		return [&C](const DiagnosticInfo &DI) { C.diagnose(DI); };
}		}

BitcodeReader::BitcodeReader(MemoryBuffer *buffer, LLVMContext &C,		BitcodeReader::BitcodeReader(MemoryBuffer *Buffer, LLVMContext &C,
DiagnosticHandlerFunction DiagnosticHandler)		DiagnosticHandlerFunction DiagnosticHandler)
: Context(C), DiagnosticHandler(getDiagHandler(DiagnosticHandler, C)),		: Context(C), DiagnosticHandler(getDiagHandler(DiagnosticHandler, C)),
TheModule(nullptr), Buffer(buffer), LazyStreamer(nullptr),		TheModule(nullptr), Buffer(Buffer), Streamer(nullptr),
NextUnreadBit(0), SeenValueSymbolTable(false), ValueList(C),		SeenValueSymbolTable(false), ValueList(C),
MDValueList(C), SeenFirstFunctionBody(false), UseRelativeIDs(false),		MDValueList(C), SeenFirstFunctionBody(false), UseRelativeIDs(false) {}
WillMaterializeAllForwardRefs(false), IsMetadataMaterialized(false) {}

BitcodeReader::BitcodeReader(DataStreamer *streamer, LLVMContext &C,		BitcodeReader::BitcodeReader(DataStreamer *Streamer, LLVMContext &C,
DiagnosticHandlerFunction DiagnosticHandler)		DiagnosticHandlerFunction DiagnosticHandler)
: Context(C), DiagnosticHandler(getDiagHandler(DiagnosticHandler, C)),		: Context(C), DiagnosticHandler(getDiagHandler(DiagnosticHandler, C)),
TheModule(nullptr), Buffer(nullptr), LazyStreamer(streamer),		TheModule(nullptr), Buffer(nullptr), Streamer(Streamer),
NextUnreadBit(0), SeenValueSymbolTable(false), ValueList(C),		SeenValueSymbolTable(false), ValueList(C),
MDValueList(C), SeenFirstFunctionBody(false), UseRelativeIDs(false),		MDValueList(C), SeenFirstFunctionBody(false), UseRelativeIDs(false) {}
WillMaterializeAllForwardRefs(false), IsMetadataMaterialized(false) {}

std::error_code BitcodeReader::materializeForwardReferencedFunctions() {		std::error_code BitcodeReader::materializeForwardReferencedFunctions() {
if (WillMaterializeAllForwardRefs)		if (WillMaterializeAllForwardRefs)
return std::error_code();		return std::error_code();

// Prevent recursion.		// Prevent recursion.
WillMaterializeAllForwardRefs = true;		WillMaterializeAllForwardRefs = true;

▲ Show 20 Lines • Show All 257 Lines • ▼ Show 20 Lines	static void UpgradeDLLImportExportLinkage(llvm::GlobalValue *GV, unsigned Val) {
switch (Val) {		switch (Val) {
case 5: GV->setDLLStorageClass(GlobalValue::DLLImportStorageClass); break;		case 5: GV->setDLLStorageClass(GlobalValue::DLLImportStorageClass); break;
case 6: GV->setDLLStorageClass(GlobalValue::DLLExportStorageClass); break;		case 6: GV->setDLLStorageClass(GlobalValue::DLLExportStorageClass); break;
}		}
}		}

namespace llvm {		namespace llvm {
namespace {		namespace {
/// @brief A class for maintaining the slot number definition		/// \brief A class for maintaining the slot number definition
		filcabUnsubmitted Not Done Reply Inline Actions Omit \brief. filcab: Omit \brief.
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
/// as a placeholder for the actual definition for forward constants defs.		/// as a placeholder for the actual definition for forward constants defs.
		filcabUnsubmitted Not Done Reply Inline Actions Nit: Put more words on the first line. filcab: Nit: Put more words on the first line.
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
class ConstantPlaceHolder : public ConstantExpr {		class ConstantPlaceHolder : public ConstantExpr {
void operator=(const ConstantPlaceHolder &) = delete;		void operator=(const ConstantPlaceHolder &) = delete;
public:		public:
// allocate space for exactly one operand		// allocate space for exactly one operand
		filcabUnsubmitted Not Done Reply Inline Actions Nit: If it's for docs, it's probably best to start with an uppercase letter. filcab: Nit: If it's for docs, it's probably best to start with an uppercase letter.
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
void *operator new(size_t s) {		void *operator new(size_t s) {
return User::operator new(s, 1);		return User::operator new(s, 1);
}		}
explicit ConstantPlaceHolder(Type *Ty, LLVMContext& Context)		explicit ConstantPlaceHolder(Type *Ty, LLVMContext& Context)
: ConstantExpr(Ty, Instruction::UserOp1, &Op<0>(), 1) {		: ConstantExpr(Ty, Instruction::UserOp1, &Op<0>(), 1) {
Op<0>() = UndefValue::get(Type::getInt32Ty(Context));		Op<0>() = UndefValue::get(Type::getInt32Ty(Context));
}		}

/// @brief Methods to support type inquiry through isa, cast, and dyn_cast.		/// \brief Methods to support type inquiry through isa, cast, and dyn_cast.
		filcabUnsubmitted Not Done Reply Inline Actions Omit \brief. filcab: Omit \brief.
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
static bool classof(const Value *V) {		static bool classof(const Value *V) {
return isa<ConstantExpr>(V) &&		return isa<ConstantExpr>(V) &&
cast<ConstantExpr>(V)->getOpcode() == Instruction::UserOp1;		cast<ConstantExpr>(V)->getOpcode() == Instruction::UserOp1;
}		}


/// Provide fast operand accessors		/// Provide fast operand accessors
DECLARE_TRANSPARENT_OPERAND_ACCESSORS(Value);		DECLARE_TRANSPARENT_OPERAND_ACCESSORS(Value);
▲ Show 20 Lines • Show All 1,963 Lines • ▼ Show 20 Lines	std::error_code BitcodeReader::GlobalCleanup() {

// Force deallocation of memory for these vectors to favor the client that		// Force deallocation of memory for these vectors to favor the client that
// want lazy deserialization.		// want lazy deserialization.
std::vector<std::pair<GlobalVariable*, unsigned> >().swap(GlobalInits);		std::vector<std::pair<GlobalVariable*, unsigned> >().swap(GlobalInits);
std::vector<std::pair<GlobalAlias*, unsigned> >().swap(AliasInits);		std::vector<std::pair<GlobalAlias*, unsigned> >().swap(AliasInits);
return std::error_code();		return std::error_code();
}		}

std::error_code BitcodeReader::ParseModule(bool Resume,		std::error_code BitcodeReader::ParseModule() {
bool ShouldLazyLoadMetadata) {		if (ParseState == AtTopLevel) {
if (Resume)		if (Stream.EnterSubBlock(bitc::MODULE_BLOCK_ID))
Stream.JumpToBit(NextUnreadBit);
else if (Stream.EnterSubBlock(bitc::MODULE_BLOCK_ID))
return Error("Invalid record");		return Error("Invalid record");
		setParseState(InsideModule);
		} else {
		assert(ParseState == InsideModule);
		}

SmallVector<uint64_t, 64> Record;		SmallVector<uint64_t, 64> Record;
std::vector<std::string> SectionTable;		std::vector<std::string> SectionTable;
std::vector<std::string> GCTable;		std::vector<std::string> GCTable;

// Read all the records for this module.		// Read all the records for this module.
while (1) {		while (1) {
BitstreamEntry Entry = Stream.advance();		BitstreamEntry Entry = Stream.advance();

switch (Entry.Kind) {		switch (Entry.Kind) {
case BitstreamEntry::Error:		case BitstreamEntry::Error:
return Error("Malformed block");		return Error("Malformed block");
case BitstreamEntry::EndBlock:		case BitstreamEntry::EndBlock:
		setParseState(AtTopLevel);
return GlobalCleanup();		return GlobalCleanup();

case BitstreamEntry::SubBlock:		case BitstreamEntry::SubBlock:
switch (Entry.ID) {		switch (Entry.ID) {
default: // Skip unknown content.		default: // Skip unknown content.
if (Stream.SkipBlock())		if (Stream.SkipBlock())
return Error("Invalid record");		return Error("Invalid record");
break;		break;
▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	case BitstreamEntry::SubBlock:
std::reverse(FunctionsWithBodies.begin(), FunctionsWithBodies.end());		std::reverse(FunctionsWithBodies.begin(), FunctionsWithBodies.end());
if (std::error_code EC = GlobalCleanup())		if (std::error_code EC = GlobalCleanup())
return EC;		return EC;
SeenFirstFunctionBody = true;		SeenFirstFunctionBody = true;
}		}

if (std::error_code EC = RememberAndSkipFunctionBody())		if (std::error_code EC = RememberAndSkipFunctionBody())
return EC;		return EC;
// For streaming bitcode, suspend parsing when we reach the function		// Suspend parsing when we reach a function body, assuming we
// bodies. Subsequent materialization calls will resume it when		// have already associated names with global values. Note: If
		jvoungUnsubmitted Not Done Reply Inline Actions NOte -> Note jvoung: NOte -> Note
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
// necessary. For streaming, the function bodies must be at the end of		// the bitcode file is old, the symbol table will be at the
// the bitcode. If the bitcode file is old, the symbol table will be		// end instead and will not have been seen yet.
// at the end instead and will not have been seen yet. In this case,		if (SeenValueSymbolTable)
// just finish the parse now.
if (LazyStreamer && SeenValueSymbolTable) {
NextUnreadBit = Stream.GetCurrentBitNo();
jvoungUnsubmitted Not Done Reply Inline Actions NextUnreadBit is no longer set -- wanted to check if that is okay now (and why)? jvoung: NextUnreadBit is no longer set -- wanted to check if that is okay now (and why)?
return std::error_code();		return std::error_code();
}
break;		break;
case bitc::USELIST_BLOCK_ID:		case bitc::USELIST_BLOCK_ID:
if (std::error_code EC = ParseUseLists())		if (std::error_code EC = ParseUseLists())
return EC;		return EC;
break;		break;
}		}
continue;		continue;

▲ Show 20 Lines • Show All 231 Lines • ▼ Show 20 Lines	case bitc::MODULE_CODE_FUNCTION: {

ValueList.push_back(Func);		ValueList.push_back(Func);

// If this is a function with a body, remember the prototype we are		// If this is a function with a body, remember the prototype we are
// creating now, so that we can match up the body with them later.		// creating now, so that we can match up the body with them later.
if (!isProto) {		if (!isProto) {
Func->setIsMaterializable(true);		Func->setIsMaterializable(true);
FunctionsWithBodies.push_back(Func);		FunctionsWithBodies.push_back(Func);
if (LazyStreamer)
DeferredFunctionInfo[Func] = 0;		DeferredFunctionInfo[Func] = 0;
}		}
break;		break;
}		}
// ALIAS: [alias type, aliasee val#, linkage]		// ALIAS: [alias type, aliasee val#, linkage]
// ALIAS: [alias type, aliasee val#, linkage, visibility, dllstorageclass]		// ALIAS: [alias type, aliasee val#, linkage, visibility, dllstorageclass]
case bitc::MODULE_CODE_ALIAS: {		case bitc::MODULE_CODE_ALIAS: {
if (Record.size() < 3)		if (Record.size() < 3)
return Error("Invalid record");		return Error("Invalid record");
Show All 30 Lines	case bitc::MODULE_CODE_PURGEVALS:
return Error("Invalid record");		return Error("Invalid record");
ValueList.shrinkTo(Record[0]);		ValueList.shrinkTo(Record[0]);
break;		break;
}		}
Record.clear();		Record.clear();
}		}
}		}

std::error_code BitcodeReader::ParseBitcodeInto(Module *M,		std::error_code BitcodeReader::parseBitcodeInto(Module *M,
		bool ShouldMaterializeAll,
bool ShouldLazyLoadMetadata) {		bool ShouldLazyLoadMetadata) {
TheModule = nullptr;		auto cleanupOnError = [&](std::error_code EC) {
		releaseBuffer(); // Never take ownership on error.
		return EC;
		};

		TheModule = M;
		this->ShouldLazyLoadMetadata = ShouldLazyLoadMetadata;

		if (std::error_code EC = startParse())
		return cleanupOnError(EC);

		if (ShouldMaterializeAll) {
		if (std::error_code EC = materializeModule(TheModule))
		return cleanupOnError(EC);
		jvoungUnsubmitted Not Done Reply Inline Actions Does this need to be cleanupOnError(EC) also? jvoung: Does this need to be cleanupOnError(EC) also?
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Yes. Good catch. Fixing. kschimpf: Yes. Good catch. Fixing.
		} else {
		if (std::error_code EC = materializeForwardReferencedFunctions())
		return cleanupOnError(EC);
		}

		return std::error_code();
		}

		std::error_code BitcodeReader::startParse() {
		assert(ParseState == AtStart);

if (std::error_code EC = InitStream())		if (std::error_code EC = InitStream())
return EC;		return EC;

// Sniff for the signature.		// Sniff for the signature.
if (Stream.Read(8) != 'B' \|\|		if (Stream.Read(8) != 'B' \|\|
Stream.Read(8) != 'C' \|\|		Stream.Read(8) != 'C' \|\|
Stream.Read(4) != 0x0 \|\|		Stream.Read(4) != 0x0 \|\|
Stream.Read(4) != 0xC \|\|		Stream.Read(4) != 0xC \|\|
Stream.Read(4) != 0xE \|\|		Stream.Read(4) != 0xE \|\|
Stream.Read(4) != 0xD)		Stream.Read(4) != 0xD) {
return Error("Invalid bitcode signature");		return Error("Invalid bitcode signature");
		}

		return continueParse();
		}

		std::error_code BitcodeReader::continueParse() {
		switch (ParseState) {
		case AtStart:
		setParseState(AtTopLevel);
		break;
		case AtTopLevel:
		// Restore input position to saved position on last call.
		Stream.JumpToBit(NextUnreadBit);
		break;
		case InsideModule: {
		// Restore input position to saved position on last call,
		jvoungUnsubmitted Not Done Reply Inline Actions This covers two states? InsideModule and AtTopLevel? It might be more clear if you list them out so it's clear what the "break" corresponds to (AtTopLevel). Previously, the Stream.JumpToBit(NextUnreadBit); was only needed when InsideModule... is it now needed for AtTopLevel too? jvoung: This covers two states? InsideModule and AtTopLevel? It might be more clear if you list them…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions The jumpToBit is needed because various "materialize" methods may be called between calls to continueParse. By forcing a jumpToBit to happen at all calls to continueParse, we no longer need to know where the materialize methods leave the bitcursor. While I did not see an example of an error caused by interleaved calls to materialize, I was very suspicious that they could occur, and wanted to make sure that this would not happen. Hence, I made sure that continueParse always resets the position to where it left off. I will fix to not use default, so that a corresponding warning will be generated if a new value is added to the enumeration. kschimpf: The jumpToBit is needed because various "materialize" methods may be called between calls to…
		// and then continue parsing module.
		Stream.JumpToBit(NextUnreadBit);
		std::error_code EC = ParseModule();
		setParseStateIfError(EC);
		return EC;
		}
		case NoMoreInput:
		case ReachedEof:
		case FinishedParse:
		return std::error_code();
		case ParseError:
		return Error("Can't continue, bitcode error already found");
		}

// We expect a number of well-defined blocks, though we don't necessarily		// We expect a number of well-defined blocks, though we don't necessarily
// need to understand them all.		// need to understand them all.
while (1) {		while (1) {
		assert(ParseState == AtTopLevel);

if (Stream.AtEndOfStream()) {		if (Stream.AtEndOfStream()) {
if (TheModule)		setParseState(ReachedEof);
		return std::error_code();
		}

		if (UseOldLazyBitcodeParser && NumModulesParsed == 1) {
		// Fake at end.
		setParseState(ReachedEof);
return std::error_code();		return std::error_code();
// We didn't really read a proper Module.
return Error("Malformed IR file");
}		}

BitstreamEntry Entry =		BitstreamEntry Entry =
Stream.advance(BitstreamCursor::AF_DontAutoprocessAbbrevs);		Stream.advance(BitstreamCursor::AF_DontAutoprocessAbbrevs);

switch (Entry.Kind) {		switch (Entry.Kind) {
case BitstreamEntry::Error:		case BitstreamEntry::Error:
return Error("Malformed block");		return Error("Malformed IR file");
case BitstreamEntry::EndBlock:		case BitstreamEntry::EndBlock:
		setParseState(AtTopLevel);
		filcabUnsubmitted Not Done Reply Inline Actions We're already at top level, no? (line 3234) I might be missing something, but it looks like we're at top level, and saw an EndBlock. Shouldn't this be an error? filcab: We're already at top level, no? (line 3234) I might be missing something, but it looks like…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions I agree that we should be at top level. I also agree that it appears weird that we allow extra (unmathced) EndBlocks. This has been allowed by the bitcode reader/writer for years. I just wasn't willing to make the leap that I should remove this. However, I tried removing it (and making it an error), and no tests failed. Hence, Converting this to an error. kschimpf: I agree that we should be at top level. I also agree that it appears weird that we allow extra…
return std::error_code();		return std::error_code();
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Discovered that the Bitstream::EndBLock was "hiding" a bug int the bitstream reader when processing a data stream. That is, when using a data stream, the size is not set until after the eof is reached. Hence, when Stream.AtEndOfStream() was called above, it would return false even when at the eof. The actual problem was in FillCurWord, which did not set the bit position correctly when there was no more input. The old code worked because the read (at eof) would return zero, and is understood as an end block. By returning success for this value, it would hide this problem. I also improved the error message so that once can see where the reader thought the eof should be, if there is miscellaneous stuff at the end of the bit code file. This makes it easier to know where to cut a test file in such cases. kschimpf: Discovered that the Bitstream::EndBLock was "hiding" a bug int the bitstream reader when…

case BitstreamEntry::SubBlock:		case BitstreamEntry::SubBlock:
switch (Entry.ID) {		switch (Entry.ID) {
		filcabUnsubmitted Not Done Reply Inline Actions Thank you! filcab: Thank you!
case bitc::BLOCKINFO_BLOCK_ID:		case bitc::BLOCKINFO_BLOCK_ID:
if (Stream.ReadBlockInfoBlock())		if (Stream.ReadBlockInfoBlock()) {
return Error("Malformed block");		return Error("Malformed block");
		}
		filcabUnsubmitted Not Done Reply Inline Actions errs()? Or StrBuf? Please also add a test for this error message. filcab: errs()? Or StrBuf? Please also add a test for this error message.
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Good catch. I meant StrBuf, so that we can use the same API for all errors. kschimpf: Good catch. I meant StrBuf, so that we can use the same API for all errors.
break;		break;
		rafaelUnsubmitted Not Done Reply Inline Actions This looks a bit much to be honest. Corrupted files are not that common and it is trivial to set a breakpoint to find the state. rafael: This looks a bit much to be honest. Corrupted files are not that common and it is trivial to…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Simplified in new CL to do same as before. kschimpf: Simplified in new CL to do same as before.
case bitc::MODULE_BLOCK_ID:		case bitc::MODULE_BLOCK_ID: {
// Reject multiple MODULE_BLOCK's in a single bitstream.		// Reject multiple MODULE_BLOCK's in a single bitstream.
if (TheModule)		if (NumModulesParsed++) {
		rafaelUnsubmitted Not Done Reply Inline Actions This is always 0 or 1. Use a boolean instead. rafael: This is always 0 or 1. Use a boolean instead.
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions This was already fixed in master. kschimpf: This was already fixed in master.
return Error("Invalid multiple blocks");		return Error("Invalid multiple blocks");
TheModule = M;		}
if (std::error_code EC = ParseModule(false, ShouldLazyLoadMetadata))		std::error_code EC = ParseModule();
		setParseStateIfError(EC);
return EC;		return EC;
if (LazyStreamer)		}
return std::error_code();
break;
default:		default:
if (Stream.SkipBlock())		if (Stream.SkipBlock()) {
return Error("Invalid record");		return Error("Invalid record");
		}
break;		break;
}		}
continue;		continue;
case BitstreamEntry::Record:		case BitstreamEntry::Record:
// There should be no records in the top-level of blocks.		// There should be no records in the top-level of blocks.

// The ranlib in Xcode 4 will align archive members by appending newlines		// The ranlib in Xcode 4 will align archive members by appending newlines
// to the end of them. If this file size is a multiple of 4 but not 8, we		// to the end of them. If this file size is a multiple of 4 but not 8, we
// have to read and ignore these final 4 bytes :-(		// have to read and ignore these final 4 bytes :-(
if (Stream.getAbbrevIDWidth() == 2 && Entry.ID == 2 &&		if (Stream.getAbbrevIDWidth() == 2 && Entry.ID == 2 &&
Stream.Read(6) == 2 && Stream.Read(24) == 0xa0a0a &&		Stream.Read(6) == 2 && Stream.Read(24) == 0xa0a0a &&
Stream.AtEndOfStream())		Stream.AtEndOfStream()) {
		setParseState(ReachedEof);
return std::error_code();		return std::error_code();
		}

return Error("Invalid record");		return Error("Invalid record");
}		}
}		}
}		}

		std::error_code BitcodeReader::finishParse() {
		assert(TheModule);

		jvoungUnsubmitted Not Done Reply Inline Actions I don't quite understand how "ShouldMaterializeAll = false" is supposed to work for the streaming case, if this isn't checked until after: while (ParseState < NoMoreInput) { if (std::error_code EC = ContinueParse()) { return EC; } } How do you delay reading until materialize(GV) for streaming? jvoung: I don't quite understand how "ShouldMaterializeAll = false" is supposed to work for the…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions After talking to Derek, I realized what was the issue I was missing. I'l summarize what I understand: When streaming, we want to "return" as soon as possible, without having to force all bitcode to be scanned. This reduces the cost of (potential) blocking calls to the data streamer. Control can return to the caller without having completed the parse. However, the parsed portions must be consistent (i.e. forward block address references have been resolved). Based on this, I've modified the code to lift the materializeForwardReferencedFunctions into startParse. kschimpf: After talking to Derek, I realized what was the issue I was missing. I'l summarize what I…
		while (ParseState < NoMoreInput) {
		if (std::error_code EC = continueParse())
		return EC;
		}
		jvoungUnsubmitted Not Done Reply Inline Actions This used to be early, in "getLazyBitcodeModuleImpl". This looks like it is now happening late in FinishParse after the loop until NoMoreInput. Why is this okay now? Was the early call to "materializeForwardReferencedFunctions" actually extraneous because of the call in materialize(GV), or what? Make sure to check the lazy case with blockaddresses for computed gotos, if there isn't already a unittest for that. jvoung: This used to be early, in "getLazyBitcodeModuleImpl". This looks like it is now happening late…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions I agree that there should be some type of forward reference unit test to verify we can lazy evaluate these forward-referenced block addresses. I also agree that for the use by llvm-dis, my original code worked because it eventually calls materializeAllPermanently. I think I may have been confused about the full expectation of "streamed" (or lazy) was because of this. kschimpf: I agree that there should be some type of forward reference unit test to verify we can lazy…

		switch (ParseState) {
		case AtStart:
		case AtTopLevel:
		case InsideModule:
		llvm_unreachable("finishParse exits with ParseState < NoMoreInput");
		case NoMoreInput:
		case ReachedEof:
		setParseState(FinishedParse);
		break;
		case FinishedParse:
		jvoungUnsubmitted Not Done Reply Inline Actions "NoMorInput" -> "NoMoreInput" It could also be that more states >= NoMoreInput are added as code evolves, but not handled here. Can the compiler accept/handle a "static_assert(ParseState < NoMoreInput, "...") to catch what happens if more states are added after NoMoreInput but not handled by this switch? jvoung: "NoMorInput" -> "NoMoreInput" It could also be that more states >= NoMoreInput are added as…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Fixed string. Also removed "default" case and made all states explicit. This will force a warning if a new state is added. kschimpf: Fixed string. Also removed "default" case and made all states explicit. This will force a…
		break;
		case ParseError:
		return Error("Can't continue, bitcode error already found");
		}
		if (NumModulesParsed == 1)
		return std::error_code();
		// We didn't really read a proper Module.
		return Error("Malformed IR file");
		}

ErrorOr<std::string> BitcodeReader::parseModuleTriple() {		ErrorOr<std::string> BitcodeReader::parseModuleTriple() {
if (Stream.EnterSubBlock(bitc::MODULE_BLOCK_ID))		if (Stream.EnterSubBlock(bitc::MODULE_BLOCK_ID))
return Error("Invalid record");		return Error("Invalid record");

SmallVector<uint64_t, 64> Record;		SmallVector<uint64_t, 64> Record;

std::string Triple;		std::string Triple;
// Read all the records for this module.		// Read all the records for this module.
▲ Show 20 Lines • Show All 1,240 Lines • ▼ Show 20 Lines	OutOfRecordLoop:
return std::error_code();		return std::error_code();
}		}

/// Find the function body in the bitcode stream		/// Find the function body in the bitcode stream
std::error_code BitcodeReader::FindFunctionInStream(		std::error_code BitcodeReader::FindFunctionInStream(
Function *F,		Function *F,
DenseMap<Function *, uint64_t>::iterator DeferredFunctionInfoIterator) {		DenseMap<Function *, uint64_t>::iterator DeferredFunctionInfoIterator) {
while (DeferredFunctionInfoIterator->second == 0) {		while (DeferredFunctionInfoIterator->second == 0) {
if (Stream.AtEndOfStream())		if (ParseState >= NoMoreInput) {
return Error("Could not find function in stream");		return Error("Could not find function in stream");
// ParseModule will parse the next body in the stream and set its		}
// position in the DeferredFunctionInfo map.		if (std::error_code EC = continueParse()) {
if (std::error_code EC = ParseModule(true))
return EC;		return EC;
}		}
		}
return std::error_code();		return std::error_code();
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// GVMaterializer implementation		// GVMaterializer implementation
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

void BitcodeReader::releaseBuffer() { Buffer.release(); }		void BitcodeReader::releaseBuffer() { Buffer.release(); }

std::error_code BitcodeReader::materialize(GlobalValue *GV) {		std::error_code BitcodeReader::materialize(GlobalValue *GV) {
if (std::error_code EC = materializeMetadata())		if (std::error_code EC = materializeMetadata())
return EC;		return EC;

Function *F = dyn_cast<Function>(GV);		Function *F = dyn_cast<Function>(GV);
// If it's not a function or is already material, ignore the request.		// If it's not a function or is already material, ignore the request.
if (!F \|\| !F->isMaterializable())		if (!F \|\| !F->isMaterializable())
return std::error_code();		return std::error_code();

DenseMap<Function*, uint64_t>::iterator DFII = DeferredFunctionInfo.find(F);		DenseMap<Function*, uint64_t>::iterator DFII = DeferredFunctionInfo.find(F);
assert(DFII != DeferredFunctionInfo.end() && "Deferred function not found!");		assert(DFII != DeferredFunctionInfo.end() && "Deferred function not found!");
// If its position is recorded as 0, its body is somewhere in the stream		// If its position is recorded as 0, its body is somewhere in the stream
// but we haven't seen it yet.		// but we haven't seen it yet.
if (DFII->second == 0 && LazyStreamer)		if (DFII->second == 0)
if (std::error_code EC = FindFunctionInStream(F, DFII))		if (std::error_code EC = FindFunctionInStream(F, DFII))
return EC;		return EC;

// Move the bit stream to the saved position of the deferred function body.		// Move the bit stream to the saved position of the deferred function body.
Stream.JumpToBit(DFII->second);		Stream.JumpToBit(DFII->second);

if (std::error_code EC = ParseFunctionBody(F))		if (std::error_code EC = ParseFunctionBody(F))
return EC;		return EC;
Show All 11 Lines	if (I->first != I->second) {
if (CallInst* CI = dyn_cast<CallInst>(*UI++))		if (CallInst* CI = dyn_cast<CallInst>(*UI++))
UpgradeIntrinsicCall(CI, I->second);		UpgradeIntrinsicCall(CI, I->second);
}		}
}		}
}		}

// Bring in any functions that this function forward-referenced via		// Bring in any functions that this function forward-referenced via
// blockaddresses.		// blockaddresses.
return materializeForwardReferencedFunctions();		return materializeForwardReferencedFunctions();
		jvoungUnsubmitted Not Done Reply Inline Actions no need for extra space jvoung: no need for extra space
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
}		}

bool BitcodeReader::isDematerializable(const GlobalValue *GV) const {		bool BitcodeReader::isDematerializable(const GlobalValue *GV) const {
const Function *F = dyn_cast<Function>(GV);		const Function *F = dyn_cast<Function>(GV);
if (!F \|\| F->isDeclaration())		if (!F \|\| F->isDeclaration())
return false;		return false;

// Dematerializing F would leave dangling references that wouldn't be		// Dematerializing F would leave dangling references that wouldn't be
Show All 16 Lines	void BitcodeReader::dematerialize(GlobalValue *GV) {
F->dropAllReferences();		F->dropAllReferences();
F->setIsMaterializable(true);		F->setIsMaterializable(true);
}		}

std::error_code BitcodeReader::materializeModule(Module *M) {		std::error_code BitcodeReader::materializeModule(Module *M) {
assert(M == TheModule &&		assert(M == TheModule &&
"Can only Materialize the Module this BitcodeReader is attached to.");		"Can only Materialize the Module this BitcodeReader is attached to.");

if (std::error_code EC = materializeMetadata())		// Make sure the rest of the bits in the module (excluding materializable)
		// have been read.
		if (std::error_code EC = finishParse())
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Removing the comment about being after a function body. This is no longer true. A call to materializeMetadata would put us some place else in the bitcode file. kschimpf: Removing the comment about being after a function body. This is no longer true. A call to…
return EC;		return EC;

// Promise to materialize all forward references.		if (std::error_code EC = materializeMetadata())
WillMaterializeAllForwardRefs = true;		return EC;

		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Moved the iterating of functions to before the call to finishParse. This deals with the problems I was having with tests in test/Bitcode/invalid.test (i.e. Inputs/invalid-fwdref-type-mismatch-2.bc and Inputs/invalid-load-ptr-type.bc). These two files had multiple errors (the one they intended which was inside a function body, and the one probably not intended - extraneous stuff at the end of the bitcode file). This removes the need for the command line flag UseOldLazyBitcodeParse, and I have deleted it. kschimpf: Moved the iterating of functions to before the call to finishParse. This deals with the…
// Iterate over the module, deserializing any functions that are still on		// Iterate over the module, deserializing any functions that are still on
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Now that the eof checking is fixed, I moved this back where it was in an earlier version of this CL. kschimpf: Now that the eof checking is fixed, I moved this back where it was in an earlier version of…
// disk.		// disk.
for (Module::iterator F = TheModule->begin(), E = TheModule->end();		for (Module::iterator F = TheModule->begin(), E = TheModule->end();
F != E; ++F) {		F != E; ++F) {
if (std::error_code EC = materialize(F))		if (std::error_code EC = materialize(F))
return EC;		return EC;
}		}
// At this point, if there are any function bodies, the current bit is
// pointing to the END_BLOCK record after them. Now make sure the rest
// of the bits in the module have been read.
if (NextUnreadBit)
ParseModule(true);

		jvoungUnsubmitted Not Done Reply Inline Actions Is this necessary at this point? Should that already be covered by the " // Iterate over the module, deserializing any functions that are still on disk" loop? jvoung: Is this necessary at this point? Should that already be covered by the " // Iterate over the…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions In correct bitcode files, you are right. However, if the function doesn't define any function blocks, but (incorrectly) references function block addresses, this code will cause the error to be generated. However, looking at the following instruction, this is checked anyway. Removing. kschimpf: In correct bitcode files, you are right. However, if the function doesn't define any function…
// Check that all block address forward references got resolved (as we		// Check that all block address forward references got resolved.
		jvoungUnsubmitted Not Done Reply Inline Actions The "promise" comment from "above" is removed now, so you could update this comment. jvoung: The "promise" comment from "above" is removed now, so you could update this comment.
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
// promised above).
if (!BasicBlockFwdRefs.empty())		if (!BasicBlockFwdRefs.empty())
return Error("Never resolved function from blockaddress");		return Error("Never resolved function from blockaddress");

// Upgrade any intrinsic calls that slipped through (should not happen!) and		// Upgrade any intrinsic calls that slipped through (should not happen!) and
// delete the old functions to clean up. We can't do this unless the entire		// delete the old functions to clean up. We can't do this unless the entire
// module is materialized because there could always be another function body		// module is materialized because there could always be another function body
// with calls to the old function.		// with calls to the old function.
for (std::vector<std::pair<Function, Function> >::iterator I =		for (std::vector<std::pair<Function, Function> >::iterator I =
Show All 18 Lines	std::error_code BitcodeReader::materializeModule(Module *M) {
return std::error_code();		return std::error_code();
}		}

std::vector<StructType *> BitcodeReader::getIdentifiedStructTypes() const {		std::vector<StructType *> BitcodeReader::getIdentifiedStructTypes() const {
return IdentifiedStructTypes;		return IdentifiedStructTypes;
}		}

std::error_code BitcodeReader::InitStream() {		std::error_code BitcodeReader::InitStream() {
if (LazyStreamer)		if (Streamer)
return InitLazyStream();		return InitLazyStream();
return InitStreamFromBuffer();		return InitStreamFromBuffer();
}		}

std::error_code BitcodeReader::InitStreamFromBuffer() {		std::error_code BitcodeReader::InitStreamFromBuffer() {
const unsigned char BufPtr = (const unsigned char)Buffer->getBufferStart();		const unsigned char BufPtr = (const unsigned char)Buffer->getBufferStart();
const unsigned char *BufEnd = BufPtr+Buffer->getBufferSize();		const unsigned char *BufEnd = BufPtr+Buffer->getBufferSize();

Show All 10 Lines	std::error_code BitcodeReader::InitStreamFromBuffer() {
Stream.init(&*StreamFile);		Stream.init(&*StreamFile);

return std::error_code();		return std::error_code();
}		}

std::error_code BitcodeReader::InitLazyStream() {		std::error_code BitcodeReader::InitLazyStream() {
// Check and strip off the bitcode wrapper; BitstreamReader expects never to		// Check and strip off the bitcode wrapper; BitstreamReader expects never to
// see it.		// see it.
auto OwnedBytes = llvm::make_unique<StreamingMemoryObject>(LazyStreamer);		auto OwnedBytes = llvm::make_unique<StreamingMemoryObject>(Streamer);
StreamingMemoryObject &Bytes = *OwnedBytes;		StreamingMemoryObject &Bytes = *OwnedBytes;
StreamFile = llvm::make_unique<BitstreamReader>(std::move(OwnedBytes));		StreamFile = llvm::make_unique<BitstreamReader>(std::move(OwnedBytes));
Stream.init(&*StreamFile);		Stream.init(&*StreamFile);

unsigned char buf[16];		unsigned char buf[16];
if (Bytes.readBytes(buf, 16, 0) != 16)		if (Bytes.readBytes(buf, 16, 0) != 16)
return Error("Invalid bitcode signature");		return Error("Invalid bitcode signature");

▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	getLazyBitcodeModuleImpl(std::unique_ptr<MemoryBuffer> &&Buffer,
LLVMContext &Context, bool WillMaterializeAll,		LLVMContext &Context, bool WillMaterializeAll,
DiagnosticHandlerFunction DiagnosticHandler,		DiagnosticHandlerFunction DiagnosticHandler,
bool ShouldLazyLoadMetadata = false) {		bool ShouldLazyLoadMetadata = false) {
Module *M = new Module(Buffer->getBufferIdentifier(), Context);		Module *M = new Module(Buffer->getBufferIdentifier(), Context);
BitcodeReader *R =		BitcodeReader *R =
new BitcodeReader(Buffer.get(), Context, DiagnosticHandler);		new BitcodeReader(Buffer.get(), Context, DiagnosticHandler);
M->setMaterializer(R);		M->setMaterializer(R);

auto cleanupOnError = [&](std::error_code EC) {		if (std::error_code EC =
R->releaseBuffer(); // Never take ownership on error.		R->parseBitcodeInto(M, WillMaterializeAll, ShouldLazyLoadMetadata)) {
delete M; // Also deletes R.		delete M; // Also deletes R.
return EC;		return EC;
};		}

// Delay parsing Metadata if ShouldLazyLoadMetadata is true.
if (std::error_code EC = R->ParseBitcodeInto(M, ShouldLazyLoadMetadata))
return cleanupOnError(EC);

if (!WillMaterializeAll)
// Resolve forward references from blockaddresses.
if (std::error_code EC = R->materializeForwardReferencedFunctions())
return cleanupOnError(EC);

Buffer.release(); // The BitcodeReader owns it now.		Buffer.release(); // The BitcodeReader owns it now.
return M;		return M;
}		}

ErrorOr<Module *>		ErrorOr<Module *>
llvm::getLazyBitcodeModule(std::unique_ptr<MemoryBuffer> &&Buffer,		llvm::getLazyBitcodeModule(std::unique_ptr<MemoryBuffer> &&Buffer,
LLVMContext &Context,		LLVMContext &Context,
DiagnosticHandlerFunction DiagnosticHandler,		DiagnosticHandlerFunction DiagnosticHandler,
bool ShouldLazyLoadMetadata) {		bool ShouldLazyLoadMetadata) {
return getLazyBitcodeModuleImpl(std::move(Buffer), Context, false,		return getLazyBitcodeModuleImpl(std::move(Buffer), Context, false,
DiagnosticHandler, ShouldLazyLoadMetadata);		DiagnosticHandler, ShouldLazyLoadMetadata);
}		}

ErrorOr<std::unique_ptr<Module>>		ErrorOr<std::unique_ptr<Module>>
llvm::getStreamedBitcodeModule(StringRef Name, DataStreamer *Streamer,		llvm::getStreamedBitcodeModule(StringRef Name, DataStreamer *Streamer,
LLVMContext &Context,		LLVMContext &Context,
DiagnosticHandlerFunction DiagnosticHandler) {		DiagnosticHandlerFunction DiagnosticHandler) {
std::unique_ptr<Module> M = make_unique<Module>(Name, Context);		std::unique_ptr<Module> M = make_unique<Module>(Name, Context);
BitcodeReader *R = new BitcodeReader(Streamer, Context, DiagnosticHandler);		BitcodeReader *R = new BitcodeReader(Streamer, Context, DiagnosticHandler);
M->setMaterializer(R);		M->setMaterializer(R);
if (std::error_code EC = R->ParseBitcodeInto(M.get()))		if (std::error_code EC = R->parseBitcodeInto(M.get(), false, false))
return EC;		return EC;
return std::move(M);		return std::move(M);
}		}

ErrorOr<Module *>		ErrorOr<Module *>
llvm::parseBitcodeFile(MemoryBufferRef Buffer, LLVMContext &Context,		llvm::parseBitcodeFile(MemoryBufferRef Buffer, LLVMContext &Context,
DiagnosticHandlerFunction DiagnosticHandler) {		DiagnosticHandlerFunction DiagnosticHandler) {
std::unique_ptr<MemoryBuffer> Buf = MemoryBuffer::getMemBuffer(Buffer, false);		std::unique_ptr<MemoryBuffer> Buf = MemoryBuffer::getMemBuffer(Buffer, false);
ErrorOr<Module *> ModuleOrErr = getLazyBitcodeModuleImpl(		ErrorOr<Module *> ModuleOrErr = getLazyBitcodeModuleImpl(
std::move(Buf), Context, true, DiagnosticHandler);		std::move(Buf), Context, true, DiagnosticHandler);
if (!ModuleOrErr)		if (!ModuleOrErr)
return ModuleOrErr;		return ModuleOrErr;
Module *M = ModuleOrErr.get();		Module *M = ModuleOrErr.get();
// Read in the entire module, and destroy the BitcodeReader.
if (std::error_code EC = M->materializeAllPermanently()) {
delete M;
return EC;
}

// TODO: Restore the use-lists to the in-memory state when the bitcode was		// TODO: Restore the use-lists to the in-memory state when the bitcode was
// written. We must defer until the Module has been fully materialized.		// written. We must defer until the Module has been fully materialized.

return M;		return M;
}		}

std::string		std::string
Show All 10 Lines

lib/Bitcode/Reader/BitstreamReader.cpp

Show First 20 Lines • Show All 352 Lines • ▼ Show 20 Lines	switch (readRecord(Entry.ID, Record)) {
Name += (char)Record[i];		Name += (char)Record[i];
CurBlockInfo->RecordNames.push_back(std::make_pair((unsigned)Record[0],		CurBlockInfo->RecordNames.push_back(std::make_pair((unsigned)Record[0],
Name));		Name));
break;		break;
}		}
}		}
}		}
}		}

test/Bitcode/invalid.test

	Show First 20 Lines • Show All 93 Lines • ▼ Show 20 Lines

	INVALID-TYPE: Invalid type for value			INVALID-TYPE: Invalid type for value

	RUN: not llvm-dis -disable-output %p/Inputs/invalid-fwdref-type-mismatch.bc 2>&1 \| \			RUN: not llvm-dis -disable-output %p/Inputs/invalid-fwdref-type-mismatch.bc 2>&1 \| \
	RUN: FileCheck --check-prefix=FWDREF-TYPE %s			RUN: FileCheck --check-prefix=FWDREF-TYPE %s

	FWDREF-TYPE: Invalid record			FWDREF-TYPE: Invalid record

	RUN: not llvm-dis -disable-output %p/Inputs/invalid-fwdref-type-mismatch-2.bc 2>&1 \| \			RUN: not llvm-dis -disable-output %p/Inputs/invalid-fwdref-type-mismatch-2.bc 2>&1 \
				RUN: -old-lazy-bitcode-parser \| \
	RUN: FileCheck --check-prefix=FWDREF-TYPE-MISMATCH %s			RUN: FileCheck --check-prefix=FWDREF-TYPE-MISMATCH %s

	FWDREF-TYPE-MISMATCH: Type mismatch in constant table!			FWDREF-TYPE-MISMATCH: Type mismatch in constant table!

	RUN: not llvm-dis -disable-output %p/Inputs/invalid-array-element-type.bc 2>&1 \| \			RUN: not llvm-dis -disable-output %p/Inputs/invalid-array-element-type.bc 2>&1 \| \
	RUN: FileCheck --check-prefix=ELEMENT-TYPE %s			RUN: FileCheck --check-prefix=ELEMENT-TYPE %s
	RUN: not llvm-dis -disable-output %p/Inputs/invalid-vector-element-type.bc 2>&1 \| \			RUN: not llvm-dis -disable-output %p/Inputs/invalid-vector-element-type.bc 2>&1 \| \
	RUN: FileCheck --check-prefix=ELEMENT-TYPE %s			RUN: FileCheck --check-prefix=ELEMENT-TYPE %s
	Show All 27 Lines

	INSERT-0-IDXS: INSERTVAL: Invalid instruction with 0 indices			INSERT-0-IDXS: INSERTVAL: Invalid instruction with 0 indices

	RUN: not llvm-dis -disable-output %p/Inputs/invalid-extract-0-indices.bc 2>&1 \| \			RUN: not llvm-dis -disable-output %p/Inputs/invalid-extract-0-indices.bc 2>&1 \| \
	RUN: FileCheck --check-prefix=EXTRACT-0-IDXS %s			RUN: FileCheck --check-prefix=EXTRACT-0-IDXS %s

	EXTRACT-0-IDXS: EXTRACTVAL: Invalid instruction with 0 indices			EXTRACT-0-IDXS: EXTRACTVAL: Invalid instruction with 0 indices

	RUN: not llvm-dis -disable-output %p/Inputs/invalid-load-ptr-type.bc 2>&1 \| \			RUN: not llvm-dis -disable-output %p/Inputs/invalid-load-ptr-type.bc 2>&1 \
				RUN: -old-lazy-bitcode-parser \| \
	RUN: FileCheck --check-prefix=BAD-LOAD-PTR-TYPE %s			RUN: FileCheck --check-prefix=BAD-LOAD-PTR-TYPE %s

	BAD-LOAD-PTR-TYPE: Cannot load/store from pointer			BAD-LOAD-PTR-TYPE: Cannot load/store from pointer

	RUN: not llvm-dis -disable-output %p/Inputs/invalid-inserted-value-type-mismatch.bc 2>&1 \| \			RUN: not llvm-dis -disable-output %p/Inputs/invalid-inserted-value-type-mismatch.bc 2>&1 \| \
	RUN: FileCheck --check-prefix=INSERT-TYPE-MISMATCH %s			RUN: FileCheck --check-prefix=INSERT-TYPE-MISMATCH %s

	INSERT-TYPE-MISMATCH: Inserted value type doesn't match aggregate type			INSERT-TYPE-MISMATCH: Inserted value type doesn't match aggregate type
	Show All 35 Lines