This is an archive of the discontinued LLVM Phabricator instance.

Refactor bitcode reader to simplify control.
AbandonedPublic

Authored by kschimpf on Apr 1 2015, 2:32 PM.

Download Raw Diff

Details

Reviewers

dschuff
filcab
• rafael
jvoung

Summary

Modifies the bitcode reader such that the same logic is used for
both memory buffers and data streams. The incremental parsing
was factored into startParse, continueParse, and finishParse.
All parses (incremental or non-incremental) begin with startParse.
Then zero (or more) calls to continueParse incrementally read more
input, picking up from where the last call left off. finishParse
materializes any additional parts, based on the flags passed to startParse.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

jvoung added inline comments.Apr 1 2015, 4:49 PM

lib/Bitcode/Reader/BitcodeReader.cpp
234	be consistent about capitalization in comments one line starts with "parsed input" and another line starts with "Parsed input" =)
250–251	Don't need explicit anymore, though that transition was a while back so not really related to this CL.
277	Variable name is different from comment "MaterializeAll" vs "ShouldMaterializeAll" -- make them the same? I see that in the actual definition you're trying to avoid conflicting with the field name...

kschimpf updated this object.Apr 6 2015, 10:04 AM

kschimpf edited edge metadata.

Merge branch 'master' of http://llvm.org/git/llvm into readfac1
Working version. Save state.
Cleanup startParse.
Cleaned up code.
Fix nit.
Merge branch 'master' of http://llvm.org/git/llvm into readfac1

DId all changes except adding test that forward block address references get resolved on lazy loads.

lib/Bitcode/Reader/BitcodeReader.cpp
234	Done.
250–251	Done.
268–269	Sorry, my fault. I followed the syntax of ConstantPlaceHolder below. Changing to follow coding standards.
268–269	Done.
276	Good point. Fixing.
277	Fixing the names to be consistent. Fixing name conflict by prefixing assignments of field names with "this->".
386	Done.
390	Done.
398	I guess the first part of the problem is that there are 2 notions of parse state: The field ParseState that names the state of the parser, and NextUnreadBit which defines where to continue the parse on return (in case a function body gets parsed between calls). I also overloaded the return value with this update. Refactoring to do less and be more clear.
413	The refactoring is a bit better now. Hopefully good enough.
2845	Done.
3277	After talking to Derek, I realized what was the issue I was missing. I'l summarize what I understand: When streaming, we want to "return" as soon as possible, without having to force all bitcode to be scanned. This reduces the cost of (potential) blocking calls to the data streamer. Control can return to the caller without having completed the parse. However, the parsed portions must be consistent (i.e. forward block address references have been resolved). Based on this, I've modified the code to lift the materializeForwardReferencedFunctions into startParse.
3281	I agree that there should be some type of forward reference unit test to verify we can lazy evaluate these forward-referenced block addresses. I also agree that for the use by llvm-dis, my original code worked because it eventually calls materializeAllPermanently. I think I may have been confused about the full expectation of "streamed" (or lazy) was because of this.
4616	Done.

• rafael added inline comments.Apr 8 2015, 5:27 PM

include/llvm/Support/StreamingMemoryObject.h
75 ↗	(On Diff #23431)	Why do you need this? The streamer will return how many bytes were read and can handle a larger request. Also, why does it need to be part of this patch? It looks like this patch has many independent changes in it.
88 ↗	(On Diff #23431)	Why the extra logic? If objectsize is known it is the same as BytesRead, no?

kschimpf added inline comments.Apr 9 2015, 9:41 AM

include/llvm/Support/StreamingMemoryObject.h
75 ↗	(On Diff #23431)	This was added to handle the case of when one is parsing a wrapped bitcode file. In such cases, you do not need to do another read (which may block until it succeeds). That was the intent of this change. However, a simpler approach would be to allow the extra read, and then not set ObjectSize (below) if already set. I will remove this change, and add the conditional assignment to ObjectSize below. I changed it in this CL because it didn't cause a problem until I fixed that materializing a module (when streaming) didn't actually read all of the bitcode file. When that change was added, tests failed and this issue was exposed. I will remove the changes StreamingMemoryObject.{h,cpp} and put in a separate CL.
88 ↗	(On Diff #23431)	No, they aren't necessarily the same. The problem happens when you have a wrapped bitcode file, and was not exposed until I fixed the case that we weren't reading the entire bitcode when materializing lazily. Then, a bunch of test cases failed. When I looked into it, this is what I discovered: The wrapped bitcode was smaller than kChunkSize. Hence, the initial read set BytesRead to the size of the wrapped file on first read. The wrapper was then read, and set ObjectSize, which corresponded to 4 bytes smaller than BytesRead. This is the reason I changed this file as I did.

jvoung added inline comments.Apr 9 2015, 2:33 PM

lib/Bitcode/Reader/BitcodeReader.cpp
242	nit: This is usually 0 or 1, but it seems unexpected for this field to be named "NumModulesParsed", and yet have the type be "bool". Rename or change type?
413	Thanks -- this is better. For a while I was also wondering how many places need to be aware of setting the state to ParseError, but I think it's just continueParse() because most/all searching for bit position, etc. goes through that.
3159	Does this need to be cleanupOnError(EC) also?
3197	This covers two states? InsideModule and AtTopLevel? It might be more clear if you list them out so it's clear what the "break" corresponds to (AtTopLevel). Previously, the Stream.JumpToBit(NextUnreadBit); was only needed when InsideModule... is it now needed for AtTopLevel too?
3292	"NoMorInput" -> "NoMoreInput" It could also be that more states >= NoMoreInput are added as code evolves, but not handled here. Can the compiler accept/handle a "static_assert(ParseState < NoMoreInput, "...") to catch what happens if more states are added after NoMoreInput but not handled by this switch?
4664	Is this necessary at this point? Should that already be covered by the " // Iterate over the module, deserializing any functions that are still on disk" loop?
4664–4665	The "promise" comment from "above" is removed now, so you could update this comment.

Fix issues raised by jvoung, and remove changes now in D8907.

kschimpf added a parent revision: D8931: Add test showing error in StreamingMemoryObject.setKnownObjectSize()..Apr 9 2015, 3:37 PM

kschimpf added inline comments.

lib/Bitcode/Reader/BitcodeReader.cpp
242	Good catch. I did meant to use size_t. Fixing.
413	That is correct and was the intent. State updates (and bit positioning) is intentionally now localized to continueParse. The only exception is in ParseModule, which updates the state to state whether it returned without completing.
3159	Yes. Good catch. Fixing.
3197	The jumpToBit is needed because various "materialize" methods may be called between calls to continueParse. By forcing a jumpToBit to happen at all calls to continueParse, we no longer need to know where the materialize methods leave the bitcursor. While I did not see an example of an error caused by interleaved calls to materialize, I was very suspicious that they could occur, and wanted to make sure that this would not happen. Hence, I made sure that continueParse always resets the position to where it left off. I will fix to not use default, so that a corresponding warning will be generated if a new value is added to the enumeration.
3292	Fixed string. Also removed "default" case and made all states explicit. This will force a warning if a new state is added.
4651	Removing the comment about being after a function body. This is no longer true. A call to materializeMetadata would put us some place else in the bitcode file.
4664	In correct bitcode files, you are right. However, if the function doesn't define any function blocks, but (incorrectly) references function block addresses, this code will cause the error to be generated. However, looking at the following instruction, this is checked anyway. Removing.
4664–4665	Done.

Fix issues in diff 23431.

Merge branch 'master' into readfac1
Fix issues raised by merge.
Merge branch 'master' of http://llvm.org/git/llvm into readfac1
Fix tests to use old-style parser.

Fix nits.

Now that the issues with the streaming memory object has been fixed, I have updated this CL for review.

Note that I added a CL flag "-old-lazy-bitcode-parser". This was done to deal with a bug fixed by this CL. That is, in the old code, when you materialized a module, it didn't check if there was any additional data in the bitcode file. The new code fixes this by calling "finishParse". However, there are a couple of (bitcode binary) tests that were generated with this violation. Hence, the flag was added to fix this problem.

I'm willing to remove this flag in either (1) a later review, or (2) in a later revision. However, for this review I made the issues explicit so that the problem can be seen.

In D8786#179791, @kschimpf wrote:

However, there are a couple of (bitcode binary) tests that were generated with this violation. Hence, the flag was added to fix this problem.

I'm willing to remove this flag in either (1) a later review, or (2) in a later revision. However, for this review I made the issues explicit so that the problem can be seen.

Let me know which ones have bugs. They might be easy-ish to reconstruct (especially since I have some additional practice in fiddling with bc files, now (Can't promise to deal with them very quickly, though).

lib/Bitcode/Reader/BitcodeReader.cpp
282–283	Why the empty line?
294	http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments ^^ This changed recently. Omit \brief if the brief description is just a sentence. (You might want to add the '.' at the end, though)
771	Omit \brief.
785	Omit \brief.
3227	We're already at top level, no? (line 3234) I might be missing something, but it looks like we're at top level, and saw an EndBlock. Shouldn't this be an error?

kschimpf added a reviewer: filcab.May 28 2015, 1:54 PM

Merge branch 'master' of http://llvm.org/git/llvm into readfac1
Fix issues raised by filcab.

Fixes based on feedback by Filipe.

lib/Bitcode/Reader/BitcodeReader.cpp
282–283	Removed.
771	Done.
785	Done.
3227	I agree that we should be at top level. I also agree that it appears weird that we allow extra (unmathced) EndBlocks. This has been allowed by the bitcode reader/writer for years. I just wasn't willing to make the leap that I should remove this. However, I tried removing it (and making it an error), and no tests failed. Hence, Converting this to an error.
4656	Moved the iterating of functions to before the call to finishParse. This deals with the problems I was having with tests in test/Bitcode/invalid.test (i.e. Inputs/invalid-fwdref-type-mismatch-2.bc and Inputs/invalid-load-ptr-type.bc). These two files had multiple errors (the one they intended which was inside a function body, and the one probably not intended - extraneous stuff at the end of the bitcode file). This removes the need for the command line flag UseOldLazyBitcodeParse, and I have deleted it.

Hi Karl,

The files came straight from the fuzzer, so it is likely have more than one error. If, in order to support them (where support is: keep the test working and diagnosing what we want), you have to change the code in a convoluted way, I would prefer to change the test.

If the change is minimal and not a problem (doesn't impact legibility or architecture), then keeping the tests as they are is not a problem either. I just want to avoid having worse code just so we don't have to re-do some tests.

Of course, if we started crashing on the tests, that's a problem :-)

Thanks,

Filipe

(Phab butchered the comment in my email. Editing it to get a complete history in Phabricator)

Fix invalid bitcode tests with more than one error.
Merge branch 'master' into readfac1

In addition to moving the code back in materializeModule, I fixed three tests. Two of them were fuzz tests where I "truncated" the file to the end of the module block. The other test did not have anything after a bad abbreviation definition, and the code file was incomplete. So I generated a replacement test that was well structured otherwise (i.e. only had that one error in it).

include/llvm/Bitcode/BitstreamReader.h
328	This code fixes the state of the bit streamer when no more input is found. As a result, method AtEndOfStream now works correctly.
lib/Bitcode/Reader/BitcodeReader.cpp
3226	Discovered that the Bitstream::EndBLock was "hiding" a bug int the bitstream reader when processing a data stream. That is, when using a data stream, the size is not set until after the eof is reached. Hence, when Stream.AtEndOfStream() was called above, it would return false even when at the eof. The actual problem was in FillCurWord, which did not set the bit position correctly when there was no more input. The old code worked because the read (at eof) would return zero, and is understood as an end block. By returning success for this value, it would hide this problem. I also improved the error message so that once can see where the reader thought the eof should be, if there is miscellaneous stuff at the end of the bit code file. This makes it easier to know where to cut a test file in such cases.
4657	Now that the eof checking is fixed, I moved this back where it was in an earlier version of this CL.

Hi Karl,

Really sorry for the delay.
LGTM on my part, as long as you add the test for the error message and do the fix.

Thank you,

Filipe

lib/Bitcode/Reader/BitcodeReader.cpp
772	Nit: Put more words on the first line.
776	Nit: If it's for docs, it's probably best to start with an uppercase letter.
3229	Thank you!
3233	errs()? Or StrBuf? Please also add a test for this error message.

Merge branch 'master' of http://llvm.org/git/llvm into readfac1
Fixes associated with review by Filipe.

Fixed issues raised by Filipe.

lib/Bitcode/Reader/BitcodeReader.cpp
772	Done.
776	Done.
3233	Good catch. I meant StrBuf, so that we can use the same API for all errors.

Applying the patch locally to take a better look.

lib/Bitcode/Reader/BitcodeReader.cpp
269–281	Please commit the pure cleanup bits first: using \ instead of @ starting functions with lowercase names.
3234	This looks a bit much to be honest. Corrupted files are not that common and it is trivial to set a breakpoint to find the state.
3245	This is always 0 or 1. Use a boolean instead.

I got the following test failures locally:

LLVM :: Bitcode/invalid.test
LLVM :: tools/gold/invalid.ll

This CL has gotten a bit long, and hard to read (to many versions). Moved to a new CL http://reviews.llvm.org/D10518

lib/Bitcode/Reader/BitcodeReader.cpp
269–281	I assume this has already been done. The new CL doesn't have suc cases anymore.
3234	Simplified in new CL to do same as before.
3245	This was already fixed in master.

This CL has gotten a bit long, and hard to read (to many versions). Moved to a new CL http://reviews.llvm.org/D10518

Revision Contents

Path

Size

include/

llvm/

Bitcode/

BitstreamReader.h

2 lines

lib/

Bitcode/

Reader/

BitcodeReader.cpp

321 lines

test/

Bitcode/

Inputs/

invalid-abbrev.bc

invalid-fwdref-type-mismatch-2.bc

invalid-load-ptr-type.bc

Commit	Tree	Parents	Author	Summary	Date
9fd0697d3371	090f105b1f8d	62f309a622b8	Karl Schimpf	Fix nit.	Jun 1 2015, 1:37 PM
62f309a622b8	33f6c031bea0	8066f41b81ef 3fcf5a99cbe7	Karl Schimpf	Merge branch 'master' into readfac1	Jun 1 2015, 1:29 PM
8066f41b81ef	870bbaf9f1a0	3944bbc6d418	Karl Schimpf	Fix nits.	Jun 1 2015, 1:20 PM
3944bbc6d418	98e7d1f6d2d6	d485b6b36010	Karl Schimpf	Fix invalid bitcode tests with more than one error. (Show More…)	Jun 1 2015, 1:04 PM
d485b6b36010	30b62cad8716	c341e5b362bf	Karl Schimpf	Fix issues raised by filcab.	May 28 2015, 3:06 PM
c341e5b362bf	2d6243e9d40c	ada1d858509e 20e00576af0d	Karl Schimpf	Merge branch 'master' of http://llvm.org/git/llvm into readfac1	May 28 2015, 2:05 PM
ada1d858509e	908bfa88dbe2	33fac955326a	Karl Schimpf	Fix nits.	May 27 2015, 3:39 PM
33fac955326a	f4eafba3ba14	ebb2942e3600	Karl Schimpf	Fix tests to use old-style parser.	May 27 2015, 3:28 PM
ebb2942e3600	fc2e92854f6a	b247e435f281 344593ce6c91	Karl Schimpf	Merge branch 'master' of http://llvm.org/git/llvm into readfac1	May 27 2015, 3:00 PM
b247e435f281	4107d4f57346	94061b821ca8	Karl Schimpf	Fix code due to merge, and report issue with invalid bitcode test.	May 27 2015, 2:59 PM
94061b821ca8	5421b155116b	d80585f197c5	Karl Schimpf	Save state.	May 27 2015, 1:49 PM
d80585f197c5	222a2e884a6b	a8ad769f4f7e	Karl Schimpf	Move check for TheModule being defined.	May 27 2015, 1:31 PM
a8ad769f4f7e	d37cfb3bee2b	ab35075a9118	Karl Schimpf	Fix issues raised by merge.	May 27 2015, 12:56 PM
ab35075a9118	75eb612a2483	8f22bd42f4dc 890a876e0e16	Karl Schimpf	Merge branch 'master' into readfac1	May 27 2015, 11:03 AM
8f22bd42f4dc	483c907c9f8a	cedc6147f3be	Karl Schimpf	Fix nit.	Apr 13 2015, 10:34 AM
cedc6147f3be	3685c77bf21c	437feddf04b5 332adac427ca	Karl Schimpf	Merge branch 'master' of http://llvm.org/git/llvm into readfac1	Apr 13 2015, 10:14 AM
437feddf04b5	49c35e14280e	8644fc8e6366	Karl Schimpf	Remove changes now in http://reviews.llvm.org/D8931.	Apr 9 2015, 3:28 PM
8644fc8e6366	0c564b770339	672b0e7b2d15	Karl Schimpf	Fix issues raised by jvoung.	Apr 9 2015, 3:26 PM
672b0e7b2d15	027579062840	19e24c9e77c0 6b5c9d5dd290	Karl Schimpf	Merge branch 'master' of http://llvm.org/git/llvm into readfac1	Apr 9 2015, 9:51 AM
19e24c9e77c0	b654d403b674	7a117b906196	Karl Schimpf	Fix StreamingMemoryObject based on Rafael's comments.	Apr 9 2015, 9:43 AM
7a117b906196	6c3cfdecb837	8eddbbc603dc cd13a3808a22	Karl Schimpf	Merge branch 'master' of http://llvm.org/git/llvm into readfac1	Apr 8 2015, 11:12 AM
8eddbbc603dc	15947275b1c6	fc1c6a8e568a	Karl Schimpf	Fix nit.	Apr 8 2015, 11:11 AM
fc1c6a8e568a	c8d19fa5f221	db835c00f8bb	Karl Schimpf	Cleaned up code.	Apr 8 2015, 11:07 AM
db835c00f8bb	9caa425c456f	45396af2fdfd	Karl Schimpf	Cleanup startParse.	Apr 8 2015, 9:47 AM
45396af2fdfd	ac2e8173502f	8daa676f74d2	Karl Schimpf	Working version. Save state.	Apr 8 2015, 8:51 AM
8daa676f74d2	2775d9b566c3	07dd5b215789 e17e7a2400df	Karl Schimpf	Merge branch 'master' of http://llvm.org/git/llvm into readfac1	Apr 6 2015, 9:17 AM
07dd5b215789	ae70c05d27df	012b38438e94	Karl Schimpf	Refactor bitcode reader to simplify control. (Show More…)	Apr 1 2015, 2:23 PM
012b38438e94	0bd1f195df6c	f279e19a390c a066ed09db37	Karl Schimpf	Merge branch 'master' of http://llvm.org/git/llvm into readfac1	Apr 1 2015, 1:48 PM
f279e19a390c	d8b4f0ff2a90	aa223716873a	Karl Schimpf	Finish code review.	Apr 1 2015, 12:57 PM
aa223716873a	cfae04b4b9fa	7016aa1cc1c8	Karl Schimpf	Clean up saving state.	Apr 1 2015, 12:12 PM
7016aa1cc1c8	e1f3d9ce6f28	da1b585479bf	Karl Schimpf	Make MaterializeModule not be recursive.	Apr 1 2015, 11:03 AM
da1b585479bf	da68d2183094	4b12b9cc327f	Karl Schimpf	Remove tracing code.	Mar 31 2015, 3:42 PM
4b12b9cc327f	a237c5cf18f7	5e41a7399c2a	Karl Schimpf	Working code for all tests in check.	Mar 31 2015, 3:33 PM
5e41a7399c2a	b3ff737b87af	0e8bafe3e45b	Karl Schimpf	Save state to test what is happening in master.	Mar 31 2015, 12:41 PM
0e8bafe3e45b	ef1f136464a0	9864b616c701	Karl Schimpf	Modify parsing of bitcode files to stop after first (safe) skipped block.	Mar 27 2015, 11:13 AM
9864b616c701	030c35e525f8	3f972ab1bf37	Karl Schimpf	Save start.	Mar 25 2015, 3:14 PM

Diff 26923

include/llvm/Bitcode/BitstreamReader.h

Show First 20 Lines • Show All 319 Lines • ▼ Show 20 Lines	void fillCurWord() {
// Read the next word from the stream.		// Read the next word from the stream.
uint8_t Array[sizeof(word_t)] = {0};		uint8_t Array[sizeof(word_t)] = {0};

uint64_t BytesRead =		uint64_t BytesRead =
BitStream->getBitcodeBytes().readBytes(Array, sizeof(Array), NextChar);		BitStream->getBitcodeBytes().readBytes(Array, sizeof(Array), NextChar);

// If we run out of data, stop at the end of the stream.		// If we run out of data, stop at the end of the stream.
if (BytesRead == 0) {		if (BytesRead == 0) {
Size = NextChar;		Size = NextChar;
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions This code fixes the state of the bit streamer when no more input is found. As a result, method AtEndOfStream now works correctly. kschimpf: This code fixes the state of the bit streamer when no more input is found. As a result, method…
		CurWord = 0;
		BitsInCurWord = 0;
return;		return;
}		}

CurWord =		CurWord =
support::endian::read<word_t, support::little, support::unaligned>(		support::endian::read<word_t, support::little, support::unaligned>(
Array);		Array);
NextChar += BytesRead;		NextChar += BytesRead;
BitsInCurWord = BytesRead * 8;		BitsInCurWord = BytesRead * 8;
▲ Show 20 Lines • Show All 181 Lines • Show Last 20 Lines

lib/Bitcode/Reader/BitcodeReader.cpp

Show All 27 Lines
#include "llvm/IR/OperandTraits.h"		#include "llvm/IR/OperandTraits.h"
#include "llvm/IR/Operator.h"		#include "llvm/IR/Operator.h"
#include "llvm/IR/ValueHandle.h"		#include "llvm/IR/ValueHandle.h"
#include "llvm/Support/DataStream.h"		#include "llvm/Support/DataStream.h"
#include "llvm/Support/ManagedStatic.h"		#include "llvm/Support/ManagedStatic.h"
#include "llvm/Support/MathExtras.h"		#include "llvm/Support/MathExtras.h"
#include "llvm/Support/MemoryBuffer.h"		#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
		#include "llvm/Support/Format.h"
#include <deque>		#include <deque>
using namespace llvm;		using namespace llvm;

namespace {		namespace {
enum {		enum {
SWITCH_INST_MAGIC = 0x4B5 // May 2012 => 1205 => Hex		SWITCH_INST_MAGIC = 0x4B5 // May 2012 => 1205 => Hex
};		};

▲ Show 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	public:
void AssignValue(Metadata *MD, unsigned Idx);		void AssignValue(Metadata *MD, unsigned Idx);
void tryToResolveCycles();		void tryToResolveCycles();
};		};

class BitcodeReader : public GVMaterializer {		class BitcodeReader : public GVMaterializer {
LLVMContext &Context;		LLVMContext &Context;
DiagnosticHandlerFunction DiagnosticHandler;		DiagnosticHandlerFunction DiagnosticHandler;
Module *TheModule;		Module *TheModule;
		// The following two fields define the type of memory to parse.
std::unique_ptr<MemoryBuffer> Buffer;		std::unique_ptr<MemoryBuffer> Buffer;
		DataStreamer *Streamer;
std::unique_ptr<BitstreamReader> StreamFile;		std::unique_ptr<BitstreamReader> StreamFile;
BitstreamCursor Stream;		BitstreamCursor Stream;
DataStreamer *LazyStreamer;
uint64_t NextUnreadBit;
bool SeenValueSymbolTable;		bool SeenValueSymbolTable;

std::vector<Type*> TypeList;		std::vector<Type*> TypeList;
BitcodeReaderValueList ValueList;		BitcodeReaderValueList ValueList;
BitcodeReaderMDValueList MDValueList;		BitcodeReaderMDValueList MDValueList;
std::vector<Comdat *> ComdatList;		std::vector<Comdat *> ComdatList;
SmallVector<Instruction *, 64> InstructionList;		SmallVector<Instruction *, 64> InstructionList;

▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	class BitcodeReader : public GVMaterializer {
/// for a more compact encoding. Some instruction operands are not		/// for a more compact encoding. Some instruction operands are not
/// relative to the instruction ID: basic block numbers, and types.		/// relative to the instruction ID: basic block numbers, and types.
/// Once the old style function blocks have been phased out, we would		/// Once the old style function blocks have been phased out, we would
/// not need this flag.		/// not need this flag.
bool UseRelativeIDs;		bool UseRelativeIDs;

/// True if all functions will be materialized, negating the need to process		/// True if all functions will be materialized, negating the need to process
/// (e.g.) blockaddress forward references.		/// (e.g.) blockaddress forward references.
bool WillMaterializeAllForwardRefs;		bool WillMaterializeAllForwardRefs = false;

/// Functions that have block addresses taken. This is usually empty.		/// Functions that have block addresses taken. This is usually empty.
SmallPtrSet<const Function *, 4> BlockAddressesTaken;		SmallPtrSet<const Function *, 4> BlockAddressesTaken;

/// True if any Metadata block has been materialized.		/// True if any Metadata block has been materialized.
bool IsMetadataMaterialized;		bool IsMetadataMaterialized = false;

		/// True if meta data should initially be skipped.
		bool ShouldLazyLoadMetadata = false;

		/// The name of state of the parse. Along with NextUnreadBit, they
		/// define the state of the parse between calls to continueParse().
		enum BitcodeReaderState {
		AtStart,
		AtTopLevel, // Processing top-level records.
		InsideModule, // Processing records inside a module block.
		// All states below here represent cases where input shouldn't be parsed.
		NoMoreInput, // Generic marker for having parsed input.
		ReachedEof, // Parsed input, but not necessary materializations.
		FinishedParse, // Parsed input and materialized necessary parts.
		ParseError, // An error has occurred, stop parsing.
		jvoungUnsubmitted Not Done Reply Inline Actions be consistent about capitalization in comments one line starts with "parsed input" and another line starts with "Parsed input" =) jvoung: be consistent about capitalization in comments one line starts with "parsed input" and another…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
		} ParseState = AtStart;

		/// The position (within the bitcode) where continueParse() left off, and used
		/// to set input position on the next call to continueParse().
		uint64_t NextUnreadBit = 0;

		/// The number of modules read at the top level.
		size_t NumModulesParsed = 0;
		jvoungUnsubmitted Not Done Reply Inline Actions nit: This is usually 0 or 1, but it seems unexpected for this field to be named "NumModulesParsed", and yet have the type be "bool". Rename or change type? jvoung: nit: This is usually 0 or 1, but it seems unexpected for this field to be named…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Good catch. I did meant to use size_t. Fixing. kschimpf: Good catch. I did meant to use size_t. Fixing.

bool StripDebugInfo = false;		bool StripDebugInfo = false;

public:		public:
std::error_code Error(BitcodeError E, const Twine &Message);		std::error_code Error(BitcodeError E, const Twine &Message);
std::error_code Error(BitcodeError E);		std::error_code Error(BitcodeError E);
std::error_code Error(const Twine &Message);		std::error_code Error(const Twine &Message);

explicit BitcodeReader(MemoryBuffer *buffer, LLVMContext &C,		BitcodeReader(MemoryBuffer *Buffer, LLVMContext &C,
		jvoungUnsubmitted Not Done Reply Inline Actions Don't need explicit anymore, though that transition was a while back so not really related to this CL. jvoung: Don't need explicit anymore, though that transition was a while back so not really related to…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
DiagnosticHandlerFunction DiagnosticHandler);		DiagnosticHandlerFunction DiagnosticHandler);
explicit BitcodeReader(DataStreamer *streamer, LLVMContext &C,		BitcodeReader(DataStreamer *Streamer, LLVMContext &C,
DiagnosticHandlerFunction DiagnosticHandler);		DiagnosticHandlerFunction DiagnosticHandler);
~BitcodeReader() override { FreeState(); }		~BitcodeReader() override { FreeState(); }

std::error_code materializeForwardReferencedFunctions();		std::error_code materializeForwardReferencedFunctions();

void FreeState();		void FreeState();

void releaseBuffer();		void releaseBuffer();

bool isDematerializable(const GlobalValue *GV) const override;		bool isDematerializable(const GlobalValue *GV) const override;
std::error_code materialize(GlobalValue *GV) override;		std::error_code materialize(GlobalValue *GV) override;
std::error_code materializeModule(Module *M) override;		std::error_code materializeModule(Module *M) override;
std::vector<StructType *> getIdentifiedStructTypes() const override;		std::vector<StructType *> getIdentifiedStructTypes() const override;
void dematerialize(GlobalValue *GV) override;		void dematerialize(GlobalValue *GV) override;

/// @brief Main interface to parsing a bitcode buffer.		/// \brief Starts parse of bitcode. Materializes during parse based on flags.
		jvoungUnsubmitted Not Done Reply Inline Actions http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments says "\brief" instead of @brief jvoung: http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments says "\brief"…
		jvoungUnsubmitted Not Done Reply Inline Actions \returns jvoung: \returns
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Sorry, my fault. I followed the syntax of ConstantPlaceHolder below. Changing to follow coding standards. kschimpf: Sorry, my fault. I followed the syntax of ConstantPlaceHolder below. Changing to follow coding…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
/// @returns true if an error occurred.		///
std::error_code ParseBitcodeInto(Module *M,		/// \param M the module to build.
bool ShouldLazyLoadMetadata = false);		/// \param ShouldMaterializeAll true when the module should be materialized
		/// completely before returning. Otherwise, function bodies are only loaded on
		/// demand.
		/// \param ShouldLazyLoadMetadata true when the metadata blocks should be
		/// parsed.
		jvoungUnsubmitted Not Done Reply Inline Actions lowercase first letter of function name -- should probably do it for these new functions since you're touching it (but leave existing functions alone?) jvoung: lowercase first letter of function name -- should probably do it for these new functions since…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Good point. Fixing. kschimpf: Good point. Fixing.
		///
		jvoungUnsubmitted Not Done Reply Inline Actions Variable name is different from comment "MaterializeAll" vs "ShouldMaterializeAll" -- make them the same? I see that in the actual definition you're trying to avoid conflicting with the field name... jvoung: Variable name is different from comment "MaterializeAll" vs "ShouldMaterializeAll" -- make them…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Fixing the names to be consistent. Fixing name conflict by prefixing assignments of field names with "this->". kschimpf: Fixing the names to be consistent. Fixing name conflict by prefixing assignments of field names…
		/// \returns true if an error occurred.
		std::error_code parseBitcodeInto(Module *M,
		bool ShouldMaterializeAll,
		bool ShouldLazyLoadMetadata);
		rafaelUnsubmitted Not Done Reply Inline Actions Please commit the pure cleanup bits first: using \ instead of @ starting functions with lowercase names. rafael: Please commit the pure cleanup bits first: using \ instead of @ starting functions with…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions I assume this has already been done. The new CL doesn't have suc cases anymore. kschimpf: I assume this has already been done. The new CL doesn't have suc cases anymore.

/// @brief Cheap mechanism to just extract module triple		/// Cheap mechanism to just extract module triple.
		filcabUnsubmitted Not Done Reply Inline Actions Why the empty line? filcab: Why the empty line?
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Removed. kschimpf: Removed.
/// @returns true if an error occurred.		/// \returns true if an error occurred.
ErrorOr<std::string> parseTriple();		ErrorOr<std::string> parseTriple();

static uint64_t decodeSignRotatedValue(uint64_t V);		static uint64_t decodeSignRotatedValue(uint64_t V);

/// Materialize any deferred Metadata block.		/// Materialize any deferred Metadata block.
std::error_code materializeMetadata() override;		std::error_code materializeMetadata() override;

void setStripDebugInfo() override;		void setStripDebugInfo() override;

private:		private:
		filcabUnsubmitted Not Done Reply Inline Actions http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments ^^ This changed recently. Omit \brief if the brief description is just a sentence. (You might want to add the '.' at the end, though) filcab: http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments ^^ This changed…
std::vector<StructType *> IdentifiedStructTypes;		std::vector<StructType *> IdentifiedStructTypes;
StructType *createIdentifiedStructType(LLVMContext &Context, StringRef Name);		StructType *createIdentifiedStructType(LLVMContext &Context, StringRef Name);
StructType *createIdentifiedStructType(LLVMContext &Context);		StructType *createIdentifiedStructType(LLVMContext &Context);

Type *getTypeByID(unsigned ID);		Type *getTypeByID(unsigned ID);
Value getFnValueByID(unsigned ID, Type Ty) {		Value getFnValueByID(unsigned ID, Type Ty) {
if (Ty && Ty->isMetadataTy())		if (Ty && Ty->isMetadataTy())
return MetadataAsValue::get(Ty->getContext(), getFnMetadataByID(ID));		return MetadataAsValue::get(Ty->getContext(), getFnMetadataByID(ID));
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	Value *getValueSigned(SmallVectorImpl<uint64_t> &Record, unsigned Slot,
if (Slot == Record.size()) return nullptr;		if (Slot == Record.size()) return nullptr;
unsigned ValNo = (unsigned)decodeSignRotatedValue(Record[Slot]);		unsigned ValNo = (unsigned)decodeSignRotatedValue(Record[Slot]);
// Adjust the ValNo, if it was encoded relative to the InstNum.		// Adjust the ValNo, if it was encoded relative to the InstNum.
if (UseRelativeIDs)		if (UseRelativeIDs)
ValNo = InstNum - ValNo;		ValNo = InstNum - ValNo;
return getFnValueByID(ValNo, Ty);		return getFnValueByID(ValNo, Ty);
}		}

		/// \name Functions that parses bitcode files, other than skipped blocks based
		/// on flags to parseBitcodeInto().
		/// @{
		jvoungUnsubmitted Not Done Reply Inline Actions lowercase first letter for new function (and I guess update the commit message if you do) jvoung: lowercase first letter for new function (and I guess update the commit message if you do)
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
		std::error_code startParse();
		std::error_code continueParse();
		std::error_code finishParse();
		/// @}
		jvoungUnsubmitted Not Done Reply Inline Actions same jvoung: same
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.

		// Changes the parse state to the new value.
		void setParseState(BitcodeReaderState NewValue) {
		NextUnreadBit = Stream.GetCurrentBitNo();
		ParseState = NewValue;
		}
		jvoungUnsubmitted Not Done Reply Inline Actions extra space in between updateParseState and ( jvoung: extra space in between updateParseState and (

		// Changes the parse state to ParseError if given an error.
		jvoungUnsubmitted Not Done Reply Inline Actions In the review, I've tried to look for where ParseState gets set, and there are various ways to grep for that... one is ParseState = X... another is updateParseState(Y, ...), or updateParseState(Z); Is there any way you can make the number of variations smaller? jvoung: In the review, I've tried to look for where ParseState gets set, and there are various ways to…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions I guess the first part of the problem is that there are 2 notions of parse state: The field ParseState that names the state of the parser, and NextUnreadBit which defines where to continue the parse on return (in case a function body gets parsed between calls). I also overloaded the return value with this update. Refactoring to do less and be more clear. kschimpf: I guess the first part of the problem is that there are 2 notions of parse state: 1) The field…
		void setParseStateIfError(std::error_code EC) {
		NextUnreadBit = Stream.GetCurrentBitNo();
		if (EC)
		ParseState = ParseError;
		}

/// Converts alignment exponent (i.e. power of two (or zero)) to the		/// Converts alignment exponent (i.e. power of two (or zero)) to the
/// corresponding alignment to use. If alignment is too large, returns		/// corresponding alignment to use. If alignment is too large, returns
/// a corresponding error code.		/// a corresponding error code.
std::error_code parseAlignmentValue(uint64_t Exponent, unsigned &Alignment);		std::error_code parseAlignmentValue(uint64_t Exponent, unsigned &Alignment);
std::error_code ParseAttrKind(uint64_t Code, Attribute::AttrKind *Kind);		std::error_code ParseAttrKind(uint64_t Code, Attribute::AttrKind *Kind);
std::error_code ParseModule(bool Resume, bool ShouldLazyLoadMetadata = false);		std::error_code ParseModule();
std::error_code ParseAttributeBlock();		std::error_code ParseAttributeBlock();
std::error_code ParseAttributeGroupBlock();		std::error_code ParseAttributeGroupBlock();
std::error_code ParseTypeTable();		std::error_code ParseTypeTable();
		jvoungUnsubmitted Not Done Reply Inline Actions This is a bit weird to me... you have a bunch of function called "updateParseState" but some variants modify ParseState and some variants don't. jvoung: This is a bit weird to me... you have a bunch of function called "updateParseState" but some…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions The refactoring is a bit better now. Hopefully good enough. kschimpf: The refactoring is a bit better now. Hopefully good enough.
		jvoungUnsubmitted Not Done Reply Inline Actions Thanks -- this is better. For a while I was also wondering how many places need to be aware of setting the state to ParseError, but I think it's just continueParse() because most/all searching for bit position, etc. goes through that. jvoung: Thanks -- this is better. For a while I was also wondering how many places need to be aware of…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions That is correct and was the intent. State updates (and bit positioning) is intentionally now localized to continueParse. The only exception is in ParseModule, which updates the state to state whether it returned without completing. kschimpf: That is correct and was the intent. State updates (and bit positioning) is intentionally now…
std::error_code ParseTypeTableBody();		std::error_code ParseTypeTableBody();

std::error_code ParseValueSymbolTable();		std::error_code ParseValueSymbolTable();
std::error_code ParseConstants();		std::error_code ParseConstants();
std::error_code RememberAndSkipFunctionBody();		std::error_code RememberAndSkipFunctionBody();
/// Save the positions of the Metadata blocks and skip parsing the blocks.		/// Save the positions of the Metadata blocks and skip parsing the blocks.
std::error_code rememberAndSkipMetadata();		std::error_code rememberAndSkipMetadata();
std::error_code ParseFunctionBody(Function *F);		std::error_code ParseFunctionBody(Function *F);
Show All 33 Lines

static std::error_code Error(DiagnosticHandlerFunction DiagnosticHandler,		static std::error_code Error(DiagnosticHandlerFunction DiagnosticHandler,
const Twine &Message) {		const Twine &Message) {
return Error(DiagnosticHandler,		return Error(DiagnosticHandler,
make_error_code(BitcodeError::CorruptedBitcode), Message);		make_error_code(BitcodeError::CorruptedBitcode), Message);
}		}

std::error_code BitcodeReader::Error(BitcodeError E, const Twine &Message) {		std::error_code BitcodeReader::Error(BitcodeError E, const Twine &Message) {
		setParseState(ParseError);
return ::Error(DiagnosticHandler, make_error_code(E), Message);		return ::Error(DiagnosticHandler, make_error_code(E), Message);
}		}

std::error_code BitcodeReader::Error(const Twine &Message) {		std::error_code BitcodeReader::Error(const Twine &Message) {
		setParseState(ParseError);
return ::Error(DiagnosticHandler,		return ::Error(DiagnosticHandler,
make_error_code(BitcodeError::CorruptedBitcode), Message);		make_error_code(BitcodeError::CorruptedBitcode), Message);
}		}

std::error_code BitcodeReader::Error(BitcodeError E) {		std::error_code BitcodeReader::Error(BitcodeError E) {
		setParseState(ParseError);
return ::Error(DiagnosticHandler, make_error_code(E));		return ::Error(DiagnosticHandler, make_error_code(E));
}		}

static DiagnosticHandlerFunction getDiagHandler(DiagnosticHandlerFunction F,		static DiagnosticHandlerFunction getDiagHandler(DiagnosticHandlerFunction F,
LLVMContext &C) {		LLVMContext &C) {
if (F)		if (F)
return F;		return F;
return [&C](const DiagnosticInfo &DI) { C.diagnose(DI); };		return [&C](const DiagnosticInfo &DI) { C.diagnose(DI); };
}		}

BitcodeReader::BitcodeReader(MemoryBuffer *buffer, LLVMContext &C,		BitcodeReader::BitcodeReader(MemoryBuffer *Buffer, LLVMContext &C,
DiagnosticHandlerFunction DiagnosticHandler)		DiagnosticHandlerFunction DiagnosticHandler)
: Context(C), DiagnosticHandler(getDiagHandler(DiagnosticHandler, C)),		: Context(C), DiagnosticHandler(getDiagHandler(DiagnosticHandler, C)),
TheModule(nullptr), Buffer(buffer), LazyStreamer(nullptr),		TheModule(nullptr), Buffer(Buffer), Streamer(nullptr),
NextUnreadBit(0), SeenValueSymbolTable(false), ValueList(C),		SeenValueSymbolTable(false), ValueList(C),
MDValueList(C), SeenFirstFunctionBody(false), UseRelativeIDs(false),		MDValueList(C), SeenFirstFunctionBody(false), UseRelativeIDs(false) {}
WillMaterializeAllForwardRefs(false), IsMetadataMaterialized(false) {}

BitcodeReader::BitcodeReader(DataStreamer *streamer, LLVMContext &C,		BitcodeReader::BitcodeReader(DataStreamer *Streamer, LLVMContext &C,
DiagnosticHandlerFunction DiagnosticHandler)		DiagnosticHandlerFunction DiagnosticHandler)
: Context(C), DiagnosticHandler(getDiagHandler(DiagnosticHandler, C)),		: Context(C), DiagnosticHandler(getDiagHandler(DiagnosticHandler, C)),
TheModule(nullptr), Buffer(nullptr), LazyStreamer(streamer),		TheModule(nullptr), Buffer(nullptr), Streamer(Streamer),
NextUnreadBit(0), SeenValueSymbolTable(false), ValueList(C),		SeenValueSymbolTable(false), ValueList(C),
MDValueList(C), SeenFirstFunctionBody(false), UseRelativeIDs(false),		MDValueList(C), SeenFirstFunctionBody(false), UseRelativeIDs(false) {}
WillMaterializeAllForwardRefs(false), IsMetadataMaterialized(false) {}

std::error_code BitcodeReader::materializeForwardReferencedFunctions() {		std::error_code BitcodeReader::materializeForwardReferencedFunctions() {
if (WillMaterializeAllForwardRefs)		if (WillMaterializeAllForwardRefs)
return std::error_code();		return std::error_code();

// Prevent recursion.		// Prevent recursion.
WillMaterializeAllForwardRefs = true;		WillMaterializeAllForwardRefs = true;

▲ Show 20 Lines • Show All 257 Lines • ▼ Show 20 Lines	static void UpgradeDLLImportExportLinkage(llvm::GlobalValue *GV, unsigned Val) {
switch (Val) {		switch (Val) {
case 5: GV->setDLLStorageClass(GlobalValue::DLLImportStorageClass); break;		case 5: GV->setDLLStorageClass(GlobalValue::DLLImportStorageClass); break;
case 6: GV->setDLLStorageClass(GlobalValue::DLLExportStorageClass); break;		case 6: GV->setDLLStorageClass(GlobalValue::DLLExportStorageClass); break;
}		}
}		}

namespace llvm {		namespace llvm {
namespace {		namespace {
/// @brief A class for maintaining the slot number definition		/// A class for maintaining the slot number definition
		filcabUnsubmitted Not Done Reply Inline Actions Omit \brief. filcab: Omit \brief.
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
/// as a placeholder for the actual definition for forward constants defs.		/// as a placeholder for the actual definition for forward constants defs.
		filcabUnsubmitted Not Done Reply Inline Actions Nit: Put more words on the first line. filcab: Nit: Put more words on the first line.
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
class ConstantPlaceHolder : public ConstantExpr {		class ConstantPlaceHolder : public ConstantExpr {
void operator=(const ConstantPlaceHolder &) = delete;		void operator=(const ConstantPlaceHolder &) = delete;
public:		public:
// allocate space for exactly one operand		/// allocate space for exactly one operand
		filcabUnsubmitted Not Done Reply Inline Actions Nit: If it's for docs, it's probably best to start with an uppercase letter. filcab: Nit: If it's for docs, it's probably best to start with an uppercase letter.
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
void *operator new(size_t s) {		void *operator new(size_t s) {
return User::operator new(s, 1);		return User::operator new(s, 1);
}		}
explicit ConstantPlaceHolder(Type *Ty, LLVMContext& Context)		explicit ConstantPlaceHolder(Type *Ty, LLVMContext& Context)
: ConstantExpr(Ty, Instruction::UserOp1, &Op<0>(), 1) {		: ConstantExpr(Ty, Instruction::UserOp1, &Op<0>(), 1) {
Op<0>() = UndefValue::get(Type::getInt32Ty(Context));		Op<0>() = UndefValue::get(Type::getInt32Ty(Context));
}		}

/// @brief Methods to support type inquiry through isa, cast, and dyn_cast.		/// Methods to support type inquiry through isa, cast, and dyn_cast.
		filcabUnsubmitted Not Done Reply Inline Actions Omit \brief. filcab: Omit \brief.
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
static bool classof(const Value *V) {		static bool classof(const Value *V) {
return isa<ConstantExpr>(V) &&		return isa<ConstantExpr>(V) &&
cast<ConstantExpr>(V)->getOpcode() == Instruction::UserOp1;		cast<ConstantExpr>(V)->getOpcode() == Instruction::UserOp1;
}		}


/// Provide fast operand accessors		/// Provide fast operand accessors
DECLARE_TRANSPARENT_OPERAND_ACCESSORS(Value);		DECLARE_TRANSPARENT_OPERAND_ACCESSORS(Value);
▲ Show 20 Lines • Show All 1,963 Lines • ▼ Show 20 Lines	std::error_code BitcodeReader::GlobalCleanup() {

// Force deallocation of memory for these vectors to favor the client that		// Force deallocation of memory for these vectors to favor the client that
// want lazy deserialization.		// want lazy deserialization.
std::vector<std::pair<GlobalVariable*, unsigned> >().swap(GlobalInits);		std::vector<std::pair<GlobalVariable*, unsigned> >().swap(GlobalInits);
std::vector<std::pair<GlobalAlias*, unsigned> >().swap(AliasInits);		std::vector<std::pair<GlobalAlias*, unsigned> >().swap(AliasInits);
return std::error_code();		return std::error_code();
}		}

std::error_code BitcodeReader::ParseModule(bool Resume,		std::error_code BitcodeReader::ParseModule() {
bool ShouldLazyLoadMetadata) {		if (ParseState == AtTopLevel) {
if (Resume)		if (Stream.EnterSubBlock(bitc::MODULE_BLOCK_ID))
Stream.JumpToBit(NextUnreadBit);
else if (Stream.EnterSubBlock(bitc::MODULE_BLOCK_ID))
return Error("Invalid record");		return Error("Invalid record");
		setParseState(InsideModule);
		} else {
		assert(ParseState == InsideModule);
		}

SmallVector<uint64_t, 64> Record;		SmallVector<uint64_t, 64> Record;
std::vector<std::string> SectionTable;		std::vector<std::string> SectionTable;
std::vector<std::string> GCTable;		std::vector<std::string> GCTable;

// Read all the records for this module.		// Read all the records for this module.
while (1) {		while (1) {
BitstreamEntry Entry = Stream.advance();		BitstreamEntry Entry = Stream.advance();

switch (Entry.Kind) {		switch (Entry.Kind) {
case BitstreamEntry::Error:		case BitstreamEntry::Error:
return Error("Malformed block");		return Error("Malformed block");
case BitstreamEntry::EndBlock:		case BitstreamEntry::EndBlock:
		setParseState(AtTopLevel);
return GlobalCleanup();		return GlobalCleanup();

case BitstreamEntry::SubBlock:		case BitstreamEntry::SubBlock:
switch (Entry.ID) {		switch (Entry.ID) {
default: // Skip unknown content.		default: // Skip unknown content.
if (Stream.SkipBlock())		if (Stream.SkipBlock())
return Error("Invalid record");		return Error("Invalid record");
break;		break;
▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	case BitstreamEntry::SubBlock:
std::reverse(FunctionsWithBodies.begin(), FunctionsWithBodies.end());		std::reverse(FunctionsWithBodies.begin(), FunctionsWithBodies.end());
if (std::error_code EC = GlobalCleanup())		if (std::error_code EC = GlobalCleanup())
return EC;		return EC;
SeenFirstFunctionBody = true;		SeenFirstFunctionBody = true;
}		}

if (std::error_code EC = RememberAndSkipFunctionBody())		if (std::error_code EC = RememberAndSkipFunctionBody())
return EC;		return EC;
// For streaming bitcode, suspend parsing when we reach the function		// Suspend parsing when we reach a function body, assuming we
// bodies. Subsequent materialization calls will resume it when		// have already associated names with global values. Note: If
		jvoungUnsubmitted Not Done Reply Inline Actions NOte -> Note jvoung: NOte -> Note
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
// necessary. For streaming, the function bodies must be at the end of		// the bitcode file is old, the symbol table will be at the
// the bitcode. If the bitcode file is old, the symbol table will be		// end instead and will not have been seen yet.
// at the end instead and will not have been seen yet. In this case,		if (SeenValueSymbolTable)
// just finish the parse now.
if (LazyStreamer && SeenValueSymbolTable) {
NextUnreadBit = Stream.GetCurrentBitNo();
jvoungUnsubmitted Not Done Reply Inline Actions NextUnreadBit is no longer set -- wanted to check if that is okay now (and why)? jvoung: NextUnreadBit is no longer set -- wanted to check if that is okay now (and why)?
return std::error_code();		return std::error_code();
}
break;		break;
case bitc::USELIST_BLOCK_ID:		case bitc::USELIST_BLOCK_ID:
if (std::error_code EC = ParseUseLists())		if (std::error_code EC = ParseUseLists())
return EC;		return EC;
break;		break;
}		}
continue;		continue;

case BitstreamEntry::Record:		case BitstreamEntry::Record:
// The interesting case.		// The interesting case.
break;		break;
}		}


// Read a record.		// Read a record.
switch (Stream.readRecord(Entry.ID, Record)) {		switch (Stream.readRecord(Entry.ID, Record)) {
default: break; // Default behavior, ignore unknown content.		default: break; // Default behavior, ignore unknown content.
case bitc::MODULE_CODE_VERSION: { // VERSION: [version#]		case bitc::MODULE_CODE_VERSION: { // VERSION: [version#]
if (Record.size() < 1)		if (Record.size() < 1)
return Error("Invalid record");		return Error("Invalid record");
// Only version #0 and #1 are supported so far.		// Only version #0 and #1 are supported so far.
unsigned module_version = Record[0];		unsigned module_version = Record[0];
▲ Show 20 Lines • Show All 217 Lines • ▼ Show 20 Lines	case bitc::MODULE_CODE_FUNCTION: {

ValueList.push_back(Func);		ValueList.push_back(Func);

// If this is a function with a body, remember the prototype we are		// If this is a function with a body, remember the prototype we are
// creating now, so that we can match up the body with them later.		// creating now, so that we can match up the body with them later.
if (!isProto) {		if (!isProto) {
Func->setIsMaterializable(true);		Func->setIsMaterializable(true);
FunctionsWithBodies.push_back(Func);		FunctionsWithBodies.push_back(Func);
if (LazyStreamer)
DeferredFunctionInfo[Func] = 0;		DeferredFunctionInfo[Func] = 0;
}		}
break;		break;
}		}
// ALIAS: [alias type, aliasee val#, linkage]		// ALIAS: [alias type, aliasee val#, linkage]
// ALIAS: [alias type, aliasee val#, linkage, visibility, dllstorageclass]		// ALIAS: [alias type, aliasee val#, linkage, visibility, dllstorageclass]
case bitc::MODULE_CODE_ALIAS: {		case bitc::MODULE_CODE_ALIAS: {
if (Record.size() < 3)		if (Record.size() < 3)
return Error("Invalid record");		return Error("Invalid record");
Show All 30 Lines	case bitc::MODULE_CODE_PURGEVALS:
return Error("Invalid record");		return Error("Invalid record");
ValueList.shrinkTo(Record[0]);		ValueList.shrinkTo(Record[0]);
break;		break;
}		}
Record.clear();		Record.clear();
}		}
}		}

std::error_code BitcodeReader::ParseBitcodeInto(Module *M,		std::error_code BitcodeReader::parseBitcodeInto(Module *M,
		bool ShouldMaterializeAll,
bool ShouldLazyLoadMetadata) {		bool ShouldLazyLoadMetadata) {
TheModule = nullptr;		auto cleanupOnError = [&](std::error_code EC) {
		releaseBuffer(); // Never take ownership on error.
		return EC;
		};

		TheModule = M;
		this->ShouldLazyLoadMetadata = ShouldLazyLoadMetadata;

		if (std::error_code EC = startParse())
		return cleanupOnError(EC);

		if (ShouldMaterializeAll) {
		if (std::error_code EC = materializeModule(TheModule))
		return cleanupOnError(EC);
		jvoungUnsubmitted Not Done Reply Inline Actions Does this need to be cleanupOnError(EC) also? jvoung: Does this need to be cleanupOnError(EC) also?
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Yes. Good catch. Fixing. kschimpf: Yes. Good catch. Fixing.
		} else {
		if (std::error_code EC = materializeForwardReferencedFunctions())
		return cleanupOnError(EC);
		}

		return std::error_code();
		}

		std::error_code BitcodeReader::startParse() {
		assert(ParseState == AtStart);

if (std::error_code EC = InitStream())		if (std::error_code EC = InitStream())
return EC;		return EC;

// Sniff for the signature.		// Sniff for the signature.
if (Stream.Read(8) != 'B' \|\|		if (Stream.Read(8) != 'B' \|\|
Stream.Read(8) != 'C' \|\|		Stream.Read(8) != 'C' \|\|
Stream.Read(4) != 0x0 \|\|		Stream.Read(4) != 0x0 \|\|
Stream.Read(4) != 0xC \|\|		Stream.Read(4) != 0xC \|\|
Stream.Read(4) != 0xE \|\|		Stream.Read(4) != 0xE \|\|
Stream.Read(4) != 0xD)		Stream.Read(4) != 0xD)
return Error("Invalid bitcode signature");		return Error("Invalid bitcode signature");

		return continueParse();
		}

		std::error_code BitcodeReader::continueParse() {
		switch (ParseState) {
		case AtStart:
		setParseState(AtTopLevel);
		break;
		case AtTopLevel:
		// Restore input position to saved position on last call.
		Stream.JumpToBit(NextUnreadBit);
		break;
		case InsideModule: {
		// Restore input position to saved position on last call,
		// and then continue parsing module.
		jvoungUnsubmitted Not Done Reply Inline Actions This covers two states? InsideModule and AtTopLevel? It might be more clear if you list them out so it's clear what the "break" corresponds to (AtTopLevel). Previously, the Stream.JumpToBit(NextUnreadBit); was only needed when InsideModule... is it now needed for AtTopLevel too? jvoung: This covers two states? InsideModule and AtTopLevel? It might be more clear if you list them…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions The jumpToBit is needed because various "materialize" methods may be called between calls to continueParse. By forcing a jumpToBit to happen at all calls to continueParse, we no longer need to know where the materialize methods leave the bitcursor. While I did not see an example of an error caused by interleaved calls to materialize, I was very suspicious that they could occur, and wanted to make sure that this would not happen. Hence, I made sure that continueParse always resets the position to where it left off. I will fix to not use default, so that a corresponding warning will be generated if a new value is added to the enumeration. kschimpf: The jumpToBit is needed because various "materialize" methods may be called between calls to…
		Stream.JumpToBit(NextUnreadBit);
		std::error_code EC = ParseModule();
		setParseStateIfError(EC);
		return EC;
		}
		case NoMoreInput:
		case ReachedEof:
		case FinishedParse:
		return std::error_code();
		case ParseError:
		return Error("Can't continue, bitcode error already found");
		}

// We expect a number of well-defined blocks, though we don't necessarily		// We expect a number of well-defined blocks, though we don't necessarily
// need to understand them all.		// need to understand them all.
while (1) {		while (1) {
		assert(ParseState == AtTopLevel);

if (Stream.AtEndOfStream()) {		if (Stream.AtEndOfStream()) {
if (TheModule)		setParseState(ReachedEof);
return std::error_code();		return std::error_code();
// We didn't really read a proper Module.
return Error("Malformed IR file");
}		}

BitstreamEntry Entry =		BitstreamEntry Entry =
Stream.advance(BitstreamCursor::AF_DontAutoprocessAbbrevs);		Stream.advance(BitstreamCursor::AF_DontAutoprocessAbbrevs);

switch (Entry.Kind) {		switch (Entry.Kind) {
case BitstreamEntry::Error:		case BitstreamEntry::Error:
return Error("Malformed block");
case BitstreamEntry::EndBlock:		case BitstreamEntry::EndBlock:
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Discovered that the Bitstream::EndBLock was "hiding" a bug int the bitstream reader when processing a data stream. That is, when using a data stream, the size is not set until after the eof is reached. Hence, when Stream.AtEndOfStream() was called above, it would return false even when at the eof. The actual problem was in FillCurWord, which did not set the bit position correctly when there was no more input. The old code worked because the read (at eof) would return zero, and is understood as an end block. By returning success for this value, it would hide this problem. I also improved the error message so that once can see where the reader thought the eof should be, if there is miscellaneous stuff at the end of the bit code file. This makes it easier to know where to cut a test file in such cases. kschimpf: Discovered that the Bitstream::EndBLock was "hiding" a bug int the bitstream reader when…
return std::error_code();		{
		filcabUnsubmitted Not Done Reply Inline Actions We're already at top level, no? (line 3234) I might be missing something, but it looks like we're at top level, and saw an EndBlock. Shouldn't this be an error? filcab: We're already at top level, no? (line 3234) I might be missing something, but it looks like…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions I agree that we should be at top level. I also agree that it appears weird that we allow extra (unmathced) EndBlocks. This has been allowed by the bitcode reader/writer for years. I just wasn't willing to make the leap that I should remove this. However, I tried removing it (and making it an error), and no tests failed. Hence, Converting this to an error. kschimpf: I agree that we should be at top level. I also agree that it appears weird that we allow extra…
		// Give bit address where error is found, so that it can be
		// easily repaired if is in an invalid test file.
		filcabUnsubmitted Not Done Reply Inline Actions Thank you! filcab: Thank you!
		std::string Buffer;
		raw_string_ostream StrBuf(Buffer);
		uint64_t Bit = Stream.GetCurrentBitNo();
		errs() << "Malformed IR file at bit " << format("%x", (Bit / CHAR_BIT))
		filcabUnsubmitted Not Done Reply Inline Actions errs()? Or StrBuf? Please also add a test for this error message. filcab: errs()? Or StrBuf? Please also add a test for this error message.
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Good catch. I meant StrBuf, so that we can use the same API for all errors. kschimpf: Good catch. I meant StrBuf, so that we can use the same API for all errors.
		<< ":" << (Bit % CHAR_BIT);
		rafaelUnsubmitted Not Done Reply Inline Actions This looks a bit much to be honest. Corrupted files are not that common and it is trivial to set a breakpoint to find the state. rafael: This looks a bit much to be honest. Corrupted files are not that common and it is trivial to…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Simplified in new CL to do same as before. kschimpf: Simplified in new CL to do same as before.
		return Error(StrBuf.str());
		}
case BitstreamEntry::SubBlock:		case BitstreamEntry::SubBlock:
switch (Entry.ID) {		switch (Entry.ID) {
case bitc::BLOCKINFO_BLOCK_ID:		case bitc::BLOCKINFO_BLOCK_ID:
if (Stream.ReadBlockInfoBlock())		if (Stream.ReadBlockInfoBlock())
return Error("Malformed block");		return Error("Malformed block");
break;		break;
case bitc::MODULE_BLOCK_ID:		case bitc::MODULE_BLOCK_ID: {
// Reject multiple MODULE_BLOCK's in a single bitstream.		// Reject multiple MODULE_BLOCK's in a single bitstream.
if (TheModule)		if (NumModulesParsed++)
		rafaelUnsubmitted Not Done Reply Inline Actions This is always 0 or 1. Use a boolean instead. rafael: This is always 0 or 1. Use a boolean instead.
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions This was already fixed in master. kschimpf: This was already fixed in master.
return Error("Invalid multiple blocks");		return Error("Invalid multiple blocks");
TheModule = M;		std::error_code EC = ParseModule();
if (std::error_code EC = ParseModule(false, ShouldLazyLoadMetadata))		setParseStateIfError(EC);
return EC;		return EC;
if (LazyStreamer)		}
return std::error_code();
break;
default:		default:
if (Stream.SkipBlock())		if (Stream.SkipBlock())
return Error("Invalid record");		return Error("Invalid record");
break;		break;
}		}
continue;		continue;
case BitstreamEntry::Record:		case BitstreamEntry::Record:
// There should be no records in the top-level of blocks.		// There should be no records in the top-level of blocks.

// The ranlib in Xcode 4 will align archive members by appending newlines		// The ranlib in Xcode 4 will align archive members by appending newlines
// to the end of them. If this file size is a multiple of 4 but not 8, we		// to the end of them. If this file size is a multiple of 4 but not 8, we
// have to read and ignore these final 4 bytes :-(		// have to read and ignore these final 4 bytes :-(
if (Stream.getAbbrevIDWidth() == 2 && Entry.ID == 2 &&		if (Stream.getAbbrevIDWidth() == 2 && Entry.ID == 2 &&
Stream.Read(6) == 2 && Stream.Read(24) == 0xa0a0a &&		Stream.Read(6) == 2 && Stream.Read(24) == 0xa0a0a &&
Stream.AtEndOfStream())		Stream.AtEndOfStream()) {
		setParseState(ReachedEof);
return std::error_code();		return std::error_code();
		}

return Error("Invalid record");		return Error("Invalid record");
}		}
}		}
}		}

		std::error_code BitcodeReader::finishParse() {
		assert(TheModule);

		jvoungUnsubmitted Not Done Reply Inline Actions I don't quite understand how "ShouldMaterializeAll = false" is supposed to work for the streaming case, if this isn't checked until after: while (ParseState < NoMoreInput) { if (std::error_code EC = ContinueParse()) { return EC; } } How do you delay reading until materialize(GV) for streaming? jvoung: I don't quite understand how "ShouldMaterializeAll = false" is supposed to work for the…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions After talking to Derek, I realized what was the issue I was missing. I'l summarize what I understand: When streaming, we want to "return" as soon as possible, without having to force all bitcode to be scanned. This reduces the cost of (potential) blocking calls to the data streamer. Control can return to the caller without having completed the parse. However, the parsed portions must be consistent (i.e. forward block address references have been resolved). Based on this, I've modified the code to lift the materializeForwardReferencedFunctions into startParse. kschimpf: After talking to Derek, I realized what was the issue I was missing. I'l summarize what I…
		while (ParseState < NoMoreInput) {
		if (std::error_code EC = continueParse())
		return EC;
		}
		jvoungUnsubmitted Not Done Reply Inline Actions This used to be early, in "getLazyBitcodeModuleImpl". This looks like it is now happening late in FinishParse after the loop until NoMoreInput. Why is this okay now? Was the early call to "materializeForwardReferencedFunctions" actually extraneous because of the call in materialize(GV), or what? Make sure to check the lazy case with blockaddresses for computed gotos, if there isn't already a unittest for that. jvoung: This used to be early, in "getLazyBitcodeModuleImpl". This looks like it is now happening late…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions I agree that there should be some type of forward reference unit test to verify we can lazy evaluate these forward-referenced block addresses. I also agree that for the use by llvm-dis, my original code worked because it eventually calls materializeAllPermanently. I think I may have been confused about the full expectation of "streamed" (or lazy) was because of this. kschimpf: I agree that there should be some type of forward reference unit test to verify we can lazy…

		switch (ParseState) {
		case AtStart:
		case AtTopLevel:
		case InsideModule:
		llvm_unreachable("finishParse exits with ParseState < NoMoreInput");
		case NoMoreInput:
		case ReachedEof:
		setParseState(FinishedParse);
		break;
		case FinishedParse:
		jvoungUnsubmitted Not Done Reply Inline Actions "NoMorInput" -> "NoMoreInput" It could also be that more states >= NoMoreInput are added as code evolves, but not handled here. Can the compiler accept/handle a "static_assert(ParseState < NoMoreInput, "...") to catch what happens if more states are added after NoMoreInput but not handled by this switch? jvoung: "NoMorInput" -> "NoMoreInput" It could also be that more states >= NoMoreInput are added as…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Fixed string. Also removed "default" case and made all states explicit. This will force a warning if a new state is added. kschimpf: Fixed string. Also removed "default" case and made all states explicit. This will force a…
		break;
		case ParseError:
		return Error("Can't continue, bitcode error already found");
		}
		if (NumModulesParsed == 1)
		return std::error_code();
		// We didn't really read a proper Module.
		return Error("Malformed IR file");
		}

ErrorOr<std::string> BitcodeReader::parseModuleTriple() {		ErrorOr<std::string> BitcodeReader::parseModuleTriple() {
if (Stream.EnterSubBlock(bitc::MODULE_BLOCK_ID))		if (Stream.EnterSubBlock(bitc::MODULE_BLOCK_ID))
return Error("Invalid record");		return Error("Invalid record");

SmallVector<uint64_t, 64> Record;		SmallVector<uint64_t, 64> Record;

std::string Triple;		std::string Triple;
// Read all the records for this module.		// Read all the records for this module.
▲ Show 20 Lines • Show All 1,240 Lines • ▼ Show 20 Lines	OutOfRecordLoop:
return std::error_code();		return std::error_code();
}		}

/// Find the function body in the bitcode stream		/// Find the function body in the bitcode stream
std::error_code BitcodeReader::FindFunctionInStream(		std::error_code BitcodeReader::FindFunctionInStream(
Function *F,		Function *F,
DenseMap<Function *, uint64_t>::iterator DeferredFunctionInfoIterator) {		DenseMap<Function *, uint64_t>::iterator DeferredFunctionInfoIterator) {
while (DeferredFunctionInfoIterator->second == 0) {		while (DeferredFunctionInfoIterator->second == 0) {
if (Stream.AtEndOfStream())		if (ParseState >= NoMoreInput) {
return Error("Could not find function in stream");		return Error("Could not find function in stream");
// ParseModule will parse the next body in the stream and set its		}
// position in the DeferredFunctionInfo map.		if (std::error_code EC = continueParse()) {
if (std::error_code EC = ParseModule(true))
return EC;		return EC;
}		}
		}
return std::error_code();		return std::error_code();
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// GVMaterializer implementation		// GVMaterializer implementation
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

void BitcodeReader::releaseBuffer() { Buffer.release(); }		void BitcodeReader::releaseBuffer() { Buffer.release(); }

std::error_code BitcodeReader::materialize(GlobalValue *GV) {		std::error_code BitcodeReader::materialize(GlobalValue *GV) {
if (std::error_code EC = materializeMetadata())		if (std::error_code EC = materializeMetadata())
return EC;		return EC;

Function *F = dyn_cast<Function>(GV);		Function *F = dyn_cast<Function>(GV);
// If it's not a function or is already material, ignore the request.		// If it's not a function or is already material, ignore the request.
if (!F \|\| !F->isMaterializable())		if (!F \|\| !F->isMaterializable())
return std::error_code();		return std::error_code();

DenseMap<Function*, uint64_t>::iterator DFII = DeferredFunctionInfo.find(F);		DenseMap<Function*, uint64_t>::iterator DFII = DeferredFunctionInfo.find(F);
assert(DFII != DeferredFunctionInfo.end() && "Deferred function not found!");		assert(DFII != DeferredFunctionInfo.end() && "Deferred function not found!");
// If its position is recorded as 0, its body is somewhere in the stream		// If its position is recorded as 0, its body is somewhere in the stream
// but we haven't seen it yet.		// but we haven't seen it yet.
if (DFII->second == 0 && LazyStreamer)		if (DFII->second == 0)
if (std::error_code EC = FindFunctionInStream(F, DFII))		if (std::error_code EC = FindFunctionInStream(F, DFII))
return EC;		return EC;

// Move the bit stream to the saved position of the deferred function body.		// Move the bit stream to the saved position of the deferred function body.
Stream.JumpToBit(DFII->second);		Stream.JumpToBit(DFII->second);

if (std::error_code EC = ParseFunctionBody(F))		if (std::error_code EC = ParseFunctionBody(F))
return EC;		return EC;
Show All 11 Lines	if (I->first != I->second) {
if (CallInst* CI = dyn_cast<CallInst>(*UI++))		if (CallInst* CI = dyn_cast<CallInst>(*UI++))
UpgradeIntrinsicCall(CI, I->second);		UpgradeIntrinsicCall(CI, I->second);
}		}
}		}
}		}

// Bring in any functions that this function forward-referenced via		// Bring in any functions that this function forward-referenced via
// blockaddresses.		// blockaddresses.
return materializeForwardReferencedFunctions();		return materializeForwardReferencedFunctions();
		jvoungUnsubmitted Not Done Reply Inline Actions no need for extra space jvoung: no need for extra space
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
}		}

bool BitcodeReader::isDematerializable(const GlobalValue *GV) const {		bool BitcodeReader::isDematerializable(const GlobalValue *GV) const {
const Function *F = dyn_cast<Function>(GV);		const Function *F = dyn_cast<Function>(GV);
if (!F \|\| F->isDeclaration())		if (!F \|\| F->isDeclaration())
return false;		return false;

// Dematerializing F would leave dangling references that wouldn't be		// Dematerializing F would leave dangling references that wouldn't be
Show All 16 Lines	void BitcodeReader::dematerialize(GlobalValue *GV) {
F->dropAllReferences();		F->dropAllReferences();
F->setIsMaterializable(true);		F->setIsMaterializable(true);
}		}

std::error_code BitcodeReader::materializeModule(Module *M) {		std::error_code BitcodeReader::materializeModule(Module *M) {
assert(M == TheModule &&		assert(M == TheModule &&
"Can only Materialize the Module this BitcodeReader is attached to.");		"Can only Materialize the Module this BitcodeReader is attached to.");

if (std::error_code EC = materializeMetadata())		// Make sure the rest of the bits in the module (excluding materializable)
		// have been read.
		if (std::error_code EC = finishParse())
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Removing the comment about being after a function body. This is no longer true. A call to materializeMetadata would put us some place else in the bitcode file. kschimpf: Removing the comment about being after a function body. This is no longer true. A call to…
return EC;		return EC;

// Promise to materialize all forward references.		if (std::error_code EC = materializeMetadata())
WillMaterializeAllForwardRefs = true;		return EC;

		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Moved the iterating of functions to before the call to finishParse. This deals with the problems I was having with tests in test/Bitcode/invalid.test (i.e. Inputs/invalid-fwdref-type-mismatch-2.bc and Inputs/invalid-load-ptr-type.bc). These two files had multiple errors (the one they intended which was inside a function body, and the one probably not intended - extraneous stuff at the end of the bitcode file). This removes the need for the command line flag UseOldLazyBitcodeParse, and I have deleted it. kschimpf: Moved the iterating of functions to before the call to finishParse. This deals with the…
// Iterate over the module, deserializing any functions that are still on		// Iterate over the module, deserializing any functions that are still on
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Now that the eof checking is fixed, I moved this back where it was in an earlier version of this CL. kschimpf: Now that the eof checking is fixed, I moved this back where it was in an earlier version of…
// disk.		// disk.
for (Module::iterator F = TheModule->begin(), E = TheModule->end();		for (Module::iterator F = TheModule->begin(), E = TheModule->end();
F != E; ++F) {		F != E; ++F) {
if (std::error_code EC = materialize(F))		if (std::error_code EC = materialize(F))
return EC;		return EC;
}		}
// At this point, if there are any function bodies, the current bit is
// pointing to the END_BLOCK record after them. Now make sure the rest
// of the bits in the module have been read.
if (NextUnreadBit)
ParseModule(true);

		jvoungUnsubmitted Not Done Reply Inline Actions Is this necessary at this point? Should that already be covered by the " // Iterate over the module, deserializing any functions that are still on disk" loop? jvoung: Is this necessary at this point? Should that already be covered by the " // Iterate over the…
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions In correct bitcode files, you are right. However, if the function doesn't define any function blocks, but (incorrectly) references function block addresses, this code will cause the error to be generated. However, looking at the following instruction, this is checked anyway. Removing. kschimpf: In correct bitcode files, you are right. However, if the function doesn't define any function…
// Check that all block address forward references got resolved (as we		// Check that all block address forward references got resolved.
		jvoungUnsubmitted Not Done Reply Inline Actions The "promise" comment from "above" is removed now, so you could update this comment. jvoung: The "promise" comment from "above" is removed now, so you could update this comment.
		kschimpfAuthorUnsubmitted Not Done Reply Inline Actions Done. kschimpf: Done.
// promised above).
if (!BasicBlockFwdRefs.empty())		if (!BasicBlockFwdRefs.empty())
return Error("Never resolved function from blockaddress");		return Error("Never resolved function from blockaddress");

// Upgrade any intrinsic calls that slipped through (should not happen!) and		// Upgrade any intrinsic calls that slipped through (should not happen!) and
// delete the old functions to clean up. We can't do this unless the entire		// delete the old functions to clean up. We can't do this unless the entire
// module is materialized because there could always be another function body		// module is materialized because there could always be another function body
// with calls to the old function.		// with calls to the old function.
for (std::vector<std::pair<Function, Function> >::iterator I =		for (std::vector<std::pair<Function, Function> >::iterator I =
Show All 18 Lines	std::error_code BitcodeReader::materializeModule(Module *M) {
return std::error_code();		return std::error_code();
}		}

std::vector<StructType *> BitcodeReader::getIdentifiedStructTypes() const {		std::vector<StructType *> BitcodeReader::getIdentifiedStructTypes() const {
return IdentifiedStructTypes;		return IdentifiedStructTypes;
}		}

std::error_code BitcodeReader::InitStream() {		std::error_code BitcodeReader::InitStream() {
if (LazyStreamer)		if (Streamer)
return InitLazyStream();		return InitLazyStream();
return InitStreamFromBuffer();		return InitStreamFromBuffer();
}		}

std::error_code BitcodeReader::InitStreamFromBuffer() {		std::error_code BitcodeReader::InitStreamFromBuffer() {
const unsigned char BufPtr = (const unsigned char)Buffer->getBufferStart();		const unsigned char BufPtr = (const unsigned char)Buffer->getBufferStart();
const unsigned char *BufEnd = BufPtr+Buffer->getBufferSize();		const unsigned char *BufEnd = BufPtr+Buffer->getBufferSize();

Show All 10 Lines	std::error_code BitcodeReader::InitStreamFromBuffer() {
Stream.init(&*StreamFile);		Stream.init(&*StreamFile);

return std::error_code();		return std::error_code();
}		}

std::error_code BitcodeReader::InitLazyStream() {		std::error_code BitcodeReader::InitLazyStream() {
// Check and strip off the bitcode wrapper; BitstreamReader expects never to		// Check and strip off the bitcode wrapper; BitstreamReader expects never to
// see it.		// see it.
auto OwnedBytes = llvm::make_unique<StreamingMemoryObject>(LazyStreamer);		auto OwnedBytes = llvm::make_unique<StreamingMemoryObject>(Streamer);
StreamingMemoryObject &Bytes = *OwnedBytes;		StreamingMemoryObject &Bytes = *OwnedBytes;
StreamFile = llvm::make_unique<BitstreamReader>(std::move(OwnedBytes));		StreamFile = llvm::make_unique<BitstreamReader>(std::move(OwnedBytes));
Stream.init(&*StreamFile);		Stream.init(&*StreamFile);

unsigned char buf[16];		unsigned char buf[16];
if (Bytes.readBytes(buf, 16, 0) != 16)		if (Bytes.readBytes(buf, 16, 0) != 16)
return Error("Invalid bitcode signature");		return Error("Invalid bitcode signature");

▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	getLazyBitcodeModuleImpl(std::unique_ptr<MemoryBuffer> &&Buffer,
LLVMContext &Context, bool WillMaterializeAll,		LLVMContext &Context, bool WillMaterializeAll,
DiagnosticHandlerFunction DiagnosticHandler,		DiagnosticHandlerFunction DiagnosticHandler,
bool ShouldLazyLoadMetadata = false) {		bool ShouldLazyLoadMetadata = false) {
Module *M = new Module(Buffer->getBufferIdentifier(), Context);		Module *M = new Module(Buffer->getBufferIdentifier(), Context);
BitcodeReader *R =		BitcodeReader *R =
new BitcodeReader(Buffer.get(), Context, DiagnosticHandler);		new BitcodeReader(Buffer.get(), Context, DiagnosticHandler);
M->setMaterializer(R);		M->setMaterializer(R);

auto cleanupOnError = [&](std::error_code EC) {		if (std::error_code EC =
R->releaseBuffer(); // Never take ownership on error.		R->parseBitcodeInto(M, WillMaterializeAll, ShouldLazyLoadMetadata)) {
delete M; // Also deletes R.		delete M; // Also deletes R.
return EC;		return EC;
};		}

// Delay parsing Metadata if ShouldLazyLoadMetadata is true.
if (std::error_code EC = R->ParseBitcodeInto(M, ShouldLazyLoadMetadata))
return cleanupOnError(EC);

if (!WillMaterializeAll)
// Resolve forward references from blockaddresses.
if (std::error_code EC = R->materializeForwardReferencedFunctions())
return cleanupOnError(EC);

Buffer.release(); // The BitcodeReader owns it now.		Buffer.release(); // The BitcodeReader owns it now.
return M;		return M;
}		}

ErrorOr<Module *>		ErrorOr<Module *>
llvm::getLazyBitcodeModule(std::unique_ptr<MemoryBuffer> &&Buffer,		llvm::getLazyBitcodeModule(std::unique_ptr<MemoryBuffer> &&Buffer,
LLVMContext &Context,		LLVMContext &Context,
DiagnosticHandlerFunction DiagnosticHandler,		DiagnosticHandlerFunction DiagnosticHandler,
bool ShouldLazyLoadMetadata) {		bool ShouldLazyLoadMetadata) {
return getLazyBitcodeModuleImpl(std::move(Buffer), Context, false,		return getLazyBitcodeModuleImpl(std::move(Buffer), Context, false,
DiagnosticHandler, ShouldLazyLoadMetadata);		DiagnosticHandler, ShouldLazyLoadMetadata);
}		}

ErrorOr<std::unique_ptr<Module>>		ErrorOr<std::unique_ptr<Module>>
llvm::getStreamedBitcodeModule(StringRef Name, DataStreamer *Streamer,		llvm::getStreamedBitcodeModule(StringRef Name, DataStreamer *Streamer,
LLVMContext &Context,		LLVMContext &Context,
DiagnosticHandlerFunction DiagnosticHandler) {		DiagnosticHandlerFunction DiagnosticHandler) {
std::unique_ptr<Module> M = make_unique<Module>(Name, Context);		std::unique_ptr<Module> M = make_unique<Module>(Name, Context);
BitcodeReader *R = new BitcodeReader(Streamer, Context, DiagnosticHandler);		BitcodeReader *R = new BitcodeReader(Streamer, Context, DiagnosticHandler);
M->setMaterializer(R);		M->setMaterializer(R);
if (std::error_code EC = R->ParseBitcodeInto(M.get()))		if (std::error_code EC = R->parseBitcodeInto(M.get(), false, false))
return EC;		return EC;
return std::move(M);		return std::move(M);
}		}

ErrorOr<Module *>		ErrorOr<Module *>
llvm::parseBitcodeFile(MemoryBufferRef Buffer, LLVMContext &Context,		llvm::parseBitcodeFile(MemoryBufferRef Buffer, LLVMContext &Context,
DiagnosticHandlerFunction DiagnosticHandler) {		DiagnosticHandlerFunction DiagnosticHandler) {
std::unique_ptr<MemoryBuffer> Buf = MemoryBuffer::getMemBuffer(Buffer, false);		std::unique_ptr<MemoryBuffer> Buf = MemoryBuffer::getMemBuffer(Buffer, false);
ErrorOr<Module *> ModuleOrErr = getLazyBitcodeModuleImpl(		ErrorOr<Module *> ModuleOrErr = getLazyBitcodeModuleImpl(
std::move(Buf), Context, true, DiagnosticHandler);		std::move(Buf), Context, true, DiagnosticHandler);
if (!ModuleOrErr)		if (!ModuleOrErr)
return ModuleOrErr;		return ModuleOrErr;
Module *M = ModuleOrErr.get();		Module *M = ModuleOrErr.get();
// Read in the entire module, and destroy the BitcodeReader.
if (std::error_code EC = M->materializeAllPermanently()) {
delete M;
return EC;
}

// TODO: Restore the use-lists to the in-memory state when the bitcode was		// TODO: Restore the use-lists to the in-memory state when the bitcode was
// written. We must defer until the Module has been fully materialized.		// written. We must defer until the Module has been fully materialized.

return M;		return M;
}		}

std::string		std::string
Show All 10 Lines

test/Bitcode/Inputs/invalid-abbrev.bc

This is a binary file.

Property	Old Value	New Value
File Size	129 B	476 B

test/Bitcode/Inputs/invalid-fwdref-type-mismatch-2.bc

This is a binary file.

Property	Old Value	New Value
File Size	617 B	452 B

test/Bitcode/Inputs/invalid-load-ptr-type.bc

This is an archive of the discontinued LLVM Phabricator instance.

Refactor bitcode reader to simplify control.AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 26923

include/llvm/Bitcode/BitstreamReader.h

lib/Bitcode/Reader/BitcodeReader.cpp

test/Bitcode/Inputs/invalid-abbrev.bc

test/Bitcode/Inputs/invalid-fwdref-type-mismatch-2.bc

test/Bitcode/Inputs/invalid-load-ptr-type.bc

Refactor bitcode reader to simplify control.
AbandonedPublic