This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang-tools-extra/
-
clang-doc/
-
Serialize.cpp
-
unittests/clang-doc/
-
clang-doc/
-
BitcodeTest.cpp
-
clang/
-
include/clang/Serialization/
-
clang/
-
Serialization/
-
ASTWriter.h
-
PCHContainerOperations.h
-
lib/
-
CodeGen/
-
ObjectFilePCHContainerOperations.cpp
-
Frontend/
-
ASTUnit.cpp
-
PrecompiledPreamble.cpp
-
SerializedDiagnosticPrinter.cpp
-
Serialization/
-
ASTWriter.cpp
-
GlobalModuleIndex.cpp
-
PCHContainerOperations.cpp
-
llvm/
-
include/llvm/
-
llvm/
-
Bitcode/
-
BitcodeWriter.h
-
Bitstream/
2/2
BitstreamWriter.h
-
Remarks/
-
BitstreamRemarkSerializer.h
-
lib/
-
Bitcode/Writer/
-
Writer/
-
BitcodeWriter.cpp
-
ExecutionEngine/Orc/
-
Orc/
-
ThreadSafeModule.cpp
-
Transforms/IPO/
-
IPO/
-
ThinLTOBitcodeWriter.cpp
-
tools/
-
llvm-cat/
-
llvm-cat.cpp
-
llvm-modextract/
-
llvm-modextract.cpp
-
unittests/Bitstream/
-
Bitstream/
-
BitstreamReaderTest.cpp
-
BitstreamWriterTest.cpp

Differential D77621

ADT: SmallVector size/capacity use word-size integers when elements are small
ClosedPublic

Authored by browneee on Apr 6 2020, 7:15 PM.

Download Raw Diff

Details

Reviewers

dblaikie
dexonsmith
nikic

Commits

rGb8d08e961df1: ADT: SmallVector size/capacity use word-size integers when elements are small

Summary

SmallVector currently uses 32bit integers for size and capacity to reduce
sizeof(SmallVector). This limits the number of elements to UINT32_MAX.

For a SmallVector<char>, this limits the SmallVector size to only 4GB.
Buffering bitcode output uses SmallVector<char>, but needs >4GB output.

This changes SmallVector size and capacity to conditionally use word-size
integers if the element type is small (<4 bytes). For larger elements types,
the vector size can reach ~16GB with 32bit size.

Making this conditional on the element type provides both the smaller
sizeof(SmallVector) for larger types which are unlikely to grow so large,
and supports larger capacities for smaller element types.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

browneee created this revision.Apr 6 2020, 7:15 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 6 2020, 7:15 PM

Herald added subscribers: llvm-commits, dexonsmith, steven_wu, hiraditya. · View Herald Transcript

Harbormaster failed remote builds in B52095: Diff 255567!Apr 6 2020, 8:11 PM

RKSimon added a subscriber: RKSimon.Apr 7 2020, 3:06 AM

RKSimon added inline comments.

llvm/include/llvm/Bitstream/BitstreamWriter.h
19	Can this be dropped?

There are a few places where smaller-scoped BitcodeWriters are used where a SmallVector may be suitable, but I'm OK sacrificing those for the broader goal. If someone finds these to be performance sensitive, we can revisit/try to find some way to support both goals.

This revision is now accepted and ready to land.Apr 7 2020, 11:25 AM

Fix build.

Herald added a project: Restricted Project. · View Herald TranscriptApr 7 2020, 12:39 PM

Herald added subscribers: cfe-commits, arphaman. · View Herald Transcript

Harbormaster failed remote builds in B52211: Diff 255773!Apr 7 2020, 1:36 PM

Fix more build errors.

Harbormaster failed remote builds in B52249: Diff 255831!Apr 7 2020, 3:49 PM

Fix build errors. Missed -DLLVM_ENABLE_PROJECTS in previous local test builds.

Harbormaster failed remote builds in B52267: Diff 255856!Apr 7 2020, 5:28 PM

This is thanks to a commit of mine that shaved a word off of SmallVector. Some options to consider:

Revert to word-size integers (size_t? uintptr_t?) for Size and Capacity for small-enough types. Could be just if sizeof(T)==1. Or maybe just for char and unsigned char.
Revert my patch entirely and go back to words (these used to be void*).
(Your patch, stop using SmallVector<char>.)

I think I would prefer some variation of (1) over (3).

Requesting changes just to be sure we consider the other options. I don't think it's good that SmallVector is no longer useful for large byte streams; I would prefer to fix that then stop using the type.

This revision now requires changes to proceed.Apr 7 2020, 6:09 PM

Build fixes in additional projects.

Harbormaster failed remote builds in B52280: Diff 255875!Apr 7 2020, 7:37 PM

Fix formatting.

browneee added inline comments.Apr 7 2020, 9:48 PM

llvm/include/llvm/Bitstream/BitstreamWriter.h
19	It is still used to construct records (line 512).

Harbormaster completed remote builds in B52291: Diff 255891.Apr 7 2020, 10:18 PM

In D77621#1968437, @dexonsmith wrote:

This is thanks to a commit of mine that shaved a word off of SmallVector. Some options to consider:

Revert to word-size integers (size_t? uintptr_t?) for Size and Capacity for small-enough types. Could be just if sizeof(T)==1. Or maybe just for char and unsigned char.

Revert my patch entirely and go back to words (these used to be void*).

(Your patch, stop using SmallVector<char>.)

I think I would prefer some variation of (1) over (3).

Hi Duncan, thanks for raising these alternatives.

Links to your prior commit for context: Review, Commit

I agree any of these options would solve the issue I'm experiencing.

Option 1:

I think SmallVectorBase would need to become templated.
The size related code would need support to two sets of edge cases.
The varying capacity may be surprising, and adds another variation to both both small mode and heap mode.

Option 3:

This patch is somewhat widespread. A more localized fix may be desirable.
Some inconvenience of an API change for downstream.

Do we want to increase the complexity of SmallVector somewhat? or do we want to keep the limit and affirm SmallVector is for small things?

In D77621#1968647, @browneee wrote:

Do we want to increase the complexity of SmallVector somewhat? or do we want to keep the limit and affirm SmallVector is for small things?

I don't think we should limit SmallVector to small things. Most std::string implementations also have the small storage optimization, but they're not limited to small things. Note that even SmallVector<T,0> has a number of conveniences for LLVM over std::vector (such as extra API, ability to use SmallVectorImpl APIs, and no pessimizations from exception handling).

Personally, I'm fine with splitting SmallVectorBase into SmallVectorBase<uintptr_t> and SmallVectorBase<uint32_t> (on 32-bit architectures, there's actually no split there). There aren't APIs that take a SmallVectorBase so there's no downside there. It doesn't seem too bad to me to do something like:

template <class SizeT> SmallVectorBase {
  typedef SizeT size_type;
  // ...
};
template <class T>
using SmallVectorSizeType =
    std::conditional<sizeof(T) < 4, uintptr_t, uint32_t>;
template <class T> SmallVectorImpl :
    SmallVectorBase<SmallVectorSizeType<T>> { ... };

If the complexity is too much, I would personally prefer to have my patch reverted (option 2 above) over making SmallVector stop working with large byte arrays.

In D77621#1970015, @dexonsmith wrote:
In D77621#1968647, @browneee wrote:

Do we want to increase the complexity of SmallVector somewhat? or do we want to keep the limit and affirm SmallVector is for small things?

I don't think we should limit SmallVector to small things. Most std::string implementations also have the small storage optimization, but they're not limited to small things. Note that even SmallVector<T,0> has a number of conveniences for LLVM over std::vector (such as extra API, ability to use SmallVectorImpl APIs, and no pessimizations from exception handling).

Personally, I'm fine with splitting SmallVectorBase into SmallVectorBase<uintptr_t> and SmallVectorBase<uint32_t> (on 32-bit architectures, there's actually no split there). There aren't APIs that take a SmallVectorBase so there's no downside there. It doesn't seem too bad to me to do something like:
template <class SizeT> SmallVectorBase {
  typedef SizeT size_type;
  // ...
};
template <class T>
using SmallVectorSizeType =
    std::conditional<sizeof(T) < 4, uintptr_t, uint32_t>;
template <class T> SmallVectorImpl :
    SmallVectorBase<SmallVectorSizeType<T>> { ... };
If the complexity is too much, I would personally prefer to have my patch reverted (option 2 above) over making SmallVector stop working with large byte arrays.

Fair enough - that complexity seems reasonably acceptable to me if you reckon the memory size benefits are still worthwhile (did you measure them on any particular workloads? Do we have lots of fairly empty SmallVectors, etc?) if they don't apply to smaller types like this?

Change to suggested approach: size and capacity type conditionally larger for small element types.

Also incorporate https://reviews.llvm.org/D77601

Harbormaster failed remote builds in B52703: Diff 256621!Apr 10 2020, 11:53 AM

browneee retitled this revision from Change BitcodeWriter buffer to std::vector instead of SmallVector. to ADT: SmallVector size & capacity use word-size integers when elements are small..Apr 10 2020, 12:46 PM

browneee edited the summary of this revision. (Show Details)

Please update the patch description/subject line.

@dexonsmith I'll leave this to you for final approval, since it was your idea/you've been touching things here. But looks like about the right direction.

llvm/include/llvm/ADT/SmallVector.h
47 ↗	(On Diff #256621)	Don't think this typedef is really pulling its weight - probably just refer to the template type parameter directly?
53 ↗	(On Diff #256621)	I'd probably use numeric_limits here & make this static constexpr
132 ↗	(On Diff #256621)	I'd probably add a "using Base = SmallVectorBase<SmallVectorSizeType<T>>" here, and then use that in the ctor and grow_pod. Also down by the other using decls maybe add "using Base::size/Base::capacity/Base::empty" so you don't have to "this->" everything.

browneee updated this revision to Diff 256803.Apr 11 2020, 2:54 PM

browneee marked 3 inline comments as done.

Address comments from dblaikie.

Harbormaster failed remote builds in B52818: Diff 256803!Apr 11 2020, 4:00 PM

dblaikie added inline comments.Apr 11 2020, 5:30 PM

llvm/include/llvm/ADT/SmallVector.h
53 ↗	(On Diff #256621)	Making it a constexpr /variable/ gets a bit more complicated - do you happen to know off-hand whether this requires a separate/out-of-line definition if it's ODR used (ah, seems it does - in C++14 which LLVM Uses, in 17 ("A constexpr specifier used in a function or static member variable (since C++17) declaration implies inline.") it would be OK) I think it's best not to leave a trap that might catch someone up in the future if SizeMax ends up getting ODR used (eg: in a call to std::less, rather than an explicit op<, etc) & then the lack of definition would lead to linking failure, etc. So best to leave it as a static constexpr function rather than static constexpr variable.
llvm/lib/Support/SmallVector.cpp
40–47 ↗	(On Diff #256803)	I think it's probably best to assert the size more like the above (but using zero-sized inline buffer), rather than using relative sizes - seems like it'd be more readable to me at least. Maybe @dexonsmith has a perspective here.

Changed SizeMax to static constexpr function.
Changed static asserts.

browneee marked 2 inline comments as done.Apr 13 2020, 11:43 AM

Looks good to me at this point (I have some vague quandries about whether the report_fatal_error stuff could be improved/made more clear, but couldn't come up with an actionable suggestion so far) - @dexonsmith could you check this over and offer final approval?

Harbormaster failed remote builds in B52964: Diff 257044!Apr 13 2020, 1:00 PM

Thanks for your patience, I missed the updates on Friday.

I have a couple of optional comments inline that I don't feel strongly about. LGTM either way.

In D77621#1972764, @dblaikie wrote:

Fair enough - that complexity seems reasonably acceptable to me if you reckon the memory size benefits are still worthwhile (did you measure them on any particular workloads? Do we have lots of fairly empty SmallVectors, etc?) if they don't apply to smaller types like this?

I haven't measured anything recently. Last I looked there were a number of SmallVectors inside other data structures (sometimes, sadly, SmallVector) on the heap (or stack). In some cases the main reason not to use std::vector is the exception pessimizations. It's nice to keep them small if it's reasonable to.

llvm/include/llvm/ADT/SmallVector.h
52 ↗	(On Diff #257044)	STL data structures have a name for this called `max_size()`. Should we be consistent with that?
179 ↗	(On Diff #257044)	Optionally we could expose `max_size()` as well.

This revision is now accepted and ready to land.Apr 13 2020, 1:36 PM

Rename SizeMax() to SizeTypeMax(). Fix max_size().

I'm open to suggestions to resolve the clang tidy naming warnings. I would prefer to leave grow_pod the same, to minimize changes.

@dexonsmith I am not a committer, if the last changes looks good please submit for me. Thanks!

llvm/include/llvm/ADT/SmallVector.h
52 ↗	(On Diff #257044)	Good question. This brought my attention to the existing SmallVectorTemplateCommon::max_size() which also needed to be updated. I'm going to name this new function SizeTypeMax to best describe what it provides, and leave it separate from max_size().
179 ↗	(On Diff #257044)	Not done. Updated existing max_size() instead.

Harbormaster failed remote builds in B53016: Diff 257130!Apr 13 2020, 4:26 PM

In D77621#1979183, @browneee wrote:

@dexonsmith I am not a committer, if the last changes looks good please submit for me. Thanks!

You've had a few patches in the past, I suggest you get yourself access.
https://llvm.org/docs/DeveloperPolicy.html#new-contributors

Or, let me know what you want for your GIT_COMMITTER* info.

GIT_COMMITTER_NAME=Andrew Browne
GIT_COMMITTER_EMAIL=browneee@google.com

This would be my second commit. I will request access next time - thanks @dexonsmith!

Rebase to latest HEAD.

Harbormaster failed remote builds in B53437: Diff 257861!Apr 15 2020, 2:55 PM

Rebase to latest HEAD.

Harbormaster failed remote builds in B53797: Diff 258444!Apr 17 2020, 3:55 PM

In D77621#1979769, @browneee wrote:

GIT_COMMITTER_NAME=Andrew Browne
GIT_COMMITTER_EMAIL=browneee@google.com

This would be my second commit. I will request access next time - thanks @dexonsmith!

I should have said GIT_AUTHOR* info, since I'm the committer :). Just landed it as b8d08e961df1d229872c785ebdbc8367432e9752, thanks for waiting!

I have reverted this change, because it causes a 1% compile-time and memory usage regression. The memory usage regression is probably fine given what this change does, but the compile-time regression is not. (For context, this pretty much undoes the wins that the recent removal of waymarking gave us.)

Some notes:

Can you please split out the fix to grow() into a separate revision? It does not seem related to the main change, and reduces surface area.
I don't think the automatic switch of the size/capacity field has been justified well. We have plenty of SmallVectors in LLVM that are, indeed, small. There is no way an MCRelaxableFragment will ever end up storing a single instruction that is 4G large.
Similarly, I'm not really convinced about handling this in SmallVector at all. The original change here just used an std::vector in the one place where this has become an issue. That seems like a pretty good solution until there is evidence that this is really a more widespread problem.

But in any case, my primary concern here is the compile-time regression, and it's not immediately clear which part of the change it comes from.

This revision is now accepted and ready to land.Apr 18 2020, 3:15 AM

Thanks for the revert explanation and notes, nikic.

@dexonsmith what is your current thinking on going back to the original std::vector approach?

browneee mentioned this in D77601: Make SmallVector assert if it cannot grow..Apr 21 2020, 6:49 PM

Switch approach back to std::vector change.

browneee retitled this revision from ADT: SmallVector size & capacity use word-size integers when elements are small. to Change BitcodeWriter buffer to std::vector instead of SmallVector..Apr 22 2020, 11:15 PM

browneee edited the summary of this revision. (Show Details)

Harbormaster failed remote builds in B54355: Diff 259480!Apr 22 2020, 11:18 PM

In D77621#1995673, @browneee wrote:

Thanks for the revert explanation and notes, nikic.

@dexonsmith what is your current thinking on going back to the original std::vector approach?

SmallVector has only been limited to UINT32_MAX size for about a year and I think it’s a pretty major regression that I broke using it for arbitrary char buffers. I don’t think that’s acceptable really. Note that there was pushback when I shrank SmallVector at all for aesthetic reasons.

Note that breaking SmallVector<char> also breaks SmallString and raw_svector_ostream for buffers that are sometimes large. This was certainly not the goal of my original commit and I think it’s the wrong result.

One thing to try I suppose is specializing just when sizeof(T)==1. But even if there’s still a compile time hit, I think making SmallVector functional is more critical. Use cases that really want something tiny can use TinyPtrVector; or if that’s not appropriate we can introduce a TinyVector that works for other types (could make it 8 bytes with small storage for 1 element if the type is 4 bytes or smaller).

This might be worth a thread on llvm-dev. Maybe no one else thinks LLVM should use SmallVectorImpl pervasively in APIs anymore.

(AFAICT MCRelaxableFragment has a SmallVector<MCFixup> and would not have been affected by the reverted commit since sizeof(MCFixup) is quite large, not sure why that was brought up.)

because it causes a 1% compile-time and memory usage regression.

Yeah, some memory regression is expected and, in my opinion, acceptable for the change.

The compile time regression presumably came from the changes to the report_fatal_error handling in SmallVector - perhaps it could be changed/omitted in this commit, and done separately to assess the cost of changes to that error checking?

I resubmitted the report_fatal_error checks again under D77601

http://llvm-compile-time-tracker.com/compare.php?from=7375212172951d2fc283c81d03c1a8588c3280c6&to=a30e7ea88e75568feed020aedae73c52de888835&stat=max-rss
http://llvm-compile-time-tracker.com/compare.php?from=7375212172951d2fc283c81d03c1a8588c3280c6&to=a30e7ea88e75568feed020aedae73c52de888835&stat=instructions

Imo impact from this part is insignificant.

Other pieces I see as possibly impacting compile time are:

This correction to SmallVectorTemplateCommon::max_size(). But SizeTypeMax() is static constexpr, this seems like it could still be optimized to a constant.

-  size_type max_size() const { return size_type(-1) / sizeof(T); }
+  size_type max_size() const {
+    return std::min(this->SizeTypeMax(), size_type(-1) / sizeof(T));
+  }

More function calls. They also appear fairly optimizable to me.

I may not have good insight into the actual optimization behavior here.

In D77621#1999757, @browneee wrote:

I resubmitted the report_fatal_error checks again under D77601

http://llvm-compile-time-tracker.com/compare.php?from=7375212172951d2fc283c81d03c1a8588c3280c6&to=a30e7ea88e75568feed020aedae73c52de888835&stat=max-rss
http://llvm-compile-time-tracker.com/compare.php?from=7375212172951d2fc283c81d03c1a8588c3280c6&to=a30e7ea88e75568feed020aedae73c52de888835&stat=instructions

Imo impact from this part is insignificant.

Ah, OK - thanks for noting that!

@nikic any sense of the noise floor/level on these measurements? It doesn't /look/ like there's much left in this that would cause problems. & I assume these measurements were made on an optimized build (so we don't have to try to improve the unoptimized code?

Other pieces I see as possibly impacting compile time are:

This correction to SmallVectorTemplateCommon::max_size(). But SizeTypeMax() is static constexpr, this seems like it could still be optimized to a constant.
-  size_type max_size() const { return size_type(-1) / sizeof(T); }
+  size_type max_size() const {
+    return std::min(this->SizeTypeMax(), size_type(-1) / sizeof(T));
+  }

Perhaps you could move the value computation into a constexpr variable & just return that as needed. (could be a static local constexpr, I guess - to avoid the issues around linkage of constexpr member variables)

More function calls. They also appear fairly optimizable to me.

I may not have good insight into the actual optimization behavior here.

*nod* Didn't seem especially interesting.

I don't think the automatic switch of the size/capacity field has been justified well. We have plenty of SmallVectors in LLVM that are, indeed, small. There is no way an MCRelaxableFragment will ever end up storing a single instruction that is 4G large.

@nikic - can you explain the relevance of this ^ (as @dexonsmith pointed out, MCRelaxableFragment doesn't look like it would be affected by this change - is there something we're missing about that?)

Similarly, I'm not really convinced about handling this in SmallVector at all. The original change here just used an std::vector in the one place where this has become an issue. That seems like a pretty good solution until there is evidence that this is really a more widespread problem.

I'm inclined to go with @dexonsmith's perspective here, as the author of the original change & the general attitude that SmallVector should support this kind of use case.

In D77621#1999957, @dblaikie wrote:

@nikic any sense of the noise floor/level on these measurements? It doesn't /look/ like there's much left in this that would cause problems. & I assume these measurements were made on an optimized build (so we don't have to try to improve the unoptimized code?

The measurements are on an optimized build (default LLVM release build, so no LTO). The noise level on the "instructions" metric is very low, so that changes above 0.1% tend to be significant. The compile-time regression on the original change definitely wasn't noise (but the change from D77601 is in the noise).

Other pieces I see as possibly impacting compile time are:

This correction to SmallVectorTemplateCommon::max_size(). But SizeTypeMax() is static constexpr, this seems like it could still be optimized to a constant.
-  size_type max_size() const { return size_type(-1) / sizeof(T); }
+  size_type max_size() const {
+    return std::min(this->SizeTypeMax(), size_type(-1) / sizeof(T));
+  }
Perhaps you could move the value computation into a constexpr variable & just return that as needed. (could be a static local constexpr, I guess - to avoid the issues around linkage of constexpr member variables)

The use of a function rather than a static constexpr for SizeTypeMax() was my first thought as well. It seems pretty weird to me, but maybe it's enough to fall one the wrong side of some inlining heuristic.

The only other thing that comes to mind is that grow_pod() moved into the header, which might have negative effects. It should be possible to avoid that by providing explicit template instantiations for uint32_t and uintptr_t in the cpp file.

I'll try to figure out what the cause is, but might take me a few days.

I don't think the automatic switch of the size/capacity field has been justified well. We have plenty of SmallVectors in LLVM that are, indeed, small. There is no way an MCRelaxableFragment will ever end up storing a single instruction that is 4G large.

@nikic - can you explain the relevance of this ^ (as @dexonsmith pointed out, MCRelaxableFragment doesn't look like it would be affected by this change - is there something we're missing about that?)

MCRelaxableFragment also contains a SmallVector<char>. I used this as an example where we use a SmallVector<char> with a very low upper bound on the size. (This example is not great, because the structure is already large for other reasons.)

Similarly, I'm not really convinced about handling this in SmallVector at all. The original change here just used an std::vector in the one place where this has become an issue. That seems like a pretty good solution until there is evidence that this is really a more widespread problem.

I'm inclined to go with @dexonsmith's perspective here, as the author of the original change & the general attitude that SmallVector should support this kind of use case.

Okay, I'm basically fine with that, if it is our stance that SmallVector should always be preferred over std::vector. Not really related to this revision, but it would probably help to do some renaming/aliasing to facilitate that view. Right now, the number of SmallVector<T, 0> uses in LLVM is really small compared to the std::vector<T> uses (100 vs 6000 based on a not very accurate grep). I think part of that is in the name, and calling it using Vector<T> = SmallVector<T, 0> and using VectorImpl<T> = SmallVectorImpl<T> would make it a lot more obvious that this is our preferred general purpose vector type, even if the stored data is not small.

In D77621#2000237, @nikic wrote:

Okay, I'm basically fine with that, if it is our stance that SmallVector should always be preferred over std::vector. Not really related to this revision, but it would probably help to do some renaming/aliasing to facilitate that view. Right now, the number of SmallVector<T, 0> uses in LLVM is really small compared to the std::vector<T> uses (100 vs 6000 based on a not very accurate grep). I think part of that is in the name, and calling it using Vector<T> = SmallVector<T, 0> and using VectorImpl<T> = SmallVectorImpl<T> would make it a lot more obvious that this is our preferred general purpose vector type, even if the stored data is not small.

Those aliases SGTM.

In D77621#2000237, @nikic wrote:

Perhaps you could move the value computation into a constexpr variable & just return that as needed. (could be a static local constexpr, I guess - to avoid the issues around linkage of constexpr member variables)

The use of a function rather than a static constexpr for SizeTypeMax() was my first thought as well. It seems pretty weird to me, but maybe it's enough to fall one the wrong side of some inlining heuristic.

The only other thing that comes to mind is that grow_pod() moved into the header, which might have negative effects. It should be possible to avoid that by providing explicit template instantiations for uint32_t and uintptr_t in the cpp file.

I'll try to figure out what the cause is, but might take me a few days.

I've tried those two things: results. From the bottom, the first commit is a rebased version of the original change, the second one makes SizeTypeMax a constant instead of a function and the last one moves grow_pod back into the C++ file (I forgot to replace the UINT32_MAX references in grow_pod, but I don't think it has an impact on the conclusion). The first change is a +0.75% regression, the second is neutral and the last one is a -0.70% improvement, the remaining difference is likely noise. So it looks like the move of grow_pod into the header was the culprit.

What is rather peculiar is that the picture is similar for the max-rss numbers. I believe this is because max-rss is also influenced by the size of the clang binary itself, and apparently the move of grow_pod into the header increased it a lot. (I should probably collect clang binary size to make this easy to verify.) That means that there doesn't seem to be much of an increase in terms of actually allocated heap memory due to this change.

Taking the max-rss numbers across all three commits, the only part where memory usage increases non-trivially is the LTO -g link step, by about ~1%. Possibly some debuginfo related stuff uses SmallVector<char>.

So tl;dr looks like as long as we keep grow_pod outside the header file, this change seems to be approximately free in terms of compile-time and memory usage both.

@nikic, great news! Thanks for doing the detailed analysis.

Switch back to size and capacity type conditionally larger approach (appologies for the noise here).

Apply performance regression solutions from @nikic

browneee retitled this revision from Change BitcodeWriter buffer to std::vector instead of SmallVector. to ADT: SmallVector size/capacity use word-size integers when elements are small.Apr 24 2020, 12:06 PM

browneee edited the summary of this revision. (Show Details)

In D77621#2001099, @dexonsmith wrote:

In D77621#2000237, @nikic wrote:

Okay, I'm basically fine with that, if it is our stance that SmallVector should always be preferred over std::vector. Not really related to this revision, but it would probably help to do some renaming/aliasing to facilitate that view. Right now, the number of SmallVector<T, 0> uses in LLVM is really small compared to the std::vector<T> uses (100 vs 6000 based on a not very accurate grep). I think part of that is in the name, and calling it using Vector<T> = SmallVector<T, 0> and using VectorImpl<T> = SmallVectorImpl<T> would make it a lot more obvious that this is our preferred general purpose vector type, even if the stored data is not small.

Those aliases SGTM.

I'd be slightly against, just because having a name that differs from the standard name only in case seems pretty subtle - that and momentum, we've had SmallVector around for a while & I think it's OK. I don't mind some places using std::vector either, though. Don't feel strongly enough that I'd outright stand against such an alias/change, but just expressing this amount of disfavor.

In D77621#2001378, @nikic wrote:

So tl;dr looks like as long as we keep grow_pod outside the header file, this change seems to be approximately free in terms of compile-time and memory usage both.

Awesome - thanks for looking into it!

nikic accepted this revision.Apr 24 2020, 1:11 PM

nikic added inline comments.

llvm/include/llvm/ADT/SmallVector.h
19 ↗	(On Diff #259949)	Is this include still needed?
84 ↗	(On Diff #259949)	Is this needed? I don't think it makes a lot of sense to allow odr-use of `SizeTypeMax`. As it's a protected member, it's only used in the SmallVector implementation, where we control how it is used.
llvm/lib/Support/SmallVector.cpp
48 ↗	(On Diff #259949)	Nit: `if the when` => `if the` or `when`.

Harbormaster failed remote builds in B54603: Diff 259949!Apr 24 2020, 1:33 PM

dblaikie added inline comments.Apr 24 2020, 1:47 PM

llvm/include/llvm/ADT/SmallVector.h
84 ↗	(On Diff #259949)	It's used as a parameter to std::min, so it's already odr used & I'd rather not leave it as a trap to walk around even if we addressed that issue. I assume if it were a constexpr local in a protected inline function it wouldn't hinder optimizations in any real way?

nikic added inline comments.Apr 24 2020, 2:32 PM

llvm/include/llvm/ADT/SmallVector.h
84 ↗	(On Diff #259949)	It's used as a parameter to std::min, so it's already odr used & I'd rather not leave it as a trap to walk around even if we addressed that issue. Oh, right you are! In that case this seems fine :) I assume if it were a constexpr local in a protected inline function it wouldn't hinder optimizations in any real way? The change from constexpr function to constexpr static didn't change anything performance-wise, so either way works for me. Another option is: enum : size_t { SizeTypeMax = std::numeric_limits<Size_T>::max() }; Kind of sad that in C++14, using an enum is still the only "no nonsense" way to declare a constant.

Change SizeTypeMax to a static constexpr function.
Fix comment typos.
Add comment to alert others to possible performance loss if that function is moved to the header.

@nikic: Thank you for detecting, analyzing, and solving the performance regression!

Harbormaster failed remote builds in B54631: Diff 260006!Apr 24 2020, 4:16 PM

Comitted: b5f0eae1dc3c09c020cdf9d07238dec9acdacf5f

Reverted in 5cb4c3776a34d48e43d9118921d2191aee0e3d21

Fails on plaforms where uintptr_t is the same type as uint32_t.

browneee reopened this revision.Apr 24 2020, 9:30 PM

This revision is now accepted and ready to land.Apr 24 2020, 9:30 PM

Change uintptr_t to uint64_t to ensure this does not instantiate the same template twice on platforms where uintptr_t is equivalent to uint32_t.

Also considered using the preprocessor to disable the uintptr_t instantiation, but chose to avoid preprocessor use.

browneee marked an inline comment as done.Apr 24 2020, 9:43 PM

browneee added inline comments.

llvm/include/llvm/ADT/SmallVector.h
19 ↗	(On Diff #259949)	SmallVectorTemplateBase<T, TriviallyCopyable>::grow() still remains in the header and uses report_bad_alloc_error().

Harbormaster failed remote builds in B54666: Diff 260064!Apr 24 2020, 10:40 PM

Seems good to me.

@browneee Hi, it seems that you did not attach Differential Revision: to the commit dda3c19a3618dce9492687f8e880e7a73486ee98 so the differential was not closed automatically.

In previous llvm-dev discussions, people agreed that Reviewed-by: and Differential Revision: are useful and should be retained. Many others Summary: Subscribers: Reviewers: Tags: not not needed.

I usually do this when committing: arcfilter; git fetch origin master && git rebase origin/master; last-minute-testing && git push origin HEAD:master

where arcfilter is a shell function which drops unneeded Phabricator tags

arcfilter () {
        arc amend
        git log -1 --pretty=%B | awk '/Reviewers:|Subscribers:/{p=1} /Reviewed By:|Differential Revision:/{p=0} !p && !/^Summary:$/ {sub(/^Summary: /,"");print}' | git commit --amend -F -
}

I guess this fixes https://bugs.llvm.org/show_bug.cgi?id=45289 as well

smeenai added a subscriber: smeenai.Apr 29 2020, 11:49 PM

smeenai added inline comments.

llvm/include/llvm/ADT/SmallVector.h
52 ↗	(On Diff #257044)	Was it intentional to make this return a `size_t` rather than a `Size_T`? Clang gives a truncation warning on 32-bit platforms when you try to instantiate the template with `uint64_t` as a result.

dblaikie added inline comments.Apr 30 2020, 8:02 AM

llvm/include/llvm/ADT/SmallVector.h
52 ↗	(On Diff #257044)	I think returning size_t is the correct thing here & the fix is not to use a 64 bit size on a 32 bit machine - that was initially intended to be solved by using uintptr_t, but got lost when that turned out to cause issues on 32 bit machines. @browneee could you fix this differently, so that when sizeof(uintptr_t) == 4 there's only one instantiation/only uint32_t is used?

Thanks for the tips, MaskRay.

Yes, I expect this would fix that issue.

smeenai, SizeTypeMax() is intended to return size_t.

I see a couple of options for fixing the truncation warning on 32-bit platforms:

Add an explicit cast to remove the warning.
- Disadvantage: the second instantiation still exists even though it is unused.

static constexpr size_t SizeTypeMax() {
  return static_cast<size_t>(std::numeric_limits<Size_T>::max());
}

Use a std::conditional to swap the type of one instantiation to avoid conflicts.

In this case I'd probably swap back to using uintptr_t and disable the uint32_t on 32bit.
- Disadvantage: the second instantiation still exists even though it is unused.

// Will be unused when instantiated with char.
// This is to avoid instantiation for uint32_t conflicting with uintptr_t on 32-bit systems.
template class llvm::SmallVectorBase<std::conditional<sizeof(void *) != 4, uint32_t, char>::type>;     
template class llvm::SmallVectorBase<uintptr_t>;

Use preprocessor to disable one of the instantiations on 32-bit platforms.

In this case I'd probably swap back to using uintptr_t and disable the uint32_t on 32bit.
- Disadvantage: uses preprocessor
- Disadvantage: potential for portability issues with different platforms lacking certain macros

#if __SIZEOF_POINTER__ != 4  && !defined(_WIN32) && !defined(__ILP32)
template class llvm::SmallVectorBase<uint32_t>;     
#endif    
template class llvm::SmallVectorBase<uintptr_t>;

My order of preference would be 1, 2, 3.

Is there another solution I've missed? Thoughts on which is best? @dblaikie

In D77621#2013400, @browneee wrote:
Thanks for the tips, MaskRay.

Yes, I expect this would fix that issue.

smeenai, SizeTypeMax() is intended to return size_t.

I see a couple of options for fixing the truncation warning on 32-bit platforms:

Add an explicit cast to remove the warning.

Disadvantage: the second instantiation still exists even though it is unused.
static constexpr size_t SizeTypeMax() {
  return static_cast<size_t>(std::numeric_limits<Size_T>::max());
}
Use a std::conditional to swap the type of one instantiation to avoid conflicts.

In this case I'd probably swap back to using uintptr_t and disable the uint32_t on 32bit.

Disadvantage: the second instantiation still exists even though it is unused.
// Will be unused when instantiated with char.
// This is to avoid instantiation for uint32_t conflicting with uintptr_t on 32-bit systems.
template class llvm::SmallVectorBase<std::conditional<sizeof(void *) != 4, uint32_t, char>::type>;     
template class llvm::SmallVectorBase<uintptr_t>;
Use preprocessor to disable one of the instantiations on 32-bit platforms.

In this case I'd probably swap back to using uintptr_t and disable the uint32_t on 32bit.

Disadvantage: uses preprocessor

Disadvantage: potential for portability issues with different platforms lacking certain macros
#if __SIZEOF_POINTER__ != 4  && !defined(_WIN32) && !defined(__ILP32)
template class llvm::SmallVectorBase<uint32_t>;     
#endif    
template class llvm::SmallVectorBase<uintptr_t>;
My order of preference would be 1, 2, 3.

Is there another solution I've missed? Thoughts on which is best? @dblaikie

I don't think (3) needs to be non-portable - it could use SIZE_MAX to test if it's bigger than 2^32? (or == 2^64)? That seems like it'd be a pretty good direct test? & ensure the types written in the explicit specializations are the same types written in the header/std::conditional (seems a bit questionable to rely on uintptr_t being necessarily the same type as either uint32_t or uint64_t - but maybe that's guaranteed/written down somewhere)?

@browneee Looks like LLVM already defines LLVM_PTR_SIZE as a more portable version of __SIZEOF_POINTER__.

In D77621#2013546, @nikic wrote:

@browneee Looks like LLVM already defines LLVM_PTR_SIZE as a more portable version of __SIZEOF_POINTER__.

I saw LLVM_PTR_SIZE, but its definition may be based on sizeof(), so I don't think it should be used in a preprocessor condition.

SIZE_MAX looks like a good option.

browneee mentioned this in D79214: [ADT] Fix SmallVector unused template instantiation on 32-bit systems..Apr 30 2020, 3:25 PM

https://reviews.llvm.org/D79214

browneee mentioned this in rG25e2e92297e2: [ADT] Fix SmallVector unused template instantiation on 32-bit systems..Apr 30 2020, 4:40 PM

In D77621#2002430, @dblaikie wrote:

(seems a bit questionable to rely on uintptr_t being necessarily the same type as either uint32_t or uint64_t - but maybe that's guaranteed/written down somewhere)?

I think in practice uintptr_t will match range and size with one of uint32_t and uint64_t, but as you suggest it might not be equivalent to either (e.g., could use all three of unsigned, unsigned long, and unsigned long long). We need to be consistent between the std::conditional in SmallVectorSizeType and the explicit instantiations (and skimming the current implementation it looks like it is consistent, doesn’t rely on uintptr_t).

Revision Contents

Path

Size

clang-tools-extra/

clang-doc/

Serialize.cpp

7 lines

unittests/

clang-doc/

BitcodeTest.cpp

5 lines

clang/

include/

clang/

Serialization/

ASTWriter.h

6 lines

PCHContainerOperations.h

4 lines

lib/

CodeGen/

ObjectFilePCHContainerOperations.cpp

2 lines

Frontend/

ASTUnit.cpp

11 lines

PrecompiledPreamble.cpp

7 lines

SerializedDiagnosticPrinter.cpp

3 lines

Serialization/

ASTWriter.cpp

8 lines

GlobalModuleIndex.cpp

2 lines

PCHContainerOperations.cpp

4 lines

llvm/

include/

llvm/

Bitcode/

BitcodeWriter.h

4 lines

Bitstream/

BitstreamWriter.h

8 lines

Remarks/

BitstreamRemarkSerializer.h

2 lines

lib/

Bitcode/

Writer/

BitcodeWriter.cpp

27 lines

ExecutionEngine/

Orc/

ThreadSafeModule.cpp

2 lines

Transforms/

IPO/

ThinLTOBitcodeWriter.cpp

6 lines

tools/

llvm-cat/

llvm-cat.cpp

2 lines

llvm-modextract/

llvm-modextract.cpp

6 lines

unittests/

Bitstream/

BitstreamReaderTest.cpp

6 lines

BitstreamWriterTest.cpp

24 lines

Diff 255891

clang-tools-extra/clang-doc/Serialize.cpp

//===-- Serialize.cpp - ClangDoc Serializer ---------------------- C++ --===//		//===-- Serialize.cpp - ClangDoc Serializer ---------------------- C++ --===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "Serialize.h"		#include "Serialize.h"
#include "BitcodeWriter.h"		#include "BitcodeWriter.h"
#include "clang/AST/Comment.h"		#include "clang/AST/Comment.h"
#include "clang/Index/USRGeneration.h"		#include "clang/Index/USRGeneration.h"
#include "llvm/ADT/Hashing.h"		#include "llvm/ADT/Hashing.h"
#include "llvm/ADT/StringExtras.h"		#include "llvm/ADT/StringExtras.h"
#include "llvm/Support/SHA1.h"		#include "llvm/Support/SHA1.h"

		#include <vector>

using clang::comments::FullComment;		using clang::comments::FullComment;

namespace clang {		namespace clang {
namespace doc {		namespace doc {
namespace serialize {		namespace serialize {

SymbolID hashUSR(llvm::StringRef USR) {		SymbolID hashUSR(llvm::StringRef USR) {
return llvm::SHA1::hash(arrayRefFromStringRef(USR));		return llvm::SHA1::hash(arrayRefFromStringRef(USR));
▲ Show 20 Lines • Show All 150 Lines • ▼ Show 20 Lines	if (Info)
return Info->Name;		return Info->Name;
// TODO: Add parsing for \file command.		// TODO: Add parsing for \file command.
return "<not a builtin command>";		return "<not a builtin command>";
}		}

// Serializing functions.		// Serializing functions.

template <typename T> static std::string serialize(T &I) {		template <typename T> static std::string serialize(T &I) {
SmallString<2048> Buffer;		std::vector<char> Buffer;
llvm::BitstreamWriter Stream(Buffer);		llvm::BitstreamWriter Stream(Buffer);
ClangDocBitcodeWriter Writer(Stream);		ClangDocBitcodeWriter Writer(Stream);
Writer.emitBlock(I);		Writer.emitBlock(I);
return Buffer.str().str();		std::string Result(Buffer.data(), Buffer.size());
		return Result;
}		}

std::string serialize(std::unique_ptr<Info> &I) {		std::string serialize(std::unique_ptr<Info> &I) {
switch (I->IT) {		switch (I->IT) {
case InfoType::IT_namespace:		case InfoType::IT_namespace:
return serialize(static_cast<NamespaceInfo >(I.get()));		return serialize(static_cast<NamespaceInfo >(I.get()));
case InfoType::IT_record:		case InfoType::IT_record:
return serialize(static_cast<RecordInfo >(I.get()));		return serialize(static_cast<RecordInfo >(I.get()));
▲ Show 20 Lines • Show All 471 Lines • Show Last 20 Lines

clang-tools-extra/unittests/clang-doc/BitcodeTest.cpp

	Show All 12 Lines
	#include "llvm/Bitstream/BitstreamReader.h"			#include "llvm/Bitstream/BitstreamReader.h"
	#include "llvm/Bitstream/BitstreamWriter.h"			#include "llvm/Bitstream/BitstreamWriter.h"
	#include "gtest/gtest.h"			#include "gtest/gtest.h"

	namespace clang {			namespace clang {
	namespace doc {			namespace doc {

	template <typename T> static std::string writeInfo(T &I) {			template <typename T> static std::string writeInfo(T &I) {
	SmallString<2048> Buffer;			std::vector<char> Buffer;
	llvm::BitstreamWriter Stream(Buffer);			llvm::BitstreamWriter Stream(Buffer);
	ClangDocBitcodeWriter Writer(Stream);			ClangDocBitcodeWriter Writer(Stream);
	Writer.emitBlock(I);			Writer.emitBlock(I);
	return Buffer.str().str();			std::string Result(Buffer.data(), Buffer.size());
				return Result;
	}			}

	std::string writeInfo(Info *I) {			std::string writeInfo(Info *I) {
	switch (I->IT) {			switch (I->IT) {
	case InfoType::IT_namespace:			case InfoType::IT_namespace:
	return writeInfo(static_cast<NamespaceInfo >(I));			return writeInfo(static_cast<NamespaceInfo >(I));
	case InfoType::IT_record:			case InfoType::IT_record:
	return writeInfo(static_cast<RecordInfo >(I));			return writeInfo(static_cast<RecordInfo >(I));
	▲ Show 20 Lines • Show All 246 Lines • Show Last 20 Lines

clang/include/clang/Serialization/ASTWriter.h

Show First 20 Lines • Show All 114 Lines • ▼ Show 20 Lines	private:
/// Keys in the map never have const/volatile qualifiers.		/// Keys in the map never have const/volatile qualifiers.
using TypeIdxMap = llvm::DenseMap<QualType, serialization::TypeIdx,		using TypeIdxMap = llvm::DenseMap<QualType, serialization::TypeIdx,
serialization::UnsafeQualTypeDenseMapInfo>;		serialization::UnsafeQualTypeDenseMapInfo>;

/// The bitstream writer used to emit this precompiled header.		/// The bitstream writer used to emit this precompiled header.
llvm::BitstreamWriter &Stream;		llvm::BitstreamWriter &Stream;

/// The buffer associated with the bitstream.		/// The buffer associated with the bitstream.
const SmallVectorImpl<char> &Buffer;		const std::vector<char> &Buffer;

/// The PCM manager which manages memory buffers for pcm files.		/// The PCM manager which manages memory buffers for pcm files.
InMemoryModuleCache &ModuleCache;		InMemoryModuleCache &ModuleCache;

/// The ASTContext we're writing.		/// The ASTContext we're writing.
ASTContext *Context = nullptr;		ASTContext *Context = nullptr;

/// The preprocessor we're writing.		/// The preprocessor we're writing.
▲ Show 20 Lines • Show All 395 Lines • ▼ Show 20 Lines	private:

ASTFileSignature WriteASTCore(Sema &SemaRef, StringRef isysroot,		ASTFileSignature WriteASTCore(Sema &SemaRef, StringRef isysroot,
const std::string &OutputFile,		const std::string &OutputFile,
Module *WritingModule);		Module *WritingModule);

public:		public:
/// Create a new precompiled header writer that outputs to		/// Create a new precompiled header writer that outputs to
/// the given bitstream.		/// the given bitstream.
ASTWriter(llvm::BitstreamWriter &Stream, SmallVectorImpl<char> &Buffer,		ASTWriter(llvm::BitstreamWriter &Stream, std::vector<char> &Buffer,
InMemoryModuleCache &ModuleCache,		InMemoryModuleCache &ModuleCache,
ArrayRef<std::shared_ptr<ModuleFileExtension>> Extensions,		ArrayRef<std::shared_ptr<ModuleFileExtension>> Extensions,
bool IncludeTimestamps = true);		bool IncludeTimestamps = true);
~ASTWriter() override;		~ASTWriter() override;

const LangOptions &getLangOpts() const;		const LangOptions &getLangOpts() const;

/// Get a timestamp for output into the AST file. The actual timestamp		/// Get a timestamp for output into the AST file. The actual timestamp
▲ Show 20 Lines • Show All 202 Lines • ▼ Show 20 Lines	class PCHGenerator : public SemaConsumer {
llvm::BitstreamWriter Stream;		llvm::BitstreamWriter Stream;
ASTWriter Writer;		ASTWriter Writer;
bool AllowASTWithErrors;		bool AllowASTWithErrors;
bool ShouldCacheASTInMemory;		bool ShouldCacheASTInMemory;

protected:		protected:
ASTWriter &getWriter() { return Writer; }		ASTWriter &getWriter() { return Writer; }
const ASTWriter &getWriter() const { return Writer; }		const ASTWriter &getWriter() const { return Writer; }
SmallVectorImpl<char> &getPCH() const { return Buffer->Data; }		std::vector<char> &getPCH() const { return Buffer->Data; }

public:		public:
PCHGenerator(const Preprocessor &PP, InMemoryModuleCache &ModuleCache,		PCHGenerator(const Preprocessor &PP, InMemoryModuleCache &ModuleCache,
StringRef OutputFile, StringRef isysroot,		StringRef OutputFile, StringRef isysroot,
std::shared_ptr<PCHBuffer> Buffer,		std::shared_ptr<PCHBuffer> Buffer,
ArrayRef<std::shared_ptr<ModuleFileExtension>> Extensions,		ArrayRef<std::shared_ptr<ModuleFileExtension>> Extensions,
bool AllowASTWithErrors = false, bool IncludeTimestamps = true,		bool AllowASTWithErrors = false, bool IncludeTimestamps = true,
bool ShouldCacheASTInMemory = false);		bool ShouldCacheASTInMemory = false);
Show All 12 Lines

clang/include/clang/Serialization/PCHContainerOperations.h

	//===--- Serialization/PCHContainerOperations.h - PCH Containers --- C++ --===//			//===--- Serialization/PCHContainerOperations.h - PCH Containers --- C++ --===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef LLVM_CLANG_SERIALIZATION_PCHCONTAINEROPERATIONS_H			#ifndef LLVM_CLANG_SERIALIZATION_PCHCONTAINEROPERATIONS_H
	#define LLVM_CLANG_SERIALIZATION_PCHCONTAINEROPERATIONS_H			#define LLVM_CLANG_SERIALIZATION_PCHCONTAINEROPERATIONS_H

	#include "clang/Basic/Module.h"			#include "clang/Basic/Module.h"
	#include "llvm/ADT/SmallVector.h"
	#include "llvm/ADT/StringMap.h"			#include "llvm/ADT/StringMap.h"
	#include "llvm/Support/MemoryBuffer.h"			#include "llvm/Support/MemoryBuffer.h"
	#include <memory>			#include <memory>
				#include <vector>

	namespace llvm {			namespace llvm {
	class raw_pwrite_stream;			class raw_pwrite_stream;
	}			}

	namespace clang {			namespace clang {

	class ASTConsumer;			class ASTConsumer;
	class CodeGenOptions;			class CodeGenOptions;
	class DiagnosticsEngine;			class DiagnosticsEngine;
	class CompilerInstance;			class CompilerInstance;

	struct PCHBuffer {			struct PCHBuffer {
	ASTFileSignature Signature;			ASTFileSignature Signature;
	llvm::SmallVector<char, 0> Data;			std::vector<char> Data;
	bool IsComplete;			bool IsComplete;
	};			};

	/// This abstract interface provides operations for creating			/// This abstract interface provides operations for creating
	/// containers for serialized ASTs (precompiled headers and clang			/// containers for serialized ASTs (precompiled headers and clang
	/// modules).			/// modules).
	class PCHContainerWriter {			class PCHContainerWriter {
	public:			public:
	▲ Show 20 Lines • Show All 77 Lines • Show Last 20 Lines

clang/lib/CodeGen/ObjectFilePCHContainerOperations.cpp

Show First 20 Lines • Show All 302 Lines • ▼ Show 20 Lines	void HandleTranslationUnit(ASTContext &Ctx) override {

// Use the LLVM backend to emit the pch container.		// Use the LLVM backend to emit the pch container.
clang::EmitBackendOutput(Diags, HeaderSearchOpts, CodeGenOpts, TargetOpts,		clang::EmitBackendOutput(Diags, HeaderSearchOpts, CodeGenOpts, TargetOpts,
LangOpts, Ctx.getTargetInfo().getDataLayout(),		LangOpts, Ctx.getTargetInfo().getDataLayout(),
M.get(), BackendAction::Backend_EmitObj,		M.get(), BackendAction::Backend_EmitObj,
std::move(OS));		std::move(OS));

// Free the memory for the temporary buffer.		// Free the memory for the temporary buffer.
llvm::SmallVector<char, 0> Empty;		std::vector<char> Empty;
SerializedAST = std::move(Empty);		SerializedAST = std::move(Empty);
}		}
};		};

} // anonymous namespace		} // anonymous namespace

std::unique_ptr<ASTConsumer>		std::unique_ptr<ASTConsumer>
ObjectFilePCHContainerWriter::CreatePCHContainerGenerator(		ObjectFilePCHContainerWriter::CreatePCHContainerGenerator(
▲ Show 20 Lines • Show All 45 Lines • Show Last 20 Lines

clang/lib/Frontend/ASTUnit.cpp

Show First 20 Lines • Show All 209 Lines • ▼ Show 20 Lines	getBufferForFileHandlingRemapping(const CompilerInvocation &Invocation,
if (BufferOwner)		if (BufferOwner)
return BufferOwner;		return BufferOwner;
if (!Buffer)		if (!Buffer)
return nullptr;		return nullptr;
return llvm::MemoryBuffer::getMemBufferCopy(Buffer->getBuffer(), FilePath);		return llvm::MemoryBuffer::getMemBufferCopy(Buffer->getBuffer(), FilePath);
}		}

struct ASTUnit::ASTWriterData {		struct ASTUnit::ASTWriterData {
SmallString<128> Buffer;		std::vector<char> Buffer;
llvm::BitstreamWriter Stream;		llvm::BitstreamWriter Stream;
ASTWriter Writer;		ASTWriter Writer;

ASTWriterData(InMemoryModuleCache &ModuleCache)		ASTWriterData(InMemoryModuleCache &ModuleCache)
: Stream(Buffer), Writer(Stream, Buffer, ModuleCache, {}) {}		: Stream(Buffer), Writer(Stream, Buffer, ModuleCache, {}) {}
};		};

void ASTUnit::clearFileLevelDecls() {		void ASTUnit::clearFileLevelDecls() {
▲ Show 20 Lines • Show All 2,090 Lines • ▼ Show 20 Lines	if (llvm::Error Err = llvm::writeFileAtomically(
: llvm::Error::success();		: llvm::Error::success();
})) {		})) {
consumeError(std::move(Err));		consumeError(std::move(Err));
return true;		return true;
}		}
return false;		return false;
}		}

static bool serializeUnit(ASTWriter &Writer,		static bool serializeUnit(ASTWriter &Writer, std::vector<char> &Buffer, Sema &S,
SmallVectorImpl<char> &Buffer,		bool hasErrors, raw_ostream &OS) {
Sema &S,
bool hasErrors,
raw_ostream &OS) {
Writer.WriteAST(S, std::string(), nullptr, "", hasErrors);		Writer.WriteAST(S, std::string(), nullptr, "", hasErrors);

// Write the generated bitstream to "Out".		// Write the generated bitstream to "Out".
if (!Buffer.empty())		if (!Buffer.empty())
OS.write(Buffer.data(), Buffer.size());		OS.write(Buffer.data(), Buffer.size());

return false;		return false;
}		}

bool ASTUnit::serialize(raw_ostream &OS) {		bool ASTUnit::serialize(raw_ostream &OS) {
// For serialization we are lenient if the errors were only warn-as-error kind.		// For serialization we are lenient if the errors were only warn-as-error kind.
bool hasErrors = getDiagnostics().hasUncompilableErrorOccurred();		bool hasErrors = getDiagnostics().hasUncompilableErrorOccurred();

if (WriterData)		if (WriterData)
return serializeUnit(WriterData->Writer, WriterData->Buffer,		return serializeUnit(WriterData->Writer, WriterData->Buffer,
getSema(), hasErrors, OS);		getSema(), hasErrors, OS);

SmallString<128> Buffer;		std::vector<char> Buffer;
llvm::BitstreamWriter Stream(Buffer);		llvm::BitstreamWriter Stream(Buffer);
InMemoryModuleCache ModuleCache;		InMemoryModuleCache ModuleCache;
ASTWriter Writer(Stream, Buffer, ModuleCache, {});		ASTWriter Writer(Stream, Buffer, ModuleCache, {});
return serializeUnit(Writer, Buffer, getSema(), hasErrors, OS);		return serializeUnit(Writer, Buffer, getSema(), hasErrors, OS);
}		}

using SLocRemap = ContinuousRangeMap<unsigned, int, 2>;		using SLocRemap = ContinuousRangeMap<unsigned, int, 2>;

▲ Show 20 Lines • Show All 363 Lines • Show Last 20 Lines

clang/lib/Frontend/PrecompiledPreamble.cpp

Show First 20 Lines • Show All 173 Lines • ▼ Show 20 Lines	public:
}		}

void HandleTranslationUnit(ASTContext &Ctx) override {		void HandleTranslationUnit(ASTContext &Ctx) override {
PCHGenerator::HandleTranslationUnit(Ctx);		PCHGenerator::HandleTranslationUnit(Ctx);
if (!hasEmittedPCH())		if (!hasEmittedPCH())
return;		return;

// Write the generated bitstream to "Out".		// Write the generated bitstream to "Out".
*Out << getPCH();		std::vector<char> &PCHBuffer = getPCH();
		Out->write(PCHBuffer.data(), PCHBuffer.size());
// Make sure it hits disk now.		// Make sure it hits disk now.
Out->flush();		Out->flush();
// Free the buffer.		// Free the buffer.
llvm::SmallVector<char, 0> Empty;		std::vector<char> Empty;
getPCH() = std::move(Empty);		PCHBuffer = std::move(Empty);

Action.setEmittedPreamblePCH(getWriter());		Action.setEmittedPreamblePCH(getWriter());
}		}

private:		private:
PrecompilePreambleAction &Action;		PrecompilePreambleAction &Action;
std::unique_ptr<raw_ostream> Out;		std::unique_ptr<raw_ostream> Out;
};		};
▲ Show 20 Lines • Show All 589 Lines • Show Last 20 Lines

clang/lib/Frontend/SerializedDiagnosticPrinter.cpp

Show All 19 Lines
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallString.h"		#include "llvm/ADT/SmallString.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/Bitstream/BitCodes.h"		#include "llvm/Bitstream/BitCodes.h"
#include "llvm/Bitstream/BitstreamReader.h"		#include "llvm/Bitstream/BitstreamReader.h"
#include "llvm/Support/FileSystem.h"		#include "llvm/Support/FileSystem.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include <utility>		#include <utility>
		#include <vector>

using namespace clang;		using namespace clang;
using namespace clang::serialized_diags;		using namespace clang::serialized_diags;

namespace {		namespace {

class AbbreviationMap {		class AbbreviationMap {
llvm::DenseMap<unsigned, unsigned> Abbrevs;		llvm::DenseMap<unsigned, unsigned> Abbrevs;
▲ Show 20 Lines • Show All 209 Lines • ▼ Show 20 Lines	struct SharedState {
SharedState(StringRef File, DiagnosticOptions *Diags)		SharedState(StringRef File, DiagnosticOptions *Diags)
: DiagOpts(Diags), Stream(Buffer), OutputFile(File.str()),		: DiagOpts(Diags), Stream(Buffer), OutputFile(File.str()),
EmittedAnyDiagBlocks(false) {}		EmittedAnyDiagBlocks(false) {}

/// Diagnostic options.		/// Diagnostic options.
IntrusiveRefCntPtr<DiagnosticOptions> DiagOpts;		IntrusiveRefCntPtr<DiagnosticOptions> DiagOpts;

/// The byte buffer for the serialized content.		/// The byte buffer for the serialized content.
SmallString<1024> Buffer;		std::vector<char> Buffer;

/// The BitStreamWriter for the serialized diagnostics.		/// The BitStreamWriter for the serialized diagnostics.
llvm::BitstreamWriter Stream;		llvm::BitstreamWriter Stream;

/// The name of the diagnostics file.		/// The name of the diagnostics file.
std::string OutputFile;		std::string OutputFile;

/// The set of constructed record abbreviations.		/// The set of constructed record abbreviations.
▲ Show 20 Lines • Show All 599 Lines • Show Last 20 Lines

clang/lib/Serialization/ASTWriter.cpp

Show First 20 Lines • Show All 1,023 Lines • ▼ Show 20 Lines	ASTFileSignature ASTWriter::writeUnhashedControlBlock(Preprocessor &PP,
// Enter the block and prepare to write records.		// Enter the block and prepare to write records.
RecordData Record;		RecordData Record;
Stream.EnterSubblock(UNHASHED_CONTROL_BLOCK_ID, 5);		Stream.EnterSubblock(UNHASHED_CONTROL_BLOCK_ID, 5);

// For implicit modules, write the hash of the PCM as its signature.		// For implicit modules, write the hash of the PCM as its signature.
ASTFileSignature Signature;		ASTFileSignature Signature;
if (WritingModule &&		if (WritingModule &&
PP.getHeaderSearchInfo().getHeaderSearchOpts().ModulesHashContent) {		PP.getHeaderSearchInfo().getHeaderSearchOpts().ModulesHashContent) {
Signature = createSignature(StringRef(Buffer.begin(), StartOfUnhashedControl));		Signature =
		createSignature(StringRef(Buffer.data(), StartOfUnhashedControl));
Record.append(Signature.begin(), Signature.end());		Record.append(Signature.begin(), Signature.end());
Stream.EmitRecord(SIGNATURE, Record);		Stream.EmitRecord(SIGNATURE, Record);
Record.clear();		Record.clear();
}		}

// Diagnostic options.		// Diagnostic options.
const auto &Diags = Context.getDiagnostics();		const auto &Diags = Context.getDiagnostics();
const DiagnosticOptions &DiagOpts = Diags.getDiagnosticOptions();		const DiagnosticOptions &DiagOpts = Diags.getDiagnosticOptions();
▲ Show 20 Lines • Show All 3,217 Lines • ▼ Show 20 Lines	void ASTWriter::SetSelectorOffset(Selector Sel, uint32_t Offset) {
assert(ID && "Unknown selector");		assert(ID && "Unknown selector");
// Don't record offsets for selectors that are also available in a different		// Don't record offsets for selectors that are also available in a different
// file.		// file.
if (ID < FirstSelectorID)		if (ID < FirstSelectorID)
return;		return;
SelectorOffsets[ID - FirstSelectorID] = Offset;		SelectorOffsets[ID - FirstSelectorID] = Offset;
}		}

ASTWriter::ASTWriter(llvm::BitstreamWriter &Stream,		ASTWriter::ASTWriter(llvm::BitstreamWriter &Stream, std::vector<char> &Buffer,
SmallVectorImpl<char> &Buffer,
InMemoryModuleCache &ModuleCache,		InMemoryModuleCache &ModuleCache,
ArrayRef<std::shared_ptr<ModuleFileExtension>> Extensions,		ArrayRef<std::shared_ptr<ModuleFileExtension>> Extensions,
bool IncludeTimestamps)		bool IncludeTimestamps)
: Stream(Stream), Buffer(Buffer), ModuleCache(ModuleCache),		: Stream(Stream), Buffer(Buffer), ModuleCache(ModuleCache),
IncludeTimestamps(IncludeTimestamps) {		IncludeTimestamps(IncludeTimestamps) {
for (const auto &Ext : Extensions) {		for (const auto &Ext : Extensions) {
if (auto Writer = Ext->createExtensionWriter(*this))		if (auto Writer = Ext->createExtensionWriter(*this))
ModuleFileExtensionWriters.push_back(std::move(Writer));		ModuleFileExtensionWriters.push_back(std::move(Writer));
Show All 40 Lines	ASTFileSignature ASTWriter::WriteAST(Sema &SemaRef,
this->WritingModule = nullptr;		this->WritingModule = nullptr;
this->BaseDirectory.clear();		this->BaseDirectory.clear();

WritingAST = false;		WritingAST = false;
if (ShouldCacheASTInMemory) {		if (ShouldCacheASTInMemory) {
// Construct MemoryBuffer and update buffer manager.		// Construct MemoryBuffer and update buffer manager.
ModuleCache.addBuiltPCM(OutputFile,		ModuleCache.addBuiltPCM(OutputFile,
llvm::MemoryBuffer::getMemBufferCopy(		llvm::MemoryBuffer::getMemBufferCopy(
StringRef(Buffer.begin(), Buffer.size())));		StringRef(Buffer.data(), Buffer.size())));
}		}
return Signature;		return Signature;
}		}

template<typename Vector>		template<typename Vector>
static void AddLazyVectorDecls(ASTWriter &Writer, Vector &Vec,		static void AddLazyVectorDecls(ASTWriter &Writer, Vector &Vec,
ASTWriter::RecordData &Record) {		ASTWriter::RecordData &Record) {
for (typename Vector::iterator I = Vec.begin(nullptr, true), E = Vec.end();		for (typename Vector::iterator I = Vec.begin(nullptr, true), E = Vec.end();
▲ Show 20 Lines • Show All 2,314 Lines • Show Last 20 Lines

clang/lib/Serialization/GlobalModuleIndex.cpp

Show First 20 Lines • Show All 900 Lines • ▼ Show 20 Lines	if (!ModuleFile)
continue;		continue;

// Load this module file.		// Load this module file.
if (llvm::Error Err = Builder.loadModuleFile(*ModuleFile))		if (llvm::Error Err = Builder.loadModuleFile(*ModuleFile))
return Err;		return Err;
}		}

// The output buffer, into which the global index will be written.		// The output buffer, into which the global index will be written.
SmallVector<char, 16> OutputBuffer;		std::vector<char> OutputBuffer;
{		{
llvm::BitstreamWriter OutputStream(OutputBuffer);		llvm::BitstreamWriter OutputStream(OutputBuffer);
if (Builder.writeIndex(OutputStream))		if (Builder.writeIndex(OutputStream))
return llvm::createStringError(std::errc::io_error,		return llvm::createStringError(std::errc::io_error,
"failed writing index");		"failed writing index");
}		}

return llvm::writeFileAtomically(		return llvm::writeFileAtomically(
Show All 34 Lines

clang/lib/Serialization/PCHContainerOperations.cpp

Show All 33 Lines	RawPCHContainerGenerator(std::unique_ptr<llvm::raw_pwrite_stream> OS,
std::shared_ptr<PCHBuffer> Buffer)		std::shared_ptr<PCHBuffer> Buffer)
: Buffer(std::move(Buffer)), OS(std::move(OS)) {}		: Buffer(std::move(Buffer)), OS(std::move(OS)) {}

~RawPCHContainerGenerator() override = default;		~RawPCHContainerGenerator() override = default;

void HandleTranslationUnit(ASTContext &Ctx) override {		void HandleTranslationUnit(ASTContext &Ctx) override {
if (Buffer->IsComplete) {		if (Buffer->IsComplete) {
// Make sure it hits disk now.		// Make sure it hits disk now.
*OS << Buffer->Data;		OS->write(Buffer->Data.data(), Buffer->Data.size());
OS->flush();		OS->flush();
}		}
// Free the space of the temporary buffer.		// Free the space of the temporary buffer.
llvm::SmallVector<char, 0> Empty;		std::vector<char> Empty;
Buffer->Data = std::move(Empty);		Buffer->Data = std::move(Empty);
}		}
};		};

} // anonymous namespace		} // anonymous namespace

std::unique_ptr<ASTConsumer> RawPCHContainerWriter::CreatePCHContainerGenerator(		std::unique_ptr<ASTConsumer> RawPCHContainerWriter::CreatePCHContainerGenerator(
CompilerInstance &CI, const std::string &MainFileName,		CompilerInstance &CI, const std::string &MainFileName,
Show All 14 Lines

llvm/include/llvm/Bitcode/BitcodeWriter.h

	Show All 24 Lines

	namespace llvm {			namespace llvm {

	class BitstreamWriter;			class BitstreamWriter;
	class Module;			class Module;
	class raw_ostream;			class raw_ostream;

	class BitcodeWriter {			class BitcodeWriter {
	SmallVectorImpl<char> &Buffer;			std::vector<char> &Buffer;
	std::unique_ptr<BitstreamWriter> Stream;			std::unique_ptr<BitstreamWriter> Stream;

	StringTableBuilder StrtabBuilder{StringTableBuilder::RAW};			StringTableBuilder StrtabBuilder{StringTableBuilder::RAW};

	// Owns any strings created by the irsymtab writer until we create the			// Owns any strings created by the irsymtab writer until we create the
	// string table.			// string table.
	BumpPtrAllocator Alloc;			BumpPtrAllocator Alloc;

	bool WroteStrtab = false, WroteSymtab = false;			bool WroteStrtab = false, WroteSymtab = false;

	void writeBlob(unsigned Block, unsigned Record, StringRef Blob);			void writeBlob(unsigned Block, unsigned Record, StringRef Blob);

	std::vector<Module *> Mods;			std::vector<Module *> Mods;

	public:			public:
	/// Create a BitcodeWriter that writes to Buffer.			/// Create a BitcodeWriter that writes to Buffer.
	BitcodeWriter(SmallVectorImpl<char> &Buffer);			BitcodeWriter(std::vector<char> &Buffer);

	~BitcodeWriter();			~BitcodeWriter();

	/// Attempt to write a symbol table to the bitcode file. This must be called			/// Attempt to write a symbol table to the bitcode file. This must be called
	/// at most once after all modules have been written.			/// at most once after all modules have been written.
	///			///
	/// A reader does not require a symbol table to interpret a bitcode file;			/// A reader does not require a symbol table to interpret a bitcode file;
	/// the symbol table is needed only to improve link-time performance. So			/// the symbol table is needed only to improve link-time performance. So
	▲ Show 20 Lines • Show All 104 Lines • Show Last 20 Lines

llvm/include/llvm/Bitstream/BitstreamWriter.h

Show All 10 Lines
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_BITSTREAM_BITSTREAMWRITER_H		#ifndef LLVM_BITSTREAM_BITSTREAMWRITER_H
#define LLVM_BITSTREAM_BITSTREAMWRITER_H		#define LLVM_BITSTREAM_BITSTREAMWRITER_H

#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/Optional.h"		#include "llvm/ADT/Optional.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
		RKSimonUnsubmitted Done Reply Inline Actions Can this be dropped? RKSimon: Can this be dropped?
		browneeeAuthorUnsubmitted Done Reply Inline Actions It is still used to construct records (line 512). browneee: It is still used to construct records (line 512).
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/Bitstream/BitCodes.h"		#include "llvm/Bitstream/BitCodes.h"
#include "llvm/Support/Endian.h"		#include "llvm/Support/Endian.h"
#include <vector>		#include <vector>

namespace llvm {		namespace llvm {

class BitstreamWriter {		class BitstreamWriter {
SmallVectorImpl<char> &Out;		std::vector<char> &Out;

/// CurBit - Always between 0 and 31 inclusive, specifies the next bit to use.		/// CurBit - Always between 0 and 31 inclusive, specifies the next bit to use.
unsigned CurBit;		unsigned CurBit;

/// CurValue - The current value. Only bits < CurBit are valid.		/// CurValue - The current value. Only bits < CurBit are valid.
uint32_t CurValue;		uint32_t CurValue;

/// CurCodeSize - This is the declared size of code values used for the		/// CurCodeSize - This is the declared size of code values used for the
Show All 26 Lines	class BitstreamWriter {
std::vector<BlockInfo> BlockInfoRecords;		std::vector<BlockInfo> BlockInfoRecords;

void WriteByte(unsigned char Value) {		void WriteByte(unsigned char Value) {
Out.push_back(Value);		Out.push_back(Value);
}		}

void WriteWord(unsigned Value) {		void WriteWord(unsigned Value) {
Value = support::endian::byte_swap<uint32_t, support::little>(Value);		Value = support::endian::byte_swap<uint32_t, support::little>(Value);
Out.append(reinterpret_cast<const char *>(&Value),		Out.insert(Out.end(), reinterpret_cast<const char *>(&Value),
reinterpret_cast<const char *>(&Value + 1));		reinterpret_cast<const char *>(&Value + 1));
}		}

size_t GetBufferOffset() const { return Out.size(); }		size_t GetBufferOffset() const { return Out.size(); }

size_t GetWordIndex() const {		size_t GetWordIndex() const {
size_t Offset = GetBufferOffset();		size_t Offset = GetBufferOffset();
assert((Offset & 3) == 0 && "Not 32-bit aligned");		assert((Offset & 3) == 0 && "Not 32-bit aligned");
return Offset / 4;		return Offset / 4;
}		}

public:		public:
explicit BitstreamWriter(SmallVectorImpl<char> &O)		explicit BitstreamWriter(std::vector<char> &O)
: Out(O), CurBit(0), CurValue(0), CurCodeSize(2) {}		: Out(O), CurBit(0), CurValue(0), CurCodeSize(2) {}

~BitstreamWriter() {		~BitstreamWriter() {
assert(CurBit == 0 && "Unflushed data remaining");		assert(CurBit == 0 && "Unflushed data remaining");
assert(BlockScope.empty() && CurAbbrevs.empty() && "Block imbalance");		assert(BlockScope.empty() && CurAbbrevs.empty() && "Block imbalance");
}		}

/// Retrieve the current position in the stream, in bits.		/// Retrieve the current position in the stream, in bits.
uint64_t GetCurrentBitNo() const { return GetBufferOffset() * 8 + CurBit; }		uint64_t GetCurrentBitNo() const { return GetBufferOffset() * 8 + CurBit; }
▲ Show 20 Lines • Show All 454 Lines • Show Last 20 Lines

llvm/include/llvm/Remarks/BitstreamRemarkSerializer.h

	Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines
	/// \| Remark0			/// \| Remark0
	/// \| Remark1			/// \| Remark1
	/// \| Remark2			/// \| Remark2
	/// \| ...			/// \| ...
	///			///
	struct BitstreamRemarkSerializerHelper {			struct BitstreamRemarkSerializerHelper {
	/// Buffer used for encoding the bitstream before writing it to the final			/// Buffer used for encoding the bitstream before writing it to the final
	/// stream.			/// stream.
	SmallVector<char, 1024> Encoded;			std::vector<char> Encoded;
	/// Buffer used to construct records and pass to the bitstream writer.			/// Buffer used to construct records and pass to the bitstream writer.
	SmallVector<uint64_t, 64> R;			SmallVector<uint64_t, 64> R;
	/// The Bitstream writer.			/// The Bitstream writer.
	BitstreamWriter Bitstream;			BitstreamWriter Bitstream;
	/// The type of the container we are serializing.			/// The type of the container we are serializing.
	BitstreamRemarkContainerType ContainerType;			BitstreamRemarkContainerType ContainerType;

	/// Abbrev IDs initialized in the block info block.			/// Abbrev IDs initialized in the block info block.
	▲ Show 20 Lines • Show All 136 Lines • Show Last 20 Lines

llvm/lib/Bitcode/Writer/BitcodeWriter.cpp

Show First 20 Lines • Show All 238 Lines • ▼ Show 20 Lines	private:
}		}

std::map<GlobalValue::GUID, unsigned> &valueIds() { return GUIDToValueIdMap; }		std::map<GlobalValue::GUID, unsigned> &valueIds() { return GUIDToValueIdMap; }
};		};

/// Class to manage the bitcode writing for a module.		/// Class to manage the bitcode writing for a module.
class ModuleBitcodeWriter : public ModuleBitcodeWriterBase {		class ModuleBitcodeWriter : public ModuleBitcodeWriterBase {
/// Pointer to the buffer allocated by caller for bitcode writing.		/// Pointer to the buffer allocated by caller for bitcode writing.
const SmallVectorImpl<char> &Buffer;		const std::vector<char> &Buffer;

/// True if a module hash record should be written.		/// True if a module hash record should be written.
bool GenerateHash;		bool GenerateHash;

/// If non-null, when GenerateHash is true, the resulting hash is written		/// If non-null, when GenerateHash is true, the resulting hash is written
/// into ModHash.		/// into ModHash.
ModuleHash *ModHash;		ModuleHash *ModHash;

SHA1 Hasher;		SHA1 Hasher;

/// The start bit of the identification block.		/// The start bit of the identification block.
uint64_t BitcodeStartBit;		uint64_t BitcodeStartBit;

public:		public:
/// Constructs a ModuleBitcodeWriter object for the given Module,		/// Constructs a ModuleBitcodeWriter object for the given Module,
/// writing to the provided \p Buffer.		/// writing to the provided \p Buffer.
ModuleBitcodeWriter(const Module &M, SmallVectorImpl<char> &Buffer,		ModuleBitcodeWriter(const Module &M, std::vector<char> &Buffer,
StringTableBuilder &StrtabBuilder,		StringTableBuilder &StrtabBuilder,
BitstreamWriter &Stream, bool ShouldPreserveUseListOrder,		BitstreamWriter &Stream, bool ShouldPreserveUseListOrder,
const ModuleSummaryIndex *Index, bool GenerateHash,		const ModuleSummaryIndex *Index, bool GenerateHash,
ModuleHash *ModHash = nullptr)		ModuleHash *ModHash = nullptr)
: ModuleBitcodeWriterBase(M, StrtabBuilder, Stream,		: ModuleBitcodeWriterBase(M, StrtabBuilder, Stream,
ShouldPreserveUseListOrder, Index),		ShouldPreserveUseListOrder, Index),
Buffer(Buffer), GenerateHash(GenerateHash), ModHash(ModHash),		Buffer(Buffer), GenerateHash(GenerateHash), ModHash(ModHash),
BitcodeStartBit(Stream.GetCurrentBitNo()) {}		BitcodeStartBit(Stream.GetCurrentBitNo()) {}
▲ Show 20 Lines • Show All 1,709 Lines • ▼ Show 20 Lines	void ModuleBitcodeWriter::writeMetadataStrings(
if (Strings.empty())		if (Strings.empty())
return;		return;

// Start the record with the number of strings.		// Start the record with the number of strings.
Record.push_back(bitc::METADATA_STRINGS);		Record.push_back(bitc::METADATA_STRINGS);
Record.push_back(Strings.size());		Record.push_back(Strings.size());

// Emit the sizes of the strings in the blob.		// Emit the sizes of the strings in the blob.
SmallString<256> Blob;		std::vector<char> Blob;
{		{
BitstreamWriter W(Blob);		BitstreamWriter W(Blob);
for (const Metadata *MD : Strings)		for (const Metadata *MD : Strings)
W.EmitVBR(cast<MDString>(MD)->getLength(), 6);		W.EmitVBR(cast<MDString>(MD)->getLength(), 6);
W.FlushToWord();		W.FlushToWord();
}		}

// Add the offset to the strings to the record.		// Add the offset to the strings to the record.
Record.push_back(Blob.size());		Record.push_back(Blob.size());

// Add the strings to the blob.		// Add the strings to the blob.
for (const Metadata *MD : Strings)		for (const Metadata *MD : Strings) {
Blob.append(cast<MDString>(MD)->getString());		StringRef MDStr = cast<MDString>(MD)->getString();
		Blob.insert(Blob.end(), MDStr.begin(), MDStr.end());
		}

// Emit the final record.		// Emit the final record.
Stream.EmitRecordWithBlob(createMetadataStringsAbbrev(), Record, Blob);		StringRef BlobStr(Blob.data(), Blob.size());
		Stream.EmitRecordWithBlob(createMetadataStringsAbbrev(), Record, BlobStr);
Record.clear();		Record.clear();
}		}

// Generates an enum to use as an index in the Abbrev array of Metadata record.		// Generates an enum to use as an index in the Abbrev array of Metadata record.
enum MetadataAbbrev : unsigned {		enum MetadataAbbrev : unsigned {
#define HANDLE_MDNODE_LEAF(CLASS) CLASS##AbbrevID,		#define HANDLE_MDNODE_LEAF(CLASS) CLASS##AbbrevID,
#include "llvm/IR/Metadata.def"		#include "llvm/IR/Metadata.def"
LastPlusOne		LastPlusOne
▲ Show 20 Lines • Show All 2,265 Lines • ▼ Show 20 Lines	void ModuleBitcodeWriter::write() {

writeGlobalValueSymbolTable(FunctionToBitcodeIndex);		writeGlobalValueSymbolTable(FunctionToBitcodeIndex);

writeModuleHash(BlockStartPos);		writeModuleHash(BlockStartPos);

Stream.ExitBlock();		Stream.ExitBlock();
}		}

static void writeInt32ToBuffer(uint32_t Value, SmallVectorImpl<char> &Buffer,		static void writeInt32ToBuffer(uint32_t Value, std::vector<char> &Buffer,
uint32_t &Position) {		uint32_t &Position) {
support::endian::write32le(&Buffer[Position], Value);		support::endian::write32le(&Buffer[Position], Value);
Position += 4;		Position += 4;
}		}

/// If generating a bc file on darwin, we have to emit a		/// If generating a bc file on darwin, we have to emit a
/// header and trailer to make it compatible with the system archiver. To do		/// header and trailer to make it compatible with the system archiver. To do
/// this we emit the following header, and then emit a trailer that pads the		/// this we emit the following header, and then emit a trailer that pads the
/// file out to be a multiple of 16 bytes.		/// file out to be a multiple of 16 bytes.
///		///
/// struct bc_header {		/// struct bc_header {
/// uint32_t Magic; // 0x0B17C0DE		/// uint32_t Magic; // 0x0B17C0DE
/// uint32_t Version; // Version, currently always 0.		/// uint32_t Version; // Version, currently always 0.
/// uint32_t BitcodeOffset; // Offset to traditional bitcode file.		/// uint32_t BitcodeOffset; // Offset to traditional bitcode file.
/// uint32_t BitcodeSize; // Size of traditional bitcode file.		/// uint32_t BitcodeSize; // Size of traditional bitcode file.
/// uint32_t CPUType; // CPU specifier.		/// uint32_t CPUType; // CPU specifier.
/// ... potentially more later ...		/// ... potentially more later ...
/// };		/// };
static void emitDarwinBCHeaderAndTrailer(SmallVectorImpl<char> &Buffer,		static void emitDarwinBCHeaderAndTrailer(std::vector<char> &Buffer,
const Triple &TT) {		const Triple &TT) {
unsigned CPUType = ~0U;		unsigned CPUType = ~0U;

// Match x86_64-, i[3-9]86-, powerpc-, powerpc64-, arm-, thumb-,		// Match x86_64-, i[3-9]86-, powerpc-, powerpc64-, arm-, thumb-,
// armv[0-9]-, thumbv[0-9]-, armv5te-, or armv6t2-. The CPUType is a magic		// armv[0-9]-, thumbv[0-9]-, armv5te-, or armv6t2-. The CPUType is a magic
// number from /usr/include/mach/machine.h. It is ok to reproduce the		// number from /usr/include/mach/machine.h. It is ok to reproduce the
// specific constants here because they are implicitly part of the Darwin ABI.		// specific constants here because they are implicitly part of the Darwin ABI.
enum {		enum {
Show All 40 Lines	static void writeBitcodeHeader(BitstreamWriter &Stream) {
Stream.Emit((unsigned)'B', 8);		Stream.Emit((unsigned)'B', 8);
Stream.Emit((unsigned)'C', 8);		Stream.Emit((unsigned)'C', 8);
Stream.Emit(0x0, 4);		Stream.Emit(0x0, 4);
Stream.Emit(0xC, 4);		Stream.Emit(0xC, 4);
Stream.Emit(0xE, 4);		Stream.Emit(0xE, 4);
Stream.Emit(0xD, 4);		Stream.Emit(0xD, 4);
}		}

BitcodeWriter::BitcodeWriter(SmallVectorImpl<char> &Buffer)		BitcodeWriter::BitcodeWriter(std::vector<char> &Buffer)
: Buffer(Buffer), Stream(new BitstreamWriter(Buffer)) {		: Buffer(Buffer), Stream(new BitstreamWriter(Buffer)) {
writeBitcodeHeader(*Stream);		writeBitcodeHeader(*Stream);
}		}

BitcodeWriter::~BitcodeWriter() { assert(WroteStrtab); }		BitcodeWriter::~BitcodeWriter() { assert(WroteStrtab); }

void BitcodeWriter::writeBlob(unsigned Block, unsigned Record, StringRef Blob) {		void BitcodeWriter::writeBlob(unsigned Block, unsigned Record, StringRef Blob) {
Stream->EnterSubblock(Block, 3);		Stream->EnterSubblock(Block, 3);
▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines	void BitcodeWriter::writeIndex(
IndexWriter.write();		IndexWriter.write();
}		}

/// Write the specified module to the specified output stream.		/// Write the specified module to the specified output stream.
void llvm::WriteBitcodeToFile(const Module &M, raw_ostream &Out,		void llvm::WriteBitcodeToFile(const Module &M, raw_ostream &Out,
bool ShouldPreserveUseListOrder,		bool ShouldPreserveUseListOrder,
const ModuleSummaryIndex *Index,		const ModuleSummaryIndex *Index,
bool GenerateHash, ModuleHash *ModHash) {		bool GenerateHash, ModuleHash *ModHash) {
SmallVector<char, 0> Buffer;		std::vector<char> Buffer;
Buffer.reserve(256*1024);		Buffer.reserve(256*1024);

// If this is darwin or another generic macho target, reserve space for the		// If this is darwin or another generic macho target, reserve space for the
// header.		// header.
Triple TT(M.getTargetTriple());		Triple TT(M.getTargetTriple());
if (TT.isOSDarwin() \|\| TT.isOSBinFormatMachO())		if (TT.isOSDarwin() \|\| TT.isOSBinFormatMachO())
Buffer.insert(Buffer.begin(), BWH_HeaderSize, 0);		Buffer.insert(Buffer.begin(), BWH_HeaderSize, 0);

Show All 26 Lines

// Write the specified module summary index to the given raw output stream,		// Write the specified module summary index to the given raw output stream,
// where it will be written in a new bitcode block. This is used when		// where it will be written in a new bitcode block. This is used when
// writing the combined index file for ThinLTO. When writing a subset of the		// writing the combined index file for ThinLTO. When writing a subset of the
// index for a distributed backend, provide a \p ModuleToSummariesForIndex map.		// index for a distributed backend, provide a \p ModuleToSummariesForIndex map.
void llvm::WriteIndexToFile(		void llvm::WriteIndexToFile(
const ModuleSummaryIndex &Index, raw_ostream &Out,		const ModuleSummaryIndex &Index, raw_ostream &Out,
const std::map<std::string, GVSummaryMapTy> *ModuleToSummariesForIndex) {		const std::map<std::string, GVSummaryMapTy> *ModuleToSummariesForIndex) {
SmallVector<char, 0> Buffer;		std::vector<char> Buffer;
Buffer.reserve(256 * 1024);		Buffer.reserve(256 * 1024);

BitcodeWriter Writer(Buffer);		BitcodeWriter Writer(Buffer);
Writer.writeIndex(&Index, ModuleToSummariesForIndex);		Writer.writeIndex(&Index, ModuleToSummariesForIndex);
Writer.writeStrtab();		Writer.writeStrtab();

Out.write((char *)&Buffer.front(), Buffer.size());		Out.write((char *)&Buffer.front(), Buffer.size());
}		}
▲ Show 20 Lines • Show All 143 Lines • ▼ Show 20 Lines
}		}

// Write the specified thin link bitcode file to the given raw output stream,		// Write the specified thin link bitcode file to the given raw output stream,
// where it will be written in a new bitcode block. This is used when		// where it will be written in a new bitcode block. This is used when
// writing the per-module index file for ThinLTO.		// writing the per-module index file for ThinLTO.
void llvm::WriteThinLinkBitcodeToFile(const Module &M, raw_ostream &Out,		void llvm::WriteThinLinkBitcodeToFile(const Module &M, raw_ostream &Out,
const ModuleSummaryIndex &Index,		const ModuleSummaryIndex &Index,
const ModuleHash &ModHash) {		const ModuleHash &ModHash) {
SmallVector<char, 0> Buffer;		std::vector<char> Buffer;
Buffer.reserve(256 * 1024);		Buffer.reserve(256 * 1024);

BitcodeWriter Writer(Buffer);		BitcodeWriter Writer(Buffer);
Writer.writeThinLinkBitcode(M, Index, ModHash);		Writer.writeThinLinkBitcode(M, Index, ModHash);
Writer.writeSymtab();		Writer.writeSymtab();
Writer.writeStrtab();		Writer.writeStrtab();

Out.write((char *)&Buffer.front(), Buffer.size());		Out.write((char *)&Buffer.front(), Buffer.size());
▲ Show 20 Lines • Show All 123 Lines • Show Last 20 Lines

llvm/lib/ExecutionEngine/Orc/ThreadSafeModule.cpp

Show All 18 Lines	ThreadSafeModule cloneToNewContext(ThreadSafeModule &TSM,
GVPredicate ShouldCloneDef,		GVPredicate ShouldCloneDef,
GVModifier UpdateClonedDefSource) {		GVModifier UpdateClonedDefSource) {
assert(TSM && "Can not clone null module");		assert(TSM && "Can not clone null module");

if (!ShouldCloneDef)		if (!ShouldCloneDef)
ShouldCloneDef = [](const GlobalValue &) { return true; };		ShouldCloneDef = [](const GlobalValue &) { return true; };

return TSM.withModuleDo([&](Module &M) {		return TSM.withModuleDo([&](Module &M) {
SmallVector<char, 1> ClonedModuleBuffer;		std::vector<char> ClonedModuleBuffer;

{		{
std::set<GlobalValue *> ClonedDefsInSrc;		std::set<GlobalValue *> ClonedDefsInSrc;
ValueToValueMapTy VMap;		ValueToValueMapTy VMap;
auto Tmp = CloneModule(M, VMap, [&](const GlobalValue *GV) {		auto Tmp = CloneModule(M, VMap, [&](const GlobalValue *GV) {
if (ShouldCloneDef(*GV)) {		if (ShouldCloneDef(*GV)) {
ClonedDefsInSrc.insert(const_cast<GlobalValue *>(GV));		ClonedDefsInSrc.insert(const_cast<GlobalValue *>(GV));
return true;		return true;
Show All 29 Lines

llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp

Show First 20 Lines • Show All 391 Lines • ▼ Show 20 Lines	void splitAndWriteThinLTOBitcode(
ModuleSummaryIndex Index = buildModuleSummaryIndex(M, nullptr, &PSI);		ModuleSummaryIndex Index = buildModuleSummaryIndex(M, nullptr, &PSI);

// Mark the merged module as requiring full LTO. We still want an index for		// Mark the merged module as requiring full LTO. We still want an index for
// it though, so that it can participate in summary-based dead stripping.		// it though, so that it can participate in summary-based dead stripping.
MergedM->addModuleFlag(Module::Error, "ThinLTO", uint32_t(0));		MergedM->addModuleFlag(Module::Error, "ThinLTO", uint32_t(0));
ModuleSummaryIndex MergedMIndex =		ModuleSummaryIndex MergedMIndex =
buildModuleSummaryIndex(*MergedM, nullptr, &PSI);		buildModuleSummaryIndex(*MergedM, nullptr, &PSI);

SmallVector<char, 0> Buffer;		std::vector<char> Buffer;

BitcodeWriter W(Buffer);		BitcodeWriter W(Buffer);
// Save the module hash produced for the full bitcode, which will		// Save the module hash produced for the full bitcode, which will
// be used in the backends, and use that in the minimized bitcode		// be used in the backends, and use that in the minimized bitcode
// produced for the full link.		// produced for the full link.
ModuleHash ModHash = {{0}};		ModuleHash ModHash = {{0}};
W.writeModule(M, /ShouldPreserveUseListOrder=/false, &Index,		W.writeModule(M, /ShouldPreserveUseListOrder=/false, &Index,
/GenerateHash=/true, &ModHash);		/GenerateHash=/true, &ModHash);
W.writeModule(MergedM, /ShouldPreserveUseListOrder=*/false, &MergedMIndex);		W.writeModule(MergedM, /ShouldPreserveUseListOrder=*/false, &MergedMIndex);
W.writeSymtab();		W.writeSymtab();
W.writeStrtab();		W.writeStrtab();
OS << Buffer;		OS.write(Buffer.data(), Buffer.size());

// If a minimized bitcode module was requested for the thin link, only		// If a minimized bitcode module was requested for the thin link, only
// the information that is needed by thin link will be written in the		// the information that is needed by thin link will be written in the
// given OS (the merged module will be written as usual).		// given OS (the merged module will be written as usual).
if (ThinLinkOS) {		if (ThinLinkOS) {
Buffer.clear();		Buffer.clear();
BitcodeWriter W2(Buffer);		BitcodeWriter W2(Buffer);
StripDebugInfo(M);		StripDebugInfo(M);
W2.writeThinLinkBitcode(M, Index, ModHash);		W2.writeThinLinkBitcode(M, Index, ModHash);
W2.writeModule(MergedM, /ShouldPreserveUseListOrder=*/false,		W2.writeModule(MergedM, /ShouldPreserveUseListOrder=*/false,
&MergedMIndex);		&MergedMIndex);
W2.writeSymtab();		W2.writeSymtab();
W2.writeStrtab();		W2.writeStrtab();
*ThinLinkOS << Buffer;		ThinLinkOS->write(Buffer.data(), Buffer.size());
}		}
}		}

// Check if the LTO Unit splitting has been enabled.		// Check if the LTO Unit splitting has been enabled.
bool enableSplitLTOUnit(Module &M) {		bool enableSplitLTOUnit(Module &M) {
bool EnableSplitLTOUnit = false;		bool EnableSplitLTOUnit = false;
if (auto *MD = mdconst::extract_or_null<ConstantInt>(		if (auto *MD = mdconst::extract_or_null<ConstantInt>(
M.getModuleFlag("EnableSplitLTOUnit")))		M.getModuleFlag("EnableSplitLTOUnit")))
▲ Show 20 Lines • Show All 115 Lines • Show Last 20 Lines

llvm/tools/llvm-cat/llvm-cat.cpp

	Show First 20 Lines • Show All 48 Lines • ▼ Show 20 Lines

	int main(int argc, char **argv) {			int main(int argc, char **argv) {
	cl::HideUnrelatedOptions(CatCategory);			cl::HideUnrelatedOptions(CatCategory);
	cl::ParseCommandLineOptions(argc, argv, "Module concatenation");			cl::ParseCommandLineOptions(argc, argv, "Module concatenation");

	ExitOnError ExitOnErr("llvm-cat: ");			ExitOnError ExitOnErr("llvm-cat: ");
	LLVMContext Context;			LLVMContext Context;

	SmallVector<char, 0> Buffer;			std::vector<char> Buffer;
	BitcodeWriter Writer(Buffer);			BitcodeWriter Writer(Buffer);
	if (BinaryCat) {			if (BinaryCat) {
	for (const auto &InputFilename : InputFilenames) {			for (const auto &InputFilename : InputFilenames) {
	std::unique_ptr<MemoryBuffer> MB = ExitOnErr(			std::unique_ptr<MemoryBuffer> MB = ExitOnErr(
	errorOrToExpected(MemoryBuffer::getFileOrSTDIN(InputFilename)));			errorOrToExpected(MemoryBuffer::getFileOrSTDIN(InputFilename)));
	std::vector<BitcodeModule> Mods = ExitOnErr(getBitcodeModuleList(*MB));			std::vector<BitcodeModule> Mods = ExitOnErr(getBitcodeModuleList(*MB));
	for (auto &BitcodeMod : Mods) {			for (auto &BitcodeMod : Mods) {
	Buffer.insert(Buffer.end(), BitcodeMod.getBuffer().begin(),			Buffer.insert(Buffer.end(), BitcodeMod.getBuffer().begin(),
	Show All 32 Lines

llvm/tools/llvm-modextract/llvm-modextract.cpp

Show First 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	int main(int argc, char **argv) {
}		}

std::error_code EC;		std::error_code EC;
std::unique_ptr<ToolOutputFile> Out(		std::unique_ptr<ToolOutputFile> Out(
new ToolOutputFile(OutputFilename, EC, sys::fs::OF_None));		new ToolOutputFile(OutputFilename, EC, sys::fs::OF_None));
ExitOnErr(errorCodeToError(EC));		ExitOnErr(errorCodeToError(EC));

if (BinaryExtract) {		if (BinaryExtract) {
SmallVector<char, 0> Result;		std::vector<char> Result;
BitcodeWriter Writer(Result);		BitcodeWriter Writer(Result);
Result.append(Ms[ModuleIndex].getBuffer().begin(),		Result.insert(Result.end(), Ms[ModuleIndex].getBuffer().begin(),
Ms[ModuleIndex].getBuffer().end());		Ms[ModuleIndex].getBuffer().end());
Writer.copyStrtab(Ms[ModuleIndex].getStrtab());		Writer.copyStrtab(Ms[ModuleIndex].getStrtab());
Out->os() << Result;		Out->os().write(Result.data(), Result.size());
Out->keep();		Out->keep();
return 0;		return 0;
}		}

std::unique_ptr<Module> M = ExitOnErr(Ms[ModuleIndex].parseModule(Context));		std::unique_ptr<Module> M = ExitOnErr(Ms[ModuleIndex].parseModule(Context));
WriteBitcodeToFile(*M, Out->os());		WriteBitcodeToFile(*M, Out->os());

Out->keep();		Out->keep();
return 0;		return 0;
}		}

llvm/unittests/Bitstream/BitstreamReaderTest.cpp

//===- BitstreamReaderTest.cpp - Tests for BitstreamReader ----------------===//		//===- BitstreamReaderTest.cpp - Tests for BitstreamReader ----------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/Bitstream/BitstreamReader.h"		#include "llvm/Bitstream/BitstreamReader.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/Bitstream/BitstreamWriter.h"		#include "llvm/Bitstream/BitstreamWriter.h"
#include "gtest/gtest.h"		#include "gtest/gtest.h"

		#include <vector>

using namespace llvm;		using namespace llvm;

namespace {		namespace {

TEST(BitstreamReaderTest, AtEndOfStream) {		TEST(BitstreamReaderTest, AtEndOfStream) {
uint8_t Bytes[4] = {		uint8_t Bytes[4] = {
0x00, 0x01, 0x02, 0x03		0x00, 0x01, 0x02, 0x03
};		};
▲ Show 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	TEST(BitstreamReaderTest, readRecordWithBlobWhileStreaming) {
const unsigned Magic = 0x12345678;		const unsigned Magic = 0x12345678;
const unsigned BlockID = bitc::FIRST_APPLICATION_BLOCKID;		const unsigned BlockID = bitc::FIRST_APPLICATION_BLOCKID;
const unsigned RecordID = 1;		const unsigned RecordID = 1;
for (unsigned I = 0, BlobSize = 0, E = BlobData.size(); BlobSize < E;		for (unsigned I = 0, BlobSize = 0, E = BlobData.size(); BlobSize < E;
BlobSize += ++I) {		BlobSize += ++I) {
StringRef BlobIn((const char *)BlobData.begin(), BlobSize);		StringRef BlobIn((const char *)BlobData.begin(), BlobSize);

// Write the bitcode.		// Write the bitcode.
SmallVector<char, 1> Buffer;		std::vector<char> Buffer;
unsigned AbbrevID;		unsigned AbbrevID;
{		{
BitstreamWriter Stream(Buffer);		BitstreamWriter Stream(Buffer);
Stream.Emit(Magic, 32);		Stream.Emit(Magic, 32);
Stream.EnterSubblock(BlockID, 3);		Stream.EnterSubblock(BlockID, 3);

auto Abbrev = std::make_shared<BitCodeAbbrev>();		auto Abbrev = std::make_shared<BitCodeAbbrev>();
Abbrev->Add(BitCodeAbbrevOp(RecordID));		Abbrev->Add(BitCodeAbbrevOp(RecordID));
Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob));		Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob));
AbbrevID = Stream.EmitAbbrev(std::move(Abbrev));		AbbrevID = Stream.EmitAbbrev(std::move(Abbrev));
unsigned Record[] = {RecordID};		unsigned Record[] = {RecordID};
Stream.EmitRecordWithBlob(AbbrevID, makeArrayRef(Record), BlobIn);		Stream.EmitRecordWithBlob(AbbrevID, makeArrayRef(Record), BlobIn);

Stream.ExitBlock();		Stream.ExitBlock();
}		}

// Stream the buffer into the reader.		// Stream the buffer into the reader.
BitstreamCursor Stream(		BitstreamCursor Stream(
ArrayRef<uint8_t>((const uint8_t *)Buffer.begin(), Buffer.size()));		ArrayRef<uint8_t>((const uint8_t *)Buffer.data(), Buffer.size()));

// Header. Included in test so that we can run llvm-bcanalyzer to debug		// Header. Included in test so that we can run llvm-bcanalyzer to debug
// when there are problems.		// when there are problems.
Expected<SimpleBitstreamCursor::word_t> MaybeRead = Stream.Read(32);		Expected<SimpleBitstreamCursor::word_t> MaybeRead = Stream.Read(32);
ASSERT_TRUE((bool)MaybeRead);		ASSERT_TRUE((bool)MaybeRead);
ASSERT_EQ(Magic, MaybeRead.get());		ASSERT_EQ(Magic, MaybeRead.get());

// Block.		// Block.
▲ Show 20 Lines • Show All 41 Lines • Show Last 20 Lines

llvm/unittests/Bitstream/BitstreamWriterTest.cpp

	//===- BitstreamWriterTest.cpp - Tests for BitstreamWriter ----------------===//			//===- BitstreamWriterTest.cpp - Tests for BitstreamWriter ----------------===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "llvm/Bitstream/BitstreamWriter.h"			#include "llvm/Bitstream/BitstreamWriter.h"
	#include "llvm/ADT/STLExtras.h"			#include "llvm/ADT/STLExtras.h"
	#include "llvm/ADT/SmallString.h"			#include "gmock/gmock.h"
	#include "gtest/gtest.h"			#include "gtest/gtest.h"

				#include <vector>

	using namespace llvm;			using namespace llvm;

				using ::testing::IsEmpty;

	namespace {			namespace {

	TEST(BitstreamWriterTest, emitBlob) {			TEST(BitstreamWriterTest, emitBlob) {
	SmallString<64> Buffer;			std::vector<char> Buffer;
	BitstreamWriter W(Buffer);			BitstreamWriter W(Buffer);
	W.emitBlob("str", /* ShouldEmitSize */ false);			W.emitBlob("str", /* ShouldEmitSize */ false);
	EXPECT_EQ(StringRef("str\0", 4), Buffer);			EXPECT_EQ(StringRef("str\0", 4), StringRef(Buffer.data(), Buffer.size()));
	}			}

	TEST(BitstreamWriterTest, emitBlobWithSize) {			TEST(BitstreamWriterTest, emitBlobWithSize) {
	SmallString<64> Buffer;			std::vector<char> Buffer;
	{			{
	BitstreamWriter W(Buffer);			BitstreamWriter W(Buffer);
	W.emitBlob("str");			W.emitBlob("str");
	}			}
	SmallString<64> Expected;			std::vector<char> Expected;
	{			{
	BitstreamWriter W(Expected);			BitstreamWriter W(Expected);
	W.EmitVBR(3, 6);			W.EmitVBR(3, 6);
	W.FlushToWord();			W.FlushToWord();
	W.Emit('s', 8);			W.Emit('s', 8);
	W.Emit('t', 8);			W.Emit('t', 8);
	W.Emit('r', 8);			W.Emit('r', 8);
	W.Emit(0, 8);			W.Emit(0, 8);
	}			}
	EXPECT_EQ(StringRef(Expected), Buffer);			EXPECT_EQ(Expected, Buffer);
	}			}

	TEST(BitstreamWriterTest, emitBlobEmpty) {			TEST(BitstreamWriterTest, emitBlobEmpty) {
	SmallString<64> Buffer;			std::vector<char> Buffer;
	BitstreamWriter W(Buffer);			BitstreamWriter W(Buffer);
	W.emitBlob("", /* ShouldEmitSize */ false);			W.emitBlob("", /* ShouldEmitSize */ false);
	EXPECT_EQ(StringRef(""), Buffer);			EXPECT_THAT(Buffer, IsEmpty());
	}			}

	TEST(BitstreamWriterTest, emitBlob4ByteAligned) {			TEST(BitstreamWriterTest, emitBlob4ByteAligned) {
	SmallString<64> Buffer;			std::vector<char> Buffer;
	BitstreamWriter W(Buffer);			BitstreamWriter W(Buffer);
	W.emitBlob("str0", /* ShouldEmitSize */ false);			W.emitBlob("str0", /* ShouldEmitSize */ false);
	EXPECT_EQ(StringRef("str0"), Buffer);			EXPECT_EQ(StringRef("str0"), StringRef(Buffer.data(), Buffer.size()));
	}			}

	} // end namespace			} // end namespace

This is an archive of the discontinued LLVM Phabricator instance.

ADT: SmallVector size/capacity use word-size integers when elements are smallClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 255891

clang-tools-extra/clang-doc/Serialize.cpp

clang-tools-extra/unittests/clang-doc/BitcodeTest.cpp

clang/include/clang/Serialization/ASTWriter.h

clang/include/clang/Serialization/PCHContainerOperations.h

clang/lib/CodeGen/ObjectFilePCHContainerOperations.cpp

clang/lib/Frontend/ASTUnit.cpp

clang/lib/Frontend/PrecompiledPreamble.cpp

clang/lib/Frontend/SerializedDiagnosticPrinter.cpp

clang/lib/Serialization/ASTWriter.cpp

clang/lib/Serialization/GlobalModuleIndex.cpp

clang/lib/Serialization/PCHContainerOperations.cpp

llvm/include/llvm/Bitcode/BitcodeWriter.h

llvm/include/llvm/Bitstream/BitstreamWriter.h

llvm/include/llvm/Remarks/BitstreamRemarkSerializer.h

llvm/lib/Bitcode/Writer/BitcodeWriter.cpp

llvm/lib/ExecutionEngine/Orc/ThreadSafeModule.cpp

llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp

llvm/tools/llvm-cat/llvm-cat.cpp

llvm/tools/llvm-modextract/llvm-modextract.cpp

llvm/unittests/Bitstream/BitstreamReaderTest.cpp

llvm/unittests/Bitstream/BitstreamWriterTest.cpp

ADT: SmallVector size/capacity use word-size integers when elements are small
ClosedPublic