Also make the summary ref and call graph vectors immutable. This means
a smaller API surface and fewer places to audit for non-determinism.
Details
Diff Detail
- Build Status
Buildable 2246 Build 2246: arc lint + arc unit
Event Timeline
llvm/include/llvm/IR/ModuleSummaryIndex.h | ||
---|---|---|
45 | The comment is outdated: IndirectCalls in per-module index are GUID-based. | |
175 | I didn't know about OwningArrayRef, interesting. | |
180 | It seems to me that this is causing memory allocation and copy that could be avoided: because Refs is an ArrayRef we're allocating and copying into RefEdgeList. However looking at call site, if Refs comes from makeRefList we just created a vector there. If it comes from findRefEdges we just created a SetVector. In both cases it looks like we could take and store a std::vector that would be moved in. | |
llvm/lib/Analysis/ModuleSummaryAnalysis.cpp | ||
92 | Fusing these looks much more sane! | |
llvm/lib/Bitcode/Reader/BitcodeReader.cpp | ||
4798 | Ret.reserve( Record.size()); | |
4806 | Ret.reserve( Record.size()); |
This looks like a good cleanup. One minor suggestion below. I see Mehdi just responded with a couple of suggestions, so will hold off on accepting for now.
llvm/lib/Analysis/ModuleSummaryAnalysis.cpp | ||
---|---|---|
138 | CalleeId doesn't seem like the best name anymore. Just "Callee"? |
- Address review comments
llvm/include/llvm/IR/ModuleSummaryIndex.h | ||
---|---|---|
175 | I guess it saves us a pointer as well as the additional space reserved by the std::vector, since the size is fixed. I haven't measured how important this is in practice, though. | |
180 | I guess there's one way that saves copies and another that saves memory. In the absence of evidence pointing one way or the other, I'd be inclined to go with the slightly simpler approach, i.e. the one I took. | |
llvm/lib/Analysis/ModuleSummaryAnalysis.cpp | ||
138 | I folded it into the only use. |
llvm/include/llvm/IR/ModuleSummaryIndex.h | ||
---|---|---|
175 | Oh makes sense! There is no "capacity" so you save a pointer (2 instead of 3). | |
180 | It is not clear to me why you think that "not simply using std::vector everywhere" is simpler though? This means hitting the heap a lot without good reason (that I perceive right now), which we traditionally avoid as much as possible I believe (reusing SmallVector across records in the BitcodeReader for example). Because of that it does not seem "one way or another to me": adding this extra malloc/free/copy sequence for saving a pointer should be motivated. (And I just saved 8B for this structure in D27970, if memory was a concern we would have started with this) |
llvm/include/llvm/IR/ModuleSummaryIndex.h | ||
---|---|---|
180 |
To avoid the copy we'd need to add a way to move the std::vector out of MapVector/SetVector. Not a big deal but it would make the interface for those classes a little more complicated.
I'd say that copying the vector is the "default position" as it wouldn't require more API surface in MapVector/SetVector. It's also the status quo as we were effectively copying the data structure in using add*Edges before. And while we're copying the vector we might as well use a slightly better data structure here. But I can see how using OwningArrayRef here could be seen as taking a position on whether the memory savings are worth it, which I don't (see also other thread), so I guess I don't have a problem with moving the vector. |
The comment is outdated: IndirectCalls in per-module index are GUID-based.