This is an archive of the discontinued LLVM Phabricator instance.

SourceManager: Don't allocate an SLocEntry until it's loaded
Changes PlannedPublic

Authored by dexonsmith on Oct 19 2020, 4:17 PM.

Details

Summary

Replace SLocEntryLoaded (a bitmap) with a vector of optional indices
into LoadedSLocEntryTable called LoadedSLocEntryIndices, and
allocate an SLocEntry in LoadedSLocEntryTable only once it has been
loaded. The optional index is stored using the new class
LoadedSLocEntryIndex, which has similar semantics to an
Optional<unsigned> without storage overhead. It uses 0 as a sentinel
to allow zero-initialization when loading a new module and stores the
index (if present) off-by-one.

This makes the allocation for an unloaded SLocEntry 6x smaller (24B =>
4B), dramatically improving memory usage with large / many modules. It
increases memory usage for a *loaded* SLocEntry by 4B (24B => 28B).

Diff Detail

Event Timeline

dexonsmith created this revision.Oct 19 2020, 4:17 PM

This should be complementary to https://reviews.llvm.org/D89580, but I think it's independent.

teemperor requested changes to this revision.Oct 26 2020, 8:10 AM
teemperor added a subscriber: v.g.vassilev.

Maybe arc didn't correctly apply the patch, but this causes a few LLDB tests to fail on my machine:

lldb-api :: commands/expression/import-std-module/queue/TestQueueFromStdModule.py
lldb-api :: commands/expression/import-std-module/list/TestListFromStdModule.py
lldb-api :: commands/expression/import-std-module/forward_list-dbg-info-content/TestDbgInfoContentForwardListFromStdModule.py

From the backtrace this seems like the ASTImporter needs to be updated:

* frame #0: 0x00007ffff25f1bb6 _lldb.so`clang::ASTImporter::Import(clang::FileID, bool) + 566
  frame #1: 0x00007ffff25ecc56 _lldb.so`clang::ASTImporter::Import(clang::SourceLocation) + 182
  frame #2: 0x00007ffff25bbaed _lldb.so`clang::SourceLocation clang::ASTNodeImporter::importChecked<clang::SourceLocation>(llvm::Error&, clang::SourceLocation const&) + 61
  frame #3: 0x00007ffff25c34e8 _lldb.so`clang::ASTNodeImporter::VisitFunctionDecl(clang::FunctionDecl*) + 2968
  frame #4: 0x00007ffff25ebb31 _lldb.so`clang::declvisitor::Base<std::add_pointer, clang::ASTNodeImporter, llvm::Expected<clang::Decl*> >::Visit(clang::Decl*) + 81
  frame #5: 0x00007ffff25ebac2 _lldb.so`clang::ASTImporter::ImportImpl(clang::Decl*) + 34
  frame #6: 0x00007ffff0df9cb1 _lldb.so`lldb_private::ClangASTImporter::ASTImporterDelegate::ImportImpl(clang::Decl*) + 1345
  frame #7: 0x00007ffff25ed3f4 _lldb.so`clang::ASTImporter::Import(clang::Decl*) + 532
  frame #8: 0x00007ffff25aeb93 _lldb.so`llvm::Expected<clang::ValueDecl*> clang::ASTNodeImport

The tests are just loading a Clang module and then trying to import some Decls into another ASTContext, so I don't think the error here is specific to LLDB. We just don't have any ASTImporter tests that exercise modules.

FWIW, if you quickly want to see if your SourceManager changes break LLDB, then you can just set the LIT_FILTER to "Modules" and run check-lldb as LLDB's use of Clang modules are the only place where we have any valid SourceLocations within LLDB.

Regarding the patch itself:

  1. The I - 1 everywhere when translating SLocEntryLoaded values to a LoadedSLocEntryTable index is a bit cryptic. We already do similar magic in other places, but here it's really just about expressing an optional integer. Could we maybe have a wrapper function that turns this into a llvm::Optional<unsigned> (that is llvm::None when it's not loaded). Something like this maybe (which hopefully optimizes to something similar to the original code):
llvm::Optional<unsigned> getLoadedSLocEntryTableIndex(size_t Index) const {
  unsigned result = SLocEntryLoaded[Index];
  if (result == /*sentinel value*/0)
    return llvm::None;
  return result - 1U;
}
  1. If we have this as an Optional, we might as well make the sentinel value std::numeric_limits<unsigned>::max() instead of 0 in SLocEntryLoaded. Then we can just all values as-is without translating them to an index.
  1. Now that the primary purpose of SLocEntryLoaded is storing indices and not the loaded status, maybe something like SLocEntryIndices would be a better name?
  1. Do you still have your benchmarking values (or some estimates) for this? I'm just curious how much memory this actually saves.

Otherwise I think this looks good.

This revision now requires changes to proceed.Oct 26 2020, 8:10 AM

Thanks for the patch!! This is a super hot place for us (mostly due to boost). I will try it on our end and let you know!

dexonsmith planned changes to this revision.Oct 26 2020, 8:59 AM

Thanks for the patch!! This is a super hot place for us (mostly due to boost). I will try it on our end and let you know!

Great!

Maybe arc didn't correctly apply the patch, but this causes a few LLDB tests to fail on my machine:

lldb-api :: commands/expression/import-std-module/queue/TestQueueFromStdModule.py
lldb-api :: commands/expression/import-std-module/list/TestListFromStdModule.py
lldb-api :: commands/expression/import-std-module/forward_list-dbg-info-content/TestDbgInfoContentForwardListFromStdModule.py

From the backtrace this seems like the ASTImporter needs to be updated:

Thanks for doing that legwork, I'll take a look.

The tests are just loading a Clang module and then trying to import some Decls into another ASTContext, so I don't think the error here is specific to LLDB. We just don't have any ASTImporter tests that exercise modules.

I'll add one.

FWIW, if you quickly want to see if your SourceManager changes break LLDB, then you can just set the LIT_FILTER to "Modules" and run check-lldb as LLDB's use of Clang modules are the only place where we have any valid SourceLocations within LLDB.

This is helpful, thanks. I also need to deal with whatever entitlements check-lldb needs (new machine since the last time and I remember it being non-trivial) but I'll figure that out.

  1. The I - 1 everywhere when translating SLocEntryLoaded values to a LoadedSLocEntryTable index is a bit cryptic. [...]

Great comment; I think the way I'll do this is to create a simple wrapper type that degrades to Optional<unsigned>.

  1. If we have this as an Optional, we might as well make the sentinel value std::numeric_limits<unsigned>::max() instead of 0 in SLocEntryLoaded. Then we can just all values as-is without translating them to an index.

Could do, but there's a tradeoff since then a resize operation can't/doesn't use zero-initialization. As a result I have a (weak) preference for a 0-sentinel (probably my next patch will keep that semantic), but I'm open to changing it.

  1. Now that the primary purpose of SLocEntryLoaded is storing indices and not the loaded status, maybe something like SLocEntryIndices would be a better name?

Good idea.

  1. Do you still have your benchmarking values (or some estimates) for this? I'm just curious how much memory this actually saves.

I'll find some numbers if Vassil hasn't shared his by the time I've fixed the patch.

The tests are just loading a Clang module and then trying to import some Decls into another ASTContext, so I don't think the error here is specific to LLDB. We just don't have any ASTImporter tests that exercise modules.

I'll add one.

I'm actually not even sure if we have a nice framework where we could test modules + ASTImporter, so don't feel obligated to solve this as part of this patch.

FWIW, if you quickly want to see if your SourceManager changes break LLDB, then you can just set the LIT_FILTER to "Modules" and run check-lldb as LLDB's use of Clang modules are the only place where we have any valid SourceLocations within LLDB.

This is helpful, thanks. I also need to deal with whatever entitlements check-lldb needs (new machine since the last time and I remember it being non-trivial) but I'll figure that out.

  1. The I - 1 everywhere when translating SLocEntryLoaded values to a LoadedSLocEntryTable index is a bit cryptic. [...]

Great comment; I think the way I'll do this is to create a simple wrapper type that degrades to Optional<unsigned>.

Thanks!

  1. If we have this as an Optional, we might as well make the sentinel value std::numeric_limits<unsigned>::max() instead of 0 in SLocEntryLoaded. Then we can just all values as-is without translating them to an index.

Could do, but there's a tradeoff since then a resize operation can't/doesn't use zero-initialization. As a result I have a (weak) preference for a 0-sentinel (probably my next patch will keep that semantic), but I'm open to changing it.

True. Let's keep the 0 sentinel value for now then, I don't think the small advantage of using 2^32 is worth that trouble.

  1. Now that the primary purpose of SLocEntryLoaded is storing indices and not the loaded status, maybe something like SLocEntryIndices would be a better name?

Good idea.

  1. Do you still have your benchmarking values (or some estimates) for this? I'm just curious how much memory this actually saves.

I'll find some numbers if Vassil hasn't shared his by the time I've fixed the patch.

dexonsmith edited the summary of this revision. (Show Details)

Made the planned changes to SourceManager, wrapping the index in LoadedSLocEntryIndex which acts like an Optional<unsigned> (but still using 0 for a sentinel to allow zero-initialization).

dexonsmith planned changes to this revision.Oct 26 2020, 10:39 AM

I still need to investigate the LLDB test failures related to ASTImporter.

Added a unit test for LoadedSLocEntryIndex.

dexonsmith planned changes to this revision.Oct 26 2020, 10:51 AM

I still need to investigate the LLDB test failures related to ASTImporter.

(still to-do)

Fix name of LoadedSLocEntryIndices (I documented it with this name, but was using SLocEntryIndices previously).

dexonsmith planned changes to this revision.Oct 26 2020, 11:00 AM

(still need to investigate ASTImporter)

The tests are just loading a Clang module and then trying to import some Decls into another ASTContext, so I don't think the error here is specific to LLDB. We just don't have any ASTImporter tests that exercise modules.

FWIW, if you quickly want to see if your SourceManager changes break LLDB, then you can just set the LIT_FILTER to "Modules" and run check-lldb as LLDB's use of Clang modules are the only place where we have any valid SourceLocations within LLDB.

I haven't reproduced yet (still building), but I think I found the problem by inspection:

Expected<FileID> ASTImporter::Import(FileID FromID, bool IsBuiltin) {
  llvm::DenseMap<FileID, FileID>::iterator Pos = ImportedFileIDs.find(FromID);
  if (Pos != ImportedFileIDs.end())
    return Pos->second;

  SourceManager &FromSM = FromContext.getSourceManager();
  SourceManager &ToSM = ToContext.getSourceManager();
  const SrcMgr::SLocEntry &FromSLoc = FromSM.getSLocEntry(FromID);

  // Various recursive calls to Import.

After this patch, the address of SLocEntry is potentially invalidated by subsequent calls to getSLocEntry, since a load can increase the capacity of LoadedSLocEntryTable and move its allocation.

That means it's not safe to store an address from getSLocEntry when there will be another call. I can update ASTImporter to use a copy, but I think this is too much of a gotcha... I'll think more about what to do, but here are the ideas I have:

  1. Return an SLocEntry by-value. After https://reviews.llvm.org/D89580 that's just 24B (down from 40B), so maybe this is reasonable.
  2. Return an SLocEntryRef by-value, a new type that is (say) a pointer to the correct SLocEntryTable and an index into it. This would be 16B.

Let me know if you have thoughts.

@teemperor, I don't get test failures:

% (cd build && env LIT_FILTER=Modules ninja -k20 check-lldb)
ld: warning: spew

Testing Time: 10.54s
  Excluded   : 2105
  Unsupported:   17
  Passed     :   13

Could this be because I used -DLLDB_USE_SYSTEM_DEBUGSERVER=ON?

If you're still seeing them, can you try the speculative fix in https://reviews.llvm.org/D90192 to confirm I understand the issue?

That means it's not safe to store an address from getSLocEntry when there will be another call. I can update ASTImporter to use a copy, but I think this is too much of a gotcha... I'll think more about what to do, but here are the ideas I have:

  1. Return an SLocEntry by-value. After https://reviews.llvm.org/D89580 that's just 24B (down from 40B), so maybe this is reasonable.
  2. Return an SLocEntryRef by-value, a new type that is (say) a pointer to the correct SLocEntryTable and an index into it. This would be 16B.

Let me know if you have thoughts.

  1. Use indirection (like https://github.com/Teemperor/llvm/commit/a06b21cbc55c6d2f1d2bf6f39771411ccc17342b, but doesn't have to be lazy) to allocate SLocEntrys in non-contiguous chunks, with stable addresses.

Unfortunately, the patch does not apply against llvm9 (which is what we have as experimental setup) not to speak against llvm5 which is our production setup.

Is the performance overhead for loaded module a % of the overall size of the source files?

Sorry for the naive question but what is a unloaded module?

clang/include/clang/Basic/SourceManager.h
703

We can probably remove the bitvector header include.

Is the performance overhead for loaded module a % of the overall size of the source files?

SourceManager::PrintStats almost has the information you need. One of the lines is:

llvm::errs() << LoadedSLocEntryTable.size()
               << " loaded SLocEntries allocated, "

This is how many SLocEntries have been allocated for imported ASTs (modules / PCH). But unless the corresponding bit in SLocEntryLoaded is set, it hasn't actually been "loaded", a spot has just been reserved. The first call to getSLocEntry for that location will trigger a load from the serialized AST.

Here's how you can estimate the memory usage change:

uint64_t OldMem = 0, NewMem = 0;

// Add each allocated SLocEntry. NextPowerOf2 is an estimate of the SmallVector
// load factor.
OldMem += llvm::capacity_in_bytes(LoadedSLocEntryTable);
NewMem += NextPowerOf2(SLocEntryTableLoaded.count()) * sizeof(SLocEntry);

// Add the side table overhead. We know precisely what the load factor of
// is in this case, since we have one that's the same size.
OldMem += llvm::capacity_in_bytes(SLocEntryTableLoaded);
NewMem += LoadedSLocEntryTable.capacity() * sizeof(unsigned);

If you do that calculation at the end of the compilation, you can see the memory savings.

Sorry for the naive question but what is a unloaded module?

It's an unloaded SLocEntry. After the module has been imported, the SLocEntrys are allocated, but lazily read from disk on first access. If it hasn't been read from disk yet, I'm calling it "unloaded".

clang/include/clang/Basic/SourceManager.h
703

Nice catch, I'll fix that.