This is an archive of the discontinued LLVM Phabricator instance.

SourceManager: Prefer Optional<MemoryBufferRef> over MemoryBuffer*
AbandonedPublic

Authored by dexonsmith on Aug 26 2019, 10:23 PM.

Details

Summary

Change the APIs in SourceManager returning const MemoryBuffer* to
return Optional<MemoryBufferRef> instead, and remove the `bool
*Invalid` flag from same. This removes the need to create dummy
invalid buffers in SrcMgr::ContentCache and clarifies buffer ownership.

A few APIs changed, and a few were added.

  • SrcMgr::ContentCache::getBuffer returns Optional<MemoryBufferRef>.
  • SrcMgr::ContentCache::getBufferIdentifier returns Optional<StringRef>.
  • SrcMgr::ContentCache::getBufferSize returns Optional<unsigned>.
  • SourceManager::getMemoryBufferForFile returns Optional<MemoryBufferRef>.
  • SourceManager::getBuffer returns Optional<MemoryBufferRef>.
  • SourceManager::getBufferOrFake returns MemoryBufferRef, replacing None with fake data.
  • SourceManager::getBufferIdentifier returns Optional<StringRef>.
  • SourceManager::getBufferIdentifierOrEmpty returns StringRef, replacing None with "".
  • SourceManager::getBufferDataOrNone returns Optional<StringRef>

SourceManager::getBufferData did not change to return
Optional<StringRef> because its usage is too pervasive. A possible
follow-up would be to add SourceManager::getBufferDataOrFake (to match
the other APIs) and move components over but I'm not sure it's urgent.

Threading MemoryBufferRef (instead of const MemoryBuffer*) through had
a few small side effects:

  • FrontendInputFile now stores a MemoryBufferRef.
  • SourceManager::createFileID now takes a MemoryBufferRef.
  • llvm::line_iterator now uses an Optional<StringRef>.
  • MemoryBufferRef::operator== now exists and compares pointer identity for comparing Optional<MemoryBufferRef>.

Diff Detail

Event Timeline

dexonsmith created this revision.Aug 26 2019, 10:23 PM
Herald added a reviewer: shafik. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
shafik added a comment.Oct 6 2020, 5:18 PM

ASTImporter changes looks good.

arphaman added inline comments.Oct 12 2020, 9:43 AM
clang-tools-extra/clangd/SourceCode.cpp
459

I feel like all buffer requests for the main file should succeed and shouldn't need the fake buffer, unless something gone terribly wrong, Do you agree? Do you think it might be valuable to add a method like SM.getMainFileBuffer which has an llvm_unreachable if buffer is invalid?

clang/include/clang/Basic/SourceManager.h
303

This doesn't seem to be used in this patch, is this dead code?

clang/lib/Basic/SourceLocation.cpp
253

Can be simplified to return Buffer ? Buffer->getBuffer() : ""

clang/lib/Basic/SourceManager.cpp
1179

How come you're returning nullptr here instead of "<<<<INVALID BUFFER>>>>" like in the error condition above? It seems that clients will not be able to handle this nullptr.

1467

Should you return "<invalid buffer>" here to be consistent with the "invalid loc" above?

1698

NIT: Please use {} for the outer if

1706

It appears you're not checking if Buffer is None here. Previously this should've been fine as it seems getBuffer created fake recovery buffers.

dexonsmith added inline comments.Oct 12 2020, 10:51 AM
clang-tools-extra/clangd/SourceCode.cpp
459

I agree the main file should really be there. But I'd rather keep this patch as NFC as possible, so I think not in this commit. I also don't see an easy way to add that assertion in a prep commit.

WDYT of doing it in a follow-up?

clang/include/clang/Basic/SourceManager.h
303

I'll check and clean it up.

clang/lib/Basic/SourceManager.cpp
1179

Good question, but I don't think we need to care since this is NFC. See the old return statement:

return Buffer->getBufferStart() + (CharDataInvalid? 0 : LocInfo.second);
1467

To keep this NFC, this should have the same identifier that the old getBuffer call would have had. I'll double check it was empty before.

1706

That's a good point.

It actually adds a concern for me, since this patch is fairly old (rebased after over a year). Could there be calls to getBuffer that I've missed?

I can do a more careful audit.

There's a way to do this more safely: adding a new API with a different name, migrating everyone over, deleting the old API, and then renaming the new API. However, I'm concerned that will create even more churn. Do you think that would be better?

dexonsmith planned changes to this revision.Oct 12 2020, 12:43 PM

I'm going to change my approach here to be less error-prone. The real goal is changing from MemoryBuffer* to MemoryBufferRef. I was adding Optional<> as a cleanup but that's really a nice-to-have, which is better landed independently.

  1. Add getBufferOrNone and getBufferDataOrNone, which return Optional<MemoryBufferRef>.
  2. Move users of getBuffer and getBufferData that use the bool* Invalid out parameter over to that API.
  3. (this patch) Change getBuffer and getBufferData to return MemoryBufferRef, dropping the (now unused) Invalid parameter. Keep the semantics where they return a fake buffer if the real one couldn't be found.
  4. (optional, later) Rename getBuffer to getBufferOrFake (same for Data)
  5. (optional, later) Migrate audited APIs over to a new/clean getBuffer API that asserts on invalid/None

Compare to current approach in the patch:

  • good: callers will still get expected results when invalid (for those that don’t check); no audit necessary
  • good: getBuffer and getBufferData will match
  • good: more incremental
  • bad: a bit more churn for getBuffer users, since pointer semantics (MemoryBuffer*) change to reference semantics (MemoryBufferRef)
  • bad: delay cleanup where callers expecting fake data explicitly use an OrFake API
martong resigned from this revision.Oct 14 2020, 5:41 AM
dexonsmith abandoned this revision.Oct 14 2020, 4:07 PM

I finished reimplementing this patch mostly as discussed, culminating in https://reviews.llvm.org/D89431. I've rebased https://reviews.llvm.org/D89431 on top of that. Abandoning this revision.