This is an archive of the discontinued LLVM Phabricator instance.

Produce a more informative diagnostics when Clang runs out of source locations
ClosedPublic

Authored by rsmith on Nov 9 2022, 3:55 PM.

Details

Reviewers
aaron.ballman
Summary

If Clang runs out of source location address space, produce a more helpful diagnostic than "ran out of source locations" or "translation unit is too large". In addition to that diagnostic, also describe the current source location address space usage, listing the header files that are contributing most to that as well as how many times they were textually entered. The intent is to make it easier to determine if the problem is some kind of misconfiguration (for example, a header isn't properly include-guarded and gets textually entered a lot, or is entered in many AST files), problematic input (for example, a preprocessor metaprogram uses a huge amount of source location space), or a death by a thousand cuts due to the source program just plain being too large.

Also included is a debug pragma to produce the usage report, both to make this more readily testable and to provide visibility into source location address space usage when debugging clang.

Diff Detail

Event Timeline

rsmith created this revision.Nov 9 2022, 3:55 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 9 2022, 3:55 PM
Herald added a subscriber: mgrang. · View Herald Transcript
rsmith requested review of this revision.Nov 9 2022, 3:55 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 9 2022, 3:55 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
rsmith updated this revision to Diff 474384.Nov 9 2022, 4:06 PM
  • Update SourceLocationsOverflow test to expect the improved diagnostics.
  • Slightly change inc2.h to make the test run 22x faster.
rsmith updated this revision to Diff 474385.Nov 9 2022, 4:06 PM
rsmith updated this revision to Diff 474386.
alexfh added a subscriber: alexfh.Nov 10 2022, 3:48 AM

Thanks for adding this diagnostic! I wonder whether we can include SLoc usage information into -print-stats output?

clang/lib/Basic/SourceManager.cpp
2251

Could you add a comment about the meaning of negative IDIndex values?

clang/test/Lexer/SourceLocationsOverflow.c
7

You probably didn't mean to match local paths on your machine. Same below.

Thank you for working on this, I think this will help folks who run out of source locations. Can you also add a release note and public docs for this extension?

clang/include/clang/Basic/SourceManager.h
1695–1696

Not that I'm opposed, but how did you arrive at 50?

clang/lib/Basic/SourceManager.cpp
675

Are you planning to do this as part of this patch, or is this more of an aspirational FIXME for the future?

2281–2282

llvm::copy?

clang/lib/Lex/Pragma.cpp
1187

~0ULL?

1190–1195

These predicates can be combined into one if statement.

clang/test/Misc/sloc-usage.cpp
2

Should we do anything special about source locations from the command line, or do those not contribute source locations? (e.g., -D or -U, force include file, etc flags)

(Testing a forced include would be interesting regardless, just to ensure we catch those the same as we do an #include.)

shafik added a subscriber: shafik.Nov 10 2022, 9:05 PM
shafik added inline comments.
clang/lib/Basic/SourceManager.cpp
2251

Maybe make -(int)LoadedSLocEntryTable.size() a named variable may help readability as well. Perhaps also for (int)LocalSLocEntryTable.size() as well.

rsmith updated this revision to Diff 475684.Nov 15 2022, 10:07 PM
rsmith marked 5 inline comments as done.

I wonder whether we can include SLoc usage information into -print-stats output?

Yeah, we could extend the source manager information there with something like this. It's a bit awkward to reuse this implementation because it's geared around producing notes, but it should be fairly straightforward to factor out some common code and extend -print-stats with it in a follow-up change.

clang/include/clang/Basic/SourceManager.h
1695–1696

I was aiming for a number that's small enough that the diagnostic won't run for pages (will fit into a relatively small scroll buffer) but large enough that it should capture problematic usage.

But, you know what, we should be consistent. A default of 32 would match our default limit for notes in constexpr and template backtraces; let's use that. Plus powers of 2 are the best arbitrary numbers.

clang/lib/Basic/SourceManager.cpp
675

This is aspirational. I think it'll be a lot of work to make all the transitive callers of this be able to handle it failing.

2251

Refactored to make it clearer what's going on, added comments.

2281–2282

This is forming a vector of iterators, not a vector of values. Hm, but... I think I can express this better with MapVector and avoid this copy in the process.

clang/lib/Lex/Pragma.cpp
1187

That would be (unsigned long long)-1, which isn't necessarily the same value.

But I think we can do better than forming a large unsigned number here; changed to use Optional.

clang/test/Misc/sloc-usage.cpp
2

For -D etc, we get output that looks like this:

<built-in>:1:1: note: file entered 2 times using 14778B of space
# 1 "<built-in>" 3
^

... where the <built-in>:1:1 is the location of the first time we enter a FileID with a null corresponding const FileEntry*. I think these will all correspond to predefines and similar things, so the <built-in>:1:1 source location seems pretty reasonable.

I've added a test for -include.

aaron.ballman accepted this revision.Nov 16 2022, 5:24 AM

LGTM! Please add a release note when landing so folks know about the improvements here. Thanks!

clang/include/clang/Basic/SourceManager.h
1695–1696

Thanks, that makes more sense! Someday (not today) we might want to consider making this a named constant and using it rather than magic numbers.

clang/lib/Basic/SourceManager.cpp
675

SGTM, thanks!

clang/lib/Lex/Pragma.cpp
1187

Even better! Thanks

This revision is now accepted and ready to land.Nov 16 2022, 5:24 AM

LGTM! Please add a release note when landing so folks know about the improvements here. Thanks!

Thanks, done and landed.