This is an archive of the discontinued LLVM Phabricator instance.

[CFG] Fix a flaky crash in CFGBlock::getLastCondition().
ClosedPublic

Authored by NoQ on Nov 7 2019, 12:50 PM.

Details

Summary

Using an end iterator of an empty CFG block boiled down to dereferencing a garbage pointer.

This was fun to debug because the actual segfault occurs once in ~20 runs on the original code (on my system; on top of that, each run took several minutes). On the newly added test it crashes even more rarely, roughly once in 500 runs.

CFG uses llvm::BumpVector for storing the list of elements. Its iterators are typedefs for raw pointers, so there's no way to check the correctness of the iterator by injecting assertions into it.

[12:26:29] <@NoQ> I'm about to commit a fix for a flaky crash that's reproducible once in ~1000 compilations. Can we make for-loops in lit?
[12:27:12] <@jdoerfert> @NoQ: @jdenny: has an extension to do that (I think)
[12:36:40] <@NoQ> @jdoerfert: Thanks!
[12:36:59] <@jdoerfert> @NoQ: so, I doubt we have on in-tree
[12:37:21] <@NoQ> Mm, ok. I guess i could copy-paste the run-line :)
[12:37:36] <@lebedev.ri> i remember seeing previous fixes with such idea, but i don't recall how they achieved that
[12:37:37] <@jdoerfert> that is one way, yes ;)
[12:38:16] <@jdoerfert> #include <>; #include<>; #include<>; ... exponential growth!

Diff Detail

Event Timeline

NoQ created this revision.Nov 7 2019, 12:50 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 7 2019, 12:50 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
NoQ marked an inline comment as done.Nov 7 2019, 12:55 PM
NoQ added inline comments.
clang/test/Analysis/a_flaky_crash.cpp
20

This is a trade-off between reliability and not increasing test time too much. This test starts with an "a", so it fires up immediately, and on my machine it's not the last test to finish during check-clang-analysis, so i guess it shouldn't cause too much trouble.

NoQ edited the summary of this revision. (Show Details)Nov 7 2019, 12:58 PM
NoQ added a subscriber: jdenny.

Nice catch! Though, wouldn't the memory sanitizer buildbots break on this reliably?

Szelethus added inline comments.Nov 8 2019, 5:18 AM
clang/lib/Analysis/CFG.cpp
5882 ↗(On Diff #228292)

What would that even be?

NoQ marked an inline comment as done.Nov 8 2019, 2:34 PM

Nice catch! Though, wouldn't the memory sanitizer buildbots break on this reliably?

I kinda hope so; i'll take a look at home.

clang/lib/Analysis/CFG.cpp
5882 ↗(On Diff #228292)

Dunno. Excellent question. I'm, like, curious if there's a better behavior we can implement.

Here's the CFG for the current test case:

FIXME: It's not particularly obvious from the dump that [B5.6] is a MaterializeTemporaryExpr.

I can't say i'm a big fan of how it looks. The temp dtor branch (terminator of [B4]) is unnecessary here, given that the short circuit never happens. But that's our typical problems with how CFG cuts off unreachable branches but never canonicalizes itself after that.

One way or another, we were crashing when calling getLastCondition() for [B2]. Its terminator is the if-statement. It probably doesn't make much sense to return the if-statement, but the answer that you're looking for is most likely [B4.1] rather than nullptr.

NoQ added a comment.Nov 11 2019, 4:34 PM
In D69962#1739440, @NoQ wrote:

Nice catch! Though, wouldn't the memory sanitizer buildbots break on this reliably?

I kinda hope so; i'll take a look at home.

Memory sanitizer at home: *fails with 1700 uninitialized reads in TableGen*.

I guess i could commit test first, then get buildbots to fail, then if they do commit the code, otherwise commit the code and multiply the run-lines.

Szelethus accepted this revision.Nov 12 2019, 7:20 AM
In D69962#1741503, @NoQ wrote:
In D69962#1739440, @NoQ wrote:

Nice catch! Though, wouldn't the memory sanitizer buildbots break on this reliably?

I kinda hope so; i'll take a look at home.

Memory sanitizer at home: *fails with 1700 uninitialized reads in TableGen*.

Try to add a non-sanitizer built tablegen: -DCLANG_TABLEGEN=/path/to/non/sanitized/clang-tblgen -DLLVM_TABLEGEN=/path/to/non/sanitized/llvm-tblgen

I guess i could commit test first, then get buildbots to fail, then if they do commit the code, otherwise commit the code and multiply the run-lines.

This revision is now accepted and ready to land.Nov 12 2019, 7:20 AM

Accepted accidentally. Well, in any case, I trust your judgement on this! I'd prefer not to commit the test file as-is.

NoQ added a comment.Nov 13 2019, 1:29 PM

Try to add a non-sanitizer built tablegen: -DCLANG_TABLEGEN=/path/to/non/sanitized/clang-tblgen -DLLVM_TABLEGEN=/path/to/non/sanitized/llvm-tblgen

Yay nice! The clang that i built this way is still unusable though, i.e. crashes on almost all tests :( I guess i'm doing something wrong.

I'd like to commit the test as is and then keep looking for the solution in the background.

Szelethus added a comment.EditedNov 14 2019, 12:32 AM
In D69962#1744679, @NoQ wrote:

Try to add a non-sanitizer built tablegen: -DCLANG_TABLEGEN=/path/to/non/sanitized/clang-tblgen -DLLVM_TABLEGEN=/path/to/non/sanitized/llvm-tblgen

Yay nice! The clang that i built this way is still unusable though, i.e. crashes on almost all tests :( I guess i'm doing something wrong.

I'd like to commit the test as is and then keep looking for the solution in the background.

Could you just give one last shot to commiting the known-to-be-incorrect test to see how the buildbots react first?

edit: I'd also be happy to take a look during the weekend.

This revision was automatically updated to reflect the committed changes.