This is an archive of the discontinued LLVM Phabricator instance.

Thread safety analysis: Improve documentation for ASSERT_CAPABILITY
ClosedPublic

Authored by ryanofsky on Sep 14 2020, 12:58 PM.

Details

Summary

Previous description didn't actually state the effect the attribute has on
thread safety analysis (causing analysis to assume the capability is held).

Previous description was also ambiguous about (or slightly overstated) the
noreturn assumption made by thread safety analysis, implying the assumption had
to be true about the function's behavior in general, and not just its behavior
in places where it's used. Stating the assumption specifically should avoid a
perceived need to disable thread safety analysis in places where only asserting
that a specific capability is held would be better. (For an example, see
https://github.com/bitcoin/bitcoin/pull/19929)

Diff Detail

Event Timeline

ryanofsky created this revision.Sep 14 2020, 12:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 14 2020, 12:58 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
ryanofsky requested review of this revision.Sep 14 2020, 12:58 PM
aaronpuchert added a subscriber: aaronpuchert.

We don't really have a good understanding of ASSERT_CAPABILITY ourselves. For example, there is this loophole:

void h() {
  mu.AssertHeld();
  mu.Unlock();
}

One would expect to get a warning forcing the user to add a RELEASE(mu)annotation on h, but there is none. Should we disallow releasing of asserted capabilities?

In some way, ASSERT_CAPABILITY is a bit of an outsider that doesn't really fit in with the other attributes.

clang/docs/ThreadSafetyAnalysis.rst
450

The assertion is just a promise though, so it could only happen in debug builds, for example.

452

That's an interesting point. We currently do assume that the capability is held even when an exception is thrown, although that's primarily because Clang's source-level CFG doesn't really model exception handling that well:

Mutex mu;
bool a GUARDED_BY(mu);

void except() {
  try {
    mu.AssertHeld();
  } catch (...) {
    a = 0; // No warning.
  }
}

We can't warn there because AssertHeld might terminate in case the mutex isn't held.

Not sure if we need to clarify this here: we just assume that the capability is held on a regular return. The other case is basically UB for us.

453

It's implemented a bit differently:

Mutex mu;
bool a GUARDED_BY(mu);

void f() {
  a = 0; // warning.
  mu.AssertHeld();
  a = 0; // Ok.
}

If we were to assume that mu is held at the call site, it would also be held right before it.

I'm not sure if this is the right behavior, but assuming that it is, calling an ASSERT_CAPABILITY function introduces an assumption which can be used from that point on, and which disappears on join points with branches that don't have the assumption.

456–457

"The function is assumed to return only if the given capability is held."?

aaronpuchert added a comment.EditedSep 14 2020, 5:15 PM

You can view it this way: there is a dynamic set and a static set of capabilities. The static set is always the same at any particular point in a function, regardless of the circumstances we're called from. It's what we determine in the analysis. The dynamic set depends on who called us and could be greater. To illustrate the difference:

void f() {
  // mu is dynamically held when called from g and not held when called from h.
  // It is not held statically.
}
void g() REQUIRES(mu) {
  // Here mu is statically and dynamically held.
  f();
}
void h() EXCLUDES(mu) {
  // Here mu is statically and dynamically not held.
  f();
}

Basically ASSERT_CAPABILITY is an alternative to propagating REQUIRES through the stack. Sometimes that's not feasible for various reasons, so we can't verify at compile-time that a lock is held. Asserting converts a dynamically held capability into a statically held capability. So locks that the programmer is sure should be held can be assumed by the analysis.

In the discussion on the bitcoin repo I got the impression that you want both runtime checks and compile-time checks in parallel, instead of having either of them. Then maybe you shouldn't use ASSERT_CAPABILITY. But EXCLUSIVE_LOCKS_REQUIRED on the assertion is also strange. If you want the assertion to exist in parallel to the Analysis and not influence it, don't annotate it at all. But maybe I misunderstood the discussion. If you want to convert dynamically held capabilities into statically held capabilities, i.e. let the analysis assume a lock is held despite not being able to prove it otherwise, then this is your attribute. (The existing LockAssertion class that the change removed looks like it should use ASSERT_CAPABILITY instead—please don't use ACQUIRE when the capability is assumed to be held previously.)

We're aware that there are loop holes, especially because the Analysis doesn't integrate into the type system. A deliberate decision, because that would affect things like overloading and other core language features. As a consequence, you can override a function with another function that has additional REQUIRES attributes (something we could warn about), or cast away REQUIRES from a function type before calling it. (Which we can't warn about, because the attributes aren't allowed on types. They are always attached to a variable or function declaration.)

vasild added a subscriber: vasild.Sep 15 2020, 1:45 AM
vasild added inline comments.
clang/docs/ThreadSafetyAnalysis.rst
452–454

Maybe don't mention "throwing", given the example above and s/at the call/after the call/.
It is similar to:

void f(char x, unsigned char y)
{
    assert(x >= 0);
    assert(y <= 127);
    if (x > y) { // it is ok to omit a warning about comparing signed and unsigned because AFTER the asserts we know we are good.
        ...
    }
}

Great feedback! I need to absorb it all then I'll fix the changeset. The mistakes about exceptions came from me taking "(no return)" in the previous documentation too literally thinking it was referring to https://en.cppreference.com/w/cpp/language/attributes/noreturn.

Re: "I got the impression that you want both runtime checks and compile-time checks in parallel". Personally I only want the compile checks, and would like to drop all the runtime checks except in handful of places where compile time checks don't work. ASSERT_CAPABILITY does a great job in these places so I don't want people to be scared off from using it unduly because of documentation!

The mistakes about exceptions came from me taking "(no return)" in the previous documentation too literally thinking it was referring to https://en.cppreference.com/w/cpp/language/attributes/noreturn.

The key here is the word "assumed". We treat the function as if it looks like this:

[[noreturn]] void error();

void assertHeld(Mutex &mu) ASSERT_CAPABILITY(mu) {
  if (!mu.isLocked())
    error();
}

So we assume that the function doesn't return if the mutex isn't held. (For the analysis, not for generating code.)

But the idea—as usual with assertions—is that they just verify assumptions that should hold anyway, so they are not integral part of the code and can be dropped, like the standard assert when NDEBUG is defined. And that's what we might clarify here. The right model is that this function converts dynamically held capabilities into statically held capabilities. It doesn't have to actually do anything, instead it's a marker for a promise: the programmer can't prove that the capability is held but promises it. And in some build profiles we might check that the promise is satisfied, but we don't guarantee that we do.

Which obviously means that an ASSERT_CAPABILITY function should only be used as a last resort, and it makes some sense to me that some bitcoin contributors want to have the standard assertion annotated with EXCLUSIVE_LOCKS_REQUIRED so that people don't accidentally defeat the Analysis by adding assertions. I can't really speak as to whether that's a valid concern, but there is certainly no misunderstanding.

Re: "I got the impression that you want both runtime checks and compile-time checks in parallel". Personally I only want the compile checks, and would like to drop all the runtime checks except in handful of places where compile time checks don't work.

Indeed, now that I've re-read the discussion I see that the argument about guarding against compiler bugs wasn't yours. I guess you have to discuss in the bitcoin community how much to trust the compiler, although you might just end up with Ken Thompson's “Reflections on Trusting Trust”. In my view you're probably better off testing with ThreadSanitizer than adding runtime checks where the Analysis says the lock is held, but I do see the point of nudging people to add annotations when they have an assertion. No misunderstanding there as well.

ASSERT_CAPABILITY does a great job in these places so I don't want people to be scared off from using it unduly because of documentation!

If you're referring to @vasild's comment I hope that we cleared this up. The attribute is meant to mirror the behavior of the standard assert macro.

(The existing LockAssertion class that the change removed looks like it should use ASSERT_CAPABILITY instead—please don't use ACQUIRE when the capability is assumed to be held previously.)

Not sure the try { AssertHeld } catch (...) { a = 0; } example reveals very much: it seems like thread safety annotations aren't checked within a catch block at all?

Mutex m;
int i GUARDED_BY(m);

static void thrower() { throw std::exception(); }
void f() { i = 1; }
void g() {
    try {
        thrower();
    } catch (...) {
        i = 5;
    }
    i = 6;
}

gives me warnings for i=1 and i=6 but not i=5 ?

Extending the above with:

void AssertHeld() ASSERT_CAPABILITY(m) { return; }
void h(bool b) {
    try {
        if (b) thrower();
        i = 7;
        AssertHeld();
        i = 8;
   } catch (...) { }
   i = 9;
}

only gives me a warning for i = 7, even though when b is true i = 9 is just as bad as the i = 6 assignment that did generate a warning.

Using a dummy RAII class with SCOPED_CAPABILITY that does ACQUIRE/RELEASE without modifying the actual mutex seems like it behaves more sensibly to me...

class SCOPED_LOCKABLE assumer
{
public:
    assumer(Mutex& m_) EXCLUSIVE_LOCK_FUNCTION(m_) { }
    ~assumer() UNLOCK_FUNCTION() { }
};

void h(bool b) {
    try {
        if (b) thrower();
        i = 7;
        assumer a(m);
        i = 8;
    } catch (...) { }
    i = 9;
}

gives me warnings for both i=7 and i=9 but allows i=8.

aaronpuchert added a comment.EditedSep 17 2020, 12:38 AM

Not sure the try { AssertHeld } catch (...) { a = 0; } example reveals very much: it seems like thread safety annotations aren't checked within a catch block at all?

Mutex m;
int i GUARDED_BY(m);

static void thrower() { throw std::exception(); }
void f() { i = 1; }
void g() {
    try {
        thrower();
    } catch (...) {
        i = 5;
    }
    i = 6;
}

gives me warnings for i=1 and i=6 but not i=5 ?

It's not that we don't check catch-blocks, but rather that the Clang CFG is incomplete:

 [B5 (ENTRY)]
   Succs (1): B4

 [B1]
   1: 6
   2: i
   3: [B1.2] = [B1.1]
   Preds (2): B3 B4
   Succs (1): B0

 [B2]
   T: try ...
   Succs (1): B3

 [B3]
  catch (...):
   1: catch (...) {
    [B3.4];
}
   2: 5
   3: i
   4: [B3.3] = [B3.2]
   Preds (1): B2
   Succs (1): B1

 [B4]
   1: thrower
   2: [B4.1] (ImplicitCastExpr, FunctionToPointerDecay, void (*)(void))
   3: [B4.2]()
   Preds (1): B5
   Succs (1): B1

 [B0 (EXIT)]
   Preds (1): B1

There is no edge from the thrower call to B2 or B3. We can turn on "exception handling edges" from non-noexcept calls to catch blocks, but that's a bigger change and it would affect other warnings as well. There are also performance problems. (https://bugs.llvm.org/show_bug.cgi?id=43311)

Using a dummy RAII class with SCOPED_CAPABILITY that does ACQUIRE/RELEASE without modifying the actual mutex seems like it behaves more sensibly to me...

Well, you clearly don't acquire or release anything. It's just a hack that seems to solve the problem, but doesn't really (see below).

class SCOPED_LOCKABLE assumer
{
public:
    assumer(Mutex& m_) EXCLUSIVE_LOCK_FUNCTION(m_) { }
    ~assumer() UNLOCK_FUNCTION() { }
};

void h(bool b) {
    try {
        if (b) thrower();
        i = 7;
        assumer a(m);
        i = 8;
    } catch (...) { }
    i = 9;
}

gives me warnings for both i=7 and i=9 but allows i=8.

You're cherry-picking here. If you move i=9 into the catch block, there is still no warning. And if you remove if (b) thrower(); we wouldn't want the warning, because the assertion could be calling std::abort and then we'd have a false positive.

Assertions aren't scoped. The easiest way to see that is that the destructor doesn't do anything.

Removed bad information about exceptions, added sentence to clarify what it means for the analysis to "assume" something, tweaked description to only say asserts affect assumptions after calls instead of at or before calls, dropped "no return" sentence because it's redundant and could still imply annotation shouldn't be used with NDEBUG asserts that do return

ryanofsky marked 4 inline comments as done.Sep 17 2020, 12:09 PM

Agree with everything in D87629#2278073. "Assumed" is the key word, and a wrong assumption doesn't imply UB or generating incorrect code. Also comment is good summary of the compile-time and run-time checking tradeoffs.

clang/docs/ThreadSafetyAnalysis.rst
450

The assertion is just a promise though, so it could only happen in debug builds, for example.

This is true, but the statements seem compatible to me. I'd be happy to update it if I know what change to make.

452

That's an interesting point.

Removed language about throwing now. That was just my misunderstanding.

452–454

Maybe don't mention "throwing", given the example above and s/at the call/after the call/.

Thanks, update has both changes

453

It's implemented a bit differently:

Tweaked to say that assumption applies after the call.

456–457

"The function is assumed to return only if the given capability is held."?

Updated to drop this extra paragraph. I don't think any information is lost and it seems better to avoid implying that annotating a function which does return (NDEBUG assert) will causes the analysis to assume something not true.

Not sure the try { AssertHeld } catch (...) { a = 0; } example reveals very much: it seems like thread safety annotations aren't checked within a catch block at all?
void g() { try { thrower(); } catch (...) { i = 5; } i = 6; }
gives me warnings for ... i=6 but not i=5 ?

It's not that we don't check catch-blocks, but rather that the Clang CFG is incomplete:
There is no edge from the thrower call to B2 or B3. We can turn on "exception handling edges" from non-noexcept calls to catch blocks, but that's a bigger change and it would affect other warnings as well. There are also performance problems. (https://bugs.llvm.org/show_bug.cgi?id=43311)

Sure, it makes perfect sense that it's too hard. But as far as I can see the end result is that catch blocks are treated as if the code therein was annotated with NO_THREAD_SAFETY_ANALYSIS?

You're cherry-picking here. If you move i=9 into the catch block, there is still no warning.

Right -- that was the a=0 and i=5 cases. But there's no way to get a warning for those cases, no matter what I do, as far as I can see?

And if you remove if (b) thrower(); we wouldn't want the warning, because the assertion could be calling std::abort and then we'd have a false positive.

I'd expect: try { if (b) throw; x(); } catch(...) { } y(); and if (!b) { x(); } y();

to generate the same errors/warnings. If x() is marked ASSERT_CAPABILITY(m) and y() is marked REQUIRES(m) they don't.

I mean, I get why they don't -- exceptions are hard. But if I have to choose between getting warnings on bug free code, and not getting warnings on buggy code, I think I'd rather get warnings on the bug free code, since it's easy to make them go away just by moving the assertion claim into the outer scope, or adding a second assertion later given it compiles to a no-op.

I mean, if I'm using ASSERT_CAPABILITY in the first place, it's because I'm already getting false positives.

Assertions aren't scoped. The easiest way to see that is that the destructor doesn't do anything.

Right, and in a perfect world that would be fine; but when other parts of the analysis aren't also perfect, it seems like it leads to false negatives that let bugs through?

It looks to me like:

  • y() is reachable from the catch(...) block, but we don't know what locks might be held there, so anything goes
  • y() is reachable from the try block if nothing is thrown, and in that case the lock is asserted, so it's fine
  • because all ways of reaching y() either have the lock or are too hard to analyse, don't warn

With the dummy SCOPED_CAPABILITY that becomes:

  • y() is reachable from the catch(...) block, but we don't know what locks might be held there, so anything goes
  • y() is reachable from the try block if nothing is thrown, but in that case the capability is released at end of scope
  • because there's a way of reaching y() without holding the capability, issue a warning

I mean, I get that that's "wrong" in that it's generating the warning due to the code path that's safe rather than the code path that's unsafe; but it's "right" in that it's ensuring there's an analysable code path that doesn't hold the capabilities that the other un-analysable code paths might not have. If we're sure every un-analysable code path will have the lock, then we can put the assumption in the outer scope.

ryanofsky marked 4 inline comments as done.Sep 17 2020, 3:41 PM

AJ, maybe this discussion could moved to another issue? I find the details hard to follow, so having another issue would be helpful just to understand what changes you might be requesting. Of course if I am doing something wrong with changes in this changeset definitely let me know.

I mean, if I'm using ASSERT_CAPABILITY in the first place, it's because I'm already getting false positives.

It probably would help to have a precise definition of "false positive" but I don't think the main use of ASSERT_CAPABILITY is to eliminate mistakes in the analysis. I'd say the main use is to provide the analysis new information it doesn't have another way to know about. More specifically the purpose is to add a capability from the dynamic capability set to the static set (D87629#2272676)

Looks pretty good, thanks! I think this clears up the misunderstandings.

Since I'm not a native speaker I'd like to wait for @aaron.ballman's opinion before we merge this.

clang/docs/ThreadSafetyAnalysis.rst
128–135

Wondering if we should refer to this somehow, but maybe I'm overthinking it.

vasild accepted this revision.Sep 18 2020, 2:21 AM

LGTM

This revision is now accepted and ready to land.Sep 18 2020, 2:21 AM

Sure, it makes perfect sense that it's too hard.

It's not really too hard, there is an existing parameter somewhere in the CFG generation that controls these exception handling edges, and we'd probably have to change some other diagnostics plus improve the performance/memory usage.

But as far as I can see the end result is that catch blocks are treated as if the code therein was annotated with NO_THREAD_SAFETY_ANALYSIS?

Not quite, if you have an explicit throw statement that connects to the catch block it will be analyzed just like anything else. Otherwise we currently see the catch block as unreachable (wrongly of course). Some other diagnostics work around that, but I don't think it's worth doing it here as well. We better fix the CFG.

You're cherry-picking here. If you move i=9 into the catch block, there is still no warning.

Right -- that was the a=0 and i=5 cases. But there's no way to get a warning for those cases, no matter what I do, as far as I can see?

With an explicit throw you should be able to reach the catch block. Otherwise no.

And if you remove if (b) thrower(); we wouldn't want the warning, because the assertion could be calling std::abort and then we'd have a false positive.

I'd expect: try { if (b) throw; x(); } catch(...) { } y(); and if (!b) { x(); } y();

to generate the same errors/warnings. If x() is marked ASSERT_CAPABILITY(m) and y() is marked REQUIRES(m) they don't.

Right, if you have a throw, but that's not what I was saying. If there is no potentially throwing call before x() then y() can only be called after x().

But if I have to choose between getting warnings on bug free code, and not getting warnings on buggy code, I think I'd rather get warnings on the bug free code, since it's easy to make them go away just by moving the assertion claim into the outer scope, or adding a second assertion later given it compiles to a no-op.

Completely understand your sentiment here, but for a compiler warning that's treated as error on large code bases we have to be a bit careful. See e.g. D52395#1248608 by the original author of this feature. I've almost gotten those angry rollbacks, see e.g. D49355#1188448 or D52398#1255149.

Assertions aren't scoped. The easiest way to see that is that the destructor doesn't do anything.

Right, and in a perfect world that would be fine; but when other parts of the analysis aren't also perfect, it seems like it leads to false negatives that let bugs through?

That's the situation, there are false negatives. And we're slowly trying to close the gaps. But hacky workarounds using misleading annotations don't improve it.

I mean, I get that that's "wrong" in that it's generating the warning due to the code path that's safe rather than the code path that's unsafe;

That's the key: It accidentally does the right thing in your example, but for the wrong reasons. (It thinks we get the capability, but then release it, whereas in reality we never had it.)

Regarding your analysis: scopes and code paths are orthogonal. Scopes are only interesting for lookup and automatic destructor calls of automatic storage duration variables. So they have nothing to with the validity of an assertion.

aaronpuchert accepted this revision.Sep 26 2020, 1:09 PM

Maybe we'll have to clarify this further in the future, but for now this is an improvement.