This is an archive of the discontinued LLVM Phabricator instance.

[Sema] Prevent UB for uninitialized `IsSurrogate`
AbandonedPublic

Authored by modocache on Mar 3 2020, 10:16 AM.

Details

Summary

A closed-source C++ codebase I help maintain began hitting a Clang ICE
with a stack trace that referenced
clang::OverloadCandidate::getNumParams:
https://gist.github.com/modocache/84eac9c519796644139471dd06ef4628

Specifically, Clang was querying getNumParams, the implementation of
which checks the IsSurrogate member:

unsigned getNumParams() const {
  if (IsSurrogate) {
    QualType STy = Surrogate->getConversionType();
    // ...
  }
  // ...
}

However, the OverloadCandidate instance had not initialized that member
with a value, triggering UB. In the case of the stack trace, IsSurrogate
was evaluated as if it were true, and the access into uninitialized
Surrogate caused the crash.

Every other use of clang::OverloadCandidate initializes IsSurrogate
with a value, so I do so in this patch in order to avoid triggering UB
in the case of my closed-source codebase. I believe the potential for UB
was introduced in this commit from January 9:
https://github.com/llvm/llvm-project/commit/25195541349b1d6dfc03bf7511483110bda69b29

Alternatively, I considered modifying the clang::OverloadCandidate
constructor to initialize IsSurrogate with a false value. I feel doing
so would be safer, but I stuck with what appears to be the convention:
initializing OverloadCandidate members are the site of use.

Unfortunately I don't have a small example I can share of how
getNumParams can be called in an invalid state. The closed-source C++
code that triggers the UB looks something like this and only occurs when
using GCC's libc++:

class MyClass {
  std::pair<std::string, int> myPair;
  // ...
  void myMemberFunctionoFoo(std::pair<std::string, int> p) {}
  void myMemberFunctionBar() {
    std::bind(&MyClass::myMemberFunctionFoo, this, myPair);
  }
};

I tried reverse-engineering a test for this patch by commenting out the
if statements added in the commit that I think introduced the UB,
https://github.com/llvm/llvm-project/commit/25195541349b1d6dfc03bf7511483110bda69b29.
But in fact I found that removing these if statements entirely did not
cause any tests in check-clang to fail, so I'm concerned the code may be
untested at the moment.

Diff Detail

Event Timeline

modocache created this revision.Mar 3 2020, 10:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2020, 10:16 AM
modocache marked an inline comment as done.Mar 3 2020, 10:20 AM
modocache added inline comments.
clang/lib/Sema/SemaOverload.cpp
7371

@rsmith I'd definitely appreciate any pointers here -- if initializing IsSurrogate with a value as I do in this patch is the kind of change that doesn't "need" an explicit test, then great. But if a test is in order, I'd be happy to write one, but I'm not certain this if statement (and the one whose body statement I modify above) is being tested. Even commenting it out entirely still has check-clang passing for me. I'd greatly appreciate advice on how to exercise this code path in a Clang regression test.

Friendly ping! OverloadCandidate has uninitialized members and so can cause UB if used incorrectly in another part of the compiler, so I feel this is a fairly straightforward patch: prevent UB by initializing all members. But I'd like some review to confirm. @rsmith? @aaron.ballman?

Sorry for the delay in review, but I believe this was already fixed in the meantime: https://github.com/llvm/llvm-project/commit/a5704f92b835d1810d83a223f70dfe6c92a34c03

modocache abandoned this revision.Mar 13 2020, 7:04 AM

Awesome, thanks!

Alternatively, I considered modifying the clang::OverloadCandidate constructor to initialize IsSurrogate with a false value. I feel doing so would be safer, but I stuck with what appears to be the convention: initializing OverloadCandidate members are the site of use.

Looks like the patch does exactly this, which is nice.