This is an archive of the discontinued LLVM Phabricator instance.

c-index: CXString: fix MSAN read-past-end bug
Needs ReviewPublic

Authored by elsteveogrande on Jan 14 2018, 1:05 PM.

Details

Summary

See discussion here:
http://clang-developers.42468.n3.nabble.com/RFC-proper-string-handling-in-CXString-td4059287.html

Bugzilla URL:
https://bugs.llvm.org/show_bug.cgi?id=35896

Previous impl would read the byte past the end of a string (a llvm::StringRef), possibly exceeding the allocation for that memory and raising an MSAN issue.

This will instead save the character range specified by the StringRef and copy it on-demand to a new C string.

Passes existing tests, and now passes with MSAN as well.

Diff Detail

Event Timeline

elsteveogrande created this revision.Jan 14 2018, 1:05 PM
elsteveogrande edited the summary of this revision. (Show Details)Jan 14 2018, 1:10 PM
elsteveogrande added reviewers: vsk, eugenis.
elsteveogrande edited the summary of this revision. (Show Details)

Change string-copy-on-demand logic; do only if not IsNullTerminated.

Add a needed null-check on input string's data ptr.

vsk added inline comments.
tools/c-index-test/c-index-test.c
3397

This looks like a separate bug fix. Is it possible to separate the main_filename changes from this patch?

tools/libclang/CXString.cpp
59–60

Why shouldn't this be defined as createRef("")?

204

Basic question: If a non-owning CXString is null-terminated, what provides the guarantee that the string is in fact valid when getCString() is called? Is the user of the C API responsible for ensuring the lifetime of the string is valid?

tools/libclang/CXString.h
108 ↗(On Diff #130225)

Generally unrelated whitespace changes should be left out of patches.

Thanks very much for looking at this @vsk ! I actually found an ASAN bug in my new code, mixing and matching malloc/free and operators new/delete.

tools/c-index-test/c-index-test.c
3397

Will do!

tools/libclang/CXString.cpp
59–60

Hmm, good question. createRef below actually calls back to createEmpty and createNull in the nonnull-but-empty and null cases.

I think I'll do this the other way around, and let createRef have the responsibility of dealing with these fields and nulls and whatnot.

204

I believe the API itself is the one building CXString instances, and the user of the C API doesn't really create them, only use them. So the API has to ensure the string stays "good" while there may be references to it.

(Which feels a little fragile. But I think that's the tradeoff being made. You'll get either "fast" strings, or data guaranteed to be sane. I'd opt for safer data but I don't know who's using this C API and am afraid to introduce a serious perf regression. So it'll stay this way and I'll try my best to solve *-SAN issues with these constraints :) )

vsk added a comment.Jan 18 2018, 11:32 AM

Thanks for working on this :).

tools/libclang/CXString.cpp
204

Sgtm, it doesn't look like this is altering the API contract.

elsteveogrande marked 2 inline comments as done.Jan 18 2018, 12:30 PM
elsteveogrande added inline comments.
tools/c-index-test/c-index-test.c
3397
vsk added a comment.Jan 18 2018, 6:29 PM

Mind rebasing this when you have a chance?

elsteveogrande marked an inline comment as done.

Fixes, but first, a question for reviewers:

Looking at the description of clang_disposeString:

/**
 * \brief Free the given string.
 */
CINDEX_LINKAGE void clang_disposeString(CXString string);

Does it seem incorrect to acquire some const char * pointer into this string, dispose the string, and then access the characters?

I've seen this happen a couple of times now. As I make changes to my code I run into this pattern. (Since now this triggers a use-after-free abort.)

I wanted to ask because, though it seemed obvious to me that this is incorrect usage, I'm now wondering if the expectation is that it's ok. Or maybe wasn't technically ok, and we just "got away with it" before. :)

Anyway, assuming it's only correct to use the string before disposing it, then the fixes this time around are:

  • Fix an ASAN use-after-free in c-index-test mentioned above. Get the CXString, pass it around, then dispose when we're done with it.
  • Change createEmpty and createNull to delegate through createRef
  • createRef tolerates nullptr correctly.
  • I previously ran into ASAN aborts due to mixing malloc/free/operator new/delete. I had changed everything to use operator new/delete. However from C-land I can't new char[length], only malloc (or strdup or whatever). Change everything to malloc/free instead.

remove unneeded changes

Fixes, but first, a question for reviewers:

Looking at the description of clang_disposeString:

/**
 * \brief Free the given string.
 */
CINDEX_LINKAGE void clang_disposeString(CXString string);

Does it seem incorrect to acquire some const char * pointer into this string, dispose the string, and then access the characters?

I'm inclined to say that this is incorrect, but skimming through the codebase it seems like we "get away" with this behavior all over the place.

I've seen this happen a couple of times now. As I make changes to my code I run into this pattern. (Since now this triggers a use-after-free abort.)

I wanted to ask because, though it seemed obvious to me that this is incorrect usage, I'm now wondering if the expectation is that it's ok.

I've CC'd some Apple folks who can say more definitively how difficult it'd be to move away from this behavior. IMO it should be doable as long as there's some way to audit the codebase (e.g with ASan).

Or maybe wasn't technically ok, and we just "got away with it" before. :)

Anyway, assuming it's only correct to use the string before disposing it, then the fixes this time around are:

  • Fix an ASAN use-after-free in c-index-test mentioned above. Get the CXString, pass it around, then dispose when we're done with it.
  • Change createEmpty and createNull to delegate through createRef
  • createRef tolerates nullptr correctly.
  • I previously ran into ASAN aborts due to mixing malloc/free/operator new/delete. I had changed everything to use operator new/delete. However from C-land I can't new char[length], only malloc (or strdup or whatever). Change everything to malloc/free instead.

Hi @vsk + @arphaman : I did find such problems with either MSAN (with poison-in-dtor) and ASAN, so we should be able to discover these instances pretty easily :)

elsteveogrande marked 3 inline comments as done.Aug 13 2018, 12:38 PM

Rebase + fix conflicts for very old diff. Works again.

ninja check-clang with MSAN-enabled build:

Before:

Failing Tests (2):
    Clang :: CodeGen/signed_metadata.cpp
    Clang :: Index/comment-to-html-xml-conversion.cpp

  Expected Passes    : 12777
  Expected Failures  : 19
  Unsupported Tests  : 291
  Unexpected Failures: 2
FAILED: tools/clang/test/CMakeFiles/check-clang
cd /data/users/steveo/llvm-ct/build/msan/clang/tools/clang/test && /bin/python2.7 /data/users/steveo/llvm-ct/build/msan/clang/./bin/llvm-lit -sv --param clang_site_config=/data/users/steveo/llvm-ct/build/msan/clang/tools/clang/test/lit.site.cfg --param USE_Z3_SOLVER=0 /data/users/steveo/llvm-ct/build/msan/clang/tools/clang/test
ninja: build stopped: subcommand failed.

After:

Failing Tests (1):
    Clang :: CodeGen/signed_metadata.cpp

  Expected Passes    : 12778
  Expected Failures  : 19
  Unsupported Tests  : 291
  Unexpected Failures: 1
FAILED: tools/clang/test/CMakeFiles/check-clang
cd /data/users/steveo/llvm-ct/build/msan/clang/tools/clang/test && /bin/python2.7 /data/users/steveo/llvm-ct/build/msan/clang/./bin/llvm-lit -sv --param clang_site_config=/data/users/steveo/llvm-ct/build/msan/clang/tools/clang/test/lit.site.cfg --param USE_Z3_SOLVER=0 /data/users/steveo/llvm-ct/build/msan/clang/tools/clang/test
ninja: build stopped: subcommand failed.

hi @vsk + @arphaman , it's been a while -- I brought this old diff up to date, wondering if you could take a look again. Thanks!

+ Jan and Volodymyr. This seemed to be in good shape the last time I looked at it. Not having touched libclang for a while I don't think I can give an official lgtm.

Is anyone working on this ? This is the only persistent msan failure when doing check-clang on my machine.