This is an archive of the discontinued LLVM Phabricator instance.

Insert poisoned paddings between fields in C++ classes so that AddressSanitizer can find intra-object-overflow bugs
ClosedPublic

Authored by kcc on Oct 8 2014, 5:04 PM.

Details

Summary

The general approach is to add extra paddings after every field
in AST/RecordLayoutBuilder.cpp, then add code to CTORs/DTORs that poisons the paddings
(CodeGen/CGClass.cpp).

Everything is done under the flag -fsanitize-address-field-padding.
The blacklist file (-fsanitize-blacklist) allows to avoid the transformation
for given classes or source files.

See also https://code.google.com/p/address-sanitizer/wiki/IntraObjectOverflow

Diff Detail

Event Timeline

kcc updated this revision to Diff 14613.Oct 8 2014, 5:04 PM
kcc retitled this revision from to Insert poisoned paddings between fields in C++ classes..
kcc updated this object.
kcc edited the test plan for this revision. (Show Details)
kcc added reviewers: samsonov, rnk, rsmith.
kcc added a subscriber: Unknown Object (MLST).
rsmith added inline comments.Oct 8 2014, 6:09 PM
lib/AST/RecordLayoutBuilder.cpp
1343–1344

Typo. "Later in CodeGen, extra code will [...]"

1345

This does not depend on the value of FD; perhaps remove FD and just call it once per class?

1347–1348

This might be easier if you insert padding before fields rather than after them.

1350–1351

Maybe factor out adding the padding from adding the field? I don't think they need to be done together, do they?

Also, you shouldn't add padding at the end of the class if it ends in a flexible array member.

1359–1364

What's the reason behind picking these rules? It seems like we could be more aggressive here and still follow the C++ standard. Is this for C compatibility, or people who roll their own memcpy, or something else? I guess the ASan-instrumented memcpy can be taught that it's OK to copy intra-object redzone?

1361–1362

This 'simple destructor' check doesn't seem right. What properties do you intend to check for here?

1366–1376

Please fix =)

I think reusing the same -fsanitize-blacklist flag makes sense.

1378–1384

If you're going to keep this, it should be put behind a remark flag, and should use the normal diagnostics machinery.

lib/CodeGen/CGClass.cpp
705

"ctor" and "dtor" in lowercase please.

Stripping the poison in the dtor doesn't match C++ semantics: you aren't required to run the dtor to reuse the storage as something else, but conversely you can't reuse the storage as something else unless you use placement-new to construct a new object.

As a compromise to be nicer to existing code, maybe we should strip poison both in the dtor and in a new-expression?

713–714

This doesn't match the set of checks you did in MayInsertExtraPaddingAfterField; in particular, do you need to consider unions here?

718–719

Field names should be capitalized.

729

Please don't hard-code 8 here; use ASTContext::toCharUnitsFromBits instead.

747

This is IntPtrTy (a member of CodeGenFunction).

majnemer added inline comments.
lib/AST/RecordLayoutBuilder.cpp
1371–1372

This doesn't look windows friendly.

kcc updated this revision to Diff 14907.Oct 14 2014, 6:20 PM

Addressed most of the comments in the review, added tests,
moved the check for validity of the instrumentation to RecordDecl::MayInsertExtraPadding,
added a remark flag -Rsanitize-address.

PTAL

This CL still has one FIXME pending Alexey's changes for -fsanitize-blacklist flag.

lib/AST/RecordLayoutBuilder.cpp
1343–1344

done

1345

Agree. Moved to RecordDecl::MayInsertExtraPadding
In future I may need a separate function to check a particular field (e.g. if we want paddings only after arrays).
But not in this patch

1347–1348

I've reverted the loop to its previous state because I actually decided to instrument the last field as well.
This handles cases where we have an array of objects and an overflow from the last field of one objects touches the other object.

1350–1351

Maybe factor out adding the padding from adding the field? I don't think they need to be done together, do they?

Err. Not sure here. It looks like it fits here naturally because we change the FieldSize which then affect the computation if the type's size, etc

Also, you shouldn't add padding at the end of the class if it ends in a flexible array member.

Added the check for !FieldSize.isZero()

1359–1364

It's quite hard to teach asan memcpy to avoid intra object redzones w/o extra false negatives and performance penalty.
But I do want to relax the requirement of non-standard layout, just not in this CL

1361–1362

I wanted to limit to classes that have a user defined DTOR, at least in one of the members, subclasses, etc.
As I say in the comment above, we'll try to relax some of these requirements.
This and other limitations are something that I was able to debug so far, I'll be removing/relaxing these requirements one by one later.

1366–1376

Alexey will commit a separate patch to move fsanitize=blacklist to langopts.
Clearly, I will wait for that and update this CL

1378–1384

done (-Rsanitize-address)

lib/CodeGen/CGClass.cpp
705

We may be able to relax the requirement of unpoisoning in dtor, but not in this CL.
I've seen at least few places (on in LLVM itself) where we should not instrument a class w/o dtor.

striping poisin in new-expression is not a solution as it may lead to false negatives (e.g. when a new-expression is called on an illegal memory)

713–714

Agree. I've moved all this logic into RecordDecl::MayInsertExtraPadding and call it twice, once from RecordLayoutBuilder::LayoutFields and once from CodeGenFunction::EmitAsanPrologueOrEpilogue

718–719

done

729

done

747

done

samsonov added inline comments.Oct 15 2014, 1:37 PM
lib/AST/RecordLayoutBuilder.cpp
1366–1376

r219842 moves `-fsanitize-blacklist` is in LangOpts. You've probably thought about it, but I will warn you anyway: recompiling the library with the same compiler and compile flags, but different blacklist file will make it ABI-incompatible. So, we should be really careful in using it.

kcc updated this revision to Diff 14971.Oct 15 2014, 5:08 PM

Use proper blacklist machinery.
PTAL

samsonov added inline comments.Oct 15 2014, 6:08 PM
lib/AST/Decl.cpp
3642

This looks weird: II->getName() is neither mangled nor qualified (what would it return for the record declared inside the namespaces)? Consider testing smth. like printQualifiedName().

rsmith added inline comments.Oct 15 2014, 7:25 PM
include/clang/AST/Decl.h
3266

In new code, please use \brief instead of repeating the function name.

3269

Please use a leading lowercase letter for this function name for consistency with the rest of the class.

include/clang/Basic/DiagnosticFrontendKinds.td
49–51

Remarks are end-user-visible, so should be a bit more friendly than this. Also, this diagnostic should use "%select{list|of|strings}1" rather than putting user-visible strings in the code in Decl.cpp. Something like:

"-fsanitize-address-field-padding ignored for %0 because it %select{is packed|is a union|is trivially copyable|has a trivial destructor|is standard layout|is in a blacklisted file|is blacklisted}1"

... would seem reasonable. (I don't think it's worth remarking once per class if the reason we're not sanitizing is because the current language is C; in that case, we should warn once for the entire compilation.)

lib/AST/Decl.cpp
3632

Please use nullptr rather than 0.

3641

hasSimpleDestructor still does not seem right; it's a notion internal to how we manage implicit declaration and definition for destructors and shouldn't be used for anything else. Does hasTrivialDestructor work for you?

3647

Blacklisting by identifier doesn't seem ideal; there should probably be some way of specifying a namespace in the blacklist file.

kcc added a comment.Oct 16 2014, 11:28 AM

Addressed all comments except for one

include/clang/AST/Decl.h
3266

done

3269

done

include/clang/Basic/DiagnosticFrontendKinds.td
49–51

Done except for "we should warn once for the entire compilation for C"
What is the kosher way to achieve that in clang?
(I assume that a function-scope static boolean is not kosher)

lib/AST/Decl.cpp
3632

Var changed to int.

3641

Yes, this seems to work,

3647

done (using getQualifiedNameAsString())

kcc updated this revision to Diff 15031.Oct 16 2014, 11:35 AM

Addressed all comments (but one), extended the test to check the blacklist functionality and the remarks

PTAL

rsmith accepted this revision.Oct 16 2014, 1:11 PM
rsmith edited edge metadata.

LGTM

include/clang/Basic/DiagnosticFrontendKinds.td
50–56

I don't think these should be BackendInfo, because they're reflecting frontend semantics changes. They should probably be just ShowInSystemHeader.

lib/AST/Decl.cpp
3629–3630

It would seem prudent to also test whether the type is in an extern "C" linkage specification in C++ here:

if (!CXXRD || CXXRD->isExternCContext())
This revision is now accepted and ready to land.Oct 16 2014, 1:11 PM
kcc updated this revision to Diff 15041.Oct 16 2014, 1:24 PM
kcc edited edge metadata.

Addressed last two Richard's comments.
Added a test for an extern "C" class.

kcc retitled this revision from Insert poisoned paddings between fields in C++ classes. to Insert poisoned paddings between fields in C++ classes so that AddressSanitizer can find intra-object-overflow bugs.Oct 16 2014, 1:29 PM
kcc updated this object.
kcc edited the test plan for this revision. (Show Details)
kcc closed this revision.Oct 16 2014, 2:05 PM