This is an archive of the discontinued LLVM Phabricator instance.

Insert poisoned paddings between fields in C++ classes so that AddressSanitizer can find intra-object-overflow bugs
ClosedPublic

Authored by kcc on Oct 8 2014, 5:04 PM.

Details

Summary

The general approach is to add extra paddings after every field
in AST/RecordLayoutBuilder.cpp, then add code to CTORs/DTORs that poisons the paddings
(CodeGen/CGClass.cpp).

Everything is done under the flag -fsanitize-address-field-padding.
The blacklist file (-fsanitize-blacklist) allows to avoid the transformation
for given classes or source files.

See also https://code.google.com/p/address-sanitizer/wiki/IntraObjectOverflow

Diff Detail

Event Timeline

kcc updated this revision to Diff 14613.Oct 8 2014, 5:04 PM
kcc retitled this revision from to Insert poisoned paddings between fields in C++ classes..
kcc updated this object.
kcc edited the test plan for this revision. (Show Details)
kcc added reviewers: samsonov, rnk, rsmith.
kcc added a subscriber: Unknown Object (MLST).
rsmith added inline comments.Oct 8 2014, 6:09 PM
lib/AST/RecordLayoutBuilder.cpp
1346–1347

Typo. "Later in CodeGen, extra code will [...]"

1348

This does not depend on the value of FD; perhaps remove FD and just call it once per class?

1362–1367

What's the reason behind picking these rules? It seems like we could be more aggressive here and still follow the C++ standard. Is this for C compatibility, or people who roll their own memcpy, or something else? I guess the ASan-instrumented memcpy can be taught that it's OK to copy intra-object redzone?

1364–1365

This 'simple destructor' check doesn't seem right. What properties do you intend to check for here?

1369–1379

Please fix =)

I think reusing the same -fsanitize-blacklist flag makes sense.

1381–1387

If you're going to keep this, it should be put behind a remark flag, and should use the normal diagnostics machinery.

1394–1395

This might be easier if you insert padding before fields rather than after them.

1397–1398

Maybe factor out adding the padding from adding the field? I don't think they need to be done together, do they?

Also, you shouldn't add padding at the end of the class if it ends in a flexible array member.

lib/CodeGen/CGClass.cpp
695

"ctor" and "dtor" in lowercase please.

Stripping the poison in the dtor doesn't match C++ semantics: you aren't required to run the dtor to reuse the storage as something else, but conversely you can't reuse the storage as something else unless you use placement-new to construct a new object.

As a compromise to be nicer to existing code, maybe we should strip poison both in the dtor and in a new-expression?

703–704

This doesn't match the set of checks you did in MayInsertExtraPaddingAfterField; in particular, do you need to consider unions here?

708–709

Field names should be capitalized.

719

Please don't hard-code 8 here; use ASTContext::toCharUnitsFromBits instead.

737

This is IntPtrTy (a member of CodeGenFunction).

majnemer added inline comments.
lib/AST/RecordLayoutBuilder.cpp
1374–1375

This doesn't look windows friendly.

kcc updated this revision to Diff 14907.Oct 14 2014, 6:20 PM

Addressed most of the comments in the review, added tests,
moved the check for validity of the instrumentation to RecordDecl::MayInsertExtraPadding,
added a remark flag -Rsanitize-address.

PTAL

This CL still has one FIXME pending Alexey's changes for -fsanitize-blacklist flag.

lib/AST/RecordLayoutBuilder.cpp
1346–1347

done

1348

Agree. Moved to RecordDecl::MayInsertExtraPadding
In future I may need a separate function to check a particular field (e.g. if we want paddings only after arrays).
But not in this patch

1362–1367

It's quite hard to teach asan memcpy to avoid intra object redzones w/o extra false negatives and performance penalty.
But I do want to relax the requirement of non-standard layout, just not in this CL

1364–1365

I wanted to limit to classes that have a user defined DTOR, at least in one of the members, subclasses, etc.
As I say in the comment above, we'll try to relax some of these requirements.
This and other limitations are something that I was able to debug so far, I'll be removing/relaxing these requirements one by one later.

1369–1379

Alexey will commit a separate patch to move fsanitize=blacklist to langopts.
Clearly, I will wait for that and update this CL

1381–1387

done (-Rsanitize-address)

1394–1395

I've reverted the loop to its previous state because I actually decided to instrument the last field as well.
This handles cases where we have an array of objects and an overflow from the last field of one objects touches the other object.

1397–1398

Maybe factor out adding the padding from adding the field? I don't think they need to be done together, do they?

Err. Not sure here. It looks like it fits here naturally because we change the FieldSize which then affect the computation if the type's size, etc

Also, you shouldn't add padding at the end of the class if it ends in a flexible array member.

Added the check for !FieldSize.isZero()

lib/CodeGen/CGClass.cpp
695

We may be able to relax the requirement of unpoisoning in dtor, but not in this CL.
I've seen at least few places (on in LLVM itself) where we should not instrument a class w/o dtor.

striping poisin in new-expression is not a solution as it may lead to false negatives (e.g. when a new-expression is called on an illegal memory)

703–704

Agree. I've moved all this logic into RecordDecl::MayInsertExtraPadding and call it twice, once from RecordLayoutBuilder::LayoutFields and once from CodeGenFunction::EmitAsanPrologueOrEpilogue

708–709

done

719

done

737

done

samsonov added inline comments.Oct 15 2014, 1:37 PM
lib/AST/RecordLayoutBuilder.cpp
1369–1379

r219842 moves `-fsanitize-blacklist` is in LangOpts. You've probably thought about it, but I will warn you anyway: recompiling the library with the same compiler and compile flags, but different blacklist file will make it ABI-incompatible. So, we should be really careful in using it.

kcc updated this revision to Diff 14971.Oct 15 2014, 5:08 PM

Use proper blacklist machinery.
PTAL

samsonov added inline comments.Oct 15 2014, 6:08 PM
lib/AST/Decl.cpp
3642 ↗(On Diff #14971)

This looks weird: II->getName() is neither mangled nor qualified (what would it return for the record declared inside the namespaces)? Consider testing smth. like printQualifiedName().

rsmith added inline comments.Oct 15 2014, 7:25 PM
include/clang/AST/Decl.h
3266 ↗(On Diff #14907)

In new code, please use \brief instead of repeating the function name.

3269 ↗(On Diff #14907)

Please use a leading lowercase letter for this function name for consistency with the rest of the class.

include/clang/Basic/DiagnosticFrontendKinds.td
49–51 ↗(On Diff #14907)

Remarks are end-user-visible, so should be a bit more friendly than this. Also, this diagnostic should use "%select{list|of|strings}1" rather than putting user-visible strings in the code in Decl.cpp. Something like:

"-fsanitize-address-field-padding ignored for %0 because it %select{is packed|is a union|is trivially copyable|has a trivial destructor|is standard layout|is in a blacklisted file|is blacklisted}1"

... would seem reasonable. (I don't think it's worth remarking once per class if the reason we're not sanitizing is because the current language is C; in that case, we should warn once for the entire compilation.)

lib/AST/Decl.cpp
3633 ↗(On Diff #14907)

Please use nullptr rather than 0.

3642 ↗(On Diff #14907)

hasSimpleDestructor still does not seem right; it's a notion internal to how we manage implicit declaration and definition for destructors and shouldn't be used for anything else. Does hasTrivialDestructor work for you?

3648 ↗(On Diff #14907)

Blacklisting by identifier doesn't seem ideal; there should probably be some way of specifying a namespace in the blacklist file.

kcc added a comment.Oct 16 2014, 11:28 AM

Addressed all comments except for one

include/clang/AST/Decl.h
3266 ↗(On Diff #14907)

done

3269 ↗(On Diff #14907)

done

include/clang/Basic/DiagnosticFrontendKinds.td
49–51 ↗(On Diff #14907)

Done except for "we should warn once for the entire compilation for C"
What is the kosher way to achieve that in clang?
(I assume that a function-scope static boolean is not kosher)

lib/AST/Decl.cpp
3633 ↗(On Diff #14907)

Var changed to int.

3642 ↗(On Diff #14907)

Yes, this seems to work,

3648 ↗(On Diff #14907)

done (using getQualifiedNameAsString())

kcc updated this revision to Diff 15031.Oct 16 2014, 11:35 AM

Addressed all comments (but one), extended the test to check the blacklist functionality and the remarks

PTAL

rsmith accepted this revision.Oct 16 2014, 1:11 PM
rsmith edited edge metadata.

LGTM

include/clang/Basic/DiagnosticFrontendKinds.td
50–56 ↗(On Diff #15031)

I don't think these should be BackendInfo, because they're reflecting frontend semantics changes. They should probably be just ShowInSystemHeader.

lib/AST/Decl.cpp
3629–3630 ↗(On Diff #15031)

It would seem prudent to also test whether the type is in an extern "C" linkage specification in C++ here:

if (!CXXRD || CXXRD->isExternCContext())
This revision is now accepted and ready to land.Oct 16 2014, 1:11 PM
kcc updated this revision to Diff 15041.Oct 16 2014, 1:24 PM
kcc edited edge metadata.

Addressed last two Richard's comments.
Added a test for an extern "C" class.

kcc retitled this revision from Insert poisoned paddings between fields in C++ classes. to Insert poisoned paddings between fields in C++ classes so that AddressSanitizer can find intra-object-overflow bugs.Oct 16 2014, 1:29 PM
kcc updated this object.
kcc edited the test plan for this revision. (Show Details)
kcc closed this revision.Oct 16 2014, 2:05 PM