This is an archive of the discontinued LLVM Phabricator instance.

[dfsan] Add Origin ABI Wrappers
ClosedPublic

Authored by stephan.yichao.zhao on Mar 22 2021, 2:15 PM.

Details

Reviewers
gbalats
Summary

Supported strrchr, strrstr, strto*, recvmmsg, recrmsg, nanosleep,
memchr, snprintf, socketpair, sprintf, getocketname, getsocketopt,
gettimeofday, getpeername.

strcpy was added because the test of sprintf need it. It will be
committed by D98966. Please ignore it when reviewing.

This is a part of https://reviews.llvm.org/D95835.

Diff Detail

Event Timeline

stephan.yichao.zhao requested review of this revision.Mar 22 2021, 2:15 PM
stephan.yichao.zhao created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptMar 22 2021, 2:15 PM
Herald added a subscriber: Restricted Project. · View Herald Transcript
gbalats added inline comments.Mar 22 2021, 4:25 PM
compiler-rt/lib/dfsan/dfsan_custom.cpp
1063

I assume the invariant here that differentiates this from strtol is that tmp_endptr is never null, right?
I suggest we add an assertion to make this explicit (here and in the similar functions that follow).

1069

The "xl" is a little confusing. I suggest renaming to dfsan_strtolong_label. Same for dfsan_strtolong_origin.

1096

Why is base_origin preferred over the origin of nptr? Should this be the case if both are tainted?

1100

Should this be ret_origin instead, perhaps?

1136

lable -> label

1674

Why is this not calling __dfsw_strrchr like the above cases do? This way, it's duplicating it's code.

1712

Same here.

compiler-rt/test/dfsan/custom.cpp
1349–1355

The if and else block here are the same.

1382–1388

Same here.

1517–1520

Is this required?

stephan.yichao.zhao edited the summary of this revision. (Show Details)Mar 22 2021, 5:44 PM
stephan.yichao.zhao marked 10 inline comments as done.
stephan.yichao.zhao retitled this revision from [dfsan] Add Origin ABI Wrappers to [dfsan] Add Origin ABI Wrappers.

update

compiler-rt/lib/dfsan/dfsan_custom.cpp
1063

Added assert(tmp_endptr) before each strto* call.

1096

When multiple sources are tainted, we pick one of its origins to propagate.
Because checking base_label does not need additional computation like dfsan_read_origin_of_first_taint, we use it rather than from data pointed by nptr.
Added comments.

1100

Thank you for catching this.
Actually we do not need to assign ret_origin when *ret_label=0 since we only need to track taint labels.
So I removed this branch.

1674

I think that was trying to reduce the number of times we call strlen since calculating origin needs strlen again.
But this happens only when strict_data_dependencies = false.
The flag is true by default, it may not be worth to 'optimize' this case by duplicating code.

Changed.

compiler-rt/test/dfsan/custom.cpp
1349–1355

removed one branch. Thank you.

gbalats accepted this revision.Mar 23 2021, 5:49 PM
This revision is now accepted and ready to land.Mar 23 2021, 5:49 PM