This is an archive of the discontinued LLVM Phabricator instance.

[dfsan] Support origin tracking
AbandonedPublic

Authored by stephan.yichao.zhao on Feb 1 2021, 4:42 PM.

Details

Reviewers
morehouse
Summary
After DFSan reports taint sinks, the next questions are "How did they
get it?", "When did that happen?", "Who has tainted data originally?",
etc. This change addresses this by adding origin tracking.

This change will be split into small diffs for incremental review.

////////////
The Design
////////////

Inspired by MSan's origin tracking.

1) The new flag -dfsan-track-origins is added. It works only with 16bit
mode.

2) Each 4 contiguous user bytes share one 4-byte origin information
aligned by 4: the user byte at addr uses an origin at addr && ~3UL +
origin_start_addr.

3) An 4-byte origin is a hash of an origin chain. An origin chain is a
pair of a stack hash id and a hash to its previous origin chain. 0 means
no previous origin chains exist. We limit the length of a chain to be
16. With origin_history_size = 0, the limit is removed.

4) Only at store and memory transfer operations, new chains are created
when taint data are written. This is to reduce chain lengths.

5) At each instruction with > 1 operands, only one origin chain is
propagated. This is to reduce chain widths.

6) Each customized function has two wrappers. The
first one is for the normal shadow propagation. The second one is used
when origin tracking is on. It calls the first one, and does additional
origin propagation. Which one to use can be decided at instrumentation
time. This is to ensure minimal additional overhead when origin tracking
is off.

7) Provide an API dfsan_print_origin_trace that reports stack traces
along a trace.

Diff Detail

Event Timeline

stephan.yichao.zhao requested review of this revision.Feb 1 2021, 4:42 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptFeb 1 2021, 4:42 PM
Herald added subscribers: llvm-commits, Restricted Project. · View Herald Transcript
stephan.yichao.zhao planned changes to this revision.Feb 1 2021, 4:42 PM
This change will be split into small diffs for incremental review.
stephan.yichao.zhao edited the summary of this revision. (Show Details)Feb 1 2021, 4:42 PM

Thanks for a full diff. I'll be referring to it as I review the incremental changes.

I haven't looked at the code much yet, but the overall design SGTM.

Many of the tests added here are failing on the AArch64 buildbots (e.g. http://lab.llvm.org:8011/#/builders/7/builds/1974). Is this expected to work on AArch64, or should these tests be disabled for those bots?

stephan.yichao.zhao added a comment.EditedMar 11 2021, 9:48 AM

Many of the tests added here are failing on the AArch64 buildbots (e.g. http://lab.llvm.org:8011/#/builders/7/builds/1974). Is this expected to work on AArch64, or should these tests be disabled for those bots?

This change supports only x86_64 arch on linux. Disabled testing other arches by https://github.com/llvm/llvm-project/commit/37520a0b2b2af025e40b17dbf99013cda9eb66a1

ormris removed a subscriber: ormris.Jun 3 2021, 10:28 AM

All changes in this CL were submitted in split CLs.