This is an archive of the discontinued LLVM Phabricator instance.

Add a new interceptors for cdbr(3) and cdbw(3) API from NetBSD
ClosedPublic

Authored by krytarowski on Dec 1 2018, 12:40 PM.

Details

Summary

cdb - formats of the constant database.

cdbr, cdbr_open, cdbr_open_mem, cdbr_entries, cdbr_get, cdbr_find,
cdbr_close - constant database access methods.

cdbw_open, cdbw_put, cdbw_put_data, cdbw_put_key, cdbw_stable_seeder,
cdbw_output, cdbw_close - creates constant databases.

Add a dedicated test for this API.

Diff Detail

Repository
rL LLVM

Event Timeline

krytarowski created this revision.Dec 1 2018, 12:40 PM
vitalybuka added inline comments.Dec 4 2018, 12:27 AM
lib/sanitizer_common/sanitizer_common_interceptors.inc
7421 ↗(On Diff #176266)

could you please extract this with comment and similar for write into functions?

7442 ↗(On Diff #176266)

for WRITE maybe we can assume that system sets *base correctly.
however for READ, we don't know if values are corrupted, so it would be nice to be check them for shaddow, or just check that they point into the mmap_base

7456 ↗(On Diff #176266)

function?

7515 ↗(On Diff #176266)

Please extract functions here for structs as well.

vitalybuka requested changes to this revision.Dec 7 2018, 3:41 PM

Assume WIP

This revision now requires changes to proceed.Dec 7 2018, 3:41 PM
joerg added a comment.Dec 9 2018, 11:31 AM

I don't see why this interceptor needs to know about the internals of struct cdbr or struct cdbw at all. They are fully opaque. All memory accessed by users is explicitly sized as argument to the functions or returned with the size.

I was thinking about the same, whether there is point to sanitize internals of an opaque structure. I will simplify this.

krytarowski marked 8 inline comments as done.Dec 10 2018, 12:02 PM
krytarowski added inline comments.
lib/sanitizer_common/sanitizer_common_interceptors.inc
7421 ↗(On Diff #176266)

it has been simplified, we no longer sanitize internals of opaque cdbr/cdbw struct.

7442 ↗(On Diff #176266)

it has been simplified, we no longer sanitize internals of opaque cdbr/cdbw struct.

7456 ↗(On Diff #176266)

it has been simplified, we no longer sanitize internals of opaque cdbr/cdbw struct.

7515 ↗(On Diff #176266)

it has been simplified, we no longer sanitize internals of opaque cdbr/cdbw struct.

I don't see why this interceptor needs to know about the internals of struct cdbr or struct cdbw at all. They are fully opaque. All memory accessed by users is explicitly sized as argument to the functions or returned with the size.

Then we give up on TSAN check of fieds, or msan check if caller will try to read particular fields

lib/sanitizer_common/sanitizer_common_interceptors.inc
8520 ↗(On Diff #177581)

Why is this enough?
Are those pointers always point inside of sanitizer_cdbr?
e.g.
sanitizer_cdbr::hash_base

If not than we are going to hide those accesses from TSAN and miss races.
Or if callers will read by those pointers, MSAN may falsely detect uninitialized values.

lib/sanitizer_common/sanitizer_platform_limits_netbsd.h
2259 ↗(On Diff #177581)

if it is not sanitizing the fields, should the struct be replaced with __sanitizer_cdbr_sz?

krytarowski marked 4 inline comments as done.Dec 10 2018, 2:41 PM

I don't see why this interceptor needs to know about the internals of struct cdbr or struct cdbw at all. They are fully opaque. All memory accessed by users is explicitly sized as argument to the functions or returned with the size.

Then we give up on TSAN check of fieds, or msan check if caller will try to read particular fields

These structs contain a pointer to a buffer and length.

Please have a look as the actual implementation.

https://nxr.netbsd.org/xref/src/common/lib/libc/cdb/cdbr.c#89

https://nxr.netbsd.org/xref/src/lib/libc/cdb/cdbw.c#71

There is another difficulty, that hash is a SLIST container (key_hash_head, hash) and I was unsure whether to sanitize it or skip.

Please help to make this clearer.

We certainly don't want to make this interceptor suboptimal.

joerg added a comment.Dec 10 2018, 2:49 PM

I don't see why this interceptor needs to know about the internals of struct cdbr or struct cdbw at all. They are fully opaque. All memory accessed by users is explicitly sized as argument to the functions or returned with the size.

Then we give up on TSAN check of fieds, or msan check if caller will try to read particular fields

As I said, the structures are fully opaque. The only defined interface is through the functions already wrapped. For the reader interface, TSAN doesn't make sense as idempotent beyond creation and destruction. For the writer, the only useful check for TSAN would be to rule out two threads using the same instance. That shouldn't require any access to the internals though.

krytarowski marked an inline comment as done.Dec 11 2018, 1:22 AM
krytarowski added inline comments.
lib/sanitizer_common/sanitizer_platform_limits_netbsd.h
2259 ↗(On Diff #177581)

Actually its size is unknown to userland programs. I need to either hardcode its size or make a shadow variation.

  • add WRITE operations in cdbw(3) interceptors
  • use FD macros in cdbw_output()

OK, I think that this is now fine. cdbr(3) contains only read checks forsanitizer_cdbr . cdbw(3) contains read and write ones for sanitizer_cdbw.

vitalybuka accepted this revision.Dec 13 2018, 2:06 AM

LGTM

lib/sanitizer_common/sanitizer_common_interceptors.inc
8525 ↗(On Diff #177674)

could you please restructure piece after the REAL call similar to the function above?

This revision is now accepted and ready to land.Dec 13 2018, 2:06 AM
krytarowski marked an inline comment as done.Dec 13 2018, 2:21 AM
This revision was automatically updated to reflect the committed changes.