This is an archive of the discontinued LLVM Phabricator instance.

Add user-defined callback on write() calls.
ClosedPublic

Authored by skerner on Apr 2 2014, 1:30 PM.

Details

Summary

Add dfsan_set_write_callback(), which sets a callback to be invoked when a write() call is invoked within DFSan instrumented code.

Diff Detail

Event Timeline

skerner updated this revision to Unknown Object (????).Apr 2 2014, 1:40 PM

Nits.

Peter,

Ready for review.  I could not decide if it was worth using an atomic swap to install the callback pointer in lib/dfsan/dfsan.cc .  I put in a TODO, but I am not sure it is worth trying to support multiple simultaneous calls to that method.  Anyone who does such a thing is almost certainly not using the function in a reasonable way.

Sam

pcc added a subscriber: Unknown Object (MLST).Apr 2 2014, 4:20 PM
pcc added inline comments.
lib/dfsan/dfsan.cc
261 ↗(On Diff #8315)

I don't think this approach is correct. We can't store a pointer to the callback function (which uses the instrumented ABI) and call it from the dfsan runtime library using the uninstrumented ABI, as the labels received by the callback function will be wrong (nor is this guaranteed to work at all).

We have a trampoline mechanism for safely calling instrumented ABI functions from the dfsan runtime library. Please take a look at how the custom functions pthread_create and dl_iterate_phdr are implemented.

When you have done this, you should be able to improve your test to show that the values passed to the write callback have the same labels as those passed to the write function.

lib/dfsan/dfsan_custom.cc
808

The signature of this function is incorrect. It should accept labels for each of the parameters and a pointer to the return label.

810

It would probably be simpler to call the callback unconditionally and let it check buf for labels. This would also allow the client to observe unlabelled writes if it needs to for whatever reason.

816

You could exit early here.

test/dfsan/write_callback.c
2

Maybe we could check that these files have the contents we expect?

skerner updated this revision to Unknown Object (????).Apr 14 2014, 11:21 AM

Comments addressed.

It is not obvious to me what the label on the return value of write() would be. Please explicitly look at what I did (dfsan_custom.c, line 849, in __dfsw_write() ) and let me know if you agree with it.

lib/dfsan/dfsan.cc
261 ↗(On Diff #8315)

Changed the code to use the trampoline mechanism. Improved tests.

lib/dfsan/dfsan_custom.cc
808

Done

810

Good point. Done.

816

Code removed.

test/dfsan/write_callback.c
2

It would be reasonable to assume that calls to write() do not fail in
tests. But write() may only write some of the bytes it is passed.
For example:

char buf[] = "123456789";
int bytes_written = write(fd, buf, 10);

If (for whatever reason) only 5 bytes are written, write may return 5.
I doubt it would do so on a local file system, but I don't want to
assume something that is not in the spec.

I considered adding retries to writes:

char buf[] = "123456789";

char* cur = buf;
int bytes_left = strlen(buf);
while ( bytes_left > 0 ) {

int result = write(fd, cur, bytes_left);

if (result < 0) ERROR HANDLER

bytes_left -= result;

}

... but now the number of callbacks tests must expect changes based on
how many times write() ends up being called.

skerner added inline comments.Apr 14 2014, 11:59 AM
test/dfsan/write_callback.c
2

Please ignore the above comment. I wrote it before realizing that its premise is flawed, and wrote the rest in a way that does check that the file is written.

pcc added inline comments.Apr 16 2014, 4:24 PM
lib/dfsan/dfsan_custom.cc
821

This isn't the correct type for write_callback -- I would use void* here.

832

This isn't necessarily always going to be the function pointer that was originally passed to dfsan_set_write_callback -- in theory, if the callee is known, we could generate the trampoline such that it calls the callee directly and the second argument is null. It would probably be simplest not to return anything from this function.

854

We haven't traditionally labelled the return value of system call functions according to their inputs, even with strict data dependencies turned on. The general justification for this is that system calls involve so much process-external state that we shouldn't even attempt to model their data flow beyond simply zero labeling their outputs.

I'm also not sure if this labelling is correct (for example, it reads the buffer's labels, even though the success or failure of a write is typically not dependent on the data being written).

test/dfsan/write_callback.c
83

I was imagining that the test could use something like FileCheck to check the program's output, rather than having the program check its own output. This would reduce the complexity of the test program and permit external validation of the program's behavior -- the wrappers could conceivably be lying to the program about the file system state.

skerner updated this revision to Unknown Object (????).Apr 21 2014, 11:10 AM

Addressed review comments.

lib/dfsan/dfsan_custom.cc
821

Done.

832

Changed to not return the previous callback.

854

The general justification for this is that system calls involve so much
process-external state that we shouldn't even attempt to model their
data flow beyond simply zero labeling their outputs.

Makes sense. Removed.

test/dfsan/write_callback.c
83

Good points. Changed the code to use FileCheck.

pcc accepted this revision.Apr 24 2014, 10:11 AM
pcc edited edge metadata.

LGTM

This revision is now accepted and ready to land.Apr 24 2014, 10:11 AM
pcc closed this revision.Apr 24 2014, 11:42 AM

r207131