This is an archive of the discontinued LLVM Phabricator instance.

[sancov] __sanitizer_dump_coverage api
ClosedPublic

Authored by aizatsky on Nov 16 2016, 11:42 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

aizatsky updated this revision to Diff 78233.Nov 16 2016, 11:42 AM
aizatsky retitled this revision from to [sancov] __sanitizer_dump_coverage api.
aizatsky updated this object.
kcc added a subscriber: kcc.Nov 17 2016, 8:54 AM
kcc added inline comments.
lib/sanitizer_common/sanitizer_coverage_libcdep_new.cc
67 ↗(On Diff #78238)

I don't think this code belongs here.

As I've mentioned before, we need to implement this using a public interface (which needs to be implemented).

We already have sanitizer_symbolize_pc which can give you the module and the offset,
but in an inefficient way (via converting to and from strings, etc).
Just need to implement a special interface, e.g.
int __sanitizer_get_module_and_offset_for_pc(uptr pc, char module_path[], uptr module_path_len, uptr *offset)

Symbolizer::FindModuleForAddress does what you need (or almost).
If you need more sorting/bsearching it belongs there or in LoadedModule::containsAddress

123 ↗(On Diff #78238)

avoid comment like this, if possible.

kcc added a comment.Nov 17 2016, 9:07 AM

I frankly don't remember your arguments and so I don't remember why (and if) I agreed.
Can you repeat the arguments?

kcc added a comment.Nov 17 2016, 9:20 AM

[please reply in phab, it doesn't properly line up the replies done in e-mail :( ]

I already need this API in libFuzzer.
The API could be simple, as described above (sanitizer_get_module_and_offset_for_pc())

then let's design and implement it in a separate CL.

Yes. And test.

you wanted trace-pc-guard deployed as soon as possible.

Yes, but that not a good reason to duplicate the code, and the patch partially duplicates the existing code in sanitizer_common.

To do what you are asking me to do would require changing the way LoadedModules stores the information.

Probably yes.

kcc added a comment.Nov 17 2016, 9:38 AM

Also, I suggest to implement the API w/o changing the internals, test it and submit.
Then see if it needs performance improvements.

aizatsky updated this revision to Diff 80832.Dec 8 2016, 2:58 PM

next iteration

aizatsky updated this revision to Diff 80833.Dec 8 2016, 3:09 PM

removed binary search.

Kostya,

Please take another look. This is now fully functional with test passing.

I can't use __sanitizer_get_module_and_offset_for_pc because it returns char* module_name, and organizing lookup by name is problematic.

kcc added a comment.Dec 8 2016, 3:24 PM

please add a test with at least one (better, two) DSOs

lib/sanitizer_common/sanitizer_coverage_libcdep_new.cc
9 ↗(On Diff #80832)

Coverage dumper used with -fsanitize-coverage=tarce-pc-guards
(or some such)

55 ↗(On Diff #80832)

I am confused.
I thought you were going to use __sanitizer_get_module_and_offset_for_pc here.

125 ↗(On Diff #80832)

Hm... Why?
The function should be fully executed once for every DSO
(called once per every comp unit)

127 ↗(On Diff #80832)

use CHECK_NE

140 ↗(On Diff #80832)

technically speaking, this does not prevent us from a race of another sort:
we may have two threads running the same code and so they will increment pc_array_index twice for the same PC
and so you will get the same PC twice.

BTW, if something needs to be atomic, that's the *guard

I had another scheme in mind (and in the docs):

  • allocate the array of PCs according to the amount of PCs deducted from __sanitizer_cov_trace_pc_guard_init.
  • set guards to consecutive numbers starting from 1
  • here do this uptr Idx = atomic_exchange(*guard, 0); if (Idx) pc_array[Idx] = pc;
test/sanitizer_common/TestCases/sanitizer_coverage_trace_pc_guard.cc
6 ↗(On Diff #80832)

shouldn't you run this test in a separate scratch dir?
This way it may race with another test that creates sancov files.

10 ↗(On Diff #80832)

Mmm. really? I thought I've fixed it.
I guess you need to fix it before submitting this test.

kcc added a comment.Dec 8 2016, 3:26 PM

Kostya,

Please take another look. This is now fully functional with test passing.

I can't use __sanitizer_get_module_and_offset_for_pc because it returns char* module_name, and organizing lookup by name is problematic.

You should not need to do any lookup by name.... See my other comments.

aizatsky added inline comments.Dec 8 2016, 3:48 PM
lib/sanitizer_common/sanitizer_coverage_libcdep_new.cc
9 ↗(On Diff #80832)

I'll have to rename the file then. I told me this will be the basis for all sanitizer coverage implementations in the future. Is this sanitizer-coverage_libcdep_tracepcguard.cc?

55 ↗(On Diff #80832)

That's why I didn't want to create it in the first place. It is useless for me right now.

I suggest we change it to return module base address, not the 'pc - base'. This way I will be able to identify module by base address.

125 ↗(On Diff #80832)

I have a check below in __sanitizer function. I'd prefer this detail to be handled over there.

140 ↗(On Diff #80832)

this would create sparse pc_array. Not sure if we want this. Besides the code now is dead-simple. I like it this way.

You are right that sometimes we can get same PC twice, but I don't think we really care about this race. What's the problem of having some PC listed twice in .sancov file? Looks harmless.

67 ↗(On Diff #78238)

see above suggestion how we can modify public API to be useful here.

aizatsky updated this revision to Diff 80847.Dec 8 2016, 4:52 PM

fixing test, sorting pcs, using api only.

aizatsky marked 2 inline comments as done.Dec 8 2016, 4:54 PM

PTAL. Sancov test works, using only api functions now.

test/sanitizer_common/TestCases/sanitizer_coverage_trace_pc_guard.cc
10 ↗(On Diff #80832)

Ahh. Missing -1 :)) Works now.

kcc added inline comments.Dec 9 2016, 1:00 PM
lib/sanitizer_common/sanitizer_coverage_libcdep_new.cc
11 ↗(On Diff #80847)

remove comment

64 ↗(On Diff #80847)

allow a special case of a zero pc (just ignore it)

84 ↗(On Diff #80847)

is this one syscall per offset?
This actually may become a perf problem.
We need a single WriteToFile per file.

96 ↗(On Diff #80847)

did you see my previous comment about DSOs?

9 ↗(On Diff #80832)

file name is fine.

test/sanitizer_common/TestCases/sanitizer_coverage_trace_pc_guard.cc
23 ↗(On Diff #80847)

I thought I added a commend asking to extend the test to have DSOs, but I can't find it now.

kcc added inline comments.Dec 9 2016, 1:23 PM
lib/sanitizer_common/sanitizer_coverage_libcdep_new.cc
140 ↗(On Diff #80832)

what's wrong with sparse array?
Its size will be limited by the total number of PCs, i.e. known at the module load time.
And it will make the code even simpler and more correct.

aizatsky updated this revision to Diff 80956.Dec 9 2016, 2:06 PM
aizatsky marked 6 inline comments as done.

multi-dso test; writing in a single call.

Added a dso test, writing in a single call.

kcc added inline comments.Dec 9 2016, 2:28 PM
lib/sanitizer_common/sanitizer_coverage_libcdep_new.cc
110 ↗(On Diff #80956)

We used exactly this approach in the previous implementation.
But I continue to think that instead we need to use an array indexed by the guard (from 1 to NumPCs), see my other unresolved comment.

aizatsky updated this revision to Diff 81134.Dec 12 2016, 1:42 PM

using InternalMmapVector rather than fixed memory set.

Using InternalMmapVector for pc storage. PTAL.

kcc added inline comments.Dec 12 2016, 2:02 PM
lib/sanitizer_common/sanitizer_coverage_libcdep_new.cc
118 ↗(On Diff #81135)

hm... Why i + end - start?

The invariant here is that pc_vector.size() is max_guard_value + 1,
i.e. pc_vector[max_guard_value] is the last element.

122 ↗(On Diff #81135)

Let's do proper atomics here.
I suggest

idx = atomic_exchange(guard, 0, memory_order_acquire);
if (!idx) return;
....
kcc added inline comments.Dec 12 2016, 2:23 PM
lib/sanitizer_common/sanitizer_coverage_libcdep_new.cc
122 ↗(On Diff #81135)

memory_order_relaxed, of course

aizatsky updated this revision to Diff 81143.Dec 12 2016, 2:33 PM
aizatsky marked 3 inline comments as done.

atomic

lib/sanitizer_common/sanitizer_coverage_libcdep_new.cc
118 ↗(On Diff #81135)

obviously. The line was before the loop, I moved it below and forgot to fix it. Done.

kcc added inline comments.Dec 12 2016, 2:57 PM
lib/sanitizer_common/sanitizer_coverage_libcdep_new.cc
118 ↗(On Diff #81135)

Still hmmmm.
if the vector's size is 'i' we can't use pc_vector[i]

and just doing pc_vector.resize(i+1) won't work too, because next time you call it you will do

u32 i = pc_vector.size();

and so new 'i' will be old_i+1, which is not what you want.

I would decouple the counter and the vector size and just have another data member in TracePcGuardController

for (u32* p = start; p < end; p++) *p = ++num_guards;
pc_vector.resize(num_guards + 1);

aizatsky added inline comments.Dec 12 2016, 3:10 PM
lib/sanitizer_common/sanitizer_coverage_libcdep_new.cc
118 ↗(On Diff #81135)

I think this is exactly what I need. Let's say we call it twice. First time len=2, second time len=3.

  1. i = 0, we write 1 & 2 to *p, size = 2 afterwards
  1. i = 2, we write 3, 4, 5 to *p, size = 5
  2. i = 5 we'll write 6 to guards.
kcc added inline comments.Dec 12 2016, 3:14 PM
lib/sanitizer_common/sanitizer_coverage_libcdep_new.cc
118 ↗(On Diff #81135)

i = 0, we write 1 & 2 to *p, size = 2 afterwards

But that's wrong! :)
size of pc_vector to be 3, because few lines later you do

pc_vector[idx] = pc;

where idx is either 1 or 2.

aizatsky updated this revision to Diff 81151.Dec 12 2016, 3:17 PM

using idx - 1

lib/sanitizer_common/sanitizer_coverage_libcdep_new.cc
118 ↗(On Diff #81135)

I see what you mean. It's actually correct to do

pc_vector[idx - 1] = pc, since we start idx from 1. Fixed. I'll try to trigger it in the test.

kcc accepted this revision.Dec 12 2016, 3:29 PM
kcc added a reviewer: kcc.

LGTM

This revision is now accepted and ready to land.Dec 12 2016, 3:29 PM
This revision was automatically updated to reflect the committed changes.