This is an archive of the discontinued LLVM Phabricator instance.

[Sanitizer Coverage] Modify initialization of array bounds for sanitizer coverage.
AbandonedPublic

Authored by mpividori on Jan 6 2017, 8:32 PM.

Details

Summary

There is an important difference in how the linker handles sections on Windows.
As mentioned in https://msdn.microsoft.com/en-us/library/7977wcck.aspx , the linker can include some zero padding between each part of the section when merging. Which means, it can include some zero padding between __start___sancov_guards and the first guard array, between the arrays and between the final array and __stop___sancov_guards.
In practice, I only see this zero padding when linking "incrementally", but I couldn't find official documentation, so this could also happen when linking non incrementally.

So, to deal with this difference, I simply modified the implementation of the instrumentation, to initialize the integers in the guard arrays with 0xffffffff , so we can distinguish between the parts of the section that belongs to a guard array and the part of the sections that represents zero padding.

For example, when I compile a simple test with "-fsanitize-coverage=trace-pc-guard" , link incrementally and execute "dumpbin /section:.CSAN /rawdata my_test.exe" , I get:

RAW DATA #4

051A6000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
051A6010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
......
051A6110: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF  ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
051A6120: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF  ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
051A6130: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF  ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
051A6140: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF  ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
051A6150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
051A6160: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF  ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
051A6170: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF  ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
051A6180: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF  ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
051A6190: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF  ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
051A61A0: 00 00 00 00 00 00 00 00 00 00 00 00 FF FF FF FF  ............ÿÿÿÿ
051A61B0: FF FF FF FF FF FF FF FF FF FF FF FF 00 00 00 00  ÿÿÿÿÿÿÿÿÿÿÿÿ....
051A61C0: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF  ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
051A61D0: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF  ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
 .......
051A6230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
 .....

(the parts of the section with "FF" represent the guard arrays)

So, when initializing the arrays guards, we will iterate from &__start___sancov_guards to &__stop___sancov_guards considering only the positions with value 0xffffffff

(In a different diff I update the implementation of sanitizer coverage to work with these changes).

Diff Detail

Event Timeline

mpividori updated this revision to Diff 83497.Jan 6 2017, 8:32 PM
mpividori retitled this revision from to [Sanitizer Coverage] Modify initialization of array bounds for sanitizer coverage..
mpividori updated this object.
mpividori added reviewers: kcc, aizatsky, rnk, zturner.
mpividori set the repository for this revision to rL LLVM.
mpividori added a subscriber: llvm-commits.
rnk edited edge metadata.Jan 9 2017, 1:40 PM

I think we want to ensure that these globals end up in .bss on non-Windows platforms. It saves a lot of object file and binary size to keep them in .bss. In the worst case, we may need to make the initializer platform-dependent. However, before we do that, are you sure the sanitizer runtime can't tolerate extra padding in this array? Is it a problem if some of the guard "pcs" are wasted on padding? It's worth discussing this in person with Kostya to figure this out. It would also be nice if this could live in .bss on Windows as well as Linux.

@rnk , yes. I was worrying about that too, the size of the data in the binary files.....
The problem is that we need to distinguish between the parts of the section that belongs to a guard array and the parts of the sections that represents zero padding.
If we consider the zero padding as part of the arrays, we won't have an accurate information about the coverage, because that "guards" will never be visited.

I see 2 possible solutions if we want to keep that data in .bss:

+ Require non-incremental linking for instrumented code (and assume this doesn't include padding).

+ Modify the instrumentation. Instead of inserting __sanitizer_cov_trace_pc_guard_init(__start___sancov_guards,__stop___sancov_guards) for each TU, we could insert: __sanitizer_cov_trace_pc_guard_init(pointer to beginning of array of guards in that TU, pointer to end of array of guard in that TU). This would require some changes in the Module Pass for Sanitizer Coverage, since we should create an array for all the TU, so we need to know the number of basic block in that TU (currently we create an array when instrumenting each function, with the length of the number of basic blocks in that function). This shouldn't be very difficult to implement.

kcc edited edge metadata.Jan 10 2017, 4:06 PM

Please try to find another solution.
The zero default value is universally better than any other.

@kcc one possible approach, that I discussed with @rnk , is to suggest linking non-incrementally and mention that if incremental linking is used, then the coverage measure won't be accurate.
At least I think is the best we can do now.

mpividori planned changes to this revision.Jan 13 2017, 1:24 PM
mpividori abandoned this revision.Feb 7 2017, 4:17 PM