Without this change, libFuzzer does not account for REDUCED inputs
during -merge, if the larger of two equivalent inputs resides in the output
corpus directory. A user may overcome this limitation by creating an empty
directory for the merge output corpus, as libFuzzer does prefer smaller units
when merging corpora from the other dirs. However, that way libFuzzer will not
provide useful incremental stats (such as `X new files with Y new features
added; Z new coverage edges`) to the user.
This change aims to close that gap and make `-merge=` process overwrite the
existing units in the output corpus directory if a shorter unit from any other
corpus dir gives the same coverage. The high level idea is the following:
1: Emit `SIGNATURE` value for every unit in the merge control file. The value is
calculated as a hash of all coverage edges and features for a given unit. By
using a hash we do not significantly increase the size of the merge control
file and also avoid numerous comparisons of arrays of numbers.
2: During the actual merge process, we map all the signatures for the units in
the output corpus directory to the corresponding file names and sizes. Then,
when looping through the rest of the units, we check if any of them has a
signature value matching any of the signatures in the output corpus dir, and,
if yes, we record a pair of file names to be returned to the `FuzzerDriver`.
3: In current implementation, the `FuzzerDriver` goes through the vector of file
pairs and does file replacing. A potential improvement here might be to
combine both `NewFiles` and `ReplacedFiles` into a single vector containing
pairs of file names. For `NewFiles`, the destination file name will be empty.
The test change is a bit not so obvious, as it is very easy to trigger a "new"
signal from value profiling when dealing with a few inputs.