Sample profile loader can be run in both LTO prelink and postlink. Currently the counts annoation in postilnk doesn't fully overwrite what's done in prelink. I'm adding a switch (`-overwrite-existing-weights=1`) to enable a full overwrite, which includes:
1. Clear old metadata for calls when their parent block has a zero count. This could be caused by prelink code duplication.
2. Clear indirect call metadata if somehow all the rest targets have a sum of zero count.
3. Overwrite branch weight for basic blocks.
When counts are accurate, I was seeing #1 and #2 help reduce code size by preventing post-sample ICP and CGSCC inliner working on obselete metedata.