Currently BPI requires that edge weights must be greater than 0, but MBPI doesn't have this requirement. In MBPI, when an edge weight is zero, it is treated as there is no weight info for that edge. At runtime, such a zero edge weight will be turned into a DEFAULT edge weight 16. This may lead to incorrect edge weights ratio when we originally have 0 and 1 as edge weights from the same block and later get 16 and 1. Zero weights can either mean no info or obtained from BPI or calculated by users, making it ambiguous. When we have all out-edges with zero weights from the same block, it is awkward to compute edge probabilities. Though this is worked-around by using default edge weights. However, if we require edge weights in MBPI also should be greater than zero, we won't have the issues above.
In addition, although BPI requires edge weights to be greater than zero, it doesn't guarantee this when weights are read from metadata. So we need to normalize edge weights when we read zero weight from metadata. For example, if we have 0 and 1 edge weights from the same block (assume it only has two out-edges), we can normalize them into 1 and UINT32_MAX - 1.
This patch contains the following changes:
- Normalize edge weights in BPI to guarantee that they are greater than 0.
- In MBPI don't turn zero edge weights into default weights as we should not have zero weights anymore.
- (To discuss) Weight list can be empty previously when it is not used at all. I found it difficult to make weight list and successor list always synchronized so this patch will always update weight list whether it is used or not. Is this acceptable?
- Adjust use and test cases accordingly.
Is 16 a good default value to indicate missing information?
The current underlying (without this patch) assumption in MBPI is that if the weight list is empty, 0 weight represents 'unknown' weight -- otherwise it will be treated as a real zero value weight. This assumption is of course very weak and can break down at any time.
All these needs to be considered in the new design when weight is eliminated from the profiling related interfaces -- i.e., reserve a special value for unknown 'probability'.