Different variables and functions might have the same name in different CU. To calculate 'Availability' metric more accurate (i.e. to avoid getting availability above 100%), we need to have some additional logic to distinguish between them. The patch introduces a DIE identifier that consists of a function/variable name and declaration information: a filename and a line number. This allows distinguishing different functions/variables (different means declared in different files/lines) with the same name, keeping duplicates counted as duplicates. This patch is the first one from a set of 6 that aims to improve 'Availability' statistics. For testing purposes, I used clang-3.4 non-optimized build and here are the results for 10 most 'heavy' binaries before and after all the 6 patches:
| "source | baseline(master) | patched | Binary name | variables" | "Availability" | time(s) | "Availability" | time(s) | ---------------------------------------------------------------------------------- opt | 637923 | 214% | 1.644 | 100% | 2.47 | lli | 362669 | 177% | 0.898 | 99% | 1.346 | diagtool | 459671 | 184% | 1.399 | 100% | 2.103 | arcmt-test | 572726 | 180% | 1.665 | 100% | 2.53 | llc | 527603 | 200% | 1.431 | 100% | 2.282 | clang-3.4 | 1445548 | 208% | 4.33 | 100% | 6.413 | llvm-lto | 629415 | 208% | 1.652 | 100% | 2.666 | clang-check | 621696 | 195% | 2.108 | 100% | 3.302 | clang-format | 470768 | 184% | 1.429 | 99% | 2.154 | llvm-c-test | 478700 | 200% | 1.302 | 100% | 2.033 |
Could you please add an explanation here why '(probably)'?