by replacing DenseMap with IndexedMap for LLTs within MRI, as
benchmarked by cross-compiling sqlite3 amalgamation for AArch64
on x86 machine.
Per-pass diffs follow:
IRTranslator: +0.1%
Legalizer: -4.5%
RegBankSelect: -4.4%
Localizer: -2.5%
InstructionSelect: -8.8%
Total GlobalISel: -5.5%
More data below:
llc -O0 -mtriple arm64-ios-apple sqlite3.bc -time-passes -time-compilations=20
Before | After | Diff | |
---|---|---|---|
GlobalISel 5 Passes | 7.7847 | 7.3563 | |
7.7927 | 7.3586 | ||
7.7896 | 7.3576 | ||
7.7876 | 7.3538 | ||
7.7703 | 7.3615 | ||
Min | 7.770 | 7.354 | -5.4% |
Avg | 7.785 | 7.358 | -5.5% |
Err | 0.3% | 0.1% | |
IRTranslator | 1.7660 | 1.7682 | |
1.7737 | 1.7678 | ||
1.7682 | 1.7674 | ||
1.7626 | 1.7702 | ||
1.7595 | 1.7667 | ||
Min | 1.760 | 1.767 | 0.4% |
Avg | 1.766 | 1.768 | 0.1% |
Err | 0.8% | 0.2% | |
Legalizer | 0.8089 | 0.7729 | |
0.8073 | 0.7729 | ||
0.8088 | 0.7732 | ||
0.8099 | 0.7705 | ||
0.8074 | 0.7720 | ||
Min | 0.807 | 0.771 | -4.6% |
Avg | 0.808 | 0.772 | -4.5% |
Err | 0.3% | 0.4% | |
RegBankSelect | 0.8034 | 0.7638 | |
0.7988 | 0.7669 | ||
0.8011 | 0.7635 | ||
0.8004 | 0.7640 | ||
0.7985 | 0.7670 | ||
Min | 0.799 | 0.764 | -4.4% |
Avg | 0.800 | 0.765 | -4.4% |
Err | 0.6% | 0.5% | |
Localizer | 0.4800 | 0.4693 | |
0.4800 | 0.4684 | ||
0.4804 | 0.4680 | ||
0.4803 | 0.4676 | ||
0.4797 | 0.4679 | ||
Min | 0.480 | 0.468 | -2.5% |
Avg | 0.480 | 0.468 | -2.5% |
Err | 0.1% | 0.4% | |
InstructionSelect | 3.9264 | 3.5821 | |
3.9329 | 3.5826 | ||
3.9311 | 3.5855 | ||
3.9344 | 3.5815 | ||
3.9252 | 3.5879 | ||
Min | 3.925 | 3.582 | -8.8% |
Avg | 3.930 | 3.584 | -8.8% |
Err | 0.2% | 0.2% | |
Notes: | Err = (Max - Min) / Min | ||
Diff = (After - Before) / Before |
Maybe something like "VReg's low-level type and register class have different sizes" would be a bit more concise?