Currently this pass is useless since the cfg-normalization pass contains
this functionality.
Also this patch refactors peephole options enum by placing it Peepholes class.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential D118364
[BOLT] Remove peephole useless-branches pass yota9 on Jan 27 2022, 6:43 AM. Authored by
Details
Diff Detail
Event TimelineComment Actions Hi Vladislav, I've added the clang-icp.test to binary tests repo (https://github.com/rafaelauler/bolt-tests) and this test breaks with this diff. Please let me know if you can reproduce it. Comment Actions Please use bolt/utils/llvm-bolt-wrapper.py script (rGc2a961e414e0) to run binary matching.
/path/to/llvm-bolt-wrapper.py /path/to/llvm-bolt.old /path/to/llvm-bolt.new > bin/llvm-bolt-wrapper.ini
ln -sf /path/to/llvm-bolt-wrapper.py /path/to/llvm-bolt
bin/llvm-lit -a tools/bolttests/clang-icp.test The mismatch in text section should be reported. Comment Actions Call fixBranches in fixDoubleJumps. Comment Actions No, it's still not matching.
Comment Actions @Amir Hello! Yes, you're right, there are 256 bytes mismatch in .text.cold between binaries (the new binary becomes smaller). But now we not only remove the useless condition branches, we are also removing the useless unconditional branches. The example you can see in small function "_ZN4llvm11InstVisitorINS_26PGOIndirectCallSiteVisitorEvE16delegateCallInstERNS_8CallInstE" Previously after-bolt we had: 38bd9e5: eb f3 jmp 38bd9da <_ZN4llvm11InstVisitorINS_26PGOIndirectCallSiteVisitorEvE16delegateCallInstERNS_8CallInstE+0x16> 38bd9e7: eb f1 jmp 38bd9da <_ZN4llvm11InstVisitorINS_26PGOIndirectCallSiteVisitorEvE16delegateCallInstERNS_8CallInstE+0x16> And now: 38bd9e5: eb f3 jmp 38bd9da <_ZN4llvm11InstVisitorINS_26PGOIndirectCallSiteVisitorEvE16delegateCallInstERNS_8CallInstE+0x16> 38bd9e7: 90 nop The second jump was emitted since previously we didn't run eraseInvalidBBs() after peephole pass, although in fact this instruction (which is in separate BB) becomes unreachable. Also fixBranches() makes some work. E.g. in "_ZN4llvm13ScheduleDAGMID1Ev" Previously we had jump on the next instruction: 47b1c51: e8 da 35 cb fe callq 3465230 <_ZN12_GLOBAL__N_113CopyConstrainD0Ev> 47b1c56: eb 07 jmp 47b1c5f <_ZN4llvm13ScheduleDAGMID1Ev+0x51> 47b1c58: eb 05 jmp 47b1c5f <_ZN4llvm13ScheduleDAGMID1Ev+0x51> 47b1c5a: ff 50 10 callq *0x10(%rax) 47b1c5d: eb 00 jmp 47b1c5f <_ZN4llvm13ScheduleDAGMID1Ev+0x51> 47b1c5f: 48 8d 53 08 lea 0x8(%rbx),%rdx And now no extra jump on 47b1c5d: 47b1c11: e8 1a 36 cb fe callq 3465230 <_ZN12_GLOBAL__N_113CopyConstrainD0Ev> 47b1c16: eb 03 jmp 47b1c1b <_ZN4llvm13ScheduleDAGMID1Ev+0x4d> 47b1c18: ff 50 10 callq *0x10(%rax) 47b1c1b: 48 8d 53 08 lea 0x8(%rbx),%rdx This example also contains one unreachable jump removed by eraseInvalidBBs right in one function :) Comment Actions Thanks for investigating! I'm checking it with more tests and will return to you shortly Comment Actions The SCTC pass breaks CFG and we are unable to call fixBranches at that point anymore, closing review for now, thank you @Amir for your time :) |