This is an archive of the discontinued LLVM Phabricator instance.

[BOLT] CMOVConversion pass
ClosedPublic

Authored by Amir on Feb 18 2022, 11:27 PM.

Details

Summary

Convert simple hammocks into cmov based on misprediction rate.

Test Plan:

  • Assembly test: cmov-conversion.s
  • Testing on a binary:
    1. Bootstrap clang with -x86-cmov-converter-force-all and -Wl,--emit-relocs (Release build)
    2. Collect perf.data:
      • clang++ <opts> bolt/lib/Core/BinaryFunction.cpp -E > bf.cpp
      • perf record -e cycles:u -j any,u -- clang-15 bf.cpp -O2 -std=c++14 -c -o bf.o
    3. Optimize clang-15 with and w/o -cmov-conversion:
      • llvm-bolt clang-15 -p perf.data -o clang-15.bolt
      • llvm-bolt clang-15 -p perf.data -cmov-conversion -o clang-15.bolt.cmovconv
    4. Run perf experiment:
      • test: clang-15.bolt.cmovconv,
      • control: clang-15.bolt,
      • workload (clang options): bf.cpp -O2 -std=c++14 -c -o bf.o

Results:

task-clock [delta: -360.21 ± 356.75, delta(%): -1.7760 ± 1.7589, p-value: 0.047951, balance: -6]
instructions  [delta: 44061118 ± 13246382, delta(%): 0.0690 ± 0.0207, p-value: 0.000001, balance: 50]
icache-misses [delta: -5534468 ± 2779620, delta(%): -0.4331 ± 0.2175, p-value: 0.028014, balance: -28]
branch-misses [delta: -1624270 ± 1113244, delta(%): -0.3456 ± 0.2368, p-value: 0.030300, balance: -22]

Diff Detail

Event Timeline

Amir created this revision.Feb 18 2022, 11:27 PM
Amir requested review of this revision.Feb 18 2022, 11:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 18 2022, 11:27 PM

Really cool! :)

lkail added a subscriber: lkail.Feb 19 2022, 1:12 AM

Thanks! Nicely written pass! There are a couple of lint issues and I have some comments, but this is mostly ready to go in my opinion.

bolt/include/bolt/Passes/CMOVConversion.h
38

I guess this comment was copied from another pass?

41

Can we also record instruction frequency and come up with a dynamic metric such as "out of 10000 executed instructions that could be converted to cmov, we converted 5000 (50%)"?

bolt/lib/Passes/CMOVConversion.cpp
67

Isn't it necessary to also check that LHS has a common predecessor with RHS? Edit: I saw how the main function uses this and this assumption is implied. Perhaps add a comment "caller guarantees that LHS and RHS share the same predecessor".

102

I know this is returning so it doesn't matter, but it makes me uneasy to think that someone might copy this code and remove the return. I would expect something like:

auto II = BB.begin();
while (II != BB.end()) {
  if (BC.MIB->isPseudo(*II)) {
    ++II;
    continue;
  }
  if (BC.MIB->isUnconditionalBranch(*II)) {
    II = BB.eraseInstruction(II);
    continue;
  }
  ...
  ++II;
}

But that might be an overkill, so we can add a comment for the incautious reader "invalidates the iterator, but that's OK since we return now"

112

Can we move this functions right after isIfThenSubgraph, closer to the helper that it uses, or vice-versa (move isIfThenSubgraph above)?

156

Is this necessary? Can't we just traverse in BF->BB vector order?

167
bolt/lib/Target/X86/X86MCPlusBuilder.cpp
2143

If we are only converting loads and you are already filtering stores below (I guess this is done in the switch-case of line 2163), can't we just drop this if and run the checks in its body regardless of it being load/store?

ayermolo added inline comments.Feb 25 2022, 12:51 PM
bolt/lib/Passes/CMOVConversion.cpp
76

can bb only have pseudo instructions?
Maybe better to check and exit?

Amir updated this revision to Diff 413156.Mar 4 2022, 3:25 PM

Change stats collection and reporting

Herald added a project: Restricted Project. · View Herald TranscriptMar 4 2022, 3:25 PM
Amir marked 2 inline comments as done.Mar 4 2022, 3:27 PM
Amir added inline comments.
bolt/include/bolt/Passes/CMOVConversion.h
41

Changed reporting into the following:

BOLT-INFO: CMOVConversion: test_memoperand_loop, converted static 0/1 (0.00%) hammock(s) into CMOV sequences, with dynamic execution count 0/2 (0.00%), saving 0/2 (0.00%) mispredictions
BOLT-INFO: CMOVConversion total: converted static 0/6 (0.00%) hammock(s) into CMOV sequences, with dynamic execution count 0/12 (0.00%), saving 0/12 (0.00%) mispredictions
Amir updated this revision to Diff 413158.Mar 4 2022, 3:40 PM
Amir marked an inline comment as done.

Address feedback

Amir updated this revision to Diff 413165.Mar 4 2022, 4:15 PM
Amir marked 4 inline comments as done.

Post-order traversal + back-to-back hammocks test

Amir marked an inline comment as done.Mar 4 2022, 4:16 PM
Amir added inline comments.
bolt/lib/Passes/CMOVConversion.cpp
67

Added both

156

Yes, post-order traversal is necessary to guarantee that we can collapse the hammock into a straight-line code (merge Predecessor, RHS and LHS into a single BB). Actually, I've made a mistake here as we need to traverse bottom-up (post-order, not RPO). Added a back-to-back hammocks test for that

Amir marked 2 inline comments as done.Mar 4 2022, 4:26 PM
Amir updated this revision to Diff 413202.Mar 4 2022, 11:48 PM

Extra knob for filtering RBP-based stack loads

Amir marked an inline comment as done.Mar 4 2022, 11:51 PM
Amir added inline comments.
bolt/lib/Target/X86/X86MCPlusBuilder.cpp
2143

No, unfortunately we can't drop it. We allow reg-reg moves, but apply filtering to loads based on AllowStackMemOp and AllowBasePtrStackMemOp. Stores are filtered by opcode.
E.g. if AllowStackMemOp is false, we should still allow reg-reg moves. Not sure how to drop isLoad check and run checks in body regardless and still allow that.

Amir marked an inline comment as done.Mar 5 2022, 10:37 AM
Amir updated this revision to Diff 413634.Mar 7 2022, 2:31 PM

Reuse BB exec count for consistent stats reporting

rafauler accepted this revision.Mar 7 2022, 5:21 PM

Thanks Amir for addressing all feedback! Looks good to me.

This revision is now accepted and ready to land.Mar 7 2022, 5:21 PM
Amir edited the summary of this revision. (Show Details)Mar 8 2022, 10:36 AM
Amir edited the summary of this revision. (Show Details)
Amir edited the summary of this revision. (Show Details)Mar 8 2022, 10:42 AM
This revision was landed with ongoing or failed builds.Mar 8 2022, 10:44 AM
This revision was automatically updated to reflect the committed changes.