This is an archive of the discontinued LLVM Phabricator instance.

[NOT FOR REVIEW][NOT FOR COMMIT] X86 machine instruction counting
Needs ReviewPublic

Authored by lebedev.ri on Sep 5 2019, 2:35 PM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary

As discussed in D65148 (https://reviews.llvm.org/D65148#1658556)

$ cat /tmp/test.c 
int a();
int b();
int c(int d) {
    return (d < 0) ? a() : b();
}
$ ./bin/clang -o - -S -g0 -O3 /tmp/test.c -mllvm -print-after-all -mllvm -debug -mllvm -stats 2>&1 | tail -n 76
# End machine code for function c.

********** COUNT MACHINE INSTRUCTIONS: c **********
Found conditional branch instruction: JCC_1 %bb.1, 8, implicit $eflags

Found unconditional branch instruction: TAILJMPd64 @b, <regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...>, implicit $rsp, implicit $ssp, implicit $rsp, implicit $ssp, implicit killed $al

Found unconditional branch instruction: TAILJMPd64 @a, <regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...>, implicit $rsp, implicit $ssp, implicit $rsp, implicit $ssp, implicit killed $al

        discovered a new reachable node %bb.0
        discovered a new reachable node %bb.1
        discovered a new reachable node %bb.2
        .text
        .file   "test.c"
        .globl  c                       # -- Begin function c
        .p2align        4, 0x90
        .type   c,@function
c:                                      # @c
        .cfi_startproc
# %bb.0:                                # %entry
        xorl    %eax, %eax
        testl   %edi, %edi
        js      .LBB0_1
# %bb.2:                                # %cond.false
        jmp     b                       # TAILCALL
.LBB0_1:                                # %cond.true
        jmp     a                       # TAILCALL
.Lfunc_end0:
        .size   c, .Lfunc_end0-c
        .cfi_endproc
                                        # -- End function

        .ident  "clang version 10.0.0 (git@github.com:LebedevRI/llvm-project.git 1cf013a6519b38fdc585e35bf241c069431be4ca)"
        .section        ".note.GNU-stack","",@progbits
        .addrsig
===-------------------------------------------------------------------------===
                          ... Statistics Collected ...
===-------------------------------------------------------------------------===

 5 asm-printer       - Number of machine instrs printed
 1 branch-folder     - Number of times common instructions are hoisted
 1 cgscc-passmgr     - Maximum CGSCCPassMgr iterations on one SCC
 2 codegenprepare    - Number of return instructions duplicated
 4 dagcombine        - Number of dag nodes combined
 3 globalopt         - Number of globals marked unnamed_addr
 2 instcombine       - Number of insts combined
 3 isel              - Number of blocks selected using DAG
14 isel              - Number of times dag isel has to try another path
 1 isel              - Number of entry blocks encountered
 1 machine-scheduler - Number of instr pairs fused
 1 mem2reg           - Number of alloca's promoted with a single store
 1 prologepilog      - Number of functions seen in PEI
 1 regalloc          - Number of registers assigned
 2 regalloc          - Number of instructions deleted by DCE
 1 regalloc          - Number of identity moves eliminated after rewriting
 2 regalloc          - Number of instructions re-materialized
 2 regalloc          - Number of shrinkToUses called
 2 regalloc          - Number of cross class joins performed
 2 regalloc          - Number of interval joins performed
 1 shrink-wrap       - Number of functions
 1 sroa              - Maximum number of partitions per alloca
 2 sroa              - Maximum number of uses of a partition
 2 sroa              - Number of alloca partition uses rewritten
 1 sroa              - Number of alloca partitions formed
 1 sroa              - Number of allocas analyzed for replacement
 2 sroa              - Number of instructions deleted
 1 sroa              - Number of allocas promoted to SSA values
 1 stackmaps         - Number of functions skipped
 1 stackmaps         - Number of functions visited
 2 x86-isel          - Number of tail calls
 1 x86-mi-counting   - Number of conditional branch instructions
 3 x86-mi-counting   - Number of machine basic blocks
 1 x86-mi-counting   - Number of machine functions
 5 x86-mi-counting   - Number of machine instructions
 2 x86-mi-counting   - Number of unconditional branch instructions

Diff Detail

Event Timeline

lebedev.ri created this revision.Sep 5 2019, 2:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 5 2019, 2:35 PM
lebedev.ri edited the summary of this revision. (Show Details)Sep 5 2019, 2:36 PM
craig.topper added inline comments.
llvm/lib/Target/X86/X86CountMIAnalysis.cpp
203

Why VPCLMULQDQ?

204

VPCMOV is a very different instruction than CMOV_*. VPCMOV is "c ? a : b" evaluated independently for each bit. In IR it would be (c & a) | (~c & b)

212

Why is there a CMP in here?

lebedev.ri updated this revision to Diff 218991.Sep 5 2019, 3:02 PM
lebedev.ri marked 3 inline comments as done.

Count vector blend separately, drop unrelated instructions.