This is not meant as an actual code review to get this patch submitted but to have a basis for further discussion.
It is meant to experiment around a solution for
With this patch, LLVM, generates pretty nice code for the example from the bug report (see below). Obviously it is far from complete or correct.
.section TEXT,text,regular,pure_instructions
.macosx_version_min 10, 10
.globl Z1fPhP1A
.align 4, 0x90
Z1fPhP1A: ## @_Z1fPhP1A
BB#0: ## %entry
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset %rbp, -16
movq %rsp, %rbp
.cfi_def_cfa_register %rbp
pushq %r15
pushq %r14
pushq %rbx
pushq %rax
.cfi_offset %rbx, -40
.cfi_offset %r14, -32
.cfi_offset %r15, -24
movq %rsi, %r14
movq %rdi, %rbx
incq %rbx
leaq LJTI0_0(%rip), %r15
jmp LBB0_1
.align 4, 0x90
LBB0_6: ## %for.cond.backedge
- in Loop: Header=BB0_1 Depth=1
incq %rbx
LBB0_1: ## %for.cond
- =>This Inner Loop Header: Depth=1
movzbl (%rbx), %eax
cmpq $3, %rax
ja LBB0_6
- BB#2: ## %for.cond
- in Loop: Header=BB0_1 Depth=1
movslq (%r15,%rax,4), %rax
addq %r15, %rax
jmpq *%rax
LBB0_3: ## %if.then
- in Loop: Header=BB0_1 Depth=1
movq %r14, %rdi
jmp LBB0_5
LBB0_4: ## %if.then4
- in Loop: Header=BB0_1 Depth=1
leaq 4(%r14), %rdi
jmp LBB0_5
LBB0_7: ## %if.then8
- in Loop: Header=BB0_1 Depth=1
leaq 8(%r14), %rdi
jmp LBB0_5
LBB0_8: ## %if.then12
- in Loop: Header=BB0_1 Depth=1
leaq 12(%r14), %rdi
LBB0_5: ## %for.cond.backedge
- in Loop: Header=BB0_1 Depth=1
callq __Z6assignPj
jmp LBB0_6
.align 2, 0x90
L0_0_set_3 = LBB0_3-LJTI0_0
L0_0_set_4 = LBB0_4-LJTI0_0
L0_0_set_7 = LBB0_7-LJTI0_0
L0_0_set_8 = LBB0_8-LJTI0_0
.long L0_0_set_3
.long L0_0_set_4
.long L0_0_set_7
.long L0_0_set_8
A comment here would be welcome.
I suspect you are just interested in the fact that the instruction shouldn't use or define physical registers as well as memory operand (which is part of what this function does).
Without a comment, at first glance, this is confusing because instructions in the preheader should be loop invariant!