First attempt / phase of changes to prevent common immediates that occur more than once within a single basic block to be pulled into their users, in order to prevent unnecessary large instruction encoding.
A quick run on cpu2k shows just over 1% .text size savings on the sum of all the objects (around 8% on just 252.eon alone). Performance (O2 and Os) is flat, though this is currently only enabled under Os, with the hopes to later also include O2.
Quick example of how this help..
int a, b, c, d, e, f, g, h, i, j; int x; int foo(void) { a = 0x0fffffff; b = 0x0fffffff; if (e == 0x0fffffff) { x = 1; } f = 456; g = 789; h = 555; i += 555; return 0; }
BEFORE:
0: c7 05 00 00 00 00 ff movl $0xfffffff,0x0 7: ff ff 0f a: c7 05 00 00 00 00 ff movl $0xfffffff,0x0 11: ff ff 0f 14: 81 3d 00 00 00 00 ff cmpl $0xfffffff,0x0 1b: ff ff 0f 1e: 75 0a jne 2a <foo+0x2a> 20: c7 05 00 00 00 00 01 movl $0x1,0x0 27: 00 00 00 2a: c7 05 00 00 00 00 c8 movl $0x1c8,0x0 31: 01 00 00 34: c7 05 00 00 00 00 15 movl $0x315,0x0 3b: 03 00 00 3e: c7 05 00 00 00 00 2b movl $0x22b,0x0 45: 02 00 00 48: 81 05 00 00 00 00 2b addl $0x22b,0x0 4f: 02 00 00 52: 31 c0 xor %eax,%eax 54: c3 ret
AFTER:
0: b8 ff ff ff 0f mov $0xfffffff,%eax 5: a3 00 00 00 00 mov %eax,0x0 a: a3 00 00 00 00 mov %eax,0x0 f: 39 05 00 00 00 00 cmp %eax,0x0 15: 75 0a jne 21 <foo+0x21> 17: c7 05 00 00 00 00 01 movl $0x1,0x0 1e: 00 00 00 21: c7 05 00 00 00 00 c8 movl $0x1c8,0x0 28: 01 00 00 2b: c7 05 00 00 00 00 15 movl $0x315,0x0 32: 03 00 00 35: b8 2b 02 00 00 mov $0x22b,%eax 3a: a3 00 00 00 00 mov %eax,0x0 3f: 01 05 00 00 00 00 add %eax,0x0 45: 31 c0 xor %eax,%eax 47: c3 ret
This is a first attempt and phase of this, with later followup including:
- Increase the instruction types of users beyond stores and binary ops.
- Enable at O2.
- Move beyond single BB to operate globally in a function.
These other phases will require a little additional work to be done to make them safe.
Included is a request to delete CodeGen/X86/remat-invalid-liveness.ll
Justification : This is a large test that failed at some point due to a register being incorrectly clobbered. It was trimmed down a little to allow for a regression test. The CHECK conditions loosely resembled the original failing condition, and now look like this:
; CHECK-LABEL: __XXX1: ; CHECK: movl $3, %ecx ; CHECK-NOT: subb %{{[a-z]+}}, %ch
Note that (1) There are a few instances of "movl $3, %ecx", and (2) there's a lot of code between the different movs and the subb.
Based on those to conditions, it's extremely hard (without some form of data flow analysis) to determine that the "subb" is, indeed, bad. For example, with my changes, I get:
movl $3, %ecx movb $64, %ch … testb %cl, %al … subb %al, %ch
..which is reasonable, yet fails.
I can't think of any reasonable way to fix this test to catch the intended issue.
Thanks,
Zia.