Fix PR23384.
The patch do the following:
- Add instructions number generated by a solution to LSR cost
- Move LSR cost comparison to target part
- Add new cross use generation for ICmpZero that ends with zero
One LIT test fails. However it should be fixed when D26367 is committed.
Performance improvement on x86:
spec2000
177.mesa on -O2 +3% 256.bzip2 on -Ofast -flto +1.5%
Do you have data to support that heuristic?
Like Wei said, I suspect this may lead to pretty bad side effect where we will increase the register pressure by a lot to save a few instruction.
So before we switch the default, I want supportive evidence that this is general goodness.
For the record, we discuss with Wei this cost model issue and I still have on my todolist a better register pressure estimation for the loop. E.g., we can ignore NumRegs as long as it is below the regpressure.