This is a follow-up from the discussion in D12965. The block-at-a-time limitation of SelectionDAG also came up in D13297.
Without the InstCombine change from D12965, I don't expect this patch to make any difference in the real world because InstCombine will already be widening cases like this in visitSwitchInst(). But we need to have this CGP safety harness in place before proceeding with any shrinkage in D12965, so we won't generate extra extends for compares.
There are regression tests for CGP in both test/Transforms/CodeGenPrepare and test/CodeGen/* . I opted for IR regression tests in the patch because that seems like a clearer way to test the transform, but PowerPC CodeGen for the i16 widening test is shown below. x86 will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473
Before:
- BB#0: mr 4, 3 extsh. 3, 4 ble 0, .LBB0_5
- BB#1: cmpwi 3, 99 bgt 0, .LBB0_9
- BB#2: rlwinm 4, 4, 0, 16, 31 <--- 32-bit mask/extend li 3, 0 cmplwi 4, 1 beqlr 0
- BB#3: cmplwi 4, 10 bne 0, .LBB0_12
- BB#4: li 3, 1 blr .LBB0_5: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 65436 beq 0, .LBB0_13
- BB#6: cmplwi 3, 65526 beq 0, .LBB0_15
- BB#7: cmplwi 3, 65535 bne 0, .LBB0_12
- BB#8: li 3, 4 blr .LBB0_9: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 100 beq 0, .LBB0_14 ...
After:
- BB#0: rlwinm 4, 3, 0, 16, 31 <--- mask/extend to 32-bit and then use that for comparisons cmpwi 4, 999 ble 0, .LBB0_5
- BB#1: lis 3, 0 ori 3, 3, 65525 cmpw 4, 3 bgt 0, .LBB0_9
- BB#2: cmplwi 4, 1000 beq 0, .LBB0_14
- BB#3: cmplwi 4, 65436 bne 0, .LBB0_13
- BB#4: li 3, 6 blr .LBB0_5: li 3, 0 cmplwi 4, 1 beqlr 0
- BB#6: cmplwi 4, 10 beq 0, .LBB0_12
- BB#7: cmplwi 4, 100 bne 0, .LBB0_13
- BB#8: li 3, 2 blr .LBB0_9: cmplwi 4, 65526 beq 0, .LBB0_15
- BB#10: cmplwi 4, 65535 bne 0, .LBB0_13 ...
auto