This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Custom lower SHL_PARTS, SRA_PARTS, SRL_PARTS
ClosedPublic

Authored by luismarques on Mar 17 2019, 1:11 PM.

Details

Summary

When not optimizing for minimum size (-Oz) we custom lower wide shifts
(SHL_PARTS, SRA_PARTS, SRL_PARTS) instead of expanding to a libcall. Although
conceptually independent from it, this patch builds upon the D59355 optimization
of SELECT sequences to obtain the optimized code exhibited in the updated tests.

Diff Detail

Repository
rL LLVM

Event Timeline

luismarques created this revision.Mar 17 2019, 1:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 17 2019, 1:11 PM
asb requested changes to this revision.Mar 17 2019, 10:37 PM

Thanks for this Luis. This needs coverage for the optForMinSize checks. I'd recommend extending shifts.ll with minsize versions of the tests, e.g.

define i64 @lshr64_minsize(i64 %a, i64 %b) minsize nounwind {                             
; RV32I-LABEL: lshr64_minsize:                                                            
; RV32I:       # %bb.0:                                                                   
; RV32I-NEXT:    addi sp, sp, -16                                                         
; RV32I-NEXT:    sw ra, 12(sp)                                                            
; RV32I-NEXT:    call __lshrdi3                                                           
; RV32I-NEXT:    lw ra, 12(sp)                                                            
; RV32I-NEXT:    addi sp, sp, 16                                                          
; RV32I-NEXT:    ret                                                                      
  %1 = lshr i64 %a, %b                                                                    
  ret i64 %1                                                                              
}
This revision now requires changes to proceed.Mar 17 2019, 10:37 PM

Updates the patch to:

  • Add a test covering the optForMinSize option
  • Add a test that shifts a 128-bit value
  • Add checks for RV64 (relevant for the 128-bit test)
  • Fix the materialization of the -XLEN constant in RV64
asb added a comment.Apr 1 2019, 4:57 AM

I think it would be worth rounding out the tests a bit, especially as it should just require a trivial amount of copy and paste and editing to do so:

  • ashr64_minsize
  • shl64_minsize
  • lshr128
  • ashr128
lewis-revill added inline comments.
lib/Target/RISCV/RISCVISelLowering.cpp
557 ↗(On Diff #191114)

I'm sure there's an off-by-one error in my reasoning here, but aren't we already safe to assume that XLEN-Shamt can fit in the bits available for srl? Since we checked for if Shamt-XLEN < 0, the only case where XLEN-Shamt couldn't fit would be where XLEN-Shamt == XLEN IE when Shamt-XLEN == 0, which is handled by the basic case below.

I'm assuming this was the reasoning for the XLEN-1 stuff and the (Lo >>u 1) here, if not can someone explain it to me?

560 ↗(On Diff #191114)

Nitpick: shouldn't this read Hi = Lo << (Shamt-XLEN)

lewis-revill added inline comments.Apr 3 2019, 8:14 AM
lib/Target/RISCV/RISCVISelLowering.cpp
557 ↗(On Diff #191114)

I see at least part of my error. It should read 'IE when Shamt == 0`. Either way I'm not seeing how it should come up?

lewis-revill added inline comments.Apr 3 2019, 8:24 AM
lib/Target/RISCV/RISCVISelLowering.cpp
557 ↗(On Diff #191114)

Okay sorry for the noise, I see why this is an issue now. Shamt == 0 is indeed handled by this case. Could we have some comment to that effect in this function? And likewise for lowerShiftRightParts?

luismarques added inline comments.Apr 5 2019, 6:12 AM
lib/Target/RISCV/RISCVISelLowering.cpp
557 ↗(On Diff #191114)

I think the comment "// Shamt < XLEN" documents reasonably well that this branch also covers Shamt == 0, since "0 < XLEN". If I misunderstood your comment please clarify.

560 ↗(On Diff #191114)

The (Shamt-XLEN) should have been in the "else" branch, not the "then" branch. These kind of mistakes were easy to make because the assembly has the "basic blocks" in the reverse order (I can't fix that in the pseudo-code without inverting the condition, which would be even more confusing). I had caught most of them, but that one slipped through. Good catch! I'll fix that together with the remaining issues.

  • Fixed a bug in the pseudo-code for lowerShiftLeftParts;
  • Added more tests, which now cover all the 64-bit shift variants, with and without the minsize attribute, plus all of the 128-bit shift variants.
luismarques marked 6 inline comments as done.Apr 5 2019, 7:37 AM
asb accepted this revision.Apr 11 2019, 3:53 AM

This looks good to me. The GCC torture suite is happy with this code too, which serves as an extra (but non-exhaustive) sanity check. It's not immediately obvious to me that checking for OptForMinSize is better than OptForSize, but the precise criteria for when something should be done at Oz vs Os don't seem to be well specified and checking OptForMinSize is consistent with other backends, so the right choice for now.

This revision is now accepted and ready to land.Apr 11 2019, 3:53 AM
This revision was automatically updated to reflect the committed changes.