Page MenuHomePhabricator

[RISCV] Introduce codegen patterns for instructions introduced in RV64I

Authored by asb on Oct 8 2018, 3:10 AM.



As discussed in the RFC, 64-bit RISC-V has i64 as the only legal integer type. This patch introduces patterns to support codegen of the new instructions introduced in RV64I: addiw, addiw, subw, sllw, slliw, srlw, srliw, sraw, sraiw, ld, sd.

Custom selection code is needed for srliw as SimplifyDemandedBits will remove lower bits from the mask, meaning the obvious pattern won't work:

def : Pat<(sext_inreg (srl (and GPR:$rs1, 0xffffffff), uimm5:$shamt), i32),
          (SRLIW GPR:$rs1, uimm5:$shamt)>;

This is sufficient to compile and execute all of the GCC torture suite for RV64I other than those files using frameaddr or returnaddr intrinsics (LegalizeDAG doesn't know how to promote the operands - a future patch addresses this).

When promoting i32 sltu/sltiu operands, it would be more efficient to use sign-extension rather than zero-extension for RV64. A future patch adds a hook to allow this.

Diff Detail


Event Timeline

asb created this revision.Oct 8 2018, 3:10 AM
asb added a comment.EditedOct 8 2018, 7:56 AM

There's one aspect I'd appreciate some input on. Full coverage of sext/zext/aext inputs and sext/aext outputs would require a lot of repetitive tests. But these tests would have value. e.g. to generate srlw for the following two functions:

define i32 @srlw(i32 signext %a, i32 zeroext %b) {                                                             
  %1 = lshr i32 %a, %b                                              
  ret i32 %1                                                        
define signext i32 @srlw_sext(i32 signext %a, i32 zeroext %b) {                                                   
  %1 = lshr i32 %a, %b                                              
  ret i32 %1                                                        

It's necessary to define two patterns:

def : Pat<(srl (zexti32 GPR:$rs1), GPR:$rs2),
          (SRLW GPR:$rs1, GPR:$rs2)>;
def : Pat<(sext_inreg (srl (zexti32 GPR:$rs1), GPR:$rs2), i32),
          (SRLW GPR:$rs1, GPR:$rs2)>;

zexti32 is defined to help two possible forms of inputs:

def zexti32 : PatFrags<(ops node:$src),
                       [(and node:$src, 0xffffffff), (assertzext node:$src)],
  if (N->getOpcode() == ISD::AssertZext) {
    return cast<VTSDNode>(N->getOperand(1))->getVT() == MVT::i32;
  return N->getOpcode() == ISD::AND;

Does anyone have any views on the testing approach for the patterns for *W instructions? It would obviously be easy to programmatically generate all {aext,sext} return + {aext,sext,zext},{aext,sext,zext} operand variants, but this seems excessive. On the other hand, my experience has been that it is easy to unexpectedly find that the *W instruction you wanted isn't being selected.

psnobl added a subscriber: psnobl.Oct 9 2018, 1:31 AM
sabuasal edited reviewers, added: sabuasal; removed: sameer.abuasal.Oct 10 2018, 2:50 PM
sabuasal added a comment.EditedOct 10 2018, 3:54 PM

Do you need to update test/CodeGen/RISCV/compress.ll to add an 64 bit run line?

It'd be great if you can chain dependencies between your in-flight patches.

asb updated this revision to Diff 169514.Oct 12 2018, 4:28 PM

Update to add zexti32 PatFrag and to eliminate unnecessary masking of shift amount values.

@sabuasal: The majority of the tests in test/CodeGen/RISCV could do with gaining RV64 CHECK lines. I was intending to have this patch just update tests directly related to the new patterns, and have a future patch update other tests (select, branch, etc). But I can incorporate all the test updates in this patch if preferred.

simoncook added a comment.EditedOct 15 2018, 4:04 AM

I tried building this on top of trunk but my build failed with this assertion:

llvm-tblgen: /home/simon/work/rvllvm/llvm/llvm/utils/TableGen/CodeGenDAGPatterns.h:836: const llvm::TreePatternNodePtr &llvm::TreePattern::getOnlyTree() const: Assertion `Trees.size() == 1 && "Doesn't have exactly one pattern!"' failed.

Trying to narrow this down, it looks like the use of PatFrags for zexti32 is causing the issue, if I try with PatFrag and just the first pattern (just picking one at random), my build seems to succeed. I'm not familiar enough with PatFrags to suggest a fix. Edit: Other than I guess manually expanding it and the rules that depend on it to use two separate explicit PatFrags

asb planned changes to this revision.Oct 22 2018, 1:20 AM

I have a number of changes implemented while I was travelling, and will update this patch series.

asb updated this revision to Diff 171075.Oct 25 2018, 6:02 AM

Updated to fix a PatFrags-related assertion and to add exhaustive testing for the *W instruction patterns. Although verbose, these tests have shown their worth by 1) identifying a missing pattern which I've now added and 2) flagging a few cases where codegen could be improved in future.

asb updated this revision to Diff 174171.Nov 15 2018, 2:25 AM

Refreshing patch so it applies without fuzz. No functional changes.

Thanks Alex, LGTM. I did not detect missing predicate around the 64 bit instructions and patterns.

This revision was not accepted when it landed; it landed in state Needs Review.Nov 30 2018, 1:42 AM
This revision was automatically updated to reflect the committed changes.
efriedma added inline comments.

These shift-right patterns are sort of suspicious. In particular, what happens if the shift amount is >= 32? For example, consider the following function:

long long x(long long y, int s) {
  return ((long long)(int)y)>>s;

x(1, 32) should return 0.

jrtc27 added inline comments.Nov 30 2018, 2:51 PM

Agreed, this currently produces a single sraw a0, a0, a1 for the body of the function, but SRAW only looks at rs2[4:0] ie the shift is modulo 32. The older versions of this patch set actually got this right, producing the correct sext.w a0, a0 ; sra a0, a0, a1 (which is what GCC also does).

asb marked an inline comment as done.Nov 30 2018, 9:19 PM
asb added inline comments.

Thank you Eli. The pattern incorrectly assumes that (sext_inreg foo, i32) is only produced when the original IR type was i32. There are similar issues with the SLLW and SRLW. I've committed rL348067 which removes these for now and adds new test cases which will pick up these issues.

I think this is a case where we suffer for having i64 as the only legal type. I want to be able to exploit the fact that ashr i32 %a, %b is UB where b >= 32, selecting srlw for this case. But that information seems to have been lost by the time we get to the SelectionDAG. I'll investigate how to handle this properly.

asb marked an inline comment as done.Nov 30 2018, 9:22 PM
asb added inline comments.

Typo: I mean selecting sraw for that example.