This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Enable signed truncation check transforms for i8
ClosedPublic

Authored by dtcxzyw on May 9 2023, 12:15 AM.

Details

Summary

This patch enables signed truncation check transforms for i8 on rv32 when XVT is i64 and Zbb is enabled.

It is a small improvement of D149977.

Diff Detail

Event Timeline

dtcxzyw created this revision.May 9 2023, 12:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 9 2023, 12:15 AM
dtcxzyw requested review of this revision.May 9 2023, 12:15 AM
reames added inline comments.May 9 2023, 7:49 AM
llvm/test/CodeGen/RISCV/lack-of-signed-truncation-check.ll
401

These two instructions are an odd variant of the add-carry from below. This looks like something got turned into a select whereas the add-with-carry + compare-two-halves should have worked here.

llvm/test/CodeGen/RISCV/signed-truncation-check.ll
651

This code sequence actually looks pretty good. It's the straight forward "add-carry-and compare" scheme.

Have you looked at why the inverted form of this check doesn't get canonicalized towards this form? If it did, we might get better codegen overall.

craig.topper added inline comments.May 9 2023, 11:43 AM
llvm/test/CodeGen/RISCV/signed-truncation-check.ll
468

This IR is non-canonical. We shouldn't get a uge with constant from InstCombine. I'll add more tests.

craig.topper added inline comments.May 9 2023, 3:30 PM
llvm/test/CodeGen/RISCV/signed-truncation-check.ll
468

Nevermind. I think the add_ultcmp_i64_i8 later in the file is the canonical form for this case.

craig.topper accepted this revision.May 9 2023, 7:55 PM

The Zbb i8 part of this looks good to me. As @reames noted there are opportunities to improve the code without Zbb, but that shouldn't block this patch.

This revision is now accepted and ready to land.May 9 2023, 7:55 PM
dtcxzyw added inline comments.May 10 2023, 7:20 AM
llvm/test/CodeGen/RISCV/lack-of-signed-truncation-check.ll
401

After legalization, RISCVISD::SELECT_CC was created. Then It was lowered to Select_GPR_Using_CC_GPR at the end of ISel.

Legalized selection DAG: %bb.0
SelectionDAG has 21 nodes:	
  t0: ch,glue = EntryToken	
  t2: i32,ch = CopyFromReg t0, Register:i32 %0	
  t20: i32 = add t2, Constant:i32<-128>	
      t4: i32,ch = CopyFromReg t0, Register:i32 %1	
      t22: i32 = setcc t20, t2, setult:ch	
    t33: i32 = add t4, t22	
  t34: i32 = add t33, Constant:i32<-1>	
      t27: i32 = setcc t20, Constant:i32<-256>, setult:ch	
      t29: i32 = setcc t34, Constant:i32<-1>, setne:ch	
    t35: i32 = RISCVISD::SELECT_CC t34, Constant:i32<-1>, seteq:ch, t27, t29	
  t13: ch,glue = CopyToReg t0, Register:i32 $x10, t35	
  t14: ch = RISCVISD::RET_GLUE t13, Register:i32 $x10, t13:1
===== Instruction selection ends:	
Selected selection DAG: %bb.0
SelectionDAG has 21 nodes:	
  t0: ch,glue = EntryToken	
  t2: i32,ch = CopyFromReg t0, Register:i32 %0	
  t20: i32 = ADDI t2, TargetConstant:i32<-128>	
      t4: i32,ch = CopyFromReg t0, Register:i32 %1	
      t22: i32 = SLTU t20, t2	
    t33: i32 = ADD t4, t22	
  t34: i32 = ADDI t33, TargetConstant:i32<-1>	
      t41: i32 = ADDI Register:i32 $x0, TargetConstant:i32<-1>	
      t27: i32 = SLTIU t20, TargetConstant:i32<-256>	
      t29: i32 = SLTIU t34, TargetConstant:i32<-1>	
    t35: i32 = Select_GPR_Using_CC_GPR t34, t41, TargetConstant:i32<0>, t27, t29	
  t13: ch,glue = CopyToReg t0, Register:i32 $x10, t35	
  t14: ch = PseudoRET Register:i32 $x10, t13, t13:1

In this case, we can fold (riscvisd::select_cc lhs, rhs, cc, truev, (setcc lhs, rhs, inv cc)) into (riscvisd::select_cc lhs, rhs, cc, truev, 1) and eventually into (or (setcc lhs, rhs, inv cc), truev).
I will try to improve the code without Zbb in follow-up patches.

This revision was landed with ongoing or failed builds.May 10 2023, 7:54 AM
This revision was automatically updated to reflect the committed changes.
dtcxzyw marked 2 inline comments as done.May 10 2023, 11:36 AM
dtcxzyw added inline comments.
llvm/test/CodeGen/RISCV/lack-of-signed-truncation-check.ll
401

Posted as D150286.