This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Combine and remove redundant ADD/SUB instructions
Needs ReviewPublic

Authored by eklepilkina on Jul 29 2022, 6:03 AM.

Details

Summary

Tail duplication pass can cause appearing of redundant intructions (see attached .ll test). Support the simpliest cases in separate MachineIR pass.

The number of removed instruction from statistics output on LLVM test-suite

Tests: 2025
Metric: riscv-addsub-comb.NumRemovedInstructions

Program                                       riscv-addsub-comb.NumRemovedInstructions
                                              results                                 
ultiSource...nchmarks/tramp3d-v4/tramp3d-v4    16.00                                  
ultiSource/Applications/sqlite3/sqlite3        15.00                                  
ultiSource/Applications/lemon/lemon             7.00                                  
ultiSource...ch/office-ispell/office-ispell     7.00                                  
ultiSource.../DOE-ProxyApps-C++/CLAMR/CLAMR     6.00                                  
ultiSource...chmarks/Prolangs-C/agrep/agrep     6.00                                  
ultiSource/Applications/ClamAV/clamscan         5.00                                  
ultiSource...gs-C/TimberWolfMC/timberwolfmc     2.00                                  
ultiSource...Prolangs-C/assembler/assembler     2.00                                  
ultiSource/Applications/lua/lua                 1.00                                  
ultiSource...Benchmarks/7zip/7zip-benchmark     1.00                                  
ultiSource...omotive-susan/automotive-susan     1.00

Diff Detail

Event Timeline

eklepilkina created this revision.Jul 29 2022, 6:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 29 2022, 6:03 AM
eklepilkina requested review of this revision.Jul 29 2022, 6:03 AM
eklepilkina edited the summary of this revision. (Show Details)Jul 29 2022, 6:09 AM
eklepilkina edited the summary of this revision. (Show Details)Jul 29 2022, 6:12 AM
craig.topper added inline comments.Jul 29 2022, 7:27 AM
llvm/test/CodeGen/RISCV/jumptable.ll
237 ↗(On Diff #448604)

This doesn’t look correct. (sext(x) -1) is not the same as sext(x-1).

Consider x==0x000000008000000. Sext(x)-1 is 0xffffffff7fffffff. Sext(x-1) is 0x000000007fffffff.

craig.topper added inline comments.Jul 29 2022, 11:30 AM
llvm/lib/Target/RISCV/CMakeLists.txt
27

This list is supposed to be alphabetical. Looks like MakeCompressible broke that, but we shouldn't make it worse.

llvm/lib/Target/RISCV/RISCVAddSubCombiner.cpp
116

Move this above the earlier if and use BaseRegister in the if

172

Cann't -> Can't

177

Reuse the iterator returned from find above

193

Register is wrapper around unsigned. Pass it by value not reference

196

llvm::find_if

237

It doesn't look like Modified is assigned anywhere else in this function

257

This should be a reference.

264

Drop curly braces

271

Do we have to assume there is only 1 def? Can we check isDef on the operand instead of checking 0?

294

Can this be llvm::is_contained?

306

You don't need to manually construct an empty vector here. Calling operator[] on line 310 will default construct the vector if it doesn't already exist. Then the push_back will be called.

324

RISCVSubtarget::getInstrINfo returns a RISCVInstrInfo *. No need for static_cast

llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
244

Use != like the other places in this file

eklepilkina marked 14 inline comments as done.
  • Upstream review fixes
craig.topper added inline comments.Aug 8 2022, 11:36 AM
llvm/lib/Target/RISCV/RISCVAddSubCombiner.cpp
43

"divided by" -> "keyed by"?

148

Why is the cast to MachineInstr* needed? Isn't there an operator-> on MIB?

169

Remove extra space before RegisterUsageIter

188

Can this use llvm::erase_if?

197

llvm::erase_value?

260

llvm::is_contained?

262

Why do you need to check Operand.isDef() and modifiedRegister? Isn't isDef enough?

llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
245

Should this be done before branch folding since it removes instructions which could reduce jump distance?

eklepilkina marked 7 inline comments as done.Aug 9 2022, 4:39 AM
eklepilkina added inline comments.
llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
245

This impossible, because BranchFolder is set in pipeline before TailDuplicationPass that combines these redundant instructions in one BB.

  • Upstream review fixes(2)
craig.topper added inline comments.Aug 9 2022, 8:46 AM
llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
245

Sorry. I meant to write BranchRelaxation.

  • Replace pass before BranchRelaxation
eklepilkina marked an inline comment as done.Aug 9 2022, 10:41 AM

Oh, yep, I got it.

Need to mention that the number of optimized cases became much less after of rebasing on the new main and escaping combining ADDI and ADDIW on 64-bit platform

Tests: 2025
Metric: riscv-addsub-comb.NumRemovedInstructions

Program                                       riscv-addsub-comb.NumRemovedInstructions
                                              results                                 
ultiSource...nchmarks/tramp3d-v4/tramp3d-v4    16.00                                  
ultiSource/Applications/sqlite3/sqlite3        15.00                                  
ultiSource/Applications/lemon/lemon             7.00                                  
ultiSource...ch/office-ispell/office-ispell     7.00                                  
ultiSource.../DOE-ProxyApps-C++/CLAMR/CLAMR     6.00                                  
ultiSource...chmarks/Prolangs-C/agrep/agrep     6.00                                  
ultiSource/Applications/ClamAV/clamscan         5.00                                  
ultiSource...gs-C/TimberWolfMC/timberwolfmc     2.00                                  
ultiSource...Prolangs-C/assembler/assembler     2.00                                  
ultiSource/Applications/lua/lua                 1.00                                  
ultiSource...Benchmarks/7zip/7zip-benchmark     1.00                                  
ultiSource...omotive-susan/automotive-susan     1.00
eklepilkina edited the summary of this revision. (Show Details)Aug 10 2022, 6:45 AM
craig.topper added inline comments.Aug 10 2022, 4:34 PM
llvm/lib/Target/RISCV/RISCVAddSubCombiner.cpp
189

Something looks weird here. [&Reg](std::pair<Register, bool> Element) { is repeated

  • Cherry-pick fix
eklepilkina marked an inline comment as done.Aug 11 2022, 1:44 AM
eklepilkina added inline comments.
llvm/lib/Target/RISCV/RISCVAddSubCombiner.cpp
189

Sorry, cherry-picked and merged badly.

craig.topper added inline comments.Aug 11 2022, 5:15 PM
llvm/lib/Target/RISCV/RISCVAddSubCombiner.cpp
119

Drop curly braces

130

Can we update the immediate for MBBI instead of creating a new instruction?

169

Extra space here

Can add/sub combine use instcombine? Adding a new pass seems to be expensive

eklepilkina marked an inline comment as done.Aug 11 2022, 11:43 PM

Can add/sub combine use instcombine? Adding a new pass seems to be expensive

I'm afraid that this is impossible because we can't cover all cases on LLVM IR. If you look at llvm/test/CodeGen/RISCV/adds-combinations.ll added in this review, you'll see that these add instructions appear on the late passes. On LLVM IR we can have different cases that can cause generation of such combination. And it seems to me it's hard to predict this on the step of InstCombine. If you have an idea what patterns should be matched in InstCombine, please, share .

eklepilkina marked an inline comment as done.
  • Upstream review fixes
  • Upstream review fixes(2)
  • Replace pass before BranchRelaxation
  • Some more review feedback
reames added a subscriber: reames.Aug 17 2022, 8:31 AM

I'm not convinced as to the motivation of this patch. Running the provided IR file through instcombine greatly simplifies it, and seems to remove the motivating pattern. As such, either this is an overly reduced test (which is fine, but more motivation is needed in review description), or this is fixing a pass ordering problem in the middle end in the wrong place.

Can you explain why this is the right way to fix this problem? At the moment, I'm unconvinced.

I'm not convinced as to the motivation of this patch. Running the provided IR file through instcombine greatly simplifies it, and seems to remove the motivating pattern. As such, either this is an overly reduced test (which is fine, but more motivation is needed in review description), or this is fixing a pass ordering problem in the middle end in the wrong place.

Can you explain why this is the right way to fix this problem? At the moment, I'm unconvinced.

I agree the tests are overly reduced. Based on the description, I think this patch is trying to fix cases where tail duplication duplicates a block that originally contained a phi. This allows duplicated code to end up in the basic block that produced the original incoming value. I've seen this happen where the incoming values are constants. After duplication we end up still putting the constant in a register instead of folding it. I don't think this patch addresses that case though.

I agree the tests are overly reduced. Based on the description, I think this patch is trying to fix cases where tail duplication duplicates a block that originally contained a phi. This allows duplicated code to end up in the basic block that produced the original incoming value. I've seen this happen where the incoming values are constants. After duplication we end up still putting the constant in a register instead of folding it. I don't think this patch addresses that case though.

That sounds plausible, but the patch itself needs to include this justification (including in reasonable tests).

If this is the use case, it really feels like something tail folding itself should be doing. Having tail folding run a forward simplify pass over the duplicated tail would seem reasonable. Not sure we have the right API for that today, but if not, maybe we should.

  • Upstream review fixes
  • Upstream review fixes(2)
  • Replace pass before BranchRelaxation
  • Fix test
  • Cherry-pick fix
  • Some more review feedback

Sorry for delay. I checked using InstCombine, but unfortuantely it doesn't help in all cases. I updated adds-combinations.ll test (please, have a look), even if I call opt with instcombine pass, it doesn't change IR in this case. Moreover, I made experiment where I added InstCombinePass after LoopStrengthReduce and measure performance, I got several regressions on benchmarks.

dewen added a subscriber: dewen.Mar 29 2023, 12:25 AM