This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Macro Fusion for RISC-V
Needs RevisionPublic

Authored by kamleshbhalui on Jan 29 2020, 9:54 AM.

Details

Reviewers
asb
evandro
lenary
Summary

Inspired from these two docs
https://arxiv.org/pdf/1607.02318.pdf
https://riscv.org/wp-content/uploads/2016/07/Tue1130celio-fusion-finalV2.pdf

this patch contains known pairs that can be fused from above two docs.

Diff Detail

Event Timeline

kamleshbhalui created this revision.Jan 29 2020, 9:54 AM
kamleshbhalui retitled this revision from Macro Fusion for RISC-V to [WIP] Macro Fusion for RISC-V.Jan 29 2020, 9:55 AM

Wow, this is really cool! I'm amazed it took so little (target-specific) code.

It would be good to understand the trade-offs of doing this rescheduling when the core doesn't necessarily support fusion. With the current instructions, I don't see many scheduling problems coming from blindly fusing all the time, but maybe this should be a subtarget feature that can be switched on and off in case it creates scheduling issues.

llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
125

Is this the default kind of machine scheduler? Does it necessarily make sense to schedule by liveness when RISC-V has so many registers?

As @lenary said, this must be enabled only for those sub targets that actually fuse these instrs. Also, when using the macro fusion pass, you should also enable the post RA pass. Otherwise, it's less effective.

evandro added inline comments.Jan 29 2020, 11:38 AM
llvm/lib/Target/RISCV/RISCVMacroFusion.cpp
21

Please, expand this comment.

26

Please, expand the acronym.

45

Please, add a comment.

llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
125

Yes and yes. By scheduling by liveness, the result is usually more friendly to both in and out of order µarchitectures.

khchen added a subscriber: khchen.Jan 29 2020, 8:55 PM
kamleshbhalui edited the summary of this revision. (Show Details)

@lenary and @evandro Thanks for reviewing.
Addressed reviewer's concerns and added few fusable pairs.

kamleshbhalui retitled this revision from [WIP] Macro Fusion for RISC-V to Macro Fusion for RISC-V.Jan 31 2020, 8:30 AM

Please, prefix the title with [RISCV].

llvm/lib/Target/RISCV/RISCV.td
71 ↗(On Diff #241740)

Different implementations may implement just some of the possible pairs. So this feature should perhaps be broken down into each category.

72 ↗(On Diff #241740)

"Fuse select instruction pairs"

119 ↗(On Diff #241740)

Does Rocket fuse all these pairs?

llvm/lib/Target/RISCV/RISCVMacroFusion.cpp
103

This is not correct as it increases compile time even when SecondMI is not a candidate for a fuseable pair. MacroFusion::scheduleAdjacentImpl() will call this function to query if SecondMI before traversing its predecessors. So you should go into the functions below and they should handle the case for FirstMI being nullptr for this test.

llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
131

Please, insert a blank line here.

kamleshbhalui retitled this revision from Macro Fusion for RISC-V to [RISCV] Macro Fusion for RISC-V.Jan 31 2020, 7:40 PM

Indexed load and store fusion should not be exclusive to LD and SD. It applies to any (F)Lx/(F)Sx. The literature just talks about LD/ST (note that this is not SD) in the sense of an arbitrary load/store instruction. Similarly for the various other fusion pairs involving memory accesses.

llvm/lib/Target/RISCV/RISCV.td
71 ↗(On Diff #241740)

This is a multi-word feature, so should be hyphenated as macro-fusion in my opinion (although I would prefer macro-op-fusion over that...).

llvm/lib/Target/RISCV/RISCVMacroFusion.cpp
38

This is clearly not a jump.

62

This is not the right name. This is for load immediate idioms.

80

We don't want to be fusing arbitrary loads/stores together, only those we know to be to adjacent memory locations with the same width (and in that order). Moreover it can only be done in decode if it's trivial to determine these conditions hold, ie they use the same opcode and base register, differing only in the immediate by the memory access width.

lenary resigned from this revision.Jan 14 2021, 10:10 AM
rkruppe removed a subscriber: rkruppe.Jan 14 2021, 10:20 AM
evandro requested changes to this revision.Jan 15 2021, 6:34 PM
This revision now requires changes to proceed.Jan 15 2021, 6:34 PM

Is Chipyard already supporting these micro-op-fusion? and how can i use this code to make a riscv file (test.riscv) from a c file?

Hi, I'd like to know if you are still interested in this work? There is a microarchitecture called XiangShan that actually supports instruction fusion.

We are working on support of this processor in LLVM, including macro fusion, and we hope to make our patches upstream. The fusion pairs are listed here (translations needed) and I suppose it's a good opportunity to introduce instruction fusion in the RISCV ecosystem of LLVM.

Herald added a project: Restricted Project. · View Herald TranscriptMar 25 2022, 1:15 AM