This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Optimize xor/or with immediate in the zbs extension
ClosedPublic

Authored by benshi001 on May 20 2021, 6:14 PM.

Diff Detail

Event Timeline

benshi001 created this revision.May 20 2021, 6:14 PM
benshi001 requested review of this revision.May 20 2021, 6:14 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 20 2021, 6:14 PM

Given how messy this is becoming and the amount of duplicated code for what are very similar patterns, does this not just want to be custom lowering? TableGen is starting to feel like the wrong tool for this.

Also, just a side thought: how does all this interact with using the integers for things like load or store offsets? Do we end up not being able to fold an immediate into the load/store as a result of some of these optimisations and thus lose out overall on large-offset addressing when B is enabled?

Given how messy this is becoming and the amount of duplicated code for what are very similar patterns, does this not just want to be custom lowering? TableGen is starting to feel like the wrong tool for this.

Also, just a side thought: how does all this interact with using the integers for things like load or store offsets? Do we end up not being able to fold an immediate into the load/store as a result of some of these optimisations and thus lose out overall on large-offset addressing when B is enabled?

I’m not sure custom lowering is the right tool either. Making OR/XOR Custom will likely have other effects. Maybe custom isel in RISCVISelDAGToDAG?

These are immediates being used only by an OR/XOR so I don’t think it affects loads/stores folding. Am I missing something?

Given how messy this is becoming and the amount of duplicated code for what are very similar patterns, does this not just want to be custom lowering? TableGen is starting to feel like the wrong tool for this.

Also, just a side thought: how does all this interact with using the integers for things like load or store offsets? Do we end up not being able to fold an immediate into the load/store as a result of some of these optimisations and thus lose out overall on large-offset addressing when B is enabled?

I’m not sure custom lowering is the right tool either. Making OR/XOR Custom will likely have other effects. Maybe custom isel in RISCVISelDAGToDAG?

These are immediates being used only by an OR/XOR so I don’t think it affects loads/stores folding. Am I missing something?

How about make the duplicate parts of the two PatLeaf to a function, and then call it in the two PatLeaf? This way would make it more readable and clear.

I do not think ISelDagToDag is good, it even introduces more lines than current form.

benshi001 updated this revision to Diff 346921.May 20 2021, 8:43 PM

I have uploaded a new version, in which the duplicate part is dropped to a function and called in the two ParLeaf.

My comment on the previous patch was about constants like 2051 that can be done with

ORI a0, 3
BSETI a0, 11

I don't think this patch handles that.

llvm/test/CodeGen/RISCV/rv64zbs.ll
1389

This isn't really an improvement.

benshi001 added inline comments.May 20 2021, 9:56 PM
llvm/test/CodeGen/RISCV/rv64zbs.ll
1389

It saves an extra register.

craig.topper added inline comments.May 20 2021, 10:14 PM
llvm/test/CodeGen/RISCV/rv64zbs.ll
1389

With 32 registers, is that an important optimization? Are we also going to implement (BSETI (BSETI (BSETI))) for cases with 3 set bits where none are in the lower 11 bits so we can't use ORI/XORI? It starts getting out of hand.

benshi001 marked 2 inline comments as done.
benshi001 added inline comments.
llvm/test/CodeGen/RISCV/rv64zbs.ll
1389

Thanks. I will just handle the simplest case currently.

benshi001 marked an inline comment as done.May 21 2021, 12:30 AM

I have changed my code to cover the simlest case, with the sign bit of simm12 handled in the same way as upper bits.

This revision is now accepted and ready to land.May 24 2021, 8:28 PM