This is an archive of the discontinued LLVM Phabricator instance.

[LoongArch] Implement OR combination to generate bstrins.w/d
ClosedPublic

Authored by SixWeining on Jul 8 2022, 3:09 AM.

Diff Detail

Event Timeline

SixWeining created this revision.Jul 8 2022, 3:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 8 2022, 3:09 AM
SixWeining requested review of this revision.Jul 8 2022, 3:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 8 2022, 3:09 AM
xen0n added a comment.Jul 8 2022, 6:12 AM

What about the arguably more common case of a = b | ((c & mask) << shamt) (when c is not known to be small enough to not require the masking) or a = b | (c & mask), where all positions in a to be overwritten by the incoming bits are known to be zero?

One of the applicable real-world scenario is RISC code generation, where a lot of ORing between a "clean" insn template and shifted/masked operands take place. One doesn't mask the bits in the template hence the optimization chance is missed.

The other cases LGTM, though.

What about the arguably more common case of a = b | ((c & mask) << shamt) (when c is not known to be small enough to not require the masking) or a = b | (c & mask), where all positions in a to be overwritten by the incoming bits are known to be zero?

One of the applicable real-world scenario is RISC code generation, where a lot of ORing between a "clean" insn template and shifted/masked operands take place. One doesn't mask the bits in the template hence the optimization chance is missed.

The other cases LGTM, though.

Sorry to reply so late. Good suggestion! And a = b | ((c << shamt) & shifted_mask) seems also apply. Let me try to update the revision. Thanks!

xen0n added a comment.Jul 11 2022, 8:58 AM

What about the arguably more common case of a = b | ((c & mask) << shamt) (when c is not known to be small enough to not require the masking) or a = b | (c & mask), where all positions in a to be overwritten by the incoming bits are known to be zero?

One of the applicable real-world scenario is RISC code generation, where a lot of ORing between a "clean" insn template and shifted/masked operands take place. One doesn't mask the bits in the template hence the optimization chance is missed.

The other cases LGTM, though.

Sorry to reply so late. Good suggestion! And a = b | ((c << shamt) & shifted_mask) seems also apply. Let me try to update the revision. Thanks!

Yeah of course. Thanks for considering the suggestion!

xen0n added a comment.Jul 11 2022, 8:59 AM

What about the arguably more common case of a = b | ((c & mask) << shamt) (when c is not known to be small enough to not require the masking) or a = b | (c & mask), where all positions in a to be overwritten by the incoming bits are known to be zero?

One of the applicable real-world scenario is RISC code generation, where a lot of ORing between a "clean" insn template and shifted/masked operands take place. One doesn't mask the bits in the template hence the optimization chance is missed.

The other cases LGTM, though.

And of course I meant "all positions in b to be overwritten ...", instead of a which is the destination. "Known bits" is the key though.

What about the arguably more common case of a = b | ((c & mask) << shamt) (when c is not known to be small enough to not require the masking) or a = b | (c & mask), where all positions in a to be overwritten by the incoming bits are known to be zero?

One of the applicable real-world scenario is RISC code generation, where a lot of ORing between a "clean" insn template and shifted/masked operands take place. One doesn't mask the bits in the template hence the optimization chance is missed.

The other cases LGTM, though.

And of course I meant "all positions in b to be overwritten ...", instead of a which is the destination. "Known bits" is the key though.

Yes, I have noticed that. :)

xen0n added inline comments.Jul 12 2022, 3:22 AM
llvm/test/CodeGen/LoongArch/bstrins_d.ll
137

typo: because.

You can include the fix in the next revision, together with the other changes you plan to make. No need to push a new revision just for this, for your convenience (and less email clutter for everyone).

llvm/test/CodeGen/LoongArch/bstrins_w.ll
149

Ditto.

  1. Cover more common cases proposed by @xen0n.
  2. Typo fix: 'because'.

The logic looks good, thanks.

I have several more real-world test cases (LoongArch and IA64 code generation) but I plan to submit them after this lands.

llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
666

nit: "in order to match" for consistency with the wording in the 6th pattern above.

699

nit: "we put it here in order to match"

llvm/test/CodeGen/LoongArch/bstrins_d.ll
176

typo: "pattern 7"

llvm/test/CodeGen/LoongArch/bstrins_w.ll
184

ditto

xen0n accepted this revision.Jul 13 2022, 1:35 AM
This revision is now accepted and ready to land.Jul 13 2022, 1:35 AM
This revision was landed with ongoing or failed builds.Jul 14 2022, 2:21 AM
This revision was automatically updated to reflect the committed changes.