This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Add isel patterns for masked RISCVISD::FMA_VL with RISCVISD::FNEG_VL.
ClosedPublic

Authored by craig.topper on Feb 21 2022, 3:40 PM.

Details

Summary

This helps us form vfnmsub, vfnmadd, and vfmusb from masked VP
intrinsics.

I've used "srcvalue" for the mask parameter in the fneg nodes. We
can't match "V0" because that doesn't sure the mask the is the same.
Instead it matches two different nodes and generates two copies to
V0 of those separate values.

If we don't think srcvalue is ok, we'll need to change to true_mask
or use C++ code to match.

Diff Detail

Event Timeline

craig.topper created this revision.Feb 21 2022, 3:40 PM
craig.topper requested review of this revision.Feb 21 2022, 3:40 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 21 2022, 3:40 PM
liaolucy added inline comments.
llvm/test/CodeGen/RISCV/rvv/vfma-vp.ll
1195

I see a lot of VP intrinsic added, but I'm a little confused about how to use these intrinsics.

I would hazard a guess that the next step is to construct the IR-level VP intrinsic? Is there an initial patch or discussion?

craig.topper added inline comments.Feb 23 2022, 5:59 PM
llvm/test/CodeGen/RISCV/rvv/vfma-vp.ll
1195

I see a lot of VP intrinsic added, but I'm a little confused about how to use these intrinsics.

I would hazard a guess that the next step is to construct the IR-level VP intrinsic? Is there an initial patch or discussion?

https://lists.llvm.org/pipermail/llvm-dev/2019-January/129791.html

https://reviews.llvm.org/D104608 <- first patch for loop vectorizer should be other patches as children

I understand srcval here is

/// SRCVALUE - This is a node type that holds a Value* that is used to
/// make reference to a value in the LLVM IR.                         
SRCVALUE,

It seems a bit hacky at first and I guess it works because those patterns are kind of skipped during matching, right?

If we don't think srcvalue is ok, we'll need to change to true_mask or use C++ code to match.

Does it make sense to have riscv_fmsub_vl, riscv_fnmsub_vl, ... nodes that we combine earlier using riscv_fma_vl and others? I understand this would make the patterns straightforward but maybe it is not feasible.

It seems a bit hacky at first and I guess it works because those patterns are kind of skipped during matching, right?

If we don't think srcvalue is ok, we'll need to change to true_mask or use C++ code to match.

Does it make sense to have riscv_fmsub_vl, riscv_fnmsub_vl, ... nodes that we combine earlier using riscv_fma_vl and others? I understand this would make the patterns straightforward but maybe it is not feasible.

I was thinking we'd (have to) go down the srcvalue route, personally. It should be correct and I can't think of a reason we'd want more control over this kind of thing while pattern matching.

I fear that adding more custom combined nodes would get unwieldy as they don't scale particularly well: we'd probably want corresponding ones for integer madd/macc, for narrowing operations, etc. They may inhibit theoretical optimizations we can perform on generic VP nodes. But the key word is "theoretical" and so that's just a gut reaction. We do have custom nodes for all of the widening operations, so it's feasible.

It seems a bit hacky at first and I guess it works because those patterns are kind of skipped during matching, right?

If we don't think srcvalue is ok, we'll need to change to true_mask or use C++ code to match.

Does it make sense to have riscv_fmsub_vl, riscv_fnmsub_vl, ... nodes that we combine earlier using riscv_fma_vl and others? I understand this would make the patterns straightforward but maybe it is not feasible.

I was thinking we'd (have to) go down the srcvalue route, personally. It should be correct and I can't think of a reason we'd want more control over this kind of thing while pattern matching.

I fear that adding more custom combined nodes would get unwieldy as they don't scale particularly well: we'd probably want corresponding ones for integer madd/macc, for narrowing operations, etc. They may inhibit theoretical optimizations we can perform on generic VP nodes. But the key word is "theoretical" and so that's just a gut reaction. We do have custom nodes for all of the widening operations, so it's feasible.

We are already using srcvalue to skip VL for splats nodes.

The other option I can think of is to create a ComplexPattern that takes the Root nodes as an input. The ComplexPattern matching code could find the mask on the root node and make sure it matches the operand the ComplexPattern is used for.

rogfer01 accepted this revision.Mar 9 2022, 10:43 PM

I fear that adding more custom combined nodes would get unwieldy as they don't scale particularly well: we'd probably want corresponding ones for integer madd/macc, for narrowing operations, etc. They may inhibit theoretical optimizations we can perform on generic VP nodes. But the key word is "theoretical" and so that's just a gut reaction. We do have custom nodes for all of the widening operations, so it's feasible.

Fair point.

We are already using srcvalue to skip VL for splats nodes.

Ah ok. I missed that.

The other option I can think of is to create a ComplexPattern that takes the Root nodes as an input. The ComplexPattern matching code could find the mask on the root node and make sure it matches the operand the ComplexPattern is used for.

Well unless we have a strong reason to really ensure the two masks match, the patterns are more descriptive.

No objections on my side.

This revision is now accepted and ready to land.Mar 9 2022, 10:43 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 9 2022, 10:43 PM
This revision was landed with ongoing or failed builds.Mar 10 2022, 10:06 AM
This revision was automatically updated to reflect the committed changes.