Follow the same custom legalisation strategy as used in D57085 for variable-length shifts (see that patch summary for more discussion). Although we may lose out on some late-stage DAG combines, I think this custom legalisation strategy is ultimately easier to reason about.
There are some codegen changes in rv64m-exhaustive-w-insts.ll but they are all neutral in terms of the number of instructions.
Since you have a custom DAG node, you might as well implement ComputeNumSignBitsForTargetNode instead of using a pattern like this.