At this point our uadd.sat and usub.sat support should be better than expanded IR, so start canonicalizing to these intrinsics. I'm porting the already existing patterns, though we'll also want to convert the currently canonical patterns in addition to that.
I've left them in place to avoid redundant calculations if the intrinsic needs to be expanded in the backend. If that's not a concern I can drop them (apart from the case that inserts an extra sub).
There are 2 concerns here:
- Is the IR-level analysis for the intrinsics as good/better than the raw IR? The description says yes, so I assume we've filled all the known analysis, value tracking, vectorization, and combining gaps.
- Is codegen as good/better than the raw IR? Ie, for all known targets/types, are we DAG combining/lowering the intrinsics optimally?
I want to make sure about #2 because there was some potential fallout from aggressively forming the similar usub overflow intrinsic in CGP in D57789, so we ended up adding a target hook (cc @dmgreen).
Sounds good. I think this is the right transform in IR, but if there's anything we can do in advance to ensure that codegen doesn't regress, that's always best.
I don't think it will be affected with this patch yet, but I'm looking at 1 more x86 or general vector codegen improvement for min/max in PR14613:
...and then I think we can also convert min/max patterns to uadd.sat in IR.
Hello. This one looks OK from the tests I just ran. A few things moved, but only by a small amount, and usually in the right direction.
I think the "with.overflow" intrinsics were causing more problems for us. These saturations, at least in the tests I've ran, seem to be handled OK.
Do these comments need updating?
I converted umin-based patterns for uaddsat here:
That should give us an x86 improvement in PR14613.
I'm not sure if we are holding this patch up for any other changes, but if not and if the above change doesn't cause trouble, we should push ahead with this patch.
Remove/update this comment.