This matches what the cost model already does.
This relies on vector SADDO/SSUBO though. I think this works fine because those are expanded on all archs (for vectors), but is possibly not entirely reliable.
Paths
| Differential D57348
[CodeGen][X86] Don't scalarize vector saturating add/sub ClosedPublic Authored by nikic on Jan 28 2019, 11:48 AM.
Details Summary This matches what the cost model already does. This relies on vector SADDO/SSUBO though. I think this works fine because those are expanded on all archs (for vectors), but is possibly not entirely reliable.
Diff Detail
Event TimelineComment Actions This might be a little premature as we haven't got PR40442 yet.
Comment Actions The relevant part of PR40442 has been resolved now.
Comment Actions Restore assertions (and move to towards top). Regenerate tests (changes in register allocation). This revision is now accepted and ready to land.Feb 10 2019, 10:49 AM Closed by commit rL353651: [CodeGen][X86] Don't scalarize vector saturating add/sub (authored by nikic). · Explain WhyFeb 10 2019, 11:06 AM This revision was automatically updated to reflect the committed changes. Comment Actions Hi Nikita, This commit seems to cause a crash running llc -O0 on the following IR: target triple = "thumbv7k-apple-darwin" define hidden void @foo(<2 x i64> *%ptr) { entry: %0 = load <2 x i64>, <2 x i64>* %ptr, align 8 %1 = call <2 x i64> @llvm.usub.sat.v2i64(<2 x i64> zeroinitializer, <2 x i64> %0) %2 = bitcast i64* undef to <2 x i64>* store <2 x i64> %1, <2 x i64>* %2, align 8 ret void } ; Function Attrs: nounwind readnone speculatable declare <2 x i64> @llvm.usub.sat.v2i64(<2 x i64>, <2 x i64>) Could you take a look? Comment Actions @aemerson Thanks for the report! From a quick look, looks like the usubsat expands to usubo, which is then expanded during op legalization, while it needs to be expanded during vector op legalization. We're doing that for umulo and smulo, but apparently not for uaddo/saddo and usubo/ssubo. I'll prepare a patch.
Revision Contents
Diff 186156 llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp
llvm/trunk/test/CodeGen/X86/sadd_sat.ll
llvm/trunk/test/CodeGen/X86/sadd_sat_vec.ll
llvm/trunk/test/CodeGen/X86/ssub_sat.ll
llvm/trunk/test/CodeGen/X86/ssub_sat_vec.ll
|