This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG][ARM][X86] Teach PromoteIntRes_SETCC to do a better job picking the result type for the setcc.
ClosedPublic

Authored by craig.topper on Feb 11 2018, 11:45 PM.

Details

Summary

Previously if getSetccResultType returned an illegal type we just fell back to using the default promoted type. This appears to have been to handle the case where for vectors getSetccResultType returns the input type, but the input type itself isn't legal and will need to be promoted. Without the legality check we would never reach a legal type.

But just picking the promoted type to be the setcc type can create strange setccs where the result type is 128 bits and the operand type is 256 bits. If for example the result type was promoted to v8i16 from v8i1, but the input type was promoted from v8i23 to v8i32. We currently handle this with custom lowering code in X86.

This legality check also caused us reject the getSetccResultType when the input type needed to be widened or split. Even though that result wouldn't have caused legalization to get stuck.

This patch tries to fix this situation by detecting that the input type needs to be promoted and using the promoted input type to make the setcc call. This should prevent getting stuck always picking the same illegal type.

I tried legalizing the operands of the setcc at the same time as the result for this case, but that caused some test changes solely due to the next DAG combine running in a different order.

The x86 test changes look to be regressions. Previously we were rejecting this because the result type being returned was v8i64 but we were rejecting it in favor of v8i16. Then we split the operands to v4i64, created a setcc with v4i1 result type. Promoted that v4i1 to v4i32. Then split the setcc operands again to v2i64, created setcc with v2i1 result type. Then finally promoted that v2i1 result to v2i64. Each time we promote the result a truncate node gets introduced. Somehow those intermediate truncates helped us legalize this well. Now we keep v8i64 and always have vXi1 or vXi64 results for the setccs. We are able to split these setccs without introducing any new truncates.

Diff Detail

Event Timeline

craig.topper created this revision.Feb 11 2018, 11:45 PM
efriedma added inline comments.Mar 9 2018, 2:43 PM
lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
593

Please explicitly note this is a heuristic.

test/CodeGen/X86/bitcast-and-setcc-512.ll
49

I guess the problem here is specifically that the new version doesn't use packssdw?

Made the code less of a heuristic.

craig.topper added inline comments.Mar 9 2018, 3:22 PM
test/CodeGen/X86/bitcast-and-setcc-512.ll
49

Yeah. I think we are only able to do it previously because the legalization process legalized the setcc several times and kept introducing truncates each time. With this patch we do less legalizations and introduce less intermediate truncates.

efriedma added inline comments.Mar 15 2018, 2:08 PM
lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
604

Are you intentionally changing the behavior in the case where SVT is an illegal type which won't be promoted?

craig.topper added inline comments.Mar 15 2018, 2:55 PM
lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
604

Yes. I think the getSetccResultType is a good answer if it doesn't need to be promoted.

This is what happened in the X86 test case that changed:

getSetccResultType for v8i64 returned v8i64, but the original code detected it as an illegal type and instead changed it to v8i16 creating a weird looking (v8i16 (setcc (v8i64))). Now type legalization splits the input type. And creates (v8i16 (sext (concat_vectors (v4i1 (setcc (v4i64))), (v4i1 (setcc (v4i64)))). The v4i1 setccs try to legalize their result types. We call getSetCCResultType and get v4i64 reject it as illegal and use v4i32. Creating another pair of weird setccs that have to be split and promoted.

If we had keep the original v8i64 answer from getSetccResultType we would have create a normal looking (v8i64 (setcc (v8i64))). Then split that to (v4i64 (setcc (v4i64))) and then (v2i64 (setcc (v2i64))) without any intermediate promotions of the setccs.

We are still doing something weird with v2f32 setccs on x86. We call getSetccResultType and get v2i32, which needs to be promoted, but v2f32 doesn't need to be promoted. So we fall back to using NVT(v2i64) and create (v2i64 (setcc v2f32)). Which we widen and create (v2i64 (sext (v2i32 (extract_subvector (v4i32 (setcc v432)))).

This revision is now accepted and ready to land.Mar 15 2018, 3:20 PM
craig.topper closed this revision.Mar 15 2018, 4:11 PM

Committed in 327683, but forgot the Differential Revision line.