This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][GlobalISel] Fold away lowered vector sign-extend of vector compares
ClosedPublic

Authored by aemerson on Oct 3 2022, 7:26 AM.

Details

Summary

This fixes a long standing cause of awful code generation when legalization creates
G_SEXT(G_FCMP(...)), for example due to promoting the condition of a vector G_SELECT.

Since on AArch64 vector compares sign-extend the condition value, there's no need
for this extra G_SEXT. Unfortunately by the time we get to post-legalization these
G_SEXTs have already been lowered into shifts, so this combine is a bit more
involved than I'd ideally like. Oh well.

Diff Detail

Event Timeline

aemerson created this revision.Oct 3 2022, 7:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 3 2022, 7:26 AM
aemerson requested review of this revision.Oct 3 2022, 7:26 AM
arsenm added a subscriber: arsenm.Oct 3 2022, 8:43 AM

This should have been caught and combined into g_sext_inreg, which should have folded out based on computeNumSignBits

This should have been caught and combined into g_sext_inreg, which should have folded out based on computeNumSignBits

The only place we have an opportunity to catch the extend before it gets legalized into shifts is in the artifact combiner, and this is just a code quality optimization so it shouldn't go there.

Why can't this be caught pre-legalize?

llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerCombiner.cpp
368

replaceSingleDefInstWithReg?

Why can't this be caught pre-legalize?

Because the sign-extend and the subsequent shifts only appear during legalization. Before that there’s nothing to combine.

paquette accepted this revision.Oct 3 2022, 1:12 PM

LGTM

This revision is now accepted and ready to land.Oct 3 2022, 1:12 PM
arsenm added a comment.Oct 3 2022, 2:29 PM

This should have been caught and combined into g_sext_inreg, which should have folded out based on computeNumSignBits

The only place we have an opportunity to catch the extend before it gets legalized into shifts is in the artifact combiner, and this is just a code quality optimization so it shouldn't go there.

The shifts themselves combine into g_sext_inreg

arsenm added a comment.Oct 3 2022, 2:30 PM

This should have been caught and combined into g_sext_inreg, which should have folded out based on computeNumSignBits

The only place we have an opportunity to catch the extend before it gets legalized into shifts is in the artifact combiner, and this is just a code quality optimization so it shouldn't go there.

The shifts themselves combine into g_sext_inreg

Plus there's nothing AArch64 specific about this. You're not getting the sext_inreg with the base set of combines because the aarch64 rule is to always lower sext_inreg

This should have been caught and combined into g_sext_inreg, which should have folded out based on computeNumSignBits

The only place we have an opportunity to catch the extend before it gets legalized into shifts is in the artifact combiner, and this is just a code quality optimization so it shouldn't go there.

The shifts themselves combine into g_sext_inreg

Plus there's nothing AArch64 specific about this. You're not getting the sext_inreg with the base set of combines because the aarch64 rule is to always lower sext_inreg

Ah I see. I’ll look into that tomorrow.