This is an archive of the discontinued LLVM Phabricator instance.

[SVE] Implement fixed-width ZEXT lowering
AbandonedPublic

Authored by cameron.mcinally on Aug 7 2020, 2:24 PM.

Details

Summary

Looks like this patch is already outdated by D85546, but thought I'd offer it anyway in case the framework helps...

This patch is an attempt to lower fixed-width ZEXT by bitcasting the input to the result type, then AND'ing with an immediate that matches the input type.

Diff Detail

Event Timeline

cameron.mcinally requested review of this revision.Aug 7 2020, 2:24 PM
cameron.mcinally edited the summary of this revision. (Show Details)Aug 7 2020, 3:27 PM

It does seem like we're risking treading on each others toes :) Perhaps worth syncing up so we don't duplicate effort. I had been trying to kill off D71767 but instead I'll updated it (probably tomorrow) to show work I've already got downstream where I just need to write tests. I doubt I'll get chance to cover additional nodes over the next few weeks so everything else should be fair game.

llvm/test/CodeGen/AArch64/sve-fixed-length-zext.ll
78–85

The test's output shows this is not really a zero extend.

The load will read 32 bytes into consecutive byte lanes. The and will zero the odd numbered bytes (i.e. zero-extend the even lanes). The store will write those 16 zero-extend even numbered bytes along with 16 more zeros that result from the load zeroing byte 32 onward.

I'm afraid to say that currently the extends are not going to be a cheap operation being they are effectively the reverse of the truncate operation. Truncates use a uzp1 sequence, so the extends require an upklo sequence.

I've got a patch already to do this that I'll add you to.

cameron.mcinally abandoned this revision.Aug 7 2020, 4:34 PM
cameron.mcinally added inline comments.
llvm/test/CodeGen/AArch64/sve-fixed-length-zext.ll
78–85

Ah, they're packed. I got mixed up. Was thinking of it as:

<n x 2 x i64> res = zext(<n x 2 x i32> x)

But, yeah, I see the problem with that now. Thanks.