Page MenuHomePhabricator

[AArch64] Use 64-bit movi for zeroing halfs/floats
ClosedPublic

Authored by SjoerdMeijer on Apr 1 2021, 2:30 AM.

Details

Summary

This was using the .2d variant, but using the "movi d0" variant that zeros 64 bits is faster on some cores.

This is a prep step for D99586 to always using movi for zeroing floats.

Diff Detail

Event Timeline

SjoerdMeijer created this revision.Apr 1 2021, 2:30 AM
SjoerdMeijer requested review of this revision.Apr 1 2021, 2:30 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 1 2021, 2:30 AM
david-arm added inline comments.
llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
1105

Should this be moved to the else case? It looks like we're adding the same immediate twice for the H and S cases.

SjoerdMeijer added inline comments.Apr 1 2021, 3:27 AM
llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
1105

Thanks for taking a look. This movi variant takes an optional shift immediate value, so indeed for the H and S we need to add 0 immediates. That probably deserves a comment, so will add that.

I think the exact suggestion was to use MOVID instead. I'm not sure how much it matters, but it may be a simpler instruction for some cores. This would then match what GCC emits.

I think the exact suggestion was to use MOVID instead. I'm not sure how much it matters, but it may be a simpler instruction for some cores. This would then match what GCC emits.

Well, that's not what GCC does at moment, not yet at least, but it is indeed another way of doing it. But let's go for the movi d0.

SjoerdMeijer edited the summary of this revision. (Show Details)

Now using "movi d0".

dmgreen accepted this revision.Apr 1 2021, 6:12 AM

Thanks. This LGTM, so long as the Apple folks here are happy with changing the instruction issued.

llvm/test/CodeGen/AArch64/arm64-zero-cycle-zeroing.ll
33

We may want to do the same with neon movi's, as the instructions set all bits to 0 in either case.

This revision is now accepted and ready to land.Apr 1 2021, 6:12 AM

Thanks, and I will wait a few days with committing.

This revision was automatically updated to reflect the committed changes.