This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SME] Use `fmov` instead of NEON `movi` for FP value.
ClosedPublic

Authored by sdesmalen on Jul 17 2023, 3:11 AM.

Details

Summary

NEON movi is not valid in Streaming SVE mode, so use an fmov
instruction instead for zero-initializing a FP value.

Diff Detail

Event Timeline

sdesmalen created this revision.Jul 17 2023, 3:11 AM
sdesmalen requested review of this revision.Jul 17 2023, 3:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 17 2023, 3:11 AM
Matt added a subscriber: Matt.Jul 17 2023, 4:27 PM
hassnaa-arm added inline comments.Jul 18 2023, 5:50 AM
llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
1333

conceptually what is the difference between hasNEON and isNeonAvailable() ?
I saw the implementation of isNeonAvailable which is the opposite of hasNEON or the opposite of streaming mode.
But I don't understand how the implementation of isNeonAvailable() represents its name.

1333

What is the difference between this class 'AArch64AsmPrinter' and tblgen files ?
Another thing, I see that in the tblgen files there are some conditions about generating code compatible with specific features, so what is the difference between handling the compatible generated code in the tblgen files and handling it here ?

llvm/test/CodeGen/AArch64/sve-streaming-mode-test-register-mov.ll
36

while the initialised vector is double, why is this mov instead of fmov ?

sdesmalen added inline comments.Jul 18 2023, 6:12 AM
llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
1333

conceptually what is the difference between hasNEON and isNeonAvailable() ?

isNeonAvailable() used to be !forceStreamingCompatibleSVE(), see ec6af93d0249d03a5babd547e072e4de3a2b5e48.
Basically, if the function is a streaming function, or a streaming-compatible function, then Neon is not available.

I could have named it isNeonAvailableAtRuntime, but isNeonAvailable was shorter.

hasNEON is more to do with what target features we compile for. So while we target a core that has NEON instructions, in the given _runtime_ mode we may not be able to use them.

What is the difference between this class 'AArch64AsmPrinter' and tblgen files ?

AsmPrinter emits a the *actual* instruction for the given MachineInstr, which is an intermediate representation of either a pseudo or a real instruction. In this case, FMOVS0 is a pseudo node that gets expanded here to the appropriate instruction.

Another thing, I see that in the tblgen files there are some conditions about generating code compatible with specific features, so what is the difference between handling the compatible generated code in the tblgen files and handling it here ?

In TableGen we define the instructions (with their encodings) and have patterns to map to those instructions, or to a pseudo node. I'm not entirely sure why they took the approach here to map to a pseudo node, rather than directly map to the appropriate instruction using a pattern. Perhaps the simpler pseudo node has some benefits, but I don't see that much code that uses it's simpler representation. There is code in InstrInfo to say that an FMOV[HSD]0 is cheap and also a change in the scheduler specific to FMOV[HSD]0.

The purpose of this patch is to fix the issue, so I didn't touch the previous design choices on how this is represented.

hassnaa-arm added inline comments.Jul 18 2023, 7:02 AM
llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
1333

Thanks for the explanation :)

sdesmalen added inline comments.Jul 20 2023, 1:50 AM
llvm/test/CodeGen/AArch64/sve-streaming-mode-test-register-mov.ll
36

FMOV (vector, immediate) is not available in streaming SVE mode, so we can't use it. Instead we use the SVE mov instruction instead.

hassnaa-arm accepted this revision.Jul 20 2023, 2:33 AM

LGTM. Thanks for the clarification.

This revision is now accepted and ready to land.Jul 20 2023, 2:33 AM