This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Modify SVE Pseudo appends
ClosedPublic

Authored by harviniriawan on Jun 29 2023, 6:59 AM.

Details

Summary
  • Update cortex-a510 and neoverse-v2 SVE scheduling so that pseudos have the same instruction latency as original instruction.

    Differential Revision: https://reviews.llvm.org/D154084

Diff Detail

Event Timeline

harviniriawan created this revision.Jun 29 2023, 6:59 AM
harviniriawan requested review of this revision.Jun 29 2023, 6:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2023, 6:59 AM

Sounds like a good idea. And I like the test, that looks very useful in keeping the data insync.

llvm/unittests/Target/AArch64/AArch64SvePseudoTest.cpp
35

false -> true

39

Can you add CortexA510 to this test name. It would be good to make use of the test in other cpu's like N2 and V2 (in a future patch). Perhaps pass the CPU as a parameter to createTargetMachine too?

And maybe make it clear that it is testing scheduling info. AArch64SVESchedPseudoTest or something similar?

52
60

Descorig -> DescOrig, same below.

paulwalker-arm added a comment.EditedJun 29 2023, 7:35 AM

Can the UNDEF_D -> D_UNDEF renaming be pulled into a separate NFC patch?

rjj added a subscriber: rjj.Jun 29 2023, 7:40 AM

This looks like a nice patch. I left a few comments for the Neoverse V2 below.

Also, though not particularly important for now, maybe we could also normalise SME instructions (like ADDHA_MPPZ_D_PSEUDO_D)?

llvm/lib/Target/AArch64/AArch64SchedNeoverseV2.td
2081–2082

This line is actually not necessary anymore

2095–2096

Same as above.

2156–2157

Unnecessary now.

2370–2371

These should still use the explicit forms, or be changed to instregex.

2524–2525

Should still use the explicit form.

2528–2529

Ditto.

2533

Ditto.

2560

Should use the explicit form.

2563

Ditto.

2566

Ditto.

llvm/unittests/Target/AArch64/AArch64SvePseudoTest.cpp
50

nit: typo

Can the UNDEF_D -> D_UNDEF renaming be pulled into a separate NFC patch?

I don't think it's possible. I can add tests on Neoverse-N2 CPU as well in here, then

Matt added a subscriber: Matt.Jun 29 2023, 5:07 PM

Can the UNDEF_D -> D_UNDEF renaming be pulled into a separate NFC patch?

I don't think it's possible. I can add tests on Neoverse-N2 CPU as well in here, then

Are you sure? To be clear I'm talking about the changes to llvm/lib/Target/AArch64/SVEInstrFormats.td which looks like a mechanical name change to me that should not affect any behaviour. If it does then that's likely a bug.

Can the UNDEF_D -> D_UNDEF renaming be pulled into a separate NFC patch?

I don't think it's possible. I can add tests on Neoverse-N2 CPU as well in here, then

Are you sure? To be clear I'm talking about the changes to llvm/lib/Target/AArch64/SVEInstrFormats.td which looks like a mechanical name change to me that should not affect any behaviour. If it does then that's likely a bug.

the UNDEF is actually used in pseudo lowering (if that's the right term). In AArch64GenInstrInfo.inc : So if during pre RA MI Sched the UNDEF variant of the SVE instruction is being generated, it will not pick up the right latency
// getSVEPseudoMap
LLVM_READONLY
int getSVEPseudoMap(uint16_t Opcode) {
static const uint16_t getSVEPseudoMapTable[][2] = {

{ AArch64::ABS_ZPmZ_B_UNDEF, AArch64::ABS_ZPmZ_B },
{ AArch64::ABS_ZPmZ_D_UNDEF, AArch64::ABS_ZPmZ_D },
harviniriawan marked 15 inline comments as done.
dmgreen added inline comments.Jul 3 2023, 1:06 AM
llvm/unittests/Target/AArch64/AArch64SvePseudoTest.cpp
83

Can you make this two tests, one for each CPU. It can make the tests easier to work with if they fail, where it is better to have each test more independant.

The neoverse-v2 part may want to be moved to D154232 to keep that patch as an NFC. This patch could then be to improve A510.

paulwalker-arm added inline comments.Jul 3 2023, 2:41 AM
llvm/unittests/Target/AArch64/AArch64SvePseudoTest.cpp
83

@dmgreen - The AArch64SchedNeoverseV2.td changes are functional are they not? As in this patch is about adding scheduling information for SVE pseudo instructions, which coveres multiple scheduling models. Or have I misunderstood your comment?

dmgreen added inline comments.Jul 3 2023, 2:50 AM
llvm/unittests/Target/AArch64/AArch64SvePseudoTest.cpp
83

The V2 scheduling model was already matching (ABS|CNOT|NEG)_ZPmZ_UNDEF_[BHSD]$. It needs to be changed to match (ABS|CNOT|NEG)_ZPmZ_[BHSD]_UNDEF$ (or (ABS|CNOT|NEG)_ZPmZ_[BHSD] as is done here) in order to keep it matching the same instructions.

I'm not sure if there were other missing UNDEF instructions? I might just include them in D154232 if they were, but it may be better to pull it out into a different patch.

paulwalker-arm added inline comments.Jul 3 2023, 2:55 AM
llvm/unittests/Target/AArch64/AArch64SvePseudoTest.cpp
83

I see. Thanks. So I was thinking D154232 would just be a literal change to move the UNDEF matching to where it needs to be. Then this patch can unify the patterns as it's already doing, which would then catch any missing ones.

harviniriawan edited the summary of this revision. (Show Details)
dmgreen accepted this revision.Jul 4 2023, 1:17 AM

This looks OK to me, as far as I can see, and I like the test case. LGTM, thanks.

llvm/unittests/Target/AArch64/AArch64SVESchedPseudoTest.cpp
82 ↗(On Diff #536769)

Perhaps use TEST(AArch64SVESchedPseudoTest, CortexA510) {

This revision is now accepted and ready to land.Jul 4 2023, 1:17 AM
harviniriawan closed this revision.Jul 4 2023, 2:44 AM