Page MenuHomePhabricator

[AArch64] Predicate SSHLL;SCVTF patterns behind UseAlternateSExtLoadCVTF32
ClosedPublic

Authored by dmgreen on May 12 2022, 8:00 AM.

Details

Summary

There have been some patterns in the AArch64 backend to optimize code of the form:

ldrsh w8, [x0]
scvtf s0, w8

to:

ldr h0, [x0]      
sshll v0.4s, v0.4h, #0
scvtf s0, s0

The idea is to remove the GRP->FPR move, but in reality is making code larger and slower (or the same) on all the cpus I tried.

This patch adds the UseAlternateSExtLoadCVTF32 predicate similar to nearby related pattern.

Diff Detail

Event Timeline

dmgreen created this revision.May 12 2022, 8:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 12 2022, 8:00 AM
dmgreen requested review of this revision.May 12 2022, 8:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 12 2022, 8:00 AM

Looks like a good fix. One quick question: I see that some CPUs have FeatureAlternateSExtLoadCVTF32Pattern set. Is that something we want too?

Looks like a good fix. One quick question: I see that some CPUs have FeatureAlternateSExtLoadCVTF32Pattern set. Is that something we want too?

I don't believe so no, not for that option. Not for any of the cpus I tried at least.

SjoerdMeijer accepted this revision.May 13 2022, 1:43 AM

Ok, cheers, LGTM

This revision is now accepted and ready to land.May 13 2022, 1:43 AM
This revision was landed with ongoing or failed builds.May 16 2022, 10:00 AM
This revision was automatically updated to reflect the committed changes.