Page MenuHomePhabricator

[AArch64][SVE] Add intrinsics for first-faulting gather loads

Authored by andwar on Feb 25 2020, 9:31 AM.



The following intrinsics are added:

  • @llvm.aarch64.sve.ldff1.gather
  • @llvm.aarch64.sve.ldff1.gather.index
  • @llvm.aarch64.sve.ldff1.gather_sxtw
  • @llvm.aarch64.sve.ldff1.gather.uxtw
  • @llvm.aarch64.sve.ldff1.gather_sxtw.index
  • @llvm.aarch64.sve.ldff1.gather.uxtw.index
  • @llvm.aarch64.sve.ldff1.gather.scalar.offset

Although this patch is quite substantial, the vast majority of the
implementation is just a 'copy & paste' of the implementation of regular
gather loads, including tests. There's only a handful of new

  • AArch64ISD nodes defined in AArch64ISelLowering.h (e.g. GLDFF1)
  • Seleciton DAG Types in (e.g. AArch64ldff1_gather)
  • intrinsics in (e.g. aarch64_sve_ldff1_gather)

Diff Detail

Event Timeline

andwar created this revision.Feb 25 2020, 9:31 AM
Herald added a reviewer: efriedma. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript

This is almost literally copy & paste of the implementation of the regular gather loads, hence submitting as one patch (there are only few meaningful changes). I generated the tests by copying the corresponding tests for regular gather loads (I did update the opcodes though).

IIUC, this should be sufficient. I couldn't find any differences between first-faulting and non-faulting gathers that would affect this patch.

These tests will currently fail when run with expensive checks (or alternatively, when running these tests with --verify-machineinstrs) because the FFR will be undefined.
For the contiguous variants of ldff1, we've worked around this by emitting a Pseudo with let hasSideEffects = 1 to model the possible changes to the FFR. The Pseudo is eventually expanded to the instruction. See D73025 for details.

andwar updated this revision to Diff 246662.Feb 26 2020, 3:31 AM

@sdesmalen Thank you for reviewing! Yes, the tests were failing with expensive checks enabled. Great catch, thank you!

I've added the workaround (similar to what was implemented in to all the affected multiclasses. It sill
feels like a fairly mechanical change, so I've kept it in a single patch. I
am happy to split this into multiple patches if you believe that that would
make the change clearer.

sdesmalen accepted this revision.Feb 26 2020, 5:30 AM

LGTM! Thanks @andwar

This revision is now accepted and ready to land.Feb 26 2020, 5:30 AM
This revision was automatically updated to reflect the committed changes.