This is an archive of the discontinued LLVM Phabricator instance.

Implement aarch64 neon instruction class AdvSIMD (by element) - LLVM
ClosedPublic

Authored by Jiangning on Sep 25 2013, 2:03 AM.

Details

Reviewers
t.p.northover

Diff Detail

Event Timeline

Hi Jiangning,

Quite a lot of comments, I'm afraid, but mostly just me moaning about names (which probably makes it worse). I think the important ones are:

  1. Why are NEON_DUPIMM and NEON_VDUP separate? Actually, why is NEON_VDUP even in this patch since you don't use it?
  2. I don't think we should be forming fma without fp-contract=fast but you seem to be checking it in your tests. That's worrying.

Cheers.

Tim.

lib/Target/AArch64/AArch64ISelLowering.h
144–145

I think this has semantics identical to the existing NEON_DUPIMM doesn't it? A large part of the reason I pushed for that solution was that it could be shared in these instructions.

NEON_DUP is probably the best name here. "NEON" already implies that a duplicate will involve vectors and AArch64 instructions don't have a 'V' prefix.

lib/Target/AArch64/AArch64InstrFormats.td
1017–1018

Braces normally on the same line.

lib/Target/AArch64/AArch64InstrNEON.td
1569–1572

The antonym of "top" is "bottom". I'm not overly struck on either, but we should probably try to be consistent.

3275

The documentation still uses "Rm" for this register, doesn't it? Inventing "Re" seems confusing.

3286

(Not a comment on this patch, I don't have a good solution. But we really need think about what to do with these instruction format names. Perhaps pass them through MD5 to make them more readable or something.)

3443–3444

'q' means nothing on AArch64 NEON. It's like writing "pattern for lane with ymm".

3662

Perhaps "-ffp-contract=fast generates fma". "enforces to generate" doesn't work in English.

3986

What's this got to do with an fma? The "f" stands for either "float" or "fused", neither of which applies here.

4000

Likewise

test/CodeGen/AArch64/neon-2velem.ll
355–357

None of these should be forming an fmla without fp-contract=fast, should they?

Jiangning updated this revision to Unknown Object (????).Sep 27 2013, 2:49 AM

Hi Tim,

Overall, I accept almost all of your comments. Particularly, I removed the patterns for generating fma when -ffp-contract=fast is unavailable. I also removed Neon_DUP from my patch.

For register definition 'Re', I didn't make change. Re is different from Rm and the Rm in documentation for this instruction class is different from the Rm defined for A64Inst. Rm is being defined in llvm as Inst{20-16}, while Re is not, because Inst{20} is being shared for both Rm and bit 'M'. This is also why the index is restricted to V0-V15 for vector with H element, I think.

Thanks,
-Jiangning

lib/Target/AArch64/AArch64ISelLowering.h
144–145

OK. I will remove NEON_VDUP, and Kevin will add it back in his new patch.

lib/Target/AArch64/AArch64InstrFormats.td
1017–1018

OK. I will reformat this one as well as similar ones.

lib/Target/AArch64/AArch64InstrNEON.td
1569–1572

OK. I will not use bottom, because it's longer. I will use High instead.

3275

Re is different from Rm and the Rm in documentation for this instruction class is different from the Rm defined for A64Inst. Rm is being defined in llvm as Inst{20-16}, while Re is not, because Inst{20} is being shared for both Rm and bit 'M'. This is also why the index is restricted to V0-V15 for vector with H element, I think.

3286

I don't have solution either. I'm trying to make differences for different patterns by describing 1) argument types 2) encoding variants. The encoding variants are purely caused by encoding limitation, so it's really hard to give an even meaningful name.

3443–3444

OK. I will change "without q' to 'in 64-bit vector', and change 'with q' to '128-bit vector'.

3662

OK. I thought I wrote 'force' rather than 'enforce'. :-) Anyway, I will change to the wording you gave.

3986

OK. I will remove letter 'f'.

4000

Likewise

test/CodeGen/AArch64/neon-2velem.ll
355–357

Yes. I will remove the test without -ffp-contract=fast.

Hi Jiangning,

Sorry I didn't get a chance to reply on Friday, and thanks for
reworking the patch.

For register definition 'Re', I didn't make change.

Fair enough.

I think it looks pretty much OK now, though I'd be happier if there
was a test that we *weren't* generating fmla instructions all the time
(perhaps with a comment about it being intentional). It's a very
tempting "optimisation" to make.

I think you should commit, with or without that though. I'll take a
look over the revision after-wards.

Cheers.

Tim.

Jiangning updated this revision to Unknown Object (????).Oct 1 2013, 6:46 AM

Tim,

I added some new pattern matches to support the fms_lane intrinsics.

Thanks,
-Jiangning

lib/Target/AArch64/AArch64InstrNEON.td
3687

Some new patterns are added to capture a + b * (-c).

t.p.northover accepted this revision.Apr 3 2014, 4:49 AM
Eugene.Zelenko closed this revision.Oct 4 2016, 6:47 PM
Eugene.Zelenko added a subscriber: Eugene.Zelenko.

Committed in rL191944.