This is an archive of the discontinued LLVM Phabricator instance.

[ARM] ACLE Chapter 9 intrinsics
ClosedPublic

Authored by samparker on Apr 20 2017, 2:37 AM.

Details

Summary

Added the integer data processing intrinsics from Chapter 9 of the ACLE v2.1 and modified existing ones to include a chain if they write to either the Q or GE flags of the APSR.
The document can be found at: http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053d/IHI0053D_acle_2_1.pdf

Diff Detail

Repository
rL LLVM

Event Timeline

samparker created this revision.Apr 20 2017, 2:37 AM
samparker edited the summary of this revision. (Show Details)

Hi Sam,

This is a big patch, with lots of smaller changes within. Normally, I'd ask you to split into smaller patches, but realistically, all of the changes have the exact same effect, are reasonably obvious and common. Your new tests also show that they work as expected, without breaking old assumptions (no old tests changed). From that perspective, it looks good to me, but I'll let it simmer for a bit, if someone else has some more comments.

Note: I'm assuming the "IntrNoMem" declaration in some intrinsics have to do with the "HasSideEffects" on the same bundle, but I haven't checked every single instruction to make sure they are strictly correlated.

cheers,
--renato

test/CodeGen/ARM/smul.ll
294 ↗(On Diff #95909)

Is this really necessary?

Cheers Renato. Sjoerd will give it a second pair of eyes.

samparker updated this revision to Diff 95917.Apr 20 2017, 3:07 AM

removed newline from test.

I'm not convinced you're handling the Q and GE flags correctly for all of the intrinsics which set/use them. If we allow users to access those flags, every instruction which might set them needs to be mapped to a SelectionDAG node with a chain, every instruction which sets or uses them needs to be marked hasSideEffects, and we can never turn arithmetic into an operation which sets Q or GE. I'm not convinced you've managed that correctly everywhere. We currently have an ARMISD::SSAT node which doesn't have a chain. We have a pattern which generates SMLABB from an ISD::MUL. And it's difficult to review the rest because you're adding them all at once.

include/llvm/IR/IntrinsicsARM.td
87 ↗(On Diff #95917)

IntrReadMem?

lib/Target/ARM/ARMInstrInfo.td
877 ↗(On Diff #95917)

ImmLeaf?

Hi Eli,

As far as I'm aware, these instructions do not read the Q flag, and I discussed why I haven't added the intrinsic that would enable the user to read that flag in D32282. This obviously doesn't stop the user from doing so in inline assembly though. As for the GE flags, I believe the only instruction that reads that flag is sel which is only selected via the intrinsic. So if I've got these descriptions correct, I think the chaining should be good. I will double check my descriptions though.

cheers,
sam

samparker updated this revision to Diff 96104.Apr 21 2017, 12:23 AM

I had missed a few instructions that write to the GE flag and have now also set the sel intrinsic to IntrReadMem.

Either we're trying to support reading the Q flag, or we're not trying to support it. If we are trying to support it, we should do it correctly; if we aren't trying to support it, we might as well mark the intrinsics which write to it as IntrNoMem. There isn't really a middle ground here.

Also, I'd like to see comments on each of the intrinsics which isn't IntrNoMem describing which bits it reads/writes.

Ok, sounds fair enough.

samparker updated this revision to Diff 96369.Apr 24 2017, 3:24 AM

Added IntrNoMem to Q bit altering intrinsics so now only GE intrinsics are chained.

efriedma added inline comments.Apr 24 2017, 12:37 PM
lib/Target/ARM/ARMISelLowering.cpp
3382 ↗(On Diff #96369)

You might want to add a comment noting that there's a bit of magic to handle this in LegalizeTypes; it looks weird to replace a node which returns i64 with a node that returns (i32, i32).

lib/Target/ARM/ARMInstrInfo.td
5736 ↗(On Diff #96369)

ARMV5TEPat? Same in other places; using plain "Pat" is a bad idea because we can end up with the wrong instruction in Thumb mode (and your tests wouldn't catch this because you don't check the encoding).

877 ↗(On Diff #95917)

ImmLeaf?

samparker updated this revision to Diff 96687.Apr 26 2017, 2:08 AM

Fixed up the ARM patterns to use predicates, which also showed that I had missed several patterns for the Thumb versions! These have now been added.

lib/Target/ARM/ARMInstrInfo.td
5736 ↗(On Diff #96369)

ah yes, thanks!

efriedma added inline comments.Apr 26 2017, 11:05 AM
lib/Target/ARM/ARMInstrThumb2.td
2851 ↗(On Diff #96687)

t2SMLALD? (Same in other places.)

samparker updated this revision to Diff 96913.Apr 27 2017, 6:43 AM

Thanks Eli, I've now corrected the multiplication intrinsic patterns to select the Thumb version of the instruction and also added thumbv7em as a target to the test.

It would be nice to have an ARMV5TE test to make sure that the intrinsics that are supposed to work for that target actually do.

lib/Target/ARM/ARMISelLowering.cpp
7795 ↗(On Diff #96913)

Assuming that LowerINTRINSIC_WO_CHAIN returns a value with exactly two results seems a little dubious; it might work here, but it's a landmine for anyone dealing with this code in the future. Better to just make it a separate method which pushes onto the Results vector directly.

samparker updated this revision to Diff 97583.May 3 2017, 3:42 AM

Hi Eli,

I've created a standalone function to lower the i64 intrinsics because, yes it was pretty horrible before. The V5TE compatible intrinsics have now been separated out into another test file.

cheers,

efriedma accepted this revision.May 3 2017, 11:31 AM

LGTM.

This revision is now accepted and ready to land.May 3 2017, 11:31 AM
This revision was automatically updated to reflect the committed changes.
llvm/trunk/test/CodeGen/ARM/acle-intrinsics.ll