This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Add BIT/BIF support.
ClosedPublic

Authored by ilinpv on Feb 6 2020, 11:00 AM.

Details

Summary

This patch added generation of SIMD bitwise insert BIT/BIF instructions.
In the absence of GCC-like functionality for optimal constraints satisfaction during register allocation
the bitwise insert and select patterns are matched by pseudo bitwise select BSP instruction with not tied def.
It is expanded later after register allocation with def tied to BSL/BIT/BIF depending on operands registers.
This allows to get rid of redundant moves.

Diff Detail

Event Timeline

ilinpv created this revision.Feb 6 2020, 11:00 AM

Would you mind re-uploading with some more context in the diff please?

ilinpv updated this revision to Diff 243118.Feb 7 2020, 2:50 AM

More diff context added

dmgreen edited reviewers, added: dmgreen; removed: greened.Feb 7 2020, 9:56 AM
dmgreen added subscribers: greened, dmgreen.

Can you update with all the context? -U999999. It helps to be able to see the entire file here in phabricator, which would not otherwise be available.

For these tests that have changed/been added, can you run the update_llc_test_checks scripts on them all and commit that as a separate initial (NFC) patch? That way the differences here are more obvious. It looks like a nice improvement, but it's hard to tell when the test just says "CHECK: bsl"

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1346

Is this ever generated from anything?

AArch64ISD::BSL and AArch64ISD::BIT seem to be created from ISelLowering. Would it make sense to convert them to use BSLP as well? We could then presumably remove a lot of the explicit patterns for BSL, BIT and BIF.

llvm/lib/Target/AArch64/AArch64InstrFormats.td
5213

We should make sure that the schedules have info on BSLP, if they previously had explicit info for BSP/BIT/BIF.

ilinpv marked 2 inline comments as done.Feb 12 2020, 4:18 AM
ilinpv added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1346

It is not generated, just a manual target node name. BSL and BIF can be converted to use BSLP. However using BSLP for BIT in ISelLowering can lead to additional move generation on expansion, see vector-fcopysign as example.

llvm/lib/Target/AArch64/AArch64InstrFormats.td
5213

Schedules using explicit regex "^(BIF|BIT|BSL)". As I understand it schedules passes happen after pseudo expansion, so at schedule stage we have all BSLP expanded to BIF/BIT/BSL.

ilinpv updated this revision to Diff 244251.Feb 12 2020, 1:00 PM
ilinpv edited the summary of this revision. (Show Details)

Diff amended in accordance with comments.

dmgreen added inline comments.Feb 13 2020, 1:45 AM
llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
464–469

Do you know if there are any test cases that test this "extra move" path?

llvm/lib/Target/AArch64/AArch64InstrFormats.td
5213

I think we run scheduling both pre-ra and post-ra. But that might depend on the target. So the pre-ra scheduling will still see the BSLP's. (I think you are right about post-ra though, those will see the "real" instruction.)

llvm/lib/Target/AArch64/AArch64InstrInfo.td
3967

AArch64bslp can be null_frag here and in BSL.

3987–4039

Most of these patterns can be removed I think. We need patterns for
TriOpFrag<(or (and ... -> BLSP (which are handled in the SIMDLogicalThreeVectorPseudo definnition)
AArch64bslp -> BSLP (for which we need these extra patterns).
AArch64bit -> BIT, if we are keeping that one around (again handled inside SIMDLogicalThreeVectorTied)

The other shouldn't be needed unless I'm missing something.

ilinpv marked an inline comment as done.Feb 13 2020, 2:52 AM
ilinpv added inline comments.
llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
464–469

Yes, CodeGen/AArch64/arm64-promote-const.ll for example

ilinpv updated this revision to Diff 244598.Feb 14 2020, 2:30 AM
ilinpv edited the summary of this revision. (Show Details)

Obsolete patterns removed, schedulers patched.

dmgreen accepted this revision.Feb 14 2020, 2:46 AM

Thanks for the scheduling changes. LGTM, with a couple of minor nitpicks.

llvm/lib/Target/AArch64/AArch64InstrInfo.td
3965

This line it quite long. Maybe try and split it up a little like the BIC above?

3988–4004

I believe that these patterns will already be included inside the SIMDLogicalThreeVectorTied BIT class, so aren't needed again here?

This revision is now accepted and ready to land.Feb 14 2020, 2:46 AM
ilinpv marked 2 inline comments as done.Feb 14 2020, 4:54 AM
ilinpv added inline comments.
llvm/lib/Target/AArch64/AArch64InstrInfo.td
3965

Sure

3988–4004

Yes, accidentally remained here.

This revision was automatically updated to reflect the committed changes.

Hello @ilinpv,

these changes break the tests on the expensive check builder
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/2884

*** Bad machine code: Explicit definition marked as use ***
- function:    test5
- basic block: %bb.0 entry (0x558983ebb6b8)
- instruction: ORRv16i8 $q0, killed renamable $q1, killed renamable $q1
- operand 0:   $q0

*** Bad machine code: Using an undefined physical register ***
- function:    test5
- basic block: %bb.0 entry (0x558983ebb6b8)
- instruction: ORRv16i8 $q0, killed renamable $q1, killed renamable $q1
- operand 0:   $q0

*** Bad machine code: Using an undefined physical register ***
- function:    test5
- basic block: %bb.0 entry (0x558983ebb6b8)
- instruction: renamable $q0 = BSLv16i8 $q0(tied-def 0), killed renamable $q2, killed renamable $q3
- operand 1:   $q0(tied-def 0)
LLVM ERROR: Found 3 machine code errors.

Ah. Kill Flags. Should hopefully be fixed up in da147ef0a5c6d31c21d31a52b97235a629830c15

Thanks for the report.

Fixed in dc0b8159890134a59fdf34b20e8b2052d9456441. All AArch64 expensive check tests passed.