This is an archive of the discontinued LLVM Phabricator instance.

[Mips] Add support to match more patterns for DEXT and CINS
ClosedPublic

Authored by violetav on Feb 28 2017, 8:55 AM.

Details

Summary

This patch adds support for recognizing more patterns to match to DEXT and CINS instructions.
It finds cases where multiple instructions could be replaced with a single DEXT or CINS instruction. For example, for the following:

define i64 @dext_and32(i64 zeroext %a) {
entry:

%and = and i64 %a, 4294967295
ret i64 %and

}

instead of generating:

0000000000000088 <dext_and32>:

88:   64010001        daddiu  at,zero,1
8c:   0001083c        dsll32  at,at,0x0
90:   6421ffff        daddiu  at,at,-1
94:   03e00008        jr      ra
98:   00811024        and     v0,a0,at
9c:   00000000        nop

the following gets generated:

0000000000000068 <dext_and32>:

68:   03e00008        jr      ra
6c:   7c82f803        dext    v0,a0,0x0,0x20

Cases that are covered:
DEXT:

  1. and $src, mask where mask > 0xffff
  2. zext $src zero extend from i32 to i64

CINS:

  1. and (shl $src, pos), mask
  2. shl (and $src, mask), pos
  3. zext (shl $src, pos) zero extend from i32 to i64

Diff Detail

Repository
rL LLVM

Event Timeline

violetav created this revision.Feb 28 2017, 8:55 AM
violetav updated this revision to Diff 90129.Mar 1 2017, 1:46 AM

Uploaded the patch with more context.

sdardis requested changes to this revision.Mar 6 2017, 6:30 AM

A few issues, mostly formatting and the correct predicates on the instructions. From my reading of the cnMIPS instruction set, the cins and cins32 instruction require that we're compiling for MIPS64 as they require access to 64 bit operations, so these instructions and patterns need to have ASE_MIPS64_CNMIPS. Additionally, they--along with the dext pattern--needs to be marked as NotInMicroMips through the AdditionalPredicates mechanism.

microMIPS64R6 does have the dext family of instructions but lacks the necessary instruction mappings, so if the instruction was selected by ISel because the pattern matched, the resulting code would be only partially translated to microMIPS64R6. This isn't a problem if the result is an assembly which is then feed back into clang or an external assembler, as the correct instruction would be parsed. It is a problem if direct object emission is used as the resulting object would have instructions from two different ISAs without the correct mode switch.

The new test cases can be cut down by getting rid of the metadata and annotations that clang produces when compiling to llvm-ir.

lib/Target/Mips/Mips64InstrInfo.td
334–336

Formatting. Align the (ins) with the out above, when you have to break the line for the (ins), align the parameters of the (ins) under one another. Align the assembly text with (outs) and the rest of the parameters to InstSE with the outs when a line break is needed. The EXT_FM<3> or ISA_MIPS64R2 is aligned with the InstSE as required.

This requires AdditionalPredicates = [NotInMicroMips].

437–438

Align the EXTS_FM with ExtsCins and ASE_CNMIPS should be ASE_MIPS64_CNMIPS.

437–460

All these defs need to be covered with AdditionalPredicates = [NotInMicroMips].

438–460

Align the EXTS_FM with ExtsCins and ASE_CNMIPS should be ASE_MIPS64_CNMIPS.

444

Align the EXTS_FM with ExtsCins. This definition also requires ASE_MIPS64_CNMIPS.

446–448

Align these lines to under the (outs). This definition requires ASE_MIPS64_CNMIPS.

675–677

Use ISA_MIPS64R2 rather than AdditionalPredicates. E.g. this should be:

def : MipsPat<(i64 (zext GPR32:$src)), (DEXT64_32 GPR32:$src, 0, 32)>, ISA_MIPS64R2;

The AdditionalPredicates should be NotInMicroMips.

678–681

This needs ASE_MIPS64_CNMIPS. The AdditionalPredicates should have NotInMicroMips. The two pattern definitions can be covered with the one "let AdditionalPredicates = [NotInMicroMips]" scope.

lib/Target/Mips/MipsISelLowering.cpp
947–949

You can fold Opc = MipsISD::CIns into it's usage here.

test/CodeGen/Mips/cins.ll
2

See my comments about dext.ll.

test/CodeGen/Mips/dext.ll
2

Change this runline to match the ones found under test/CodeGen/Mips/, that is the architecture, cpu and ABI are specified on the run line.

4–7

Drop all this after you've made the above change as it'll be irrelevant.

9

After you've re-written @dext_index1, you can drop this.

11–12

Can you move the CHECKs into the function near the IR that they are testing the codegen of? It makes it clearer to see what output is expected for a given input.

14

These types of lines can be removed as they're unnecessary for the purposes of this test.

15

local_unnamed_addr #0 can be removed as well.

15–30

Rewrite this so that it takes a i32 signext argument and returns a i64 zext result, the the body can just be

%res = zext i32 %n to i64
ret i64 %res
35–43

This test is redundant as the patterns it's likely to produce are covered by the case above.

138–139

Add whatever relevant attributes are required onto the runline instead of relying on function attributes and remove these attribute declarations.

141–147

All of this can be removed as it's irrelevant.

141–147

This is irrelevant and can be removed.

test/CodeGen/Mips/load-store-left-right.ll
12

This is inconsistent with the little endian test above. Also, there are no check lines for MIPS64R2-EB.

186–188

Add the MIPS64R2-EB variant.

487–489

MIPS64R2-EB produces a dext here, so test for that.

test/MC/Mips/sext_64_32.ll
14–15

Add a CHECK-LABEL: foo_2: before this.

This revision now requires changes to proceed.Mar 6 2017, 6:30 AM
violetav updated this revision to Diff 91179.Mar 9 2017, 9:02 AM
violetav edited edge metadata.

Addressed comments. Thank you for the review.

violetav added inline comments.Mar 9 2017, 9:03 AM
test/CodeGen/Mips/dext.ll
15–30

There is already a test (dext_zext) with the same body that you are proposing, so I removed the dext_index test that produces the same pattern. As for the dext_index1 test, it produces a different pattern (I simplified the body and renamed the test 'dext_add_zext'). The difference is that in the first case, the 'zero_extend' node gets transformed to 'and' node which is then further combined to dext, and in the other case, the zero_extend node doesn't get transformed to 'and' and later DEXT64_32 gets selected.

sdardis accepted this revision.Mar 13 2017, 9:28 AM

Thanks, LGTM with inline nits addressed.

lib/Target/Mips/Mips64InstrInfo.td
438–459

Indent the contents of the "let AdditionalPredicates = [NotInMicroMips} in" by 2 spaces.

lib/Target/Mips/MipsInstrInfo.td
1176

Overly long line, reformat this to match the formatting of the line above.

This revision is now accepted and ready to land.Mar 13 2017, 9:28 AM
violetav updated this revision to Diff 91585.Mar 13 2017, 10:30 AM

Addressed comments.

violetav marked 2 inline comments as done.Mar 13 2017, 11:04 AM
violetav updated this revision to Diff 91751.Mar 14 2017, 11:05 AM

Rebased patch.

violetav updated this revision to Diff 91862.Mar 15 2017, 5:12 AM

Added missing CHECKs in load-store-left-right.ll test.

This revision was automatically updated to reflect the committed changes.