This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][GlobalISel] Add G_EXT and select ext using it
ClosedPublic

Authored by paquette on Jun 8 2020, 3:35 PM.

Details

Summary

Add selection support for ext via a new opcode, G_EXT and a post-legalizer combine which matches it.

Add an applyEXT function, because the AArch64ext patterns require a register for the immediate. So, we have to create a G_CONSTANT to get these without writing new patterns or modifying the existing ones.

Tests are the same as arm64-ext.ll.

For reference, here are the patterns we get with G_EXT:

(AArch64ext:{ *:[v1i64] } V64:{ *:[v1i64] }:$Rn, V64:{ *:[v1i64] }:$Rm, (imm:{ *:[i32] }):$imm)  =>  (EXTv8i8:{ *:[v1i64] } V64:{ *:[v1i64] }:$Rn, V64:{ *:[v1i64] }:$Rm, (imm:{ *:[i32] }):$imm)
(AArch64ext:{ *:[v1f64] } V64:{ *:[v1f64] }:$Rn, V64:{ *:[v1f64] }:$Rm, (imm:{ *:[i32] }):$imm)  =>  (EXTv8i8:{ *:[v1f64] } V64:{ *:[v1f64] }:$Rn, V64:{ *:[v1f64] }:$Rm, (imm:{ *:[i32] }):$imm)
(AArch64ext:{ *:[v2i32] } V64:{ *:[v2i32] }:$Rn, V64:{ *:[v2i32] }:$Rm, (imm:{ *:[i32] }):$imm)  =>  (EXTv8i8:{ *:[v2i32] } V64:{ *:[v2i32] }:$Rn, V64:{ *:[v2i32] }:$Rm, (imm:{ *:[i32] }):$imm)
(AArch64ext:{ *:[v2f32] } V64:{ *:[v2f32] }:$Rn, V64:{ *:[v2f32] }:$Rm, (imm:{ *:[i32] }):$imm)  =>  (EXTv8i8:{ *:[v2f32] } V64:{ *:[v2f32] }:$Rn, V64:{ *:[v2f32] }:$Rm, (imm:{ *:[i32] }):$imm)
(AArch64ext:{ *:[v2i64] } V128:{ *:[v2i64] }:$Rn, V128:{ *:[v2i64] }:$Rm, (imm:{ *:[i32] }):$imm)  =>  (EXTv16i8:{ *:[v2i64] } V128:{ *:[v2i64] }:$Rn, V128:{ *:[v2i64] }:$Rm, (imm:{ *:[i32] }):$imm)
(AArch64ext:{ *:[v2f64] } V128:{ *:[v2f64] }:$Rn, V128:{ *:[v2f64] }:$Rm, (imm:{ *:[i32] }):$imm)  =>  (EXTv16i8:{ *:[v2f64] } V128:{ *:[v2f64] }:$Rn, V128:{ *:[v2f64] }:$Rm, (imm:{ *:[i32] }):$imm)
(AArch64ext:{ *:[v4i16] } V64:{ *:[v4i16] }:$Rn, V64:{ *:[v4i16] }:$Rm, (imm:{ *:[i32] }):$imm)  =>  (EXTv8i8:{ *:[v4i16] } V64:{ *:[v4i16] }:$Rn, V64:{ *:[v4i16] }:$Rm, (imm:{ *:[i32] }):$imm)
(AArch64ext:{ *:[v4f16] } V64:{ *:[v4f16] }:$Rn, V64:{ *:[v4f16] }:$Rm, (imm:{ *:[i32] }):$imm)  =>  (EXTv8i8:{ *:[v4f16] } V64:{ *:[v4f16] }:$Rn, V64:{ *:[v4f16] }:$Rm, (imm:{ *:[i32] }):$imm)
(AArch64ext:{ *:[v4bf16] } V64:{ *:[v4bf16] }:$Rn, V64:{ *:[v4bf16] }:$Rm, (imm:{ *:[i32] }):$imm)  =>  (EXTv8i8:{ *:[v4bf16] } V64:{ *:[v4bf16] }:$Rn, V64:{ *:[v4bf16] }:$Rm, (imm:{ *:[i32] }):$imm)
(AArch64ext:{ *:[v4i32] } V128:{ *:[v4i32] }:$Rn, V128:{ *:[v4i32] }:$Rm, (imm:{ *:[i32] }):$imm)  =>  (EXTv16i8:{ *:[v4i32] } V128:{ *:[v4i32] }:$Rn, V128:{ *:[v4i32] }:$Rm, (imm:{ *:[i32] }):$imm)
(AArch64ext:{ *:[v4f32] } V128:{ *:[v4f32] }:$Rn, V128:{ *:[v4f32] }:$Rm, (imm:{ *:[i32] }):$imm)  =>  (EXTv16i8:{ *:[v4f32] } V128:{ *:[v4f32] }:$Rn, V128:{ *:[v4f32] }:$Rm, (imm:{ *:[i32] }):$imm)
(AArch64ext:{ *:[v8i8] } V64:{ *:[v8i8] }:$Rn, V64:{ *:[v8i8] }:$Rm, (imm:{ *:[i32] }):$imm)  =>  (EXTv8i8:{ *:[v8i8] } V64:{ *:[v8i8] }:$Rn, V64:{ *:[v8i8] }:$Rm, (imm:{ *:[i32] }):$imm)
(AArch64ext:{ *:[v8i8] } V64:{ *:[v8i8] }:$Rn, V64:{ *:[v8i8] }:$Rm, (imm:{ *:[i32] }):$imm)  =>  (EXTv8i8:{ *:[v8i8] } V64:{ *:[v8i8] }:$Rn, V64:{ *:[v8i8] }:$Rm, (imm:{ *:[i32] }):$imm)
(AArch64ext:{ *:[v8i16] } V128:{ *:[v8i16] }:$Rn, V128:{ *:[v8i16] }:$Rm, (imm:{ *:[i32] }):$imm)  =>  (EXTv16i8:{ *:[v8i16] } V128:{ *:[v8i16] }:$Rn, V128:{ *:[v8i16] }:$Rm, (imm:{ *:[i32] }):$imm)
(AArch64ext:{ *:[v8f16] } V128:{ *:[v8f16] }:$Rn, V128:{ *:[v8f16] }:$Rm, (imm:{ *:[i32] }):$imm)  =>  (EXTv16i8:{ *:[v8f16] } V128:{ *:[v8f16] }:$Rn, V128:{ *:[v8f16] }:$Rm, (imm:{ *:[i32] }):$imm)
(AArch64ext:{ *:[v8bf16] } V128:{ *:[v8bf16] }:$Rn, V128:{ *:[v8bf16] }:$Rm, (imm:{ *:[i32] }):$imm)  =>  (EXTv16i8:{ *:[v8bf16] } V128:{ *:[v8bf16] }:$Rn, V128:{ *:[v8bf16] }:$Rm, (imm:{ *:[i32] }):$imm)
(AArch64ext:{ *:[v16i8] } V128:{ *:[v16i8] }:$Rn, V128:{ *:[v16i8] }:$Rm, (imm:{ *:[i32] }):$imm)  =>  (EXTv16i8:{ *:[v16i8] } V128:{ *:[v16i8] }:$Rn, V128:{ *:[v16i8] }:$Rm, (imm:{ *:[i32] }):$imm)
(AArch64ext:{ *:[v16i8] } V128:{ *:[v16i8] }:$Rn, V128:{ *:[v16i8] }:$Rm, (imm:{ *:[i32] }):$imm)  =>  (EXTv16i8:{ *:[v16i8] } V128:{ *:[v16i8] }:$Rn, V128:{ *:[v16i8] }:$Rm, (imm:{ *:[i32] }):$imm)

Diff Detail

Event Timeline

paquette created this revision.Jun 8 2020, 3:35 PM
aemerson added inline comments.Jun 11 2020, 6:38 PM
llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerCombiner.cpp
355

You can just do:
auto Cst = IRBuilder.buildConstant(LLT::scalar(32), MatchInfo.SrcOps[2].getImm());

aemerson added inline comments.Jun 12 2020, 11:23 AM
llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerCombiner.cpp
320

Tiny nit: if you're returning optional instead of bool unlike the DAG version, I think its better to name this getExtMask rather than isExtMask.

paquette updated this revision to Diff 270518.Jun 12 2020, 2:33 PM

Rename function and address nit.

Also update a couple testcases I apparently forgot to run.

For the zip test, disable the ext combine. The ext combine has a higher priority than the zip combine, and G_EXT was being produced for a couple of the negative zip tests. Since those tests are supposed that we will not produce a G_ZIP in those situations, running the ext combine (and thus not exercising the code) is undesirable.

For the splat test, it was messed up because it was already regbankselected, and the ext combine introduces G_CONSTANTs.

This revision is now accepted and ready to land.Jun 15 2020, 11:48 AM
This revision was automatically updated to reflect the committed changes.