This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][GlobalISel] Legalize bswap <2 x i16>
ClosedPublic

Authored by jroelofs on Jul 13 2021, 1:10 PM.

Diff Detail

Event Timeline

jroelofs created this revision.Jul 13 2021, 1:10 PM
jroelofs requested review of this revision.Jul 13 2021, 1:10 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 13 2021, 1:10 PM

marked as "WIP" because there are a few extraneous copies:

$ cat bswap.ll
; RUN: llc -mtriple=arm64-apple-ios < %s | FileCheck %s

define void @test1(<2 x i16>* %p) {
  %in = load <2 x i16>, <2 x i16>* %p
  %out = call <2 x i16> @llvm.bswap.v2i16(<2 x i16> %in)
  store <2 x i16> %out, <2 x i16>* %p
  ret void
}

declare <2 x i16> @llvm.bswap.v2i16(<2 x i16>) nounwind readnone

GISel:

$ ./bin/llc -global-isel=1 -global-isel-abort=1 bswap.ll -mtriple=arm64-apple-ios -o -
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1                          ; -- Begin function test1
	.p2align	2
_test1:                                 ; @test1
	.cfi_startproc
; %bb.0:
	ldr	h0, [x0]
	ldr	h1, [x0, #2]
	mov.h	v0[1], v1[0]
	fmov	w8, s0
	mov.s	v0[0], w8
	rev32.8b	v0, v0
	ushr.2s	v0, v0, #16
	fmov	s0, s0
	mov	h1, v0[1]
	str	h0, [x0]
	str	h1, [x0, #2]
	ret
	.cfi_endproc
                                        ; -- End function
.subsections_via_symbols

SDAG:

$ ./bin/llc -global-isel=0 -global-isel-abort=1 bswap.ll -mtriple=arm64-apple-ios -o -
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1                          ; -- Begin function test1
	.p2align	2
_test1:                                 ; @test1
	.cfi_startproc
; %bb.0:
	ld1.h	{ v0 }[0], [x0]
	add	x8, x0, #2                      ; =2
	ld1.h	{ v0 }[2], [x8]
	rev32.8b	v0, v0
	ushr.2s	v0, v0, #16
	mov.s	w8, v0[1]
	fmov	w9, s0
	strh	w9, [x0]
	strh	w8, [x0, #2]
	ret
	.cfi_endproc
                                        ; -- End function
.subsections_via_symbols
paquette added inline comments.Jul 13 2021, 1:23 PM
llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
1024

Can you use G_REV32 here instead? (see: AArch64InstrGISel.td)

I think it's preferable to keep the results of legalization as generic as possible.

1029

Is it possible to use a generic instruction here?

jroelofs updated this revision to Diff 358485.Jul 13 2021, 6:39 PM

Closer.

Use more generic opcodes during legalization.

jroelofs updated this revision to Diff 358487.Jul 13 2021, 6:41 PM
jroelofs marked 2 inline comments as done.

fixup test checks

paquette added inline comments.Jul 13 2021, 9:07 PM
llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
3999 ↗(On Diff #358487)

This looks similar to getSubRegForClass? Maybe it's possible to share some code there?

paquette added inline comments.Jul 14 2021, 11:37 AM
llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
3999 ↗(On Diff #358487)

Also, it'd be nice to have a specific test for G_UNMERGE_VALUES selection. I think that this part + that test could be moved into an independent patch.

jroelofs marked an inline comment as done.Jul 14 2021, 12:47 PM
jroelofs added inline comments.
llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
3999 ↗(On Diff #358487)

good idea

3999 ↗(On Diff #358487)
jroelofs updated this revision to Diff 358712.Jul 14 2021, 1:11 PM
jroelofs marked an inline comment as done.

Getting closer:

SDAG:

$ ./bin/llc -global-isel=0 -global-isel-abort=1 bswap.ll -mtriple=arm64-apple-ios -o -
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1                          ; -- Begin function test1
	.p2align	2
_test1:                                 ; @test1
	.cfi_startproc
; %bb.0:
	ld1.h	{ v0 }[0], [x0]
	add	x8, x0, #2                      ; =2
	ld1.h	{ v0 }[2], [x8]
	rev32.8b	v0, v0
	ushr.2s	v0, v0, #16
	mov.s	w8, v0[1]
	fmov	w9, s0
	strh	w9, [x0]
	strh	w8, [x0, #2]
	ret
	.cfi_endproc
                                        ; -- End function
.subsections_via_symbols

GISel:

$ ./bin/llc -global-isel=1 -global-isel-abort=1 bswap.ll -mtriple=arm64-apple-ios -o -
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1                          ; -- Begin function test1
	.p2align	2
_test1:                                 ; @test1
	.cfi_startproc
; %bb.0:
	ldr	h0, [x0]
	ldr	h1, [x0, #2]
	mov.h	v0[1], v1[0]
	mov.s	v0[0], v0[0]
	rev32.8b	v0, v0
	ushr.2s	v0, v0, #16
	mov	h1, v0[1]
	str	h0, [x0]
	str	h1, [x0, #2]
	ret
	.cfi_endproc
                                        ; -- End function
.subsections_via_symbols
jroelofs updated this revision to Diff 358783.Jul 14 2021, 4:57 PM
jroelofs retitled this revision from WIP: [AArch64][GlobalISel] Legalize bswap <2 x i16> to [AArch64][GlobalISel] Legalize bswap <2 x i16>.

SDAG:

$ ./bin/llc -global-isel=0 -global-isel-abort=1 bswap.ll -mtriple=arm64-apple-ios -o -
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1                          ; -- Begin function test1
	.p2align	2
_test1:                                 ; @test1
	.cfi_startproc
; %bb.0:
	ld1.h	{ v0 }[0], [x0]
	add	x8, x0, #2                      ; =2
	ld1.h	{ v0 }[2], [x8]
	rev32.8b	v0, v0
	ushr.2s	v0, v0, #16
	mov.s	w8, v0[1]
	fmov	w9, s0
	strh	w9, [x0]
	strh	w8, [x0, #2]
	ret
	.cfi_endproc
                                        ; -- End function
.subsections_via_symbols

GISel:

$ ./bin/llc -global-isel=1 -global-isel-abort=1 bswap.ll -mtriple=arm64-apple-ios -o -
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1                          ; -- Begin function test1
	.p2align	2
_test1:                                 ; @test1
	.cfi_startproc
; %bb.0:
	ldr	h0, [x0]
	ldr	h1, [x0, #2]
	mov.h	v0[1], v1[0]
	rev32.8b	v0, v0
	ushr.2s	v0, v0, #16
	mov	h1, v0[1]
	str	h0, [x0]
	str	h1, [x0, #2]
	ret
	.cfi_endproc
                                        ; -- End function
.subsections_via_symbols
paquette added inline comments.Jul 15 2021, 10:02 AM
llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
1016

Do you think you could add an explanation somewhere for why the v2s16 case is special?

paquette accepted this revision.Jul 15 2021, 1:58 PM

Aside from the comment, I think this looks pretty good at this point? The codegen differences between SDAG and GISel seem unrelated to the bswap at this point.

This revision is now accepted and ready to land.Jul 15 2021, 1:58 PM
jroelofs updated this revision to Diff 359504.Jul 16 2021, 5:34 PM

Explain why <2 x half> is weird, and why we're not directly selecting the instructions we want during legalization.

jroelofs marked an inline comment as done.Jul 16 2021, 5:34 PM
paquette accepted this revision.Jul 16 2021, 8:26 PM

Thanks!

This revision was landed with ongoing or failed builds.Jul 17 2021, 4:46 PM
This revision was automatically updated to reflect the committed changes.