This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Try to fold uaddlv and uaddlp
ClosedPublic

Authored by jaykang10 on Jun 20 2023, 2:27 AM.

Details

Summary

gcc generates less instructions than llvm from below intrinsic example.

#include <arm_neon.h>

unsigned foo(uint16x8_t b) {
    return vaddlvq_u32(vpadalq_u16(vdupq_n_u32(0), b));
}

gcc output

foo:
	uaddlv	s31, v0.8h
	fmov	x0, d31
	ret

llvm output

foo:
	uaddlp	v0.4s, v0.8h
	uaddlv	d0, v0.4s
	fmov	x0, d0

We could do uaddlv(uaddlp(x)) ==> uaddlv(x).
After adding tablegen pattern for it, the llvm output is as below.

foo:
	uaddlv	s0, v0.8h
	fmov	x0, d0

Diff Detail

Event Timeline

jaykang10 created this revision.Jun 20 2023, 2:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 20 2023, 2:27 AM
jaykang10 requested review of this revision.Jun 20 2023, 2:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 20 2023, 2:27 AM
dmgreen added inline comments.Jun 20 2023, 3:27 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.td
6331

It is probably quite a minor point, but can you change this to a (v4i32 (SUBREG_TO_REG (i64 0), (UADDLVv8i16v V128:$op), ssub)). The EXTRACT_SUBREG is using the fact that the higher lanes will be implicitly zeroed.

jaykang10 added inline comments.Jun 20 2023, 4:11 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.td
6331

Let me update the pattern.

jaykang10 updated this revision to Diff 532864.Jun 20 2023, 4:16 AM

Following @dmgreen's comment, updated the pattern.

dmgreen added inline comments.Jun 20 2023, 6:23 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.td
6336

This one too, as it returns a h reg.

jaykang10 added inline comments.Jun 20 2023, 6:38 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.td
6336

Sorry. Let me update the pattern.

jaykang10 updated this revision to Diff 532895.Jun 20 2023, 6:42 AM

Following @dmgreen's comment, updated pattern.

dmgreen accepted this revision.Jun 20 2023, 6:42 AM

Thanks. LGTM

This revision is now accepted and ready to land.Jun 20 2023, 6:42 AM
This revision was landed with ongoing or failed builds.Jun 20 2023, 7:15 AM
This revision was automatically updated to reflect the committed changes.