This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Remove vector shift instrinsic with shift amount zero
ClosedPublic

Authored by jaykang10 on Jun 27 2023, 2:13 AM.

Details

Summary

It looks gcc folds vector shift intrinsic with zero shift amount from below example.

#include <arm_neon.h>

inline void foo(int64x2_t a, int64x2_t b, int64_t *dst, int df) {
    int64x2_t df_s64 = vdupq_n_s64(df);
    a = vpaddq_s64(a, b); 
    a = vshlq_s64(a, df_s64);
    vst1q_s64(dst, a); 
} 

void bar(int64x2_t a, int64x2_t b, int64_t *dst) {
    foo(a, b, dst, 0); 
}

gcc output
bar:
	addp	v0.2d, v0.2d, v1.2d
	str	q0, [x0]
	ret

llvm output
bar:
	addp	v0.2d, v0.2d, v1.2d
	shl	v0.2d, v0.2d, #0
	str	q0, [x0]
	ret

It looks llvm's AArch64 target lowers the intrinsic to target custom node in SelectionDAG and is missing to fold the custom node with zero shift amount.
With this patch, the llvm output is as below.

bar:
	addp	v0.2d, v0.2d, v1.2d
	str	q0, [x0]
	ret

Diff Detail

Event Timeline

jaykang10 created this revision.Jun 27 2023, 2:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 27 2023, 2:13 AM
jaykang10 requested review of this revision.Jun 27 2023, 2:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 27 2023, 2:13 AM

Thanks - looks mostly good to me. Like I said before, there might be some advantage to doing this in instcombine in order to doing the transform earlier, but this will be useful in DAG too.

llvm/test/CodeGen/AArch64/arm64-vshift.ll
3491

Apparently this one is not correct, as the sqshlu will round the input even with a zero shift. The others look OK.

jaykang10 added inline comments.Jun 27 2023, 8:10 AM
llvm/test/CodeGen/AArch64/arm64-vshift.ll
3491

Ah... I did not know that.
If possible, can you let me know where I can find that the sqshlu will round the input even with a zero shift please?

jaykang10 updated this revision to Diff 534998.Jun 27 2023, 8:21 AM

Following @dmgreen's comment, updated code.

dmgreen accepted this revision.Jun 28 2023, 4:30 AM

Thanks. LGTM

This revision is now accepted and ready to land.Jun 28 2023, 4:30 AM