This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC][altivec] Optimize codegen of vec_promote
ClosedPublic

Authored by lkail on Aug 21 2023, 11:13 PM.

Details

Summary

According to https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-vec-promote, elements not specified by the input index argument are undefined. So that we don't need to set these elements to be zeros.

Diff Detail

Event Timeline

lkail created this revision.Aug 21 2023, 11:13 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 21 2023, 11:13 PM
Herald added a subscriber: kbarton. · View Herald Transcript
lkail requested review of this revision.Aug 21 2023, 11:13 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptAug 21 2023, 11:13 PM
lkail updated this revision to Diff 552226.Aug 21 2023, 11:14 PM
lkail updated this revision to Diff 552251.Aug 22 2023, 1:02 AM
qiucf added a subscriber: qiucf.Aug 22 2023, 1:33 AM
qiucf added inline comments.
clang/lib/Headers/altivec.h
14661

Could we just define it without initialization? This can also make undefined vector.

llvm/test/CodeGen/PowerPC/vec-promote.ll
2

We don't need this file because (1) no backend codegen changed; (2) further changes to altivec.h will not change this file.

lkail added inline comments.Aug 22 2023, 1:50 AM
clang/lib/Headers/altivec.h
14661

Using __builtin_shufflevector generates poison values, which is stronger than undef, exposing more optimizations in my view. See https://llvm.org/docs/LangRef.html#id1781.

llvm/test/CodeGen/PowerPC/vec-promote.ll
2

I prefer keeping this file

  1. Backend lacks vec_promote equivalent tests.
  2. to show different codegen of FE exposes different optimization opportunities to backend which is the core value of this patch. Codegen quality should be finally reflected by assembly.
lkail added inline comments.Aug 22 2023, 1:54 AM
clang/lib/Headers/altivec.h
14661

Also using __builtin_shufflevector is explicit, which has a formal specification https://clang.llvm.org/docs/LanguageExtensions.html#langext-builtin-shufflevector.

nemanjai accepted this revision.Aug 23 2023, 2:04 PM

LGTM. This is a good idea and we should go ahead with this for anyone that uses vec_promote, but it might be a good idea to improve codegen for the insert which might be more common.

llvm/test/CodeGen/PowerPC/vec-promote.ll
44

This code is absolutely terrible. Not only is the lfs super slow compared to lfiwzx/lxsiwzx that we actually want, but the two conversions and three permutes are super slow.

I think the change to altivec.h to produce better code for something like that is a good thing, but I wonder if something like this might come up in other contexts.

At least on Power9 and up, we can do much better than this. We don't do particularly well regardless of whether we're using a zero vector input or an arbitrary vector: https://godbolt.org/z/79fx8nsdP

This revision is now accepted and ready to land.Aug 23 2023, 2:04 PM
lkail added a comment.EditedAug 23 2023, 6:50 PM

but it might be a good idea to improve codegen for the insert which might be more common.

Yes. Usually insertelement is transformed to BUILD_VECTOR in SDAG, currently we don't have much optimization for BUILD_VECTOR in special patterns.

This revision was landed with ongoing or failed builds.Aug 23 2023, 7:10 PM
This revision was automatically updated to reflect the committed changes.