Page MenuHomePhabricator

[X86][FP16] Fix vector_shuffle and lowering without f16c feature problems
ClosedPublic

Authored by pengfei on Jul 29 2022, 9:18 PM.

Details

Summary

The problem Alexander reported on D127982 was caused by an optimization
for AVX512-FP16 instruction. We must limit it to the feature enabled only.

During the investigation, I found we didn't expand for fp_round/fp_extend
without F16C. This may result runtime crash, so change them too.

Diff Detail

Event Timeline

pengfei created this revision.Jul 29 2022, 9:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 29 2022, 9:18 PM
pengfei requested review of this revision.Jul 29 2022, 9:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 29 2022, 9:18 PM
RKSimon added inline comments.Jul 30 2022, 1:37 AM
llvm/test/CodeGen/X86/vector-half-conversions.ll
8

please don't use COMMON when it isn't - tbh I'd prefer you just kept F16C + AVX512 seperate

pengfei updated this revision to Diff 448785.Jul 30 2022, 1:43 AM
pengfei marked an inline comment as done.

Split F16C and AVX512.

LuoYuanke added inline comments.Jul 30 2022, 2:18 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
16102–16112

It seems the condition can be refined to hasAVX(), or we can add hasAVX() check in lowerShuffleAsBroadcast.
https://www.felixcloutier.com/x86/vbroadcast.html

RKSimon added inline comments.Jul 30 2022, 3:10 AM
llvm/test/CodeGen/X86/vector-half-conversions.ll
29

I'm confused as to under what cases we can't use vcvtph2ps for half -> float given all half values are exactly representable by float?

pengfei added inline comments.Jul 30 2022, 4:06 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
16102–16112

No, we can't. We don't have a 16-bit broadcast until AVX512BW. We don't need extra work to leverage it since we have turn the shuffle to v8i16.

llvm/test/CodeGen/X86/vector-half-conversions.ll
29

Not sure if I understand you question correctly. vcvtph2ps requires F16C, which is independent of AVX and AVX2.

RKSimon added inline comments.Jul 30 2022, 5:51 AM
llvm/test/CodeGen/X86/vector-half-conversions.ll
29

Sorry - I missed the corresponding change in the attributes for the check prefixes.

It might make sense to pre-commit the changes to the RUN/check-prefixes so you can rebase the patch and show the codegen differences more clearly.

pengfei added inline comments.Jul 30 2022, 6:51 AM
llvm/test/CodeGen/X86/vector-half-conversions.ll
29

Yeah, I tried, but the test failed without the change.

RKSimon accepted this revision.Aug 2 2022, 5:13 AM

LGTM - cheers

This revision is now accepted and ready to land.Aug 2 2022, 5:13 AM
This revision was landed with ongoing or failed builds.Aug 2 2022, 7:55 AM
This revision was automatically updated to reflect the committed changes.

@tstellar This commit needs to be cherrypicked to the 15.x branch, since it fixes a problem introduced before the llvm 16 cut (https://reviews.llvm.org/D127982).

@tstellar This commit needs to be cherrypicked to the 15.x branch, since it fixes a problem introduced before the llvm 16 cut (https://reviews.llvm.org/D127982).

Filed https://github.com/llvm/llvm-project/issues/57238