This is an archive of the discontinued LLVM Phabricator instance.

[X86] Allow inlining callees missing VLX feature
Needs ReviewPublic

Authored by kalcutter on Aug 13 2023, 2:13 PM.

Details

Summary

This patch attempts to fix a regression caused by https://github.com/llvm/llvm-project/commit/d6f994acb3d545b80161e24ab742c9c69d4bbf33. In particular, always_inline should work on callees without VLX from VLX functions. I found this testing clang-17-rc2. If accepted, please also apply this to the LLVM 17.0.0 branch.

Diff Detail

Event Timeline

kalcutter created this revision.Aug 13 2023, 2:13 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 13 2023, 2:13 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
kalcutter requested review of this revision.Aug 13 2023, 2:13 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 13 2023, 2:13 PM
nikic added a subscriber: nikic.Aug 13 2023, 2:42 PM
nikic added inline comments.
llvm/lib/Target/X86/X86TargetTransformInfo.cpp
6082

I assume that this is the actually failing check? In that case, should the adjustment be in that function?

kalcutter added inline comments.Aug 13 2023, 3:00 PM
llvm/lib/Target/X86/X86TargetTransformInfo.cpp
6082

I am not sure. That function is checking that the used types are ABI compatible, but what about available target instructions? Is the idea to first only check the types, then later during CodeGen the target does a more exhaustive check?

Fixing areTypesABICompatible() would be more general I guess. Do you have an idea how much work is involved properly fixing that function? I am not familiar with this code base at all. Do you have anything against applying this patch as an incremental improvement?

kalcutter added inline comments.Aug 13 2023, 3:23 PM
llvm/lib/Target/X86/X86TargetTransformInfo.cpp
6082

I guess the feature subset check covers all cases of available target instructions. So just making sure all the types are compatible, like you suggested, would be the proper fix.

There is also another related issue: When areInlineCompatible() returns false here, a callee with attribute always_inline is silently not inlined. It seems like clang should emit and error or at least a warning in this case. I only noticed this because my code was running 10x slower compiling with clang 17.

kalcutter marked an inline comment as not done.Aug 13 2023, 3:38 PM
pengfei added inline comments.Aug 13 2023, 4:20 PM
llvm/lib/Target/X86/X86TargetTransformInfo.cpp
6082

The function is designed to check if both caller and callee allow ZMM codegen or not when passing a 512-bit vector type. The logic to check ZMM codegen is:

  1. has AVX512F but not AVX512VL
  2. has AVX512F and prefer-vector-width >= 512
  3. has AVX512F and min-legal-vector-width > 256

So has AVX512VL or not does affect the ABI when prefer-vector-width and min-legal-vector-width are set 256 so less, though it shouldn't happen since we have guarantee they are set properly in the FE.

kalcutter added inline comments.Aug 14 2023, 12:52 AM
llvm/lib/Target/X86/X86TargetTransformInfo.cpp
6082

I am a bit confused. A function with both AVX512F and AVX512VL features obviously allows ZMM codegen in the function body. Why should enabling AVX512VL change how 512-bit vector types are passed? Even if that were the case for an opaque function, why should that disabling inlining? Also I don't see how (1)-(3) should ever affect`always_inline`.

pengfei added inline comments.Aug 14 2023, 6:34 AM
llvm/lib/Target/X86/X86TargetTransformInfo.cpp
6082

We have an option -mprefer-vector-width=128/256, which prefers to 128/256 vector instructions under AVX512VL. Under such situation, 512-bit vector may be split into 256-bit when passing. No sure if it is the reason we disable inlining.

kalcutter added inline comments.Aug 14 2023, 9:05 AM
llvm/lib/Target/X86/X86TargetTransformInfo.cpp
6058

Thinking about this more. Why shouldn't this function just end at this line? ISTM that only features should control if inlining is _possible_ or not. I don't see how checking for ABI compatibility between types is helpful. Once the callee gets inlined, it can take on the ABI of the caller anyway.

pengfei added inline comments.Aug 14 2023, 5:57 PM
llvm/lib/Target/X86/X86TargetTransformInfo.cpp
6058

It's good for caller has FeatureVLX but callee doesn't, but maybe we need to check ABI compatibility for the both direction. The reason we need to check type is only vector type affected by feature difference.

Matt added a subscriber: Matt.Aug 16 2023, 1:16 PM