This is an archive of the discontinued LLVM Phabricator instance.

[X86][regcall] Support passing / returning structures
ClosedPublic

Authored by pengfei on Mar 20 2022, 9:44 AM.

Details

Summary

Currently, the regcall calling conversion in Clang doesn't match with
ICC when passing / returning structures. https://godbolt.org/z/axxKMKrW7

This patch tries to fix the problem to match with ICC.

Diff Detail

Event Timeline

pengfei created this revision.Mar 20 2022, 9:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 20 2022, 9:44 AM
pengfei requested review of this revision.Mar 20 2022, 9:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 20 2022, 9:44 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
LuoYuanke added inline comments.Mar 20 2022, 6:50 PM
clang/include/clang/CodeGen/CGFunctionInfo.h
744

Use "Log2_32()"?

pengfei updated this revision to Diff 416945.Mar 21 2022, 7:59 AM

Address Yuanke's comment.

craig.topper added inline comments.Mar 21 2022, 10:12 AM
clang/include/clang/CodeGen/CGFunctionInfo.h
744

Are you assuming Width is a power of 2? Should we assert that?

clang/lib/CodeGen/CGCall.cpp
4672

static?

clang/lib/CodeGen/TargetInfo.cpp
3747–3748

clang-format

pengfei updated this revision to Diff 417179.Mar 21 2022, 9:36 PM
pengfei marked 2 inline comments as done.

Address review comments. Thanks Craig!

pengfei added inline comments.Mar 21 2022, 9:36 PM
clang/include/clang/CodeGen/CGFunctionInfo.h
744

Good question! I assumed it, but I found it's not true for Clang, although ICC and GCC error for it.
Maybe we should diagnose it too? Anyway, I added an assert for it.

pengfei updated this revision to Diff 417502.Mar 22 2022, 10:47 PM

clang-formatted.

Ping? We have internal request for this.

LuoYuanke added inline comments.Mar 26 2022, 8:43 PM
clang/include/clang/CodeGen/CGFunctionInfo.h
590

I notice some code would indicate it is log 2 size with Log2 suffix in the variable name. Do you think it is more readable to add Log2 suffix?

clang/lib/CodeGen/CGCall.cpp
5239

Does this also affect other calling convention besides fastcall?

clang/lib/CodeGen/TargetInfo.cpp
2311

Update the comments for the new parameter?

3039–3041

Would you add a test for non regcall? Pass 1024 bit vector parameter and check if it is well handled both with regcall and without regcall.
Would you add comments to depict why regcall accept the size which is more than 512?

clang/test/CodeGen/aarch64-neon-tbl.c
45

I'm curious why aarch64 test cases are affected by the patch.

clang/test/CodeGen/regcall2.c
3

Add test case for target that has no avx512 feature?

10

May add a test case to show what's the max register we can pass with regcall.

pengfei updated this revision to Diff 418448.Mar 27 2022, 7:26 AM
pengfei marked 2 inline comments as done.

Address Yuanke's comments.

clang/include/clang/CodeGen/CGFunctionInfo.h
590

I think it doesn't matter. We have getter and setter for the variable. People won't access it directly.

clang/lib/CodeGen/CGCall.cpp
5239

I don't think so. The change here adds for the missing cases like [2 x <4 x double>] or { <2 x double>, <4 x double> } which should also set min-legal-width-width to the maximum of the vector length.
There're several reasons why other calling convention won't be affected.

  1. If a target has ability to pass arguments like [2 x <4 x double>], it must have the ability for <4 x double> and have set min-legal-width-width to 256 when passing it. So it makes more sense to set min-legal-width-width to 256 for [2 x <4 x double>] rather than keeping it as 0;
  2. AFAIK, targets other than X86 simply ignore min-legal-width-width. So the change won't affect them;
  3. On x86, calling conventions other than regcall don't allow arguments size larger than 512, see if (!IsRegCall && Size > 512). They will be turned into pointers, so they won't be affected by this change;
  4. For arguments size no larger than 512 and only contain single vector element, we have already turned them into pure vectors. So they have already set min-legal-width-width to the correct value;
  5. For arguments have more then one vector elements. Clang has bug which doesn't match with GCC and ICC. I have filed a bug here https://github.com/llvm/llvm-project/issues/54582
  6. Thus, only regcall can generate arguments type like [2 x <4 x double>] on X86. So only it will be affected by this.
clang/lib/CodeGen/TargetInfo.cpp
3039–3041

Added one to non regcall. regcall doesn't specify how to handle 1024 bit vector. I'd take it as UB, so we don't need such a test.
https://www.intel.com/content/www/us/en/develop/documentation/cpp-compiler-developer-guide-and-reference/top/compiler-reference/c-c-calling-conventions.html

clang/test/CodeGen/aarch64-neon-tbl.c
45

As I explained above, [2 x <16 x i8>] should have the same value of min-legal-vector-width as <16 x i8>. The difference between #0 and #1 is the value of min-legal-vector-width.

clang/test/CodeGen/regcall2.c
3

I'd take it as UB for Clang and GCC using 512 bit vector without avx512 feature. ICC always promotes to avx512 when it finds 512 bit vector. So we don't need such tests.

10

We have tests for it in clang/test/CodeGen/regcall.c. This patch doesn't affect the capability of regcall.

LuoYuanke accepted this revision.Mar 27 2022, 5:43 PM

LGTM, thanks.

This revision is now accepted and ready to land.Mar 27 2022, 5:43 PM
This revision was automatically updated to reflect the committed changes.
pengfei added inline comments.Mar 29 2022, 3:03 AM
clang/lib/CodeGen/CGCall.cpp
5239

Update for bullet 5. The handling for multi vector elements is correct. It's also passed / returned by memory. So it's still not affected by this change.

erichkeane added inline comments.Mar 31 2022, 8:58 AM
clang/lib/CodeGen/TargetInfo.cpp
3992

This line here accesses NeededSSE while uninitialized. Do you need to initalize it to something on 3963?

erichkeane added inline comments.Mar 31 2022, 9:04 AM
clang/lib/CodeGen/TargetInfo.cpp
3992
aaron.ballman added inline comments.
clang/lib/CodeGen/TargetInfo.cpp
3992

I've fixed this in 2267549296dabfed31ce194bb124f7bdb91f74c5 as it broke many tests on Windows (but I don't think one of the community build bots noticed this, which suggests we may want to have a Debug Windows builder).

Thanks @erichkeane @aaron.ballman ! Yeah, I didn't receive buildbots notice about that.