Download Raw Diff

Details

Reviewers

craig.topper
LuoYuanke
rnk
hjl.tools
rjmccall
wxiao3
jyknight
efriedma
echristo
pengfei
RKSimon

Commits

rG1c4108ab661d: [i386] Modify the alignment of __m128/__m256/__m512 vector type according i386…

Summary

According to i386 System V ABI:

when __m256 are required to be passed on the stack, the stack pointer must be aligned on a 0 mod 32 byte boundary at the time of the call.
when __m512 are required to be passed on the stack, the stack pointer must be aligned on a 0 mod 64 byte boundary at the time of the call.

The current method of clang passing __m512 parameter is as follow:

when target supports avx512, passing it with 64 byte alignment;
when target supports avx, passing it with 32 byte alignment;
Otherwise, passing it with 16 byte alignment.

Passing __m256 parameter is as follow:

when target supports avx or avx512, passing it with 32 byte alignment;
Otherwise, passing it with 16 byte alignment.

This pach will passing m128/m256/__m512 following i386 System V ABI and
apply it to Linux only since other System V OS (e.g Darwin, PS4 and FreeBSD) don't
want to spend any effort dealing with the ramifications of ABI breaks at present.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

LiuChen3 created this revision.Apr 21 2020, 7:16 AM

Herald added subscribers: krytarowski, arichardson, emaste. · View Herald TranscriptApr 21 2020, 7:16 AM

LiuChen3 mentioned this in D78533: [i386] Fix bug that get __m128/__m256/__m512 with wrong alignment for variadic functions..Apr 21 2020, 7:19 AM

I'm not sure this is right. _m512 is just a typedef to a 64 byte vector_size attribute. This patch changes the behavior of using 64 byte vector_size on an sse/avx target. Prior to avx512 existing an avx target would pass a 512 bit vector 32 byte aligned. It does't make sense to me to change the alignment on an avx target just because avx512 exists, but isn't enabled.

I might be wrong, but I wonder if we should be using the alignment of the type not the size of the type? All the _m128/_m256/_m512 types have an alignment attribute on them.

In D78564#1994943, @craig.topper wrote:

I'm not sure this is right. _m512 is just a typedef to a 64 byte vector_size attribute. This patch changes the behavior of using 64 byte vector_size on an sse/avx target. Prior to avx512 existing an avx target would pass a 512 bit vector 32 byte aligned. It does't make sense to me to change the alignment on an avx target just because avx512 exists, but isn't enabled.

If we use an library function which passing __m512 on stack(such as variadic function) compiled with avx512 but the caller compiled without avx512, this will cause run fail. But actually, when parameters passed by registers, it will cause run fail too because caller and callee use different register. Therefore, it is hard to say whether it is reasonable to align to 64 bytes.

but I wonder if we should be using the alignment of the type not the size of the type? All the _m128/_m256/_m512 types have an alignment attribute on them.

That's reasonable. Then we can omit getTypeSize().

In D78564#1995840, @LiuChen3 wrote:

In D78564#1994943, @craig.topper wrote:

I'm not sure this is right. _m512 is just a typedef to a 64 byte vector_size attribute. This patch changes the behavior of using 64 byte vector_size on an sse/avx target. Prior to avx512 existing an avx target would pass a 512 bit vector 32 byte aligned. It does't make sense to me to change the alignment on an avx target just because avx512 exists, but isn't enabled.

If we use an library function which passing __m512 on stack(such as variadic function) compiled with avx512 but the caller compiled without avx512, this will cause run fail. But actually, when parameters passed by registers, it will cause run fail too because caller and callee use different register. Therefore, it is hard to say whether it is reasonable to align to 64 bytes.

See:

https://bugs.llvm.org/show_bug.cgi?id=39501
https://reviews.llvm.org/D53919

GCC warns in this case.

LiuChen3 added reviewers: efriedma, echristo.Apr 21 2020, 7:38 PM

In D78564#1995859, @hjl.tools wrote:

In D78564#1995840, @LiuChen3 wrote:

In D78564#1994943, @craig.topper wrote:

I'm not sure this is right. _m512 is just a typedef to a 64 byte vector_size attribute. This patch changes the behavior of using 64 byte vector_size on an sse/avx target. Prior to avx512 existing an avx target would pass a 512 bit vector 32 byte aligned. It does't make sense to me to change the alignment on an avx target just because avx512 exists, but isn't enabled.

If we use an library function which passing __m512 on stack(such as variadic function) compiled with avx512 but the caller compiled without avx512, this will cause run fail. But actually, when parameters passed by registers, it will cause run fail too because caller and callee use different register. Therefore, it is hard to say whether it is reasonable to align to 64 bytes.

See:

https://bugs.llvm.org/show_bug.cgi?id=39501
https://reviews.llvm.org/D53919

GCC warns in this case.

I think that is another topic. I will add warning for clang.

Determine whether the type is m128/m256/__m512 by type alignment rather than type size.
Since I am not familiar with front-end, adding diagnostics will take some effort. I think it would be better to foucus on this calling-convention for now.

ping?

RKSimon resigned from this revision.Jun 18 2020, 1:50 AM

Can you provide a few C testcases comparing gcc and clang, where clang currently misaligns an argument? I briefly tried a few testcases with __m256i vectors, and clang seemed to do the right thing.

clang/lib/CodeGen/TargetInfo.cpp
1603	`Ty->isVectorType()`
1884	We don't want to use getIndirect() here if we can avoid it; byval makes it harder for the compiler to reason about the value.

In D78564#2102121, @efriedma wrote:

Can you provide a few C testcases comparing gcc and clang, where clang currently misaligns an argument? I briefly tried a few testcases with __m256i vectors, and clang seemed to do the right thing.

@efriedma, thanks for your review.

Here is a little case:

#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>
#include <immintrin.h>

typedef union {
        int d[4];
        __m128 m;
} M128;

void test(int argCount, ...) {
        M128 res;
        int retValue = 0;
        va_list args;
        va_start(args, argCount);
        res.d[0] = res.d[1] = res.d[2] = res.d[3] = 0;
        res.m = va_arg(args, __m128);
        printf("%d %d %d %d\n", res.d[0], res.d[1], res.d[2], res.d[3]);
        va_end(args);
}

int main(void) {
        int retValue = 0;
        M128 a;
        a.d[0] = 0; a.d[1] = 2; a.d[2] = 4; a.d[3] = 6;
        test(1, a.m);
        return 0;
}

The option i use is '-m32 -O0'.
The output of clang is: 134520832 -7561716 134514139 0
And the output of gcc is right.

If I'm following correctly, the change to getTypeStackAlignInBytes() makes the lowering of va_arg correct for m256/m512 etc. on targets where they're legal types. (And you could independently verify this by pass a va_list from a gcc-compiled function to a clang-compiled function.)

The change to X86_32ABIInfo::classifyArgumentType, then, is specifically targeted at cases where the vector type in question isn't legal. It fixes the alignment, I guess, so it's always consistent with what va_arg thinks the alignment should be? Can we restrict this change to varargs function calls?

Restricting this change to varargs function calls and address comments.

Hi, @efriedma.

I used gcc and clang cross-compilation for testing, and currently found no problems.
main.c:

#include "abi.h"

int main(void) {
        int retValue = 0;
        M128 a;
        M256 b;
        M512 c;
        a.d[0] = 0; a.d[1] = 2; a.d[2] = 4; a.d[3] = 6;
        b.d[0] = 0; b.d[1] = 2; b.d[2] = 4; b.d[3] = 6;
        c.d[0] = 0; c.d[1] = 2; c.d[2] = 4; c.d[3] = 6;
        c.d[4] = 8; c.d[5] = 10; c.d[6] = 12; c.d[7] = 14;
        testv128(1, a.m);
        testv256(1, b.m);
        testv512(1, c.m);
        return 0;
}

test.c:

#include "abi.h"

void testv128(int argCount, ...) {
        M128 res;
        int retValue = 0;
        va_list args;
        va_start(args, argCount);
        res.d[0] = res.d[1] = res.d[2] = res.d[3] = 0;
        res.m = va_arg(args, __m128);
        printf("%d %d %d %d\n", res.d[0], res.d[1], res.d[2], res.d[3]);
        va_end(args);
}

void testv256(int argCount, ...) {
        M256 res;
        int retValue = 0;
        va_list args;
        va_start(args, argCount);
        res.d[0] = res.d[1] = res.d[2] = res.d[3] = 0;
        res.m = va_arg(args, __m256);
        printf("%lld %lld %lld %lld\n", res.d[0], res.d[1], res.d[2], res.d[3]);
        va_end(args);
}

void testv512(int argCount, ...) {
        M512 res;
        int retValue = 0;
        va_list args;
        va_start(args, argCount);
        res.d[0] = res.d[1] = res.d[2] = res.d[3] = 0;
        res.d[4] = res.d[5] = res.d[6] = res.d[7] = 0;
        res.m = va_arg(args, __m512);
        for(int i = 0; i < 8; ++i)
          printf("%lld ", res.d[i]);
        printf("\n");
        va_end(args);
}

abi.h:

#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>
#include <immintrin.h>

typedef union {
        int d[4];
        __m128 m;
} M128;

typedef union {
        long long d[4];
        __m256 m;
} M256;

typedef union {
        long long d[8];
        __m512 m;
} M512;


void testv128(int argCount, ...);
void testv256(int argCount, ...);
void testv512(int argCount, ...);

The option i use:

clang -m32 -c main.c
gcc -m32 -c test.c
clang -m32 main.o test.o

clang -m32 -mavx2 -c main.c
gcc -m32 -mavx2 -c test.c
clang -m32 -mavx2 main.o test.o

clang -m32 -c -mavx512f main.c
gcc -m32 -c -mavx512f test.c
clang -m32 -mavx512f main.o test.o

Then use clang to compile test.c and gcc to compile main.c.

Before this patch, The output is incorrect.

efriedma added inline comments.Jun 23 2020, 1:09 PM

clang/lib/CodeGen/TargetInfo.cpp
1136	Please don't use default arguments here; it's isn't helping readability.

echristo added inline comments.Jun 23 2020, 1:41 PM

clang/lib/CodeGen/TargetInfo.cpp
1136	Avoid boolean arguments if possible too.

LiuChen3 marked an inline comment as done.Jun 23 2020, 7:18 PM

LiuChen3 added inline comments.

clang/lib/CodeGen/TargetInfo.cpp
1136	Thanks for you review. I think it's not easy to judge if current arguments is variable argument without this boolean parameter.

Remove default arguments.

efriedma added inline comments.Jun 23 2020, 7:36 PM

clang/lib/CodeGen/TargetInfo.cpp
1136	As a general style rule, if a function has two alternative modes, prefer naming the two alternatives with an enum, instead of using a boolean. This makes the call more readable. (See https://google.github.io/styleguide/cppguide.html#Function_Argument_Comments etc.) We should probably describe this somewhere in the LLVM coding standards document.

LiuChen3 marked an inline comment as done.Jun 23 2020, 9:19 PM

LiuChen3 added inline comments.

clang/lib/CodeGen/TargetInfo.cpp
1136	Thanks for your information. I misunderstood echristo's mean. I followed X86_64ABIInfo which use 'isNamedArg' to indicate whether the argument is named. Should we change the style?

Address comments: Replacing the bool argument with an enum argument

Ping?

Does this fix https://github.com/golang/go/issues/42440?

In D78564#2457869, @hjl.tools wrote:

Does this fix https://github.com/golang/go/issues/42440?

No. I think this patch can only fix part of the issue.

In D78564#2459571, @LiuChen3 wrote:

In D78564#2457869, @hjl.tools wrote:

Does this fix https://github.com/golang/go/issues/42440?

No. I think this patch can only fix part of the issue.

Can you fix the go issue?

LiuChen3 added a reviewer: pengfei.Dec 16 2020, 10:41 PM

In D78564#2459608, @hjl.tools wrote:

In D78564#2459571, @LiuChen3 wrote:

In D78564#2457869, @hjl.tools wrote:

Does this fix https://github.com/golang/go/issues/42440?

No. I think this patch can only fix part of the issue.

Can you fix the go issue?

If we can confirm how gcc does it, I think I can fix that.

For now we just assume the rules of gcc does are as follows:

StackAlignmentForType(T):

1. If T's alignment is < 16 bytes, return 4.
2. If T is a struct/union/array type, then:
    recursively call StackAlignmentForType() on each member's type (note -- this ignores any attribute((aligned(N))) directly on the fields of a struct, but not those that appear on typedefs, or the underlying types).
    If all of those calls return alignments < 16, then return 4.
3. Otherwise, return the alignment of T.

We need to confirm that this is the actually behavior of gcc.

Rebase and avoid using 'byval' parameter.

Ping?

Harbormaster completed remote builds in B96846: Diff 334874.Apr 1 2021, 6:22 PM

pengfei added inline comments.Apr 2 2021, 12:33 AM

clang/lib/CodeGen/TargetInfo.cpp
1956	Can we always use the type alignment despite named and unnamed?

Address Pengfei's comments

Patch LGTM. If this is what GCC is doing on Linux, then we should match it.

LGTM.

This revision is now accepted and ready to land.Apr 7 2021, 1:24 AM

This revision was landed with ongoing or failed builds.Apr 14 2021, 1:47 AM

Closed by commit rG1c4108ab661d: [i386] Modify the alignment of __m128/__m256/__m512 vector type according i386… (authored by LiuChen3). · Explain Why

This revision was automatically updated to reflect the committed changes.

LiuChen3 added a commit: rG1c4108ab661d: [i386] Modify the alignment of __m128/__m256/__m512 vector type according i386….

Herald added a project: Restricted Project. · View Herald TranscriptApr 14 2021, 1:47 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

Thanks for your review. Hope this patch won't cause too many ABI issues in the future.

pengfei mentioned this in D108887: [X86][MS] Fix the aligement mismatch of vector variable arguments on Win32.Aug 30 2021, 7:29 PM

pengfei mentioned this in D109265: [X86][mingw] Modify the alignment of __m128/__m256/__m512 vector type for mingw.Sep 3 2021, 7:19 PM

pengfei mentioned this in rGe6e8d25920c1: [X86][mingw] Modify the alignment of __m128/__m256/__m512 vector type for mingw.Sep 6 2021, 5:28 AM

Diff 337373

clang/lib/CodeGen/TargetInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,099 Lines • ▼ Show 20 Lines	class X86_32ABIInfo : public SwiftABIInfo {

static const unsigned MinABIStackAlignInBytes = 4;		static const unsigned MinABIStackAlignInBytes = 4;

bool IsDarwinVectorABI;		bool IsDarwinVectorABI;
bool IsRetSmallStructInRegABI;		bool IsRetSmallStructInRegABI;
bool IsWin32StructABI;		bool IsWin32StructABI;
bool IsSoftFloatABI;		bool IsSoftFloatABI;
bool IsMCUABI;		bool IsMCUABI;
		bool IsLinuxABI;
unsigned DefaultNumRegisterParameters;		unsigned DefaultNumRegisterParameters;

static bool isRegisterSize(unsigned Size) {		static bool isRegisterSize(unsigned Size) {
return (Size == 8 \|\| Size == 16 \|\| Size == 32 \|\| Size == 64);		return (Size == 8 \|\| Size == 16 \|\| Size == 32 \|\| Size == 64);
}		}

bool isHomogeneousAggregateBaseType(QualType Ty) const override {		bool isHomogeneousAggregateBaseType(QualType Ty) const override {
// FIXME: Assumes vectorcall is in use.		// FIXME: Assumes vectorcall is in use.
Show All 11 Lines	class X86_32ABIInfo : public SwiftABIInfo {
/// getIndirectResult - Give a source type \arg Ty, return a suitable result		/// getIndirectResult - Give a source type \arg Ty, return a suitable result
/// such that the argument will be passed in memory.		/// such that the argument will be passed in memory.
ABIArgInfo getIndirectResult(QualType Ty, bool ByVal, CCState &State) const;		ABIArgInfo getIndirectResult(QualType Ty, bool ByVal, CCState &State) const;

ABIArgInfo getIndirectReturnResult(QualType Ty, CCState &State) const;		ABIArgInfo getIndirectReturnResult(QualType Ty, CCState &State) const;

/// Return the alignment to use for the given type on the stack.		/// Return the alignment to use for the given type on the stack.
unsigned getTypeStackAlignInBytes(QualType Ty, unsigned Align) const;		unsigned getTypeStackAlignInBytes(QualType Ty, unsigned Align) const;

		efriedmaUnsubmitted Not Done Reply Inline Actions Please don't use default arguments here; it's isn't helping readability. efriedma: Please don't use default arguments here; it's isn't helping readability.
		echristoUnsubmitted Not Done Reply Inline Actions Avoid boolean arguments if possible too. echristo: Avoid boolean arguments if possible too.
		LiuChen3AuthorUnsubmitted Done Reply Inline Actions Thanks for you review. I think it's not easy to judge if current arguments is variable argument without this boolean parameter. LiuChen3: Thanks for you review. I think it's not easy to judge if current arguments is variable argument…
		efriedmaUnsubmitted Not Done Reply Inline Actions As a general style rule, if a function has two alternative modes, prefer naming the two alternatives with an enum, instead of using a boolean. This makes the call more readable. (See https://google.github.io/styleguide/cppguide.html#Function_Argument_Comments etc.) We should probably describe this somewhere in the LLVM coding standards document. efriedma: As a general style rule, if a function has two alternative modes, prefer naming the two…
		LiuChen3AuthorUnsubmitted Done Reply Inline Actions Thanks for your information. I misunderstood echristo's mean. I followed X86_64ABIInfo which use 'isNamedArg' to indicate whether the argument is named. Should we change the style? LiuChen3: Thanks for your information. I misunderstood echristo's mean. I followed X86_64ABIInfo which…
Class classify(QualType Ty) const;		Class classify(QualType Ty) const;
ABIArgInfo classifyReturnType(QualType RetTy, CCState &State) const;		ABIArgInfo classifyReturnType(QualType RetTy, CCState &State) const;
ABIArgInfo classifyArgumentType(QualType RetTy, CCState &State) const;		ABIArgInfo classifyArgumentType(QualType RetTy, CCState &State) const;

/// Updates the number of available free registers, returns		/// Updates the number of available free registers, returns
/// true if any registers were allocated.		/// true if any registers were allocated.
bool updateFreeRegs(QualType Ty, CCState &State) const;		bool updateFreeRegs(QualType Ty, CCState &State) const;

Show All 18 Lines	public:
Address EmitVAArg(CodeGenFunction &CGF, Address VAListAddr,		Address EmitVAArg(CodeGenFunction &CGF, Address VAListAddr,
QualType Ty) const override;		QualType Ty) const override;

X86_32ABIInfo(CodeGen::CodeGenTypes &CGT, bool DarwinVectorABI,		X86_32ABIInfo(CodeGen::CodeGenTypes &CGT, bool DarwinVectorABI,
bool RetSmallStructInRegABI, bool Win32StructABI,		bool RetSmallStructInRegABI, bool Win32StructABI,
unsigned NumRegisterParameters, bool SoftFloatABI)		unsigned NumRegisterParameters, bool SoftFloatABI)
: SwiftABIInfo(CGT), IsDarwinVectorABI(DarwinVectorABI),		: SwiftABIInfo(CGT), IsDarwinVectorABI(DarwinVectorABI),
IsRetSmallStructInRegABI(RetSmallStructInRegABI),		IsRetSmallStructInRegABI(RetSmallStructInRegABI),
IsWin32StructABI(Win32StructABI),		IsWin32StructABI(Win32StructABI), IsSoftFloatABI(SoftFloatABI),
IsSoftFloatABI(SoftFloatABI),
IsMCUABI(CGT.getTarget().getTriple().isOSIAMCU()),		IsMCUABI(CGT.getTarget().getTriple().isOSIAMCU()),
		IsLinuxABI(CGT.getTarget().getTriple().isOSLinux()),
DefaultNumRegisterParameters(NumRegisterParameters) {}		DefaultNumRegisterParameters(NumRegisterParameters) {}

bool shouldPassIndirectlyForSwift(ArrayRef<llvm::Type*> scalars,		bool shouldPassIndirectlyForSwift(ArrayRef<llvm::Type*> scalars,
bool asReturnValue) const override {		bool asReturnValue) const override {
// LLVM's x86-32 lowering currently only assigns up to three		// LLVM's x86-32 lowering currently only assigns up to three
// integer registers and three fp registers. Oddly, it'll use up to		// integer registers and three fp registers. Oddly, it'll use up to
// four vector registers for vectors, but those can overlap with the		// four vector registers for vectors, but those can overlap with the
// scalar registers.		// scalar registers.
▲ Show 20 Lines • Show All 408 Lines • ▼ Show 20 Lines

unsigned X86_32ABIInfo::getTypeStackAlignInBytes(QualType Ty,		unsigned X86_32ABIInfo::getTypeStackAlignInBytes(QualType Ty,
unsigned Align) const {		unsigned Align) const {
// Otherwise, if the alignment is less than or equal to the minimum ABI		// Otherwise, if the alignment is less than or equal to the minimum ABI
// alignment, just use the default; the backend will handle this.		// alignment, just use the default; the backend will handle this.
if (Align <= MinABIStackAlignInBytes)		if (Align <= MinABIStackAlignInBytes)
return 0; // Use default alignment.		return 0; // Use default alignment.

		if (IsLinuxABI) {
		// Exclude other System V OS (e.g Darwin, PS4 and FreeBSD) since we don't
		// want to spend any effort dealing with the ramifications of ABI breaks.
		//
		// If the vector type is __m128/__m256/__m512, return the default alignment.
		if (Ty->isVectorType() && (Align == 16 \|\| Align == 32 \|\| Align == 64))
		efriedmaUnsubmitted Done Reply Inline Actions `Ty->isVectorType()` efriedma: `Ty->isVectorType()`
		return Align;
		}
// On non-Darwin, the stack type alignment is always 4.		// On non-Darwin, the stack type alignment is always 4.
if (!IsDarwinVectorABI) {		if (!IsDarwinVectorABI) {
// Set explicit alignment, since we may need to realign the top.		// Set explicit alignment, since we may need to realign the top.
return MinABIStackAlignInBytes;		return MinABIStackAlignInBytes;
}		}

// Otherwise, if the type contains an SSE vector type, the alignment is 16.		// Otherwise, if the type contains an SSE vector type, the alignment is 16.
if (Align >= 16 && (isSIMDVectorType(getContext(), Ty) \|\|		if (Align >= 16 && (isSIMDVectorType(getContext(), Ty) \|\|
▲ Show 20 Lines • Show All 262 Lines • ▼ Show 20 Lines	ABIArgInfo X86_32ABIInfo::classifyArgumentType(QualType Ty,
}		}


if (const EnumType *EnumTy = Ty->getAs<EnumType>())		if (const EnumType *EnumTy = Ty->getAs<EnumType>())
Ty = EnumTy->getDecl()->getIntegerType();		Ty = EnumTy->getDecl()->getIntegerType();

bool InReg = shouldPrimitiveUseInReg(Ty, State);		bool InReg = shouldPrimitiveUseInReg(Ty, State);

if (isPromotableIntegerTypeForABI(Ty)) {		if (isPromotableIntegerTypeForABI(Ty)) {
		efriedmaUnsubmitted Not Done Reply Inline Actions We don't want to use getIndirect() here if we can avoid it; byval makes it harder for the compiler to reason about the value. efriedma: We don't want to use getIndirect() here if we can avoid it; byval makes it harder for the…
if (InReg)		if (InReg)
return ABIArgInfo::getExtendInReg(Ty);		return ABIArgInfo::getExtendInReg(Ty);
return ABIArgInfo::getExtend(Ty);		return ABIArgInfo::getExtend(Ty);
}		}

if (const auto * EIT = Ty->getAs<ExtIntType>()) {		if (const auto * EIT = Ty->getAs<ExtIntType>()) {
if (EIT->getNumBits() <= 64) {		if (EIT->getNumBits() <= 64) {
if (InReg)		if (InReg)
▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	void X86_32ABIInfo::computeInfo(CGFunctionInfo &FI) const {
bool UsedInAlloca = false;		bool UsedInAlloca = false;
MutableArrayRef<CGFunctionInfoArgInfo> Args = FI.arguments();		MutableArrayRef<CGFunctionInfoArgInfo> Args = FI.arguments();
for (int I = 0, E = Args.size(); I < E; ++I) {		for (int I = 0, E = Args.size(); I < E; ++I) {
// Skip arguments that have already been assigned.		// Skip arguments that have already been assigned.
if (State.IsPreassigned.test(I))		if (State.IsPreassigned.test(I))
continue;		continue;

Args[I].info = classifyArgumentType(Args[I].type, State);		Args[I].info = classifyArgumentType(Args[I].type, State);
UsedInAlloca \|= (Args[I].info.getKind() == ABIArgInfo::InAlloca);		UsedInAlloca \|= (Args[I].info.getKind() == ABIArgInfo::InAlloca);
		pengfeiUnsubmitted Not Done Reply Inline Actions Can we always use the type alignment despite named and unnamed? pengfei: Can we always use the type alignment despite named and unnamed?
}		}

// If we needed to use inalloca for any argument, do a second pass and rewrite		// If we needed to use inalloca for any argument, do a second pass and rewrite
// all the memory arguments to use inalloca.		// all the memory arguments to use inalloca.
if (UsedInAlloca)		if (UsedInAlloca)
rewriteWithInAlloca(FI);		rewriteWithInAlloca(FI);
}		}

▲ Show 20 Lines • Show All 9,280 Lines • Show Last 20 Lines

clang/test/CodeGen/x86_32-align-linux.c

This file was added.

				// RUN: %clang_cc1 -w -fblocks -ffreestanding -triple i386-pc-linux-gnu -emit-llvm -o - %s \| FileCheck %s
				// RUN: %clang_cc1 -w -fblocks -ffreestanding -triple i386-pc-linux-gnu -target-feature +avx -emit-llvm -o - %s \| FileCheck %s
				// RUN: %clang_cc1 -w -fblocks -ffreestanding -triple i386-pc-linux-gnu -target-feature +avx512f -emit-llvm -o - %s \| FileCheck %s

				#include <immintrin.h>

				// CHECK-LABEL: define dso_local void @testm128
				// CHECK-LABEL: %argp.cur = load i8, i8* %args, align 4
				// CHECK-NEXT: %0 = ptrtoint i8* %argp.cur to i32
				// CHECK-NEXT: %1 = add i32 %0, 15
				// CHECK-NEXT: %2 = and i32 %1, -16
				// CHECK-NEXT: %argp.cur.aligned = inttoptr i32 %2 to i8*
				void testm128(int argCount, ...) {
				__m128 res;
				__builtin_va_list args;
				__builtin_va_start(args, argCount);
				res = __builtin_va_arg(args, __m128);
				__builtin_va_end(args);
				}

				// CHECK-LABEL: define dso_local void @testm256
				// CHECK-LABEL: %argp.cur = load i8, i8* %args, align 4
				// CHECK-NEXT: %0 = ptrtoint i8* %argp.cur to i32
				// CHECK-NEXT: %1 = add i32 %0, 31
				// CHECK-NEXT: %2 = and i32 %1, -32
				// CHECK-NEXT: %argp.cur.aligned = inttoptr i32 %2 to i8*
				void testm256(int argCount, ...) {
				__m256 res;
				__builtin_va_list args;
				__builtin_va_start(args, argCount);
				res = __builtin_va_arg(args, __m256);
				__builtin_va_end(args);
				}

				// CHECK-LABEL: define dso_local void @testm512
				// CHECK-LABEL: %argp.cur = load i8, i8* %args, align 4
				// CHECK-NEXT: %0 = ptrtoint i8* %argp.cur to i32
				// CHECK-NEXT: %1 = add i32 %0, 63
				// CHECK-NEXT: %2 = and i32 %1, -64
				// CHECK-NEXT: %argp.cur.aligned = inttoptr i32 %2 to i8*
				void testm512(int argCount, ...) {
				__m512 res;
				__builtin_va_list args;
				__builtin_va_start(args, argCount);
				res = __builtin_va_arg(args, __m512);
				__builtin_va_end(args);
				}

				// CHECK-LABEL: define dso_local void @testPastArguments
				// CHECK: call void (i32, ...) @testm128(i32 1, <4 x float> %0)
				// CHECK: call void (i32, ...) @testm256(i32 1, <8 x float> %1)
				// CHECK: call void (i32, ...) @testm512(i32 1, <16 x float> %2)
				void testPastArguments(void) {
				__m128 a;
				__m256 b;
				__m512 c;
				testm128(1, a);
				testm256(1, b);
				testm512(1, c);
				}

This is an archive of the discontinued LLVM Phabricator instance.

[i386] Modify the alignment of m128/m256/__m512 vector type according i386 abi.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 337373

clang/lib/CodeGen/TargetInfo.cpp

clang/test/CodeGen/x86_32-align-linux.c

This is an archive of the discontinued LLVM Phabricator instance.

[i386] Modify the alignment of __m128/__m256/__m512 vector type according i386 abi.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 337373

clang/lib/CodeGen/TargetInfo.cpp

clang/test/CodeGen/x86_32-align-linux.c

[i386] Modify the alignment of m128/m256/__m512 vector type according i386 abi.
ClosedPublic