This is an archive of the discontinued LLVM Phabricator instance.

[MS] Fix passing aligned records by value in some cases
ClosedPublic

Authored by rnk on Jun 12 2023, 1:57 PM.

Details

Summary

It's not exactly clear what the meaning of TypeInfo::AlignRequirement
is, so go directly to the ASTRecordLayout for records and check the
required alignment there. Compare that number with the stack alignment
value of 4.

This fixes cases when the alignment attribute does not appear directly
on the record [1], or when the attribute on the record is underaligned
[2].

[1]: struct Foo { int __declspec(align(16)) x; };
[2]: struct __declspec(align(1)) Bar { int x; };

Diff Detail

Event Timeline

rnk created this revision.Jun 12 2023, 1:57 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 12 2023, 1:57 PM
rnk requested review of this revision.Jun 12 2023, 1:57 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 12 2023, 1:57 PM
nikic accepted this revision.Jun 13 2023, 5:19 AM

LGTM

This revision is now accepted and ready to land.Jun 13 2023, 5:19 AM
This revision was landed with ongoing or failed builds.Jun 13 2023, 12:55 PM
This revision was automatically updated to reflect the committed changes.

Intel is seeing some fallout from this change in our downstream (reverting this commit causes our test case to pass). Our test looks like this:

#include <stdlib.h>
#include <stdio.h>
#include <stdarg.h>
#include <x86intrin.h>

typedef union {
	char c[32];
	short s[16];
	int d[8];
	long long l[4];
	__m128 h[2];
	__m256 m;
	__m256i mi;
	__m256d md;
} M256;

M256 make_m256_from_int(int d1, int d2, int d3, int d4, int d5, int d6, int d7, int d8) {
	M256 ret;
	ret.d[0] = d1;
	ret.d[1] = d2;
	ret.d[2] = d3;
	ret.d[3] = d4;
	ret.d[4] = d5;
	ret.d[5] = d6;
	ret.d[6] = d7;
	ret.d[7] = d8;
	return ret;
}

typedef union {
	char c;
} M8;

M8 make_m8_from_char(char c) {
	M8 ret;
	ret.c = c;
	return ret;
}

int test(char* format, ...) {
	int i, j;
	M256 m256;
	M8 m8;
	int retValue = 0;
	va_list args;
	va_start(args, format);

	for (i = 0; format[i]; i++) {
		switch (format[i]) {
			case '1':
				m8 = va_arg(args, M8);
				retValue += m8.c;
				break;
			case '6':
				m256 = va_arg(args, M256);
				retValue += m256.d[0] + m256.d[1] + m256.d[2] + m256.d[3] + m256.d[4] + m256.d[5] + m256.d[6] + m256.d[7];
				break;
		}
	}

	va_end(args);
	return retValue;
}

int main(void) {
	int retValue = 0;

	if (test("16", make_m8_from_char(-12), make_m256_from_int(1, 2, 1, 2, 1, 2, 1, 2))) {
		fprintf(stderr, "test failed for: M8(%d) M256(%d,%d,%d,%d,%d,%d,%d,%d)\n", 12, 1, 2, 1, 2, 1, 2, 1, 2);
		retValue++;
	}

	if (retValue) {
		printf("FAILED %d tests\n", retValue);
	} else {
		printf("PASSED\n");
	}
	return retValue;
}

What we're seeing is the difference in IR emitted by FE is the byval attribute and alignment for 3rd argument.

Bad

%call2 = call i32 (ptr, ...) @test(ptr noundef @"??_C@_02KMALDIDP@16?$AA@", ptr noundef byval(%union.M8) align 4 %agg.tmp1, ptr noundef %agg.tmp), !dbg !237

Good

%call2 = call i32 (ptr, ...) @test(ptr noundef @"??_C@_02KMALDIDP@16?$AA@", ptr noundef byval(%union.M8) align 4 %agg.tmp1, ptr noundef byval(%union.M256) align 4 %agg.tmp), !dbg !237

(We're seeing it with this clang-cl-esque command line: icx -c -Zi -Od /arch:AVX check_types_reduced.c -Xclang -emit-llvm -Xclang -disable-llvm-passes)

I didn't spot any UB in the test, so I think something may be incorrect with this patch (but codegen is not my area of specialty, so please correct me if I'm wrong).

rnk added a comment.Aug 23 2023, 11:30 AM

Thanks, I think this is a clang bug. As I understand MSVC's behavior, we should not pass highly aligned variadic arguments indirectly: https://gcc.godbolt.org/z/Kr67xWTeE

I'll follow up on that.

Thanks, I think this is a clang bug. As I understand MSVC's behavior, we should not pass highly aligned variadic arguments indirectly: https://gcc.godbolt.org/z/Kr67xWTeE

I'll follow up on that.

Thank you!

Thanks, I think this is a clang bug. As I understand MSVC's behavior, we should not pass highly aligned variadic arguments indirectly: https://gcc.godbolt.org/z/Kr67xWTeE

I'll follow up on that.

Thank you!

Any update on a fix for this?

rnk added a subscriber: mstorsjo.Aug 29 2023, 12:57 PM

I need to get to it, my recollection is that @mstorsjo ran into the same issue here for mingw and made some changes, I wanted to go dig those up as a starting point. I may have completely forgotten things though.

rnk added a comment.Aug 29 2023, 2:34 PM

I put together a fix here: https://github.com/llvm/llvm-project/compare/main...rnk:llvm-project:fix-vararg-align

I don't have arcanist set up like I used to, and I don't write as much code as I used to, so I kind of want to submit this as a pull request as soon as they are available this Thursday, if that's not an issue for you.

I need to get to it, my recollection is that @mstorsjo ran into the same issue here for mingw and made some changes, I wanted to go dig those up as a starting point. I may have completely forgotten things though.

Hmm, I don't remember doing anything in that area - I don't think I've had to touch variadics on i386 (or x86_64 for that matter) so far, or anything relating to aligned variadics. (The main thing that might sound similar to alignment was about setting up the homed registers on aarch64 when receiving variadics; there's some special casing there relating to whether the number of homed registers is even or odd. But all of that is much deeper within lowering in LLVM.)

Following up, the vararg fix is here: https://github.com/llvm/llvm-project/pull/65692

Thank you for the fix!