This is an archive of the discontinued LLVM Phabricator instance.

I don't know what platforms this needs to support. But __builtin_cpu_support only works when compiled with clang or gcc. And it requires compiler-rt or libgcc. I don't know if that's guaranteed to exist on Windows.

In D41962#973827, @craig.topper wrote:

I don't know what platforms this needs to support. But __builtin_cpu_support only works when compiled with clang or gcc. And it requires compiler-rt or libgcc. I don't know if that's guaranteed to exist on Windows.

I doubt this test was ever passing on windows, as our RegisterContextWindows does not even acknowledge the existence of sse registers. If we wanted to be fancy, we could do some manual cpuid parsing here (the test contains inline assembly anyway), but that's probably not necessary.

packages/Python/lldbsuite/test/functionalities/register/intel_avx/main.c
20	Gcc manual says: This built-in (__builtin_cpu_init) function needs to be invoked ..., only when used in a function that is executed before any constructors are called. So calling it here should not be necessary. However, I am still unable to get gcc (6.3) to return 1 here. Clang (since at least 3.8) seems to be doing fine however, so that's probably enough for this test.

fwiw I'm working on upstreaming on zmm (avx512) patches that we have locally (there's one testsuite fail I still need to find time to fix) and the TestZMMRegister.py test that ChrisB wrote to test this is written as skip-unless-darwin, and there's a new skipUnlessFeature() method added to decorators.py which runs sysctl to detect hardware features (in this case, hw.optional.avx512f) which, I suspect, is an even more mac-specific way of doing this. While Adrian's approach would be gcc/clang specific, it would def be better than depending on a sysctl.

I suppose a possible alternative would be to figure out the avx2 / avx512 features manually based on the cpuid instead of letting the compiler do it for us. e.g. https://stackoverflow.com/questions/1666093/cpuid-implementations-in-c and then checking the bits as e.g. described in https://en.wikipedia.org/wiki/CPUID . Bummer to do it so low level if we can delegate this to the compiler though.

In D41962#974656, @jasonmolenda wrote:

I suppose a possible alternative would be to figure out the avx2 / avx512 features manually based on the cpuid instead of letting the compiler do it for us. e.g. https://stackoverflow.com/questions/1666093/cpuid-implementations-in-c and then checking the bits as e.g. described in https://en.wikipedia.org/wiki/CPUID . Bummer to do it so low level if we can delegate this to the compiler though.

I don't know MSVC well enough and don't have access to one to test it but: This would also only work if there were a compiler-independent way of writing inline assembler. Is that possible?

Other fun facts: Clang doesn't even define __builtin_cpu_init().

__builtin_cpu_init was added to clang between 5.0 and 6.0

Uploaded a mildly better version that is NFC on MSVC.

Why not just look for the AVX registers by name that are only available if they are correctly detected by the native lldb-server or debugserver? Then we can avoid all of this. If we don't execute any instructions that crash the program, we can stop before any specialized AVX instructions are executed and kill the program is we don't see a register by name?

I considered doing something like this, but I want to avoid relying on the AVX2 support in LLDB to work in order to detect AVX2. If I use an LLDB mechanism for this then (exaggerating here!) someone could remove AVX support from LLDB and this test would still pass.

there's a new skipUnlessFeature() method added to decorators.py which runs sysctl to detect hardware features (in this case, hw.optional.avx512f)

How does one execute a program like sysctl on the remote? I have seen code in TestLldbGdbServer.py that uses platform get-file /proc/cpuinfo to achieve something similar for Linux, but that works without executing a new process.

In D41962#975168, @aprantl wrote:

there's a new skipUnlessFeature() method added to decorators.py which runs sysctl to detect hardware features (in this case, hw.optional.avx512f)

How does one execute a program like sysctl on the remote? I have seen code in TestLldbGdbServer.py that uses platform get-file /proc/cpuinfo to achieve something similar for Linux, but that works without executing a new process.

this skipUnlessFeature sysctl check was all performed on the system running the testsuite. Checking whether the feature exists in the program (the approach you're taking) is more correct. We usually do host != target testsuite runs for arm devices, but there's no reason why someone couldn't do a macos x freebsd testsuite run and the sysctl check would be invalid in that case.

aprantl abandoned this revision.Apr 30 2018, 1:01 PM

Revision Contents

Path

Size

packages/

Python/

lldbsuite/

test/

functionalities/

intel_avx/

TestYMMRegister.py

4 lines

main.c

9 lines

Diff 129639

packages/Python/lldbsuite/test/functionalities/register/intel_avx/TestYMMRegister.py

Show First 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	def test(self):
matched = output.find(str1) != -1		matched = output.find(str1) != -1
with recording(self, False) as sbuf:		with recording(self, False) as sbuf:
print("%s sub string: %s" % ('Expecting', str1), file=sbuf)		print("%s sub string: %s" % ('Expecting', str1), file=sbuf)
print("Matched" if matched else "Not Matched", file=sbuf)		print("Matched" if matched else "Not Matched", file=sbuf)
if matched:		if matched:
break		break
self.assertTrue(matched, STOPPED_DUE_TO_SIGNAL)		self.assertTrue(matched, STOPPED_DUE_TO_SIGNAL)

		# Detect AVX2 support and early exit otherwise.
		if self.frame().FindVariable("haveAVX2").GetValue() == "0":
		return False

if self.getArchitecture() == 'x86_64':		if self.getArchitecture() == 'x86_64':
register_range = 16		register_range = 16
else:		else:
register_range = 8		register_range = 8
for i in range(register_range):		for i in range(register_range):
self.runCmd("thread step-inst")		self.runCmd("thread step-inst")

register_byte = (byte_pattern1 \| i)		register_byte = (byte_pattern1 \| i)
Show All 15 Lines

packages/Python/lldbsuite/test/functionalities/register/intel_avx/main.c

	//===-- main.c ------------------------------------------------- C --===//			//===-- main.c ------------------------------------------------- C --===//
	//			//
	// The LLVM Compiler Infrastructure			// The LLVM Compiler Infrastructure
	//			//
	// This file is distributed under the University of Illinois Open Source			// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.			// License. See LICENSE.TXT for details.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	void func() {			void func() {
	unsigned int ymmvalues[16];			unsigned int ymmvalues[16];
	unsigned char val;			unsigned char val;
	unsigned char i;			unsigned char i;
	for (i = 0 ; i < 16 ; i++)			for (i = 0 ; i < 16 ; i++)
	{			{
	val = (0x80 \| i);			val = (0x80 \| i);
	ymmvalues[i] = (val << 24) \| (val << 16) \| (val << 8) \| val;			ymmvalues[i] = (val << 24) \| (val << 16) \| (val << 8) \| val;
	}			}

				// Detect AVX2.
				static volatile unsigned haveAVX2;
				lebedev.riUnsubmitted Not Done Reply Inline Actions Note that you need to call `__builtin_cpu_init()` before calling `__builtin_cpu_supports()`. Or maybe it is already called before this? lebedev.ri: Note that you need to call `__builtin_cpu_init()` before calling `__builtin_cpu_supports()`. Or…
				labathUnsubmitted Not Done Reply Inline Actions Gcc manual says: This built-in (__builtin_cpu_init) function needs to be invoked ..., only when used in a function that is executed before any constructors are called. So calling it here should not be necessary. However, I am still unable to get gcc (6.3) to return 1 here. Clang (since at least 3.8) seems to be doing fine however, so that's probably enough for this test. labath: Gcc manual says: ```This built-in (__builtin_cpu_init) function needs to be invoked ..., only…
				#ifdef _MSC_VER
				haveAVX2 = 1; // TODO: Implement this for MSVC.
				#else
				haveAVX2 = __builtin_cpu_supports("avx2");
				#endif

	unsigned int ymmallones = 0xFFFFFFFF;			unsigned int ymmallones = 0xFFFFFFFF;
	__asm__("int3;"			__asm__("int3;"
	"vbroadcastss %1, %%ymm0;"			"vbroadcastss %1, %%ymm0;"
	"vbroadcastss %0, %%ymm0;"			"vbroadcastss %0, %%ymm0;"
	"vbroadcastss %2, %%ymm1;"			"vbroadcastss %2, %%ymm1;"
	"vbroadcastss %0, %%ymm1;"			"vbroadcastss %0, %%ymm1;"
	"vbroadcastss %3, %%ymm2;"			"vbroadcastss %3, %%ymm2;"
	"vbroadcastss %0, %%ymm2;"			"vbroadcastss %0, %%ymm2;"
	Show All 40 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Fix TestYMMRegisters for older machines without AVX2AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 129639

packages/Python/lldbsuite/test/functionalities/register/intel_avx/TestYMMRegister.py

packages/Python/lldbsuite/test/functionalities/register/intel_avx/main.c

Fix TestYMMRegisters for older machines without AVX2
AbandonedPublic