This is an archive of the discontinued LLVM Phabricator instance.

More importantly, even if it did, pointers are 32-bit values not 64-bit values. But regardless, the original RV64 code is wrong, it's doing a load not a move, so is setting tp to whatever the first pointer in the static TLS block happens to be, which is clearly nonsense. This should just be:

static void set_thread_ptr(uintptr_t val) {
  LIBC_INLINE_ASM("mv tp, %0\n\t" : : "r"(val));
}

This revision now requires changes to proceed.Aug 29 2023, 2:27 PM

Implement code as suggested by reviewer
Updated commit message to reflect new changes

Harbormaster completed remote builds in B255640: Diff 554514.Aug 29 2023, 4:01 PM

In D159110#4626619, @mikhail.ramalho wrote:

Updated commit message to reflect new changes

You need to do that manually in Phabricator, arc won't sync your local changes to the commit message.

Change looks fine, but message is not, and I'd like to know how you're testing this. Presumably it's currently completely untested given it couldn't possibly have worked before.

For my knowledge, what is wrong with a tp load in the existing code?

In D159110#4626668, @sivachandra wrote:

For my knowledge, what is wrong with a tp load in the existing code?

Well it's the difference between x = p and x = *p. Only one is ever the right thing to do. Like I said, "it's doing a load not a move, so is setting tp to whatever the first pointer in the static TLS block happens to be", i.e. you're reading in the first sizeof(void *) bytes of data from .tdata (or .tbss) and using that as the thread pointer, not the region of memory that just got mapped for the thread pointer.

In D159110#4626672, @jrtc27 wrote:

In D159110#4626668, @sivachandra wrote:

For my knowledge, what is wrong with a tp load in the existing code?

Well it's the difference between x = p and x = *p. Only one is ever the right thing to do. Like I said, "it's doing a load not a move, so is setting tp to whatever the first pointer in the static TLS block happens to be", i.e. you're reading in the first sizeof(void *) bytes of data from .tdata (or .tbss) and using that as the thread pointer, not the region of memory that just got mapped for the thread pointer.

Shouldn't the ld operation overwrite tp and not read from tp? The assembly looks like this: https://godbolt.org/z/PTqfTcshv

In D159110#4626641, @jrtc27 wrote:

In D159110#4626619, @mikhail.ramalho wrote:

Updated commit message to reflect new changes

You need to do that manually in Phabricator, arc won't sync your local changes to the commit message.

Ok.

Change looks fine, but message is not, and I'd like to know how you're testing this. Presumably it's currently completely untested given it couldn't possibly have worked before.

For rv64: I test it locally on my VisionFive V2 board with ninja check-libc, which covers all the tests in libc, even more than what's being tested on the buildbots.
For rv32: I test it locally with qemu-system, with ninja libc-unit-tests, as our buildbots do for rv64. I do get some random crashes with ninja check-libc with or without this patch. I'll only investigate these crashes once I get ninja libc-unit-tests passing.

In D159110#4626682, @sivachandra wrote:

In D159110#4626672, @jrtc27 wrote:

In D159110#4626668, @sivachandra wrote:

For my knowledge, what is wrong with a tp load in the existing code?

Well it's the difference between x = p and x = *p. Only one is ever the right thing to do. Like I said, "it's doing a load not a move, so is setting tp to whatever the first pointer in the static TLS block happens to be", i.e. you're reading in the first sizeof(void *) bytes of data from .tdata (or .tbss) and using that as the thread pointer, not the region of memory that just got mapped for the thread pointer.

Shouldn't the ld operation overwrite tp and not read from tp? The assembly looks like this: https://godbolt.org/z/PTqfTcshv

Yes, it writes to tp (which is currently junk / 0), with the result of *(void **)val, where val points into the allocated TLS region, at the first byte of the static TLS block within it. Therefore it does exactly what I have now twice said it does.

In D159110#4626727, @jrtc27 wrote:

In D159110#4626682, @sivachandra wrote:

In D159110#4626672, @jrtc27 wrote:

In D159110#4626668, @sivachandra wrote:

For my knowledge, what is wrong with a tp load in the existing code?

Well it's the difference between x = p and x = *p. Only one is ever the right thing to do. Like I said, "it's doing a load not a move, so is setting tp to whatever the first pointer in the static TLS block happens to be", i.e. you're reading in the first sizeof(void *) bytes of data from .tdata (or .tbss) and using that as the thread pointer, not the region of memory that just got mapped for the thread pointer.

Shouldn't the ld operation overwrite tp and not read from tp? The assembly looks like this: https://godbolt.org/z/PTqfTcshv

Yes, it writes to tp (which is currently junk / 0), with the result of *(void *)val, where val points into the allocated TLS region, at the first byte of the static TLS block within it. Therefore it does exactly what I have now twice said it does.

That is, currently the code does tp = *(void **)val;, but it should do tp = val;.

In D159110#4626688, @mikhail.ramalho wrote:

In D159110#4626641, @jrtc27 wrote:

In D159110#4626619, @mikhail.ramalho wrote:

Updated commit message to reflect new changes

You need to do that manually in Phabricator, arc won't sync your local changes to the commit message.

Ok.

Change looks fine, but message is not, and I'd like to know how you're testing this. Presumably it's currently completely untested given it couldn't possibly have worked before.

For rv64: I test it locally on my VisionFive V2 board with ninja check-libc, which covers all the tests in libc, even more than what's being tested on the buildbots.
For rv32: I test it locally with qemu-system, with ninja libc-unit-tests, as our buildbots do for rv64. I do get some random crashes with ninja check-libc with or without this patch. I'll only investigate these crashes once I get ninja libc-unit-tests passing.

Does ninja check-libc on RV64 pass without this change? If so then there's no test coverage for this. If not then it's concerning that broken code was committed.

In D159110#4626744, @jrtc27 wrote:

Yes, it writes to tp (which is currently junk / 0), with the result of *(void *)val, where val points into the allocated TLS region, at the first byte of the static TLS block within it. Therefore it does exactly what I have now twice said it does.

That is, currently the code does tp = *(void **)val;, but it should do tp = val;.

What I am not getting is why should we even consider a dereference operation in the ld case. They way I understand, this is what is happening with ld:

Store val (which is in a0) on the stack.
Load tp with the value stored on the stack in the above step.

So, there is no dereference happening anywhere?

In D159110#4626771, @sivachandra wrote:

In D159110#4626744, @jrtc27 wrote:

Yes, it writes to tp (which is currently junk / 0), with the result of *(void *)val, where val points into the allocated TLS region, at the first byte of the static TLS block within it. Therefore it does exactly what I have now twice said it does.

That is, currently the code does tp = *(void **)val;, but it should do tp = val;.

What I am not getting is why should we even consider a dereference operation in the ld case. They way I understand, this is what is happening with ld:

Store val (which is in a0) on the stack.

Load tp with the value stored on the stack in the above step.

So, there is no dereference happening anywhere?

Oh sorry, right, it's an m constraint, so it's implicitly &val. Bleh, I had skimmed over that. Then yes, the existing code was correct, just added pointless indirection and made it XLEN-specific. This change is still the sensible way to do things.

OK with commit message updated.

This revision was not accepted when it landed; it landed in state Needs Review.Aug 30 2023, 7:31 AM

This revision was landed with ongoing or failed builds.

Closed by commit rGb0272d8ec349: [libc] Fix set_thread_ptr call in rv32 start up code (authored by Mikhail R. Gadelha <mikhail@igalia.com>). · Explain Why

This revision was automatically updated to reflect the committed changes.

Mikhail R. Gadelha <mikhail@igalia.com> added a commit: rGb0272d8ec349: [libc] Fix set_thread_ptr call in rv32 start up code.

Revision Contents

Path

Size

libc/

startup/

linux/

riscv64/

start.cpp

2 lines

Diff 554715

libc/startup/linux/riscv64/start.cpp

	Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines

	void cleanup_tls(uintptr_t addr, uintptr_t size) {			void cleanup_tls(uintptr_t addr, uintptr_t size) {
	if (size == 0)			if (size == 0)
	return;			return;
	__llvm_libc::syscall_impl<long>(SYS_munmap, addr, size);			__llvm_libc::syscall_impl<long>(SYS_munmap, addr, size);
	}			}

	static void set_thread_ptr(uintptr_t val) {			static void set_thread_ptr(uintptr_t val) {
	LIBC_INLINE_ASM("ld tp, %0\n\t" : : "m"(val));			LIBC_INLINE_ASM("mv tp, %0\n\t" : : "r"(val));
	}			}

	using InitCallback = void(int, char , char );			using InitCallback = void(int, char , char );
	using FiniCallback = void(void);			using FiniCallback = void(void);

	extern "C" {			extern "C" {
	// These arrays are present in the .init_array and .fini_array sections.			// These arrays are present in the .init_array and .fini_array sections.
	// The symbols are inserted by linker when it sees references to them.			// The symbols are inserted by linker when it sees references to them.
	▲ Show 20 Lines • Show All 125 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[libc] Fix set_thread_ptr call in rv32 start up codeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 554715

libc/startup/linux/riscv64/start.cpp

[libc] Fix set_thread_ptr call in rv32 start up code
ClosedPublic