Page MenuHomePhabricator

Bdragon28 (Brandon Bergren)
Animal

Projects

User does not belong to any projects.

User Details

User Since
Jan 11 2019, 6:48 AM (88 w, 6 d)

Recent Activity

Thu, Sep 17

Bdragon28 added a comment to D79916: Map -O to -O1 instead of -O2.

But also you really should not get warnings for unused static functions in included headers, only ones defined in the C source file itself. We'd have countless warnings in the kernel across all architectures otherwise.

I agree. But that's what it is doing when using always_inline in combination with -Wunused-function.

There is currently no real usage of always_inline in system headers though, so maybe I'm just the first to complain about it?

We use them in CheriBSD and have no such issues that I've ever noticed. When was the last time you checked (and what compiler)?

Thu, Sep 17, 11:38 AM · Restricted Project
Bdragon28 added a comment to D79916: Map -O to -O1 instead of -O2.

But also you really should not get warnings for unused static functions in included headers, only ones defined in the C source file itself. We'd have countless warnings in the kernel across all architectures otherwise.

Thu, Sep 17, 11:29 AM · Restricted Project
Bdragon28 added a comment to D79916: Map -O to -O1 instead of -O2.

This has significantly regressed FreeBSD's performance with the new version of Clang. It seems Clang does not inline functions at -O1, unlike GCC, and since FreeBSD currently compiles its kernel with -O whenever debug symbols are enabled[1] (which, of course, is almost always true), this results in all its static inline helper functions not being inlined at all, a pattern that is common in the kernel, used for things like get_curthread and the atomics implementations.

[1] This is a dubious decision made in r140400 in 2005 to provide "truer debugger stack traces" (well, before then there was ping-ponging between -O and -O2 based on concerns around correctness vs performance, but amd64 is an exception that has always used -O2 since r127180 it seems). Given that GCC will inline at -O, at least these days, the motivation seems to no longer exist, and compiling a kernel at anything other than -O2 (or maybe -O3) seems like a silly thing to do, but nevertheless it's what is currently done.

Cc: @dim @trasz

This is actually SUCH a bad idea that a kernel built with -O will *not work at all* on 32 bit powerpc platforms (presumably due to allocating stack frames in the middle of assembly fragments in the memory management that are supposed to be inlined at all times.) I had to hack kern.pre.mk to rquest -O2 at all times just to get a functioning kernel.

Well, -O0, -O1, -O2 and -O should all produce working kernels, and any cases where they don't are compiler bugs (or kernel bugs if they rely on UB) that should be fixed, not worked around by tweaking the compiler flags in a fragile way until you get the behaviour relied on. Correctness and performance are very different issues here.

As an example:

static __inline void
mtsrin(vm_offset_t va, register_t value)
{

        __asm __volatile ("mtsrin %0,%1; isync" :: "r"(value), "r"(va));
}

This code is used in the mmu when bootstrapping the cpu like so:

for (i = 0; i < 16; i++)
        mtsrin(i << ADDR_SR_SHFT, kernel_pmap->pm_sr[i]);
powerpc_sync();

sdr = (u_int)moea_pteg_table | (moea_pteg_mask >> 10);
__asm __volatile("mtsdr1 %0" :: "r"(sdr));
isync();

tlbia();

During the loop there, we are in the middle of programming the MMU segment registers in real mode, and is supposed to be doing all work out of registers. (and powerpc_sync() and isync() should be expanded to their single assembly instruction, not a function call. The whole point of calling those is that we are in an inconsistent hardware state and need to sync up before continuing execution)

If there isn't a way to force inlining, we will have to change to using preprocessor macros in cpufunc.h.

There is, it's called __attribute__((always_inline)) and supported by both GCC and Clang. But at -O0 you'll still have register allocation to deal with, so really that code is just fundamentally broken and should not be written in C. There is no way for you to guarantee stack spills are not used, it's way out of scope for C.

Thu, Sep 17, 11:13 AM · Restricted Project
Bdragon28 added a comment to D79916: Map -O to -O1 instead of -O2.

This has significantly regressed FreeBSD's performance with the new version of Clang. It seems Clang does not inline functions at -O1, unlike GCC, and since FreeBSD currently compiles its kernel with -O whenever debug symbols are enabled[1] (which, of course, is almost always true), this results in all its static inline helper functions not being inlined at all, a pattern that is common in the kernel, used for things like get_curthread and the atomics implementations.

[1] This is a dubious decision made in r140400 in 2005 to provide "truer debugger stack traces" (well, before then there was ping-ponging between -O and -O2 based on concerns around correctness vs performance, but amd64 is an exception that has always used -O2 since r127180 it seems). Given that GCC will inline at -O, at least these days, the motivation seems to no longer exist, and compiling a kernel at anything other than -O2 (or maybe -O3) seems like a silly thing to do, but nevertheless it's what is currently done.

Cc: @dim @trasz

This is actually SUCH a bad idea that a kernel built with -O will *not work at all* on 32 bit powerpc platforms (presumably due to allocating stack frames in the middle of assembly fragments in the memory management that are supposed to be inlined at all times.) I had to hack kern.pre.mk to rquest -O2 at all times just to get a functioning kernel.

Well, -O0, -O1, -O2 and -O should all produce working kernels, and any cases where they don't are compiler bugs (or kernel bugs if they rely on UB) that should be fixed, not worked around by tweaking the compiler flags in a fragile way until you get the behaviour relied on. Correctness and performance are very different issues here.

As an example:

static __inline void
mtsrin(vm_offset_t va, register_t value)
{

        __asm __volatile ("mtsrin %0,%1; isync" :: "r"(value), "r"(va));
}

This code is used in the mmu when bootstrapping the cpu like so:

for (i = 0; i < 16; i++)
        mtsrin(i << ADDR_SR_SHFT, kernel_pmap->pm_sr[i]);
powerpc_sync();

sdr = (u_int)moea_pteg_table | (moea_pteg_mask >> 10);
__asm __volatile("mtsdr1 %0" :: "r"(sdr));
isync();

tlbia();

During the loop there, we are in the middle of programming the MMU segment registers in real mode, and is supposed to be doing all work out of registers. (and powerpc_sync() and isync() should be expanded to their single assembly instruction, not a function call. The whole point of calling those is that we are in an inconsistent hardware state and need to sync up before continuing execution)

If there isn't a way to force inlining, we will have to change to using preprocessor macros in cpufunc.h.

Thu, Sep 17, 11:12 AM · Restricted Project
Bdragon28 added a comment to D79916: Map -O to -O1 instead of -O2.

This has significantly regressed FreeBSD's performance with the new version of Clang. It seems Clang does not inline functions at -O1, unlike GCC, and since FreeBSD currently compiles its kernel with -O whenever debug symbols are enabled[1] (which, of course, is almost always true), this results in all its static inline helper functions not being inlined at all, a pattern that is common in the kernel, used for things like get_curthread and the atomics implementations.

[1] This is a dubious decision made in r140400 in 2005 to provide "truer debugger stack traces" (well, before then there was ping-ponging between -O and -O2 based on concerns around correctness vs performance, but amd64 is an exception that has always used -O2 since r127180 it seems). Given that GCC will inline at -O, at least these days, the motivation seems to no longer exist, and compiling a kernel at anything other than -O2 (or maybe -O3) seems like a silly thing to do, but nevertheless it's what is currently done.

Cc: @dim @trasz

This is actually SUCH a bad idea that a kernel built with -O will *not work at all* on 32 bit powerpc platforms (presumably due to allocating stack frames in the middle of assembly fragments in the memory management that are supposed to be inlined at all times.) I had to hack kern.pre.mk to rquest -O2 at all times just to get a functioning kernel.

Well, -O0, -O1, -O2 and -O should all produce working kernels, and any cases where they don't are compiler bugs (or kernel bugs if they rely on UB) that should be fixed, not worked around by tweaking the compiler flags in a fragile way until you get the behaviour relied on. Correctness and performance are very different issues here.

Thu, Sep 17, 11:10 AM · Restricted Project
Bdragon28 added a comment to D79916: Map -O to -O1 instead of -O2.

This has significantly regressed FreeBSD's performance with the new version of Clang. It seems Clang does not inline functions at -O1, unlike GCC, and since FreeBSD currently compiles its kernel with -O whenever debug symbols are enabled[1] (which, of course, is almost always true), this results in all its static inline helper functions not being inlined at all, a pattern that is common in the kernel, used for things like get_curthread and the atomics implementations.

[1] This is a dubious decision made in r140400 in 2005 to provide "truer debugger stack traces" (well, before then there was ping-ponging between -O and -O2 based on concerns around correctness vs performance, but amd64 is an exception that has always used -O2 since r127180 it seems). Given that GCC will inline at -O, at least these days, the motivation seems to no longer exist, and compiling a kernel at anything other than -O2 (or maybe -O3) seems like a silly thing to do, but nevertheless it's what is currently done.

Cc: @dim @trasz

Thu, Sep 17, 10:47 AM · Restricted Project

Sat, Sep 12

Bdragon28 added a comment to D73425: [PPC] Fix platform definitions when compiling FreeBSD powerpc64 as LE.

That's fair. Will just use a patch on the FreeBSD side and revisit after 11.0.0 is released. Thanks.

Sat, Sep 12, 11:27 AM · Restricted Project, Restricted Project

Thu, Sep 10

Bdragon28 added a comment to D73425: [PPC] Fix platform definitions when compiling FreeBSD powerpc64 as LE.

Any chance of a backport to 11?

Thu, Sep 10, 12:11 PM · Restricted Project, Restricted Project

Wed, Aug 26

Bdragon28 added inline comments to D73425: [PPC] Fix platform definitions when compiling FreeBSD powerpc64 as LE.
Wed, Aug 26, 1:49 PM · Restricted Project, Restricted Project
Bdragon28 added inline comments to D73425: [PPC] Fix platform definitions when compiling FreeBSD powerpc64 as LE.
Wed, Aug 26, 1:45 PM · Restricted Project, Restricted Project

Tue, Aug 25

Bdragon28 updated the diff for D73425: [PPC] Fix platform definitions when compiling FreeBSD powerpc64 as LE.

Use correct target for FreeBSD driver test.

Tue, Aug 25, 5:47 PM · Restricted Project, Restricted Project
Bdragon28 added inline comments to D73425: [PPC] Fix platform definitions when compiling FreeBSD powerpc64 as LE.
Tue, Aug 25, 5:24 PM · Restricted Project, Restricted Project
Bdragon28 updated the diff for D73425: [PPC] Fix platform definitions when compiling FreeBSD powerpc64 as LE.

Add some tests for the new target.

Tue, Aug 25, 4:29 PM · Restricted Project, Restricted Project
Bdragon28 added inline comments to D86489: [PowerPC] Add addtional test that retroactively catches PR47259.
Tue, Aug 25, 12:09 PM · Restricted Project

Aug 24 2020

Bdragon28 updated the summary of D86489: [PowerPC] Add addtional test that retroactively catches PR47259.
Aug 24 2020, 2:44 PM · Restricted Project
Bdragon28 requested review of D86489: [PowerPC] Add addtional test that retroactively catches PR47259.
Aug 24 2020, 2:42 PM · Restricted Project

May 20 2020

Bdragon28 added inline comments to D79977: [ELF][PPC64] Synthesize _savegpr[01]_{14..31} and _restgpr[01]_{14..31}.
May 20 2020, 12:37 PM · Restricted Project

Mar 1 2020

Bdragon28 added a comment to D75416: [PowerPC][ELF] Place .toc in the same COMDAT group as the target object.

Here's the LLD_REPRODUCE using the cross toolchain (since I had it already and I hit the same thing on the backport, this one is probably the best one to look at)

Mar 1 2020, 7:53 PM · Restricted Project
Bdragon28 added a comment to D75416: [PowerPC][ELF] Place .toc in the same COMDAT group as the target object.

OK, I reinterpreted 9569a1472ee7fee37f7f991d34634c5d8d1f3559 (as a prereq to reduce the churn) and this patch in the context of llvm10 (mainly re-uppercasing function names so that stuff applies and converting the MCSection::NonUniqueID bit back into ~0) so I can run it in-tree for a FreeBSD buildworld, and I hit the exact same failure, so it looks like there's an actual problem here.

Mar 1 2020, 7:44 PM · Restricted Project
Bdragon28 added a comment to D75416: [PowerPC][ELF] Place .toc in the same COMDAT group as the target object.

I'm pretty sure I'm just tripping over c++ weirdness related to trying to use an external llvm HEAD compiler as a toolchain for the same platform. I will retry from my cross build machine.

Mar 1 2020, 6:54 PM · Restricted Project
Bdragon28 added a comment to D75416: [PowerPC][ELF] Place .toc in the same COMDAT group as the target object.

Right, which is why it was so surprising to see it again in HEAD. I'm still trying to get a reproduce.

Mar 1 2020, 6:40 PM · Restricted Project
Bdragon28 accepted D75419: [ELF][PPC32] Don't report "relocation refers to a discarded section" for .got2.

This fixes my problem encountered trying to build kyua and atf for lld10-native FreeBSD ppc32.

Mar 1 2020, 5:11 PM · Restricted Project
Bdragon28 added a comment to D75416: [PowerPC][ELF] Place .toc in the same COMDAT group as the target object.

Still working on testing this one.

Mar 1 2020, 5:08 PM · Restricted Project

Feb 28 2020

Bdragon28 accepted D75394: [ELF][PPC32] Fix canonical PLTs when the order does not match the PLT order.

From a FreeBSD standpoint, I am very happy with this.

Feb 28 2020, 10:19 PM · Restricted Project
Bdragon28 added a comment to D75394: [ELF][PPC32] Fix canonical PLTs when the order does not match the PLT order.

AWESOME, this fixed the clang crash I was seeing with lld10 as well! I think it's possible this might be the last crashing bug blocking ppc32 lld migration.

Feb 28 2020, 6:32 PM · Restricted Project
Bdragon28 added a comment to D75394: [ELF][PPC32] Fix canonical PLTs when the order does not match the PLT order.

This patch will be needed on FreeBSD to fix some crashes for an lld10-linked FreeBSD powerpc32 world, such as a crash running /usr/bin/objdump -D.

Feb 28 2020, 6:21 PM · Restricted Project

Jan 28 2020

Bdragon28 accepted D73532: [ELF][PPC32] Support --emit-relocs link of R_PPC_PLTREL24.

Looks correct to me, works great, and fixes the last lld-related problem I am aware of currently on FreeBSD powerpc32.

Jan 28 2020, 10:58 AM · Restricted Project
Bdragon28 added a comment to D73532: [ELF][PPC32] Support --emit-relocs link of R_PPC_PLTREL24.

I am now able to load pf.ko on FreeBSD powerpc32 without it crashing.

Jan 28 2020, 7:54 AM · Restricted Project

Jan 25 2020

Bdragon28 created D73425: [PPC] Fix platform definitions when compiling FreeBSD powerpc64 as LE.
Jan 25 2020, 10:43 PM · Restricted Project, Restricted Project
Bdragon28 accepted D73424: [ELF][PPC32] Support range extension thunk with addends.

This is necessary to run large programs like clang on FreeBSD powerpc32 when built with lld, because of REL24 branches in the crt startup code (done on purpose for object reusability reasons -- the csu objects need to work on both non-pic and pic, so they just use a plain bl and rely on the link editor to do a thunk if necessary.)

Jan 25 2020, 10:21 PM · Restricted Project
Bdragon28 accepted D73399: [ELF][PPC32] Support canonical PLT.

Working great!

Jan 25 2020, 5:56 PM · Restricted Project
Bdragon28 added a comment to D73399: [ELF][PPC32] Support canonical PLT.

This version looks very promising so far. Will do a full freebsd buildworld test / boot on G4. I expect this version to have fixed the issue, but will verify.

Jan 25 2020, 3:32 PM · Restricted Project
Bdragon28 added a comment to D73399: [ELF][PPC32] Support canonical PLT.

Excellent! This looks like it fixed the FreeBSD putchar problem for me. I will test this with a full powerpc32 buildworld and report back.

Jan 25 2020, 12:20 PM · Restricted Project

Jan 24 2020

Bdragon28 accepted D73255: [ELF][PowerPC] Support R_PPC_COPY and R_PPC64_COPY.

While I ran into further problems on ppc32, I think it's just because there are further problems on ppc32 (unrelated to this one). At the very least this gets FreeBSD ppc32/lld10 as far as the rescue shell, which is great progress!

Jan 24 2020, 8:31 AM · Restricted Project

Jan 23 2020

Bdragon28 added a comment to D73255: [ELF][PowerPC] Support R_PPC_COPY and R_PPC64_COPY.

For anyone who is curious about this, copy relocations are especially needed on powerpc32, where they are used to implement global data (such as sharing a pointer between a .so and the main program) in situations where the main program's copy lives in .bss.

Jan 23 2020, 9:25 AM · Restricted Project

Jan 7 2020

Bdragon28 added a comment to D72363: [PowerPC] Default ppc64 linux-gnu/freebsd to -fno-PIC.

I am definitely in favor of this change, as the defaulting to PIC has been causing headaches in the FreeBSD kernel.

Jan 7 2020, 2:54 PM · Restricted Project

Nov 29 2019

Bdragon28 added a comment to D70570: [PowerPC] Only use PLT annotations if using PIC relocation model.

On the FreeBSD kernel side, I put some time into working out the missing parts on ppc32 and slightly extending the kernel linker to support secure-plt PIC kernel modules. Working on that over at https://reviews.freebsd.org/D22608.

Nov 29 2019, 1:27 PM · Restricted Project

Nov 25 2019

Bdragon28 added a comment to D70570: [PowerPC] Only use PLT annotations if using PIC relocation model.

This doesn't seem quite sufficient for FreeBSD kernel modules. It's emitting R_PPC_REL24 instead of R_PPC_ADDR16_HA/R_PPC_ADDR16_LO pairs for stuff like memset and memcpy, which are unusable because we're running modules in KVA and the kernel text in the 32 bit DMAP. Still need to figure out why these are being emitted this way when using freestanding.

Nov 25 2019, 9:01 AM · Restricted Project

Aug 8 2019

Bdragon28 added a comment to D64906: [ELF][PPC] Allow PT_LOAD to have overlapping p_offset ranges.

It turns out I need the proper fix after all in my local tree, because lately I have been working on getting a working Petitboot loader binary, and that means I'm technically cross compiling code for ppc64le Linux. So yeah, it would be very nice to get this in.

Aug 8 2019, 10:58 AM · Restricted Project

May 14 2019

Bdragon28 added inline comments to D61792: [PPC] Fix 32-bit build of libunwind.
May 14 2019, 2:50 PM · Restricted Project, Restricted Project

May 7 2019

Bdragon28 added a comment to D61647: llvm-objdump: when ELF st_other field is set, print its value before symbol name.

So for ELF, you would want to do this logic INSTEAD of the hidden check immediately above.

May 7 2019, 4:46 PM · Restricted Project
Bdragon28 added a comment to D61647: llvm-objdump: when ELF st_other field is set, print its value before symbol name.

So basically, it's either printing nothing (if st_other is 0), .internal (if st_other is 1), .hidden (if st_other is 2), .protected (if st_other is 3), or hex (anything other than these exact st_other values.)

May 7 2019, 4:41 PM · Restricted Project
Bdragon28 added a comment to D61647: llvm-objdump: when ELF st_other field is set, print its value before symbol name.

The relevant binutils objdump logic is contained in bfd/elf.c, specifically the bfd_elf_print_symbol() function.

May 7 2019, 4:32 PM · Restricted Project

Jan 11 2019

Bdragon28 added a comment to D56586: [PPC64] Update LocalEntry from assigned symbols.

One particular place this is needed is to handle FreeBSD powerpc64 (ELFv2 experimental) libc's weak symbols / symbol aliasing, especially when using LLD (in my experience, binutils ld will partially compensate for this issue.)

Jan 11 2019, 7:34 AM · Restricted Project