This is an archive of the discontinued LLVM Phabricator instance.

Consider increasing the default ARM32 max page size to 64k.
ClosedPublic

Authored by thieta on Apr 2 2020, 12:18 PM.

Details

Summary

After some discussion on the mailing list: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140549.html I was recommended to open up a Pull Request for changing the default page size on ARM32 in LLD.

The current 4k page size doesn't work if the arm system has a higher page size configured. There are some systems out there that do this and it leads to the binary getting Killed! by the kernel, this is scenario that is hard to debug and a more compatible value might be good as we see adoption of lld go up.

Another thing to consider is that arm32 binaries can be executed on arm64 and there the 64k page size is pretty common. This is what GNU LD considered: https://github.com/bminor/binutils-gdb/commit/7572ca8989ead4c3425a1500bc241eaaeffa2c89

The question to the community is - would the increased compatibility be worth the hassle for the people that would have to tweak it with a option?

Note that the current diff doesn't update the tests to account for the new page size. There are 58 tests failing after this change, but I didn't want to update them all until I know if there is any interest in tweaking this value or not.

Diff Detail

Event Timeline

thieta created this revision.Apr 2 2020, 12:18 PM
MaskRay edited reviewers, added: psmith; removed: peter.smith.Apr 2 2020, 12:47 PM

From the mailing list:

The default page size in GNU ld for arm is 64k so binaries linked with ld
just worked on this device.

Can you help me verify that?

This is what GNU LD considered

But that link is referring to max page size, not the default page size?

srhines added a subscriber: srhines.Apr 2 2020, 5:23 PM

Hello Nick,

Maybe I am using the wrong terminology here and in that case I am sorry. What I mean is the default max page size for the arm32 target in lld. Here is the tests I can boil it down to:

Consider hello.c

#include <stdio.h>

int main(int argc, char** argv)
{
  printf("Hello World!\n");
  return 0;
}

I build this three different ways:

With lld no arguments about max page size:

clang --target=arm-linux-gnueabi -march=armv7-a -mthumb -mfloat-abi=soft -mfpu=vfpv3-d16 --sysroot=$TOOLCHAIN_DIR/arm-linux-gnueabi/libc --gcc-toolchain=$TOOLCHAIN_DIR hello.c -o hello-lld -fuse-ld=lld

With ld.bfd:

clang --target=arm-linux-gnueabi -march=armv7-a -mthumb -mfloat-abi=soft -mfpu=vfpv3-d16 --sysroot=$TOOLCHAIN_DIR/arm-linux-gnueabi/libc --gcc-toolchain=$TOOLCHAIN_DIR hello.c -o hello-bfd -fuse-ld=bfd

And finally with lld and max-page-size argument:

clang --target=arm-linux-gnueabi -march=armv7-a -mthumb -mfloat-abi=soft -mfpu=vfpv3-d16 --sysroot=$TOOLCHAIN_DIR/arm-linux-gnueabi/libc --gcc-toolchain=$TOOLCHAIN_DIR hello.c -o hello -fuse-ld=lld -Wl,-max-page-size=0x10000

I copied them over to the device and ran them:

root@WD-EX2100 root # ./hello-bfd
Hello World!
root@WD-EX2100 root # ./hello-lld
Killed
root@WD-EX2100 root # ./hello-lld-max-page
Hello World!
root@WD-EX2100 root #

As you see it works when we set the max-page-size, we can also verify this quickly with readelf:

➜ readelf -a hello-bfd | grep LOAD
  LOAD           0x000000 0x00010000 0x00010000 0x0043c 0x0043c R E 0x10000
  LOAD           0x00043c 0x0002043c 0x0002043c 0x0011c 0x00120 RW  0x10000

➜ readelf -a hello-lld | grep LOAD
  LOAD           0x000000 0x00010000 0x00010000 0x003e4 0x003e4 R   0x1000
  LOAD           0x0003e4 0x000113e4 0x000113e4 0x001cc 0x001cc R E 0x1000
  LOAD           0x0005b0 0x000125b0 0x000125b0 0x000d8 0x000d8 RW  0x1000
  LOAD           0x000688 0x00013688 0x00013688 0x00024 0x00025 RW  0x1000

➜ readelf -a hello-lld-max-page | grep LOAD
  LOAD           0x000000 0x00010000 0x00010000 0x003e4 0x003e4 R   0x10000
  LOAD           0x0003e4 0x000203e4 0x000203e4 0x001cc 0x001cc R E 0x10000
  LOAD           0x0005b0 0x000305b0 0x000305b0 0x000d8 0x000d8 RW  0x10000
  LOAD           0x000688 0x00040688 0x00040688 0x00024 0x00025 RW  0x10000

And just to confirm the device have the following page size configuration:

root@WD-EX2100 root # getconf PAGESIZE
32768

And we can try it on another device with another page size configuration:

root@Netgear-RN102:~# getconf PAGESIZE
4096
root@Netgear-RN102:~# ./hello-lld
Hello World!
root@Netgear-RN102:~# ./hello-bfd
Hello World!
root@Netgear-RN102:~# ./hello-lld-max-page
Hello World!

Hope this was the information you where looking for.

I also uploaded the binaries here: https://artifacts.plex.tv/testing/page_size_tests.tar.gz

Some references:
The GNU ld change to 64k max page size was made in 2014 https://patches.linaro.org/patch/32565/ [RFC] ld/ARM: Increase maximum page size to 64kB
My understanding from conversations at Arm and Linaro was that this was made primarily for the benefit of people running AArch32 executables on a AArch64 kernel built for 64k page sizes. In practice I don't think there ended up being a widely used distribution that simultaneously used 64K page size and offered a 32-bit user space. Ubuntu's decision to use 4k citing compatibility with old Arm v7 binaries https://wiki.ubuntu.com/ARM64/performance

The change to 64k page size for AArch64 in LLD D25079
Android driver forcing 4k page size in AArch64 D55029
Small number of Arm systems using non 4k page size in context of Gold https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=844467

As mentioned on llvm-dev the tests will need updating, I'm sorry this will be a rather tedious job.

My understanding is that Android and ChromeOs are the largest users of LLD on Arm that are most likely to be hurt by any image size changes. It is true that the change to adding a max-page-size in virtual memory rather than aligning to the next max-page-size will have helped mitigate this a bit, but I think we may still see an increase either side of RELRO. I'd very much like to hear from these folks to see if this hurts them, or if they are willing to force 4k alignment in the clang driver and if so how much time do they need to prepare for it?

We (Android Google folks) have an internal thread going on about what implications this might have, but I can say that I am concerned about making this kind of change for ARM Android targets. Even though we manually set a 4KB max page size in our platform builds for 32-bit, we also have to consider the needs of NDK developers (i.e. those folks making native applications/games/etc.). Some of their build configurations might behave very differently with this change, so my current thinking is that 64KB should not be the default for ARM Android targets.

thieta added a comment.Apr 3 2020, 6:16 AM

@srhines Thanks for your reply! Would it be enough for you guys to expand this block in clang https://github.com/llvm/llvm-project/blob/master/clang/lib/Driver/ToolChains/Linux.cpp#L256 for Android's usage? Or would a change in lld make more sense?

We (Android Google folks) have an internal thread going on about what implications this might have, but I can say that I am concerned about making this kind of change for ARM Android targets. Even though we manually set a 4KB max page size in our platform builds for 32-bit, we also have to consider the needs of NDK developers (i.e. those folks making native applications/games/etc.). Some of their build configurations might behave very differently with this change, so my current thinking is that 64KB should not be the default for ARM Android targets.

Can you say more about your concern? I was actually thinking the exact opposite -- that the special-case in the Driver to use 4K pages on Android AArch64 could very likely be removed, now that using the default 64k max-page-size will no longer cause the binary size to increase.

MaskRay added a comment.EditedApr 3 2020, 9:51 AM

We (Android Google folks) have an internal thread going on about what implications this might have, but I can say that I am concerned about making this kind of change for ARM Android targets. Even though we manually set a 4KB max page size in our platform builds for 32-bit, we also have to consider the needs of NDK developers (i.e. those folks making native applications/games/etc.). Some of their build configurations might behave very differently with this change, so my current thinking is that 64KB should not be the default for ARM Android targets.

Can you say more about your concern? I was actually thinking the exact opposite -- that the special-case in the Driver to use 4K pages on Android AArch64 could very likely be removed, now that using the default 64k max-page-size will no longer cause the binary size to increase.

I am slightly in favor of D77330 because the change will improve consistency with GNU ld, considering that the binary size is not affected by max-page-size at all (since D66749), and I think the patch does not weaken ASLR.
By default there are 4 PT_LOAD segments: R, RX, RW(RELRO) and RW(non-RELRO).
This patch mostly changes p_vaddr/p_paddr of RX, RW(RELRO) and RW(non-RELRO) and thus the distances between two adjacent segments.

My question is now more about whether Android uses -z separate-code and whether Android expects NDK developers (from the context above they seem to be able to enable arbitrary linker scripts and options) to use -z separate-code with -z max-page-size=65536. -z separate-code + -z max-page-size=65536 can waste a lot of bytes (up to 2 * 65536) due to padding. There can be padding before the RX and the RW(RELRO).

Deleting -z max-page-size from the clang driver should have no downside if they don't use -z separate-code.

The only thing that may motivate -z separate-code is probably eXecute-Only Memory (see --execute-only (D49456)) and probably SHF_ARM_PURECODE (rL326207) plus the intention to not share file bytes between R and RX.

So it turns out that the ld shipped in the NDK actually did have the 64KB max-page-size patch, so not taking this patch would change current user behavior. Thanks to @MaskRay for sharing the exact binutils patch we were looking for to verify that it has it. For that reason, this patch should be fine for our 32-bit ARM NDK users. For the platform, we'll see if we should just set the max-page-size back to 4KB explicitly in the build rules, but that shouldn't affect this CL.

We (Android Google folks) have an internal thread going on about what implications this might have, but I can say that I am concerned about making this kind of change for ARM Android targets. Even though we manually set a 4KB max page size in our platform builds for 32-bit, we also have to consider the needs of NDK developers (i.e. those folks making native applications/games/etc.). Some of their build configurations might behave very differently with this change, so my current thinking is that 64KB should not be the default for ARM Android targets.

Can you say more about your concern? I was actually thinking the exact opposite -- that the special-case in the Driver to use 4K pages on Android AArch64 could very likely be removed, now that using the default 64k max-page-size will no longer cause the binary size to increase.

Android is currently using -z separate-code for AArch64. Upping the default max page size will bloat the binaries even more (it costs us ~17MB alone on our platform image for -z separate-code with the 4KB max-page-size - Google-folk can look at http://b/150506341 for more details on the tradeoffs here). Thanks to @rprichard for pointing out recent changes here so that I hopefully can avoid regressing other size/performance behaviors unintentionally. For removing the special case in Clang, I'll go ahead and hard-code the 4KB max-page-size in aosp, so our behavior won't change with a future toolchain update. That will allow me to remove the driver hack safely. Thanks everyone for the good suggestions here, and for helping simplify things while not causing any unintended side effects.

I should have looked more closely at the AOSP sources, because we already hard-code the expected 4KB max-page-size. We do have some crufty flags that can be cleaned up, so I'll do that instead (https://android-review.googlesource.com/c/platform/build/soong/+/1278815). :) I'll also go ahead and create/test a CL to revert the Clang driver change forcing 4KB max-page-size next week.

So it sounds like I could go ahead and start changing the tests to match this? I can get to it during the weekend I hope. I anyone have a huge objection to this getting merged - let me know asap.

So it turns out that the ld shipped in the NDK actually did have the 64KB max-page-size patch, so not taking this patch would change current user behavior. Thanks to @MaskRay for sharing the exact binutils patch we were looking for to verify that it has it. For that reason, this patch should be fine for our 32-bit ARM NDK users. For the platform, we'll see if we should just set the max-page-size back to 4KB explicitly in the build rules, but that shouldn't affect this CL.

It's a bit hairier than this. The NDK ships ld.bfd, ld.gold, and ld.lld. The plain "ld" for arm32 is ld.gold, and "ld" for arm64 is ld.bfd. (AFAIK, that inconsistency has existed at least since NDK r10e. We're hoping to switch "ld" to ld.lld soon-ish for all architectures.)

Before the platform switched to lld, I think it was using ld.gold on all architectures.

  • For arm32, the NDK ld.gold(default), defaults to a max-page-size of 0x1000, but ld.bfd defaults to 0x10000 instead. (Even upstream arm32 gold looks like it uses 0x1000 still.)
  • For arm64, it looks like the NDK ld.bfd(default) has used a max-page-size of 0x10000 since at least r10e, but ld.gold switched from 0x1000 to 0x10000 between r15 and r16.

For arm32, I think a 64KiB max-page-size (on an OS where the pages are actually 4KiB) would create vaddr gaps between each object's segments. It looks like the Bionic loader maps a gap as inaccessible memory, but the kernel loader leaves the gap unmapped, so something else could be mapped into an executable's gap. If another DSO were mapped into that area, perhaps it could break something? (e.g. arm32 C++ exception unwinding can break. That was fixed for Q+, but old platform versions are still unfixed.) Newer devices seem to keep non-fixed mmap VMAs away from the executable's VMAs, though, which avoids the issue.

I think each DSO gap adds another VMA, and I think this change could add up to 3 of them per DSO. (IIRC, VMA usage is a problem because the bookkeeping uses expensive non-pageable kernel memory.) This issue is easy to fix, though, by building the arm32 platform with -z,max-page-size=0x1000. (I checked a 32-bit device's system_server. I saw 318 DSOs, and it looked like a 64KiB max-page-size could add up to 192KiB of vaddr padding per DSO, for a total increase of up to 60MiB of vaddr usage.)

thieta added a comment.Apr 7 2020, 7:47 AM

A small ping here.

If I understand it correctly Android build scripts would have to be adjusted if we change this to make sure the 4k max page size is still happening when building the core system libraries. Is that a bigger task so that we can't merge this or would it better with a clang driver change similar to what we do with android aarch64 here:https://github.com/llvm/llvm-project/blob/master/clang/lib/Driver/ToolChains/Linux.cpp#L256 - I am happy to include that in my work if that's something that would make life easier in android land.

I am still trying to find out if this is a change we want to land so that I can start working on the unit tests in that case.

I am still in favor of this change:

  • Better interoperability with AArch64
  • No size difference (for most cases)
  • Better compatibility GNU ld>=2014
  • We have plenty of approaches to specify -z max-page-size=4096 for Android NDK users
  • Importantly, we have a contributor who is willing to fix the 58 tests

git shortlog --after 2014-01-01 -- gold/arm.cc says there has been very little activity. I don't want us to be locked down by decisions in gold due to lack of maintenance.

Yeah, the change seems OK to me. I think we want Android to continue using a 4K max-page-size, though.

clang driver change similar to what we do with android aarch64 here:https://github.com/llvm/llvm-project/blob/master/clang/lib/Driver/ToolChains/Linux.cpp#L256

That seems fine to me.

I agree with Ryan. Please update the Clang driver for this to pass this same flag for 32-bit ARM as we already pass for AArch64.

I agree with Ryan. Please update the Clang driver for this to pass this same flag for 32-bit ARM as we already pass for AArch64.

I'll do it :/ D77746. The original description was outdated and I want to make some amendment to the test as well.

thieta updated this revision to Diff 256380.Apr 9 2020, 1:18 PM

I have started working on the tests - I have 17 left to do - but since the diff is going to be big I thought I wanted to upload this for now and start collecting feedback to make sure I am doing the right things here.

To be honest - I am not that well versed with the inner workings and the assembler that I am modifying here - I understand the alignments and addresses needs to change but not always why. But I am slowly working through it - please let me know if I am messing anything up here.

thieta marked an inline comment as done.Apr 9 2020, 1:20 PM
thieta added inline comments.
lld/test/ELF/arm-thumb-interwork-abs.s
31 ↗(On Diff #256380)

This looked a bit strange to me as I was working through this test. I am not sure if the -57352 is correct here it seems like a very large negative number which is not something I usually see in the other files. If anyone can double check it it would be great.

I can check this over, but I'll have to do it in my spare time after work so it may take some time to go through it all. I should be able to answer any specific questions without too much delay though so feel free to ask if there is anything that isn't clear? I'll be/am on Easter break right now, will probably be Tuesday before I can check again.

lld/test/ELF/arm-adr.s
85 ↗(On Diff #256380)

Apologies in advance for being a pain. Please could you fix up the comments? PC-relative Arm and Thumb instructions can be a bit strange if you aren't used to them, as the PC-bias of 8 on Arm and 4 on Thumb needs to be accounted for. I wrote a comment hand evaluating the instructions in a lot of these cases to show that the disassembled immediate was correct.

I would hope in the majority of cases, like this one, the fix up is relatively straightforward.

There will be more than just this test affected, at this stage I won't highlight each one. In most of the tests the comments are identified as / although this is a relatively new convention that may not exist older tests. When updating the comment please could you start with / rather than //? The reason behind the convention is so that it is easier to pick up comments from FileCheck lines.

lld/test/ELF/arm-thumb-interwork-abs.s
31 ↗(On Diff #256380)

The offset looks good to me. The symbol "sym" is defined via defsym to be 0x13001 which is higher than the call site with 4k pages, but lower when using 64k pages. Given that 0x21000 = 135168 in decimal a hand evaluation of the B instruction gives me 135168 + 8 - 57352 = 0x13000 which is what we'd expect the destination address to be for a Thumb defsym of 0x13001.

thieta marked 2 inline comments as done.Apr 12 2020, 5:38 AM
thieta added inline comments.
lld/test/ELF/arm-adr.s
85 ↗(On Diff #256380)

No problem - I can update the tests.

I am a bit confused about the comment about / - did you mean that comments should have three slashes first? or just one? Some tests seems to have comments with /// but I don't see anything with / and this also seems to annoy lit. So I am going to guess you mean /// - let me know if that's the wrong assumption.

lld/test/ELF/arm-thumb-interwork-abs.s
31 ↗(On Diff #256380)

Thanks for the clarification.

thieta marked an inline comment as done.Apr 12 2020, 5:55 AM
thieta added inline comments.
lld/test/ELF/arm-branch-rangethunk.s
12 ↗(On Diff #256380)

@psmith - I just remembered that I didn't get this test to work. @MaskRay told me that the Inputs/far-long-arm-abs.s had to be updated as well - but can't say I am sure how exactly.

So if anyone could show me the exact values or how to calculate them in the input file that would be very helpful.

psmith added inline comments.Apr 14 2020, 1:05 AM
lld/test/ELF/arm-adr.s
85 ↗(On Diff #256380)

Yes you've got it right; I meant 3 /// to start a non FileCheck comment.

/// 0x11808 + 0x8 + 0x400 = 0x11c10 = dat3

There are likely to be older tests that don't follow the new-ish convention, if you have to modify any of them it would be great to update them to follow the new convention.

lld/test/ELF/arm-branch-rangethunk.s
12 ↗(On Diff #256380)

Looking at the numbers in far-arm-abs.s and far-long-arm-abs.s the values of the symbols are:
far-arm-abs.s

too_far1 0x2020008
too_far2 0x202000c
too_far3 0x2020010

far-long-arm-abs.s

too_far1 0x2020014
too_far2 0x2020018
too_far3 0x202001c

The address of the first branch in the old disassembly is 0x20000, for far-arm-abs.s the offset is therefore 0x2020008 - 0x20000 = 0x2000008 which including the 8 byte pc-bias is just out of range of the maximum branch displacement. The short thunks can just about reach these addresses if placed after .text. In far-long-arm-abs.s the addresses are 0x2020014 - 0x20000 = 0x2000014 which is just out of range of the short thunk, but is in range of the long-thunk.

I think that if you alter the addresses of too_far* symbols such that the offset from the first branch instruction matches 0x2000008 for far-arm-abs.s and 0x2000014 for far-long-arm-abs.s then this should work.

While not strictly necessary, there is an opportunity for a small cleanup. The far symbol in far-arm-abs.s is not used by this test, only arm-branch.s. The far symbol in far-long-arm-abs.s is not used at all by any test. It may be worth splitting out far-arm-abs.s into two files?
far-arm-abs.s just contains the symbol far and is used by arm-branch.s
far-arm-short-abs.s
contains the too_far* symbols from far-arm-abs.s
far-arm-long-abs.s // contains the too_far* symbols from far-long-arm-abs.s with far removed.

thieta updated this revision to Diff 257724.Apr 15 2020, 8:08 AM

This adjusts the comments as suggested by @psmith and addresses all the tests except the following three:

lld :: ELF/arm-branch-rangethunk.s
lld :: ELF/arm-thunk-largesection.s
lld :: ELF/arm-thunk-multipass-plt.s

I am not able to get them to pass and would like some help to get them to work. Otherwise I hope this is more or less done. It's been kind of a slog getting through the tests but hopefully I did it correctly :)

Oh I see that there are some files I forgot to update the comments to have /// - I'll revisit them later this week.

This adjusts the comments as suggested by @psmith and addresses all the tests except the following three:

lld :: ELF/arm-branch-rangethunk.s
lld :: ELF/arm-thunk-largesection.s
lld :: ELF/arm-thunk-multipass-plt.s

I am not able to get them to pass and would like some help to get them to work. Otherwise I hope this is more or less done. It's been kind of a slog getting through the tests but hopefully I did it correctly :)

I'll take a look although it may take some time to go through these. May report back after I've looked at each one.

I've uploaded a fixed arm-branch-rangethunks.s

I took out the modification to raise the .balign to 0x20000 as I don't think it is necessary. I updated the comment and used --no-show-raw-insn in objdump -d as this makes it easier to read. Apart from that the only difference is that the .text section starts at 0x30000 instead of 0x20000 so I needed to add 0x10000 to the far-arm-abs.s and far-long-arm.s values. Can you check the attachment works and add it to the patch?

I'll try and get to the other 2 failing tests later this week or the weekend.

thieta updated this revision to Diff 257995.Apr 16 2020, 2:17 AM

I have applied @psmith changes to arm-branch-rangethunk.s and reworked arm-branch.s to work with the changes above.

thieta marked an inline comment as done.Apr 16 2020, 2:19 AM

we are now only down to two failing tests:

Failing Tests (2):
  lld :: ELF/arm-thunk-largesection.s
  lld :: ELF/arm-thunk-multipass-plt.s
lld/test/ELF/arm-branch-rangethunk.s
12 ↗(On Diff #256380)

This test passes now 🎉

thieta retitled this revision from Consider increasing the default ARM32 page size to 64k. to Consider increasing the default ARM32 max page size to 64k..Apr 16 2020, 2:21 AM

Attaching a diff that will fix the other 2 tests. The arm-thunk-multipass-plt.s has a .got.plt that is more heavily aligned so the PLT entries needed adjusting. The arm-thunk-largesection.s needed the section start and stop addresses adjusting for a higher start address. Can you check that these work and re-upload. I'll then do a run through all the tests to see if I can spot anything else.

thieta updated this revision to Diff 258415.Apr 17 2020, 1:42 PM

This applies the latest patch from @psmith which means that all the tests now passes! 🎉

@psmith This worked! All patches is now passing with the latest diff. I did a quick rebase as well - no conflicts.

MaskRay accepted this revision.Apr 17 2020, 3:31 PM

Thanks!

This revision is now accepted and ready to land.Apr 17 2020, 3:31 PM
psmith accepted this revision.Apr 18 2020, 2:49 AM

LGTM too, I'd be lying if I said I checked every one of the numbers, but I think the overall test changes are consistent with changing the page size. There are a couple of files arm-plt-reloc.s and arm-thumb-plt-reloc.s where there are some comments that could be made /, if you've got the time to fix prior to committing then great. I don't think that is important enough to hold up the patch though, I appreciate the amount of work that you've put in.

Thanks guys - I don't have commit access. So could anyone of you commit this? Please use my email address tobias@hieta.se if you commit as another author.

This revision was automatically updated to reflect the committed changes.

Hmm...seems to have regressed ARMv5 Linux kernel boots according to a bisection from @nathanchance (https://groups.google.com/g/clang-built-linux/c/aKDK-N6JN4g)

Hmm...seems to have regressed ARMv5 Linux kernel boots according to a bisection from @nathanchance (https://groups.google.com/g/clang-built-linux/c/aKDK-N6JN4g)

I don't think that URL is publicly visible.

In theory increasing the max page size shouldn't have changed anything significant, especially given that ld.bfd defaults to the same page size, it may have flushed out another bug though. Would it be possible to raise a PR with a reproducer?

Hmm...seems to have regressed ARMv5 Linux kernel boots according to a bisection from @nathanchance (https://groups.google.com/g/clang-built-linux/c/aKDK-N6JN4g)

@nathnchance Add -z max-page-size=4096 TO LDFLAGS (-z is also a recognized gcc/clang driver option) to restore the previous behavior. But as Peter said, this change likely flushed out another bug.

(I made a typo in the commit subject 6536->65536, sorry :) )

Hmm...seems to have regressed ARMv5 Linux kernel boots according to a bisection from @nathanchance (https://groups.google.com/g/clang-built-linux/c/aKDK-N6JN4g)

I don't think that URL is publicly visible.

I don't know why that started happening :/

In theory increasing the max page size shouldn't have changed anything significant, especially given that ld.bfd defaults to the same page size, it may have flushed out another bug though. Would it be possible to raise a PR with a reproducer?

Done: https://bugs.llvm.org/show_bug.cgi?id=45632

Feel free to ping me for more information/debugging as needed.

grimar added a comment.EditedApr 22 2020, 1:21 AM

Hmm...seems to have regressed ARMv5 Linux kernel boots according to a bisection from @nathanchance (https://groups.google.com/g/clang-built-linux/c/aKDK-N6JN4g)

I don't think that URL is publicly visible.

I don't know why that started happening :/

I think we saw such issue before. That time I think the URL of a page from incognito mode worked (i.e. a different URL to the same page).

Hmm...seems to have regressed ARMv5 Linux kernel boots according to a bisection from @nathanchance (https://groups.google.com/g/clang-built-linux/c/aKDK-N6JN4g)

I don't think that URL is publicly visible.

I don't know why that started happening :/

I think we saw such issue before. That time I think the URL of a page from incognito mode worked (i.e. a different URL to the same page).

https://groups.google.com/forum/#!topic/clang-built-linux/aKDK-N6JN4g is the publicly visible version of that page. I'm not quite sure why it works this way, but it wasn't too hard for me to find it. Thanks for pointing out that it should be visible even in incognito.

Hmm...seems to have regressed ARMv5 Linux kernel boots according to a bisection from @nathanchance (https://groups.google.com/g/clang-built-linux/c/aKDK-N6JN4g)

I don't think that URL is publicly visible.

I don't know why that started happening :/

In theory increasing the max page size shouldn't have changed anything significant, especially given that ld.bfd defaults to the same page size, it may have flushed out another bug though. Would it be possible to raise a PR with a reproducer?

Done: https://bugs.llvm.org/show_bug.cgi?id=45632

Feel free to ping me for more information/debugging as needed.

Thanks for the PR. The change to the pagesize to 64k, along with the lowest starting address of 0xc0008000 has meant that LLD now thinks it can add allocate the headers, ELF header and PT_PHDR in the first loadable segment. Other than that the only changes are the alignments of the program segments. At this point I don't know what is causing the problem. It is possible that the PT_PHDR and the extra program segment are confusing the boot-loader /compressor/decompressor.

ld.bfd with its 64k page size has a very different strategy for generating program headers. It generates 3 large program headers whereas LLD generates many smaller ones. I've put more details in the PR.

Anyhow I'm pretty sure that we've just flushed out another problem/difference between how LLD and BFD handle linker scripts that has turned out to cause a problem.