- User Since
- Apr 11 2016, 8:56 AM (233 w, 2 d)
Apr 16 2020
That is my understanding too from my comment in https://reviews.llvm.org/D72892#1923502
I have removed the tests from LLD that used the relocations coming from these instructions and have replaced them with .inst and .reloc. Will be worth making sure that ninja check-lld passes with this change applies. I'm happy for this to be applied. Will need to check with the other reviewers to make sure they are happy.
Feb 21 2020
I think this is close, but there are a couple of test failures with ninja check-llvm
Feb 18 2020
Thanks for the update. Apart from the extra / needed in some of the comment strings this looks good to me. Happy for you to add those prior to commit.
Feb 17 2020
Just noticed that PR44929 https://bugs.llvm.org/show_bug.cgi?id=44929 had been traced to this change. I've put a suggestion on how we might be able to fix it in the PR, if you have any thoughts please comment there.
Feb 13 2020
One query aside, this looks good to me. The original idea for glibc was to automatically have the dynamic loader sign the .plt.got entries on the presence of AARCH64_PAC_PLT, it turns out that those patches haven't been accepted in glibc yet. Android does not use lazy binding and marks the .plt.got as RELRO so PAC isn't much use there. I've marked as approved, but please wait a day or so to see if there are any comments from other timezones.
Adding other regular reviewers.
Feb 11 2020
Updated to split the logic for the warning from the output. This permits some simplification. Updated tests.
Feb 9 2020
I've updated the diff to add a warning about no interworking for the cases when the behaviour changes from lld 10.0. I've set the details in a large comment in front of warnOnStateMismatch.
Reopening as the addition of the warning is a non-trivial change.
Feb 6 2020
Feb 5 2020
This is a deliberate choice so that we don't need to know the section where foo and bar are defined beforehand.
Just want to make sure this is well defined with respect to the description in ELF.
SHF_LINK_ORDER This flag adds special ordering requirements for link editors. The requirements apply if the sh_link field of this section's header references another section (the linked-to section). If this section is combined with other sections in the output file, it must appear in the same relative order with respect to those sections, as the linked-to section appears with respect to sections the linked-to section is combined with.
Feb 3 2020
I think this is the right thing to do. According to https://sourceware.org/binutils/docs/ld/Options.html#Options
Add the archive or object file specified by namespec to the list of files to link. This option may be used any number of times. If namespec is of the form :filename, ld will search the library path for a file called filename, otherwise it will search the library path for a file called libnamespec.a.
An argument could be made that LLD could be a bit cleverer and only add the .a when it doesn't exist, although that would fail in the contrived case that someone had explicitly named their library lib name.a.a however it would be good to fix this in clang so that older versions of LLD will work.
If I've understood correctly this would make LLVM more aggressive for PIC relocation models, but perhaps more honest in an inter procedural optimisation context? The code changes look fine to me, I'm wondering if we've discussed this widely enough with the community to work out how to proceed here. For example do we have plan to test -fno-semantic-interposition well enough for it to be relied on. The consensus may already exist, and I don't know enough about it though.
Jan 29 2020
Thanks for the update, I agree that informing the assembler of pic status isn't the right way to go, particularly if the assembler can't check that the code that you have written is actually pic and also the potentially incompatibility with GNU as. I've no objections.
Thanks for the suggestions, I've applied them prior to commit.
Jan 28 2020
To make sure I understand, this is adding a local alias as at assembly time that can be used as a fixup target. This is necessary as at assembly time all we have is a global default visibility symbol which we have to assume can be interposed, even if the code-generator has already assumed it. Are there other uses that I'm missing?
As it stands this sounds reasonable to me, I'm trying to think if there are any other alternatives and what the trade-offs would be. For example is there no way that the assembler could find out the information at assembly time, or is that more complex than adding local aliases?
I think this is step in the right direction. I tend to find that relocation out of range errors tend to be more useful to linker developers (via bug reports) as there is often little that a user can do about it, and it often represents either a fault in the linker or a misuse of a relocation in the compiler, I think we could tend to include information that would help us diagnose problems.
Jan 27 2020
Thanks for the comments, I'll fix up those last ones tomorrow.
Removed redundant ==1, added test for branch to ifunc symbol.
Uploaded revision with explicit STT_NOTYPE symbol. Will wait till tomorrow to commit so I can handle any failures (just about to leave the office).
Jan 24 2020
Thanks for the update, I'm happy with the changes, will be worth waiting for George's ack as well.
Jan 23 2020
Thanks for the update, will be worth seeing what George thinks tomorrow.
Revert on the 10.0 release sounds reasonable to me. That would prevent the kernel from enabling the option and would prevent the build failure.
I should have been clearer, apologies; we're not blocked by the assertion failures, the CI won't halt and should pick up other build failures.
If the patchable functions is intended for clang-10 we'll need to make sure any fix is merged to clang-10.
This commit was made before release/10.x branch. Maybe the easiest thing is to revert the driver change in release/10.x (CC @hans), before we had a better understanding of the problem.
(Eventually I think the Linux kernel should have a better configure time test than a simple whether the compiler accepts -fpatchable-function-entry=2,0?)
Revert on the 10.0 release sounds reasonable to me. That would prevent the kernel from enabling the option and would prevent the build failure.
Looking at where we use relocateOne directly, I can see the PLT code, ARMErrataFix and AArch64ErrataFix , and Thunks that call relocateOne directly. For my selfish purposes this isn't a problem as for correcting ARM/Thumb interworking via BLX I'd only need to know the symbol type for R_ARM_CALL and R_ARM_THM_CALL which always have a symbol. I think that the R_ARM_THM_JUMP19, R_ARM_THM_JUMP24 and R_ARM_JUMP24 can be solved in needsThunk() which already has the information it needs.
-fpatchable-function-entry=2,0 is included in the release/10.x branch. I don't want to push the N,M where M>0 changes to release/10.x.
Although this particular commit will not be at fault, it is the option that enables the feature which is the earliest I can bisect the fault to. There are 3 files in linux that assert fail on the Implement the 'patchable-function attribute'. The files in question are kasan/quarantine.c, mm/slab_common.c and mm/slub.c .
Jan 22 2020
Jan 21 2020
Added a check for lld -v to make sure a detection message isn't printed.
LGTM too. Thanks for the update.
Jan 20 2020
Not got any objections to the refactoring, I'm not sure we need to share the loop traversal between getInputSections and getFirstInputSections (see comment inline) but I'm willing to go with the consensus on that.
Reading the BFD source it looks like stubs (which is what errata patches like this are generated) are not added for relocatable links, I couldn't find any ld tests for them and an empirical test with:
Thanks for the comments. It has been a bit of a nightmare of a day and I've only just got to this. I'll apply the suggestions and commit tomorrow morning so I can catch any problems.
It is also failing on the other 32-bit arm bots, example http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/13052/steps/ninja%20check%201/logs/FAIL%3A%20Clang%3A%3Avirtual-compare.cpp
Looking at the failure
Jan 17 2020
- Removed support for corner case where there is just a filespec. This permits the INPUT_SECTION_FLAGS logic to be moved to readInputSectionDescription from readOutputSectionDescription.
- Changed comment in test
The implementation looks good to me. Will defer to others that know -fsanitize-coverage=inline-8bit-counters and LTO better. There was a comment in the PR about adding a dummy non-gcable section instead, I think that this would have the same effect as having the linker handle sh-link == 0, but it might be more portable to other linkers. Any thoughts on whether a dummy section would work better?
I'm reluctantly in favour of making the change on the basis that there exists a design using SHF_LINKORDER that lets this happen. My reservation is that in some use cases of SHF_LINKORDER such as .ARM.exidx sections where there is a 1 : 1 relationship between meta-data and "content" (for want of a better word) section then sh_link = 0 would be a sign that something had gone badly wrong. Having said that there isn't any thing in ELF that forbids the use case that this needs. I think that for .ARM.exidx which is the other main user of SHF_LINKORDER we handle it as a special case so this shouldn't make a difference in practice.
I'll remove the support for the corner-case of when there is just a filepattern. Most likely won't get the time today, if I don't then I'll update on Monday. Thanks for the comments.
Personally I'd prefer we either go for a permanent decision to always resolve some fixups without relocations, or just implement the relocations in LLD when they are missing, assuming they are also in ld.bfd. It is likely that ld.gold won't have implemented the relocations either but we may be less concerned about that.
Updated diff. Main changes:
- Use stringswitch with enumerations
- Incorporate simplification suggestions
Thanks for the review. I'll upload another patch that I hope to have addressed all the comments.
As I understand it these relocation types are not often implemented is due to their short range. The Arm and Thumb2 ldr (literal) range (search for A8.8.64 in https://static.docs.arm.com/ddi0406/c/DDI0406C_C_arm_architecture_reference_manual.pdf) is:
Encoding T2 or A1 Any value in the range -4095 to 4095.
Outside of carefully crafted section orders, the chances of the relocation being in range if it went outside of a section boundary is very low. If the relocation is within a section boundary then the assembler can resolve it.
Jan 16 2020
Address code review comments so far. Main changes:
- Used EnumEntry as in ELFDumper.cpp
- Updated parser with better diagnostic message
- Updated tests
Thanks for the comments, I'll upload a new patch shortly.
Updated diff (and description) to simplify part of the check to os->size > target->getThunkSectionSpacing(). This is conservative, but simple. We can in theory do better, but it is difficult to do so without using heuristics. In effect it is a judgement that ThunkSections close to, but not necessarily at the end of the OutputSection have less of a chance of generating sufficient extra thunks and patches to cause non-convergence.
Thanks for the comments. I'll post an update with a simplification as I don't think I can justify the heuristic without being too hand wavy.
I'm happy to go ahead with this. The maximum alignment for instructions on Arm and AArch64 is 4, which is what the compiler uses for code sections.
Jan 15 2020
Thanks for the comments, I'll make an update tomorrow. I'll plan to:
- Try the flagsMask + withMask to see if it looks any better.
- Limit the recognised flags to what BFD accepts, I'd like to continue to recognise integers.
- Look at the error reporting in the parser.
- Update the tests.
Jan 14 2020
Updated the fix, and description after testing on a real linux kernel allyesconfig build. The description contains the full details, which should also be present in the comment. In summary we round up the size of a thunk section to nearest 4KiB when:
- The erratum fix is being used.
- The thunk section may be placed before the end of Output Section so there may be code after it.
- The InputSectionDescription is larger than 4KiB.
Thanks for the comments. I've decided to rework this In light of it not actually fixing the problem in Linux. In the allyesconfig build the overall .text OutputSection is larger than thunkSectionSpacing, but the assert was for in effect an InputSectionDescription within .text so I've had to think again. I'm nearly ready to submit again, just got to test in on the kernel itself first.
Jan 13 2020
Committed as 10c11e4e2d05cf0e8f8251f50d84ce77eb1e9b8d , I've used the new attribution process https://llvm.org/docs/DeveloperPolicy.html so you should show up as the author of the patch in the commit log. I'll comment here if there are any problems with buildbots.
Jan 10 2020
LGTM. ld.bfd, at least on AARCH64 will write the address of PLT into the .got.plt, but this may just be out of convenience as it will be overwritten by the IRELATIVE relocation before it can be used.
LGTM thanks for the update.
I can do it early next week, unfortunately a little too busy today. If anyone else can do it earlier please go ahead. To obtain commit access it is worth following https://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access
I'm personally strongly enough in favour to approve given no strong objections so far. Will be worth holding off to next week to see if the approval provokes any.
Jan 8 2020
If I could summarise the comments so far:
- There is some concern that this is not the full solution and there maybe other things that can break interposing.
- The response is that fixing the assembler will be a necessary part of that work.
- There maybe some extra via-PLT calls in shared libraries if -ffunction-sections is not used, but these can be mitigated by giving the symbols non-default visibility, and in any case these would exist with -ffunction-sections.
Thanks for the update, looks good to me.
Jan 7 2020
Ok LGTM. May be worth waiting a day to see if there are any final objections.
On looking again at the linux kernel makefile
Apologies for the delay in responding, just come back from vacation. I've checked the implementation against GCC and it looks like it will give the same behaviour. I've got one minor suggestion surrounding the clamping of TLSSize to its maximum value. It looks like it would only need to be done once, is there a convenient place, such as AArch64TargetMachine where it can be done once?
Apologies for the delay in responding, the original came in when I was on vacation.
- Am I correct in thinking that the extra dynamic symbols that will result with -pie and no shared libraries are just wasted space, but won't cause anything to happen at run time as there are no dynamic relocations generated against them?
- The main argument for removing is that this our ld.bfd is ad-hoc and this part doesn't help much, removing it would simplify here and permit a simplification in D71795. Is there a plan beyond D71795 to more closely follow ld.bfd behaviour, or is this just an area where we expect cosmetic inconsistencies?