Hi @MaskRay, I applied your patch on top of 15.0.7 and can confirm that it does fix the immediate issue. I will perform more extensive tests as time permits.
Jun 12 2023
Apr 8 2023
Looks good to me, thank you!
Mar 29 2023
Brilliant, thanks everyone. @vit9696 am I right that you don't currently have LLVM commit access - would you like me to go ahead and commit this for you?
Mar 20 2023
Ok, I think I finally understood what you meant now. I kindly request you to be more explicit next time, also providing a more complete review from the start would help to reduce the amount of iterations and time needed for the change to land.
Mar 19 2023
Mar 3 2023
Addressed the review comments.
Mar 2 2023
Feb 21 2023
Apparently my build was out of date and update_llc_test_checks.py generated incompatible results. Fixed in V3.
Added V2. Should have applied everything. I did not split the common part, because after the changes it became rather small.
Feb 20 2023
Sorry for the delay, got a bit sick here. I mostly acked the requested changes and left comments on the ones I believe should not be applied. Will try to prepare the V2 by the end of the week.
Feb 16 2023
This is actually the triple we use, despite our OS not having any relevance to Linux, mainly since LLVM has no better triple for us. I basically added the triple we personally need to work with.
Feb 11 2023
@MaskRay, we have a family of private in-house RTOSes for safety-critical application with static resource allocation (memory, threads, processes, etc). Our supported architectures include AArch64, ARM, MIPS, PowerPC, RISC-V, X86 and we have TLS in all of them. For easier certification, unified codebase, and compatibility reasons with various kinds of gcc and clang we opted with emulated TLS API. However, instead of using the implementation from builtins we went with our own, which looks roughly like this:
Feb 10 2023
@vit9696 perhaps you could remove the "requires changes to proceed" marker, unless you have specific quality of implementation concerns with this patch? It might be we decide to abandon it in favour of enabling emulated TLS, but even if it's just the case that it takes a little longer to agree on that, this patch is still better than the status quo where it sounds like we're generating incorrect code and might be reasonable stop-gap.
Added a test case.
@jrtc27, I agree and am very sorry for that having been taken personally. To make things more constructive I submitted a new differential (D143708), where I also summarised the current state of the art with the emulated TLS on RISC-V.
@paulkirth, I agree that shipping a working compiler is more important than parity with the other compilers, but in this particular case we simply face a simple to fix bug caused by component owner tyranny :) We should simply fix this bug and move forward.
This change looks inadequate to me. As I explicitly stated in the mentioned issue, we need emutls support for RISC-V. For the reasons stated there I believe we should enable emutls support for RISC-V as GCC does, and this differential should be abandoned.
May 2 2022
Got it. To be honest, I cannot imagine a situation where one would prefer to keep separate behaviour, and with GCC it would be exactly just one. In my view, if we introduce -fprofile-prefix-map, it should rather mirror the entire -fcoverage-prefix-map behaviour, not some parts of it.
Based on the issue it sounds like this should be gated behind a new -fprofile-prefix-map flag? I assume we'd also want -ffile-prefix-map to apply to it as well, similar to the others. And we'll definitely want some tests here!
Jan 10 2022
Tested this on what we had on 13.x branch. Works fine. Thank you very much.
Nov 28 2021
This changeset causes a regression mentioned in https://lists.llvm.org/pipermail/llvm-dev/2021-November/153999.html.
May 27 2021
Nov 29 2019
This change makes good sense to me, as I suggested pretty much the same thing 2 years ago (D38554). In the end for us it was easier to fix LLD and switch to GNU ABI ignoring IFUNC support entirely than to convince the others and support the change. Yet either way believe that it is logical not to emit PLT for static memory model. Personally I can accept this, but I wonder whether there are more comments coming.
Nov 12 2019
Actually thanks for adding C checks too. The new revision also looks good to me.
Oct 28 2019
Oct 26 2019
I noticed a typo in the file, but the rest looks good now.
Oct 23 2019
Yes, that's the point, since 8548 is an alias.
@jhibbits Should not the test in test/Preprocessor/init.c also ensure that __SPE__ and __NO_FPRS__ macros are defined with CPU 8548? The rest looks good to me, thank you.
Sep 5 2019
@jhibbits, thank you for merging. Will we have this in LLVM 9.0?
Jun 29 2019
Not sure whether I understood you, do not you already have the common switch by feature, named spe, in Features["spe"] line for that?
Jun 28 2019
@ruiu, we have some hardware, which bootloader does not support PT_GNU_STACK segments, and would therefore benefit quite a bit from this option. As this does not really depend on NetBSD support, could you please merge this patch in time for 9.0? Thanks!
Could somebody commit it into 9.0? Quite a lot of people depend on this option including us. Since it got accepted I do not see a reason for this to get postponed any further. Thanks!
Right, ok. I have been testing this for quite some time now, including maths, and so far had no issues. Can this get merged into 9.0? I do not believe there are enough obstacles to postpone it any further. Thanks!
Mar 31 2019
Feb 20 2019
@nemanjai, sorry, under merging I meant committing into llvm upstream.
Feb 19 2019
This is a series of patches, which I believe should merged altogether. Currently the following patches are relevant:
Feb 18 2019
Feb 12 2019
Thanks a lot!
I am fine to create bugs, but I am yet to ask for commit access, and somebody needs to commit the patch for the time being. For this reason I thought it would be faster to leave the rest up to you.
Feb 11 2019
Done, I must have been quite tired yesterday, thought I included it.
Done with the test update. As for your other comments, there are some minor differences in section handling for different output types in GNU binutils, but I believe they have little to do with LLD abilities as of now. ppclinux one is chosen when clang is used for linking under Linux target. You can get some more details by checking ld/emulation folder, but I believe it is just fine to have these as synonyms for a long while to let most of the code continue building just fine.
Sure, superseded by D58005.
Sorry, did not realise that we have tests for this functionality in such an undescriptive place. Added tests and merged with D58007 as suggested by Rui.
Feb 9 2019
Jan 29 2019
That's pretty good. Do you think it is possible to land it in 8.0 release? @hfinkel?
Thanks for pointing it out. You could use hasFeature from there during construction:
return SetCGInfo( new PPC32TargetCodeGenInfo(Types, CodeGenOpts.FloatABI == "soft" || getTarget().hasFeature("spe")));
It works for me, but probably a separate argument is best to be used (or at least the current one is to be renamed).
Right, I noticed the same thing yesterday. There is an override calling LowerVAARG for 64-bit integers, yet that is not a thing that does lowering for all the rest. I believe it is derived somewhere from td calling conventions. I will check it out later this evening and mail you if I find anything (vit9696 at avp dot su).
Jan 28 2019
@jhibbits it appears that va_list is not functional with SPE (va_arg returns garbage for double and stuff like printf is not functional). Is this expected or I miss a patch?
Jan 24 2019
Hi @vit9696, it looks like Kei's patch fixes the issue you're seeing as well.
Jan 23 2019
Hmm, I have not yet tried to explore this, but I get a feeling a regression appeared somewhere during the patch iterations. Either this or D54409.
At this point I am consistently getting weird generated instructions for __floatundidf from compiler-rt (compiling with freebsd & -O3), yet I have a correct one in my files.
What makes it strange is that the logs show that the correct and first incorrect examples were generated by the same compiler (binary file) with the same flags, yet I no longer can get the correct one.
Jan 20 2019
Justin, I am currently testing the latest patches, yet so far it looks very good. Thanks.
I rechecked GCC, and it seems that it forces 64-bit long double for SPE regardless of the target.
Could you please force that in LLVM as well? While imperfect, I currently changed the following part of code in clang/lib/Basic/Targets/PPC.cpp, and it seems to work for the needs:
Jan 17 2019
You are right, had to modify it like this to get the crash with FreeBSD triple:
Actually I am not sure about Linux, since this is bare metal, and I just used what fited. However, it does not look like 128-bit or 64-bit long doubles are related.
I retried to compile the test case with powerpc-gnu-freebsd target (and even made a compile-time assertion to ensure that long double is now 64-bit), yet it still crashed in a similar way.
Might it be another LLVM 7 vs LLVM 8 difference? Does it crash for you with Linux target?
@jhibbits sorry for not answering earlier, I was occupied with NY holidays, and then had a lot of stuff on the road.
Dec 21 2018
Ok, I found the fix for the first crash that landed in 8.0 trunk. It works fine for me if backported to 7.0.1:
Dec 7 2018
I have applied this patch to the llvm-toolchain-7 package in Debian and did not see any regressions on x86_64 or 32-Bit PowerPC. Additionally, I have included the patches from https://reviews.llvm.org/D54409 and https://reviews.llvm.org/D54583 saw no regressions on x86_64 and 32-bit PowerPC.
All three patches will be part of the next upload of the llvm-toolchain-7 package in Debian unstable which will be version 1:7.0.1~+rc2-9.
Nov 16 2018
Thanks for the fix. I made a quick check of the mentioned patch, and it looks like it does solve the issue. However, besides the previous crash, which remains unfixed as of 7.0.1rc2, there is another instruction selection failure crash that may be caused by the change. I have not yet had a chance to properly research it, but here is an example if you feel interested: http://llvm.org/svn/llvm-project/compiler-rt/tags/RELEASE_701/rc2/lib/builtins/divdc3.c
Jul 31 2018
Thank you for working this. I tried the change and have a couple of suggestions:
- -mspe option in GCC works like -mspe=yes or -mspe=no. While it does make sense to have it the way you did (-mno-spe and -mspe) it would be nice to have at least have an alias for compiler compatibility.
- One of the known CPU examples with SPE support is MPC8548 (https://www.nxp.com/docs/en/reference-manual/MPC8548ERM.pdf), it will be nice to have it recognised in -mcpu argument as 8548 like in GCC.
Jun 22 2018
The proposed change seems right to me in both senses:
- moving the invariant out of the loop (obvious);
- not making a reciprocal here, as it may add an extra overhead for non-constant divisors.
Apr 4 2018
Now that https://reviews.llvm.org/D44926 was merged upstream, may I request this to be merged as well?
As stated above it is pretty much pointless to have the check doubled, and after more than a week nobody thought of any issue.
Mar 28 2018
Morning. Ok, it looks like I hardly checked the IR and a ton of passes were left unnoticed making no sense of the tests :D. Sorry, fixed.
Hm, I did run tests before uploading the diff and they passed. Perhaps I upped the wrong file. I will recheck tomorrow and resubmit.
Thanks for the review. We are good now, I suppose. Could you merge it for me please?
Mar 27 2018
Feb 15 2018
Thank you too!
Feb 14 2018
Done, could you merge it please?
Comments? But nobody reeeeads them anyway xD. Updated, kind of forgot about them.
Thank you for such an involved explanation! It is pretty clear now I think. I applied all the mentioned changes and updated the diff.
In case there are no other issues may I ask you to commit this?
Feb 13 2018
Thanks for a quick reply. I think your comments were less helpful, because we had different assumptions, and you ended up explaining fairly trivial things leaving the ones I asked about unanswered. Apparently, assuming you got it right, I indeed misunderstood what James suggested. However, that's less important and easy to change.
Well, I found some time to prototype what we discussed. As I got to the code I think I discovered more hidden issues that were not taken into account, and mentioned them as the comments in the code.
Feb 9 2018
Well, ok, I think it might actually work this way. I will try to find some time in the next few days to update the patch.
It is not that simple, because the size of the header part should be aligned to the page size (from what I remember at least). And we cannot know the alignment rules / page sizes of the target system. We could try checking the subsequent segments, but it is not so nice. I think that unless we replicate the area used by the header and the subsequent padding, the resulting ELF may become unloadable.
Feb 8 2018
I think I could have missed the point of the description a little, but I know of at least a total of 3 cases regarding ELF header:
- ELF header is present in PT_LOAD segment
- ELF header is mentioned in PT_PHDR segment
- ELF header is not mentioned or present anywhere, but the segment data starts at a random offset the linker decided to use.
I am personally not sure if it is the correct solution, because I do not understand why a segment with 0 filesize and a 0 offset should corrupt an ELF while copying regardless of its type.
I would personally prefer special-casing two segments to writing a check against both MemSize and FileSize, but it does not feel nice to me.
More importantly, if you have a PT_LOAD "marker" segment with all zero values:
- will gnu binutils objcopy also corrupt such an ELF?
- will a Linux/BSD-based system load such an ELF?
If in both cases the answer is no, then we should probably just special-case indeed, but I still cannot say that silently overwriting the header is any good. Perhaps we should print an error and abort?
Feb 7 2018
@jhenderson, thanks, NOBITS makes a lot more sense to me than your previous example. Yet I am unaware of any software or even any tool that would rely on such on offset, and to my eyes the linkers are simply doing the wrong thing by not setting the offset of 0 for whatever reason. In my opinion being able to copy more ELF files, which will be most likely valid and loaded on a target system, is more important, than trying to support some shady offset to a non-existent memory nobody knows how to calculate.
@jhenderson, sorry for my ignorance, but I miss the point of your .bss example just a second time. Could you elaborate a little more?
Thanks for the comments!
Feb 2 2018
Jan 24 2018
This was committed in r323366, thanks to @rafael!
Updated with suggested test extension by Peter Smith.
Added a test case and restyled the check like Igor suggested.
Thanks for sending the patch. What I'd like to know first is in what circumstances the synthetic section has no parent and is this expected to be possible? If it isn't expected to be possible then it would be better to fix that and not try to account for the no parent case. The implementation has changed quite a bit since I last looked so I'm not sure of the intention.
Do you have a test case that you could add to your patch? For example find a case that currently faults without the patch and check that it links correctly with it. Perhaps checking that the output has no .ARM.exidx sections present.