- User Since
- Jan 24 2014, 10:03 AM (376 w, 1 d)
Feb 16 2021
This looks like a straightforward implementation. The only caveat is that the XAR immediate does not represent a lane, and hence the need for a custom immediate range check. Looks sensible to me.
@labrinea and others at ARM, do have any other comment before this is merged?
Feb 10 2021
Feb 9 2021
Jan 6 2021
Can you confirm when the GNU toolchain will have this flag supported in their assembler? Compatibility between LLVM and GNU toolchains is important.
Dec 17 2020
Thanks Daniel for the explanation. Was the support added for GNU assembler as well? Is it the same flag name?
Thanks for clarifying - so the property is being set for C/C++ files but not for assembly files. I think it should be set automatically for both when one uses clang driver to compile/assemble.
Dec 16 2020
So the intention here is to generate the property when BTI branch protection is enabled, to guarantee the generate .o indeed has the property set, without requiring to pass the flag -mmark-bti-property explicitly. This is convenient for users.
Sep 17 2020
Can you clarify on the errors you are talking about? Are you building LLVM libc++with glibc?
I have been building LLVM libc++ with MUSl and I do not see a build issue.
Jul 23 2020
May 8 2020
Remove wrong setting of Framesetup flags
May 6 2020
Apr 29 2020
Simon can you please rebase, it seems D78776 got merged and now conflicts. Thank you.
Apr 24 2020
LGTM thanks for addressing Alex's comments.
Apr 21 2020
Apr 20 2020
Apr 8 2020
To clarify, we were not missing in compressed instructions before, this change is for the future compressed instructions to be added, confirmed? Otherwise I need to check why we missed it with the fuzzer.
Thanks, Simon for pushing this patch, it does help when debugging code and removes the dependence on binutils.
Mar 19 2020
Thanks Shiva, making it alias of -G makes sense, LGTM.
It looks like it needs more work indeed. Enabling it won't build compiler-rt.
Mar 17 2020
Alex, we have not yet merged this patch. We should merge it and we can leave the default as GCC.
This makes our code size gap with GCC smaller.
In the past I did see degradation in some SiFive hardware (I think it was on HiFive1 board). Maybe some of the SiFive colleagues can look at it at some point.
Shiva, how about making the flag small-data-limit alias of -msmall-data-threshold?
Mar 12 2020
Thanks Shiva, I res-ynced and rebuilt the patch. It is working fine.
Mar 11 2020
Shiva, I am not sure how the SDataLimit is being honored in LTO mode.
Where does getModuleMetadata get called?
If the SelectSectionForGlobal api is called without getModuleMetadata being called first, it will use the default 8 instead of honoring the SDataLimit.
Shiva, I see a warning always being printed:
Mar 9 2020
Shiva, we forgot about this patch. Can you rebase it so we move on with merging.
Feb 27 2020
Feb 19 2020
Feb 4 2020
Jan 31 2020
Thanks for catching this issue, Simon. Alex, will this be cherry-picked to the branch?
Jan 14 2020
Lewis, your latest patch looks good, we just had another run with no new failures. But we know it will have issues with -g. So I think we should not merge it yet. Do you have a version of the patch that creates the labels for the compiler-generated save/restore lib calls, so that this optimization does not depend on D71593? We could merge that version then, and when D71593 is accepted, you just have to rework/remove the label generation part of the patch.
Jan 7 2020
Lewis, same question, is the patch final? It would be good to merge it before the 10.0 release branch creation on Jan 15th
Lewis, is the patch final? It would be good to merge it before the 10.0 release branch creation on Jan 15th
Dec 19 2019
Thanks Lewis for providing the test case. A bug can be opened for the machine outliner with that example. In my opinion you can go ahead and merge this patch.
Dec 17 2019
Can you open a bug for the machine outliner and maybe contact Jessica, may be she can help fix this quickly.
Dec 16 2019
My commit permissions were fixed last week, I will rebase and merge.
Dec 3 2019
Lewis, try rebasing it, not applying cleanly nor https://reviews.llvm.org/D62190
Lewis, try rebase again, not applying cleaning, nor https://reviews.llvm.org/D62686
Lewis, I tested the latest version of this patch (with and without https://reviews.llvm.org/D68290), I don't see any issue. LGTM.
With this change, the regressions observed with the machine outliner are eliminated.
Sam, I don't think any one has reviewed yet.
Dec 2 2019
The change looks good to me. Thanks for adding the test!
Oct 17 2019
Lewis, I am not seeing any code change.
Shiva, I have tried a few workloads like EEMBC, SPEC2000/2006, perennial c++, plumhall.
They don't seem to use variable length arrays nor allocas to test this patch.
Which test suite are you using?
Hi Shiva, I will evaluate your patch today and report back.
Thanks Lewis, I applied this patch with its dependencies, ran -msave-restore -mllvm -enable-shrink-wrap config, and I see no failures.
Thanks Lewis, I applied the patch and I am not detecting any new failure with this feature.
Oct 16 2019
Lewis, this patch is not applying clean on top of https://reviews.llvm.org/D62686. Can you please rebase?
Thanks Lewis, the runs are looking good, no failures, and good code size savings (average 3%)
Oct 15 2019
Added check to verify whether MI is parentless or not because isCompressible call depends of function/module info.
Is it worth trying to disallow tail call optimization completely if this flag is enabled? I'm not sure what GCC does exactly. but this seems to be the behaviour.
Oct 14 2019
Yes Eli thanks for pointing out there are more scenarios that can fail.
It looks like the best solution is to permit both flags on, but then bail out from doing this transformation if there is any type of tail call already in the function.
This way we avoid messing with reverting tail calls back to regular calls.
Here is the bugpoint-reduced test case for the SPEC failure when enabling -msave-restore and allowing tail calls:
Oct 10 2019
Oct 7 2019
Lewis, this patch LGTM. You can go ahead and merge it.
Oct 4 2019
Oct 3 2019
There are some additional perennial test suite failures when applying this patch and enabling -mllvm -enable-shrink-wrap.
When Lewis updates the patch to be standalone, we can verify it again.
Thanks Alex and Lewis.
I will break the patch in 3 commits:
- fix missing ImmLeaf predicate, which I will merge directly (no review).
- Compress Instr table gen back end changes. I will post this patch for review.
- Modify getInstSizeInBytes to invoke isCompressibleInst (thanks for the name fix!). This one I will run a few more correctness tests, and then post for review.
Oct 2 2019
Lewis, with this patch we see less failures. But still some tests in SPEC and perennial test suites are failing.
Oct 1 2019
Here is the change: https://reviews.llvm.org/D68290
With this change I was able to remove the code size degradation I had observed.
Please try it out.
Thanks for the patch update. I will launch some new correctness runs.
Sep 26 2019
I have run a couple of standard workloads like SPEC.
There is no correctness issue when enabling the MO feature (except for spec2000/twolf which fails with/without MO).
The MO code size gains without compression are up to 7%. With compression enabled, most of the code size gain is gone, and I see more code size increase.
It is possible it has to do with the SequenceSize we are estimating.
The reason we enable compression late is to have all paths - coming from codegen (LLVM IR), parsing, assembling .s or inline asm - go through the same mechanism for compression.
This is similar/compatible with GCC behavior, which relies on the external assembler to implement compression.
We can better estimate SequenceSize by checking if an instruction is compressable. We can modify the tablegen backend for compression emitter to generate a function that does the isCompressable check, but instead of using MCInst and MCOperands for the checks, we need to use MachineInstr and MachineOperand types. I will try this solution. Another alternative is to compress LLVM IR code before running the machine outliner.
Aug 12 2019
Thanks Sam, LGTM.
Aug 9 2019
Mar 28 2019
Mar 27 2019
Mar 20 2019
Thanks Alex, LGTM
Mar 8 2019
Thanks Alex, LGTM.
Mar 7 2019
Leaving pseudo instructions that access FP CSRs untouched for now.
OK, I can restrict the patch to access CSR names and value range without F extension, while we wait GCC and the RISC-V specs to be updated.
Mar 6 2019
LGTM, I have verified the patch with some workloads and found no new issue.
We are going with the alternative in https://reviews.llvm.org/D58943
Mar 4 2019
Hi Eli, this current patch as it is fails for the test case you highlighted. Thanks for clarifying and pushing an alternative solution.