IPRA try to optimize caller saved register by propagating register usage information from callee to caller so it is beneficial to have caller saved registers compare to callee saved registers when IPRA is enabled. Please find more detailed explanation here https://groups.google.com/d/msg/llvm-dev/XRzGhJ9wtZg/tjAJqb0eEgAJ.
This change makes local function do not have any callee preserved register when IPRA is enabled. A simple test case is also added to verify this change.
Details
Diff Detail
Event Timeline
lib/CodeGen/TargetFrameLoweringImpl.cpp | ||
---|---|---|
27–28 | This should go into a header. | |
64–69 | I would first test for if (UseIPRA) and only then do the MF.getFunction(), hasLocalLnkage() && !hasAddressTaken(). | |
test/CodeGen/X86/ipra-local-linkage.ll | ||
4–5 | You can probably simplify this further by leaving out the datalayout and using target triple = "x86_64--" | |
7 | The metadata #0 referenced here doesn't exist, I am surprised this even parses... |
test/CodeGen/X86/ipra-local-linkage.ll | ||
---|---|---|
7 | oh sorry that is there due to copy paste from similar earlier test case. |
The bug related to tail calls and recursive function is solved. The test case is updated as per discussion with @qcolombet. A statistic added to count number of function optimized for not having callee saved registers.
include/llvm/CodeGen/TargetPassConfig.h | ||
---|---|---|
23 ↗ | (On Diff #62702) | I'm not sure it is fine to have a cl::opt in such a global header? If we need the option available broadly we may have to make it a member of TargetPassConfig initialized from a private cl::opt. |
include/llvm/Target/TargetFrameLowering.h | ||
350 ↗ | (On Diff #62702) | Why not looking at the users of F? |
lib/CodeGen/RegUsageInfoCollector.cpp | ||
123 ↗ | (On Diff #62702) | You created F but didn't update uses of MF.getFunction() |
129 ↗ | (On Diff #62702) | Add the function name in the debug output. |
lib/CodeGen/TargetFrameLoweringImpl.cpp | ||
86 | if (UseIPRA && isSafeForNoCSROpt(MF.getFunction())) return; | |
test/CodeGen/X86/ipra-local-linkage.ll | ||
9 | Hasn't been fixed apparently. | |
20 | Take %X by argument? | |
23 | saved before and reloaded after? |
test/CodeGen/X86/ipra-local-linkage.ll | ||
---|---|---|
9 | Yes, it has fixed, Sorry that come again as I modified an older version of it while doign git reset --soft ... |
include/llvm/CodeGen/TargetPassConfig.h | ||
---|---|---|
23 ↗ | (On Diff #62702) | UseIPRA is used in TargetPassConfig and TargetFrameLoweringImpl |
test/CodeGen/X86/ipra-local-linkage.ll | ||
---|---|---|
23 | But yes I will edit comment accordingly to explain it clearly. |
include/llvm/Target/TargetFrameLowering.h | ||
---|---|---|
350 ↗ | (On Diff #62702) | Why has this been marked as done? |
lib/CodeGen/RegUsageInfoCollector.cpp | ||
123 ↗ | (On Diff #62702) | Why has this been marked as done? |
129 ↗ | (On Diff #62702) | Why has this been marked as done? |
lib/CodeGen/TargetFrameLoweringImpl.cpp | ||
86 | Why has this been marked as done? | |
test/CodeGen/X86/ipra-local-linkage.ll | ||
20 | I mean as a function parameter, unless I'm missing why an addition is needed here. |
include/llvm/Target/TargetFrameLowering.h | ||
---|---|---|
350 ↗ | (On Diff #62702) | Changes have been done on local machine, just waiting for your decision on if UseIPRA to be in TargetPassConfig.h or not? |
test/CodeGen/X86/ipra-local-linkage.ll | ||
20 | Yes addition is not needed I can change load constant value to variable X. |
include/llvm/Target/TargetFrameLowering.h | ||
---|---|---|
350 ↗ | (On Diff #62702) | Please mark as done when the revision is updated, otherwise it makes it harder to track. |
test/CodeGen/X86/ipra-local-linkage.ll | ||
20 | Why do you need a load? Why don't you take X as a function parameter (you haven't address this question...) |
test/CodeGen/X86/ipra-local-linkage.ll | ||
---|---|---|
20 | Do you mean as a function parameter of bar() ? |
test/CodeGen/X86/ipra-local-linkage.ll | ||
---|---|---|
20 | Yes, that also worked. However I have just never thought to use that. But I guess that simplified the test further. In previous version constant value and add instruction were just irrelevant. |
This should go into a header.