This commit addes pass to change CodeGen order to be bottom up order on Call Graph. The patch for the same has been provided by dear Mehdi Amini. This commit added required analysis passes for IPRA. 1) lib/CodeGen/RegUsageInfoCollector.cpp (MachineFunction Pass) to create RegMask based on actual register usage, scheduled at POST-RA 2) lib/CodeGen/RegisterUsageInfo.cpp (Immutable Pass) to store RegMask details for functions which has been processed in bottom up order on Call Graph, Scheduled before RegAllocation. This commit also adds a method setRegMask() to MachineOperand class. This commit also adds command line option to enable IPRA -enable-ipra and debug type "ip-regalloc". Changes to adher to coding standards, correction for ownership of RegMask vector in related functions.
Details
Diff Detail
Event Timeline
include/llvm/CodeGen/MachineOperand.h | ||
---|---|---|
537 | Take a const ptr. | |
include/llvm/CodeGen/PhysicalRegisterUsageInfo.h | ||
36 ↗ | (On Diff #58887) | You need to do the same in RegUsageInfoPropagationPass |
43 ↗ | (On Diff #58887) | Take RegMasks by value here. |
45 ↗ | (On Diff #58887) | const std::vector<uint32_t> *getRegUsageInfo(StringRef MFName); |
lib/CodeGen/PhysicalRegisterUsageInfo.cpp | ||
32 ↗ | (On Diff #58887) | RegMasks[MFName] = std::move(RegMask); |
lib/CodeGen/RegUsageInfoCollector.cpp | ||
100 | PRUI->storeUpdateRegUsageInfo(MF.getName(), std::move(RegMask)); | |
lib/CodeGen/TargetPassConfig.cpp | ||
587 | remove | |
603 | Remove this, it is useless. | |
lib/Target/X86/X86RegUsageInfoPropagate.cpp | ||
56 ↗ | (On Diff #58887) | Take a const ptr |
92 ↗ | (On Diff #58887) | C++11 for-range: for(auto &MBB : MF) { for(auto &MI : MBB) { |
113 ↗ | (On Diff #58887) | auto updateRegMask = [&](StringRef FuncName) { const auto *RegMask = PRUI->getRegUsageInfo(FuncName); if (RegMask) { // else skip optimization setRegMask(MI, &(*RegMask)[0]); changed = true; } }; MachineOperand &Operand = MI.getOperand(0); if (Operand.isGlobal()) { updateRegMask(Operand.getGlobal()->getName()); } else if (Operand.isSymbol()) { updateRegMask(Operand.getGlobal()->getName()); } |
lib/Target/X86/X86TargetMachine.cpp | ||
316 ↗ | (On Diff #58887) | This is not the right place, it should be added right after createX86ISelDag |
Changes to adher to coding standards, correction for ownership of RegMask vector in related functions.
include/llvm/CodeGen/PhysicalRegisterUsageInfo.h | ||
---|---|---|
28 ↗ | (On Diff #58917) | This is not clang-formatted, can you please format it? |
include/llvm/CodeGen/MachineOperand.h | ||
---|---|---|
147 | It is not good practice to clang-format the whole file as part of your patch. You should only format the patch of the file you changed (git clang-format handles it automatically, some text editor as well, or you can give a range on the command line, or even copy paster on stdin the range you want to format). |
Fix that makes RegUsageInfoCollector Calling Convention aware so that it does not mark CalleeSaved register as clobbered if it is not used.
Hi Vivek,
A couple of minor comments, the main logic looks reasonable.
Cheers,
-Quentin
include/llvm/CodeGen/MachineOperand.h | ||
---|---|---|
537 | I would say in a comment to double check MachineOperand::CreateRegMask for the characteristic of RegMaskPtr. The reason why I think it is important is because that method clearly states the expected life time of the pointer and if we get it wrong we may end with nasty bug. Therefore, we should do our best to have the API clearly documented. | |
include/llvm/CodeGen/Passes.h | ||
361 | Typo: machine | |
include/llvm/CodeGen/PhysicalRegisterUsageInfo.h | ||
1 ↗ | (On Diff #59113) | .cpp => .h |
2 ↗ | (On Diff #59113) | Wrapped line. |
51 ↗ | (On Diff #59113) | Period. |
51 ↗ | (On Diff #59113) | Please use doxygen style comment: /// |
52 ↗ | (On Diff #59113) | Typo: function |
53 ↗ | (On Diff #59113) | Period |
include/llvm/InitializePasses.h | ||
336 | I don’t know why this is not the case for last one, but we usually sort the initializers by name. | |
lib/CodeGen/CMakeLists.txt | ||
103 | Looks like this patch could be split:
| |
lib/CodeGen/PhysicalRegisterUsageInfo.cpp | ||
27 ↗ | (On Diff #59113) | ip-regalloc |
45 ↗ | (On Diff #59113) | Remove the else per the coding convention. |
lib/CodeGen/RegUsageInfoCollector.cpp | ||
17 | Wrapping is strange here. | |
82 | Call getPassName. | |
88 | That is a bit strange to state the comment like that. | |
95 | && instead of nested ifs. | |
96 | Period. | |
98 | Encapsulate that loop into a setRegister thing. | |
107 | Capital letter at the beginning, period at the end of the sentence. | |
lib/Target/X86/X86RegUsageInfoPropagate.cpp | ||
2 ↗ | (On Diff #59113) | Why is that pass X86 specific? |
12 ↗ | (On Diff #59113) | Use doxygen comments: /// |
16 ↗ | (On Diff #59113) | queries |
Somehow I had unsubmitted comments on phab, some are still relevant.
lib/CodeGen/PhysicalRegisterUsageInfo.cpp | ||
---|---|---|
2 ↗ | (On Diff #59113) | This should be on one line (and fits within 80 chars). |
lib/CodeGen/RegUsageInfoCollector.cpp | ||
3 | same | |
38 | Doesn't this need to be in a header and be called somewhere? | |
110 | Why not initializing RegMask with 0xFFFFFFFF in resize() and having the loop setting the bit to zero instead? | |
116 | comment | |
lib/CodeGen/TargetPassConfig.cpp | ||
502 | No capital letters here. | |
582 | s/register/registers/ | |
583 | Comment could be simply: Collect register usage information and produce a register mask of clobbered registers, to be used to optimize call sites. | |
lib/Target/X86/X86.h | ||
86 ↗ | (On Diff #59113) | s/call site/call sites/ |
lib/Target/X86/X86RegUsageInfoPropagate.cpp | ||
2 ↗ | (On Diff #59113) | When we first looked at it, is seemed to me that we would have to switch over the various possible calls. But it was a mistake on my side, and now that it is implemented it does not seem to need any target specific bit indeed. |
The general approach seems fine at a first glance. I haven't reviewed this in-depth for correctness yet though, as I first have a bunch of nit-picks and coding style issues:
include/llvm/CodeGen/MachineOperand.h | ||
---|---|---|
537 | +1 | |
include/llvm/CodeGen/PhysicalRegisterUsageInfo.h | ||
1 ↗ | (On Diff #59113) | Does all of this need to be exposed in a public header anyway? Wouldn't it be enough to have the initialization stuff in Passes.h and keep the actual pass definitions private to the .cpp file in this case? (if we decide no, then I'd rename this to something shorter like RegisterUsageInfo...) |
11–17 ↗ | (On Diff #59113) | Should use doxygen style comment here (see coding convention). It may also be wise not to mention specific filenames here as those could be moved or renamed in the future and people generally do not catch comment like this when doing so. |
54 ↗ | (On Diff #59113) | @Mehdi: I've seen you arguing for a stringmap here before, but I would argue that a reference to a compiler internal object would be more stable here, avoid string comparisons and also work with unnamed functions (not sure if that actually allowed in llvm, but I could see this happening for JITs...). As to what object to use: MachineFunction* would be nice but I am not sure that would well as long as the MachineFunction gets created in the MachineFunctionAnalysisPass. So maybe tie it to the Functions GlobalObject for now and change it to MachineFunction* later when we switch to the new pass manager and hopefully can deal with MachineFunctions as a first-class object rather than an analysis. |
59 ↗ | (On Diff #59113) | Check your editor settings so the newline at the end of file is added. |
lib/CodeGen/PhysicalRegisterUsageInfo.cpp | ||
1–2 ↗ | (On Diff #59113) | strange linebreak. I think in .cpp we also don't need those magic markers for the emacs users (my understanding is that they are just used in the .h files so emacs knows they contain C++ code and not just C code). |
11–17 ↗ | (On Diff #59113) | Use doxygen comments. |
lib/CodeGen/RegUsageInfoCollector.cpp | ||
2–3 | strange linebreak. | |
12–20 | doxygen comments. | |
56 | add "end of anonymous namespace" comment (see coding conventions). | |
61 | It is clear that this is about a pass so no need for the " Pass" suffix in the explanation. | |
77 | This is always false anyway, just use return false; instead of a variable. | |
88 | We could make the code here and in MachineOperand more robust by having a "typedef uint32_t RegMaskType" and then using sizeof(RegMaskType) * CHAR_BIT instead of hardcoding 32... | |
89 | how about uint32_t RegMask[regMaskSize] instead of using a std::vector here so we get a stack allocation instead of an unnecessary heap allocation of the vector? | |
93 | We tend to introduce a new variable (like PRegE = TRI->getNumRegs()) in loops like this and compare with it to avoid getNumRegs() getting called in every iteration of the loop (see coding conventions). | |
112 | No space after the * | |
lib/CodeGen/TargetPassConfig.cpp | ||
117 | I feel like this comment is just stating the very obvious and can be left out. | |
119 | I think we can should add cl::Hidden for now until this is proven and stable. | |
501 |
Just put the pass into CallGraphSCCPass.h as it is not specific to codegen (just happens to be used | |
lib/Target/X86/X86RegUsageInfoPropagate.cpp | ||
93 ↗ | (On Diff #59113) |
Please do not use auto in contexts where it is not obvious what type it represents. It is friendlier for the readers to use MachineBasicBlock &MBB and MachineInstr &MI here. |
101 ↗ | (On Diff #59113) | We start local variables with uppercase letters (coding convention). |
103–104 ↗ | (On Diff #59113) | same comment as above. |
114 ↗ | (On Diff #59113) | This does not seem complicated enough to me to warrant a lambda. What about: const char *FuncName = nullptr; if (Operand.isGlobal()) { Name = Operand.getGlobal()->getName(); } else if (Operand.isSymbol()) { Name = Operand.getGlobal()->getName(); } if (FuncName != nullptr && RegMask) { setRegMask(MI, &(*RegMask)[0]); changed = true; } |
128–129 ↗ | (On Diff #59113) | odd line break (clang-format may be confused because of the macro...) |
131–132 ↗ | (On Diff #59113) | `DEBUG(dbgs() << MI << '\n'); |
@Mehdi: explicit ping so you do not miss my comment about the register mask map modeling among all the nitpicks.
include/llvm/CodeGen/PhysicalRegisterUsageInfo.h | ||
---|---|---|
54 ↗ | (On Diff #59113) | I pointed a StringMap in order to not have the backend depends on the IR, now if you and Quentin thinks it is fine to go through a GlobalVariable *, I don't have strong feeling about it. |
lib/Target/X86/X86RegUsageInfoPropagate.cpp | ||
---|---|---|
93 ↗ | (On Diff #59113) | I feel that it is clearly obvious that MI is a MachineInstr when reading for(auto &MI : MBB) {, I'll be explicit if it was something else. But I won't pick this fight :) |
114 ↗ | (On Diff #59113) | I guess it is a matter of personal style, I'd try to avoid this construct when possible, but I don't really care here. |
I know some of the issues are more a matter of personal preference than there being one true way. I just felt I should point out how I feel about them before people start writing them down as the one true way in the coding conventions :) I will of course not block the patches if we end up with a different style here.
lib/CodeGen/PhysicalRegisterUsageInfo.cpp | ||
---|---|---|
11 ↗ | (On Diff #59113) | No need to duplicate this comment in both the source file and the header. |
lib/CodeGen/RegUsageInfoCollector.cpp | ||
89 | regMaskSize is not a constant, and we can't use VLAs. We could use SmallVector with a reasonable default, however. | |
lib/CodeGen/TargetPassConfig.cpp | ||
581 | Why here? It seems much too early. Backends can use the register scavenger to use otherwise-unused registers until the very end. I think this needs to be after the call below to: addPreEmitPass(); if not at the very end. | |
lib/Target/X86/X86TargetMachine.cpp | ||
283 ↗ | (On Diff #59113) | Does this need to be here so that targets can opt-in? Could we reasonably schedule this early in TargetPassConfig::addMachinePasses? |
include/llvm/CodeGen/PhysicalRegisterUsageInfo.h | ||
---|---|---|
10–11 ↗ | (On Diff #59113) | How about: // This pass is required to take advantage of the interprocedural-register-allocation infrastructure. |
15 ↗ | (On Diff #59113) | If you change from using a StringMap, update this comment. |
54 ↗ | (On Diff #59113) | I'd also prefer a GlobalVariable * here, as that currently establishes our canonical function identify. If we work to further separate the MI level from the IR level, I suspect we'd establish some other function object which would have pointer-based identify, and we can always then update this code to use that instead. |
include/llvm/CodeGen/PhysicalRegisterUsageInfo.h | ||
---|---|---|
54 ↗ | (On Diff #59113) | Since MatzeB commented about the allocation in std::vector in the MachineFunctionPass, it seems equally important to override doInitialization(Module &M) in order to reserve the size of the map with the number of functions defined in the module, and doFinalization to clear the map. |
lib/CodeGen/RegUsageInfoCollector.cpp | ||
89 | What is regMaskSize on ARM and X86?
|
include/llvm/CodeGen/MachineOperand.h | ||
---|---|---|
147 | I will consider it, do we required to revert those changes ? If yes then please suggest easy way, I am using git. | |
include/llvm/CodeGen/PhysicalRegisterUsageInfo.h | ||
29 ↗ | (On Diff #58934) | Actually I run git clang-format , but that was after committing my changes so it did not changed any thing. I have applied clang-format on each file. |
lib/CodeGen/CMakeLists.txt | ||
103 | @qcolombet Could you please explain why is that required? | |
lib/CodeGen/PhysicalRegisterUsageInfo.cpp | ||
1–2 ↗ | (On Diff #59113) | oh yes this is already mentioned in coding standards but I really didn't check for it. |
lib/CodeGen/RegUsageInfoCollector.cpp | ||
38 | Isn't this will be called by method generated due to macroINITIALIZE_PASS_BEGIN ? | |
lib/CodeGen/TargetPassConfig.cpp | ||
501 | Have thought of any thing for this? | |
516 | I believe CallGraphWrapperPass is required, though I have not look at this code closely as it was given by you :D | |
516 | No these dependencies are not required, I am yet to find why not technically but I just removed it and compiled llvm again and things are working. | |
lib/Target/X86/X86RegUsageInfoPropagate.cpp | ||
101 ↗ | (On Diff #59113) | I thought it is not for primitive type. I will change it. |
include/llvm/CodeGen/PhysicalRegisterUsageInfo.h | ||
---|---|---|
30 ↗ | (On Diff #59113) | git clang-format takes a commit range, so you can format patches that are already committed. |
lib/CodeGen/RegUsageInfoCollector.cpp | ||
38 | If there is something in the macro expansion that makes you think it is called, please elaborate. |
lib/CodeGen/RegUsageInfoCollector.cpp | ||
---|---|---|
38 | No, I look carefully at that macro it provides definition for initialize.. method but yes some where we need to call that method. Also I haven't mentioned this pass as dependency for any other pass with INITIALIZE_PASS_DEPENDENCY other wise it can call that function. Now I am also having same question as you. |
Correction in typos , more adhering to coding standards.
PhysicalRegisterUsageInfo.cpp -> RegisterUsageInfo.cpp
DenseMap of GlobelVariable * to RegMask is used instead of StringMap.
DummyCGSCCPass moved to CallGraphSCCPass.h and .cpp
We're getting there, the patch is looking really good now! :)
Still a little bit of work ahead, see inline comments.
include/llvm/Analysis/CallGraphSCCPass.h | ||
---|---|---|
132 | Describe the class with a doxygen comment. | |
include/llvm/CodeGen/MachineOperand.h | ||
537 | Do not repeat the name of the function in the doxygen comment (other places that do it are outdated). | |
include/llvm/CodeGen/Passes.h | ||
366 | s/used/preserved/ | |
include/llvm/CodeGen/RegisterUsageInfo.h | ||
14 | You shouldn't introduce implementation details in high-level description. | |
lib/CodeGen/CMakeLists.txt | ||
103–105 | It is a good practice to decoupled software component. Having separate patches helps to make sure we indeed have correctly separated the components. So I agree with Quentin on the principle, and I think it is also a good exercise for you to split the patch an submit the analysis alone and tested. Have a look at "CostModelAnalysis::print()" in lib/Analysis/CostModel.cpp and see how it is tested in test/Analysis/CostModel/X86/cast.ll | |
lib/CodeGen/RegUsageInfoCollector.cpp | ||
12 | Wrapping | |
lib/CodeGen/TargetPassConfig.cpp | ||
119 | Why "compile time"? I'd just write "Enable inter-procedural register allocation" | |
515 | You forgot to remove this hunk |
lib/CodeGen/CMakeLists.txt | ||
---|---|---|
103–105 | @mehdi_amini I understand what you explain above but here I think RegisterUsageInfo is not tastable alone because it just holds RegMasks, RegisterInfoCollector is a trigger to IP regalloc and both of them can be tested together. Also to test X86RegUsageInfoPropagate it requires both of the above mentioned passes. But we can separate patches for RegisterUsageInfo + InfoCollector and X86RegUsageInfoPropagate ( condition that first patch is required to test second one) . |
lib/CodeGen/CMakeLists.txt | ||
---|---|---|
103–105 | Splitting in two is what I had in mind: the analysis part on one side, the transformation part on another. |
lib/CodeGen/CMakeLists.txt | ||
---|---|---|
103–105 | Just to make sure RegisterUsageInfo.cpp and RegUsageInfoCollector.cpp both are part of analysis so there is not need of separate patch for them but I will separate changes related to X86RegUsageInfoPropagate.cpp in other patch. |
@mehdi_amini I have made all suggested changes , I am writing a simple test for RegUsageInfoCollector so that next review you can also look at that. Thanks for your patience.
patch is splited into analysis and optimization pass. Simple test cases has been added.
include/llvm/CodeGen/RegisterUsageInfo.h | ||
---|---|---|
50 | add a method print() like other analysis, and call it here behind a flag. | |
55 | Doxygen | |
58 | Doxygen | |
test/CodeGen/Generic/reg-usage-info.ll | ||
2 ↗ | (On Diff #59669) | I don't like that we can't pass the test in release mode. I suggested that the dump occurs in doFinalization() for the analysis pass (i.e. PhysicalRegisterUsageInfo). You need to implement a proper cl::opt that controls a dump there, that could be used in release mode. |
4 ↗ | (On Diff #59669) | I'm worried that we don't provide any stability guarantee on the numbers printed here. |
test/CodeGen/Generic/reg-usage-info.ll | ||
---|---|---|
2 ↗ | (On Diff #59669) | Aren't we refering RegUsageInfoCollector as analysis pass, RegisterUsageInfo is there for keeping data around. |
4 ↗ | (On Diff #59669) | If RegUsageInfoCollector is considered as analysis pass then we can add cl::opt into that file and have analysis printed from that. |
test/CodeGen/Generic/reg-usage-info.ll | ||
---|---|---|
2 ↗ | (On Diff #59669) | Technically in LLVM the analysis is the pass that you require and query the result from (i.e. a pass you're using as getAnalysis()). But yeah terminology can be fuzzy. |
test/CodeGen/Generic/reg-usage-info.ll | ||
---|---|---|
2 ↗ | (On Diff #59669) | SO What is your suggestion? Which file should be changed? |
test/CodeGen/Generic/reg-usage-info.ll | ||
---|---|---|
2 ↗ | (On Diff #59669) | I think I already told you: the way analysis are tested at the IR level is using the -analyze flag in opt. Since we don't have -analyze in llc, the closest I can imagine is to implement the analysis (in the LLVM sense) with the "print(...)" method as other analyses, and call it in doFinalization() when a cl::opt flag is set. |
test/CodeGen/Generic/reg-usage-info.ll | ||
---|---|---|
4 ↗ | (On Diff #59669) | Will TRI pointer assigned a value by the MachineFunction pass ? Or Is there any way to get TargetMachine or TargetRegisterInfo from Module object? Also if PReg name is required to be printed then MCRegisterInfo will be required. |
a new command line option dump-regusage has been added so that reg usage information can be printed in release build. The test case is also changed accordingly.
include/llvm/CodeGen/RegisterUsageInfo.h | ||
---|---|---|
52 | Is the TRI dependent on the Target or the SubTarget? | |
57 | Same question. | |
lib/CodeGen/RegisterUsageInfo.cpp | ||
25 | I'm not sure the name of the option is great, but I'll leave the naming to Quentin/Matthias. | |
50 | Can you implement the print in a separate print method with the same signature as the other analyses for consistency? |
lib/CodeGen/RegUsageInfoCollector.cpp | ||
---|---|---|
129 | Mdl is not usual, M alone is more common (or sometimes Mod) |
lib/CodeGen/RegUsageInfoCollector.cpp | ||
---|---|---|
129 | Ok I will take care for naming , but that would be not required if we are going with Function * , because MF.getFunction() would be enough. |
lib/CodeGen/RegisterUsageInfo.cpp | ||
---|---|---|
57 | now I think such assertion wold be for Function *, right ? |
lib/CodeGen/RegUsageInfoCollector.cpp | ||
---|---|---|
131 | I tried out PRUI->storeUpdateRegUsageInfo(Mdl->getGlobalVariable(MF.getFunction()->getName(),true), std::move(RegMask)); and PRUI->storeUpdateRegUsageInfo(Mdl->getGlobalVariable(MF.getFunction()->getName()), std::move(RegMask)); but both returns nullptr for some functions. |
lib/CodeGen/RegUsageInfoCollector.cpp | ||
---|---|---|
131 | My previous comment called out the fact that you were using an API for global *variables* instead of function, and you replaced with a call to getGlobalVariable? There is a getFunction() API on the module that returns only functions, or there is a getNamedValue() that will return any symbol. However you have MF.getFunction(), so you should not call any of these. |
RegMasks uses Function * as key, print method and TargetMachine * have added to RegisterUsageInfo. Command line option -dump-regusage has been renamed to -print-regusage, test case is also updated to reflect the changes.
lib/CodeGen/RegisterUsageInfo.cpp | ||
---|---|---|
66 | pair is not coding-convention friendly. Also you are iterating on a map that is keyed on pointer values, which does not provided any ordering guarantee. Even with names as keys, the map is unordered anyway. | |
test/CodeGen/Generic/reg-usage-info.ll | ||
39 ↗ | (On Diff #59909) | This test is not OK, for the reasons I explained when I provided the other test (stability). |
Test case changed to simple test case which is originally provided by Mehdi Amini, print() now sorts analysis details before printing in RegisterUsageInfo.cpp
lib/CodeGen/RegUsageInfoCollector.cpp | ||
---|---|---|
93 | I think MatzeB mentioned that TRI is a subclass of MCRI, so I'm not sure why you're using MCRI at all while you have TRI. | |
124 | Why aren't you using MachineOperand::clobbersPhysReg here? | |
test/CodeGen/Generic/reg-usage-info.ll | ||
8 ↗ | (On Diff #60003) | What will be printed for bar1 and bar2? |
git clang-formated , test case changes, MCRI is not required , clobberedReg expression removed.
test/CodeGen/Generic/reg-usage-info.ll | ||
---|---|---|
15 ↗ | (On Diff #60158) | trailing whitespace here. |
Hi Vivek,
I was thinking about debug ability and usability of the produced code.
Something to consider in a following patch would be to add in the assembly output comments at the beginning of the function and/or related call site of the function listing what are the register preserved by the function.
The rationale is that now if we play with assembly files produced by different optimization phase, we want a quick way to check if two functions are equivalent.
E.g.,
Let say we have two assembly files produced for the same application:
- bad.s which fails, produced with ipra
- good.s which works, produced without ipra
When bisecting which function causes the failure, it is important to know if it is fine to take a function foo from good.s and replace it into bad.s. This may not be possible if the preserve register are not compatible.
Cheers,
-Quentin
@qcolombet. It will be very help full, I will look into that. Do we need to keep it default or a separate switch to get those comments ?
LGTM, with one inline comment.
include/llvm/CodeGen/RegisterUsageInfo.h | ||
---|---|---|
60 | Strip the "this method is provided to" everywhere. |
inline-asm related test case added Thanks to Mehdi Amini, and previous test case modified so that it do not fail due to transformation pass for IPRA.
Describe the class with a doxygen comment.