During register swapping, the code fragments associated with the
function need to be swapped together (which may be generated during
PGO optimization).
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Thanks for fixing this, @hzq !
Some suggestions below.
Ideally we would have a test case loosely similar to this one (https://reviews.llvm.org/D138245) that shows that the fix is working as intended. Let me know if you need help writing this.
Finally, be aware that this pass is also experimental because it does not update debug info. So it really depends on how serious you are about keeping debug info accurate.
bolt/include/bolt/Core/BinaryFunction.h | ||
---|---|---|
377 | ||
1798–1800 | ||
bolt/lib/Passes/RegReAssign.cpp | ||
326–328 | ||
378–380 |
Ideally we would have a test case loosely similar to this one (https://reviews.llvm.org/D138245) that shows that the fix is working as intended. Let me know if you need help writing this.
Finally, be aware that this pass is also experimental because it does not update debug info. So it really depends on how serious you are about keeping debug info accurate.
Thanks. will modify and try to write test cases.
Hi, @rafauler , I'm sorry for the delayed response. Could you please help me take a look at this test case to see if it is valid?
Thanks! Mostly formatting nits:
bolt/include/bolt/Core/BinaryFunction.h | ||
---|---|---|
377 | Since you added a new getter for "getParentFraments", move this "using" to line 369 and change ParentFragments to use FrafmentsSetTy, just like Fragments. | |
bolt/lib/Passes/RegReAssign.cpp | ||
195–197 | nit: drop braces here: | |
198–202 | ||
229 | clang-format | |
230–232 | ||
342 | clang-format | |
343–345 | ||
bolt/test/X86/reg-reassign-swap-cold.s | ||
12 ↗ | (On Diff #546864) | Because this tests runs the produced output, we need to move this test to the folder test/runtime/X86 |
Thanks! LGTM with some modifications I made to the test file available here: https://pastebin.com/3abP1S5G
I modified the test to fail if the optimization is applied just to main, but not to main.cold, and vice-versa. Please commit this version. I also used basic block labels when generating FDATA instead of hardcoded offsets, so the test is resilient to changes in it. I further modified main.cold to be a global symbol, so the symbol is not stripped when we run llvm-strip. We do need to run llvm-strip because I'm not declaring bb labels as temp symbols anymore (.L1, etc), in order to use their names when generating fdata. That's the way we usually write the tests (function names are globals, bb names are locals but non-temp so we don't need to hardcode fdata).