Currently, a BoundaryAlign fragment may be inserted after the branch
that needs to be aligned to truncate the current fragment, this fragment is
unused at most of time. To avoid that, we can insert a new empty Data
fragment instead. Non-relaxable instruction is usually emitted into Data
fragment, so the inserted empty Data fragment will be reused at a high
possibility.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Inserting a fragment before the current insertion point is problematic now,
avoid doing this
llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp | ||
---|---|---|
479 | I believe you can simply insert a new empty data fragment instead of adding the special state to all data fragments. Also, you only need to do so if the last fragment was a data fragment. |
llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp | ||
---|---|---|
479 | If the next intruction is emitted into a relaxable fragment, the empty data fragment will be unused, is it acceptable to you? |
llvm/lib/MC/MCAssembler.cpp | ||
---|---|---|
1022 | Do me a favour and leave this line in for this patch. It can be removed in a follow up, but the assumptions about offset calculation caching are subtle and it's clearly conservatively correct to invalidate too much. :) | |
llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp | ||
434 | Does this need to be a runtime check? Or can it be an assert? "assert(OS.getCurrentFragment()->getPrevNode() == PendingBoundaryAlign)" to be clear. Also, add braces. Even though there's a single statement the block comment deserves clear demarcation. | |
439 | Add to comment: and we'll tie them together in branchBoundaryEnd | |
454–459 | Drop the else since the previous if returned. | |
472 | Might be good to add an assert here about the expected instructions between pending BA and this one. | |
477 | I believe we only need this bit of code if we tied the aligned instruction into a previous BA. (i.e. we can early return above if there's no pending BA.) | |
479 | Yes. This is an idiom we already use in a few places and we can fix them all at once if it becomes an issue. |
llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp | ||
---|---|---|
434 | yes, we need to check it at runtime. An AlignFragment might be inserted after previous instruction(see the comment below), we need to insert a new BoundaryAlign before the JCC if there is a AlignFragment between the fusible pair. |
llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp | ||
---|---|---|
472 | How to check the expected instructions here? |
llvm/test/MC/X86/align-branch-64-negative.s | ||
---|---|---|
30–37 | Can I remove the second test? |
llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp | ||
---|---|---|
429 | PendingBoundaryAlign has been set to nullptr in alignBranchesEnd()? | |
469 | Only check PendingBoundaryAlign is enough? In alignBranchesBegin(), if needAlignInst(Inst), a PendingBoundaryAlign is added. | |
474 | After setting PendingBoundaryAlign to nullptr, it seems there is no chance to reset LastFragment of PendingBoundaryAlign for macro fusion case. | |
476 | I don't understand why further data isn't added to the current data. Is it because we need get instruction size? Could you comments it in the code? |
llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp | ||
---|---|---|
429 | We need to clear the PendingBoundaryAlign
alignBranchesEnd is responsible for the second case and alignBranchesBegin is responsible for the first case. | |
469 | That' not enough. We need to handle the macro fusion case. When PrevInst is the first instruction in a fusible pair, the PendingBoundaryAlign is not null and we should return directly here. | |
474 | The macro fusion case is corretly handled here. We do not need to reset LastFragment of PendingBoundaryAlign. Only when the Inst needs to be aligned, we can reach here and LastFragment is only set once. | |
476 | Yes, it is because we need get instruction size. I will add it to comments. |
llvm/test/MC/X86/align-branch-64-negative.s | ||
---|---|---|
30–36 | Why is the test deleted? |
llvm/test/MC/X86/align-branch-64-negative.s | ||
---|---|---|
30–36 | This test is a negative test and we expect the label is bounded to the end of the call rather than the start of the jmp. As commented in this test file, the label "label_after" is in the right place now because we insert a new empty DataFragment after the call. |
llvm/test/MC/X86/align-branch-64-negative.s | ||
---|---|---|
30–36 | Please update the check line instead. I can't tell from the diff or not whether the change made is correct or not. p.s. Commenting out a test is almost never the right answer. |
I applied the patch and reviewed the changed code sequence directly. It's more correct than what was there before.
LGTM, but please make sure the test check lines are updated so that the diff is visible before landing.
Do me a favour and leave this line in for this patch. It can be removed in a follow up, but the assumptions about offset calculation caching are subtle and it's clearly conservatively correct to invalidate too much. :)