- Share code optimizeInstFromVEX3ToVEX2 with MCInstLower
- Move the code of optimization for shift/rotate to a separate file
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp | ||
---|---|---|
3814 | The comment was kind of useful. I was about to suggest adding InstAlias. |
llvm/lib/Target/X86/X86InstrAsmAlias.td | ||
---|---|---|
554 | I think this can be done by implementing validateTargetOperandClass for MCK__36_1 ($1). |
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp | ||
---|---|---|
3814 | Agreed. Just say "We can't write this as an InstAlias." | |
llvm/lib/Target/X86/MCTargetDesc/X86InstrOptimization.cpp | ||
18 | Adda blank line before this. | |
19 | Perhaps you could do something like unsigned OpIdx1, OpIdx2; unsigned NewOpc; switch (Opc) default: return true; // Use macros to set OpIdx1, OpIdx2, NewOpc; } if (X86II::isX86_64ExtendedReg(MI.getOperand(OpIx1).getReg()) || !X86II::isX86_64ExtendedReg(MI.getOperand(OpIdx2).getReg())) return false; MI.setOpcode(New); return true; Then you're not duplicating the calls to isX86_64ExtendedReg and setOpcode in every macro? | |
55 | Why bring iterators into this? Can't we use getNumOperands() - 1? | |
62 | Do this outside the switch instead of duplicating for each case? |
llvm/lib/Target/X86/MCTargetDesc/X86InstrOptimization.cpp | ||
---|---|---|
54 | Not for this patch, but should we use this in X86MCInstLower and remove all the isel patterns that select the 1 immediate version? |
llvm/lib/Target/X86/MCTargetDesc/X86InstrOptimization.cpp | ||
---|---|---|
54 |
Good idea! I can do it after this patch. | |
55 |
| |
llvm/lib/Target/X86/X86InstrAsmAlias.td | ||
554 | Thanks for the info! I added the comments to optimizeShiftRotateWithImmediateOne. |
llvm/lib/Target/X86/X86InstrAsmAlias.td | ||
---|---|---|
554 | I tried to do what I suggested and it doesn't work for all cases. |
llvm/lib/Target/X86/X86InstrAsmAlias.td | ||
---|---|---|
554 |
It doesn't matter. As craig suggested, we may remove all the isel patterns that select the 1 immediate version by optimizeShiftRotateWithImmediateOne, so InstAlias might be not a better direction. But I still think the comment about validateTargetOperandClass is valuable b/c it provides a new perspective. |
LGTM with that comment.
llvm/lib/Target/X86/MCTargetDesc/X86EncodingOptimization.cpp | ||
---|---|---|
67 ↗ | (On Diff #520245) | This can be moved below the switch and then you wouldn't need to check how many operands there are. |
Is it expected that this patch would cause any differences in the final binary? It seems to be just a refactoring, but we bisected a runtime crash to this commit.
It should not cause any difference in binary in theory. Could you show the difference or provide a small reproducer?
I don't have anything yet, since it's an internal test case. I'm going to bisect what the difference is, but it's still reproducing a failure at this commit but not the one prior -- so _something_ must be different, I think.
With -Os and -march=haswell the patch produces differences similar to this one:
vpsrldq $8, %xmm8, %xmm8 # xmm8 = xmm8[8,9,10,11,12,13,14,15],zero,zero,zero,zero,zero,zero,zero,zero vpxor %xmm8, %xmm9, %xmm8 vpsrldq $4, %xmm8, %xmm9 # xmm9 = xmm8[4,5,6,7,8,9,10,11,12,13,14,15],zero,zero,zero,zero - vmovss %xmm8, %xmm4, %xmm8 # xmm8 = xmm8[0],xmm4[1,2,3] + vmovss %xmm4, %xmm8, %xmm8 # xmm8 = xmm4[0],xmm8[1,2,3] vpclmulqdq $0, %xmm8, %xmm5, %xmm8 vpxor %xmm9, %xmm8, %xmm8 - vmovss %xmm8, %xmm4, %xmm9 # xmm9 = xmm8[0],xmm4[1,2,3] + vmovss %xmm4, %xmm8, %xmm9 # xmm9 = xmm4[0],xmm8[1,2,3] vpclmulqdq $1, %xmm9, %xmm6, %xmm9 - vmovss %xmm9, %xmm4, %xmm9 # xmm9 = xmm9[0],xmm4[1,2,3] + vmovss %xmm4, %xmm9, %xmm9 # xmm9 = xmm4[0],xmm9[1,2,3] vpclmulqdq $0, %xmm9, %xmm7, %xmm9 vpxor %xmm9, %xmm8, %xmm8 vpextrd $1, %xmm8, %ecx
Looks wrong to me.
The proposed fix (https://reviews.llvm.org/D150440) is a non-trivial change on its own, which might introduce more issues. I suggest to revert this commit to return the trunk to a good state and then get this patch reviewed together with the fix.
clang-format not found in user’s local PATH; not linting file.