This is an archive of the discontinued LLVM Phabricator instance.

[X86][AsmParser] Refactor code in AsmParser
ClosedPublic

Authored by skan on May 7 2023, 7:39 AM.

Details

Summary
  1. Share code optimizeInstFromVEX3ToVEX2 with MCInstLower
  2. Move the code of optimization for shift/rotate to a separate file

Diff Detail

Event Timeline

skan created this revision.May 7 2023, 7:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 7 2023, 7:39 AM
skan requested review of this revision.May 7 2023, 7:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 7 2023, 7:39 AM
barannikov88 added inline comments.
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
3814

The comment was kind of useful. I was about to suggest adding InstAlias.

barannikov88 added inline comments.May 7 2023, 8:54 AM
llvm/lib/Target/X86/X86InstrAsmAlias.td
554

I think this can be done by implementing validateTargetOperandClass for MCK__36_1 ($1).
See how ARM does this.

craig.topper added inline comments.May 7 2023, 9:42 AM
llvm/lib/Target/X86/MCTargetDesc/X86InstrOptimization.cpp
1 ↗(On Diff #520188)

X86EncodingOptimixation might be a better name?

70 ↗(On Diff #520188)

I think you could have a single argument and add the i or 1 suffix inside the macro.

craig.topper added inline comments.May 7 2023, 10:56 AM
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
3814

Agreed. Just say "We can't write this as an InstAlias."

llvm/lib/Target/X86/MCTargetDesc/X86InstrOptimization.cpp
18 ↗(On Diff #520188)

Adda blank line before this.

19 ↗(On Diff #520188)

Perhaps you could do something like

unsigned OpIdx1, OpIdx2;
unsigned NewOpc;
switch (Opc)
default:
  return true;
// Use macros to set OpIdx1, OpIdx2, NewOpc;
}

if (X86II::isX86_64ExtendedReg(MI.getOperand(OpIx1).getReg()) ||
    !X86II::isX86_64ExtendedReg(MI.getOperand(OpIdx2).getReg()))
  return false;
MI.setOpcode(New);
return true;

Then you're not duplicating the calls to isX86_64ExtendedReg and setOpcode in every macro?

55 ↗(On Diff #520188)

Why bring iterators into this? Can't we use getNumOperands() - 1?

62 ↗(On Diff #520188)

Do this outside the switch instead of duplicating for each case?

craig.topper added inline comments.May 7 2023, 3:09 PM
llvm/lib/Target/X86/MCTargetDesc/X86InstrOptimization.cpp
54 ↗(On Diff #520188)

Not for this patch, but should we use this in X86MCInstLower and remove all the isel patterns that select the 1 immediate version?

skan updated this revision to Diff 520240.May 7 2023, 8:09 PM
skan marked 8 inline comments as done.

Address review comments

skan marked an inline comment as done.May 7 2023, 8:16 PM
skan added inline comments.
llvm/lib/Target/X86/MCTargetDesc/X86InstrOptimization.cpp
54 ↗(On Diff #520188)

Not for this patch, but should we use this in X86MCInstLower and remove all the isel patterns that select the 1 immediate version?

Good idea! I can do it after this patch.

55 ↗(On Diff #520188)

Why bring iterators into this? Can't we use getNumOperands() - 1?

llvm/lib/Target/X86/X86InstrAsmAlias.td
554

Thanks for the info! I added the comments to optimizeShiftRotateWithImmediateOne.

craig.topper added inline comments.May 7 2023, 8:20 PM
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
3661–3662

Can we use Inst.clear()?

llvm/lib/Target/X86/MCTargetDesc/X86EncodingOptimization.cpp
70

Why not make this part of the TO_IMM1 macro?

barannikov88 added inline comments.May 7 2023, 8:32 PM
llvm/lib/Target/X86/X86InstrAsmAlias.td
554

I tried to do what I suggested and it doesn't work for all cases.
The "short" variants should be matched first, but this is not always happening.
Currently there doesn't seem to be a way to force ordering between the variants.
(AsmOperandClass has SuperClasses field that guarantees partial ordering, but there is no explicit operand to attach the class to.)

skan updated this revision to Diff 520243.May 7 2023, 8:48 PM
skan marked 2 inline comments as done.

Address review comments: Integrate FROM_TO to TO_IMM1

skan edited the summary of this revision. (Show Details)May 7 2023, 8:50 PM
skan marked an inline comment as done.May 7 2023, 8:55 PM
skan added inline comments.
llvm/lib/Target/X86/X86InstrAsmAlias.td
554

I tried to do what I suggested and it doesn't work for all cases.
The "short" variants should be matched first, but this is not always happening.
Currently there doesn't seem to be a way to force ordering between the variants.
(AsmOperandClass has SuperClasses field that guarantees partial ordering, but there is no explicit operand to attach the class to.)

It doesn't matter. As craig suggested, we may remove all the isel patterns that select the 1 immediate version by optimizeShiftRotateWithImmediateOne, so InstAlias might be not a better direction.

But I still think the comment about validateTargetOperandClass is valuable b/c it provides a new perspective.

skan updated this revision to Diff 520245.May 7 2023, 9:00 PM
skan marked 2 inline comments as done.

Address review comments: Use Inst.clear()

craig.topper accepted this revision.May 7 2023, 9:18 PM

LGTM with that comment.

llvm/lib/Target/X86/MCTargetDesc/X86EncodingOptimization.cpp
68

This can be moved below the switch and then you wouldn't need to check how many operands there are.

This revision is now accepted and ready to land.May 7 2023, 9:18 PM
skan updated this revision to Diff 520254.May 7 2023, 10:25 PM

Address review comments: simplify code

This revision was landed with ongoing or failed builds.May 7 2023, 10:27 PM
This revision was automatically updated to reflect the committed changes.

Is it expected that this patch would cause any differences in the final binary? It seems to be just a refactoring, but we bisected a runtime crash to this commit.

skan added a comment.May 11 2023, 9:40 PM

Is it expected that this patch would cause any differences in the final binary? It seems to be just a refactoring, but we bisected a runtime crash to this commit.

It should not cause any difference in binary in theory. Could you show the difference or provide a small reproducer?

Is it expected that this patch would cause any differences in the final binary? It seems to be just a refactoring, but we bisected a runtime crash to this commit.

It should not cause any difference in binary in theory. Could you show the difference or provide a small reproducer?

I don't have anything yet, since it's an internal test case. I'm going to bisect what the difference is, but it's still reproducing a failure at this commit but not the one prior -- so _something_ must be different, I think.

bgraur added a subscriber: bgraur.May 11 2023, 10:18 PM
alexfh added a subscriber: alexfh.May 12 2023, 2:42 AM

With -Os and -march=haswell the patch produces differences similar to this one:

        vpsrldq $8, %xmm8, %xmm8                # xmm8 = xmm8[8,9,10,11,12,13,14,15],zero,zero,zero,zero,zero,zero,zero,zero
        vpxor   %xmm8, %xmm9, %xmm8
        vpsrldq $4, %xmm8, %xmm9                # xmm9 = xmm8[4,5,6,7,8,9,10,11,12,13,14,15],zero,zero,zero,zero
-       vmovss  %xmm8, %xmm4, %xmm8             # xmm8 = xmm8[0],xmm4[1,2,3]
+       vmovss  %xmm4, %xmm8, %xmm8             # xmm8 = xmm4[0],xmm8[1,2,3]
        vpclmulqdq      $0, %xmm8, %xmm5, %xmm8
        vpxor   %xmm9, %xmm8, %xmm8
-       vmovss  %xmm8, %xmm4, %xmm9             # xmm9 = xmm8[0],xmm4[1,2,3]
+       vmovss  %xmm4, %xmm8, %xmm9             # xmm9 = xmm4[0],xmm8[1,2,3]
        vpclmulqdq      $1, %xmm9, %xmm6, %xmm9
-       vmovss  %xmm9, %xmm4, %xmm9             # xmm9 = xmm9[0],xmm4[1,2,3]
+       vmovss  %xmm4, %xmm9, %xmm9             # xmm9 = xmm4[0],xmm9[1,2,3]
        vpclmulqdq      $0, %xmm9, %xmm7, %xmm9
        vpxor   %xmm9, %xmm8, %xmm8
        vpextrd $1, %xmm8, %ecx

Looks wrong to me.

skan added a comment.May 12 2023, 3:50 AM

With -Os and -march=haswell the patch produces differences similar to this one:

        vpsrldq $8, %xmm8, %xmm8                # xmm8 = xmm8[8,9,10,11,12,13,14,15],zero,zero,zero,zero,zero,zero,zero,zero
        vpxor   %xmm8, %xmm9, %xmm8
        vpsrldq $4, %xmm8, %xmm9                # xmm9 = xmm8[4,5,6,7,8,9,10,11,12,13,14,15],zero,zero,zero,zero
-       vmovss  %xmm8, %xmm4, %xmm8             # xmm8 = xmm8[0],xmm4[1,2,3]
+       vmovss  %xmm4, %xmm8, %xmm8             # xmm8 = xmm4[0],xmm8[1,2,3]
        vpclmulqdq      $0, %xmm8, %xmm5, %xmm8
        vpxor   %xmm9, %xmm8, %xmm8
-       vmovss  %xmm8, %xmm4, %xmm9             # xmm9 = xmm8[0],xmm4[1,2,3]
+       vmovss  %xmm4, %xmm8, %xmm9             # xmm9 = xmm4[0],xmm8[1,2,3]
        vpclmulqdq      $1, %xmm9, %xmm6, %xmm9
-       vmovss  %xmm9, %xmm4, %xmm9             # xmm9 = xmm9[0],xmm4[1,2,3]
+       vmovss  %xmm4, %xmm9, %xmm9             # xmm9 = xmm4[0],xmm9[1,2,3]
        vpclmulqdq      $0, %xmm9, %xmm7, %xmm9
        vpxor   %xmm9, %xmm8, %xmm8
        vpextrd $1, %xmm8, %ecx

Looks wrong to me.

I think I know what's the bug here and will prepare a patch to fix it soon.

With -Os and -march=haswell the patch produces differences similar to this one:

        vpsrldq $8, %xmm8, %xmm8                # xmm8 = xmm8[8,9,10,11,12,13,14,15],zero,zero,zero,zero,zero,zero,zero,zero
        vpxor   %xmm8, %xmm9, %xmm8
        vpsrldq $4, %xmm8, %xmm9                # xmm9 = xmm8[4,5,6,7,8,9,10,11,12,13,14,15],zero,zero,zero,zero
-       vmovss  %xmm8, %xmm4, %xmm8             # xmm8 = xmm8[0],xmm4[1,2,3]
+       vmovss  %xmm4, %xmm8, %xmm8             # xmm8 = xmm4[0],xmm8[1,2,3]
        vpclmulqdq      $0, %xmm8, %xmm5, %xmm8
        vpxor   %xmm9, %xmm8, %xmm8
-       vmovss  %xmm8, %xmm4, %xmm9             # xmm9 = xmm8[0],xmm4[1,2,3]
+       vmovss  %xmm4, %xmm8, %xmm9             # xmm9 = xmm4[0],xmm8[1,2,3]
        vpclmulqdq      $1, %xmm9, %xmm6, %xmm9
-       vmovss  %xmm9, %xmm4, %xmm9             # xmm9 = xmm9[0],xmm4[1,2,3]
+       vmovss  %xmm4, %xmm9, %xmm9             # xmm9 = xmm4[0],xmm9[1,2,3]
        vpclmulqdq      $0, %xmm9, %xmm7, %xmm9
        vpxor   %xmm9, %xmm8, %xmm8
        vpextrd $1, %xmm8, %ecx

Looks wrong to me.

I think I know what's the bug here and will prepare a patch to fix it soon.

The proposed fix (https://reviews.llvm.org/D150440) is a non-trivial change on its own, which might introduce more issues. I suggest to revert this commit to return the trunk to a good state and then get this patch reviewed together with the fix.