This is an archive of the discontinued LLVM Phabricator instance.

[RFC][X86][MemFold] Upgrade the mechanism of auto-generated Memory Folding Table
ClosedPublic

Authored by yubing on Jan 18 2023, 11:01 PM.

Details

Summary
  1. Align ManualMapSet with X86MemoryFoldTableEntry instead of using UnfoldStrategy
  2. ManualMapSet able to update the existing record in auto-generated MemFold table

Diff Detail

Event Timeline

yubing created this revision.Jan 18 2023, 11:01 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 18 2023, 11:01 PM
Herald added a subscriber: mgrang. · View Herald Transcript
yubing requested review of this revision.Jan 18 2023, 11:01 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 18 2023, 11:01 PM
yubing retitled this revision from [X86][MemFold] Upgrade the mechanism of auto-generated Memory Folding Table 1. Align ManualMapSet with X86MemoryFoldTableEntry instead of using UnfoldStrategy 2. ManualMapSet able to update the existing record in auto-generated MemFold table to [X86][MemFold] Upgrade the mechanism of auto-generated Memory Folding Table.Jan 18 2023, 11:04 PM
yubing edited the summary of this revision. (Show Details)
yubing added a comment.EditedJan 18 2023, 11:10 PM

PLAN:

  1. Move MemoryFoldTable2Addr MemoryFoldTable0~4 into X86InstrFoldTables.def from llvm/lib/Target/X86/X86InstrFoldTables.cpp https://reviews.llvm.org/D142083
  2. Update llvm/lib/Target/X86/X86InstrFoldTables.def with adding records, mainly for avx512fp16 https://reviews.llvm.org/D143149
  3. Upgrade mechanism of auto-generated memfolding table https://reviews.llvm.org/D142084
  4. Update ManualMapSet in X86FoldTablesEmitter.cpp to make x86-auto-memfold.td pass
  5. small modification for remaining different ~30 records
craig.topper added inline comments.
llvm/utils/TableGen/X86FoldTablesEmitter.cpp
33

Where is NO_STRATEGY define? It looks like it was deleted from the left hand of this diff.

skan added inline comments.Jan 19 2023, 12:41 AM
llvm/utils/TableGen/X86FoldTablesEmitter.cpp
61

Don't duplicate the definitions in llvm/lib/Target/X86/X86InstrFoldTables.h, try finding a common place

82

See my comments in https://reviews.llvm.org/D142083, we can change thing to print here

yubing updated this revision to Diff 490392.Jan 19 2023, 12:42 AM

give 0 to default Strategy of ManualMapEntry

yubing added inline comments.Jan 19 2023, 12:49 AM
llvm/utils/TableGen/X86FoldTablesEmitter.cpp
33

changed to 0

skan added inline comments.Jan 19 2023, 2:31 AM
llvm/test/TableGen/x86-auto-memfold.td
3

Drop "-asmwriternum=1" "-write-if-changed"

4

Drop this line by using pipleline in the previous line.

5

We should remove all the \t or space in INC file, don't use "tr" for it.

8

Drop the XFAIL, this patch should be merged after fp16 records are added.

llvm/utils/TableGen/X86FoldTablesEmitter.cpp
45–46

Move this into a .inc as a whitelist

pengfei added inline comments.Jan 19 2023, 3:17 AM
llvm/utils/TableGen/X86FoldTablesEmitter.cpp
30

Remove.

50

1?

122

Should be a space here.

124

Add assert to make sure LHS and RHS not null?

389

How about !!(S & TB_NO_REVERSE)?

Matt added a subscriber: Matt.Jan 25 2023, 9:07 AM
yubing updated this revision to Diff 494163.Feb 1 2023, 9:58 PM

address all the comments

yubing added inline comments.Feb 1 2023, 9:59 PM
llvm/utils/TableGen/X86FoldTablesEmitter.cpp
50

i copy the enum's definition from llvm/lib/Target/X86/X86InstrFoldTables.h

122

clang-format can't do this. i observed

skan added inline comments.Feb 1 2023, 10:32 PM
llvm/test/TableGen/x86-auto-memfold.td
3

Question: what do we skip here?

skan added inline comments.Feb 1 2023, 10:37 PM
llvm/utils/TableGen/X86FoldTablesEmitterManualMapSet.inc
2

This comment does not applies to all the entries in the table, I think we should move it to a suitable place. in this file.
And we need a clarification for each group in this table.

yubing added inline comments.Feb 2 2023, 12:18 AM
llvm/test/TableGen/x86-auto-memfold.td
3

the first 568 bytes are:

/*===- TableGen'erated file -------------------------------------*- C++ -*-===*\
|*                                                                            *|
|* X86 fold tables                                                            *|
|*                                                                            *|
|* Automatically generated file, do not edit!                                 *|
|*                                                                            *|
\*===----------------------------------------------------------------------===*/
skan added a comment.Feb 6 2023, 6:14 AM

Reverse ping @yubing

RKSimon added inline comments.Feb 8 2023, 5:46 AM
llvm/lib/Target/X86/X86MemFoldTables.inc
261

It would be good to get some/all of these diffs committed first to minimize any changes in codegen due to this patch - the patch should only be about the build process.

skan added inline comments.Feb 8 2023, 6:29 AM
llvm/lib/Target/X86/X86MemFoldTables.inc
261

Both methods are okay to me. I suggested the author to merge the table change into this PR b/c we are not confident the update in the table is correct. The new table needs a careful review.

And if the reviewer doubted an entry was incorrect, the author could point out which rule used in the X86FoldTablesEmitter.cpp was "guilty" and get suggestion/feedback from reviewers.

A separate diff would be the best way if we could check the correctness of the table just by looking at the entry themselves.

skan retitled this revision from [X86][MemFold] Upgrade the mechanism of auto-generated Memory Folding Table to [RFC][X86][MemFold] Upgrade the mechanism of auto-generated Memory Folding Table.Feb 8 2023, 6:48 AM
yubing added inline comments.Feb 13 2023, 1:32 AM
llvm/lib/Target/X86/X86MemFoldTables.inc
508–509

should remove one of them otherwise leads to duplicate entry in unfolding table.
i will remove {X86::MMX_MOVQ64rr, X86::MMX_MOVQ64rm, 0}, and write "{X86::MMX_MOVQ64rr, X86::MMX_MOVQ64rm, noforward || noreverse}," in llvm/utils/TableGen/X86FoldTablesEmitterManualMapSet.inc

craig.topper added inline comments.Feb 13 2023, 9:08 PM
llvm/lib/Target/X86/X86MemFoldTables.inc
612

These aren't real load instructions. It doesn't really make sense for them to be in the table. Folding the load would make the load not happen. That would be incorrect for volatile.

4009

Would this allow unfolding a VMOVAPDZ128rmk to VMOVAPDZ128rrk + VMOVAPDZ128rm? That would be a bug.

yubing added inline comments.Feb 15 2023, 6:39 AM
llvm/lib/Target/X86/X86MemFoldTables.inc
4009

@craig.topper
https://discourse.llvm.org/t/auto-generate-the-memory-folding-tables/61100/19
i remember we always fold whole load intrinsics into masked arithmetic. does it happen to masked movereg instruction as well?(only wholeload+ movereg => mask.load)

besides, do we need to add TB_NO_REVERSE for {X86::VMOVAPDZ128rr, X86::VMOVAPDZ128rm, TB_ALIGN_16}?

craig.topper added inline comments.Feb 15 2023, 8:31 AM
llvm/lib/Target/X86/X86MemFoldTables.inc
4009

What I said previously applied to arithmetic that has loads folded into them.

VMOVAPDZ128rmk is created from a masked load with an isel pattern.

VMOVAPDZ128rm will never be unfolded. We unfold to separate a loop invariant load from arithmetic. There’s no arithmetic folded in to it.. the only operands are address operands. They would all need to be loop invariant to hoist so the whole instruction is hoistable.

For VMOVAPDZ128rmk there is a mask operand and address operands. Without TB_NO_REVERSE we would try to unfold it if the address was loop invariant but the mask wasn’t. This would be incorrect. The mask may be masking off elements that will fault.

yubing added inline comments.Feb 15 2023, 8:17 PM
llvm/lib/Target/X86/X86MemFoldTables.inc
4009

thanks for explanation. i also check the code in llvm/lib/CodeGen/MachineLICM.cpp

// First check whether we should hoist this instruction.
if (!IsLoopInvariantInst(*MI) || !IsProfitableToHoist(*MI)) {
  // If not, try unfolding a hoistable load.
  MI = ExtractHoistableLoad(MI);
  if (!MI) return false;
}

since VMOVAPDZ128rmk((the address isloop invariant but the mask is not)) is not IsLoopInvariantInst, we can extract a wholeload from it if it is Without TB_NO_REVERSE. so what you said about VMOVAPDZ128rmk and VMOVAPDZ128rm makes sense to me.

yubing updated this revision to Diff 497875.Feb 15 2023, 8:21 PM

make X86::VMOVAPDZ128rrk, X86::VMOVAPDZ128rmk TB_NO_REVERSE

yubing added inline comments.Feb 15 2023, 8:22 PM
llvm/utils/TableGen/X86FoldTablesEmitter.cpp
429

Is there better way to check no-kz version's isMoveReg?

yubing updated this revision to Diff 497876.Feb 15 2023, 8:29 PM

remove {"MMX_MOVQ64rr", "MMX_MOVQ64mr"}

yubing updated this revision to Diff 497891.Feb 15 2023, 11:07 PM

disable {"MMX_MOVQ64rr", "MMX_MOVQ64rm"} as well

yubing updated this revision to Diff 497918.Feb 16 2023, 12:56 AM

add comments for removing {"MMX_MOVQ64rr", "MMX_MOVQ64rm"}

the patch can let internal testcase pass. no regression was found in benchmark.

yubing updated this revision to Diff 503638.Mar 8 2023, 10:52 PM

small rebase

skan accepted this revision.Mar 14 2023, 11:51 PM

LGTM in general

llvm/utils/TableGen/X86FoldTablesEmitter.cpp
19

What's header file Error.h used for?

87

Drop the comment?

110

Is this operator never used?

127

Why do we need to compare isPseudo?

432

Do we need to assert the result of getDef is not NULL?

439

Why do we need Result.IsStore here?

This revision is now accepted and ready to land.Mar 14 2023, 11:51 PM
skan added inline comments.Mar 14 2023, 11:55 PM
llvm/utils/TableGen/X86FoldTablesEmitterManualMapSet.inc
78

unfoldingtable -> unfolding table

81

ditto

yubing added inline comments.Mar 15 2023, 12:40 AM
llvm/utils/TableGen/X86FoldTablesEmitter.cpp
439

load such as VMOVAPDZ128rm will never be unfolded.
if VMOVAPDZ128rm 's memoperand is invariant, then VMOVAPDZ128rm will be hoisted directly and won't be unfolded into rm+rr
if VMOVAPDZ128rm 's memoperand is not invariant, we stop doing unfolding since compiler find it is a simple load.

// First check whether we should hoist this instruction.
if (!IsLoopInvariantInst(*MI) || !IsProfitableToHoist(*MI)) {
  // If not, try unfolding a hoistable load.
  MI = ExtractHoistableLoad(MI);
  if (!MI) return false;
}
MachineInstr *MachineLICMBase::ExtractHoistableLoad(MachineInstr *MI) {
  // Don't unfold simple loads.
  if (MI->canFoldAsLoad())
    return nullptr;

@craig.topper , does line429~439 make sense to you as well?

yubing added inline comments.Mar 15 2023, 12:58 AM
llvm/utils/TableGen/X86FoldTablesEmitter.cpp
127

I copy the code from "bool operator<(const X86FoldTableEntry &RHS) const {",
@craig.topper , i saw the author of "bool operator<(const X86FoldTableEntry &RHS) const {" is you. do you still remember why we need to compare isPseudo here?

skan added inline comments.Mar 15 2023, 1:06 AM
llvm/utils/TableGen/X86FoldTablesEmitter.cpp
127

I think it's a NFC change from me. If we remove this comparison, will any test fail?

craig.topper added inline comments.Mar 15 2023, 2:00 AM
llvm/utils/TableGen/X86FoldTablesEmitter.cpp
127

It's the sort order for the enum in X86GenInstrInfo.inc. Instructions with isPseudo set are ordered before other instructions.

does unfolding only happen in LICM?

llvm/utils/TableGen/X86FoldTablesEmitter.cpp
439

@craig.topper i think my analysis of VMOVAPDZ128rm is only for LICM. but for other passes in codegen, is it possible for VMOVAPDZ128rm to be unfolded?

craig.topper added inline comments.Mar 15 2023, 11:04 PM
llvm/utils/TableGen/X86FoldTablesEmitter.cpp
439

The other 2 places I know that unfold are TwoAddressInstructionPass and SelectionDAG->MachineIR

TwoAddressInstuctionPass should only happen for instructions with tied source and dest. That doesn't apply to moveReg.

I think SelectionDAG->MachineIR case happens if we need to duplicate an instruction that has an EFLAGS def. If the EFLAGS are used by two instructions and EFLAGS is clobbered in between them. We need to duplicate the flag producing instruction to satisfy the second user. But we can't duplicate the folded load so we have to unfold. That doesn't apply to moveReg.

yubing added inline comments.Mar 16 2023, 12:09 AM
llvm/utils/TableGen/X86FoldTablesEmitter.cpp
439

i saw two kinds of unfoldMemoryOperand, one of which is for MIR and another is for DAG. Our folding table can't affect unfoldMemoryOperand for DAG, right?
and unfoldMemoryOperand for MIR only happens in LICM, TwoAddress, X86CMOVConversion, X86KCFI, and X86SpeculativeLoadHardening.
i checked VMOVAPDZ128rm won't be handled in those Pasess. so we don't worry if we need to set TB_NO_REVERSE for VMOVAPDZ128rm.

yubing updated this revision to Diff 505713.Mar 16 2023, 12:24 AM

address skan's comments

craig.topper added inline comments.Mar 16 2023, 12:25 AM
llvm/utils/TableGen/X86FoldTablesEmitter.cpp
439

I think the DAG unfoldMemoryOperand uses the same table.

craig.topper added inline comments.Mar 16 2023, 12:27 AM
llvm/utils/TableGen/X86FoldTablesEmitter.cpp
439

The DAG unfold occurs after the isel portion of SelectionDAG. So it's MachineOpcodes

yubing added inline comments.Mar 16 2023, 12:34 AM
llvm/utils/TableGen/X86FoldTablesEmitter.cpp
439

oh, you're right. unfoldMemoryOperand is trying to find the MIR according to DAGNode's Machinecode, and then lookupUnfoldTable

bool
X86InstrInfo::unfoldMemoryOperand(SelectionDAG &DAG, SDNode *N,
                                  SmallVectorImpl<SDNode*> &NewNodes) const {
  if (!N->isMachineOpcode())
    return false;

  const X86MemoryFoldTableEntry *I = lookupUnfoldTable(N->getMachineOpcode());

@craig.topper , but where can i find the code of DAG unfold which occurs after the isel portion of SelectionDAG?
i saw unfoldMemoryOperand(for DAG) in llvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp

craig.topper added inline comments.Mar 16 2023, 12:37 AM
llvm/utils/TableGen/X86FoldTablesEmitter.cpp
439

ScheduleDAGRRList.cpp is the code. The DAG needs to be scheduled into linear basic block for MachineIR.

yubing added inline comments.Mar 16 2023, 1:31 AM
llvm/utils/TableGen/X86FoldTablesEmitterManualMapSet.inc
31

for myself, why " { "TAILJMPr", "TAILJMPm", TB_FOLDED_LOAD }," listed here.

This revision was landed with ongoing or failed builds.Mar 16 2023, 3:44 AM
This revision was automatically updated to reflect the committed changes.
steven_wu added inline comments.
llvm/test/TableGen/x86-auto-memfold.td
3

From the manpage of cmp, it should be like:

cmp [OPTIONS] file1 file2

There are some implementation that cannot take --ignore-initial after the files. Can you move it to the front?

smeenai added inline comments.
llvm/test/TableGen/x86-auto-memfold.td
3

This was breaking our tests on macOS as well, so I pushed rG7e271c2a8552 to fix.

This revision is now accepted and ready to land.Mar 16 2023, 11:12 PM

@vitalybuka , do you know how to reproduce the issue by buildbot_fast.sh?
i tried https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild#try-local-changes
but it set HEAD wrongly at

commit 8dfdcc7b7bf66834a761bd8de445840ef68e4d1a
Author: Nikolas Klauser <nikolasklauser@berlin.de>
Date:   Thu Nov 17 21:34:29 2022 +0100

    [libc++] Fix memory leaks when throwing inside std::vector constructors

    Fixes #58392

    Reviewed By: ldionne, #libc

    Spies: alexfh, hans, joanahalili, dblaikie, libcxx-commits

    Differential Revision: https://reviews.llvm.org/D138601
yubing added a subscriber: hctim.Mar 19 2023, 8:48 PM

@hctim do you know how to reproduce the ubsan issue elegantly. after i spend some time modify buildbot_functions.sh, i still can't reproduce it locally. https://lab.llvm.org/buildbot/#/builders/5/builds/32220/steps/16/logs/stdio

craig.topper added inline comments.Mar 19 2023, 9:09 PM
llvm/utils/TableGen/X86FoldTablesEmitter.cpp
394

Is it possible for S & TB_ALIGN_MASK to be 0? That would cause 1<< -1 which would match the ubsan error

yubing updated this revision to Diff 506468.Mar 19 2023, 10:45 PM

address ubsan issue and ignore-initial issue

craig.topper added inline comments.Mar 19 2023, 10:52 PM
llvm/utils/TableGen/X86FoldTablesEmitter.cpp
395

Use a variable instead of repeating the same code twice?

yubing updated this revision to Diff 506470.Mar 19 2023, 11:24 PM

address Craig's comments

This revision was landed with ongoing or failed builds.Mar 19 2023, 11:43 PM
This revision was automatically updated to reflect the committed changes.

This commit may be breaking this bot:
https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/52590/console

llvm-project/llvm/utils/TableGen/X86FoldTablesEmitterManualMapSet.inc:4:48: error: use of undeclared identifier 'TB_NO_REVERSE'
    { "ADD16ri_DB",         "ADD16mi",         TB_NO_REVERSE  },

Also note that the pre-merge checks failed to clang format these files:

llvm/include/llvm/Support/X86FoldTablesUtils.h
llvm/lib/Target/X86/X86InstrFoldTables.h
llvm/utils/TableGen/X86FoldTablesEmitter.cpp
llvm/utils/TableGen/X86FoldTablesEmitterManualMapSet.inc
fdeazeve added inline comments.Mar 20 2023, 5:34 AM
llvm/include/llvm/Support/X86FoldTablesUtils.h
12

The anon namespace is causing the bot failure, was a there a particular reason for this change? If not, I'll submit a patch

skan added inline comments.Mar 20 2023, 6:50 AM
llvm/include/llvm/Support/X86FoldTablesUtils.h
12

I think it's a typo by the author of this patch.

Hi, this is also causing a failure on the clang-ppc64-aix bot https://lab.llvm.org/buildbot/#/builders/214/builds/6519

+ : 'RUN: at line 1'
+ /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/build/bin/llvm-tblgen -gen-x86-fold-tables -asmwriternum=1 /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/llvm-project/llvm/test/TableGen/../../lib/Target/X86/X86.td -I /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/llvm-project/llvm/test/TableGen/../../include -I /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/llvm-project/llvm/test/TableGen/../../lib/Target/X86/ -I /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/llvm-project/llvm/test/TableGen/../../include/ -I /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/llvm-project/llvm/test/TableGen/../../lib/Target/ --write-if-changed -o /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/build/test/TableGen/Output/x86-auto-memfold.td.tmp1
+ : 'RUN: at line 2'
+ cmp --ignore-initial=0:568 /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/llvm-project/llvm/test/TableGen/../../lib/Target/X86/X86MemFoldTables.inc /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/build/test/TableGen/Output/x86-auto-memfold.td.tmp1
cmp: illegal option -- -
usage: cmp [-l | -s] file1 file2
yubing added inline comments.Mar 20 2023, 10:14 PM
llvm/include/llvm/Support/X86FoldTablesUtils.h
12

thanks for fixing it.

yubing added a comment.EditedMar 20 2023, 10:31 PM

Hi, this is also causing a failure on the clang-ppc64-aix bot https://lab.llvm.org/buildbot/#/builders/214/builds/6519

+ : 'RUN: at line 1'
+ /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/build/bin/llvm-tblgen -gen-x86-fold-tables -asmwriternum=1 /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/llvm-project/llvm/test/TableGen/../../lib/Target/X86/X86.td -I /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/llvm-project/llvm/test/TableGen/../../include -I /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/llvm-project/llvm/test/TableGen/../../lib/Target/X86/ -I /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/llvm-project/llvm/test/TableGen/../../include/ -I /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/llvm-project/llvm/test/TableGen/../../lib/Target/ --write-if-changed -o /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/build/test/TableGen/Output/x86-auto-memfold.td.tmp1
+ : 'RUN: at line 2'
+ cmp --ignore-initial=0:568 /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/llvm-project/llvm/test/TableGen/../../lib/Target/X86/X86MemFoldTables.inc /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/build/test/TableGen/Output/x86-auto-memfold.td.tmp1
cmp: illegal option -- -
usage: cmp [-l | -s] file1 file2

@abhina.sreeskantharajan
seems "--ignore-initial=0:568" doesn't support ppc64? will https://reviews.llvm.org/D146498 works for you?

dblaikie added inline comments.
llvm/test/TableGen/x86-auto-memfold.td
2

Could you update/fix this test to not depend on the real .td file - all the other tblgen tests include a small snippet/test case/example instead? (though the test was renamed in 35aeb321c005ed77ef5e5fb4f8dea69a81c81253)

skan added inline comments.Apr 5 2023, 6:30 PM
llvm/test/TableGen/x86-auto-memfold.td
2

@dblaikie Thank you for the suggestion! The test itself aims to expose the vulnerable rules in X86FoldTablesEmitter.cpp. Due to the complexity of X86ISA, even the folding rules are already complicated, it is hard to predict whether exceptions will occur when new ISAs are introduced. So unfortunately, we need the real .td file as input. I illustrated the usage of the test file in D147527