This is an archive of the discontinued LLVM Phabricator instance.

[llvm-tblgen] CodeGenSchedModels::hasReadOfWrite gets wrong predication result
ClosedPublic

Authored by zixuan-wu on Aug 18 2022, 11:58 PM.

Details

Summary

CodeGenSchedModels::hasReadOfWrite tries to predicate whether the WriteDef is contained in the list of ValidWrites of someone ProcReadAdvance, so that WriteID of WriteDef can be compressed and reusable.

It tries to iterate all ProcReadAdvance entry, but not all ProcReadAdvance defs also inherit from SchedRead. Some ProcReadAdvances are defined by ReadAdvance.

// A processor may define a ReadAdvance associated with a SchedRead
// to reduce latency of a prior write by N cycles. A negative advance
// effectively increases latency, which may be used for cross-domain
// stalls.
//
// A ReadAdvance may be associated with a list of SchedWrites
// to implement pipeline bypass. The Writes list may be empty to
// indicate operands that are always read this number of Cycles later
// than a normal register read, allowing the read's parent instruction
// to issue earlier relative to the writer.
class ReadAdvance<SchedRead read, int cycles, list<SchedWrite> writes = []>
  : ProcReadAdvance<cycles, writes> {
  SchedRead ReadType = read;
}

// Directly associate a new SchedRead type with a delay and optional
// pipeline bypass. For use with InstRW or ItinRW.
class SchedReadAdvance<int cycles, list<SchedWrite> writes = []> : SchedRead,
  ProcReadAdvance<cycles, writes>;

So it's not complete to enumerate all ProcReadAdvances if just iterate all SchedReads.

Diff Detail

Event Timeline

zixuan-wu created this revision.Aug 18 2022, 11:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2022, 11:58 PM
Herald added a subscriber: StephenFan. · View Herald Transcript
zixuan-wu requested review of this revision.Aug 18 2022, 11:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2022, 11:58 PM
zixuan-wu updated this revision to Diff 453929.Aug 19 2022, 1:50 AM

Add testcase.

Does this fix PR57548?

Does this fix PR57548?

I am not sure. It just influences the number of cycle, which make it more accurate.

Does this fix PR57548?

I am not sure. It just influences the number of cycle, which make it more accurate.

My mistake (the bug + this patch were just coincidental) - @andreadb fixed the bug with rG3262794804ad23ed4a511669ffc97d128512bc37

Do you have real world test cases for this?

The chage looks good. However, the test must be improved.

Your test only checks the content of the write latency table. However, that is not enough to prove that this bug is fixed; you also need to show that the read advance entry for Read_D is associated with Write_B.

I suggest to modify the InstRW for Inst_C as follows:

def : InstRW<[Write_C, Read_D], (instrs Inst_C)>;

That way, Read_D will be associated to the first read operand of Inst_C.

Since Read_D is now explicitly used, a new MCReadAvanceEntry is added to table MyTargetReadAdvanceTable (see entry #1 below).

// {UseIdx, WriteResourceID, Cycles}
extern const llvm::MCReadAdvanceEntry MyTargetReadAdvanceTable[] = {
  {0,  0,  0}, // Invalid
  {0,  2,  1} // #1
}; // MyTargetReadAdvanceTable

In your test, you need to CHECK the presence of that {0, 2, 1} entry. That's the only way to know that there is a (at least one) SchedRead associated to Write_B (WriteResourceID = 2).
You should also check that Inst_C now uses that ReadAdvance. That's how we know that Read_D is effectively associated with Write_B.

In the .inc output, you should be able to check the last line of the table below:

// {Name, NumMicroOps, BeginGroup, EndGroup, RetireOOO, WriteProcResIdx,#, WriteLatencyIdx,#, ReadAdvanceIdx,#}
static const llvm::MCSchedClassDesc SchedModel_ASchedClasses[] = {
  {DBGFIELD("InvalidSchedClass")  8191, false, false, false, 0, 0,  0, 0,  0, 0},
  {DBGFIELD("Inst_A")             1, false, false, false,  0, 0,  1, 1,  0, 0}, // #1
  {DBGFIELD("Inst_B")             1, false, false, false,  0, 0,  1, 1,  0, 0}, // #2
  {DBGFIELD("Inst_C")             1, false, false, false,  0, 0,  1, 1,  1, 1}, // #3
}; // SchedModel_ASchedClasses

Based on that last descriptor, Inst_C declares a single read-advance entry (i.e. entry #1 in MyTargetReadAdvanceTable).
Entry #1 in MyTargetReadAdvanceTable declares a 1cy of read advance when associated with Write_B.

In conclusion, for completeness, you need to check also tables MyTargetReadAdvanceTable and SchedModel_ASchedClasses.

I hope it helps.

The chage looks good. However, the test must be improved.

Your test only checks the content of the write latency table. However, that is not enough to prove that this bug is fixed; you also need to show that the read advance entry for Read_D is associated with Write_B.

I suggest to modify the InstRW for Inst_C as follows:

def : InstRW<[Write_C, Read_D], (instrs Inst_C)>;

That way, Read_D will be associated to the first read operand of Inst_C.

Since Read_D is now explicitly used, a new MCReadAvanceEntry is added to table MyTargetReadAdvanceTable (see entry #1 below).

// {UseIdx, WriteResourceID, Cycles}
extern const llvm::MCReadAdvanceEntry MyTargetReadAdvanceTable[] = {
  {0,  0,  0}, // Invalid
  {0,  2,  1} // #1
}; // MyTargetReadAdvanceTable

In your test, you need to CHECK the presence of that {0, 2, 1} entry. That's the only way to know that there is a (at least one) SchedRead associated to Write_B (WriteResourceID = 2).
You should also check that Inst_C now uses that ReadAdvance. That's how we know that Read_D is effectively associated with Write_B.

In the .inc output, you should be able to check the last line of the table below:

// {Name, NumMicroOps, BeginGroup, EndGroup, RetireOOO, WriteProcResIdx,#, WriteLatencyIdx,#, ReadAdvanceIdx,#}
static const llvm::MCSchedClassDesc SchedModel_ASchedClasses[] = {
  {DBGFIELD("InvalidSchedClass")  8191, false, false, false, 0, 0,  0, 0,  0, 0},
  {DBGFIELD("Inst_A")             1, false, false, false,  0, 0,  1, 1,  0, 0}, // #1
  {DBGFIELD("Inst_B")             1, false, false, false,  0, 0,  1, 1,  0, 0}, // #2
  {DBGFIELD("Inst_C")             1, false, false, false,  0, 0,  1, 1,  1, 1}, // #3
}; // SchedModel_ASchedClasses

Based on that last descriptor, Inst_C declares a single read-advance entry (i.e. entry #1 in MyTargetReadAdvanceTable).
Entry #1 in MyTargetReadAdvanceTable declares a 1cy of read advance when associated with Write_B.

In conclusion, for completeness, you need to check also tables MyTargetReadAdvanceTable and SchedModel_ASchedClasses.

I hope it helps.

Thank you a lot for so detail comment.

andreadb accepted this revision.Sep 14 2022, 2:52 AM

Patch looks good to me modulo a couple of minor nits (see below).

Thanks
-Andrea

llvm/test/TableGen/CompressWriteLatencyEntry.td
1–3 ↗(On Diff #460002)

Please insert an extra newline to better separate the RUN line from the other code code comment.

I also suggest to change that comment to something like this:

Make sure that ReadAdvance entries are correctly processed. Not all ProcReadAdvance definitions
implicitly inherit from SchedRead. Some ProcReadAdvances are subclasses of ReadAdvance.

Here, I have reused part of your original description from the summary.

This revision is now accepted and ready to land.Sep 14 2022, 2:52 AM
zixuan-wu marked an inline comment as done.Sep 18 2022, 11:23 PM
This revision was landed with ongoing or failed builds.Sep 18 2022, 11:23 PM
This revision was automatically updated to reflect the committed changes.