This is an archive of the discontinued LLVM Phabricator instance.

DAG: Don't try to cluster loads with tied inputs
ClosedPublic

Authored by arsenm on Mar 4 2019, 10:18 AM.

Details

Summary

This avoids breaking possible value dependencies when sorting loads by
offset.

AMDGPU has some load instructions that write into the high or low bits
of the destination register, and have a tied input for the other input
bits. These can easily have the same base pointer, but be a swizzle so
the high address load needs to come first. This was inserting glue
forcing the opposite ordering, producing a cycle the InstrEmitter
would assert on. It may be potentially expensive to look for the
dependency between the other loads, so just skip any where this could
happen.

Fixes bug 40936 by reverting r351379, which added a hacky attempt to
fix this by adding chains in this case, which I think was just working
around broken glue before the InstrEmitter. The core of the patch is
re-implementing the fix for that problem.

Diff Detail

Event Timeline

arsenm created this revision.Mar 4 2019, 10:18 AM
rampitec added inline comments.Mar 4 2019, 10:30 AM
lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
277

Isn't that easier and cleaner to compute new BaseOff once yet while sorting?

arsenm marked an inline comment as done.Mar 4 2019, 10:40 AM
arsenm added inline comments.
lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
277

This is dependent on each offset so
I don’t understand what you mean . The alternative is to delete the assert in the SI implementation, which isn’t really necessary

rampitec added inline comments.Mar 4 2019, 11:12 AM
lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
271

There is unlikely but possible case neither is operand of another but depend through a third instruction.
Also what would happen if:

A: load i16 [base]
B: load i16 [base + 2]
C: load i8 [base + 1]

I guess normal sorting will tell A < C < B. With your change: B < A, A < C, C < B. That creates an impossible sort order.

277

I mean you could compute min(Offsets[]) during sort and use it as a new base. Otherwise you may send different base offset to target with each call for the same chain.

arsenm updated this revision to Diff 189790.Mar 7 2019, 3:26 PM
arsenm retitled this revision from DAG: Don't break value dependencies when sorting loads by offset to DAG: Don't try to cluster loads with tied inputs.
arsenm edited the summary of this revision. (Show Details)

Just skip loads with tied inputs

arsenm marked an inline comment as done.Mar 7 2019, 3:26 PM
arsenm added inline comments.
lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
271

It turns out this is already broken even if clustering is disabled. Fixing this requires another patch to rewrite the patterns

rampitec accepted this revision.Mar 7 2019, 3:46 PM

LGTM

This revision is now accepted and ready to land.Mar 7 2019, 3:46 PM
arsenm closed this revision.Mar 8 2019, 12:45 PM

r355728