This is an archive of the discontinued LLVM Phabricator instance.

ScheduleDAG: Don't break the dependence in clustering neighboring loads.
Needs ReviewPublic

Authored by cfang on Jan 3 2019, 2:14 PM.

Details

Reviewers
rampitec
arsenm
Summary

In clusterNeighboringLoads, loads are scheduled base on the order of increasing addresses.
However, this will break the dependence for the case that the load of a lower address depends on
that of a higher address.

This patch resolves this issue by not to cluster these two loads if such dependence did exist.

Diff Detail

Event Timeline

cfang created this revision.Jan 3 2019, 2:14 PM
rampitec added inline comments.Jan 3 2019, 2:49 PM
lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
258

I see two flaws in this logic:

  1. A load can depend on any previous load, not just base.
  2. A dependency can be indirect, not necessarily as a direct operand.

The problem is if you use a loop over the whole chain and also use isPredecessorOf instead of isOperandOf that will make this loop extremely expensive. It needs to be somehow simplified. Maybe do such check only if getIROrder() of successor is less or equal getIROrder() of a predecessor in the assumption if it was located higher in the IR and given all of that happens in the same basic block, there can be no dependency?

cfang marked an inline comment as done.Jan 4 2019, 10:52 AM
cfang added inline comments.
lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
258
  1. A load can depend on any previous load, not just base.

Right. But here we should only care about whether the baseload depends on this load because we are trying to create a dependence of the load on the baseload by glueing.

  1. A dependency can be indirect, not necessarily as a direct operand.

llvm::sort(Offsets);
All the leading loads have been checked already, so I am assuming no indirect dependence.

rampitec added inline comments.Jan 4 2019, 11:12 AM
lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
258

You are only creating dependency on a base load simply because that is the testcase you are using. All loads in chain are glued together. You can make a test: start load chain with an independent load and then have these two. You will run into the same issue.

All the loads were checked, so you should have assumed no direct dependency as well. Although both generic code and target callback missed that dependency anyway. I do not see why it would detect indirect dependency if it does not handle direct.

Thinking about this I am not sure this is a right place to handle it. To cluster loads they should be from the same base and also all linked the the same base chain. This should result in a no dependency. That is the reason why sorting by offset is legal, there is no dependency between subsequent loads.

Now I have a question: why two of our loads in question do not model dependency at the DAG level? This seems to be a real problem to me which needs to be fixed. I assume two subreg loads in question shall be chained together in a correct order to prevent rescheduling.

rampitec added inline comments.Jan 4 2019, 11:34 AM
lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
258

Think about this DAG:

a(ptr)
  b(ptr):a
  c(ptr):a

I am using ":a" to represent chain. Your load cluster will be "a, b, c" or "a, c, b". Base load does not move, but b and c can be rescheduled. That is the logic behind clustering.

The situation you are handling is:

a(ptr, b)
  b(ptr):a
  c(ptr):a

So you have additional regular argument making an extra dependency from a to b. This is already a bug by itself, because it has created a circular dependency between a and b. There is no legal schedule which can obey both regular operand and chain dependency.

Now transform it further, to have c depend on b via regular operand:

a(ptr)
  b(ptr):a
  c(ptr, b):a

This will not be checked by the clustering logic because only chains are tracked. So schedule "a, c, b" is also invalid, but your check will not handle it as well as it only checks dependency on a base load.

A correct DAG for this case should be:

a(ptr)
  b(ptr):a
    c(ptr, b):b

A correct DAG for the second case should be either:

b(ptr):a
  a(ptr, b):b
    c(ptr):a

or better

b(ptr):a
  a(ptr, b):b
  c(ptr):b

I.e. we need to make sure we are producing correct chains when creating load SDNodes.

rampitec added inline comments.Jan 4 2019, 2:19 PM
lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
258

During selection the following dag was created:

t0: ch = EntryToken
    t17: i16,ch = BUFFER_LOAD_USHORT_OFFSET<Mem:(load 2 from `<2 x half> addrspace(5)* null` + 2, addrspace 5)> Register:v4i32 $private_rsrc_reg, Register:i32 $scratch_wave_offset_reg, TargetConstant:i16<2>, TargetConstant:i1<0>, TargetConstant:i1<0>, TargetConstant:i1<0>, t0
  t16: v2f16,ch = BUFFER_LOAD_SHORT_D16_HI_OFFSET<Mem:(load 2 from `<2 x half> addrspace(5)* null`, addrspace 5)> Register:v4i32 $private_rsrc_reg, Register:i32 $scratch_wave_offset_reg, TargetConstant:i16<0>, TargetConstant:i1<0>, TargetConstant:i1<0>, TargetConstant:i1<0>, t17, t0

To fix the problem the chain operand of resulting MachineSDNode t0 needs to be replaced with real required chain t17.

arsenm added a comment.Jan 4 2019, 8:37 PM

I expect this to need to look at the users list

arsenm added inline comments.Jan 7 2019, 10:52 PM
lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
258

I think this is technically an allowed way of representing the dependencies. The value dependency of the second load should be sufficient. This came up before in the context of simplifying chains for load/store merging earlier in the DAG, but I'm not surprised if this will be buggy like here

rampitec added inline comments.Jan 7 2019, 10:57 PM
lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
258

In general having this dependency shall suffice, although patch is far not enough for that. A proper solution will be very expensive. On the other hand chains are still broken and needs fix anyway.

arsenm resigned from this revision.Feb 21 2019, 7:10 PM