Following code(IR test) produce an incorrect DAG with cycles by this command: llc s.ll
source_filename = "any.c"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
define i32 @foo(i64* nocapture %perm, i32 %n) {
entry:
br label %body
body:
%vec.ind = phi <2 x i64> [ <i64 0, i64 1>, %entry ], [ <i64 2, i64 3>, %body ] %l13 = extractelement <2 x i64> %vec.ind, i32 %n %l14 = getelementptr inbounds i64, i64* %perm, i64 %l13 %l15 = bitcast i64* %l14 to <2 x i64>* store <2 x i64> %vec.ind, <2 x i64>* %l15, align 8 %niter.ncmp.3 = icmp eq i64 %l13, 0 br i1 %niter.ncmp.3, label %exit, label %body
exit:
ret i32 %n
}
Here is DAG dump just before it was diagnosis with a cycle in checkForCyclesHelper:
(gdb) p DAG->dump()
SelectionDAG has 31 nodes:
t0: ch = EntryToken t8: i64,ch = CopyFromReg t0, Register:i64 %vreg1 t10: i64 = shl t44, Constant:i8<3> t11: i64 = add t8, t10 t2: v2i64,ch = CopyFromReg t0, Register:v2i64 %vreg0 t14: ch = store<ST16[%l15](align=8)> t0, t2, t11, undef:i64 t4: i32,ch = CopyFromReg t0, Register:i32 %vreg2 t49: i64 = any_extend t4 t40: i64 = and t49, Constant:i64<1> t50: i64 = shl t40, Constant:i8<3> t43: i64 = add t50, t11 t44: i64,ch = load<LD8[<unknown>]> t14, t43, undef:i64 t48: i64 = X86ISD::Wrapper TargetConstantPool:i64<<2 x i64> <i64 2, i64 3>> 0 t46: v2i64,ch = load<LD16[ConstantPool]> t0, t48, undef:i64 t20: ch = CopyToReg t0, Register:v2i64 %vreg7, t46 t24: ch = TokenFactor t20, t44:1 t34: i32 = X86ISD::CMP t44, Constant:i64<0> t37: ch = X86ISD::BRCOND t24, BasicBlock:ch<body 0x5fc2078>, Constant:i8<9>, t34 t27: ch = br t37, BasicBlock:ch<exit 0x5fc2188>
$2 = void
Note here t10 is dependent from t44 .
This patch is a simple fix for the problem, and adds an additional dependency test when searching for a store. And it avoids to reuse already existing STORE and creates new STORE on the stack in order to construct DAG without cycles.
This doesn't look like it's checking for the right thing. The problem isn't that the store depends on Idx; if you change the GEP in your testcase to "getelementptr inbounds i64, i64* %perm, i64 %n", there is no cycle. The actual problem is that the store depends on Op (the extractelement itself).