Following code(IR test) produce an incorrect DAG with cycles by this command: llc s.ll
source_filename = "any.c"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
define i32 @foo(i64* nocapture %perm, i32 %n) {
entry:
br label %body
body:
%vec.ind = phi <2 x i64> [ <i64 0, i64 1>, %entry ], [ <i64 2, i64 3>, %body ] %l13 = extractelement <2 x i64> %vec.ind, i32 %n %l14 = getelementptr inbounds i64, i64* %perm, i64 %l13 %l15 = bitcast i64* %l14 to <2 x i64>* store <2 x i64> %vec.ind, <2 x i64>* %l15, align 8 %niter.ncmp.3 = icmp eq i64 %l13, 0 br i1 %niter.ncmp.3, label %exit, label %body
exit:
ret i32 %n
}
Here is DAG dump just before it was diagnosis with a cycle in checkForCyclesHelper:
(gdb) p DAG->dump()
SelectionDAG has 31 nodes:
t0: ch = EntryToken
t8: i64,ch = CopyFromReg t0, Register:i64 %vreg1
t10: i64 = shl t44, Constant:i8<3>
t11: i64 = add t8, t10
t2: v2i64,ch = CopyFromReg t0, Register:v2i64 %vreg0
t14: ch = store<ST16[%l15](align=8)> t0, t2, t11, undef:i64
t4: i32,ch = CopyFromReg t0, Register:i32 %vreg2
t49: i64 = any_extend t4
t40: i64 = and t49, Constant:i64<1>
t50: i64 = shl t40, Constant:i8<3>
t43: i64 = add t50, t11
t44: i64,ch = load<LD8[<unknown>]> t14, t43, undef:i64
t48: i64 = X86ISD::Wrapper TargetConstantPool:i64<<2 x i64> <i64 2, i64 3>> 0
t46: v2i64,ch = load<LD16[ConstantPool]> t0, t48, undef:i64
t20: ch = CopyToReg t0, Register:v2i64 %vreg7, t46
t24: ch = TokenFactor t20, t44:1
t34: i32 = X86ISD::CMP t44, Constant:i64<0>
t37: ch = X86ISD::BRCOND t24, BasicBlock:ch<body 0x5fc2078>, Constant:i8<9>, t34
t27: ch = br t37, BasicBlock:ch<exit 0x5fc2188>$2 = void
Note here t10 is dependent from t44 .
This patch is a simple fix for the problem, and adds an additional dependency test when searching for a store. And it avoids to reuse already existing STORE and creates new STORE on the stack in order to construct DAG without cycles.
This doesn't look like it's checking for the right thing. The problem isn't that the store depends on Idx; if you change the GEP in your testcase to "getelementptr inbounds i64, i64* %perm, i64 %n", there is no cycle. The actual problem is that the store depends on Op (the extractelement itself).