This fixes an issue that we were not properly supporting multiple
reduction stmts in a loop, and not generating SMLADs for these cases. The
alias analysis checks were done too early, making it too conservative.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Target/ARM/ARMParallelDSP.cpp | ||
---|---|---|
584 ↗ | (On Diff #154768) | For this to be safe, don't we have to ensure that the reductions are all rooted at the same phi? |
Thanks for the suggestion.
This update makes explicit that the different MAC chains don't interfere/alias with each other. I.e. MAC-chains are a simple chains that starts with a reducing integer add statement, and then a chain of adds and muls, which only has sext and loads as operands. Thus, we don't write memory, and we set a ReadOnly flag.
I still think you're overcomplicating things here... Keeping the ReadOnly flag, you can then just check what's writing to memory in the rest of the block:
static void AliasCandidates(BasicBlock *Header, Instructions &Reads, Instructions &Writes) { for (auto &I : *Header) { if (I.mayReadFromMemory()) Reads.push_back(&I); if (I.mayWriteToMemory()) Writes.push_back(&I); } } static bool AreAliased(AliasAnalysis *AA, Instructions &Reads, Instructions &Writes, ParallelMACList &MACCandidates) { LLVM_DEBUG(dbgs() << "Alias checks:\n"); for (auto &MAC : MACCandidates) { LLVM_DEBUG(dbgs() << "mul: "; MAC.Mul->dump()); if (!MAC.isReadOnly) return true; for (auto *I : Writes) { LLVM_DEBUG(dbgs() << "- "; I->dump()); assert(MAC.MemLocs.size() >= 2 && "expecting at least 2 memlocs"); for (auto &MemLoc : MAC.MemLocs) { if (isModOrRefSet(intersectModRef(AA->getModRefInfo(I, MemLoc), ModRefInfo::ModRef))) { LLVM_DEBUG(dbgs() << "Yes, aliases found\n"); return true; } } } } return false; }
As before, this will allow each MAC chain to be processed independently (which makes sense because they are independent) and allows us to be continue being assumption free. I'm sorry this removes the for_each loop though :p