This revision adds a new transformation to map a copy operation to a gpu grid of threads.
It implements a first heuristic that allows trading off coalesced accesses vs predication and occupancy.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
mlir/lib/Dialect/Linalg/TransformOps/GPUHeuristics.cpp | ||
---|---|---|
53 | Nit: I'd predicate this on favorPredicate to avoid computing it twice if favorPredicate == 1. |
typo