Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
No idea. My change is NFC, or at least I thought so. I was debugging this for quite some time and that still doesn’t make any sense to me….
Another thing that’s surprising is that the open source build bot detected the leak for this test case:
Testing: 0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90
FAIL: MLIR :: Dialect/LLVM/transform-e2e.mlir (70727 of 75254)
- TEST 'MLIR :: Dialect/LLVM/transform-e2e.mlir' FAILED ****
But the Google internal build passed even with asan turned on. But another internal test case was reporting the leak. This is ver mysterious…
Google internal disable ASAN leak testing for some obscure reasons, it was supposed to be fixed a while ago so I didn't bother adding it to MLIR TAP ASAN at the time (I think there is an env var or an option to get it to work...)
mlir/lib/IR/Dominance.cpp | ||
---|---|---|
34 | Since it is deleted here, I suspect we don't have a single upstream test which calls "invalidate"? |
mlir/lib/IR/Dominance.cpp | ||
---|---|---|
34 | Good point I’ll try to at one later. And I’ll try to use a unique_ptr instead of C++ new/delete |
I found the problem: As part of my revision, I added this code to CSE:
/// Invalidate dominance info if the IR was changed. if (!opsToErase.empty()) domInfo->invalidate();
In the original code, the dominance info was never invalidated, even though the IR changed (so the invalidate function was never run). That's probably fine in most cases because the only thing that's cached here is a mapping of Region * to DomTree *. However, this could lead to problems in cases such as:
- CSE erases an op with a region. CSE preserves the dominance info analysis.
- Another pass is running. A new op with a region is created and the region happens to be allocated at the same memory location (pointer is reused). This pass also preserves the dominance info analysis. (E.g., that is safe to do if only new ops are added.)
- CSE is run again. The same dominance info is retrieved from the pass, but now the new region is mapped to an incorrect (outdated) DomTree*.
Actually, there is also this comment:
// We currently don't remove region operations, so mark dominance as // preserved.
So invalidation is actually not needed.
But it may be better to have code in place that invalidates all erased regions (even if there are none today). Only the erased regions, not a blanket "invalidate everything", as I did in my change. Then we can remove this comment and we run a lower risk of breaking something, should we decide to CSE ops with regions in the future.
That's tricky. I could call invalidate from a unit test. But it's unclear what I'm actually testing. Basically invalidate just removes an item from a cache. There's not much to test there. (It would just be a regression test for the leak.)
This is actually also not a good idea, because for consistency I would also have to invalidate the regions in PostDominanceInfo (because it is also marked as preserved by the pass). But CSE does not even use that. It would be inefficient to retrieve the PostDominanceInfo analysis (because it will probably be computed on the spot), only to invalidate a few regions.
And I’ll try to use a unique_ptr instead of C++ new/delete
DominanceInfo uses llvm::PointerIntPair to store the DomInfo* pointer and hasSSADominance more efficiently:
mutable DenseMap<Region *, llvm::PointerIntPair<DomTree *, 1, bool>> dominanceInfos;
This prevents us from using unique_ptr here. I could change it, but we would loose some efficiency. Maybe this is overly optimized, not sure what's more important here...
Since it is deleted here, I suspect we don't have a single upstream test which calls "invalidate"?