The line at issue is the following:
nodeVisitNumbers[New] = nodeVisitNumbers[Old];
If we write this as:
unsigned &OldRef = nodeVisitNumbers[Old]; unsigned &NewRef = nodeVisitNumbers[New]; unsigned OldVal = OldRef; NewRef = OldVal;
the issue becomes obvious: The call to nodeVisitNumbers[New] may
invalidate the reference from the nodeVisitNumbers[Old], causing
the subsequent reference to value conversion to read free'ed memory.
The rewritten evaluation order is valid, because there are no sequence
points among the LHS and RHS value expressions, so either evaluation
order is acceptable. However, as described one of them results in a
use-after-free. However, the resulting crash is highly compiler and
runtime dependent (since the use happens immediately after the free,
there's only an issue if the memory gets freed or re-used by a
different thread).
An obvious fix is just to manually sequence the operations. However,
I would be concerned that this issue could be re-introduced later
by somebody assuming that removing the intermediate variable is NFC.
Instead, switch the access to use the iterator interface. This has
two advantages:
- When LLVM is build with debug checks, iterator access is validated, to catch these kind of issues.
- We save ourselves a bucket lookup, because we can re-use the iterator for erasure, increasing performance.
We believe this issue to be the root cause behind https://bugs.llvm.org/show_bug.cgi?id=34480.
Co-authored-by: Valentin Churavy <vchuravy@mit.edu>