Memdep had a funny bug related to invariant.groups - because it did not
invalidated cache, in some very rare cases it was possible to show memory
dependence of the instruction that was deleted, but because other
instruction took it's place it resulted in call to vtable!
Thanks @amharc for repro!.
Details
Diff Detail
- Repository
- rL LLVM
- Build Status
Buildable 16865 Build 16865: arc lint + arc unit
Event Timeline
include/llvm/IR/ValueMap.h | ||
---|---|---|
109 ↗ | (On Diff #141148) | Can this be marked noexcept, like the move operator= below? |
I am not sure that ValueMap can be moveable: if it was moved, the ValueMapT *Map pointer in ValueMapCallbackVH could become invalid.
Wouldn't std::unique_ptr<ValueMap<Instruction *, NonLocalDepResult>> NonLocalDefsCache be a better solution?
It's kind of hard for me to know what to do with this.
The cache invalidation is still very very broken with or without this patch.
So this patch is not wrong but everything it does probably needs to be rewritten anyway
Would handling NonLocalDefsCache in the same places as other DenseMaps from Instruction * be sufficient? In particular, RemoveCachedNonLocalPointerDependencies, called by both removeInstruction and invalidateCachedPointerInfo, looks promising ;)
I know, I also wish we would get rid of MemDep asap, but as long as it's there, it would be good to avoid weird miscompilations that causes calls to vtable.
Yep, this is probably the right way to go
"I know, I also wish we would get rid of MemDep asap, but as long as it's there, it would be good to avoid weird miscompilations that causes calls to vtable."
But it still will, just not in this particular way.
But it will still fail in very common ways anyway :)
We may actually have to disable preservation entirely for the moment anyway.
I think it would be a little bit cleaner if the relevant checks were added to verifyRemoved too, where all other maps are traversed in debug builds. However, AssertingVH is probably a good enough guarantee ;)
You probably want to wait for dberlin to accept this.
So, please don't let me stand in the way of reasonable fixes. I only get
~20 minutes a day to look at llvm stuff these days unless i'm on vacation.
Maybe some day it'll be better again, but i suspect it'll be a while ;)
llvm/lib/Analysis/MemoryDependenceAnalysis.cpp | ||
---|---|---|
1472 | You may want to do compilation time measurements on this on some bigger cases, if you haven't. This is likely to be painfully slow in some cases. |
During MemorySSA development, i'm pretty positive we tried some similar types of reverse maps, and while it did relatively okay on "llvm + clang testsuite", it fell down pretty hard in the real world from memory usage.
Adding George to see if he remembers any cases, so you have some stuff to test this all with.
He's on vacation, but i believe gets back next week.
I don't think it would be a problem here, I don't expect this map to keep more than couple of entries (for most cases 0 and 1). The same thing goes to the previous approach.
Adding George to see if he remembers any cases, so you have some stuff to test this all with.
I don't remember anything in particular. I'd recommend just bootstrapping clang/llvm and seeing what happens, assuming clang will emit these intrinsics if you pass an added cflag to it or something.
It's my impression that halide is great at generating code that hits worst-case behaviors in LLVM. I don't know whether they're users of these invariant intrinsics, however.
I measured build time of llvm& clang with devirtualization and either the variance is too big, or the compiler without this patch is faster.
As I found out, after writing other bugfix for AA, clang does not bootstrap correctly without this patch so I would love to push it.
If you don't have any objections, I would like to push it. I will still have to do build time measurements with and without devirtualization, but I would like to do it when everything is ready.
(Note that this code will not be executed unless you use devirtualization)
You may want to do compilation time measurements on this on some bigger cases, if you haven't.
This is likely to be painfully slow in some cases.