This is an archive of the discontinued LLVM Phabricator instance.

[profile] Don't dump counters when forking and don't reset when calling exec** functions

Authored by calixte on Feb 21 2020, 4:27 AM.



There is no need to write out gcdas when forking because we can just reset the counters in the parent process.
Let say a counter is N before the fork, then fork and this counter is set to 0 in the child process.
In the parent process, the counter is incremented by P and in the child process it's incremented by C.
When dump is ran at exit, parent process will dump N+P for the given counter and the child process will dump 0+C, so when the gcdas are merged the resulting counter will be N+P+C.
About exec** functions, since the current process is replaced by an another one there is no need to reset the counters but just write out the gcdas since the counters are definitely lost.
To avoid to have lists in a bad state, we just lock them during the fork and the flush (if called explicitely) and lock them when an element is added.

Diff Detail

Event Timeline

calixte created this revision.Feb 21 2020, 4:27 AM
Herald added projects: Restricted Project, Restricted Project, Restricted Project. · View Herald TranscriptFeb 21 2020, 4:27 AM
Herald added subscribers: llvm-commits, Restricted Project, cfe-commits, hiraditya. · View Herald Transcript

Also, as we discussed, regarding exec.


Could you expand the comment explaining a situation where this could fail if we didn't lock?


Nit: do we need this check or can we just use the earlier one on pid == 0?


What if we have a thread in another process making modifications to the list?


Since we are now mostly doing different things on forks and execs, we could remove this vector and just do the operations directly instead of adding to the vec.


Isn't this bug fixed by splitting the block?

calixte updated this revision to Diff 245837.Feb 21 2020, 6:23 AM

If exec** fails we need to reset the counters since they've have dumped just before to avoid to count them twice.

calixte updated this revision to Diff 245843.Feb 21 2020, 6:43 AM
calixte marked 4 inline comments as done.

Add more comments to explain why we need to lock around the fork


I added this check to be sure we had a true fork in case of someone wrote its own fork function.


When forking, only the thread in the parent process is copied so in the child process we've only one thread.
When forking, the parent and the child process share the same memory until some change (Copy-on-Write), so even if in the parent process a list is changed then the memory will be copied before the change and then the child process will have an unchanged list.


If we make the insertions of new code in the second for loop, we'll invalidate the iterator used in this loop.


With the example shew here, the split is in foo but not in bar, blah() should be in another block

calixte marked 5 inline comments as done.Feb 21 2020, 6:45 AM
marco-c accepted this revision.Feb 21 2020, 7:04 AM

Could you test the last iteration of the patch on Mozilla's CI (with the workaround for the mismatch in LLVM version used by Rust)?


M->getOrInsertFunction is what would invalidate the iterator?

You could clean things up a bit by having two vectors then, one for forks and one for execs.


Replace with // No need to reset the counters since they'll be lost after the exec**

This revision is now accepted and ready to land.Feb 21 2020, 7:04 AM
calixte marked an inline comment as done.Feb 21 2020, 7:17 AM
calixte added inline comments.

Builder.Create** will invalidate the iterator

calixte updated this revision to Diff 245848.Feb 21 2020, 7:18 AM

Fix a typo