Page MenuHomePhabricator

Remove scheduling dependency from XRay :: Posix/fork_basic_logging.cc
ClosedPublic

Authored by ormris on Jul 18 2018, 11:45 AM.

Details

Summary

We've been seeing intermittent failures on our internal bots and we suspect
this may be due to the OS scheduling the child process to run before the parent
process.

This version ensures that the parent and child can be run in either order.

Diff Detail

Event Timeline

ormris created this revision.Jul 18 2018, 11:45 AM
Herald added subscribers: Restricted Project, delcypher. · View Herald TranscriptJul 18 2018, 11:45 AM
dberris accepted this revision.Jul 18 2018, 4:30 PM

LGTM

I've been wondering the same with local failures but hadn't looked at it closely, thanks @ormris!

The scheduling dependency definitely is not required in this test.

This revision is now accepted and ready to land.Jul 18 2018, 4:30 PM

Thanks for the review! Before I commit this, would it be possible to change the second part of the test (after line 82) to be scheduling independent?

Thanks for the review! Before I commit this, would it be possible to change the second part of the test (after line 82) to be scheduling independent?

The secondary test already should do that, since it's already looking for independent process markers. If it's not doing the right thing, then we (@Maknee or I) should look into how to re-structure this test to be less dependent on scheduling (maybe looking for a different function called from the child and the parent).

OK. I'll commit this and see if it fixes the issues.

ormris edited the summary of this revision. (Show Details)Jul 18 2018, 5:28 PM
ormris edited the summary of this revision. (Show Details)
This revision was automatically updated to reflect the committed changes.

Sorry for responding so late -- I noticed that this was happening too.

Thanks for finding and fixing the issue.

@Maknee We're now seeing intermittent failures for the second part of the test (after line 83). Is that part of the test scheduling independent? I see that the comments say "// The parent will print its pid first".

There is an ordering currently (There shouldn't have been)

On lines 92 and 100, TRACE should be TRACE-DAG.

Refer to https://reviews.llvm.org/D49559