This is an archive of the discontinued LLVM Phabricator instance.

replacing `rm -rf` with RemoveDirectory step in ClangBuilders
AbandonedPublic

Authored by kuhnel on Dec 1 2020, 1:39 AM.

Details

Summary

I'm experiencing issues with deleting files on Windows builder, e.g. http://lab.llvm.org:8014/#/builders/27/builds/1200
The root cause is most likely a race condition when using the Unix rm -rfcommand. (Details in the video). BuildBot has an integrated RemoveDirectory step which calls rmdir which should not have these issues.

I also added a .pylint.rc file, so that I could use PyLint to check the modified code. Otherwise it would not parse the zorg.* packages.

Event Timeline

kuhnel created this revision.Dec 1 2020, 1:39 AM
kuhnel requested review of this revision.Dec 1 2020, 1:39 AM
kuhnel updated this revision to Diff 308580.Dec 1 2020, 1:40 AM

added empty last line

kuhnel edited the summary of this revision. (Show Details)Dec 1 2020, 1:46 AM
kuhnel added a reviewer: amccarth.
kuhnel edited the summary of this revision. (Show Details)
amccarth requested changes to this revision.Dec 1 2020, 10:18 AM

LGTM.

I look forward to just switching everything to RemoveDirectory.

This revision now requires changes to proceed.Dec 1 2020, 10:18 AM
amccarth accepted this revision.Dec 1 2020, 10:19 AM

LGTM. (I must have slipped on the Action drop-down.)

This revision is now accepted and ready to land.Dec 1 2020, 10:19 AM

I think this patch is a good idea, but I'm having second thoughts about whether this will solve the problem.

Some details from the example build linked from the patch description:

rm: reading directory `stage1/include/llvm': Permission denied
rm: cannot remove directory `stage1/include': Directory not empty

The "Permission denied" message surprises me. If this were just the usual directory tree race condition, I'd only expect "Directory not empty" messages. Instead, it's as though there is an actual permission problem, which means something doesn't get deleted, and _that_'s why the parent directory is not empty.

I don't know how much to trust those messages. Is there actually a permission problem _reading_ the directory? Could there be some file with the wrong ACL or being held open by another thread?

I'll keep my fingers crossed that RemoveDirectory will resolve this, but I won't be surprised if the problem remains.

I can't really reproduce the issue outside of buildbot. Deleting the files locally always works.

There is nothing else running on the machine, that could keep an open file handle except some leftovers from a previous build/test.

kuhnel updated this revision to Diff 309895.Dec 7 2020, 6:16 AM

removed backwards compatibility, directly replacing all clean actions

kuhnel edited the summary of this revision. (Show Details)Dec 7 2020, 6:17 AM
amccarth accepted this revision.Dec 7 2020, 9:33 AM

I'm glad @gkistanova asked for this simplification. The more cautious approach added a lot of probably unnecessary noise.

kuhnel abandoned this revision.Dec 16 2020, 1:52 AM

Not pursuing work on the windows buildbot any more