This is an archive of the discontinued LLVM Phabricator instance.

lit: support long paths on Windows
ClosedPublic

Authored by compnerd on Jan 31 2019, 11:59 AM.

Details

Summary

Use ctypes to call into SHFileOperationW with the extended NT path to allow us to remove paths which exceed 261 characters on Windows. This functionality is exercised by swift's test suite.

Diff Detail

Repository
rL LLVM

Event Timeline

compnerd created this revision.Jan 31 2019, 11:59 AM
compnerd added a subscriber: ruiu.

Please upload with -U99999 and add a comment that says that ctypes is for long paths. And probably error handling.

(But I'd try to not have paths that long if I were you; there are many headaches associated with that. Windows can handle it, but not well.)

smeenai added inline comments.Jan 31 2019, 12:07 PM
utils/lit/lit/TestRunner.py
611

Add a comment explaining this monstrosity?

Wait, what happens if you call shutil.rmtree with the \\?\ normalized path, instead of dropping down to ctypes?

That doesn't work with python 2.7 as that uses RemoveDirectoryA, which does not support it.

compnerd updated this revision to Diff 184566.Jan 31 2019, 12:19 PM
compnerd marked an inline comment as done.

comments, error handling, context

smeenai added inline comments.Jan 31 2019, 12:21 PM
utils/lit/lit/TestRunner.py
619

Shouldn't the error handling logic match the other branch (on_rm_error if force else None)?

ruiu added a comment.Jan 31 2019, 12:26 PM

A very long pathname has various issues such as this and I think should generally be avoided for portability. Can't you just use shorter file names for test files?

compnerd added a comment.EditedJan 31 2019, 12:32 PM

@ruiu, no unfortunately, not all the paths can be shortened in the swift test suite since it is such a heavy user of the clang modules, modules cache paths and module naming structure in clang is a problem. Perhaps if we could make %t in lit be short and clang module names to be short, it might be possible to get the paths to a size that works?

ruiu added a comment.Jan 31 2019, 12:46 PM

@ruiu, no unfortunately, not all the paths can be shortened in the swift test suite since it is such a heavy user of the clang modules, modules cache paths and module naming structure in clang is a problem. Perhaps if we could make %t in lit be short and clang module names to be short, it might be possible to get the paths to a size that works?

If you can do that, that's much more preferable solution than fixing the problem only at this location. I'd think if your test file get a very long pathname, this is not the only place that could break. For example, if you use rm on a test file and the test file gets a long pathname, that could fail.

@ruiu, no unfortunately, not all the paths can be shortened in the swift test suite since it is such a heavy user of the clang modules, modules cache paths and module naming structure in clang is a problem. Perhaps if we could make %t in lit be short and clang module names to be short, it might be possible to get the paths to a size that works?

If you can do that, that's much more preferable solution than fixing the problem only at this location. I'd think if your test file get a very long pathname, this is not the only place that could break. For example, if you use rm on a test file and the test file gets a long pathname, that could fail.

The solution I proposed would also fix that. Basically, move this logic to a separate utility function, then call the utility function and update everywhere else in lit to use the function. Changing the naming scheme in both lit and clang seems like a heavy impact change.

rnk added a comment.Jan 31 2019, 5:58 PM

I don't think RemoveDirectoryW is recursive, so this is not functionally equivalent to rmtree:
https://docs.microsoft.com/en-us/windows/desktop/api/fileapi/nf-fileapi-removedirectoryw

Herald added a project: Restricted Project. · View Herald TranscriptJan 31 2019, 5:58 PM
compnerd marked an inline comment as done.Jan 31 2019, 6:01 PM

Ugh, right ... I think that I had tested it against an empty directory! It was run early in the test. I'll flesh this out with @zturner's idea of extract it into a helper.

utils/lit/lit/TestRunner.py
619

Hmm, yeah, RemoveDirectoryW could loop back around...

Okay, seems like the way to handle this is to switch to SHFileOperationW using ctypes. I believe that the win32py support that is needed for this should be part of the default python installation on windows (at least it is part of the Visual Studio python distribution).

compnerd updated this revision to Diff 184906.Feb 2 2019, 10:28 AM
compnerd edited the summary of this revision. (Show Details)

Switch to SHFileOperationW for the recursive removal. Refactor lit.util.mkdir_p to expose lit.utl.mkdir and update that to support long file paths. This makes lit more robust on Windows against creating long path temporary directories.

rnk added inline comments.Feb 4 2019, 3:29 PM
utils/lit/lit/TestRunner.py
620

This seems like a lot of ctypes code that would be better off written as a little llvm-rm utility in native C++ code, with a substitution from \brm\b to llvm-rm. Then, we'll have finally integrated a "safe rm" into LLVM, and we'll have a place we can add retry loops to try to satisfy hostile virus scanners.... it'll be great. =P WDYT?

utils/lit/lit/util.py
139 ↗(On Diff #184906)

This seems reasonable to leave as python.

thakis added a comment.Feb 5 2019, 6:27 AM

@ruiu, no unfortunately, not all the paths can be shortened in the swift test suite since it is such a heavy user of the clang modules, modules cache paths and module naming structure in clang is a problem. Perhaps if we could make %t in lit be short and clang module names to be short, it might be possible to get the paths to a size that works?

If you can do that, that's much more preferable solution than fixing the problem only at this location. I'd think if your test file get a very long pathname, this is not the only place that could break. For example, if you use rm on a test file and the test file gets a long pathname, that could fail.

(iirc we had to put a long-pathname-compatible rm on the chromium bots because that's already

utils/lit/lit/TestRunner.py
620

Or we could stop having rm be a builtin again. You need unxutils or similar to run tests on Win anyhow, and there are rm implementations for Windows that can do this. That seems a lot simpler.

compnerd marked an inline comment as done.Feb 5 2019, 9:13 AM
compnerd added inline comments.
utils/lit/lit/TestRunner.py
620

@thakis, having GNUWin32 seems insufficient, as that doesn't really support long paths :-( (or at least, I've run into issues with that too). @rnk, I can't tell if you are serious. I mean, its trivial to convert this to a C++ program, and I can do that, I just don't know if it is worth it.

rnk added inline comments.Feb 5 2019, 2:02 PM
utils/lit/lit/TestRunner.py
620

@compnerd Well, not that serious. I kind of just wanted to throw out the suggestion. We can do this shell32 call. I just think we're going to end up doing more bug fixes on it later.
@thakis, that's kind of the issue, though, isn't it? The GNU tools distributions that LLVM itself recommends don't handle these long paths well, and for Chromium we had to bundle our own "safe rm". I forget where it came from to begin with, MSys? git bash msys? I'd be OK going back to native rm if we could point with confidence to some tools that actually work.

thakis added a comment.Feb 5 2019, 3:57 PM

https://bugs.chromium.org/p/chromium/issues/detail?id=603364 is our old bug on that. Getting the right mix of win gnu tools is tricky either way (…though people can just pull the latest gnuwin zip from http://commondatastorage.googleapis.com/chromium-browser-clang/index.html?path=tools/ I suppose). Saying "we require an rm that can handle long path names" doesn't strike me as unreasonable.

The progression here was:

  1. Require rm to be present
  2. Switch to a built-in rm that doesn't work
  3. Reimplement a long-path rm in the built-in

That seems a bit like we're working hard to solve problems we made ourselves – imho we should just backtrack on step 2.

ruiu added a comment.Feb 5 2019, 4:00 PM

I'd like to know where in clang creates such long path names to see if we can fix it. Does anyone know?

thakis added a comment.Feb 5 2019, 4:06 PM

The module hash is computed in CompilerInvocation::getModuleHash() in http://llvm-cs.pcc.me.uk/tools/clang/lib/Frontend/CompilerInvocation.cpp#3402

ruiu added a comment.Feb 5 2019, 4:14 PM

The module hash is computed in CompilerInvocation::getModuleHash() in http://llvm-cs.pcc.me.uk/tools/clang/lib/Frontend/CompilerInvocation.cpp#3402

Does that function returns a long string? The last line of the function is this

return llvm::APInt(64, code).toString(36, /*Signed=*/false);

, and IIUC it returns a 64-bit value in string of base 32, which is 13 characters long.

rnk added a comment.Feb 5 2019, 4:16 PM

I think if we can fix this issue in the builtin rm, we can put this problem behind us, and save the next person setting up a new buildbot the effort of finding a special long path version of rm. I do, however, worry that we're painting ourselves into a corner where rm and mkdir work with long paths, but find doesn't. IIRC that's been a problem as well in the past.

rnk added a comment.Feb 5 2019, 4:18 PM

@ruiu, I don't think it's sustainable to ban deep directory hierarchies in lit test suites. The reality is that non-Windows users will add them back someday in the future, and the burden to remove them will eventually fall back onto Windows porters.

ruiu added a comment.Feb 5 2019, 4:23 PM

@rnk I don't think we should ban it, but I expect that the situation will be better in the future. More tools will be able to handle long paths, as MAX_PATH limitation is removed from Windows 10 according to https://docs.microsoft.com/en-us/windows/desktop/fileio/naming-a-file. If we can avoid the problem at the moment at a lower cost than handling it in a "correct" way, we probably should do that to buy time.

@ruiu it is only "removed" if you opt into it by modifying the registry on a per program basis. I don't think that is really a reasonable approach.

ruiu added a comment.Feb 5 2019, 5:39 PM

I didn't say it's easy at the moment but I don't think it is not that wrong to expect that the situation will be better over time. So as always it's a trade-off. Where is the location we currently generate long pathnames?

@ruiu - to your point of "it being better in the future" I think is false hope. The backwards API compatibility will prevent Microsoft from changing the behaviour of the APIs without an explicit opt-in. Otherwise, applications need to be rewritten, which has long been the case, you can use the NT style paths all the way back to Windows XP I believe, but applications aren't being rebuilt and redistributed.

As to the cases I am hitting? It is primarily the use of %t from lit and the module cache path in the swift test suite. Personally, I think changing those to return short paths is the wrong approach. But, if that seems to shrink the path enough to permit parallel builds, I can probably live with that. I think that of the 251 characters, we should reserve at least \Users\????????\swift-source\build\Ninja-RelWithDebInfoAssert\swift-windows-x86_64 as a prefix because that is going to be pretty common ... which doesn't leave you with much space to play with.

rnk added a comment.Feb 6 2019, 12:41 PM

I think that of the 251 characters, we should reserve at least \Users\????????\swift-source\build\Ninja-RelWithDebInfoAssert\swift-windows-x86_64 as a prefix because that is going to be pretty common ... which doesn't leave you with much space to play with.

I learned the pattern of using something like C:\b\${botname} as the root of the buildbot directory from Chrome infra, probably for this reason. =D

While I agree with @ruiu that this problem will probably eventually solve itself (as we have seen in other areas like Windows 10 console improvements), I think it's worth adding this in as a stopgap measure. I just think the lead time on a proper fix in Windows is going to be pretty long.

thakis added a comment.Feb 6 2019, 6:51 PM

Why isn't "if you want to put your build at c:\very_long_path, use a rm that can handle that" good enough as a stopgap measure?

rnk added a comment.Feb 7 2019, 10:11 AM

Why isn't "if you want to put your build at c:\very_long_path, use a rm that can handle that" good enough as a stopgap measure?

Because that's a landmine we leave behind for every person who wants to set up a Windows bot to run lit tests, no matter how much documentation we write about it. Hand rolling rm ourselves also has maintenance costs, but I believe in the long run the costs will be lower than telling everyone to use a special version of rm. Of course, that is just my estimate of the relative cost of the two proposals, and I'm prepared to be wrong.

I think we've already covered this, and it seems we're not in agreement yet. Should we reach out to other Windows bot maintainers and get more opinions?

I prefer if things "just work", so I'm not sure I understand where the pushback on this one is coming from. This isn't just an issue for buildbots, it's also an issue for developers who might have their build trees nested very deeply. Even on my local machine, I have some directories whose names are quite long because of the way I structure my build folder.

It's already a hassle for people on Windows to make sure they have all the right tools necessary to get things working properly, and then telling them on top of that that they may have to find some nonstandard version of a tool or build something from source just seems like much more of a headache then writing a couple of lines of code ourselves that solve the problem once and for all.

@zturner, to your point, I'd like to add the fact that it took a bit of effort to even figure out that the problem was so nicely tucked away in the corner of lit because the failure scenario is pretty absurd (-ENOENT). IMO, this makes this even more insidious because it will cause a lot of people to get confused or possibly give up. But, perhaps I just am holding on to tightly to my idea that things should just work and the testing infrastructure should encourage people to add tests rather than demotivate them.

thakis added a comment.Feb 8 2019, 5:07 AM

I think getting the rest of unxutils is also a huge roadblock. So we could either add llvm-grep and all the other unix utils we rely on, or we could have a script checked in somewhere that downloads a known-good version of unxutils for use on Windows. That would fix this issue here and would make running llvm tests on Windows much easier in general.

I think getting the rest of unxutils is also a huge roadblock. So we could either add llvm-grep and all the other unix utils we rely on, or we could have a script checked in somewhere that downloads a known-good version of unxutils for use on Windows. That would fix this issue here and would make running llvm tests on Windows much easier in general.

Instead of adding llvm-grep, llvm-rm, llvm-sed, etc, I would rather just implement grep, rm, and sed directly in Python as builtins in the internal lit shell, as that will help us eventually migrate all users exclusively to the LIT shell, and we have more control over fixing bugs and ensuring consistency across platforms.

@zturner I think that moving to the integrated lit shell is a great idea, as it really does lower the barrier to entry and makes testing stuff on Windows much easier.

efriedma added inline comments.
utils/lit/lit/TestRunner.py
614

Please add a comment noting which version of python, exactly, are known to have a broken shutil.rmtree, so we can potentially remove this hack in the future.

compnerd marked an inline comment as done.Feb 10 2019, 11:29 AM
compnerd added inline comments.
utils/lit/lit/TestRunner.py
614

@efriedma my understanding is that they all do. I know that they did switch to the Wide version of the interfaces in 3.5/3.7, so we might be able to pass an explicit unicode string, but, really, this is unlikely to be something that we can easily remove.

efriedma added inline comments.Feb 11 2019, 1:19 PM
utils/lit/lit/TestRunner.py
614

If all current versions are broken, fine; please still note the versions that were tested.

@rnk @zturner @ruiu - can we come to a decision here?

rnk added a comment.Feb 15 2019, 2:25 PM

I'm in favor of doing this. How about this, if we continue encountering problems in this new code over the next month, let's remove the rm builtin and go back to requiring a long-path aware native rm.exe. WDYT @thakis?

rnk added a comment.Feb 21 2019, 11:58 AM

@compnerd: I spoke offline and @thakis convinced me we should basically partially revert rL319528. Supporting long paths here just continues down the path of reimplementing coreutils buggily in Python. The 'rm' distributed as part of git for windows supports long paths, and that's what 99% of Windows developers use, and Windows buildbots probably do as well. As we move towards the git monorepo, I think those should become the documented preferred tools for testing LLVM.

@rnk - okay, that seems reasonable - can you please update the documentation on how to setup the Visual Studio Developer command prompt such that you can use the tools (assuming that Git is setup by the Visual Studio Installer) then?

@rnk, @zturner, @thakis - why don't we merge this for the time being, and once we have setup lit such that it auto-locates git for windows and sets up the path (removing the dependency on GnuWin32), we can remove this entirely?

rnk added a comment.Mar 21 2019, 3:48 PM

I'd be in favor of that. I updated the clang getting started docs to suggest using the git for windows msys tools (http://clang.llvm.org/get_started.html), but I honestly don't have the time to go around and do all these experiments to smooth over the setup. Whatever the shortest path to "not broken" is sounds good until someone decides they want to improve things.

zturner accepted this revision.Apr 5 2019, 9:03 AM
This revision is now accepted and ready to land.Apr 5 2019, 9:03 AM
compnerd closed this revision.Apr 5 2019, 9:46 AM

SVN r357778