This is an archive of the discontinued LLVM Phabricator instance.

[Support] On Windows, ensure that UniqueID is really stable
Needs ReviewPublic

Authored by aganea on Mar 20 2023, 7:17 PM.

Details

Summary

Before this patch, UniqueID was thought to store stable file IDs. However a recently reported issue shows that the file IDs are not always stable, at least on Windows. The documentation for the underlying Windows API states: (note the emphasis)

To determine whether two open handles represent the same file, combine the identifier and the volume serial number for each file and compare them.

Currently, UniqueID does not keep the file handles open. The fact that file IDs were stable until now is only due to a NTFS implementation detail. In some cases, such as drives mounted on network locations, the underlying Windows drivers might or might not generate stable file IDs.

To fix the issue, we keep the file handles open during the lifetime of their corresponding UniqueID instances. Since handles will live longer now, this requires particular attention when performing some file actions, such as file deletions.

Should fix #61401 and #22079.

Diff Detail

Event Timeline

aganea created this revision.Mar 20 2023, 7:17 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 20 2023, 7:17 PM
aganea requested review of this revision.Mar 20 2023, 7:17 PM
Herald added projects: Restricted Project, Restricted Project, Restricted Project. · View Herald TranscriptMar 20 2023, 7:17 PM
aganea updated this revision to Diff 506820.Mar 20 2023, 7:32 PM
kbobyrev added a subscriber: sammccall.
kadircet added a comment.EditedMar 21 2023, 2:32 AM

To fix the issue, we keep the file handles open during the lifetime of their corresponding UniqueID instances. Since handles will live longer now, this requires particular attention when performing some file actions, such as file deletions.

I am a little worried about this in general (what happens if some part of LLVM is keeping a fs::file_status object around when something deep down is trying to delete a file?) but specifically from clang-tooling perspective, especially clangd (i'd expect lldb to have similar concerns).
Today clangd is a long-lived process and it keeps uniqueid's around for the headers included inside a translation unit pretty much indefinitely. What happens if someone outside tries to delete/modify a file while we're keeping handles for it around? (it's quite common for people to switch branches in their VCS while clangd is working on these files)

also filemanager is one of the central pieces depended by many projects, that'll keep uniqueids around for any file that's accessed.

hans added a comment.Mar 21 2023, 5:38 AM

This makes me nervous as well. I think it also doesn't fit well with developers' expectations of UniqueID, changing it from a simple piece of data to something which interacts with the OS in a pretty deep way. I don't have specific examples, but I'm sure it will bite us somehow.

FWIW, I'm also nervous about this change. Keeping the file handle open changes the ways in which you can access the file from other processes (or within the same process), so I'd expect this to cause some heartache, possibly for downstreams as well as Clang users. I mostly worry about things like leaving a file open interacting with things like deleting the file from the file system, including via a user interaction.

aganea added a comment.EditedMar 21 2023, 6:33 AM

Fair enough. There are several choices forward: either we mark the issue as "Will Not Fix" or I can try only scoping this patch to only keep the handle open for network drives/paths. Any other suggestions?

Fair enough. There are several choices forward: either we mark the issue as "Will Not Fix" or I can try only scoping this patch to only keep the handle open for network drives/paths. Any other suggestions?

I am afraid that a solution that keeps file handles alive indefinitely won't work even for that narrow set of users (unless they're operating on a read-only FS). so another alternative could be to have a different definition for uniqueids on these specific cases, e.g. use filepaths (canonicalized for capitalization & symlinks & hardlinks etc.). it might be slower than how things are working today, but i guess if that's needed for correctness in certain platforms one has to eat that regression?

Fair enough. There are several choices forward: either we mark the issue as "Will Not Fix" or I can try only scoping this patch to only keep the handle for network drives/paths. Any other suggestions?

I also kinda share the sentiment that keeping handles open might be problematic - the cure is worse than the disease. I wouldn't quite start throwing in the towel quite yet though...

While the docs say that the ID only is stable as long as the handles are open (except for NTFS where it's an implementation detail), it's also stable enough for FAT (as long as nobody is defragmenting the drive while the process is open); at least stable enough for compiler/linker use, but possibly not for long running processes like clangd.

It might not even be strictly necessary for all network drives; out of 4 tested cases, it seemed to behave fine (stable enough at least) with a Linux Samba mount and a drive shared from VMWare Fusion, while Remote Desktop folder sharing (and allegedly VirtualBox although I haven't tried that) exhibit the issue. But I don't know of any immediate way of identifying the problematic cases either...

Another potential band-aid solution is to simply hash the pathnames for such drives. We won't get the desired effect of correctly identifying the same individual file via different pathnames (hardlinks/symlinks), but we would at least have a stable representation of a file.

Hello devs,

I would like to follow up on this issue whether or not there has been an update on it.

Thanks.

I would like to follow up on this issue whether or not there has been an update on it.

There's no update currently; I think there's some amount of consensus that this approach, while fixing the issue, would cause too much other collateral issues.

I'm considering making a patch that switches to hashing the canonicalized path, for the cases where we know the path isn't a local path; it's not an entirely correct solution, but a rough approximation for when the file IDs are stable or not. I just haven't had time to try doing that yet.