This is an archive of the discontinued LLVM Phabricator instance.

[Frontend] Only compile modules if not already finalized
ClosedPublic

Authored by bnbarham on Jul 1 2021, 6:46 PM.

Details

Summary

It was possible to re-add a module to a shared in-memory module cache
when search paths are changed. This can eventually cause a crash if the
original module is referenced after this occurs.

  1. Module A depends on B
  2. B exists in two paths C and D
  3. First run only has C on the search path, finds A and B and loads them
  4. Second run adds D to the front of the search path. A is loaded and contains a reference to the already compiled module from C. But searching finds the module from D instead, causing a mismatch
  5. B and the modules that depend on it are considered out of date and thus rebuilt
  6. The recompiled module A is added to the in-memory cache, freeing the previously inserted one

This can never occur from a regular clang process, but is very easy to
do through the API - whether through the use of a shared case or just
running multiple compilations from a single CompilerInstance. Update
the compilation to return early if a module is already finalized so that
the pre-condition in the in-memory module cache holds.

Resolves rdar://78180255

Diff Detail

Event Timeline

bnbarham created this revision.Jul 1 2021, 6:46 PM
bnbarham requested review of this revision.Jul 1 2021, 6:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 1 2021, 6:46 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
bnbarham edited the summary of this revision. (Show Details)Jul 1 2021, 6:47 PM

This problem shouldn't happen based on module-level name hashing but for some reason it is happening. If I'm not mistaken, InMemoryModuleCache uses .pcm file name as the key. But that file name should contain a hash part that is based on the modulemap path. But frameworks M in "frameworks" and "frameworks2" should have different paths for modulemaps and should have different .pcm names, therefore no conflicts in InMemoryModuleCache.

I see the value in not crashing but looks like there might be a problem somewhere else.

Also it is more of my personal pet peeve. InMemoryModuleCache should simplify memory management for modules but in practice it simplifies creating dangling references. I'm not asking to address that, so if you feel like my requests get too close to that direction, it's good to push back.

bnbarham added a comment.EditedJul 6 2021, 8:57 PM

This problem shouldn't happen based on module-level name hashing but for some reason it is happening. If I'm not mistaken, InMemoryModuleCache uses .pcm file name as the key. But that file name should contain a hash part that is based on the modulemap path. But frameworks M in "frameworks" and "frameworks2" should have different paths for modulemaps and should have different .pcm names, therefore no conflicts in InMemoryModuleCache.

Ah sorry, looks like I had my modules slightly confused. The underlying problem is that B is considered out of date due to the check on line 2920, ie. the one that creates the err_imported_module_relocated diagnostic. That causes A to be re-compiled as a module it depends on is out of date. The location of A hasn't changed, so it keeps the same .pcm path.

B is also re-compiled, but as you point out that's perfectly fine since it'll get a different path in the cache.

EDIT: Replace A with Top and B with M for the corresponding modules in the test. Not sure why I used different names in the summary and test...

I'll update the comments in the test + the review summary. Sorry for the confusion.

I see the value in not crashing but looks like there might be a problem somewhere else.

For what it's worth, even if there was another cause I'd still like to get this change in - writing over a module in the cache causes all sorts of different crashes that are hard to diagnose. I'd be *very* surprised if there weren't more issues like this, so at least we'll get an actual error rather than random crashes.

Also it is more of my personal pet peeve. InMemoryModuleCache should simplify memory management for modules but in practice it simplifies creating dangling references. I'm not asking to address that, so if you feel like my requests get too close to that direction, it's good to push back.

Heh, I find this a nice summary of how I feel about it after trying to find the root cause of these crashes!

bnbarham updated this revision to Diff 356861.Jul 6 2021, 9:04 PM
bnbarham edited the summary of this revision. (Show Details)

Updated the test comment with the actual cause (ie. Top being recompiled and inserted, not M).

vsapsai added inline comments.Jul 7 2021, 7:43 PM
clang/unittests/Serialization/ModuleCacheTest.cpp
126–127

Haven't rechecked the code more carefully but had an idea is that if we want to allow InMemoryModuleCache reuse between multiple CompilerInstance, safer API would be to transfer ModuleCache ownership to the new CompilerInstance and maybe make all records in the cache purgeable. But that's applicable only if ModuleCache reuse is important.

bnbarham added inline comments.Jul 7 2021, 8:52 PM
clang/unittests/Serialization/ModuleCacheTest.cpp
126–127

Every module compilation runs in a separate thread with a separate CompilerInstance (but shared cache). So ownership would have to be transferred back to the original instance once complete. I might be misunderstanding what you mean here though.

Another point to note is that eg. Swift doesn't actually create another CompilerInstance, it's just re-used across module loading and CompilerInstance::loadModule is called directly. I went with this way in the test as otherwise we'd need to inline most of ExecuteAction and have a custom FrontendAction.

bnbarham edited the summary of this revision. (Show Details)Jul 9 2021, 8:23 PM
vsapsai added inline comments.Jul 13 2021, 9:13 AM
clang/lib/Frontend/CompilerInstance.cpp
1063

Can we get in infinite loop with AllowPCMWithCompilerErrors = true? Specifically, I'm thinking about the scenario

  1. compileModuleAndReadAST obtains a file lock and calls compileModule
  2. compileModule calls compileModuleImpl
  3. Module is finalized but AllowPCMWithCompilerErrors is true, so compileModuleImpl returns true
  4. compileModule returns true
  5. compileModuleAndReadAST tries to read AST because compilation was successful
  6. AST is out of date, so compileModuleAndReadAST decides to try again, goto 1

Haven't tried to reproduce it locally but even if this scenario is impossible, a corresponding test case can be useful.

clang/lib/Serialization/ASTReader.cpp
2926

I'm thinking if in case of finalized modules diagnostic messages are good enough. One concern is it won't be clear why a module wasn't rebuilt. It can be already confusing for precompiled headers and I'm afraid we won't be able to detect isPCMFinal code path without a debugger. Though I don't know how bad that would be in practice.

Another concern is that retrying a compilation should succeed as for a new process we have a new InMemoryModuleCache and isPCMFinal should return false. So we might have non-deterministic behavior and some of the valid error messages can seem to be non-deterministic and not reliable. I was thinking about adding a note in case we are dealing with isPCMFinal to distinguish these cases but not sure it is a good idea.

5932

Based on the rest of the code in clang, the expectation for diagnose... methods is to emit diagnostic in some cases. Personally, I'm often confused what true/false means for these methods, so I'm thinking about renaming the method to something like isRecoverableOutOfDateModule, canRecoverOutOfDateModule or some such. Feel free to pick a name you believe is appropriate, mine are just examples.

clang/unittests/Serialization/ModuleCacheTest.cpp
126–127

Thanks for the explanation. I haven't considered how the API is used right now, so please discard my earlier comment.

bnbarham added inline comments.Jul 13 2021, 4:31 PM
clang/lib/Frontend/CompilerInstance.cpp
1063

Nice catch, that does seem likely - I'll see if I can add a test for this.

clang/lib/Serialization/ASTReader.cpp
2926

I'm thinking if in case of finalized modules diagnostic messages are good enough. One concern is it won't be clear why a module wasn't rebuilt. It can be already confusing for precompiled headers and I'm afraid we won't be able to detect isPCMFinal code path without a debugger. Though I don't know how bad that would be in practice.

The error messages will mention a module in the module cache, which would be the main way to tell. We could add a note here as you suggest below, but I'm not quite sure what it would say... something about there being two modules with the same name?

Another concern is that retrying a compilation should succeed as for a new process we have a new InMemoryModuleCache and isPCMFinal should return false. So we might have non-deterministic behavior and some of the valid error messages can seem to be non-deterministic and not reliable. I was thinking about adding a note in case we are dealing with isPCMFinal to distinguish these cases but not sure it is a good idea.

The errors should be deterministic I believe. If one process has the issue then a new one will have the issue as well. For what it's worth, I don't think these crashes are possible from the clang frontend. They require messing around with search paths such that between two compilations in the same process, different modules are found.

5932

Fair enough, canRecoverOutOfDateModule sounds reasonable to me. Or maybe canRecoverFromOutOfDate?

bnbarham marked 5 inline comments as done.Jul 13 2021, 5:35 PM
bnbarham added inline comments.
clang/lib/Frontend/CompilerInstance.cpp
1063

It doesn't end up causing an infinite recursion as false will end up being returned from compileModuleAndReadAST, but in that case there's no point returning true from compileModuleImpl in the first place so I've changed it anyway. Have also added that as a test case, just to make sure.

clang/lib/Serialization/ASTReader.cpp
2926

I've added an extra note "this is generally caused by modules with the same name found in multiple paths". So the diagnostics would now be:

/.../test.m:2:10: fatal error: module 'M' was built in directory '/.../frameworks/M.framework' but now resides in directory 'frameworks2/M.framework'
        @import Top;
         ^
/.../test.m:2:10: note: imported by module 'Top' in '/path/to/module/cache/LGSY9KSYAKN1/Top-1MZE9QJ1AHENQ.pcm
/.../test.m:2:10: note: this is generally caused by modules with the same name found in multiple paths

I was thinking of mentioning the finalized module, but that's really just a side effect of the error that should have already been output (ie. module 'M' was built in... in this case). So the existence of this note should help point users to the problem + make which path caused the error obvious when looking at the code.

5932

I went with the latter.

bnbarham updated this revision to Diff 358465.Jul 13 2021, 5:36 PM
bnbarham edited the summary of this revision. (Show Details)
vsapsai added inline comments.Jul 14 2021, 7:23 AM
clang/lib/Frontend/CompilerInstance.cpp
1063

Thanks for investigating it and adding a test case.

And appreciate that the error message mentions the module name. It is so hard to work with errors like "cannot rebuild this module".

clang/lib/Serialization/ASTReader.cpp
2926

I like the note. Think it gives a good hint at what's wrong and we don't need to improve it further now.

The errors should be deterministic I believe. If one process has the issue then a new one will have the issue as well. For what it's worth, I don't think these crashes are possible from the clang frontend. They require messing around with search paths such that between two compilations in the same process, different modules are found.

Thanks for the explanation. I was thinking about triggering the error not through API but through clang frontend. But I see that through API the error should be deterministic.

5932

I think with the current naming we need to flip the conditions to opposite. Currently,

return !(ClientLoadCapabilities & ARR_OutOfDate) ||
       getModuleManager().getModuleCache().isPCMFinal(ModuleFileName);

corresponds to cannotRecoverFromOutOfDate. But to avoid double negations it is better to have canRecoverFromOutOfDate (with (ClientLoadCapabilities & ARR_OutOfDate) && !isPCMFinal(ModuleFileName)) and call it as !canRecoverFromOutOfDate when necessary.

bnbarham updated this revision to Diff 358798.Jul 14 2021, 6:05 PM
bnbarham marked 2 inline comments as done.Jul 14 2021, 6:09 PM
bnbarham added inline comments.
clang/lib/Frontend/CompilerInstance.cpp
1063

Note that this error is really just a fallback - it will be silenced by the previous fatal errors in ASTReader.

clang/lib/Serialization/ASTReader.cpp
2859–2860

The failing test was because this note was being output when if OutOfDate wasn't in Capabilities. It should only get output when OutOfDate can be handled (ie. it'd be recompiled) *and* the module is already finalized.

5932

Argh, sorry - just did a find + replace without thinking. Have flipped that function and its calls.

vsapsai accepted this revision.Jul 14 2021, 7:47 PM

Have 1 punctuation nit (that I'm not sure about), the rest looks good.

clang/lib/Serialization/ASTReader.cpp
2854

I don't remember recommended LLVM style or if clang-tidy would complain about it but ... & ... && ... can be unclear regarding the priority of operations. Personally, I would do (... & ...) && ... but I don't know what is the rule, so not insisting.

This revision is now accepted and ready to land.Jul 14 2021, 7:47 PM
bnbarham updated this revision to Diff 358826.Jul 14 2021, 7:54 PM
bnbarham marked 2 inline comments as done.
bnbarham marked an inline comment as done.
bnbarham added inline comments.
clang/lib/Serialization/ASTReader.cpp
2854

I don't feel particularly strongly about it either way, so have just updated to add the parentheses :).

Also renamed recompileFinalized to recompilingFinalized as I thought that made more sense after coming back to it.

bnbarham marked an inline comment as done.Jul 15 2021, 3:20 PM

I don't have commit access. @vsapsai or @akyrtzi would you mind committing this? The test failures seem unrelated to me.

This revision was landed with ongoing or failed builds.Jul 15 2021, 6:28 PM
This revision was automatically updated to reflect the committed changes.