This is a “fix something while breaking other stuff” kind of change. So it probably should not be merged. I’m bringing this up as a reminder that CommandLine needs a major rework.
On to the rationale:
The comment below that change seems to indicate, that CommandLine inconsistencies are strictly unrecoverable. However this is not the case, if two libraries link dynamically to the same libLLVM version. Applying this patch results in error-free operation of these cases. This most commonly occurs in mesa, if two libraries link to libLLVM, e.g. the Clover and RocM OpenCL backends or the radeonsi VAAPI driver, LLVM aborts the application using them.
Consequently, having both Clover and RocM installed leads to a complete breakage of OpenCL device enumeration, as outlined in the Gentoo Wiki. [1] There is also the existing use case of mpv with vapoursynth support, using a plugin like SVP [2] to post process hardware decoded video. Other use cases in that context come into mind, like video editing software.
So we have valid non-breaking use cases while for now a fatal error has been reported. Changing this into a warning definitely will break other things, so this can merely serve as a temporary solution until better and more specific error handling has been fleshed out.
Since this only is reproduced by the lib linked twice or more often, I struggle to create a regression test. Python is an interpreted language, so I can not influence the linkage and I doubt, that importing LLVM twice would cause that. I’d appreciate any hints on that. On the other hand I also do not expect this to be merged like this.
Finally, I should add, that I’m unable to convince Gentoo ebuilds to link against LLVM 16.x git, so I could only test this with 15.x git. However the patch applies to both branches, so this should not matter too much.
This probably should be discussed broader in a RFC, but I’m hesitating to create one, because I can’t really think of solutions, mostly, because I’m neither familiar with C++ nor LLVM. As an end user I can just suggest two things for the rework:
- (Somehow) detect and print clearly, when a LLVM version mismatch occurs.
- Print the library names linking against libLLVM.
This would greatly simplify handling these issues for distributors (and it would spared me from spending 3 days until I found the root cause), but it most likely does not solve the underlying problem.
[1]: https://wiki.gentoo.org/wiki/OpenCL#AMD
[2]: https://www.svp-team.com/wiki/SVP:Linux
Related bugs:
[3]: https://github.com/llvm/llvm-project/issues/23326
[4]: https://github.com/llvm/llvm-project/issues/29935