Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 35293 Build 35292: arc lint + arc unit
Event Timeline
I have previously shared most of the content of this proposal on the llvm-dev list: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133861.html
I was asked to send the charter as a formal code review. I feel it can fit under "Proposals", but I am open to moving it to another more appropriate place.
There are some things in the libc++ docs that I think would be good to follow here: https://libcxx.llvm.org/docs/
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
25 | I'd drop this part: "llvm-libc is effectively C11 and upwards conformant." I do wonder: will it require passing -std=c17 or later? That's mostly how libc++ works: the C++17 library features require C++17 language. | |
28 | I think this is still pretty tentative, based on mailing list comments. Maybe say so here? | |
37 | Will it be ABI-stable? Maybe it's worth expanding on how the ABI will evolve, and what will be stable. | |
71 | Capitalize "llvm" |
I have actually used the libc++ documentation as a template but wrote it like a proposal. I feel that some of the items like status, platform support etc would be meaningful within the llvm-libc documentation after we make some progress with the implementation.
Do you have any specific items in mind that should be included?
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
25 | I would think the public headers will use certain C11/C17 features requiring user code to be compiled with -std=c17. | |
28 | Reworded it now to say that we will provide this ability only if possible and desired for a platform. | |
37 | I am not sure how exactly to word it here. Do you have any suggestions on what the ABI promise should be and what exactly to say in a proposal like this? |
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
37 | Depends on what people want to do with it. I'm just saying it should be given thought. If you want to interop with other libc then you need to match their ABI, which can be a burden. IIUC musl matched glibc almost accidentally, and is moving away from doing so. Then you might consider whether your libc is ABI stable over time, and how you'd manage that. The answer might change between static and dynamic versions. |
(Wall of text warning... sorry!)
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
37 | (Sorry for chiming in so late... I chatted with Siva, and I volunteered to add my take.) I don't think we would ever try to maintain ABI-level interop with other libc implementations on purpose. There is some subtlety here, though, so let me step back for a moment. (The next few paragraphs are largely for the benefit of folks who may be following along the review thread, so I'm surely restating more than necessary. This is a bit long, so you can skip to the end if you want... I won't tell anyone. ;-) ) Fundamentally, our aim is to define a tightly-bounded interface (i.e., provides what it must; no more, no less), and retain as much flexibility as we can for everything behind the interface. The "implementation stuff" behind the interface then has to be structured in a rational way, so that bits and pieces can be replaced. (I'm going to come back to that point in a bit.) The "interface" in this case are effectively just the headers: struct and enum definitions, function declarations, maybe a couple of extern variable declarations, and... really, that's about it. The "implementation flexibility" goal is where things like DSOs, interceptors, and delegation come into play. The point in the design space we need (for Google production workloads) is, arguably, one of(*) the simplest: a fully statically linked (*)PIE binary, wherein we are willing to build everything from sources using the same compiler. The question of things like ABI stability seem to me, quite frankly, out of the scope that we would even want to define. There is a simple reason why I would like to stay somewhat purposefully blind to these questions: I do not believe we can predict why folks might need guarantees from the library, instead of relying on the guarantees that come from the combination of language and compiler implementation. (Such reasons surely exist, but I expect many of them to be novel, surprising, or both.) In other words: if you can build the .a or .lib, and you have headers that match it, and that archive works like any other library archive, why would one need still more guarantees? FWIW, this question is only maybe 75% rhetorical... I find myself routinely ... let's say, "impressed" by new and interesting ways to ... let's say, "meaningfully change program behavior" using just the linker's command line options. (Obviously, I'm toning down my opinions here... but my strong bias is that these uses need to be brought to light before any fundamental design accommodations are made.) A logical extension of the above is that the implementation will look as much as possible like any other library. We don't want to insert an "upward contract" (or would it be "downward?") that might get in the way of whatever else a packager might want to do. (Providing a DSO is, IMO, more akin to release or distribution packaging than simply building a library. I'm purposefully using the term "packaging" instead of anchoring on, say, the specifics of ELF DSOs.) I think there is an interesting analog here to the LLVM project in general: similar to how the broader LLVM project's "compiler as a library" approach yields a toolbox of things which can be used to build a compiler (but also other things), the "libc as a library" aphorism(/pun) is meant to point out that the task of "building a libc" isn't monolithic, either (and maybe there are other things people want in this space, and we can make our work useful to them). This comes back to the question about using or replacing parts of the overall libc: there's really no reason that the whole thing needs to be monolithic. There are, of course, some internally-cohesive subsystems that would be hard to break apart; but at the macro level, a "libc" isn't terribly cohesive. The monolithic coupling is largely artificial. Alright, so that's the background. There is, quite admittedly, some duplicity in the goals -- at least, as I've stated them. One good example: we actually would like to make reasonable "narrowing" guarantees: if there is some design or implementation space we could leave empty, and doing so would drastically simplify a packaging use case, then that's something we should try to accommodate. (Especially if it doesn't add substantial cost to the implementation.) However, a guarantee like "our ABI will match <other libc>" is not what I would call narrowing: it would require us to cover more of the design/implementation/ABI space than absolutely necessary, so it doesn't really seem like a guarantee we would want to make. On a more technical note, I should also point out that there is an important detail that is easy to miss, and I think it's mostly buried somewhere within this bullet point:
So, for example, the entire "implementation" ought to be available without squatting on symbol names used by libc. Our expectation is to use link-time aliasing to satisfy most libc symbols, but I think the important takeaway is that the libc symbols will notionally only exist in an independent layer on top of the "actual" implementation (i.e., separate symbol logic from program logic). One seemingly-equivocal goal in all of this is delegation to a separate libc, and this seems like it might raise questions about ABI interop. Personally, I would not characterize this as "interoperability;" rather, I would characterize it by saying that a "syscall" might be implemented by, say, a trap instruction; but it doesn't have to be. For instance, it could delegate -- probably through clever linking tricks -- to a vendor-supplied libc. More generally, I would say that this could apply to any libcall (not just syscalls). My expectation is that, since we'll need something like this as a bootstrapping mechanism anyhow, it would be unwise to try to make it an anti-goal. Inhibiting that use case just seems like more work to cover ground that we don't particularly care about. (Implementing such delegation is hard, but actually preventing it would be even harder... and to what gain? Bragging rights?) I do strongly suspect that there are other users who might want to use this libc, but still have to use a vendor library to actually talk to their kernel. Or maybe folks would want to delegate almost everything to their existing libc, except for one or two routines. Or maybe folks want an easier way to rebuild existing programs to run inside their shiny, new sandbox. Or they want easier kernel-bypass networking, or they want to use an in-process virtual filesystem, or any other thing which, today, commonly requires using new, incompatible APIs. And just as a quick sanity check: I don't actually expect that this libc would be adopted primarily as a system libc... at least, not in the near term, by any existing platform. It should be complete and high enough quality to serve that purpose, though. It might make sense to use for a new platform that hasn't already cemented its ABI (c.f.: Alpine linux and musl), or for a vendor to adopt for an epoch release (I do think an ABI layer could be fashioned to make that feasible). But there is plenty of value in not relying on a system libc, and in being able to replace parts (work around bugs, provide a different implementation, etc.). |
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
37 | If I understand David correctly, he is essentially trying to say something which I have failed to convey so far: We will keep the ABI compatibility and stability questions open for now and let someone who cares for these issues come and provide/fill in these details in future. Does that make sense? Considering we (the team at Google I represent) are not particularly interested in these questions, we do not want to say/guess/promise about them. That said, we are also not preventing anyone from formalizing answers to these questions in future. |
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
37 | I made two points:
I'd like them answered separately. I think it's critical to the success of this projects to have other non-Google-production participants, and I expect that some will care about ABI stability. |
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
37 | It seems like this line of discussion is quickly starting to go in circles... but maybe I'm misunderstanding your questions.
Can you be more specific? "ABI" is a pretty broad topic... For example, we have no control over whether a user of the library chooses an alternate calling convention, then attempts to link against a prebuilt archive. I strongly doubt this is the intent of your question, but it's an example of why I'm asking for narrowing... we need to be careful not to over-promise. "ABI" is a vague enough notion that simply saying "we will have a stable ABI" would be vastly overreaching.
(We don't believe this is entirely the case, at least for ELF.)
This is the intent, and why the term "layer" is used in this bullet point:
The specific mechanism, however, is something that ought to be addressed in a standalone design doc. We do have such a plan for ELF, but it is fairly intricate. In any event, the discussion of exactly how (and why) it would be implemented is something that is ... "nuanced," to put it lightly. It seems unnecessary to try to include such a deep technical specification in this particular, high-level document. In fact, I could even go further: trying to define "layering" will require us to anchor to the mechanism for a specific platform, or at least family of platforms. That seems antithetical to the current goals of this document (and, frankly, feels more like a wedge than an actual question... so perhaps this would be better deferred until Siva sends the specific design, after this doc is committed). |
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
37 |
libc++ has ABI guarantees, I'd expect you to figure out similar guarantees for a libc. Yes it's broad, and yes I'm asking that you figure out exactly what that means. Yes this includes LLVM libc version X -> Y as well as with other libc implementations (since interop with other libc is part of this proposal). |
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
37 | As you have said, there are two kinds of ABI compatibility that one could discuss here:
[You brought up the libc++ ABI guarantees. One cannot have a namespace based ABI management scheme for a libc as we cannot use namespaces in libc public headers.]
|
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
37 | I think there is some miscommunication happening here.... I think what Siva and David are trying to say is that we want people who have specific ABI compatibility needs to drive the ABI compatibility design. It's not any form of opposition to having such a proposal, it just should probably come from the people working in that space. I'll try to explain this from the perspective JF asked the question: libc++ has ABI guarantees. But originally, libc++ *only* had a stable ABI. There was no "unstable" ABI because the libc++ authors didn't need one. Later on, folks were trying to start using libc++ (us) and happened to want an "unstable" ABI that could easily track updates and fixes that weren't ABI compatible. At that point we worked w/ Marshall and others to come up with an approach that would work for our use case but also wouldn't get in the way of the stable libc++ ABI. I think it was a good thing that libc++ didn't try to design this system up-front. It would have been a waste of time given that there weren't any users involved with libc++ at the time to even consume it. But I also think it was really important to figure out a way to support that once there was a concrete use case in mind. I think we're basically seeing the reverse position here. Our use case is for an unstable ABI. We're totally open to having a stable ABI but as we don't have a concrete use case in mind, it seems like it would be better to wait for folks to have specific requirements for a stable ABI and then design a solution that works for them. Maybe "we are not interested in this question" is easily misinterpreted as "we are not interested in answering this question at all". Sorry if so, I think that's just a slight communication issue. I think another way to put it "we aren't the right people to figure out the core use cases that will drive any answer to this question, but we're happy for folks to propose or work toward that direction". Hopefully this helps address some of the confusion. |
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
37 |
That's fine with me. What I'm asking is simple:
I am asking a bunch of questions, but you don't need to have perfect answers or to sign up for the work. However, I do think you want to put some thought into it so either interested folks can jump in now, or when interested folks come along everything is at least set up with ABI in mind. Simple examples:
Some of these can stay up in the air for a bit! But some would require moving your code around and setting stuff in stone to enforce into an ABI. What have other libc implementations done? If you think they did something silly, why? |
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
37 | To be clear, I'm not asking that you dig into all the bullet points! Just the 3 numbered things. Leaving a placeholder is easy, sending a ping to the RFC is as well. Clarifying the interop thing seems easy too? I guess code might make what you have in mind obvious here... I think it's just been confusing to me and others. |
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
37 | I understand that everything you are saying and asking is very relevant. One thing not clear to me is, should this proposal block on getting answers to those questions even if the answers do not end up in this proposal? As David mentioned, we will have to prepare design docs for some of the questions you want to see answers for. Not to say we do not want to write up those design docs. On the contrary, we want to write and share those docs and get feedback from the community. But, should this proposal block on having an agreement on those docs? There are other kind of questions in your list which we think should be answered by people working and having expertise in those areas. I agree that such folk are probably not following this code review. But again, should this proposal block on having answers and agreements to those questions? If I have understood the process correctly, I can make requests for administrative aspects like mailing lists, SVN/Git repos etc. only after this proposal lands. So, I have been of the opinion that we will first land this proposal and then have discussions about the designs etc via code reviews in the new lists/repos.
The very first time you brought this up, I have asked you back as to how to write it so that it conveys that ABI questions are still open. As it is written now, all it says is that we want an ABI independent implementation as far as possible. That ABI refers to things like calling convention, stack layout etc. The ABI aspects you are asking questions about refers to a different kind of ABI about which the proposal is currently silent. |
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
37 | Siva, I think JF's follow-up maybe clarified this somewhat, but having a placeholder around the fact that there remain open ABI issues seems pretty reasonable. Similarly to at least pinging the RFC thread and trying to write down some of the interop issues. Regarding your last point -- I think just adding a section that says there are open questions about the best way to build stable ABI versions of the libc (and the use cases associated with it) would be a decent start? |
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
37 | Ah, sorry! I missed JF's follow up which came in as I was typing my response in phabricator. |
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
25 |
This is not clear. See Szabolcs's concern in https://lists.llvm.org/pipermail/llvm-dev/2019-July/133894.html It is a very good reply but many questions in that post have not been answered. | |
30 | Can you clarify a bit what features you really want from C++? People have concerns that freestanding c++ language semantics are underspecified. My impression is that it may be useful in a few places, but in most places C will just be a better choice. | |
37 | What "ABI" means here is very unclear (call ABI? ELF processor-supplements ABI? ABI specified by the host libc?). Does "ABI independent" mean inventing your own ABI? | |
56 | I don't know why you call them monolithic. Many libc implementations, if they care static linking, just can't be monolithic. In static linking, you link in just enough components of libc, otherwise your program will be monstrous. uclibc is menuconfig configurable | |
60 |
"break" is not accurate. They just lack builtin sanitizer interceptors. |
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
25 | I don't know what it is you find not clear. I understand that many disagree with how LLVM's C++ libraries manage their API and ABI stability (or lack there-of), but I don't think that is a terribly relevant complaint towards this effort and so I don't know what specific technical concerns you think are still important to address here. Unfortunately, the post you mentioned conflates technical critique with somewhat inflammatory comments. I think that rather than cite the post without elaboration, it would be useful to specifically articulate the concerns you still have so that they can be addressed. | |
30 | It's not clear that trying to split this hair *even more* finely is really useful in the absence of code. I think if people can effectively leverage C++ language features when implementing runtime libraries, they should. And I don't think we need to try to preclude that before even seeing the usage. LLVM's libunwind, libc++abi, the sanitizer runtimes, all have benefited from some limited use of C++ language features. Not sure what more evidence is really needed here... | |
60 | The folks developing the sanitizers seem to disagree somewhat? Notably, hand written assembly and other extensions can easily prevent sanitizers from working at all. |
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
30 | I think "compatible types" in C can be very useful to layering it on the system libc. When C++ fits, sure using it will be nice. I am just worried C is precluded from the core implementation. Many functions do not share states and they benefit little from using C++. I hope these are not disallowed from using C. (Does core implementation mean things like: network,pthread,stdio,regex,locale?) |
Add a placeholder section for ABI stability.
Also add a section to elaborate layering of llvm-libc over system-libc.
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
37 | I've been lurking -- this is an interesting discussion. If I read JF's comments correctly, I think what he's asking is that you add a section for open questions, one of which would be:
You shouldn't try to answer that question here and now, however this should be part of the proposal and it should be clear that the answer is to be determined by the community. Now, my personal stance is that a C Standard Library that has no ABI stability guarantees will miss out many potential users (and hence contributors). I think it should be simple to provide ABI stability for those that need it without impeding those that don't require it (such as Google). We do this in libc++, and a C++ library is immensely harder to keep ABI stable than a C library. However, we should probably move this discussion to the list so that others can chime in. |
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
37 | I have added a section below called "ABI stability" which says that the stability question is currently open and will be answered some time in future. So, I ask the same question I did previously: does this proposal have to block on getting answers to these questions? I do not want this question of mine to be misinterpreted as a dismissal of the open questions. I totally understand why the questions are important and relevant. But, at the same time, I want to know the expectation here. |
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
37 | Sorry -- I think I must have written my comment before you added the section, and then posted it without refreshing the page and seeing your update.
I think so, yes. ABI stability, if provided, has to be a core goal of the library. For example, you'll want to control what symbols you export from a shared library (when one is built), and also you'll need to answer questions of what the linkage of implementation-detail functions is (which impacts whether you provide ABI stability for static archives or not). I don't think it is wise to design these mechanisms and the tests associated to them after the fact. It's much easier (and actually not too hard since this is a green field project) to do it from the start. All this, IMO. |
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
90 | I think something like this might be better:
I'm being pretty hand wavy and ABI stability requires more than just this. Maybe those who are interested will offer to build tooling. The tone I'm trying to strike is one where potential users and contributors know exactly what to expect, and where they could jump in. |
Update the tone of the "ABI stability" section as suggested.
I based my edits on the suggestion and did not take it exactly as suggested.
Overall I think this hits the important points that were raised in the RFC. I'm not sure everything is doable, but it's healthy to have some "disregard for the impossible" early in a project :-)
I'd love to hear from others who want to contribute to this project, and see them sign off on this high-level plan.
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
87 | Typo "offser offer" |
I still have big doubts about this. I fear this may have detrimental effect on the ecosystem at large.
(and i'm quite sure this is not just my personal opinion.)
At the very least, there should be some wording that will explicitly disallow lock-in into this library.
In particular, but not limited to, i'm worried about sanitizers, it may be really tempting to drop all the
interceptors and just say "just build against llvm-libc, just like you already do with libc++".
These opinions were clearly expressed in the RFC. I'm not sure what can be done about them. I'm not saying the concerns are invalid, just that I don't know what can be done to alleviate them.
At the very least, there should be some wording that will explicitly disallow lock-in into this library.
In particular, but not limited to, i'm worried about sanitizers, it may be really tempting to drop all the
interceptors and just say "just build against llvm-libc, just like you already do with libc++".
That seems odd for libc docs to do: the sanitizer folks have shown extreme willingness to be open, even collaborating with GCC folks. Some platforms like the BSDs have the libc as part of the system libraries. I simply can't imagine sanitizer people changing their approach based on what's been done before, including sanitizer people working on BSD-like platforms. @kubamracek WDYT?
Really libc docs are a bad place to legislate how the sanitizer (a different sub-project) do things.
libc++ has no such wording either.
It is hard to address a blanket statement like this. If you have specific concerns, I will be glad to address them. I have tried my best to address most concerns raised on the llvm-dev thread. Not just me, other experienced LLVM contributors (who aren't related to this proposal or project) have pitched in to show case LLVM's track record in working with other communities like GCC.
If you have new concerns, or find that I have missed addressing something on the llvm-dev thread, feel free to bump it up. I will be glad to answer.
llvm/docs/Proposals/LLVMLibC.rst | ||
---|---|---|
87 | "llvm-libc" is used elsewhere in this document, this is the only instance of "LLVM-libc", I'd use the uniform lowercase spelling. |
Did you build these docs locally? (Honestly asking -- I'm wondering if there's something about my local setup that's overly strict, I see doc errors too frequently). I'm getting errors running ninja docs-llvm-html:
~/src/llvm-project/llvm/docs/Proposals/LLVMLibC.rst:document isn't included in any toctree
This should be added somewhere to llvm/docs/index.rst, or ignored with :orphan: at the top of this file.
I have only been running rst2html on this new file. A sphinx bot has also pointed me to this error. Will fix soon.
Hello all,
Since we already have plain libc++, libc++abi and libunwind, why adding a llvm- prefix in front of libc? Just be consistent with our already available names. I think there is no need to name the project "llvm-libc".
I'd drop this part: "llvm-libc is effectively C11 and upwards conformant."
I think it's mostly true, but not worth promising.
I do wonder: will it require passing -std=c17 or later? That's mostly how libc++ works: the C++17 library features require C++17 language.