This is an archive of the discontinued LLVM Phabricator instance.

Move Config.h from Host to Utility
Needs ReviewPublic

Authored by zturner on Feb 25 2019, 4:32 PM.

Details

Summary

Host/Config.h is where we have platform specific preprocessor defines that are configured at CMake time. Then, we can include this file lldb/Host/Config.h and read the value of defines. This is basically identical to what llvm does with llvm/Support/llvm-config.h, the only fundamental differences being: a) LLDB configures it into lldb/Host where as the spiritual equivalent to llvm/Support is lldb/Utility, and b) LLDB calls it Config.h and LLVM calls it llvm-config.h.

This patch brings LLDB in line with LLVM here by configuring into lldb/Utility/lldb-config.h, in part for consistency and in part for more practical reasons.

The practical reasons are that I want to use Socket.h from a new tool / library without having to link all of LLDB such as Clang, Python, etc, and that's not currently possible with the current layering, as linking to Host will link to everything.

So this is a necessary first step. llvm/Support's model for handling platform specific differences is quite convenient both from a usability as well as a maintenance perspective, and I'd like to gradually move towards that whenever an opportunity / need arises. So the immediate plan is to move config.h to Utility, and then start by moving pieces -- as necessary -- from Host to Utility until I can get Socket just by linking to lldbUtility.

Diff Detail

Event Timeline

zturner created this revision.Feb 25 2019, 4:32 PM

Makes sense, no objections from me.

Utility is supposed to be a bunch of stand-alone utility files & headers that we gather together for convenience's sake.

Host is where we put all the code that is specific to one or another host, and any support files that requires. For instance, that's why PlatformLinux and friends are there, and all the independent platform directories. So it is odd to move platform specific defines from Host to Utility.

I don't care much about the name, but having this in Utility when it is setting Host defines and we have a place clearly set out for host support files doesn't seem right.

Utility is supposed to be a bunch of stand-alone utility files & headers that we gather together for convenience's sake.

Host is where we put all the code that is specific to one or another host, and any support files that requires. For instance, that's why PlatformLinux and friends are there, and all the independent platform directories. So it is odd to move platform specific defines from Host to Utility.

I don't care much about the name, but having this in Utility when it is setting Host defines and we have a place clearly set out for host support files doesn't seem right.

That is how Host started out, but I think that that distinction both a) hasn't played out the way we hoped for in practice, and b) isn't necessarily the right line to draw anyway. We have quite a few ifdefs scattered around the codebase for different platforms that are not in Host in places where we couldn't easily (or cleanly) make the code fit into that separation model. On the flip side, lots of stuff has crept into Host that isn't really host-specific, which is why the dependencies are so tangled up.

I think following the LLVM/Support model makes sense. It's "Utility", plus what we would normally think of as "Host". This is a lot cleaner than it sounds, and all of the ifdefs and platform specific defines end up only in private implementation files, with none exposed through headers.

To be honest, I don't mind if we call it something other than Host (for example it could be called System or something), but the practical issue with using Host today is that it still has a ways to go before it can be linked against without pulling in the rest of LLDB, and the changes needed to get there are not trivial, so it will take some more continued effort.

Regardless, in my mind Utility is exactly what you said, but that definition doesn't actually preclude Host-specific things. In fact, putting Host-specific abstractions there solves a lot of problems. Imagine, for example, someone wanted to create a debugging related tool that was not a full-fledged debugger. They might want to use lots of functionality that is in Utility and Host, but not bring in everything else. That's what Utility is to me. It's a set of abstractions that are useful for creating debugging-related tooling that someone can use for that purpose.

I don't necessarily care if it's in Utility or some other newly created library we call something else, but we need some immediate way of using these abstractions without linking against the rest of LLDB so that we're not blocked on finishing the layering split.

One option would be to just keep growing Utility with these kinds of classes and abstractions until we decide that it's too big, at which point a separation would probably fall out naturally (FWIW, this is how LLVM/Support and LLVM/ADT evolved into what they are)

There aren't many more platforms (OpenVMS?) that are likely to show up and need support for lldb, so maybe this is making too much of the matter. But having there be a well defined set of places where you need to look when porting lldb to a new host platform seems a useful design to me. The fact that we weren't strict enough and allowed Host platform dependencies to creep in where they don't belong doesn't argue we should just abandon trying to keep the places where a port sockets into lldb regular.

It's fine to have a staging area ("WillBeHost"?) to help pull out the classes that it is simple to pull out of host. But I'd rather not glom them all into Utility, which is already becoming a little incoherent.

There aren't many more platforms (OpenVMS?) that are likely to show up and need support for lldb, so maybe this is making too much of the matter. But having there be a well defined set of places where you need to look when porting lldb to a new host platform seems a useful design to me. The fact that we weren't strict enough and allowed Host platform dependencies to creep in where they don't belong doesn't argue we should just abandon trying to keep the places where a port sockets into lldb regular.

It's fine to have a staging area ("WillBeHost"?) to help pull out the classes that it is simple to pull out of host. But I'd rather not glom them all into Utility, which is already becoming a little incoherent.

How about "System" then? FTR, I'm also fine eventually renaming it back to Host once all the layering issues are addressed. The important thing is that if it's really just a set of platform-specific abstractions, it shouldn't really depend on anything else (except perhaps Utility).

I guess one advantage to having them be together though is that often even the Utility code itself needs to make use of platform-specific abstractions. You can already see this anywhere that lldb/Utility calls into llvm/Support to do things like read from the file system or start a thread (TBH, I'm actually not even sure it does any of those things, but it seems reasonable that it might want to). So if we make a library called System, then it's possible that System ends up depending on Utility and Utility ends up depending on System. Despite all of mine (and others') push for proper layering, this is one area where I'm actually ok with it. LLVM has the same issue here, where ADT and Support depend on each other and just linked together into one big .lib, despite being in separate folders.

But it's something to keep in mind.

Anyway, does that seem reasonable? Put it in a new folder called System?

I also think it would be great to group all host-specific functionality into a common library (other than Utility, since that already has a lot of other stuff in it).

I understand the desire to start with a clean slate for a new tool, so a new would-be-host folder might be a good middle ground between using host as-is and shoving everything into Utility. It will create some "git blame" churn, but probably won't be that bad overall. I probably wouldn't do it that way, because I think we're not far from "fixing" Host (you can help by reviewing D58167, I am waiting for your responses on the comments there), and I don't think it would be so bad to "accidentally" depend on everything in the interim. However, this seems fine too.

So, bottom line, if everyone is ok with the "staging" System folder, then I am too.

zturner updated this revision to Diff 188393.Feb 26 2019, 9:12 AM

Moved to System instead of Utility

I'm not a fan of introducing another library for this. Without a clear timeline of "fixing" Host it's basically as temporary as anything else in software development. I also think it's confusing (e,g, new files should go where?). If you really want a clean slate we should move everything in System that doesn't depend on anything other than Utility until Host is empty, but I'm sure everyone agrees that unrealistic and not worth the churn.

For the record I also very much support a Host library instead of having everything in Utility. That being said, I could see the config file being special enough (in that everything depends on it) to be the exception.

It seems like we could drop the dependencies on Core and Symbol by moving Symbols somewhere else. How much work would there be left after Pavel's patch?

I'm not a fan of introducing another library for this. Without a clear timeline of "fixing" Host it's basically as temporary as anything else in software development. I also think it's confusing (e,g, new files should go where?). If you really want a clean slate we should move everything in System that doesn't depend on anything other than Utility until Host is empty, but I'm sure everyone agrees that unrealistic and not worth the churn.

For the record I also very much support a Host library instead of having everything in Utility. That being said, I could see the config file being special enough (in that everything depends on it) to be the exception.

It seems like we could drop the dependencies on Core and Symbol by moving Symbols somewhere else. How much work would there be left after Pavel's patch?

Moving Symbols somewhere else isn't exactly easy though. There's a lot of interdependencies that are slightly different on every Host platform, so it creates a lot of breakage doing things this way.

FWIW, I actually wouldn't mind moving everything out of Host into System until Host is basically empty. My next step after this patch was going to be to move MainLoop followed by Socket into System (those are my actual goals anyway). In some ways, approaching the dependency problem from both directions this way is more effective than approaching it from only one side, as this way eventually Host becomes so small that you can just take whatever is left over and have it be subsumed by some other target (probably Core or Symbols).

All of that said, I'm not as sold that having a separate library just for Host is even terribly useful. I'll go along with it if that's what everyone wants, but at the end of the day, this library (whether it be called Host, or System, or something else) will almost certainly depend on Utility, and Utility will almost certainly depend on it (for the same reason that Utility currently depends on llvm/Support). This is why in LLVM/Support there are both non-Host specific things, such as StringExtras.h, and host specific things, such as FileSystem.h, Thread.h, etc. I don't think that's actually so bad in practice.

So my vote is honestly still to just put this in Utility for consistency with LLVM, but I can agree with System or something as part of a transition if it gets us closer to being able to use these utility classes without a dependency on the whole debugger.

BTW, what's the reason that you have to have this split now? I understand its attractiveness from a Zen perspective, but it seems to me that at the end of the day, it doesn't have that much impact on the new code you're about to write. It seems to me you could still enforce a clean layering on the new code by just saying "only ever include stuff from Host or Utility". The fact that Host transitively pulls in the kitchen sink is unfortunate, but I don't think it should impact you much. And when we finally clean up Host (which is something that we'll do eventually, and I hope not too far from now), then the new code will magically stop depending on the whole world without any extra effort.

I'm not a fan of introducing another library for this. Without a clear timeline of "fixing" Host it's basically as temporary as anything else in software development. I also think it's confusing (e,g, new files should go where?). If you really want a clean slate we should move everything in System that doesn't depend on anything other than Utility until Host is empty, but I'm sure everyone agrees that unrealistic and not worth the churn.

For the record I also very much support a Host library instead of having everything in Utility. That being said, I could see the config file being special enough (in that everything depends on it) to be the exception.

I would actually say that almost nothing should depend on the config file :). I mean this in the sense that the dependencies should be encapsulated similarly to how llvm encapsulates all of the external dependencies and the rest of the code should just use those abstractions.

However, I too have come to the conclusion that the config file could be an exception. My reason for that is that for instance our XML abstraction seems to be misplaced in the Host library. It's kind of true that whether libxml is present is a feature of the host system, but that's a fairly odd way of looking at things. Right not I am incubating this idea of creating a "Formats" library which would house the minidump parser, gdb-remote protocol support code and similar, XML support seems like it could fit nicely there. Having the LIBXML_AVAILABLE macro in an easily accessible place would help make that happen.

It seems like we could drop the dependencies on Core and Symbol by moving Symbols somewhere else. How much work would there be left after Pavel's patch?

Yea, symbols is going to be the trickiest part. I've been preparing myself (both code-wise and mentally) to tackle that by first removing all the other easy obstacles first. I have a sort of an idea on how to handle that, but I don't want to go into that here.

I suppose one way to make progress here would be to move Symbols into some weird temporary package. Then cleaning up the rest of Host should be easy, and the temporary package could disappear as soon as Symbols has been dealt with. It might make my job of cleaning it up easier, as I could just move the modularized functionality back into Host piece by piece. It shouldn't even create additional history churn, as that code will need to be rewritten anyway.

JDevlieghere added a comment.EditedFeb 26 2019, 4:17 PM

I'm not a fan of introducing another library for this. Without a clear timeline of "fixing" Host it's basically as temporary as anything else in software development. I also think it's confusing (e,g, new files should go where?). If you really want a clean slate we should move everything in System that doesn't depend on anything other than Utility until Host is empty, but I'm sure everyone agrees that unrealistic and not worth the churn.

For the record I also very much support a Host library instead of having everything in Utility. That being said, I could see the config file being special enough (in that everything depends on it) to be the exception.

It seems like we could drop the dependencies on Core and Symbol by moving Symbols somewhere else. How much work would there be left after Pavel's patch?

Moving Symbols somewhere else isn't exactly easy though. There's a lot of interdependencies that are slightly different on every Host platform, so it creates a lot of breakage doing things this way.

FWIW, I actually wouldn't mind moving everything out of Host into System until Host is basically empty. My next step after this patch was going to be to move MainLoop followed by Socket into System (those are my actual goals anyway). In some ways, approaching the dependency problem from both directions this way is more effective than approaching it from only one side, as this way eventually Host becomes so small that you can just take whatever is left over and have it be subsumed by some other target (probably Core or Symbols).

All of that said, I'm not as sold that having a separate library just for Host is even terribly useful. I'll go along with it if that's what everyone wants, but at the end of the day, this library (whether it be called Host, or System, or something else) will almost certainly depend on Utility, and Utility will almost certainly depend on it (for the same reason that Utility currently depends on llvm/Support). This is why in LLVM/Support there are both non-Host specific things, such as StringExtras.h, and host specific things, such as FileSystem.h, Thread.h, etc. I don't think that's actually so bad in practice.

So my vote is honestly still to just put this in Utility for consistency with LLVM, but I can agree with System or something as part of a transition if it gets us closer to being able to use these utility classes without a dependency on the whole debugger.

BTW, what's the reason that you have to have this split now? I understand its attractiveness from a Zen perspective, but it seems to me that at the end of the day, it doesn't have that much impact on the new code you're about to write. It seems to me you could still enforce a clean layering on the new code by just saying "only ever include stuff from Host or Utility". The fact that Host transitively pulls in the kitchen sink is unfortunate, but I don't think it should impact you much. And when we finally clean up Host (which is something that we'll do eventually, and I hope not too far from now), then the new code will magically stop depending on the whole world without any extra effort.

I'm not a fan of introducing another library for this. Without a clear timeline of "fixing" Host it's basically as temporary as anything else in software development. I also think it's confusing (e,g, new files should go where?). If you really want a clean slate we should move everything in System that doesn't depend on anything other than Utility until Host is empty, but I'm sure everyone agrees that unrealistic and not worth the churn.

For the record I also very much support a Host library instead of having everything in Utility. That being said, I could see the config file being special enough (in that everything depends on it) to be the exception.

I would actually say that almost nothing should depend on the config file :). I mean this in the sense that the dependencies should be encapsulated similarly to how llvm encapsulates all of the external dependencies and the rest of the code should just use those abstractions.

However, I too have come to the conclusion that the config file could be an exception. My reason for that is that for instance our XML abstraction seems to be misplaced in the Host library. It's kind of true that whether libxml is present is a feature of the host system, but that's a fairly odd way of looking at things. Right not I am incubating this idea of creating a "Formats" library which would house the minidump parser, gdb-remote protocol support code and similar, XML support seems like it could fit nicely there. Having the LIBXML_AVAILABLE macro in an easily accessible place would help make that happen.

It seems like we could drop the dependencies on Core and Symbol by moving Symbols somewhere else. How much work would there be left after Pavel's patch?

Yea, symbols is going to be the trickiest part. I've been preparing myself (both code-wise and mentally) to tackle that by first removing all the other easy obstacles first. I have a sort of an idea on how to handle that, but I don't want to go into that here.

I suppose one way to make progress here would be to move Symbols into some weird temporary package. Then cleaning up the rest of Host should be easy, and the temporary package could disappear as soon as Symbols has been dealt with. It might make my job of cleaning it up easier, as I could just move the modularized functionality back into Host piece by piece. It shouldn't even create additional history churn, as that code will need to be rewritten anyway.

I think this inverted approach makes more sense than pulling everything out of host that doesn't depend on anything outside of Utility. Seems like that would create less churn, leave host as the canonical library, and have the same effect. Zachary, how do you feel about that?

BTW, what's the reason that you have to have this split now? I understand its attractiveness from a Zen perspective, but it seems to me that at the end of the day, it doesn't have that much impact on the new code you're about to write. It seems to me you could still enforce a clean layering on the new code by just saying "only ever include stuff from Host or Utility". The fact that Host transitively pulls in the kitchen sink is unfortunate, but I don't think it should impact you much. And when we finally clean up Host (which is something that we'll do eventually, and I hope not too far from now), then the new code will magically stop depending on the whole world without any extra effort.

I suppose that actually is fine.

JDevlieghere resigned from this revision.Feb 28 2019, 1:35 PM

I believe this isn't relevant anymore after moving Symbols?

jgorbe resigned from this revision.Mar 26 2019, 5:13 PM