This is an archive of the discontinued LLVM Phabricator instance.

Move RegisterValue,Scalar,State from Core to Utility
ClosedPublic

Authored by labath on Jul 24 2018, 9:30 AM.

Details

Summary

These three classes have no external dependencies, but they are used
from various low-level APIs. Moving them down to Utility improves
overall code layering (although it still does not break any particular
dependency completely).

Diff Detail

Repository
rL LLVM

Event Timeline

labath created this revision.Jul 24 2018, 9:30 AM

Did you test that with lldb's C++ modules? If not I can test it locally.

Did you test that with lldb's C++ modules? If not I can test it locally.

I didn't, but I don't expect there to be any issues, as I've made sure that the moved files don't include anything outside of lldb/Utility. You're welcome to try it out, of course.

Sounds reasonable. It seems both arcanist and patch can't handle the patch file, so let's just see how it goes post-commit.

labath added a comment.Aug 3 2018, 3:24 AM

Does anyone have thoughts on this? (I have one more move like this lined up (for the Listener/Broadcaster classes), and then I should be done with these kinds of changes for a while.)

zturner added a subscriber: labath.Aug 3 2018, 8:36 AM

No objections here

I worry a little bit about Utility getting kind of incoherent. Some of it is clearly utility, some is more there because that's where we are putting the code we're trying to reduce dependencies on so we can share then with lldb-server. RegisterValue doesn't for instance seem like a "Utility" class to me. But maybe this is something to revisit once you've gotten at least what you need for lldb-server done?

I worry a little bit about Utility getting kind of incoherent. Some of it is clearly utility, some is more there because that's where we are putting the code we're trying to reduce dependencies on so we can share then with lldb-server. RegisterValue doesn't for instance seem like a "Utility" class to me. But maybe this is something to revisit once you've gotten at least what you need for lldb-server done?

I think being a mixed bag is kind of an inherent property of the lowest layers. LLVM's Support library is also a collection of utilities, which have very little in common, but can be combined in various interesting ways to do cool stuff.

I am not opposed to opening the discussion of splitting and/or putting these three classes into a difference place right now. If you have an idea on how to organize it, I'd like to hear it.

However, I have to say, that RegisterValue actually *does* seem like a good fit for Utility to me -- it's the kind of a thing (representing a "value" of a "register") that every conceivable debugger will need. So it seems good to have it in a place where everybody can use (Utilize :)) it. It's also not host-dependent, as we want to represent registers on remote machines too, so it does not seem like it belongs to Host (although putting it there would have been enough for my present needs).

The question that I am not too clear on is what then becomes of "Core", which I think had a similar design goal in mind. Once I am finished here, it will become kind of empty. I guess my best current definition for it would be that it is the "core" of this single specific debugger called "liblldb", which ties all of the interesting components from other modules together, and which probably don't make sense to reuse in a different debugger (e.g. for Zachary's tracing debugger, he could conceivably find a use for RegisterValue or other stuff in Utility, but it's unlikely he would ever want to use the Debugger class). However, I am not really clear on which of our current classes would fit this description.

Looking over the current contents of the Utility module, I see only a couple of things that I would not choose to put there:

  • CompletionRequest - this sounds like it could go next to the command interpreter
  • SafeMachO - ObjectFileMachO ?
  • SelectHelper - Host (probably somehow merged with the MainLoop class)

However, I'm not too bothered by them being there either, and I doubt this would help with the issues that are troubling you.

Several months ago when this came up, i hypothesized that Utility would
eventually contain a mix of random stuff some of which may not make sense
in the long term. But that in the meantime, it’s useful to have some sort
of layering abstraction for “has no other dependencies”. Eventually when
this reaches critical mass, a natural separation should present itself,
because all the stuff that conceptually belongs together will be in
Utility. So then we can break Utility up into a more logical separation

First off, I'm fine with Zachary's suggestion that we let the dust settle a bit before we try to organize things better. We'll probably do a better job then.

But just to use these cases, for instance, Scalar is the base of how we realize values in the debugger (ValueObject lives on top of this). It would be good if its landing place was such that if I was looking at how we represent values it was logically placed for that. State is part of how we present the Process current state, so it seems like it should be grouped with other process related entities. And RegisterValue belongs with the other parts of lldb that take apart the machine level state of a process.

It will probably go away, but FastDemangle really belongs around the language handling code. I agree that Utility is an odd place for CompletionRequest...

SafeMachO is weird. It gets used in Host - both common & macosx, and we're trying not to include out of Plugins so the MachO object file plugin directory doesn't seem right. Maybe Host is a better place, it's reason for being is so you can include both llvm/Support/MachO.h and mach/machine.h, so even though it's not directly a host functionality it look a bit like that. Not sure.

There's also tension between "things that belong together functionally" and "things that need to be separated because we want to layer one strictly on top of the other, since we seem to be treating directories as a representation of dependencies. Do we want to have a Values directory with ValueObject at the top level, and Scalar in a no-dependency subdirectory?

TTTT, if I need to find a file these days, I either Cmd-Click on a type to go to its definition, or type Cmd-Shift-O and start typing some bits of the file name and Xcode finds it for me. I'm pretty familiar with the code, so I don't need a higher level skeleton to help me figure out what all is in this project. I think at this point many of us are in this state...

But once you get past the build issues, the real audience for this level of organization is folks coming new to the project. It would be good to hear from some of the newer folks what they found confusing or what would have helped them navigate around the project as they are starting out.

For the Register stuff, for example, I think it could make sense for it to be in a project such as HAL (Hardware Abstraction Layer) or something similarly named. Everything that describes properties of specific CPUs could go there, perhaps even including ArchSpec. For the Scalar stuff, perhaps a target called Visualization (or even just Value). Once more and more stuff starts ending up in Utility, I think it makes sense to try something like this.

CompletionRequest - this sounds like it could go next to the command interpreter

Yeah, makes sense. Even though Utility classes then can either not offer completion methods (currently only ArchSpec is doing that) or work around that with a forward decl. But I think that's not a big issue.

First off, I'm fine with Zachary's suggestion that we let the dust settle a bit before we try to organize things better. We'll probably do a better job then.

Thanks. I agree this can wait, but I also do have some ideas below on what could done differently right now for the Scalar and State classes. Let me know if you think I should try that instead. (Otherwise, I'll just go ahead with this.)

But just to use these cases, for instance, Scalar is the base of how we realize values in the debugger (ValueObject lives on top of this). It would be good if its landing place was such that if I was looking at how we represent values it was logically placed for that.

I think putting Scalar next to the ValueObject stuff would make sense, if the ValueObjects were the only user of this class. However, that isn't the case right now. The reason I am touching this class in the first place is that it is being used from within the RegisterValue class. This means it is more than just a ValueObject helper, but more like a self-contained "utility" which can be reused in different contexts.

Now, I am not actually sure whether using Scalar within RegisterValue is a good idea (seems a bit overkill for that), but it seemed to work, so I didn't want to disturb that (and I do believe that Scalar could be useful outside of the ValueObject hierarchy). However, I can take another look at what it would take to stop using Scalar in the context of RegisterValues, which would free us up to move RegisterValue without touching the Scalar class.

State is part of how we present the Process current state, so it seems like it should be grouped with other process related entities.

State is a bit funny. It is currently used from both liblldb and lldb-server, but these use different hierarchies for "process-related entities", so that's why I'm moving it to a common place. However, I can certainly see a case for LLGS and liblldb having different State enums. Sharing it doesn't buy us much (ATM it's just the StateAsCString function), and using a different enum in LLGS would have another benefit. Right now it only uses a subset of all State values, so using an different enum would allow us to capture the possible states more precisely. Doing that instead of moving State.h around should be easy enough.

And RegisterValue belongs with the other parts of lldb that take apart the machine level state of a process.

I am not sure which parts are *you* thinking about here, but I think it would be nice to have this class together with all the definitions of RegisterInfo structs and associated constants. These are now mostly in Plugin/Process/Utility, but so is a lot of other stuff. At one point I will get around to worrying about those too, so maybe then we could move all of these things to a separate module.

It will probably go away, but FastDemangle really belongs around the language handling code. I agree that Utility is an odd place for CompletionRequest...

SafeMachO is weird. It gets used in Host - both common & macosx, and we're trying not to include out of Plugins so the MachO object file plugin directory doesn't seem right. Maybe Host is a better place, it's reason for being is so you can include both llvm/Support/MachO.h and mach/machine.h, so even though it's not directly a host functionality it look a bit like that. Not sure.

I think Host makes sense. None of the other code (except the NativeProcessXXX classes, I guess) should really be including OS-specific stuff, so there shouldn't be a need (in an ideal world, I don't know how far we are from it right now) for this header anywhere except Host code.

There's also tension between "things that belong together functionally" and "things that need to be separated because we want to layer one strictly on top of the other, since we seem to be treating directories as a representation of dependencies. Do we want to have a Values directory with ValueObject at the top level, and Scalar in a no-dependency subdirectory?

TTTT, if I need to find a file these days, I either Cmd-Click on a type to go to its definition, or type Cmd-Shift-O and start typing some bits of the file name and Xcode finds it for me. I'm pretty familiar with the code, so I don't need a higher level skeleton to help me figure out what all is in this project. I think at this point many of us are in this state...

But once you get past the build issues, the real audience for this level of organization is folks coming new to the project. It would be good to hear from some of the newer folks what they found confusing or what would have helped them navigate around the project as they are starting out.

Yes, that's certainly a good point. I'd like to hear that too.

CompletionRequest - this sounds like it could go next to the command interpreter

Yeah, makes sense. Even though Utility classes then can either not offer completion methods (currently only ArchSpec is doing that) or work around that with a forward decl. But I think that's not a big issue.

I am not too bothered by that either, but I think that for the thing that the ArchSpec function is doing, passing it the entire CompletionRequest is overkill. Ideally, I'd set things up such that ArchSpec and friends don't have any knowledge of "completion" -- they would just offer an interface for accessing all possible values(*), and then completion could be built on top of that. This way the data source can be independent of the actual algorithm computing the completions.

(*) The interface could be as simple as an iterator over all ArchSpec string values. However, this may not be good enough in terms of performance (it wouldn't matter for ArchSpec, but it might for some other classes), in which case we might want to do something slightly more fancy (GiveMeAllValuesStartingWith("foo")).

First off, I'm fine with Zachary's suggestion that we let the dust settle a bit before we try to organize things better. We'll probably do a better job then.

Thanks. I agree this can wait, but I also do have some ideas below on what could done differently right now for the Scalar and State classes. Let me know if you think I should try that instead. (Otherwise, I'll just go ahead with this.)

It doesn't seem like you're doing anything that is going to be hard to undo, so I'm fine with just doing this as you have it for now. Maybe see if SafeMachO can go in Host, but that's the only one that seems easy and not requiring deeper thought.

But just to use these cases, for instance, Scalar is the base of how we realize values in the debugger (ValueObject lives on top of this). It would be good if its landing place was such that if I was looking at how we represent values it was logically placed for that.

I think putting Scalar next to the ValueObject stuff would make sense, if the ValueObjects were the only user of this class. However, that isn't the case right now. The reason I am touching this class in the first place is that it is being used from within the RegisterValue class. This means it is more than just a ValueObject helper, but more like a self-contained "utility" which can be reused in different contexts.

Now, I am not actually sure whether using Scalar within RegisterValue is a good idea (seems a bit overkill for that), but it seemed to work, so I didn't want to disturb that (and I do believe that Scalar could be useful outside of the ValueObject hierarchy). However, I can take another look at what it would take to stop using Scalar in the context of RegisterValues, which would free us up to move RegisterValue without touching the Scalar class.

I'm not sure how much of Scalar RegisterValue needs when it isn't being used to back a ValueObjectRegister. Scalar is a pretty ambitious class, since it supports everything you would need to back DWARF expression manipulations. RegisterValue does get used in a bunch of places stand alone, though a lot of that just seems to be setting or dumping.

But I guess I'm looking for an organization based on "groups by functionality" rather than "who uses what". So even though other actors might use Scalar, it's reason for being is to present values. For the purposes of understanding the code, grouping all the files that work together for a task seems to me like it would help. But again, this is more in the service of people newer to the code.

State is part of how we present the Process current state, so it seems like it should be grouped with other process related entities.

State is a bit funny. It is currently used from both liblldb and lldb-server, but these use different hierarchies for "process-related entities", so that's why I'm moving it to a common place. However, I can certainly see a case for LLGS and liblldb having different State enums. Sharing it doesn't buy us much (ATM it's just the StateAsCString function), and using a different enum in LLGS would have another benefit. Right now it only uses a subset of all State values, so using an different enum would allow us to capture the possible states more precisely. Doing that instead of moving State.h around should be easy enough.

And RegisterValue belongs with the other parts of lldb that take apart the machine level state of a process.

I am not sure which parts are *you* thinking about here, but I think it would be nice to have this class together with all the definitions of RegisterInfo structs and associated constants. These are now mostly in Plugin/Process/Utility, but so is a lot of other stuff. At one point I will get around to worrying about those too, so maybe then we could move all of these things to a separate module.

It will probably go away, but FastDemangle really belongs around the language handling code. I agree that Utility is an odd place for CompletionRequest...

SafeMachO is weird. It gets used in Host - both common & macosx, and we're trying not to include out of Plugins so the MachO object file plugin directory doesn't seem right. Maybe Host is a better place, it's reason for being is so you can include both llvm/Support/MachO.h and mach/machine.h, so even though it's not directly a host functionality it look a bit like that. Not sure.

I think Host makes sense. None of the other code (except the NativeProcessXXX classes, I guess) should really be including OS-specific stuff, so there shouldn't be a need (in an ideal world, I don't know how far we are from it right now) for this header anywhere except Host code.

Yes, that makes sense to me.

There's also tension between "things that belong together functionally" and "things that need to be separated because we want to layer one strictly on top of the other, since we seem to be treating directories as a representation of dependencies. Do we want to have a Values directory with ValueObject at the top level, and Scalar in a no-dependency subdirectory?

TTTT, if I need to find a file these days, I either Cmd-Click on a type to go to its definition, or type Cmd-Shift-O and start typing some bits of the file name and Xcode finds it for me. I'm pretty familiar with the code, so I don't need a higher level skeleton to help me figure out what all is in this project. I think at this point many of us are in this state...

But once you get past the build issues, the real audience for this level of organization is folks coming new to the project. It would be good to hear from some of the newer folks what they found confusing or what would have helped them navigate around the project as they are starting out.

Yes, that's certainly a good point. I'd like to hear that too.

labath added a comment.Aug 7 2018, 3:56 AM

Thanks. I'll commit this and look at moving SafeMachO next.

But I guess I'm looking for an organization based on "groups by functionality" rather than "who uses what". So even though other actors might use Scalar, it's reason for being is to present values. For the purposes of understanding the code, grouping all the files that work together for a task seems to me like it would help.

I don't this these are that much different objectives. If Scalar is used by RegisterValue then, whether we like it or not, it's functionality is also to provide services to that class. That certainly wasn't what the class was written for originally, but it is what we have now.

When working on ValueObjects, it would be logical to have Scalar next to them. However, I don't think having Scalar somewhere else would be too surprising either. After all, displaying debugger values is a complex functionality, so it's not surprising it uses some generic number manipulation class (which is what Scalar would be in this world view). On the other hand, I would be *very* surprised if I was working on code reading a register from process memory and found out it references a class which purports to do dwarf expression evaluation.

This revision was not accepted when it landed; it landed in state Needs Review.Aug 7 2018, 4:08 AM
This revision was automatically updated to reflect the committed changes.
lldb/trunk/unittests/Core/CMakeLists.txt