The only real consumer of this is the Process class, so it makes sense (both conceptually and from a layering standpoint) to hide this in the Process plugin.
This is intended to be NFC.
Differential D31172
Move stop info override callback code from ArchSpec into Process labath on Mar 20 2017, 11:51 PM. Authored by
Details The only real consumer of this is the Process class, so it makes sense (both conceptually and from a layering standpoint) to hide this in the Process plugin. This is intended to be NFC.
Diff Detail Event TimelineComment Actions It looks like you are bundling two changes into one patch, one to move the StopInfoOverrideCallback from ArchSpec to Process, and one to change how the help for the list of architectures is computed. Are these linked in some way I'm missing? I don't have a strong opinion about whether this callback belongs in ArchSpec, but I don't think it belongs in Process. We've been trying hard to keep the algorithms in Process architecture independent, and that seems a worthy goal since otherwise it's going to be too easy to let little architecture specific hacks creep in and make the logic harder to follow. If you have some strong reason to take this out of ArchSpec then maybe we need another container for architecture specific algorithms. But I don't think Process is the right place for this. Comment Actions The architecture help change snuck in, I already removed it locally but didn't re-upload. As for why I'm trying to get this out of ArchSpec, it turns out ArchSpec is one of the biggest causes of dependency cycles. It's used all over the place, which triggers a dependency from Everywhere -> Core, and then it depends on Target, a couple of Plugins, and Host. There's only a very small amount of code in ArchSpec that accounts for all of these dependencies though. The first is this code that I'm attempting to move, the second is the code to set the triple given a Platform. So I wanted to tackle these one at a time. One option would be to make lldb/Process/StopInfoOverride.h and cpp, and put the code there. Thoughts? Comment Actions None of this isn't modeling the situation particularly clearly, since we have "architecture specific modifications to general behaviors" that aren't coming in through plugins so that it would be easy to modify the behavior for new architectures. Instead, we're requiring new architectures to go in and edit supposedly generic code. We put off that abstraction because it was unclear what we would need it to do, and because of that for the most part we just put these architecture specific behaviors inline in various places. Then this one callback had nowhere to go but ArchSpec which is the only logical place for architecture specific code to accumulate. It doesn't seem like we should gate cleaning up ArchSpec on that larger issue, however. Rather than introducing another file, it seems simpler to make this part of StopInfo, since it is logically modifying the StopInfo for a thread. It isn't right there, but it's any worse, and StopInfo is closer to the machine (though so far mostly closer to the Platform) than Process should be. And StopInfo's already have to know about pretty much all the details of Process/Thread/etc... so you wouldn't be adding knowledge to them that isn't appropriate. I'd also suggest removing the "callback" name from it since it really isn't a general purpose callback, it is absolutely determined by the ArchSpec. For instance if you were to use anything but the ARM one for ARM debugging you would break debugging. It's not optional... Instead I'd call it something like InvokeStopInfoArchOverride. That will also make it clear that this is a target for gathering up into some "ArchSpec specific behaviors" if/when we get around to that. Comment Actions It almost seems like we need some sort of an architecture plug-in here. Maybe something like Architecture plugins. The Architecture::FindPlugin() would take an ArchSpec and return a lldb_private::Architecture class instance that can be cached in the target or process. Currently the class would look like: class Architecture : public PluginInterface { Error StopInfoOverride(...); }; Then using the target's arch, it can lookup the right one and cache it in the target or process? Comment Actions It looks like this has ground to a halt, but it seems the consensus was to try the Architecture plugin idea. I'm going to hijack this revision to maintain context, and upload a new version implementing that. Comment Actions A few inline fixes, but this looks great. Very close
Comment Actions This is a good addition. However, it seems to me that since you only need an ArchSpec to make one of these Architecture plugins, which plugin you get seems fully determined by the Target, not the Process. I understand that the only need for it at present is to do a Process-related task. But that task actually takes a Thread as a parameter, so the Architecture plugin doesn't need to know it's containing process to do its job. And it seems like it diminishes the plugin's future utility to have it more limited in scope than it needs to be. What do you think? Comment Actions I'm not sure I understand what you are proposing here. Is it to have the architecture plugin store a Target as a member variable? If that's the case, then I'd say let's wait until a need for that arises. I am not fundamentally against the idea, but I don't see a reason to add it while we don't have a need for it. Comment Actions I think is saying we should just store the Architecture plug-in in the target instead of in the process. It will also need to be cleared when the target arch is changed. Comment Actions Sorry I wasn't clear. I meant that since the Target knows everything it needs to know to vend the correct Architecture plugin, you should get it from the Target, not the Process. In general, I think that the highest class in the stack that can vend a plugin is the one that should. Comment Actions Ah, got it. That sounds like a good idea, I'll switch this over to Target.
Comment Actions Moving the plugin to Target and other minor fixups. I also wrote a test for the functionality covered by the override callback. (I'm Comment Actions Added a suggestion to make Target::GetArchitecturePlugin lazy and to ask if the arch is supported. Most of the time we will set this once and never need to change it, even when we attach, change arch, etc. The new suggestion will allow us to use the same arch plug-in without re-fetching it. Let me know what you think as the changes are thrown out there to see what people's opinions are.
Comment Actions Well... I don't feel strongly about it, but it's not my preferred solution (the IsSupported() call feels hacky). But since we're already throwing ideas out there, here are two of mine:
Comment Actions That would be fine too.
It would need to be the other way around. Store the Architecture plug-in inside of the ArchSpec since we only have the "arm" Architecture plug-in and architectures don't need to make one of these unless the arch has special things it needs to do. I am fine with either approach and fine not doing that approach I suggested. I just don't want this to ever get out of sync. So maybe the easiest way it to ask the ArchSpec object for it's architecture plug-in? If there is no state maintained in these plug-ins, we might just have a single instance of each Architecture plug-in that everyone uses (across targets)? Then everyone doesn't need to hold onto a shared/unique_ptr and can just fetch it from the Plugin manager each time? Comment Actions I've tried that, and eventually decided against it, as it would create an uncomfortable mutual recursion between SetArchitecture and SetExecutableModule.
We can't have ArchSpec vend the plugin, as that would cause ArchSpec to depend on the whole world again (which is what we are trying to remove by this). I don't see why the plugin could not hold the ArchSpec -- for architectures which don't need special processing, we could just have a default dumb "plugin" that does nothing but vend the ArchSpec. The alternative would be to have a third object which holds both ArchSpec and the plugin, and whose only job is to make sure these two are in sync -- it would basically be a glorified std::pair with a custom assignment operator. I think this would actually be a fairly clean solution -- the only problem I have is that I don't know how to name the new class. I'll upload it so you can see how it looks like. Comment Actions I'm back now, and I'd like to try to push this patch to completion. After re-reading the discussion, I got the impression we have mostly reached a consensus here. A small issue remained about how to guarantee that the Architecture plugin and the ArchSpec object are in sync. Several versions were thrown around, with no clear conclusion. Does anyone hav objections to how this part is implemented in the last version of the patch? If not, I'd like to put this in... Comment Actions I know you're doing things the way it's always been done, but I want to start questioning some long-standing practices :) So I'm not picking on you specifically, but anywhere we can migrate towards something better incrementally, I think we should do so.
Comment Actions
I haven't seen the paper, but my guess is that this is on linux and lldb wasn't using any accelerator tables for the dwarf of their inferior test program. gdb was designed before there were any accelerator tables, so in their absence, it will create "partial symbol tables" with the address ranges and globally visible names of each CU. When details are needed for a specific CU, the psymtab is expanded into a symbol table, symtab. DWARF is scanned on initial load to create the psymtab names, and then when the symtab is created, it is ingested a CU at a time. nb: my knowledge of lldb is a decade old at this point, I have not kept up with the project. lldb was designed with the assumption that we could get these global symbol names from accelerator tables - we would map them into our address space and search for names in them. When we are interested in a specific type or function or variable, we will read only the DWARF related to that type/function/variable. In the absence of accelerator tables, I am willing to believe that lldb exhibits poor memory and startup performance -- that was not something we (at apple) concentrated on the performance of. It's a completely valid area for future work in lldb -- adopting to using accelerator tables other than apple's vendor specific ones. I suspect, in an absence of concrete information, that the measurement of lldb's memory use being primarily strings, and the perf issues mentioned in this paper, are all due to this non-performant code path on Linux. If we see these memory issues when there are accelerator tables, then that's a big problem and I'll be shocked if we can actually be out-performed by gdb. They were the base line that we designed ourselves to do better than. We can expend a lot of energy on making string tables smaller, and of course I wouldn't object to people working in that direction. But I think the real solution is to get lldb on non-darwin platforms to be using the accelerator tables that exist and allowing the full design of lldb to work as is intended. My two cents. Comment Actions I will ping them for some numbers and more details of their test setup. Regardless, I didn't mean to derail the code review. But, I really really want to reach a point where we can stop falling back on the "we need to be safe even in the presence of stuff that is clearly not user input" argument. I understand the concerns, but I don't think this is a reasonable path forward for the project. If it's not user input, if we own it, then we can make whatever assumption we want that leads to the best performance and memory usage characteristics. As I said this is a pretty uninteresting case, and probably makes no difference since we're talking about 30-40 strings in the whole program, so I'll leave it at that. But I will probably push on this harder in the future if it's in some other area of code that has a larger impact. |