Page MenuHomePhabricator

Remap the target SDK directory to the host SDK directory
ClosedPublic

Authored by aprantl on Mar 19 2020, 6:51 PM.

Details

Summary

This is mostly useful for Swift support; it allows LLDB to substitute a matching SDK it shipped with instead of the sysroot path that was used at compile time.

rdar://problem/60640017

Diff Detail

Event Timeline

aprantl updated this revision to Diff 251526.Mar 19 2020, 6:51 PM
aprantl created this revision.
labath requested changes to this revision.Mar 20 2020, 1:56 AM
labath added a subscriber: labath.

I think this needs a lot more discussion. First, there's some weird layering going on here, where the class SDK is declared in lldb/Utility, but it's implemented in PlatformDarwin. But even before we sort that out, we probably need to figure out what exactly is the scope of the new class (e.g. should it cover only Mac sdks, or more).

Then there's the question of the usage of the host platform in the Module class, which is not very ideal. Elsewhere, one would ask the target for it's current platform and use that, but since Modules are idependent of a target, that can't happen here. Still, I don't think that jumping to the host platform is the right solution to that.

It also needs tests. The SDK class looks to be perfectly unit-testable, and if we include the "module sdk" (or maybe the resultant path mappings) in its "dump" output, then we could use lldb-test to test it's interaction with the module and dwarf classes too.

lldb/include/lldb/Utility/SDK.h
18–19 ↗(On Diff #251526)

If this is going to be specific to apple sdks, then it should have a less generic name (other platforms have sdks too).

20 ↗(On Diff #251526)

Are all of these ConstStrings really needed? The code uses StringRefs for string manipulation anyway, and it's very doubtful that any of this is going to be a performance bottleneck. OTOH, each ConstString adds some junk to the global string pool which never gets deleted.

lldb/source/Core/Module.cpp
1603

Routing this via host platform seems like a bad design choice. Theoretically, if I am cross-debugging a linux binary, I should be able to ask some entity which knows about linux sysroots even if I am on a mac.

And indeed, we don't have that many callers of Platform::GetHostPlatform, and none of them are in the Module hierarchy.

lldb/source/Plugins/Platform/MacOSX/PlatformDarwin.cpp
1285–1358

Implementing this inside PlatformDarwin.cpp is super weird.

lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
670–681 ↗(On Diff #251526)

Unless we're planning to add these methods to llvm::DWARFUnit, they should go someplace else. There's no reason why this functionality needs to be implemented from within DWARFUnits.

This revision now requires changes to proceed.Mar 20 2020, 1:56 AM
shafik added a subscriber: shafik.Mar 20 2020, 9:34 AM
shafik added inline comments.
lldb/source/Plugins/Platform/MacOSX/PlatformDarwin.cpp
1318

We should document what we expect the format to be. I mean I can deduce it from the code but I would prefer to have it spelled out and then check the code parses the expected format correctly.

1336

We should have tests for this and the rest of the methods here.

Thanks, Pavel, these are all very valid concerns and in retrospect I should have at least slapped an RFC label on this before dumping into phabricator last night. The goal of this (and I should have mentioned that in the description) is to make the Xcode SDK something that behaves more like the compiler's resource directory, as in that it ships with LLDB rather than with the debugged program. This important primarily for importing Swift and Clang modules in the expression evaluator, and getting at the APINotes from the SDK in Swift.

For a cross-debugging scenario, this means you have to have an SDK for your target installed alongside LLDB. In Xcode this will always be the case. As you correctly identified my first patch does not work if you are cross debugging, e.g., a macOS target from a Linux machine because we are querying the HostPlatform for the SDK path and only PlatformDarwin implements this function. I'm open to any suggestions (and will also think about this myself) for how to make the layering less awkward, the only hard requirement I have is that I need the SDK path remapping information inside the per-module Swift typesystem, so I can't depend on a target for any of this.

What do you think about the general idea of a HostPlatform having several cross-SDKs installed and knowing where to find them? What would be a better place for this otherwise?

Thanks for the explanation. I have some ideas on this below, though I am not sure if I know enough about the problem to be able to tell which ones are feasible.

Thanks, Pavel, these are all very valid concerns and in retrospect I should have at least slapped an RFC label on this before dumping into phabricator last night. The goal of this (and I should have mentioned that in the description) is to make the Xcode SDK something that behaves more like the compiler's resource directory, as in that it ships with LLDB rather than with the debugged program. This important primarily for importing Swift and Clang modules in the expression evaluator, and getting at the APINotes from the SDK in Swift.

For a cross-debugging scenario, this means you have to have an SDK for your target installed alongside LLDB. In Xcode this will always be the case. As you correctly identified my first patch does not work if you are cross debugging, e.g., a macOS target from a Linux machine because we are querying the HostPlatform for the SDK path and only PlatformDarwin implements this function. I'm open to any suggestions (and will also think about this myself) for how to make the layering less awkward, the only hard requirement I have is that I need the SDK path remapping information inside the per-module Swift typesystem, so I can't depend on a target for any of this.

What do you think about the general idea of a HostPlatform having several cross-SDKs installed and knowing where to find them? What would be a better place for this otherwise?

I don't think putting this knowledge inside the host platform instance is a good idea. It already seems very odd to be enumerating all darwin platforms inside PlatformDarwin, when there is already PlatformXXX class for each of these platforms. It would be even weirder to include non-darwin platforms there.

Technically, platform objects are not tied to any particular target, and all you need to get one is an ArchSpec (static Platform::GetPlatformForArchitecture). Putting this knowledge into the "right" platform instances and then finding them based on the module's ArchSpec seems like it should be possible. I'm not convinced its a good idea to reach for a Platform object from a Module (it seems to open possibilities for inconsistencies between the platform chosen in this way, and the platform used for eventual launches), but it definitely seems better than defaulting to the host platform object (which also doesn't do anything to address the inconsistency).

Alternatively, since the list of "sdk" that the lldb happens to come with is a property of how it is built/deployed, it may also make sense to put this logic into the Host module somehow. A combination of the two approaches might be interesting too (the interactions between host and platform are always a bit weird). E.g., we could still query the Platform object for the sdk, but it would defer to some Host functionality to produce the right value. That way we could use xcrun (or whatever) on a mac, but theoretically other host systems could implement a different mechanism to locate the mac SDKs.

Finally, it may be possible to just have the list of sdks as some hardcoded list (populated by the build system?), in which case the code for that could live just about anywhere...

aprantl updated this revision to Diff 253257.Mar 27 2020, 5:19 PM

I've reworked this a little based on your feedback.

First, I've renamed SDK to XcodeSDK. An Xcode SDK is a fairly specific concept and I'm not going to pretend that it makes sense to generalize it for something else, so I thought the name should reflect this. I've kept it in Utility though, since the functionality mirrors ArchSpec and one platform typically hosts many SDKs at the same time. I entertained the idea of creating a base class and each platform would implement its own SDK, which sounds neat, but with all the merging functionality needed, this doesn't really work out.

I did incorporate the suggestion of requesting a platform (because I do think it should be Platform or Host that is calling xcrun) via the Module's ArchSpec.

Finally I added unit tests for all of the SDK functionality.

I have no great idea for how to test the remapping functionality other than completely end-to-end, because its primary effect is showing when debugging the OS itself (you need to resolve files *inside* the SDK).

Let me know what you think!

aprantl marked 2 inline comments as done.Mar 27 2020, 5:21 PM

I've reworked this a little based on your feedback.

First, I've renamed SDK to XcodeSDK. An Xcode SDK is a fairly specific concept and I'm not going to pretend that it makes sense to generalize it for something else, so I thought the name should reflect this. I've kept it in Utility though, since the functionality mirrors ArchSpec and one platform typically hosts many SDKs at the same time. I entertained the idea of creating a base class and each platform would implement its own SDK, which sounds neat, but with all the merging functionality needed, this doesn't really work out.

Thanks.

That sounds fine. I don't have a problem with a class like this residing in Utility. I also don't think it's time to try to generalize this just yet, as its very likely that the generalized concept would not fit what some other platform wants to do. I just think some of the apis should reflect the specificity of this more. Some inline comments reflect that.

I did incorporate the suggestion of requesting a platform (because I do think it should be Platform or Host that is calling xcrun) via the Module's ArchSpec.

That sounds fine to me, though I'd like to hear @jingham's take on Platform in this way. The part that I'm not super-clear on is the type argument to the GetSDKPath function. I mean, if I look at the XcodeSDK::Type enum, then the list pretty much mirrors the list if Platform classes we have. So asking a PlatformRemoteAppleWatch for an "AppleTV" sdk sounds nonsensical, and one gets the impression that this information should be already encoded in the selection of the platform object. It probably already is, via the ArchSpec argument, no?

In case of Apple platforms, this won't make a difference in practice, since the support for that is implemented in PlatformDarwin (which all of these inherit from), but it sounds like this will be a problem for the "linux" sdk (assuming this is what I think it is), as there the selected platform will be PlatformLinux, which has no clue about these sdk thingies.

So, it sounds to me like the sdk type should somehow play a role in the selection of the platform instance, or this functionality should be moved down into some Host function.

Finally I added unit tests for all of the SDK functionality.

I have no great idea for how to test the remapping functionality other than completely end-to-end, because its primary effect is showing when debugging the OS itself (you need to resolve files *inside* the SDK).

I guess this is down to our current inability to "mock" xcrun responses (?) It might be possible to do something about that in a unit test with a custom platform plugin, but it seems that the amount of untested functionality here is not very big (Module::RegisterSDK, basically), so we could let that slide.

lldb/include/lldb/Core/Module.h
517

RegisterXCodeSDK?

983

m_xcode_sdk

lldb/include/lldb/Target/Platform.h
437

GetXCodeSDKPath? (and maybe the argument can be the proper enum type now.

lldb/include/lldb/Utility/XcodeSDK.h
25

the fancy && is useless here without std::move (and it's not even very useful with it).

In case of Apple platforms, this won't make a difference in practice, since the support for that is implemented in PlatformDarwin (which all of these inherit from), but it sounds like this will be a problem for the "linux" sdk (assuming this is what I think it is), as there the selected platform will be PlatformLinux, which has no clue about these sdk thingies.

The "Linux" SDK is a placeholder of sorts when debugging a Swift-on-Linux server application (e.g., something like https://www.kitura.io/) from *within* Xcode that points to a directory that contains the Swift resources/stdlib cross-compiled for Linux. So this is still an Xcode SDK that should be managed by PlatformDarwin despite the confusing name.

So, it sounds to me like the sdk type should somehow play a role in the selection of the platform instance, or this functionality should be moved down into some Host function.

Since there is only going to be one place that does the SDK detection via xcrun, I would either move it to PlatformDarwin or HostMacOSX. Personally, I think I prefer the Host, since this is something that is really about the machine LLDB is running on and not about the remote platform.

Having this in the host layer sounds good to me.

aprantl updated this revision to Diff 254071.Mar 31 2020, 6:35 PM

I just had an off-list conversation with @jingham were we discussed what should be Platform versus Host in detail. The resulting solution looks even more convoluted at first, but it actually solves a bunch of problems that I had with either of the previous proposals. Jim argued that an interface that retrieves an SDK for the target should live in PlatformDarwin, since we are querying something about a Darwin platform, and in principle, this interface should also be available if you are (hypothetically) debugging a macOS process from a Linux host, even if it isn't implemented at the time. However, the one concrete implementation that we have, the one that calls xcrun must live in HostInfoMacOSX, because xcrun doesn't anywhere else.

With this in mind I arrived at this version of the patch where the implementation lives in HostInfoMacOSX, but the interface lives in PlatformDarwin, which forwards to the host if an implementation exists. Let me (Pavel & Jim) know what you think about this variant!

I think the current separation makes a lot of sense, but I don't like that XcodeSDK class is part of Utility. I understand the need from a layering perspective and I don't really have a better idea, but maybe someone else does?

labath added a comment.Apr 1 2020, 1:07 AM

I don't think that spreading this out over host and platform is convoluted. In fact, I was going to propose something like that myself. However, it does bring us back to the question of the "linux" sdk.

A couple of comments ago you said:

The "Linux" SDK is a placeholder of sorts when debugging a Swift-on-Linux server application (e.g., something like https://www.kitura.io/) from *within* Xcode that points to a directory that contains the Swift resources/stdlib cross-compiled for Linux.

When you say "cross-compiled", I am imagining that the resulting binary will be a regular linux elf file, and it's triple will be detected as ***-linux. So, Platform::GetPlatformForArchitecture(GetArchitecture(), nullptr) will return PlatformLinux, and the subsequent GetSDKPath will return blank, whereas I am expecting that you would want it to return HostInfoMacOSX::GetXcodeSDK(linux) (if running on macos).

So, if all of this is true, and we want this interface to reside on the Platform class, then I think the implementation of GetSDKPath needs to be spread out over the individual platform classes, which would delegate to the appropriate host api. E.g. PlatformLinux::GetSDKPath (with no arguments) would be implemented via HostInfo::GetXcodeSDK(linux), PlatformRemoteAppleWatch::GetSDKPath as HostInfo::GetXcodeSDK(watchos), etc.

This would give us maximum flexibility. For instance one could implement HostInfoLinux::GetXcodeSDK to locate the xcode sdk through some other mechanism, and then cross-debugging mac binaries would work too. Or, one could implement PlatformXXX::GetSDKPath to download the sdk from the remote system somehow if it is not found on the host, and then we would be able to debug on any host. However, I found all of these scenarios fairly unlikely, which is why I said that going straight for the host seems fine. The interface is still sensible -- we just say that the host system has to provide the sdks for any of the listed platforms. And if the host doesn't provide that, tough luck. That setup can always be generalized if we have a need for it.

I think the current separation makes a lot of sense, but I don't like that XcodeSDK class is part of Utility. I understand the need from a layering perspective and I don't really have a better idea, but maybe someone else does?

We could move it up to Host, and have it be next to the HostInfo::GetXcodeSDK function which returns it. Would that look any better?

aprantl updated this revision to Diff 254326.Apr 1 2020, 3:42 PM

I don't think that spreading this out over host and platform is convoluted. In fact, I was going to propose something like that myself. However, it does bring us back to the question of the "linux" sdk.

A couple of comments ago you said:

The "Linux" SDK is a placeholder of sorts when debugging a Swift-on-Linux server application (e.g., something like https://www.kitura.io/) from *within* Xcode that points to a directory that contains the Swift resources/stdlib cross-compiled for Linux.

When you say "cross-compiled", I am imagining that the resulting binary will be a regular linux elf file, and it's triple will be detected as ***-linux. So, Platform::GetPlatformForArchitecture(GetArchitecture(), nullptr) will return PlatformLinux, and the subsequent GetSDKPath will return blank, whereas I am expecting that you would want it to return HostInfoMacOSX::GetXcodeSDK(linux) (if running on macos).

So, if all of this is true, and we want this interface to reside on the Platform class, then I think the implementation of GetSDKPath needs to be spread out over the individual platform classes, which would delegate to the appropriate host api. E.g. PlatformLinux::GetSDKPath (with no arguments) would be implemented via HostInfo::GetXcodeSDK(linux), PlatformRemoteAppleWatch::GetSDKPath as HostInfo::GetXcodeSDK(watchos), etc.

This would give us maximum flexibility. For instance one could implement HostInfoLinux::GetXcodeSDK to locate the xcode sdk through some other mechanism, and then cross-debugging mac binaries would work too. Or, one could implement PlatformXXX::GetSDKPath to download the sdk from the remote system somehow if it is not found on the host, and then we would be able to debug on any host. However, I found all of these scenarios fairly unlikely, which is why I said that going straight for the host seems fine. The interface is still sensible -- we just say that the host system has to provide the sdks for any of the listed platforms. And if the host doesn't provide that, tough luck. That setup can always be generalized if we have a need for it.

Great! I think the current version of the patch reflects all of that, with only one difference — something that I actually overlooked in the previous version. We now have Platform that provides the interface to provide an SDK, and will forward the question to Host. Each Host can define its own mechanisms for doing so. So far I have only provided an implementation for macOS, which runs xcrun. What I can't do is remove the SDK parameter from the interface in platform, because Xcode may contain differently versioned SDKs (e.g., a MacOSX10.14.sdk and a MacOSX10.15.sdk) plus some wrinkles that are only relevant when you are developing the OS itself. That said, even without removing the SDK parameter from GetSDKPath(), we can still implement HostInfoLinux::GetXcodeSDK() for hypothetical Linux -> macOS cross-debugger you mentioned above, so I don't think the extra parameter tarnishes the design.

In D76471#1954045, @JDevlieghere wrote:

I think the current separation makes a lot of sense, but I don't like that XcodeSDK class is part of Utility. I understand the need from a layering perspective and I don't really have a better idea, but maybe someone else does?

We could move it up to Host, and have it be next to the HostInfo::GetXcodeSDK function which returns it. Would that look any better?

In a way XcodeSDK is like Triple or ArchSpec, so I've started to feel less bad about putting it into Utility, where ArchSpec lives.

labath added a comment.Apr 2 2020, 2:28 AM

Ok, I think we have converged onto something here. I just think it would be good to rename some functions/variables to make it clear they are working with an xcode sdk, and not some generic entity.

Great! I think the current version of the patch reflects all of that, with only one difference — something that I actually overlooked in the previous version. We now have Platform that provides the interface to provide an SDK, and will forward the question to Host. Each Host can define its own mechanisms for doing so. So far I have only provided an implementation for macOS, which runs xcrun. What I can't do is remove the SDK parameter from the interface in platform, because Xcode may contain differently versioned SDKs (e.g., a MacOSX10.14.sdk and a MacOSX10.15.sdk) plus some wrinkles that are only relevant when you are developing the OS itself. That said, even without removing the SDK parameter from GetSDKPath(), we can still implement HostInfoLinux::GetXcodeSDK() for hypothetical Linux -> macOS cross-debugger you mentioned above, so I don't think the extra parameter tarnishes the design.

Yeah, it seems acceptable. The "sdk type" argument still seems somewhat redundant. Theoretically we could pass in just the triple component out the xcodesdk object, but that would just mean the individual platforms would have to reconstruct the XcodeSDK object, which is already present in the caller, so it's not ideal either...

lldb/include/lldb/Core/Module.h
517

ping

983

ping?

aprantl updated this revision to Diff 254696.Apr 2 2020, 9:05 PM
aprantl marked 7 inline comments as done.

Renamed SDK -> XcodeSDK where applicable. Sorry I missed this earlier.

Yeah, it seems acceptable. The "sdk type" argument still seems somewhat redundant. Theoretically we could pass in just the triple component out the xcodesdk object, but that would just mean the individual platforms would have to reconstruct the XcodeSDK object, which is already present in the caller, so it's not ideal either...

It's an entire string, not just the type, because it also contains an (optional) version number.

lldb/include/lldb/Target/Platform.h
437

That would look clean, but unfortunately it has to be a full XcodeSDK string, because in addition to the type, it optionally also contains a version number and some other bits that are applicable only when developing the OS itself.

labath accepted this revision.Apr 3 2020, 12:34 AM

Ok, I think we've done what we can here.

Yeah, it seems acceptable. The "sdk type" argument still seems somewhat redundant. Theoretically we could pass in just the triple component out the xcodesdk object, but that would just mean the individual platforms would have to reconstruct the XcodeSDK object, which is already present in the caller, so it's not ideal either...

It's an entire string, not just the type, because it also contains an (optional) version number.

err.. I meant "just the _version_ component", but it sounds like you are saying you need more than the version.

This revision is now accepted and ready to land.Apr 3 2020, 12:34 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 6 2020, 4:22 PM
sylvestre.ledru added inline comments.
lldb/source/Utility/XcodeSDK.cpp
72

With older version of the libcstdc++ (some Ubuntu LTS), it fails with:

 error: chosen constructor is explicit in copy-initialization
  return {sdk, version};
         ^~~~~~~~~~~~~~
/usr/lib/gcc/i686-linux-gnu/5.3.1/../../../../include/c++/5.3.1/tuple:612:19: note: explicit constructor declared here
        constexpr tuple(_U1&& __a1, _U2&& __a2)

@aprantl Could you please have a look ?

aprantl marked an inline comment as done.Apr 10 2020, 10:35 AM
aprantl added inline comments.
lldb/source/Utility/XcodeSDK.cpp
72

I have made an attempt in f5be71b4450, but if that doesn't work, I'll need some help since I don't really know what works and doesn't work on that particular platform.