This is an archive of the discontinued LLVM Phabricator instance.

Refactor user/group name resolving code
ClosedPublic

Authored by labath on Feb 13 2019, 1:32 AM.

Details

Summary

This creates an abstract base class called "UserIDResolver", which can
be implemented to provide user/group ID resolution capabilities for
various objects. Posix host implement a PosixUserIDResolver, which does
that using posix apis (getpwuid and friends). PlatformGDBRemote
forwards queries over the gdb-remote link, etc. ProcessInstanceInfo
class is refactored to make use of this interface instead of taking a
platform pointer as an argument. The base resolver class already
implements caching and thread-safety, so implementations don't have to
worry about that.

The main motivating factor for this was to remove external dependencies
from the ProcessInstanceInfo class (so it can be put next to
ProcessLaunchInfo and friends), but it has other benefits too:

  • ability to test the user name caching code
  • ability to test ProcessInstanceInfo dumping code
  • consistent interface for user/group resolution between Platform and Host classes.

Diff Detail

Repository
rLLDB LLDB

Event Timeline

labath created this revision.Feb 13 2019, 1:32 AM

You are using a mix of llvm & lldb naming conventions for local variables and arguments and the ivars of UserIDResolver (lots of "Uid", etc...) Probably better to stick with the lldb convention.

Other than that it looks good to me.

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerCommon.cpp
439–440

Do we need auto here? Since we have a bunch of API's returning StringRef's now when I see strings returned I get paranoid about their lifespan. auto hides the fact that I don't need to worry...

clayborg added inline comments.Feb 13 2019, 11:36 AM
include/lldb/Host/UserIDResolver.h
24 ↗(On Diff #186601)

make this 64 bit for future proofing? And if so, just use lldb::user_id_t?

27 ↗(On Diff #186601)

If we are storing this as an optional std::string, why hand it out as a StringRef?

49–51 ↗(On Diff #186601)

prefix with "m_", make lower case and separate by "_"

source/Commands/CommandObjectPlatform.cpp
1154

Why are we not passing the platform in still? What if we need it for something else?

1156

Ditto

source/Host/CMakeLists.txt
23 ↗(On Diff #186601)

remove whitespace change unless it is needed?

37 ↗(On Diff #186601)

remove whitespace change unless it is needed?

45 ↗(On Diff #186601)

remove whitespace change unless it is needed?

source/Target/Process.cpp
281

So we are currently only using the platform for the user ID resolving, but it might be nice to keep the platform as the argument in case we need it for something else in the future?

352

Ditto

I didn't get around to address the style issues yet, but I wanted to respond to one high-level comment today:

So we are currently only using the platform for the user ID resolving, but it might be nice to keep the platform as the argument in case we need it for something else in the future?

Doing this would defeat the purpose of this patch, as the ProcessInstanceInfo class would no longer be standalone. I don't think it's very likely that we'll need other platform capabilities to decode this class, but in case we do, I see two options:

  • hide the details of this new thing behind an interface. This could either be a new interface, or an extension of the user id resolver interface, depending on what the thing is.
  • Move the Dumping code out of the ProcessInstanceInfo class. If interpreting the ProcessInstanceInfo data is so complicated that we need access to full platform class, then I don't think the ProcessInstanceInfo is the right place for that code. In this case, we could move the code to the platform class (platform->Dump(stream, info)), or a completely separate class altogether (like we did with the DumpDataExtractor classes).
labath updated this revision to Diff 186982.Feb 15 2019, 2:26 AM
labath marked 13 inline comments as done.

Rename things to follow lldb style. All the remaining comments should be either
marked as done, or have a comment explaining why I don't think doing that is a
good idea.

include/lldb/Host/UserIDResolver.h
24 ↗(On Diff #186601)

I think the best way to future-proof things is to use a separate type for logically distinct entities. While user_id_t sounds like it would be the right type, it is actually used for a completely different purpose.

And while I don't fully understand how windows user id's work, they seem to be represented as strings of the form "S-1-5-21-7375663-6890924511-1272660413-2944159". If that's the thing we're going to use on windows, then even 64 bits will not be enough, and we may have to use something different altogether.

I used 32-bits for now, because that's what it was used until now, and it seemed to be enough. There are also some latent uses of UINT32_MAX for invalid uids, which would need to be tracked down to change that.

27 ↗(On Diff #186601)

I am storing it as std::string, as I need it to provide storage, but generally, I think the public APIs should use StringRefs for non-owning string references. I could hand this out as const llvm::Optional<std::string> &, but that would be even longer to type, and would increase the chance of someone doing an unintentional string copy.

source/Host/CMakeLists.txt
23 ↗(On Diff #186601)

I've committed the sorting as a separate patch.

source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerCommon.cpp
439–440

I've removed the auto, though I am not sure if that alleviates your fears, as the returned type is StringRef. There is still nothing to worry about though, as the backing storage is held by the resolver object.

jingham added inline comments.Feb 15 2019, 11:15 AM
source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerCommon.cpp
439–440

So how do I reason about the lifespan of this StringRef, then? Now I have to know that GetUserIDResolver doesn't make a temporary UserIDResolver, but a reference to one that persists - for how long again?

labath marked an inline comment as done.Feb 18 2019, 2:23 AM
labath added inline comments.
source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerCommon.cpp
439–440

Yes, that would be line of reasoning I envisioned. The returned resolver (and, transitively, the storage backing the StringRef) ought to exist at least while the "HostInfo" class is valid, so from HostInfo::Initialize() and until HostInfo::Finalize(). In practice it will be even longer because the resolver is constructed lazily on first use, and will be destroyed only by llvm_shutdown (which we don't call ever), but that's not what I would promise. For the "platform" version of the GetUserIDResolver function, the resolver ought to exist for as long as the platform instance you got it from is alive.

I can put this into the method comments, but it seems pretty straigh-forward to me (e.g. I would expect this comment to apply to all HostInfo functions. I definitely know that HostInfo::GetArchitecture blows up if called before Initialize()).

zturner added inline comments.Feb 26 2019, 9:18 AM
include/lldb/Host/UserIDResolver.h
9 ↗(On Diff #186982)

I wonder if this class should actually be in Host. While some specific implementation of it might be host-dependent, the interface itself is not. I kind of envision at some point in the future having a target that contains all of our core interfaces that someone can include and re-implement small pieces of the debugger without having to bring in the entire thing. This is also nice from a mocking / unittesting perspective.

So I think this would be better if it were in Utility (or some other top-level library such as Interfaces)

labath marked 2 inline comments as done.Feb 26 2019, 1:02 PM
labath added inline comments.
include/lldb/Host/UserIDResolver.h
9 ↗(On Diff #186982)

Yes, I've wondered about that too. I went with Host because that was enough to make things work, but I certainly see the case for this being Utility too. I'll move it there.

labath updated this revision to Diff 188448.Feb 26 2019, 1:03 PM
labath marked an inline comment as done.

Move the resolver interface into Utility

zturner accepted this revision.Mar 1 2019, 1:35 PM
This revision is now accepted and ready to land.Mar 1 2019, 1:35 PM

This is fine by me, though Greg I think still had an outstanding question. If he's satisfied, I am.

Greg, I believe I've responded to all your comments. If there is anything else you want me to do here, please let me know.

JDevlieghere accepted this revision.Mar 4 2019, 9:52 AM
ormris removed a subscriber: ormris.Mar 4 2019, 9:58 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMar 4 2019, 10:50 AM