This is an archive of the discontinued LLVM Phabricator instance.

Add Utility/Environment class for handling... environments
ClosedPublic

Authored by labath on Dec 18 2017, 9:15 AM.

Details

Summary

There was some confusion in the code about how to represent process
environment. Most of the code (ab)used the Args class for this purpose,
but some of it used a more basic StringList class instead. In either
case, the fact that the underlying abstraction did not provide primitive
operations for the typical environment operations meant that even a
simple operation like checking for an environment variable value was
several lines of code.

This patch adds a separate Environment class, which is essentialy a
llvm::StringMap<std::string> in disguise. To standard StringMap
functionality, it adds a couple of new functions, which are specific to
the environment use case:

  • (most important) envp conversion for passing into execve() and likes. Instead of trying to maintain a constantly up-to-date envp view, it provides a function which creates a envp view on demand, with the expectation that this will be called as the very last thing before handing the value to the system function.
  • insert(StringRef KeyEqValue) - splits KeyEqValue into (key, value) pair and inserts it into the environment map.
  • compose(value_type KeyValue) - takes a map entry and converts in back into "KEY=VALUE" representation.

With this interface most of the environment-manipulating code becomes
one-liners. The only tricky part was maintaining compatibility in
SBLaunchInfo, which expects that the environment entries are accessible
by index and that the returned const char* is backed by the launch info
object (random access into maps is hard and the map stores the entry in
a deconstructed form, so we cannot just return a .c_str() value). To
solve this, I have the SBLaunchInfo convert the environment into the
"envp" form, and use it to answer the environment queries. Extra code is
added to make sure the envp version is always in sync.

(This also improves the layering situation as Args was in the Interpreter module
whereas Environment is in Utility.)

Diff Detail

Repository
rL LLVM

Event Timeline

labath created this revision.Dec 18 2017, 9:15 AM
zturner added inline comments.Dec 18 2017, 9:26 AM
include/lldb/Utility/Environment.h
70–72 ↗(On Diff #127378)

Why'd you decide to go with inserting an entire pre-formatted key value pair in the form {key}={value}, instead of having the function be insert(StringRef Key, StringRef Value)?

source/Host/macosx/Host.mm
763 ↗(On Diff #127378)

Indentation is messed up here.

labath marked an inline comment as done.Dec 18 2017, 9:36 AM
labath added inline comments.
include/lldb/Utility/Environment.h
70–72 ↗(On Diff #127378)

The insert(key, value) form is still available (and used where appropriate). This is a convenience function because in a lot of the cases we only have the "KEY=VALUE" form handy (constructing from char**envp, reading the env over gdb-remote, ...), and I wanted to avoid having each user having to split this manually. I wouldn't be opposed to calling this something other than insert if you think it's confusing, but I think the functionality per-se is quite useful.

labath updated this revision to Diff 127384.Dec 18 2017, 9:39 AM

re-clang-format the patch

clayborg accepted this revision.Dec 18 2017, 10:09 AM

ok as long as we don't want to return const reference when returning Environment values in getters and setters.

include/lldb/Target/Platform.h
636 ↗(On Diff #127384)

Do we want to return by value? Not a const reference?

include/lldb/Target/Target.h
118–119 ↗(On Diff #127384)

set/get using reference instead of by value?

source/API/SBLaunchInfo.cpp
34 ↗(On Diff #127384)

Seems like a lot of work to get indexed environment variable access. Should we not make the Environment class use a std::vector instead of a map? Not sure if environment variable order is ever important? Is so then the StringMap isn't what we want to use. I know we have it in our public API so we can't change it now and thus why we need this work around. Just thinking out loud

This revision is now accepted and ready to land.Dec 18 2017, 10:09 AM
labath added inline comments.Dec 18 2017, 12:22 PM
include/lldb/Target/Platform.h
636 ↗(On Diff #127384)

This object can come from the Host::GetEnvironment, which constructs it on-demand. Returning a reference would mean we need to cache a value in the host layer, but that's tricky as the environment can change during runtime, and you want to make sure you always return an up-to-date value.

I don't think this is an issue, as all the returns will be moves, so the situation will be the same as before this patch (where we also created a copy of the environment in the by-ref argument).

include/lldb/Target/Target.h
118–119 ↗(On Diff #127384)

There is no persistent object object to back the reference for the getter (it's constructed from an OptionValueDictionary). Using a value for the setter is actually better, as that allows me to move-assign the environment object in the LaunchInfo. If the user calls this with a moved object then we get no copies.

source/API/SBLaunchInfo.cpp
34 ↗(On Diff #127384)

I was thinking about the importance of variable order as well. I can certainly construct an application that will behave differently depending on the order of environment variables it is given. However, something like that seems unlikely to happen for real.

I actually started working on an order-preserving implementation, but then dropped it when I saw that we are already shoving the variables through a std::map in the OptionValueDictionary in target.env-vars setting. If we manage to drop that (OptionValueEnvironment?), then I think it would make sense to swap out this implementation for an order-preserving one. I would still keep the present map-like interface, as we do key-based lookup in quite a lot of places.

Thanks for the explanation. Good to go.

This revision was automatically updated to reflect the committed changes.