This is an archive of the discontinued LLVM Phabricator instance.

Add support to read aux vector values
ClosedPublic

Authored by aadsm on May 27 2019, 3:19 PM.

Details

Summary

This is the second patch to improve module loading in a series that started here (where I explain the motivation and solution): https://reviews.llvm.org/D62499

I need to read the aux vector to know where the r_debug map with the loaded libraries are.
The AuxVector class was made generic so it could be reused between the POSIX-DYLD plugin and NativeProcess*. The class itself ended up in the ProcessUtility plugin.

Event Timeline

aadsm created this revision.May 27 2019, 3:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 27 2019, 3:19 PM
labath requested changes to this revision.May 28 2019, 1:35 AM
labath added a reviewer: JDevlieghere.
labath added a subscriber: JDevlieghere.

Definitely go for the option of refactoring the DYLD AuxVector class to make it usable from lldb-server.

It doesn't look like it should be that complicated even. Instead of passing it a Process instance, we can just pass it the data we have obtained from the right process (liblldb process, or lldb-server process), along with any other data it needs to interpret it (looks like it's just the pointer size (?)).

The question of plugins depending on each other is just a matter of finding the right place to put this new class. In my mind, Plugins/Process/Utility` is not a real plugin, and I don't see much of a problem in the Posix dynamic loader depending on that (it makes sense that it would need *some* data about the process in order to do its job. Or, given that the new class will depend on just the data extractor and nothing else, we could put it all the way up to Utility.

A third option would be to create a completely new library for this. In the past we've talked about a new library for "classes describing various properties of a process", where we'd have MemoryRegionInfo, ProcessInfo, etc, but so far it hasn't materialized. It seems like this could fit nicely into this description, so we could start with the new library with this class. +@JDevlieghere for any thoughts on this.

This revision now requires changes to proceed.May 28 2019, 1:35 AM
clayborg requested changes to this revision.May 28 2019, 7:37 AM
clayborg added inline comments.
lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp
2091–2093

We need to get a copy of the data here right? I believe the "buffer_or_error" local variable will go out of scope and the data pointed to will be freed. You can use:

lldb::offset_t DataExtractor::CopyData(lldb::offset_t offset, lldb::offset_t length, void *dst) const;

which will copy of the data and internally own it in a DataBufferSP. Or you can create a DataBufferSP yourself using DataBufferHeap and placing it into a DataBufferSP and then passing that to the DataExtractor constructor or SetData method.

lldb/source/Plugins/Process/Utility/ELFAuxVector.cpp
18
data.GetAddress(offset_ptr);
lldb/source/Plugins/Process/Utility/ELFAuxVector.h
67

Is there only ever one entry in the AUXV map for each EntryType?

I like this being in a Utility class like it is coded, but we should fix the POSIX DYLD plug-in to use this common code as part of this patch. Comments?

aadsm marked 2 inline comments as done.May 28 2019, 7:49 AM
aadsm added inline comments.
lldb/source/Plugins/Process/Utility/ELFAuxVector.cpp
18

I know you pointed this out but I felt weird calling GetAddress since that's not the semantic value of the underlying data. That's why I still ended up with GetMaxU64, what do you think?

lldb/source/Plugins/Process/Utility/ELFAuxVector.h
67

Yes, this was something I was concerned with (since the AuxVector returns an iterator). I checked the "official" API and that's what they assume: http://man7.org/linux/man-pages/man3/getauxval.3.html

clayborg added inline comments.May 28 2019, 9:18 AM
lldb/source/Plugins/Process/Utility/ELFAuxVector.cpp
18

I would use it since that is exactly what you are doing with "data.GetMaxU64(offset_ptr, data.GetAddressByteSize());".

Definitely go for the option of refactoring the DYLD AuxVector class to make it usable from lldb-server.

+1

It doesn't look like it should be that complicated even. Instead of passing it a Process instance, we can just pass it the data we have obtained from the right process (liblldb process, or lldb-server process), along with any other data it needs to interpret it (looks like it's just the pointer size (?)).

The question of plugins depending on each other is just a matter of finding the right place to put this new class. In my mind, Plugins/Process/Utility` is not a real plugin, and I don't see much of a problem in the Posix dynamic loader depending on that (it makes sense that it would need *some* data about the process in order to do its job. Or, given that the new class will depend on just the data extractor and nothing else, we could put it all the way up to Utility.

I don't think this is general enough to fit in Utility. Looking at other classes in Plugins/Process/Utility, it seems like a better fit.

A third option would be to create a completely new library for this. In the past we've talked about a new library for "classes describing various properties of a process", where we'd have MemoryRegionInfo, ProcessInfo, etc, but so far it hasn't materialized. It seems like this could fit nicely into this description, so we could start with the new library with this class. +@JDevlieghere for any thoughts on this.

Do you mean having AuxVector in this library, or having it take a ProcessInfo instead of the Process itself?

A third option would be to create a completely new library for this. In the past we've talked about a new library for "classes describing various properties of a process", where we'd have MemoryRegionInfo, ProcessInfo, etc, but so far it hasn't materialized. It seems like this could fit nicely into this description, so we could start with the new library with this class. +@JDevlieghere for any thoughts on this.

Do you mean having AuxVector in this library, or having it take a ProcessInfo instead of the Process itself?

The first option. The idea would be that "ProcessInfo" describes uids/pids/... of a process, MemoryRegionInfo describes it's memory regions, and AuxVector describes it's, well.. aux vector. UnixSignals, and probably some others could go there too.

OTOH, Process/Utility sounds like a good name for this kind of thing, and some of these classes are already there, so we may want to approach this from the other end, and remove anything that should not be there (RegisterContextLLDB is my first candidate).

aadsm marked an inline comment as done.May 29 2019, 10:56 PM
aadsm added inline comments.
lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp
2091–2093

Forgot to answer this, this is actually not needed because the ELFAuxVector copies what's given to it, it doesn't keep a a reference to the passed data.

aadsm updated this revision to Diff 202324.May 30 2019, 3:39 PM

Rnamed ELFAuxVector to AuxVector and got rid of the one that existed in the POSIX-DYLD plugin.

labath added inline comments.May 31 2019, 2:52 AM
lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
91–94 ↗(On Diff #202324)

It looks like we're not guarding the call to LoadModules on line 101, so I think we can assume that m_process is not null here (and I don't see why it should be -- it does not make sense to call DidAttach if there is no process around).

BTW, your guarding attempt isn't even correct as we'd always call m_process->GetByteOrder() and crash if the process was null.

182–185 ↗(On Diff #202324)

same here.

549–550 ↗(On Diff #202324)

Whitespace-only change. Did you maybe run clang-format on the whole file instead of just the diff?

It doesn't matter that much here as there isn't many of them, but I just want to make sure you're aware of this in the future.

lldb/source/Plugins/Process/Utility/AuxVector.cpp
38 ↗(On Diff #202324)

It would be better to return an Optional<uint64_t> (or addr_t maybe ?) instead of using a magic value to mean "not found". It looks like at least some of these entries can validly be zero.

50 ↗(On Diff #202324)

range-based for ?

aadsm marked 3 inline comments as done.May 31 2019, 7:13 AM
aadsm added inline comments.
lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
91–94 ↗(On Diff #202324)

oh no, this is embarrassing!

549–550 ↗(On Diff #202324)

odd, pretty sure I just run clang-format-diff (as I don't have auto formatting going on), will double check this.

lldb/source/Plugins/Process/Utility/AuxVector.cpp
38 ↗(On Diff #202324)

Can do, I was trying to follow this api though: http://man7.org/linux/man-pages/man3/getauxval.3.html.
I think I'll go with the uint64_t though, I find it odd to return an addr_t because it's not an address, it's a number that happens to be the same size of an address.

labath added inline comments.May 31 2019, 7:21 AM
lldb/source/Plugins/Process/Utility/AuxVector.cpp
38 ↗(On Diff #202324)

Ah, interesting. It's a bit odd of an api as it does not allow you to distinguish AT_EUID not being present from somebody actually being root. I guess you're expected to assume that the AT_EUID field will always be present...

In any case, I think it would be better to use Optional, and not be tied to what is expressible in some C api.

I'm fine with uint64_t.

JDevlieghere added inline comments.May 31 2019, 9:38 AM
lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
94 ↗(On Diff #202324)

llvm::make_unique< AuxVector>(auxv_data);

631 ↗(On Diff #202324)

This shouldn't be auto anymore, as it's not obvious what the type is. Same below.

lldb/source/Plugins/Process/Linux/NativeProcessLinux.h
105

Newline before the function to keep things consistent.

aadsm marked an inline comment as done.May 31 2019, 6:47 PM
aadsm added inline comments.
lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
549–550 ↗(On Diff #202324)

I have my editor to remove trailing spaces on save and this line had one, that's why it ended up being parsed by clang-format-diff.

aadsm updated this revision to Diff 202642.Jun 2 2019, 9:44 PM

Address commentds

aadsm updated this revision to Diff 202649.Jun 2 2019, 10:28 PM

Fix patch snafu

aadsm updated this revision to Diff 202654.Jun 2 2019, 10:47 PM

Missing clang-format-diff

labath accepted this revision.Jun 3 2019, 2:54 AM
labath added inline comments.
lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
629–631 ↗(On Diff #202654)
if (llvm::Optional<uint64_t> vdso_base =
      m_auxv->GetAuxValue(AuxVector::AUXV_AT_SYSINFO_EHDR))
  m_vdso_base = *vdso_base;

(and similar elsewhere)

clayborg added inline comments.Jun 3 2019, 10:00 AM
lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
91–92 ↗(On Diff #202654)

We should change Process::GetAuxvData() to return a DataExtractor to avoid this kind of code.

94 ↗(On Diff #202654)

Process can't be NULL ever in a DynamicLoader. It should be a reference, but there may have been complications making that happen for some reason since if you have a member variables "Process &m_process;" it has to be initialized in the contructor? Can't remember. Since process owns the dynamic loader plug-in you can assume "m_process" is always valid and not NULL.

181–182 ↗(On Diff #202654)

We should change Process::GetAuxvData() to return a DataExtractor to avoid this kind of code.

183 ↗(On Diff #202654)

Use std::move?

m_auxv = llvm::make_unique<AuxVector>(std::move(auxv_data));
lldb/source/Plugins/Process/Utility/AuxVector.cpp
11 ↗(On Diff #202654)

Not a big fan of constructors doing parsing work. No way to return an error unless you build that into the AuxVector class (AuxVector::GetError()).

13–20 ↗(On Diff #202654)

Remove this function and just call "data.GetAddress(offset_ptr)"?

24–28 ↗(On Diff #202654)
const size_t value_type_size = data.GetAddressByteSize() * 2;
while (data.ValidOffsetForDataOfSize(value_type_size)) {
  const uint64_t type = data.GetAddress(&offset);
  const uint64_t value = data.GetAddress(&offset);
  if (type == AUXV_AT_NULL)
    ...
aadsm marked 2 inline comments as done.Jun 3 2019, 5:08 PM
aadsm added inline comments.
lldb/source/Plugins/Process/Utility/AuxVector.cpp
11 ↗(On Diff #202654)

That is true but ParseAuxv doesn't generate errors. It just reads as much data as it exists in the data extractor until there's no more.

aadsm marked an inline comment as done.Jun 3 2019, 5:41 PM
aadsm added inline comments.
lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp
183 ↗(On Diff #202654)

The constructor receives a lldb_private::DataExtractor & so there's no copy going on.

aadsm updated this revision to Diff 202853.Jun 3 2019, 10:20 PM
aadsm marked an inline comment as done.

Address comments

aadsm added inline comments.Jun 3 2019, 10:22 PM
lldb/source/Plugins/Process/Utility/AuxVector.cpp
24–28 ↗(On Diff #202654)

nice, this is a much better loop.

labath accepted this revision.Jun 3 2019, 11:13 PM
clayborg accepted this revision.Jun 4 2019, 7:17 AM

Nice!

This revision is now accepted and ready to land.Jun 4 2019, 7:17 AM
JDevlieghere accepted this revision.Jun 10 2019, 9:24 AM
aadsm edited the summary of this revision. (Show Details)Jun 11 2019, 11:20 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJun 11 2019, 1:13 PM