This is an archive of the discontinued LLVM Phabricator instance.

[trace][intelpt] Add thin wrapper for perf_event_open API
ClosedPublic

Authored by jj10306 on Mar 15 2022, 1:26 PM.

Details

Summary
  • Add PerfEvent class to handle creating ring buffers and handle the resources associated with the perf_event. This API can be used by the IntelPT collection code and by the future work in this series of diffs related to enabling TSC to timestamp conversion.
  • Refactor IntelPT collection code to use this new API
  • Add TSC to timestamp conversion logic with unittest

Depends on https://reviews.llvm.org/D121711
Second of the series of smaller diffs that split https://reviews.llvm.org/D120595 up

Diff Detail

Event Timeline

jj10306 created this revision.Mar 15 2022, 1:26 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 15 2022, 1:26 PM
jj10306 requested review of this revision.Mar 15 2022, 1:26 PM
jj10306 added a comment.EditedMar 15 2022, 1:36 PM

Not sure if having this new, separate Perf.h file makes sense as the custom destructors for mmap and file descriptors can be used generally and don't directly relate to the perf API. I added this for the time being based on discussions on https://reviews.llvm.org/D120595, so feedback/thoughts are appreciated.

Also, this is my first time trying to create a "stack" of diffs so I just created the patch with the base being another diff's commit that's currently being reviewed instead of the base being some commit on the mainline - please let me know if there is a better/preferred to do this.

lldb/source/Plugins/Process/Linux/IntelPTCollector.cpp
166

Should this free function be defined here or elsewhere? It's declared in the new Perf.h file

wallace requested changes to this revision.Mar 16 2022, 11:46 AM

Move FetchPerfTscConversionParameters and StartTrace to the new Perf.h and Perf.cpp files. FetchPerfTscConversionParameters can easily be moved there. StartTrace should be refactored, so that a new method StartPerfEventWithAuxBuffer in Perf.h receives a configuration struct prepared by StartTrace in IntelPTCollector. This new StartPerfEventWithAuxBuffer could return a unique_ptr to a new PerfEvent object with pointers to m_mmap_meta and m_mmap_aux. The destructor of this object could free the resources accordingly. If you are able to do this, you could even share some code between FetchPerfTscConversionParameters and StartPerfEventWithAuxBuffer

This revision now requires changes to proceed.Mar 16 2022, 11:46 AM
jj10306 updated this revision to Diff 416337.Mar 17 2022, 3:19 PM
jj10306 retitled this revision from [WIP][trace][intelpt] Refactor perf_event resource handling and add TSC conversion param fetching logic to [WIP][trace][intelpt] Add thin wrapper for perf_event_open API.
jj10306 edited the summary of this revision. (Show Details)

addressed major comment on previous version - see updated title/summary

wallace retitled this revision from [WIP][trace][intelpt] Add thin wrapper for perf_event_open API to [trace][intelpt] Add thin wrapper for perf_event_open API.Mar 17 2022, 5:05 PM
wallace requested changes to this revision.Mar 17 2022, 10:43 PM

pretty nice. Most of the comments are stylistic and about the code organization, not the underlying logic, which is overall correct.

lldb/source/Plugins/Process/Linux/IntelPTCollector.cpp
8–27

Btw, there's a code style guideline regarding includes, which can be found here https://llvm.org/docs/CodingStandards.html#include-style. Let's try to follow it from now it. Being honest I wasn't really following it

9–10

do you need these?

198–199

tbh, this method doesn't need to be part of IntelPTThreadTrace. Just leave it as a private static function of this file.
Besides that, we can improve it a bit with the use of errorOrToExpected

198–199

we don't need to change this argument. We might later the behavior of this function to be "allocate the largest buffer possible that is <= the buffer size". Then we don't need to thing of pages anymore from a user point of view.
On the other hand, the PerfEvent API you are building should receive the number of pages as parameter because that's what the documentation says.

202–223

you can move this to a new static method CreateIntelPTPerfEventConfiguration

224–225

use auto only if the actual type is too complex, which is not the case here, so don't use auto. More here https://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable

225–226

just use the -> operator instead of creating a new variable

227

that std::move is not needed. It should be done automatically by the compiler

230

i'd asume this std::move is also not needed, but all good if it's here

lldb/source/Plugins/Process/Linux/IntelPTCollector.h
33–55

please move them to be bottom of this class and make them explicitly private

54–55

move the constructor to the top of the class and add documentation. Mention that the perf_event must be the result of a perf_event request for intel pt

54–59

move this method to the top of the class, to make it more readable. Also mention that this is the way to create an instance of this class

111

good one

lldb/source/Plugins/Process/Linux/Perf.cpp
34
35

the std::move should be able to go away

37

remove this, as it should be part of the documentation of GetMetadataPage

47
87–96

you can move the code that mmaps and checks for errors to a new function

First make resource_handle::MmapUP use always uint8_t instead of allowing a typename. In the end you mmap bytes and the getter methods in PerfEvent can do the casting.

private: 
Expected<resource_handle::MmapUP> PerfEvent::Mmap(size_t size, long int offset, StringRef buffer_name /*for errors*/) {
  errno = 0;
  auto mmap_result = ::mmap(nullptr, size, PROT_READ, MAP_SHARED, GetFd(), offset);

  if (mmap_result == MAP_FAILED) {
    std::string err_msg = llvm::formatv("perf event mmap allocation failed for {0}: {1}",
                                      buffer_name, std::strerror(errno));
    return createStringError(inconvertibleErrorCode(), err_msg);
  }
  return resource_handle::MmapUP(reinterpret_cast<uint8_t *>(mmap_result), size);
}

then you can use it like

Error LoadAuxBuffer(size_t pages) {
  size_t aux_mmap_size = pages * pagesize();
  if (Expected<resource_handle::MmapUP> mmap_aux = Mmap(aux_mmap_size, GetMetadata().aux_offset, "aux buffer"))
    m_aux_buffer = std::move(mmap_aux.get());
  else
    return mmap_aux.takeError();
}

Error LoadMetadataAndDataBuffer(size_t data_pages) {
  size_t size = (data_pages + 1) * pagesize();
  if (Expected<resource_handle::MmapUP> mmap_data = Mmap(size, 0, "metadata and data buffer"))
    m_metadata_and_data_buffer = std::move(mmap_data.get());
  else
    return m_metadata_and_data_buffer.takeError();
}

Error LoadMetadataAndBuffers(size_t data_pages, size_t aux_pages) {
  if (Error err = LoadMetadataAndDataBuffer(data_pages))
    return err;
  if (Error err = LoadAuxBuffer(aux_pages))
    return err;
  return Error::success();
}

perf_event_mmap_page &PerfEvent::GetMetadataPage() const {
 return *reinterpret_cast<perf_event_mmap_page*>(m_metadata_and_data_buffer.get());
}
102–104
lldb/source/Plugins/Process/Linux/Perf.h
10

i know not all the code is consistent, but this is the guideline we all should follow
https://llvm.org/docs/CodingStandards.html#header-guard, so you should change this to

LLDB_SOURCE_PLUGINS_PROCESS_LINUX_PERF_H

an example is lldb/source/Plugins/ObjectFile/JIT/ObjectFileJIT.h

13–14
31

rename length to bytes to be more specific

Also add a default value with a length of 0, i.e. MmapDeleter(size_t length = 0). That'll be used for avoiding specifying 0 in the constructor of PerfEvent

31–32
32–34

move the implementation to a cpp file

45

remove this, already done

48

you don't need this

49–55

move the implementation to a cpp file. That will also reduce the number of headers in this file

59–62

Let's remove the template, so that we only deal with bytes. The low-level mmap function works in terms of bytes, so it's better to keep it as it is. This also plays well with the MmapDeleter that deletes a number of bytes.

68–69
94–98

let's keep this short

99–103

As this will be invoked as PerfEvent::InitPerfEvent(, better just rename these methods to Init, so that the user sees PerfEvent::Init(

102–103

Add documentation mentioning the parameters and when to invoke this method over the other

105

move this to the cpp file. Just keep implementation code in a header file if it makes sense to copy it everywhere the header is included, e.g. inline functions

108
111–113

Let's not mention the metadata page here because the fact that the data buffer comes right after the metadata page is an implementation detail.

116
119

we cannot talk about the mmap pointers because that's a minor implementation detail

121

Let's better use a descriptive name: MmapMetadataPageAndBuffers(size_t data_pages, size_t aux_pages)

121–125

blank lines between methods and documentation

123

Return a reference instead of a pointer to enforce the validity of the data, because it can't be nullptr if you are using this method correctly, and it should fail if otherwise

128–134

it's possible that you can do m_mmap_meta() instead of m_mmap_meta(nullptr, resource_handle::MmapDeleter(0)) if MmapDeleter has a parameter-less constructor. All good if you can't

129

rename to m_mmap_metadata_and_data_buffer or have two different mmaps for each chunk of data. If you keep only one, mention in a comment that we can do this because the data buffer comes right after the metadata page

154

can you use the current process' pid with getpid() instead of having the caller pass a pid? That'll make your code simpler

159

remove #ifndef

This revision now requires changes to proceed.Mar 17 2022, 10:43 PM
wallace edited reviewers, added: zrthxn; removed: clayborg.Mar 17 2022, 10:43 PM
jj10306 marked 44 inline comments as done.Mar 21 2022, 9:06 AM
jj10306 added inline comments.
lldb/source/Plugins/Process/Linux/IntelPTCollector.cpp
227

it actually is required here

230

it actually is required here

lldb/source/Plugins/Process/Linux/Perf.cpp
35

it's actually required here

lldb/source/Plugins/Process/Linux/Perf.h
68–69

I don't think you finished your comment here

111–113

isn't "A value of 0 effectively is a no-op and no data is mmaped for this buffer." a bit misleading since the metadata page will always be mmaped even if 0 is passed?
I agree that it's a no-op for the aux buffer but I think it's worth noting here that the metadata page will always be allocated

jj10306 updated this revision to Diff 416979.Mar 21 2022, 9:11 AM
jj10306 marked 5 inline comments as done.
  • remove template from MmapUP and use void * instead since this is what mmap returns. Leave casting of the pointer to the getter functions
  • refactor internal mmap functions and provide a thin mmap wrapper function that simply forwards parameters to real mmap calls and allows for custom error message. Different callers might need to pass different prot, flag, etc so I decided to include all the mmap parameters, except fd since this is stored on the PerfEvent object itself. This could potentially be changed to accept a fd and be moved out of the PerfEvent class to be used globally since it simply wraps an mmap call and adds custom error messages and resource handling through the unique pointer.
  • clean up and add documentation
wallace requested changes to this revision.Mar 21 2022, 10:23 AM

only some minor documentation change and good to go

lldb/source/Plugins/Process/Linux/IntelPTCollector.h
17

this shouldn't be there.

I'm even more amazed by the fact that

#include "llvm/ADT/DenseMap.h" 
#include "llvm/ADT/DenseSet.h"

are there in the first place. Maybe the intel guy who did this first added them

118
120–124

Too many implementation details. Better not mention them. The reason for avoiding giving too many details is that in the future the implementation might change and this comment won't make sense anymore.

lldb/source/Plugins/Process/Linux/Perf.h
112–117

Let's reduce the number of implementation details that might easily change in the future.

139–140

nice clarification

This revision now requires changes to proceed.Mar 21 2022, 10:23 AM
jj10306 updated this revision to Diff 417051.Mar 21 2022, 12:36 PM
jj10306 marked 5 inline comments as done.
jj10306 edited the summary of this revision. (Show Details)

Address minor comments and add unittests

wallace requested changes to this revision.Mar 21 2022, 12:55 PM

some minor final details

lldb/source/Plugins/Process/Linux/Perf.cpp
60–61

??

lldb/source/Plugins/Process/Linux/Perf.h
116
254

??

lldb/unittests/Process/Linux/PerfTests.cpp
24
40–41

use lower_case format. The xray uses a different format

49

add a comment before this line explaining how the test will work

55

nice!!

also explain here that it's all fine to skip here

71
This revision now requires changes to proceed.Mar 21 2022, 12:55 PM
jj10306 updated this revision to Diff 417072.Mar 21 2022, 1:26 PM
jj10306 marked 8 inline comments as done.

final cleanup

wallace accepted this revision.Mar 21 2022, 1:29 PM

great job!

This revision is now accepted and ready to land.Mar 21 2022, 1:29 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMar 21 2022, 1:39 PM