Changeset View
Standalone View
llvm/tools/llvm-profgen/llvm-profgen.cpp
- This file was added.
//===- llvm-profgen.cpp - LLVM SPGO profile generation tool ---------------===// | |||||
// | |||||
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | |||||
// See https://llvm.org/LICENSE.txt for license information. | |||||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | |||||
// | |||||
//===----------------------------------------------------------------------===// | |||||
// | |||||
// llvm-profgen generates SPGO profiles from perf script ouput. | |||||
// | |||||
//===----------------------------------------------------------------------===// | |||||
#include "ErrorHandling.h" | |||||
#include "PerfReader.h" | |||||
#include "ProfiledBinary.h" | |||||
#include "llvm/Support/CommandLine.h" | |||||
#include "llvm/Support/InitLLVM.h" | |||||
static cl::list<std::string> PerfTraceFilenames( | |||||
"perfscript", cl::value_desc("perfscript"), cl::OneOrMore, | |||||
llvm::cl::MiscFlags::CommaSeparated, | |||||
cl::desc("Path of perf-script trace created by Linux perf tool with " | |||||
"`script` command(the raw perf.data should be profiled with -b)")); | |||||
static cl::list<std::string> | |||||
BinaryFilenames("binary", cl::value_desc("binary"), cl::ZeroOrMore, | |||||
llvm::cl::MiscFlags::CommaSeparated, | |||||
cl::desc("Path of profiled binary files")); | |||||
mtrofin: (flyby comment) is it a "perfscript" or a "perftrace" - or maybe "perfdata" rather.
Also… | |||||
@mtrofin Thanks for your helpful list of suggestions!
here we intentionally use "perfscript" but not "perfdata" because it's not the raw perfdata created by the perf record but the output by the perf script -i perf.data, this can avoid to call the cmd inside the code.
Here change the description of it wlei: @mtrofin Thanks for your helpful list of suggestions!
>(flyby comment) is it a "perfscript" or… | |||||
static cl::opt<std::string> OutputFilename("output", cl::value_desc("output"), | |||||
cl::Required, | |||||
cl::desc("Output profile file")); | |||||
using namespace llvm; | |||||
using namespace sampleprof; | |||||
Path to profiled binary? mtrofin: Path to profiled binary? | |||||
you mean to change the desc here? "Path" is added wlei: you mean to change the desc here? "Path" is added | |||||
int main(int argc, const char *argv[]) { | |||||
InitLLVM X(argc, argv); | |||||
cl::ParseCommandLineOptions(argc, argv, "llvm SPGO profile generator\n"); | |||||
// Load binaries and parse perf events and samples | |||||
PerfReader Reader(BinaryFilenames); | |||||
Reader.parsePerfTraces(PerfTraceFilenames); | |||||
return EXIT_SUCCESS; | |||||
} | |||||
Maybe use BinaryTable.insert instead of BinaryTable.find above and use the return value to check whether the binary has been loaded before, then you don't have to search the table another time here. wmi: Maybe use BinaryTable.insert instead of BinaryTable.find above and use the return value to… | |||||
Good to know this, fixed! wlei: Good to know this, fixed! | |||||
Here it is may or may not? wmi: Here it is may or may not? | |||||
Good catch, fixed wlei: Good catch, fixed | |||||
It is a map from address to ProfiledBinary, how about using AddressBinaryMap? wmi: It is a map from address to ProfiledBinary, how about using AddressBinaryMap? | |||||
Good suggestion wlei: Good suggestion | |||||
Here BinaryAddrMap needs to erase an entry before inserting a new one. I guess it is to support the case that a load image is unloaded and then reloaded at a different place. If that is correct, please add some comment to make it clear. wmi: Here BinaryAddrMap needs to erase an entry before inserting a new one. I guess it is to support… | |||||
Naive question: that assumes no interesting events were collected when the image was loaded at the old address? mtrofin: Naive question: that assumes no interesting events were collected when the image was loaded at… | |||||
Not sure whether it will remain the same address, but here I think it will cover this case. or maybe you want me to check if the base address doesn't change then we can just drop this event? wlei: Not sure whether it will remain the same address, but here I think it will cover this case. or… | |||||
Sorry, what I meant was: as the events list gets traversed, say that for a while events corresponding to the current base address are captured; then the base address changes. What happens to the captured events? mtrofin: Sorry, what I meant was: as the events list gets traversed, say that for a while events… | |||||
The events come in order of time. The mmap events serve as a time stamp for all LBR and call stack events. Once a mmap event is processed, the binary load loaded address recorded will be updated and all subsequent LBR events will be processed against the updated base address. hoy: The events come in order of time. The mmap events serve as a time stamp for all LBR and call… | |||||
I see - thanks! mtrofin: I see - thanks! | |||||
yes , comments added wlei: yes , comments added | |||||
Nit: to help maintainability down the road, perhaps ensure primitive fields are initialized here - I realize they are init-ed in the ctor, but it's even easier to maintain things when def and init on the same line. Less typing for the ctor, too. mtrofin: Nit: to help maintainability down the road, perhaps ensure primitive fields are initialized… | |||||
Fixed! wlei: Fixed! | |||||
best to initialize struct fields to avoid undefined values. mtrofin: best to initialize struct fields to avoid undefined values. | |||||
fixed! wlei: fixed! | |||||
should this rather be Buffer->getBufferSize() > static_cast<size_t>(std::numeric_limits<uint32_t>::max())? mtrofin: should this rather be Buffer->getBufferSize() > static_cast<size_t>(std… | |||||
Good catch wlei: Good catch | |||||
could P+1 be out of bounds? mtrofin: could P+1 be out of bounds? | |||||
Yeah, it seems it's not used here, delete this! wlei: Yeah, it seems it's not used here, delete this! | |||||
The underlying MemoryBuffer ensures there is always trailing \0 at the end. hoy: The underlying `MemoryBuffer` ensures there is always trailing `\0` at the end. | |||||
should this set IsLoaded to false? also, why not assert(!IsLoaded); or rename to "ensureLoaded"; or - is there value in a ProfiledBinary that's not loadable? If not, how about:
that way, a reader knows they don't need to bother worrying about ProfiledBinaries that don't work - simpler overall state. wdyt? mtrofin: should this set IsLoaded to false?
also, why not assert(!IsLoaded); or rename to… | |||||
Asserting the function is only run once per binary is reasonable since all provided binaries are loaded as soon as the tool starts. The initial design allowed for a load on demand, i.e, when binaries are not provided via the command line option, the profiled binaries record by mmap events will be automatically loaded while processing the events. IsLoaded was introduced as a signal for that. However that complicated the tool and was not continued. A load failure would result the tool to error and exit instead of returning a null object. Please see subsequent patch D89712. hoy: Asserting the function is only run once per binary is reasonable since all provided binaries… | |||||
So then could the loading happen in the ctor, since failure to load => fast fail? Then the ProfiledBinary is known to be always loaded, and the only mutable field is the base address - to be clear, my suggestion is for simpler maintainability. mtrofin: So then could the loading happen in the ctor, since failure to load => fast fail? Then the… | |||||
I see your point now. Thanks for the suggestion. There could be a case that user-specified binary never participated in profiling, which means they are not included in the mmap events, and they should not be loaded and if they are, their load failure should not block the processing. However, that sounds a user responsibility. Moving the load into ctor time is reasonable to me. It is a cleaner design. hoy: I see your point now. Thanks for the suggestion. There could be a case that user-specified… | |||||
consider using an enum like: enum EventIndex { WholeLine = 0, PID = 1, BaseAddress = 2... (etc) } so then dereferencing Fields is more readable. mtrofin: consider using an enum like:
enum EventIndex {
WholeLine = 0,
PID = 1,
BaseAddress = 2... | |||||
fixed, thanks for the suggestion wlei: fixed, thanks for the suggestion | |||||
I guess this can happen because the Event could reference other binaries than the interesting ones, correct? Could you add a comment here about this - i.e. that the event is intentionally dropped here. mtrofin: I guess this can happen because the Event could reference other binaries than the interesting… | |||||
Yes, comments added wlei: Yes, comments added | |||||
why call load - why not just fetch the ProfiledBinary entry? Unless I'm missing something, the expected behavior is that first the user-provided binaries are loaded; and if that goes well, then the events for them are loaded. So at this point the ProfiledBinary should be there? mtrofin: why call load - why not just fetch the ProfiledBinary entry? Unless I'm missing something, the… | |||||
perhaps instead of 'run' something more specific, like "loadBinariesAndEvents". mtrofin: perhaps instead of 'run' something more specific, like "loadBinariesAndEvents". | |||||
Agree it's better to have a specific name. My concern is we do not only load binaries and mmp events here, we must also do parsing the sample line simultaneously to extract info for the LBR unwinder(see its children changes), also do check the perfscript type, so I just use a general name as an entry of those misc. Any thoughts about this? wlei: Agree it's better to have a specific name. My concern is we do not only load binaries and mmp… | |||||
probably can be done later just fine, but should the main be in its own file, so then this becomes a reusable library? mtrofin: probably can be done later just fine, but should the main be in its own file, so then this… | |||||
Yeah, we separate PerfReader to other files in the following commit, please see its children changes. wlei: Yeah, we separate PerfReader to other files in the following commit, please see its children… | |||||
One thought on using MemoryBuffer::getFileOrSTDIN (or other similar MemroyBuffer based file reading):
What do you think? shenhan: One thought on using MemoryBuffer::getFileOrSTDIN (or other similar MemroyBuffer based file… | |||||
This sounds a good solution to me. The perf file easies goes very large for large application and long profiling runs where reducing memory footprint will be very helpful. hoy: This sounds a good solution to me. The perf file easies goes very large for large application… | |||||
@shenhan Thanks for your suggestion, having a constant memory consumption is great, let me change the code. wlei: @shenhan Thanks for your suggestion, having a constant memory consumption is great, let me… | |||||
const line_iterator& ? wenlei: `const line_iterator&` ? | |||||
What about we pass in BinaryFilenames and PerfTraceFilenames as parameters to this function (or to ctor), instead of letting PerfReader coupled with command-line options directly. Then perhaps name it readFromInput(BinaryFilenames, PerfTraceFilenames). wenlei: What about we pass in `BinaryFilenames` and `PerfTraceFilenames` as parameters to this function… |
(flyby comment) is it a "perfscript" or a "perftrace" - or maybe "perfdata" rather.
Also, probably "perf tool" not "perf script" in the description.