This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
test/tools/llvm-xray/X86/
-
tools/
-
llvm-xray/
-
X86/
-
Inputs/
2/2
simple-instrmap.yaml
1/6
extract-instrmap.ll
-
lit.local.cfg
-
tools/llvm-xray/
-
llvm-xray/
1/1
CMakeLists.txt
3/4
llvm-xray.cc
5/6
xray-extract.h
39/50
xray-extract.cc
2
xray-sleds.h

Differential D21987

[XRay] Implement `llvm-xray extract`, start of the llvm-xray tool
ClosedPublic

Authored by dberris on Jul 5 2016, 3:56 AM.

Download Raw Diff

Details

Reviewers

chandlerc
dblaikie
echristo
sanjoy

Commits

rL285165: [XRay] Implement `llvm-xray extract`, start of the llvm-xray tool

Summary

Usage:

llvm-xray extract <object file> [-o <filename or '-'>]

The tool gets the XRay instrumentation map from an object file and turns
it into YAML. We first support ELF64 sleds on x86_64 binaries, with
provision for supporting other supported platforms and formats later.

This is the first of a many-part change to fully implement the
llvm-xray tool.

We also define a subcommand registration and dispatch mechanism to be
used by other further subcommand implementations for llvm-xray.

Depends on D21982 for the in-memory logging in compiler-rt.
Depends on D21983 for the changes to flags in clang.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

dberris added inline comments.Aug 7 2016, 11:07 PM

tools/xray/xray-fc-account.cc
167–170 ↗	(On Diff #66907)	If these are just standard layout structs (which they seem to be), you could allocate an array of them with 'new XRaySledEntry[N]' as usual. Though I think you could get aligned memory by new'ing a std::aligned_storage object... maybe? Unfortunately, neither of those are guaranteed to be honoured by the standard allocator (or even the default implementation of 'new') in terms of returning an aligned memory location. :( What's happening here really is that, since the data is a global, defined by the compiler (when we emit the XRay sleds) we have an opportunity to align the data then. I suspect that has to be fixed first for this to work predictably (i.e. make each entry in the sled be always aligned to 32-byte boundaries). Let me send a change to make that happen, so that this would "just work". But this'll still only work to run the tool on the same architecture, layout, etc, as the original - is that an intentional limitation? Or should we be making a more fixed/deliberate choice about the file format and having writers format it appropriately, and readers deserialize it appropriately? This is currently relying on ELF sections for the data. I suspect that's fine for now, but later on a more flexible representation of the sleds would be required (for example, if the instrumentation map was encoded as json or some other format that's easier to consume with other tools).
167–170 ↗	(On Diff #66907)	new'ing a std::aligned_storage object doesn't seem to work. :( C++17 fixes this by being able to provide an alignment target to new, but we can't use that yet (also, alignment-aware allocators aren't a thing yet AFAICT either). What will work is asking for 2 x sizeof(T) then manually aligning the address and then inplace new'ing. That might be a useful little utility, but I don't think we need to do that here (yet). Note that this is loaded from the ELF file section using the reader. I don't know yet whether the alignment enforced by the compiler/linker will translate to aligned memory pointers -- though if they're being mmapped then they _should_ be. For now we're assuming that we're reading the ELF section and that there's an analog to this in other formats (COFF, MachO, etc.) that LLVM supports. Does this help?
265 ↗	(On Diff #66907)	Yeah, this turns out to be much simpler if the body of the loop was empty. Thanks. :)

massad added a subscriber: massad.Aug 8 2016, 7:27 PM

I've added something which should explain a little what this tool already does at the moment. For the moment I'll hold off on adding more tests (I haven't figured out where the tests for the tools or parts of the tools go yet).

Tests generally go in with the functionality in the same commit - and tests for the <foo> tool live in the test/tools/<foo> directory. If some of this functionality ends up sufficiently factored out into utility libraries, unit tests may be written - though unit tests are pretty rare in LLVM (just culturally - I'm actually more of a fan of them) outside of the general data structures in ADT and some of Support. Mostly it's just end-to-end testing of specific tools, optimization passes, etc.

tools/xray/xray-fc-account.cc
168–171 ↗	(On Diff #67124)	Helps a bit - but generally LLVM tools are platform agnostic. It sounds like this currently is not - that the tool would only work on the platform that generated the file (& that when we port xray to other platforms, again, the tool would only work on the desired platform) Seems pretty likely someone might take a report from some buildbot that they don't have local hardware for, then want to analyze that on their x86 developer machine. So I think it's probably a good idea to ensure the tools are not platform specific like this. But perhaps I'm overengineering this?
237–267 ↗	(On Diff #67124)	I'm not sure whether a generic CSV-style output library would be that important/useful, but maybe I'm not quite picturing what you have in mind. As for an output system - I'd probably skip the array of variants and use a variadic function template to take all the fields. I was thinking something more high level/closer to the format you have in mind here, but perhpas I'm having trouble picturing exactly what the output formats are and how generic an API to them would look. Perhaps there is a reasonable high level API for this that could abstract away the differences between CSV and pretty printed output that wouldn't be specific to the contents. Test cases might make it easier to see what the final output formats look like & thus what some common API to produce that information might look like.
453 ↗	(On Diff #67124)	std::min_element?
453–471 ↗	(On Diff #67124)	Reckon it'd be easier to just sort this then pick out the elements you want? Or are all these partial orderings better?
467 ↗	(On Diff #67124)	Timings.begin() + Timings.size() == Timings.end() ? And then it's probably just std::max_element rather than std::nth_element?
470–476 ↗	(On Diff #67124)	Maybe just move the declaration of Row up and initialize all these members directly so you don't need the extra/duplicate variables?
509–516 ↗	(On Diff #67124)	Inconsistent use of {} on single line statements (I don't think there's a convention to put {} on if/else but not on if - some people put {} on all parts of an if/else chain if any one part needs them, but that's not the case here)
521 ↗	(On Diff #67124)	I'd consider breaking each of these blocks (there are several top level blocks in main, each with a comment at the start) of code out into a separate function for readability? Comment -> function name, ideally

Tests generally go in with the functionality in the same commit - and tests for the <foo> tool live in the test/tools/<foo> directory.

Thanks -- yes, I'll add them to the change, but thought of holding off until I got a bit more feedback. :)

If some of this functionality ends up sufficiently factored out into utility libraries, unit tests may be written - though unit tests are pretty rare in LLVM (just culturally - I'm actually more of a fan of them) outside of the general data structures in ADT and some of Support. Mostly it's just end-to-end testing of specific tools, optimization passes, etc.

Since this is a new tool, I'd like to be able to test the units in isolation, so I'll add both end-to-end tests and unit tests. This makes reasoning about the requirements and implementation a lot easier.

I'll ping when I have them lined up, it shouldn't take too long as long as I don't go too deep into the refactoring rabbit hole. :)

tools/xray/xray-fc-account.cc
168–171 ↗	(On Diff #67124)	Good point -- no your suggestion isn't over-engineered. I think a neutral format is much better for the tooling. I suspect we can provide a platform-dependent way of extracting the instrumentation map and exporting it in a neutral format. I'm thinking json is the most portable and easier format to consume/transport. So instead of doing this ELF-specific parsing/loading, we can make this step something that can be done in isolation. I'll break this out into a specific tool that takes an XRay-instrumented binary, and extracts the instrumentation map and turns it into JSON. That way, we can implement this extraction mechanism for every platform supported and have one canonical format for the instrumentation map. This then becomes an alternative input (note that the instrumentation map isn't exactly required, since the XRay trace contains most/all the information that's useful, albeit in the form of function id's). If all we ever had was the function id's, then we can still provide useful information -- the instrumentation map helps only for providing human-readable metadata, like which function is called, and if we have debugging symbols where the function was defined.
237–267 ↗	(On Diff #67124)	I hadn't thought of just a variadic template -- yes, this would work, with dispatch for handling special kinds of values (i.e. if it's a string, wrap it in quotes and escape any quotes already in the string, etc.). I'll add test for this specific API since I think it's important to get this right.

Add initial implementation of llvm-xray extract

Implement llvm-xray account, subset of xray-fc-account.

I'm going to be implementing the error handling logic in xray-account.cc as well as some cosmetic improvements to the text output.

Also, as mentioned in the RFC I sent out (http://lists.llvm.org/pipermail/llvm-dev/2016-August/103984.html) this change might be getting a little too big to be manageable as a single change. I'll probably break this out into smaller pieces, but will do so once we have a better idea of the extent of the changes involved in this effort.

dberris mentioned this in rL279805: [compiler-rt][XRay] Initial per-thread inmemory logging implementation.Aug 25 2016, 11:47 PM

Fairly substantial update that shows the overall structure (or direction) of how this tooling should develop/evolve as we add more subcommands.

There's a lot here to be looked at, and I think it's a good idea to start picking this apart into smaller self-contained pieces.

Reviewers, please specifically have a look at:

Overall structure/design. I suspect there are better ways to subdivide and remove repetitions in some places. Happy to get feedback on those and start evolving this to a cleaner state even before submit.
Stylistic issues. Specifically for things like documentation, file organisation, etc. (still fairly new to the LLVM coding guidelines).

The state of this patch now is as follows:

We have a tool for extracting the instrumentation map from an xray-instrumented binary (currently just ELF 64-bit for now) and turns it into YAML or JSON.
We have a tool for converting an XRay trace from binary to YAML. We follow the naive logging format that's already in compiler-rt.
We have a tool for doing some basic function call accounting.

These three tools are implemented as subcommands.

dberris mentioned this in D23988: [compiler-rt][XRay] Support tail call sleds.Aug 29 2016, 3:41 AM

Apologies for piling on a bit on this, just updated the llvm-xray account ...
subcommand to handle tail/sibling call deduction. We also now support loading
YAML instrumentation maps and YAML traces for all subcommands.

It's sort of a bit hard to see the forest for the trees reviewing this - I know it's a pain to do smaller reviews that might end up with large refactorings that'll have knock-on effects to later patches, but I think that might work better. Stacking up lots of work unreviewed gets a bit tricky.

Still some outstanding design issues to discuss re: file formats, conversions, portability, etc. But once that's done.

I'm still concerned about reading/writing the binary format using memory mapping/splatting - I think the tools should be portable (in the same way that LLVM/Clang are - we don't build cross compilers, we just build tools that can read/write from/to any format on any platform), but even that's an implementation detail to a degree (unless making the tools that generic removes the need for the separate portable format/conversion tools).

& not sure about YAML+JSON. If you've a strong/unavoidable need for JSON, it's probably best to just implement that. (maybe it's best to just do JSON anyway if you just have a vague preference - as I said in the design review thread, I'm not sure how strong the binding to YAML is in the project)

Higher level design/coding style stuff:

Might be worth pulling out each of the subcommands into their own utility function/file along with the command line arguments. Though I can see an argument (har har) to be made for keeping all the arguments in the same place - but maybe subcommand arguments make sense to be grouped together along with the functionality, separate from the main file.

Basic stylistic stuff: Commenting and header file layout looks fine. Oh, we don't seem to put the trailing _ in our header file names, and other headers in tools use "LLVM_TOOLS_FOO_H" rather than just "FOO_H", so that's probably a good idea to avoid naming conflicts.

tools/llvm-xray/CMakeLists.txt
4–7	I think we usually keep these lists (including the source file lists below) sorted... though I see I've not done that consistently in the past either.
tools/llvm-xray/xray-account.cc
27 ↗	(On Diff #69796)	Calling forward on the same object more than once seems problematic - if this was ever passed as an xvalue the pair would construct from the xvalue, then again from the xvalue but now with that value in a moved-from state. (also does'nt seem like it needs to be a template - that might be overly generic at this point/given the current code)
32 ↗	(On Diff #69796)	Does this need to be a template? Should the parameters be const ref? (might be easier to judge whether const ref is missing if it wasn't a template so we could see what sort of types it was for)
94–169 ↗	(On Diff #69796)	This seems like more than is necessary - I know we were talking in the early versions about how to abstract out the different formats, but what I meant was whether we could abstract out the whole file format, rather than this one part of it. (eg: by having a "HumanWriter" and a "CSVWriter" that just get handed the data, etc). Probably doesn't need a whole record type of its own, and could just pass the relevant parameters to the function as arguments (though a struct doesn't hurt, I suppoes... *shrug)
tools/llvm-xray/xray-log-reader.h
65–73 ↗	(On Diff #69796)	Wouldn't bother about implementing every variant of (c)(begin\|end)(const) - just what you need. It's not like you're implementing the whole Container requirements anyway.
tools/xray/xray-extract.cc
32–33 ↗	(On Diff #67999)	Should this be llvm::Error return instead?
37–38 ↗	(On Diff #67999)	branch-to-unreachable should be an assert instead

In D21987#534750, @dblaikie wrote:

It's sort of a bit hard to see the forest for the trees reviewing this - I know it's a pain to do smaller reviews that might end up with large refactorings that'll have knock-on effects to later patches, but I think that might work better. Stacking up lots of work unreviewed gets a bit tricky.

I agree. I'll break this up into multiple smaller changes, and update accordingly.

Still some outstanding design issues to discuss re: file formats, conversions, portability, etc. But once that's done.

I'm still concerned about reading/writing the binary format using memory mapping/splatting - I think the tools should be portable (in the same way that LLVM/Clang are - we don't build cross compilers, we just build tools that can read/write from/to any format on any platform), but even that's an implementation detail to a degree (unless making the tools that generic removes the need for the separate portable format/conversion tools).

That's a tricky one. The problem with _not_ doing this memory-dump format is the cost of doing it in the XRay runtime. We try to be as cheap as possible there (in compiler-rt) so that we don't spend too much time just formatting the log when the process is writing it down to disk. The tradeoff we've made here consciously is that any conversion should happen externally -- i.e. the tool should be able to read this format, and then turn it into something more manageable offline. That's a very conscious trade-off here because we can't afford to use too much time or too much memory on the compiler-rt implementation.

& not sure about YAML+JSON. If you've a strong/unavoidable need for JSON, it's probably best to just implement that. (maybe it's best to just do JSON anyway if you just have a vague preference - as I said in the design review thread, I'm not sure how strong the binding to YAML is in the project)

Not strong, and if anything, I might just drop JSON. :)

Higher level design/coding style stuff:

Might be worth pulling out each of the subcommands into their own utility function/file along with the command line arguments. Though I can see an argument (har har) to be made for keeping all the arguments in the same place - but maybe subcommand arguments make sense to be grouped together along with the functionality, separate from the main file.

I thought about that, then it started to feel over-engineered -- since I'd have to figure out how to dispatch anyway from the main file, I still need to know which command to call based on which subcommand, this seemed like the more prudent thing to do.

Basic stylistic stuff: Commenting and header file layout looks fine. Oh, we don't seem to put the trailing _ in our header file names, and other headers in tools use "LLVM_TOOLS_FOO_H" rather than just "FOO_H", so that's probably a good idea to avoid naming conflicts.

Ah, right. Let me fix those.

tools/llvm-xray/xray-account.cc
94–169 ↗	(On Diff #69796)	Right -- I'm ambivalent about coming up with some generic writer interface here. One the one hand, the stats generated by this one sub-command follow a very specific format, defined for this particular set of data. I don't see how we can abstract out a writer for CSV and for pretty-printed data, since we'd have different information anyway. For example, the text output currently just has auto-formatted values -- I am thinking of potentially using a formatting ostream here to fix column widths, etc. (very hard to do with a generic implementation). On the other hand, if we could figure out a way of generating pretty-printed console I/O for arbitrary data, then that sounds like a bigger library than something that just does the simple thing. :)
tools/llvm-xray/xray-log-reader.h
65–73 ↗	(On Diff #69796)	Believe it or not, it seemed like I needed all of these. :/

[XRay] Implement llvm-xray extract, start of the llvm-xray tool
Fix header guard spelling

Herald added a subscriber: beanz. · View Herald TranscriptSep 8 2016, 10:46 PM

dberris retitled this revision from [XRay] Initial XRay Function Call Accounting Tool to [XRay] Implement `llvm-xray extract`, start of the llvm-xray tool.Sep 8 2016, 10:48 PM

dberris updated this object.

dberris added a child revision: D24376: [XRay] Implement `llvm-xray convert` -- trace file conversion.Sep 8 2016, 10:59 PM

Still some outstanding design issues to discuss re: file formats, conversions, portability, etc. But once that's done.

I'm still concerned about reading/writing the binary format using memory mapping/splatting - I think the tools should be portable (in the same way that LLVM/Clang are - we don't build cross compilers, we just build tools that can read/write from/to any format on any platform), but even that's an implementation detail to a degree (unless making the tools that generic removes the need for the separate portable format/conversion tools).

That's a tricky one. The problem with _not_ doing this memory-dump format is the cost of doing it in the XRay runtime.

Ah, sorry - I should clarify/be more specific.

I don't have so much of a problem with the writing side - I appreciate there are performance concerns there. (but wouldn't mind understanding them better - how important is the performance once xray features are turned on? (obviously having xray be as close to zero cost when not in use is important) What sort of cost do you see if it did write a consistent record format? (copying integers into known-width types, changing to a fixed endianness))

But what I'm thinking about is whether the tool could read the input no matter which platform it was running on - each architecture would have a different, but known, format (I mean, it already does - it's just defined more implicitly in terms of "what I can read from memory with this struct overlay"). So the reader can do all the "read this many bytes of integer, switch the endianness from whatever was written to whatever I need in this process, etc". That way the output file becomes portable, without changing the bits/incurring extra cost during the writing step.

Having the tools only work on the architecture where the file was generated seems especially tricky/not-great - and I don't think we have anything like that in LLVM today. llvm-objdump works on COFF files for ARM even when it's run on Linux X86, etc (or should do, if it supports COFF, etc - I haven't looked closely... ).

We try to be as cheap as possible there (in compiler-rt) so that we don't spend too much time just formatting the log when the process is writing it down to disk. The tradeoff we've made here consciously is that any conversion should happen externally -- i.e. the tool should be able to read this format, and then turn it into something more manageable offline. That's a very conscious trade-off here because we can't afford to use too much time or too much memory on the compiler-rt implementation.

Higher level design/coding style stuff:

Might be worth pulling out each of the subcommands into their own utility function/file along with the command line arguments. Though I can see an argument (har har) to be made for keeping all the arguments in the same place - but maybe subcommand arguments make sense to be grouped together along with the functionality, separate from the main file.

I thought about that, then it started to feel over-engineered -- since I'd have to figure out how to dispatch anyway from the main file, I still need to know which command to call based on which subcommand, this seemed like the more prudent thing to do.

Not sure I quite follow - I was thinking something fairly simple, just taking the contents of the "if (Extract)" block in main, and pulling it into a function (with whatever parameters it needs - probably not many, since they're all subcommand arguments anyway, mostly - which could be moved over to that other file (the implementation of extract) or left here and all passed in (but that might be a lot of argument passing)).

In D21987#538323, @dblaikie wrote:

Still some outstanding design issues to discuss re: file formats, conversions, portability, etc. But once that's done.

I'm still concerned about reading/writing the binary format using memory mapping/splatting - I think the tools should be portable (in the same way that LLVM/Clang are - we don't build cross compilers, we just build tools that can read/write from/to any format on any platform), but even that's an implementation detail to a degree (unless making the tools that generic removes the need for the separate portable format/conversion tools).

That's a tricky one. The problem with _not_ doing this memory-dump format is the cost of doing it in the XRay runtime.

Ah, sorry - I should clarify/be more specific.

I don't have so much of a problem with the writing side - I appreciate there are performance concerns there. (but wouldn't mind understanding them better - how important is the performance once xray features are turned on? (obviously having xray be as close to zero cost when not in use is important) What sort of cost do you see if it did write a consistent record format? (copying integers into known-width types, changing to a fixed endianness))

That's a good question.

When it's off, we'd like the cost to be as close to 0 as possible. When it's recording, we'd like the recording cost to be as little as possible so as not to interfere too much with the latency profiles of the functions being instrumented. So this means, if we can get away with introducing a fixed overhead when on (say 0.05 microseconds or 50 nanoseconds on every record) then that's how we'd structure it. So far we've been measuring far less than that on the fixed costs of the trampolines, but we've very wary of the cost of the actual recording as well. So while we can tolerate 0.05 microseconds maximum when we're recording, we'd get up there in case we need to.

Now with future work for various modes and trade-offs on the runtime side, there's a lot of opportunities for reducing other costs -- like memory (doing selective recording based on duration), durable storage (how many bytes are on disk for the log/trace), as well as the cost of the actual recording operation. But in the meantime, we're exploring the bare minimum we can get, and the implementation we currently have does exactly this. :)

But what I'm thinking about is whether the tool could read the input no matter which platform it was running on - each architecture would have a different, but known, format (I mean, it already does - it's just defined more implicitly in terms of "what I can read from memory with this struct overlay"). So the reader can do all the "read this many bytes of integer, switch the endianness from whatever was written to whatever I need in this process, etc". That way the output file becomes portable, without changing the bits/incurring extra cost during the writing step.

Having the tools only work on the architecture where the file was generated seems especially tricky/not-great - and I don't think we have anything like that in LLVM today. llvm-objdump works on COFF files for ARM even when it's run on Linux X86, etc (or should do, if it supports COFF, etc - I haven't looked closely... ).

I agree. There are two changes that could facilitate this support:

Indicate in the log what system the log was generated from. This could very well indicate endiannes, sizes of certain fields, etc. We have enough padding in the file headers to support encoding most of this information.
Handle the data accordingly from the tools based on the header information.

In the meantime, since we only just run on x86_64 and we have very specific fields, it seems premature to engineer something that will support all the platforms that come along later. And when we end up porting XRay to these other platforms, then we cross that bridge of extending the tool to support data generated from other systems.

Are these concerns a blocker to getting something that works in LLVM first, then evolving it as we support more platforms?

We try to be as cheap as possible there (in compiler-rt) so that we don't spend too much time just formatting the log when the process is writing it down to disk. The tradeoff we've made here consciously is that any conversion should happen externally -- i.e. the tool should be able to read this format, and then turn it into something more manageable offline. That's a very conscious trade-off here because we can't afford to use too much time or too much memory on the compiler-rt implementation.

Higher level design/coding style stuff:

Might be worth pulling out each of the subcommands into their own utility function/file along with the command line arguments. Though I can see an argument (har har) to be made for keeping all the arguments in the same place - but maybe subcommand arguments make sense to be grouped together along with the functionality, separate from the main file.

I thought about that, then it started to feel over-engineered -- since I'd have to figure out how to dispatch anyway from the main file, I still need to know which command to call based on which subcommand, this seemed like the more prudent thing to do.

Not sure I quite follow - I was thinking something fairly simple, just taking the contents of the "if (Extract)" block in main, and pulling it into a function (with whatever parameters it needs - probably not many, since they're all subcommand arguments anyway, mostly - which could be moved over to that other file (the implementation of extract) or left here and all passed in (but that might be a lot of argument passing)).

I thought about this, except I'm not quite as familiar with the commandline flags library -- is there a way to declare the subcommand in the main, then have that defined elsewhere? I ask because of the way adding flags to the subcommands involves getting a reference to a Subcommand. Then in main(...) we still end up checking the Subcommand, and then calling a function.

So if the suggestion is, move the Subcommand and flag definitions in headers, and define a single entry point function, and leave the dispatch to main, that might be doable -- but not sure how much cleaner that would be...

In the meantime I'll do the function argument passing, and see how it goes for the other more complex examples later.

Move invocation logic to function in subcommand definition

Herald added a subscriber: mgorny. · View Herald TranscriptSep 12 2016, 12:10 AM

majnemer added a subscriber: majnemer.Sep 12 2016, 5:54 PM

majnemer added inline comments.

tools/llvm-xray/xray-extract.cc
93	Do you need the llvm:: qualifier here?
93	SectionRef is cheap, I'd pass it by value.
tools/llvm-xray/xray-sleds.h
21–27	This seems problematic, the fields should probably be ulittle64_t, etc.. This way they have the correct endianness and alignment.

Address review comments

tools/llvm-xray/xray-sleds.h
21–27	Interesting -- can we use `ulittle64_t` in compiler-rt as well?

Rebase

Herald added a subscriber: modocache. · View Herald TranscriptOct 5 2016, 1:00 AM

Add a command registry interface

dblaikie added inline comments.Oct 5 2016, 2:12 PM

test/tools/llvm-xray/X86/Inputs/simple-instrmap.yaml
2	What's this file used for? I couldn't see its name mentioned anywhere else and it has no RUN lines.
test/tools/llvm-xray/X86/extract-instrmap.ll
6–7	Do we need both yaml and json? If this is only for testing purposes, seems one would suffice? (is this for other purposes?)
tools/llvm-xray/llvm-xray.cc
38–39	Roll the variable declaration into the condition
39–41	Drop the {} on single-line blocks (possibly even on the "if (*SC)" block too - though that's more questionable, some people take teh LLVM convention as "no braces on single line blocks" others as "no braces on single statement blocks, even if the statement is spread over multiple lines")
tools/llvm-xray/xray-extract.cc
115	pair<StringRef, error_code> looks a bit like it should be llvm::Error, no?
119–120	Branch-to-unreachable should be an assert instead, but chances are this should be a handled error case, right? Or is filename non-empty a precondition to this function?
303–304	Might be worth making the registerCommand into a "CommandRegistration" (since they'll all want to do it in a global init anyway, so it's not like you'll have other callers/uses of registerCommand except in idioms that all look like this global ctor): static CommandRegistration Unused(&Extract, [] { ... });
tools/llvm-xray/xray-extract.h
47	No need for explicit on multi-arg ctors - we haven't bothered to try to do that across the LLVM codebase.
62–64	Documentation comment, perhaps? It's not clear, looking at the signature, what this might do (how could the output be provided in a StringRef? What would the ownership semantics be? Should this be called "convert" rather than "extract" if it maps from one format to another? Should this have an intermediate representation so that different input and output formats can be generically composed?)

dberris added inline comments.Oct 5 2016, 10:36 PM

tools/llvm-xray/xray-extract.cc
115	Ooh, nice -- I wasn't aware of `llvm::Error` before. I'll use that instead.
119–120	Actually this should be an error, I'll change it.
303–304	That's a really good point, I definitely like this style better too. :)
tools/llvm-xray/xray-extract.h
62–64	Actually, this is vestigial (we don't define this function anymore).

dberris added inline comments.Oct 5 2016, 11:03 PM

test/tools/llvm-xray/X86/Inputs/simple-instrmap.yaml
2	Ah, yes, this one is required in later patches. I'll remove this from this change.

Address review comments

Thanks @dblaikie -- PTAL

test/tools/llvm-xray/X86/extract-instrmap.ll
6–7	JSON is convenient but not strictly required. Removed.

dtzWill added a subscriber: dtzWill.Oct 6 2016, 10:49 AM

Rebase
Remove unnecessary end marker for enums

Ping?

dblaikie added inline comments.Oct 12 2016, 9:43 AM

tools/llvm-xray/xray-extract.cc
52–58	Could we detect which kind of file we're reading with file magic instead of passing an argument to specify the input format? Is it useful to specify the input format as YAML if the only task to be performed is to generate YAML? Perhaps the only input format should be ELF/object file.
63–68	LLVM isn't too fussy about using declarations/directives - you're welcome to "using llvm::yaml" if that seems suitable, I think? (maybe the yaml namespace has particularly vague names that are problematic?) (also we don't tend to bother with the '::' global scope qualifier)
105	If you can get away without needing to be convertible to error_code, that's probably a good thing (it's just there for backwards compatibility in APIs where we haven't migrated the whole stack to llvm::Error yet) Also you might be able to get away without needing a custom Error type entirely - maybe just use StringError?
131–133	Test case for this codepath? (goes for all the error paths, ideally)
155	If you're already using a random access iterator, I'd probably suggest writing this using op- instead of std::distance. std::distance is handy when you need to write generic code which might be applied to non-random access iterators, but otherwise seems a bit obfuscatory.
193–204	Same feedback, might be fine to just use StringError
264–266	It'd probably be good to keep in the llvm::Error space rather than dropping to std::error_code - you'll probably be losing extra diagnostic information by collapsing into an error code like this. (same a few lines up with BinaryInstrELFError)
280	What's the purpose of this conditional operator ("AlwaysIntrument ? true : false") - one would assume AlwaysInstrument is already a boolean and this conditional operator is redundant. Ah, because it matches the binary format on disk, it's a char. Could we just get away from memcpying the struct off disk - only being 4 fields, llvm::DataExtractor might suffice to quickly pull 4 fields of known sizes out of a stream? & then the struct's members can be the right types, etc. (I wonder, but assume it's not appropriate, if we could just use the YAML struct for everything? But I guess we want various APIs consuming these data structures/records that shouldn't be aware of any YAML binding, etc)
286–287	a static variable inside a namespace - is the namespace there to have the same effect as "using"? Would using "using" be more obvious/straightforward? Oh, they're already 'using' at the top. So do these namespace decls provide anything?
289	Might even be worth having the command registration functor return an llvm::Error, then the error printing could be handled up in the caller (if everyone's just going to print an error and return 1, at least - that could happen up there).

dberris planned changes to this revision.Oct 12 2016, 9:14 PM

Thanks @dblaikie -- I'll make some changes and ping when ready. :)

tools/llvm-xray/xray-extract.cc
52–58	Could we detect which kind of file we're reading with file magic instead of passing an argument to specify the input format? Not with how we've defined it in compiler-rt, no. :( Is it useful to specify the input format as YAML if the only task to be performed is to generate YAML? Perhaps the only input format should be ELF/object file. Good point. For extraction it doesn't make sense. But the class that does extraction can be used (and actually is used in later patches) to load an instrumentation map in YAML. I'll remove the flag and assume that extraction will only ever make sense on ELF/object files.
63–68	LLVM isn't too fussy about using declarations/directives - you're welcome to "using llvm::yaml" if that seems suitable, I think? (maybe the yaml namespace has particularly vague names that are problematic?) Right -- I was merely following the examples. I'll try with just `using namespace llvm::yaml` here. (also we don't tend to bother with the '::' global scope qualifier) Will fix this -- force of habit. :)
105	If you can get away without needing to be convertible to error_code, that's probably a good thing (it's just there for backwards compatibility in APIs where we haven't migrated the whole stack to llvm::Error yet) Unfortunately that seems to be a pure virtual function in ErrorInfo<...>. :( I like `StringError` though, I'll change this to use StringError instead (less types, better).
280	Yeah, we're constrained by the structure that's in disk. I thought about only ever using the YAML struct for everything, and it boils down to allowing it to be ported/used by other classes/functions that need to load the instrumentation map and don't need any of the YAML functionality.
286–287	This is vestigial -- it used to be that there was a function defined in the namespace, and it's changed to this global registration approach. I'll remove the namespaces. :)
289	Good idea, yeah I'll change this to make the function return an llvm::Error and consolidate the logging and exit in main.

Address some review comments

I haven't added the tests, will do so when the style questions have been resolved.

tools/llvm-xray/xray-extract.cc
131–133	I'll add more test cases in the next revision.

dberris mentioned this in rL284178: [compiler-rt][XRay] Support tail call sleds.Oct 13 2016, 5:06 PM

dblaikie added inline comments.Oct 14 2016, 10:26 AM

tools/llvm-xray/xray-extract.cc
238–240	Previous code only printed this "Cannot extract" message, new code will print that as well as whatever text is in Err, right? Is that a desired change? Is there some nice/easy way to append/prepend the "Cannot extract" text to the exsiting Error to pass up to main to print there? (splitting the diagnostic printing between the two places seems a bit awkward)
246–247	Ideally pass up the string and error code here, rather than printing then passing?
280	Still, think it might be nice to move away from it being splatted back and forth to memory like that - and having the expected C++ types (like enums and bools).

Use errors exclusively

Please have another look @dblaikie -- before I add more tests?

tools/llvm-xray/xray-extract.cc
280	I agree -- on the ELF section (and the stuff embedded in the binaries), I think we're a bit constrained by what we can effectively write on the compiler-side (and read on the tooling side). On the trace files though I'm a bit more open to working with a unified encoding format, that was a bit more clever than it currently is.

dblaikie added inline comments.Oct 17 2016, 10:31 AM

tools/llvm-xray/xray-extract.cc
280	I agree -- on the ELF section (and the stuff embedded in the binaries), I think we're a bit constrained by what we can effectively write on the compiler-side (and read on the tooling side). On the trace files though I'm a bit more open to working with a unified encoding format, that was a bit more clever than it currently is. Sorry I think I'm still confused/we're talking past each other. All I mean is that, rather than this tool splatting into a carefully laid out struct, it would use something like DataExtractor to extract the carefully laid out bytes into a more generic struct. The file formats and writer would remain the same - just the reader would be more robust and more a usable type for the rest of the APIs. (I suppose put another way: Using the YAML struct with its YAML types isn't really ideal for general tool code that's trying to process these records. Equally, using a carefully laid out struct with surprising types (char instead of bool/enum, explicit padding, etc) has the same problem. The main interchange structure should probably be the generic semantic thing, not YAML or binary file related - and once that happens, we could skip the packed struct entirely and just read in the few records with DataExtractor or a similar API)

Use a common sled representation, hide ELF64-specific implementation detail

Please have another look @dblaikie.

tools/llvm-xray/xray-extract.cc
280	Ah, yes, this makes sense. I've hidden the ELF64-specific sled layout and defined a higher level `SledEntry` type that could be exposed later.

dblaikie added inline comments.Oct 18 2016, 10:14 AM

tools/llvm-xray/xray-extract.cc
142–145	I'm still a bit concerned about doing this via memory mapping (what if we are on a different platform that happens to add some extra padding between fields, etc?) So I'd suggest using DataExtractor or similar (then you don't have to worry about alignment either) techniques/devices/tools, probably? Maybe it just seems like this function is harder to read than it is in the code review, but might consider breaking it up - perhaps taking all the error handling at the beginning and turning it into one utility function, so this function (LoadBinaryInstrELF) can focus on the parsing, etc (or break out this parsing into another function, etc - or even both).
263–267	Would it make sense to sink this code into InstrumentationMapExtractor's ctor? (it has all the information - it knows it's trying to extract an instrumentation map, and the name of the input - so it seems it should be responsible for creating this message, maybe?)

Use DataExtractor to load data from the memory mapped object file.

PTAL @dblaikie

tools/llvm-xray/xray-extract.cc
142–145	Ah, I get it now. I'm using llvm::DataExtractor now. Refactoring this further seems pre-mature, given that this is already an implementation detail. When supporting other formats, it would be something to think about, I agree.
263–267	Actually, nope -- because users of this class could choose to ignore the errors (i.e. treat it as if there was no available instrumentation map). It's just that for this sub-command, it wouldn't work if extraction actually failed. :)

Generally looking pretty good - a few little nits & more test coverage (error cases, for example) is probably a good idea.

test/tools/llvm-xray/X86/extract-instrmap.ll
6–7	Convenient how? I suppose what i'm trying to understand is: why /both/? (which is to say, what does each offer that the other doesn't)
13	Is it worth checking the specific addresses? If this test is simple enough to provide fairly stable values here - to make sure the algorithms, etc, are working correctly & not just printing garbage values?
tools/llvm-xray/xray-extract.cc
86	I'd sink this down into the function/near the use (the general principle of keeping variables in the smallest scope needed)
164	Drop the "? false : true" here, and use !=, perhaps: "Entry.AlwaysInstrument = AlwaysInstrument != 0" (potentially even drop the "!= 0", but I can see how that helps readability.
182	Move assignment rather than swap? (but equally I wouldn't mind the old code that left the OutputSleds in an unspecified state on failure)
234–238	This code (& LoadYAMLInstrFile) is dead/untested - perhaps should be moved into another/separate change.
263–267	I'm not sure I understand this - could you rephrase? What I mean is: The joining of this "Cannot extract instrumentation map from" seems like it could go inside the InstrumentationMapExtractor ctor (since it has all the context to make that message and the message seems appropriate at that level). Then at this level (in the CommandRegistration) we just propagate the error up ("if (Err) return Err; or whatever)
tools/llvm-xray/xray-extract.h
39	Doesn't look like this needs to be a member? (it's only used during construction, if I'm reading it correctly)
51	Unused/dead code
tools/llvm-xray/xray-registry.h
31–32 ↗	(On Diff #75109)	Could just make this a struct - since it's only member's public anyway.
36 ↗	(On Diff #75109)	It doesn't seem like it actually requires that SC is not null (& pedantically the terminology would be "not null" (or "not a null pointer" - 'nullptr' is just some specific null pointer literal)) - but, sure - seems OK to say that's not acceptable even if it'd be fine for the current implementation. Is it worth just defining this function to only work if the SubCommand is registered? (assert when not found, instead of returning the empty std::function)

Add failure tests
Address more review comments

dberris marked an inline comment as done.Oct 20 2016, 11:53 PM

dberris added inline comments.

test/tools/llvm-xray/X86/extract-instrmap.ll
6–7	JSON was convenient for making it easily parse-able and view-able by anything that will show it on a browser. :)
13	Yep, added a few more functions to be a bit more complete.
tools/llvm-xray/xray-extract.cc
263–267	Ah, right -- I've moved the concatenation of the Twine into the error generation parts.
tools/llvm-xray/xray-extract.h
51	Unused/dead code Yes, but this is actually used in a later (stacked) change...

This is ready for another look @dblaikie -- added some failure tests that were easy to catch (not found, no instrumentation map) and provided some sample binaries to use instead of creating one on the fly (as it seemed harder to generate a final linked executable from a .ll).

If you have ideas about how to make the sequence:

llc ...; <link the .o into a final binary> ;  llvm-xray extract <final binary>

work in a .ll test, I'd be very interested to try.

In D21987#576231, @dberris wrote:
This is ready for another look @dblaikie -- added some failure tests that were easy to catch (not found, no instrumentation map) and provided some sample binaries to use instead of creating one on the fly (as it seemed harder to generate a final linked executable from a .ll).

If you have ideas about how to make the sequence:
llc ...; <link the .o into a final binary> ;  llvm-xray extract <final binary>
work in a .ll test, I'd be very interested to try.

Could you explain why you need a linked binary rather than just an object file? What changes? Relocations, I guess? All the addresses look like zero?

Yeah, doubt there's much to do about that - no way to produce linked binaries with the LLVM tools available to the test suite, and no point implementing support in the tool to handle unapplied relocations (unlike dwarfdump which does handle them, and there's more reason to run dwarfdump on unlinked objects, so it's worth having that support for more than just testing).

tools/llvm-xray/xray-extract.cc
90–93	Untested (do we need this test? Presumably we just won't be able to open the file)
98–99	I'm guessing this is the path we take for "no such file or directory"? (ie: covered by the test for that)
101–104	Untested
119–122	Untested (though, granted - not sure quite how to test this, but could look further into libObject to see how getContents can fail)
128–132	Untested
207–209	Guess this should just be an assert for now? (represents a programmer error/isn't reachable/testable/etc?)
tools/llvm-xray/xray-registry.cc
34–35 ↗	(On Diff #75393)	Replace branch-to-unreachable with assert.

In D21987#576626, @dblaikie wrote:
In D21987#576231, @dberris wrote:
This is ready for another look @dblaikie -- added some failure tests that were easy to catch (not found, no instrumentation map) and provided some sample binaries to use instead of creating one on the fly (as it seemed harder to generate a final linked executable from a .ll).

If you have ideas about how to make the sequence:
llc ...; <link the .o into a final binary> ;  llvm-xray extract <final binary>
work in a .ll test, I'd be very interested to try.
Could you explain why you need a linked binary rather than just an object file? What changes? Relocations, I guess? All the addresses look like zero?

The addresses are basically zero on a .o :/

Yeah, doubt there's much to do about that - no way to produce linked binaries with the LLVM tools available to the test suite, and no point implementing support in the tool to handle unapplied relocations (unlike dwarfdump which does handle them, and there's more reason to run dwarfdump on unlinked objects, so it's worth having that support for more than just testing).

Yep, I think the binary should be fine here.

However, I'm having some trouble generating malformed binaries without resorting to hex-editing some existing files. Any recommendations for faking "bad" inputs to test the error conditions?

tools/llvm-xray/xray-extract.cc
90–93	Good point, removed.
98–99	Yes. that's correct.

Add test for empty, convert some if-unreachable to assert

Add elf32 sample and test for unsupportedness
Add test for badly sized instrumentation map entries.

Added a couple more tests, ptal -- I haven't quite resorted to hex-editing files, but close enough with deliberately generating a bad XRay instrumentation map with a modified clang+llvm local build.

tools/llvm-xray/xray-extract.cc
119–122	I looked, and it had something to do with the ELF encoding of a file (say, if the supposed size of the section defined in the header is different in reality (through some checks)). Not sure how to properly test this yet.

dblaikie accepted this revision.Oct 25 2016, 8:52 AM

dblaikie edited edge metadata.

dblaikie added inline comments.

tools/llvm-xray/llvm-xray.cc
36	Perhaps it'd be better to catch this case earlier - disallow registering with a null/empty function (just assert) rather than quietly not executing anything?

This revision is now accepted and ready to land.Oct 25 2016, 8:52 AM

Squash local commits and rebase before landing.

Thanks for the review @dblaikie!

tools/llvm-xray/llvm-xray.cc
36	Good idea. Added an assert on the registration path to make sure the function is not "empty".

Update description.

Landed in rL285155.

Rolled back in rL285155, broke the tests.

This revision is now accepted and ready to land.Oct 25 2016, 9:22 PM

Landed as rL285165.

Revision Contents

Path

Size

test/

tools/

llvm-xray/

X86/

Inputs/

simple-instrmap.yaml

10 lines

extract-instrmap.ll

34 lines

lit.local.cfg

1 line

tools/

llvm-xray/

8 lines

82 lines

68 lines

289 lines

31 lines

Diff 71297

test/tools/llvm-xray/X86/Inputs/simple-instrmap.yaml

This file was added.

				# This is a simple instrumentation map with bogus addresses and offsets, but
				# follow the recommended format.
				dblaikieUnsubmitted Done Reply Inline Actions What's this file used for? I couldn't see its name mentioned anywhere else and it has no RUN lines. dblaikie: What's this file used for? I couldn't see its name mentioned anywhere else and it has no RUN…
				dberrisAuthorUnsubmitted Done Reply Inline Actions Ah, yes, this one is required in later patches. I'll remove this from this change. dberris: Ah, yes, this one is required in later patches. I'll remove this from this change.
				---
				- { id: 1, address: 0x1, function: 0x1, kind: function-enter, always-instrument: true}
				- { id: 1, address: 0x2, function: 0x1, kind: function-exit, always-instrument: true}
				- { id: 2, address: 0x2, function: 0x2, kind: function-enter, always-instrument: true}
				- { id: 2, address: 0x3, function: 0x2, kind: function-exit, always-instrument: true}
				- { id: 3, address: 0x3, function: 0x3, kind: function-enter, always-instrument: true}
				- { id: 3, address: 0x4, function: 0x3, kind: function-exit, always-instrument: true}
				...

test/tools/llvm-xray/X86/extract-instrmap.ll

This file was added.

				; This test makes sure we can extract the instrumentation map from an
				; XRay-instrumented object file.
				;
				; RUN: llc -filetype=obj -o %t -mtriple=x86_64-unknown-linux-gnu < %s
				; RUN: llvm-xray extract -format=json %t \| FileCheck %s --check-prefix=CHECK-JSON
				; RUN: llvm-xray extract -format=yaml %t \| FileCheck %s --check-prefix=CHECK-YAML

				dblaikieUnsubmitted Done Reply Inline Actions Do we need both yaml and json? If this is only for testing purposes, seems one would suffice? (is this for other purposes?) dblaikie: Do we need both yaml and json? If this is only for testing purposes, seems one would suffice?
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions JSON is convenient but not strictly required. Removed. dberris: JSON is convenient but not strictly required. Removed.
				dblaikieUnsubmitted Not Done Reply Inline Actions Convenient how? I suppose what i'm trying to understand is: why /both/? (which is to say, what does each offer that the other doesn't) dblaikie: Convenient how? I suppose what i'm trying to understand is: why /both/? (which is to say, what…
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions JSON was convenient for making it easily parse-able and view-able by anything that will show it on a browser. :) dberris: JSON was convenient for making it easily parse-able and view-able by anything that will show it…
				define i32 @foo() nounwind noinline uwtable "function-instrument"="xray-always" {
				ret i32 0
				}

				; CHECK-JSON: [
				; CHECK-JSON-NEXT: {
				dblaikieUnsubmitted Not Done Reply Inline Actions Is it worth checking the specific addresses? If this test is simple enough to provide fairly stable values here - to make sure the algorithms, etc, are working correctly & not just printing garbage values? dblaikie: Is it worth checking the specific addresses? If this test is simple enough to provide fairly…
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions Yep, added a few more functions to be a bit more complete. dberris: Yep, added a few more functions to be a bit more complete.
				; CHECK-JSON-NEXT: "funcId": 1,
				; CHECK-JSON-NEXT: "funcAddr": {{.*}},
				; CHECK-JSON-NEXT: "sledAddr": {{.*}},
				; CHECK-JSON-NEXT: "sledKind": "FunctionEnter",
				; CHECK-JSON-NEXT: "alwaysInstrument": true
				; CHECK-JSON-NEXT: },
				; CHECK-JSON-NEXT: {
				; CHECK-JSON-NEXT: "funcId": 1,
				; CHECK-JSON-NEXT: "funcAddr": {{.*}},
				; CHECK-JSON-NEXT: "sledAddr": {{.*}},
				; CHECK-JSON-NEXT: "sledKind": "FunctionExit",
				; CHECK-JSON-NEXT: "alwaysInstrument": true
				; CHECK-JSON-NEXT: }
				; CHECK-JSON-NEXT: ]

				; CHECK-YAML: ---
				; CHECK-YAML-NEXT: - { id: 1, address: 0x{{[0-9A-F]+}}, function: 0x{{[0-9A-F]+}}, kind: function-enter,
				; CHECK-YAML-NEXT: always-instrument: true }
				; CHECK-YAML-NEXT: - { id: 1, address: 0x{{[0-9A-F]+}}, function: 0x{{[0-9A-F]+}}, kind: function-exit,
				; CHECK-YAML-NEXT: always-instrument: true }
				; CHECK-YAML-NEXT: ...

test/tools/llvm-xray/X86/lit.local.cfg

This file was added.

config.suffixes = ['.yaml', '.ll']

tools/llvm-xray/CMakeLists.txt

This file was added.

				set(LLVM_LINK_COMPONENTS
				${LLVM_TARGETS_TO_BUILD}
				Support
				Object)

				set(LLVM_XRAY_TOOLS
				xray-extract.cc)
				dblaikieUnsubmitted Done Reply Inline Actions I think we usually keep these lists (including the source file lists below) sorted... though I see I've not done that consistently in the past either. dblaikie: I think we usually keep these lists (including the source file lists below) sorted... though I…
				add_llvm_tool(llvm-xray llvm-xray.cc ${LLVM_XRAY_TOOLS})

tools/llvm-xray/llvm-xray.cc

This file was added.

				//===- llvm-xray.cc - XRay Tool Main Program ------------------------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements the main entry point for the suite of XRay tools. All
				// additional functionality are implemented as subcommands.
				//
				//===----------------------------------------------------------------------===//
				//
				// Basic usage:
				//
				// llvm-xray [options] <subcommand> [subcommand-specific options]
				//

				#include "xray-extract.h"
				#include "llvm/Support/CommandLine.h"
				#include "llvm/Support/FileSystem.h"
				#include "llvm/Support/raw_ostream.h"
				#include <unistd.h>

				using namespace llvm;
				using namespace llvm::xray;

				// Define the subcommands we support.

				// llvm-xray extract
				// ----------------------------------------------------------------------------
				static cl::SubCommand Extract("extract", "Extract instrumentation maps");
				static cl::opt<std::string> ExtractInput(cl::Positional,
				cl::desc("<input file>"), cl::Required,
				cl::sub(Extract));
				dblaikieUnsubmitted Done Reply Inline Actions Perhaps it'd be better to catch this case earlier - disallow registering with a null/empty function (just assert) rather than quietly not executing anything? dblaikie: Perhaps it'd be better to catch this case earlier - disallow registering with a null/empty…
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions Good idea. Added an assert on the registration path to make sure the function is not "empty". dberris: Good idea. Added an assert on the registration path to make sure the function is not "empty".
				static cl::opt<std::string>
				ExtractOutput("output", cl::value_desc("output file"), cl::init("-"),
				cl::desc("output file; use '-' for stdout"),
				dblaikieUnsubmitted Done Reply Inline Actions Roll the variable declaration into the condition dblaikie: Roll the variable declaration into the condition
				cl::sub(Extract));
				static cl::alias ExtractOutput2("o", cl::aliasopt(ExtractOutput),
				dblaikieUnsubmitted Done Reply Inline Actions Drop the {} on single-line blocks (possibly even on the "if (SC)" block too - though that's more questionable, some people take teh LLVM convention as "no braces on single line blocks" others as "no braces on single statement blocks, even if the statement is spread over multiple lines") dblaikie:* Drop the {} on single-line blocks (possibly even on the "if (*SC)" block too - though that's…
				cl::desc("Alias for -output"),
				cl::sub(Extract));

				static cl::opt<ExtractOutputFormats>
				ExtractFormat("format", cl::desc("output format"),
				cl::values(clEnumValN(ExtractOutputFormats::JSON, "json",
				"instrumentation map in json"),
				clEnumValN(ExtractOutputFormats::YAML, "yaml",
				"output instrumentation map in csv"),
				clEnumValEnd),
				cl::init(ExtractOutputFormats::YAML), cl::sub(Extract));
				static cl::alias ExtractFormat2("f", cl::desc("Alias for -format"),
				cl::aliasopt(ExtractFormat), cl::sub(Extract));

				// Options common to all subcommands
				// ---------------------------------
				static cl::opt<InstrumentationMapExtractor::InputFormats> InstrMapFormat(
				"instr-map-format", cl::desc("format of instrumentation map"),
				cl::values(clEnumValN(InstrumentationMapExtractor::InputFormats::ELF, "elf",
				"instrumentation map in an ELF header"),
				clEnumValN(InstrumentationMapExtractor::InputFormats::YAML,
				"yaml", "instrumentation map in YAML"),
				clEnumValEnd),
				cl::sub(*cl::AllSubCommands),
				cl::init(InstrumentationMapExtractor::InputFormats::ELF));
				static cl::alias InstrMapFormat2("t", cl::aliasopt(InstrMapFormat),
				cl::desc("Alias for -instr-map-format"),
				cl::sub(*cl::AllSubCommands));

				int main(int argc, char *argv[]) {
				cl::ParseCommandLineOptions(argc, argv,
				"XRay Tools\n\n"
				" This program consolidates multiple XRay trace "
				"processing tools for convenient access.\n");
				if (Extract)
				return xray::extract(ExtractInput, InstrMapFormat, ExtractOutput,
				ExtractFormat);

				// We need a subcommand to do anything useful. At this point show usage.
				cl::PrintHelpMessage(false, true);
				}

tools/llvm-xray/xray-extract.h

This file was added.

				//===- xray-extract.h - XRay Instrumentation Map Extraction ---------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// Defines the interface for extracting the instrumentation map from an
				// XRay-instrumented binary.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_TOOLS_XRAY_EXTRACT_H
				#define LLVM_TOOLS_XRAY_EXTRACT_H

				#include <deque>
				#include <map>
				#include <string>
				#include <unordered_map>

				#include "xray-sleds.h"
				#include "llvm/Support/raw_ostream.h"

				namespace llvm {
				namespace xray {

				class InstrumentationMapExtractor {
				public:
				typedef std::unordered_map<int32_t, uint64_t> FunctionAddressMap;
				typedef std::unordered_map<uint64_t, int32_t> FunctionAddressReverseMap;

				enum class InputFormats { ELF, YAML };

				private:
				std::string Filename;
				std::deque<XRaySledEntry> Sleds;
				FunctionAddressMap FunctionAddresses;
				dblaikieUnsubmitted Done Reply Inline Actions Doesn't look like this needs to be a member? (it's only used during construction, if I'm reading it correctly) dblaikie: Doesn't look like this needs to be a member? (it's only used during construction, if I'm…
				FunctionAddressReverseMap FunctionIds;

				public:
				/// Loads the instrumentation map from \|Filename\|. Updates \|EC\| in case there
				/// were errors encountered opening the file. \|Format\| defines what the input
				/// instrumentation map is in.
				explicit InstrumentationMapExtractor(std::string Filename,
				InputFormats Format,
				dblaikieUnsubmitted Done Reply Inline Actions No need for explicit on multi-arg ctors - we haven't bothered to try to do that across the LLVM codebase. dblaikie: No need for explicit on multi-arg ctors - we haven't bothered to try to do that across the LLVM…
				std::error_code &EC);

				const FunctionAddressMap &getFunctionAddresses() { return FunctionAddresses; }

				dblaikieUnsubmitted Done Reply Inline Actions Unused/dead code dblaikie: Unused/dead code
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions Unused/dead code Yes, but this is actually used in a later (stacked) change... dberris: > Unused/dead code Yes, but this is actually used in a later (stacked) change...
				/// Exports the loaded function address map as JSON through \|OS\|.
				void exportAsJSON(raw_ostream &OS);

				/// Exports the loaded function address map as YAML through \|OS\|.
				void exportAsYAML(raw_ostream &OS);
				};

				enum class ExtractOutputFormats { JSON, YAML };

				int extract(StringRef Input,
				InstrumentationMapExtractor::InputFormats InputFormat,
				StringRef Output, ExtractOutputFormats OutputFormat);

				dblaikieUnsubmitted Done Reply Inline Actions Documentation comment, perhaps? It's not clear, looking at the signature, what this might do (how could the output be provided in a StringRef? What would the ownership semantics be? Should this be called "convert" rather than "extract" if it maps from one format to another? Should this have an intermediate representation so that different input and output formats can be generically composed?) dblaikie: Documentation comment, perhaps? It's not clear, looking at the signature, what this might do…
				dberrisAuthorUnsubmitted Done Reply Inline Actions Actually, this is vestigial (we don't define this function anymore). dberris: Actually, this is vestigial (we don't define this function anymore).
				} // namespace xray
				} // namespace llvm

				#endif // LLVM_TOOLS_XRAY_EXTRACT_H

tools/llvm-xray/xray-extract.cc

This file was added.

				//===- xray-extract.cc - XRay Instrumentation Map Extraction --------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// Implementation of the xray-extract.h interface.
				//
				// FIXME: Support other XRay-instrumented binary formats other than ELF.
				//
				//===----------------------------------------------------------------------===//

				#include <type_traits>
				#include <unistd.h>
				#include <utility>

				#include "xray-extract.h"

				#include "xray-sleds.h"
				#include "llvm/Object/ELF.h"
				#include "llvm/Object/ObjectFile.h"
				#include "llvm/Support/CommandLine.h"
				#include "llvm/Support/ELF.h"
				#include "llvm/Support/Format.h"
				#include "llvm/Support/YAMLTraits.h"
				#include "llvm/Support/raw_ostream.h"

				using namespace llvm;
				using namespace llvm::xray;

				using ::llvm::yaml::MappingTraits;
				using ::llvm::yaml::ScalarEnumerationTraits;
				using ::llvm::yaml::IO;
				using ::llvm::yaml::Hex64;
				using ::llvm::yaml::Output;
				using ::llvm::yaml::Input;

				enum class FunctionKinds { ENTRY, EXIT };

				struct YAMLXRaySledEntry {
				int32_t FuncId;
				Hex64 Address;
				Hex64 Function;
				FunctionKinds Kind;
				bool AlwaysWinstrument;
				};

				template <> struct ScalarEnumerationTraits<FunctionKinds> {
				static void enumeration(IO &IO, FunctionKinds &Kind) {
				IO.enumCase(Kind, "function-enter", FunctionKinds::ENTRY);
				IO.enumCase(Kind, "function-exit", FunctionKinds::EXIT);
				}
				};

				template <> struct MappingTraits<YAMLXRaySledEntry> {
				dblaikieUnsubmitted Done Reply Inline Actions Could we detect which kind of file we're reading with file magic instead of passing an argument to specify the input format? Is it useful to specify the input format as YAML if the only task to be performed is to generate YAML? Perhaps the only input format should be ELF/object file. dblaikie: Could we detect which kind of file we're reading with file magic instead of passing an argument…
				dberrisAuthorUnsubmitted Done Reply Inline Actions Could we detect which kind of file we're reading with file magic instead of passing an argument to specify the input format? Not with how we've defined it in compiler-rt, no. :( Is it useful to specify the input format as YAML if the only task to be performed is to generate YAML? Perhaps the only input format should be ELF/object file. Good point. For extraction it doesn't make sense. But the class that does extraction can be used (and actually is used in later patches) to load an instrumentation map in YAML. I'll remove the flag and assume that extraction will only ever make sense on ELF/object files. dberris: > Could we detect which kind of file we're reading with file magic instead of passing an…
				static void mapping(IO &IO, YAMLXRaySledEntry &Entry) {
				IO.mapRequired("id", Entry.FuncId);
				IO.mapRequired("address", Entry.Address);
				IO.mapRequired("function", Entry.Function);
				IO.mapRequired("kind", Entry.Kind);
				IO.mapRequired("always-instrument", Entry.AlwaysWinstrument);
				}

				static constexpr bool flow = true;
				};
				dblaikieUnsubmitted Done Reply Inline Actions LLVM isn't too fussy about using declarations/directives - you're welcome to "using llvm::yaml" if that seems suitable, I think? (maybe the yaml namespace has particularly vague names that are problematic?) (also we don't tend to bother with the '::' global scope qualifier) dblaikie: LLVM isn't too fussy about using declarations/directives - you're welcome to "using llvm::yaml"…
				dberrisAuthorUnsubmitted Done Reply Inline Actions LLVM isn't too fussy about using declarations/directives - you're welcome to "using llvm::yaml" if that seems suitable, I think? (maybe the yaml namespace has particularly vague names that are problematic?) Right -- I was merely following the examples. I'll try with just `using namespace llvm::yaml` here. (also we don't tend to bother with the '::' global scope qualifier) Will fix this -- force of habit. :) dberris: > LLVM isn't too fussy about using declarations/directives - you're welcome to "using llvm…

				LLVM_YAML_IS_SEQUENCE_VECTOR(YAMLXRaySledEntry);

				namespace {

				// Returns {"message", false} on errors; {"", true} otherwise.
				std::pair<StringRef, std::error_code> LoadBinaryInstrELF(
				StringRef Filename, std::deque<XRaySledEntry> &Sleds,
				InstrumentationMapExtractor::FunctionAddressMap &InstrMap,
				InstrumentationMapExtractor::FunctionAddressReverseMap &FunctionIds) {
				if (Filename.empty())
				llvm_unreachable("Provided Filename is empty.");

				auto ObjectFile = object::ObjectFile::createObjectFile(Filename);

				// FIXME: Maybe support other ELF formats. For now, 64-bit Little Endian only.
				if (!ObjectFile \|\| !ObjectFile->getBinary()->isELF())
				return std::make_pair("File format not supported (only does ELF).",
				dblaikieUnsubmitted Done Reply Inline Actions I'd sink this down into the function/near the use (the general principle of keeping variables in the smallest scope needed) dblaikie: I'd sink this down into the function/near the use (the general principle of keeping variables…
				std::make_error_code(std::errc::not_supported));

				// Find the section named "xray_instr_map".
				StringRef Contents = "";
				const auto &Sections = ObjectFile->getBinary()->sections();
				auto I = find_if(Sections, [&](object::SectionRef Section) {
				StringRef Name = "";
				majnemerUnsubmitted Done Reply Inline Actions Do you need the llvm:: qualifier here? majnemer: Do you need the llvm:: qualifier here?
				majnemerUnsubmitted Done Reply Inline Actions SectionRef is cheap, I'd pass it by value. majnemer: SectionRef is cheap, I'd pass it by value.
				dblaikieUnsubmitted Done Reply Inline Actions Untested (do we need this test? Presumably we just won't be able to open the file) dblaikie: Untested (do we need this test? Presumably we just won't be able to open the file)
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions Good point, removed. dberris: Good point, removed.
				if (Section.getName(Name))
				return false;
				return Name == "xray_instr_map";
				});
				if (I == Sections.end())
				return std::make_pair("Failed to find XRay instrumentation map.",
				dblaikieUnsubmitted Done Reply Inline Actions I'm guessing this is the path we take for "no such file or directory"? (ie: covered by the test for that) dblaikie: I'm guessing this is the path we take for "no such file or directory"? (ie: covered by the test…
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions Yes. that's correct. dberris: Yes. that's correct.
				std::make_error_code(std::errc::not_supported));
				if (I->getContents(Contents))
				return std::make_pair(
				"Failed to get contents of 'xray_instr_map' section.",
				std::make_error_code(std::errc::executable_format_error));
				dblaikieUnsubmitted Not Done Reply Inline Actions Untested dblaikie: Untested

				dblaikieUnsubmitted Done Reply Inline Actions If you can get away without needing to be convertible to error_code, that's probably a good thing (it's just there for backwards compatibility in APIs where we haven't migrated the whole stack to llvm::Error yet) Also you might be able to get away without needing a custom Error type entirely - maybe just use StringError? dblaikie: If you can get away without needing to be convertible to error_code, that's probably a good…
				dberrisAuthorUnsubmitted Done Reply Inline Actions If you can get away without needing to be convertible to error_code, that's probably a good thing (it's just there for backwards compatibility in APIs where we haven't migrated the whole stack to llvm::Error yet) Unfortunately that seems to be a pure virtual function in ErrorInfo<...>. :( I like `StringError` though, I'll change this to use StringError instead (less types, better). dberris: > If you can get away without needing to be convertible to error_code, that's probably a good…
				// Copy the instrumentation map data into the Sleds data structure.
				auto C = Contents.bytes_begin();
				if (std::distance(C, Contents.bytes_end()) % sizeof(XRaySledEntry) != 0)
				return std::make_pair(
				"Instrumentation map entries not evenly divisible by "
				"size of XRaySledEntry",
				std::make_error_code(std::errc::executable_format_error));
				while (C != Contents.bytes_end()) {
				// Because we have a 'const char*' we need to assume that it is not aligned.
				// This means we're going to have to memcpy it into properly-aligned memory
				dblaikieUnsubmitted Done Reply Inline Actions pair<StringRef, error_code> looks a bit like it should be llvm::Error, no? dblaikie: pair<StringRef, error_code> looks a bit like it should be llvm::Error, no?
				dberrisAuthorUnsubmitted Done Reply Inline Actions Ooh, nice -- I wasn't aware of `llvm::Error` before. I'll use that instead. dberris: Ooh, nice -- I wasn't aware of `llvm::Error` before. I'll use that instead.
				// first, then copy that into the Sleds data structure.
				std::aligned_storage<sizeof(XRaySledEntry)>::type AlignedSled;
				memcpy(&AlignedSled, C, sizeof(XRaySledEntry));
				Sleds.push_back(reinterpret_cast<XRaySledEntry >(&AlignedSled));
				C += sizeof(XRaySledEntry);
				dblaikieUnsubmitted Done Reply Inline Actions Branch-to-unreachable should be an assert instead, but chances are this should be a handled error case, right? Or is filename non-empty a precondition to this function? dblaikie: Branch-to-unreachable should be an assert instead, but chances are this should be a handled…
				dberrisAuthorUnsubmitted Done Reply Inline Actions Actually this should be an error, I'll change it. dberris: Actually this should be an error, I'll change it.
				}

				dblaikieUnsubmitted Not Done Reply Inline Actions Untested (though, granted - not sure quite how to test this, but could look further into libObject to see how getContents can fail) dblaikie: Untested (though, granted - not sure quite how to test this, but could look further into…
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions I looked, and it had something to do with the ELF encoding of a file (say, if the supposed size of the section defined in the header is different in reality (through some checks)). Not sure how to properly test this yet. dberris: I looked, and it had something to do with the ELF encoding of a file (say, if the supposed size…
				// We replicate the function id generation scheme implemented in the runtime
				// here. Ideally we should be able to break it out, or output this map from
				// the runtime, but that's a design point we can discuss later on. For now, we
				// replicate the logic and move on.
				int32_t FuncId = 1;
				uint64_t CurFn = 0;
				for (const auto &Sled : Sleds) {
				auto F = Sled.Function;
				if (CurFn == 0) {
				CurFn = F;
				dblaikieUnsubmitted Done Reply Inline Actions Untested dblaikie: Untested
				InstrMap[FuncId] = F;
				dblaikieUnsubmitted Done Reply Inline Actions Test case for this codepath? (goes for all the error paths, ideally) dblaikie: Test case for this codepath? (goes for all the error paths, ideally)
				dberrisAuthorUnsubmitted Done Reply Inline Actions I'll add more test cases in the next revision. dberris: I'll add more test cases in the next revision.
				FunctionIds[F] = FuncId;
				}
				if (F != CurFn) {
				++FuncId;
				CurFn = F;
				InstrMap[FuncId] = F;
				FunctionIds[F] = FuncId;
				}
				}
				return std::make_pair("", std::error_code());
				}

				dblaikieUnsubmitted Not Done Reply Inline Actions I'm still a bit concerned about doing this via memory mapping (what if we are on a different platform that happens to add some extra padding between fields, etc?) So I'd suggest using DataExtractor or similar (then you don't have to worry about alignment either) techniques/devices/tools, probably? Maybe it just seems like this function is harder to read than it is in the code review, but might consider breaking it up - perhaps taking all the error handling at the beginning and turning it into one utility function, so this function (LoadBinaryInstrELF) can focus on the parsing, etc (or break out this parsing into another function, etc - or even both). dblaikie: I'm still a bit concerned about doing this via memory mapping (what if we are on a different…
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions Ah, I get it now. I'm using llvm::DataExtractor now. Refactoring this further seems pre-mature, given that this is already an implementation detail. When supporting other formats, it would be something to think about, I agree. dberris: Ah, I get it now. I'm using llvm::DataExtractor now. Refactoring this further seems pre-mature…
				std::pair<StringRef, std::error_code> LoadYAMLInstrFile(
				StringRef Filename, std::deque<XRaySledEntry> &Sleds,
				InstrumentationMapExtractor::FunctionAddressMap &InstrMap,
				InstrumentationMapExtractor::FunctionAddressReverseMap &FunctionIds) {
				int Fd;
				auto EC = sys::fs::openFileForRead(Filename, Fd);
				if (EC)
				return {"Cannot open file for reading", EC};
				off_t FileSize;
				while ((FileSize = lseek(Fd, 0, SEEK_END)) == -1) {
				dblaikieUnsubmitted Done Reply Inline Actions If you're already using a random access iterator, I'd probably suggest writing this using op- instead of std::distance. std::distance is handy when you need to write generic code which might be applied to non-random access iterators, but otherwise seems a bit obfuscatory. dblaikie: If you're already using a random access iterator, I'd probably suggest writing this using op…
				if (errno == EINTR)
				continue;
				}
				assert(FileSize != -1 && "Error seeking to the end of file");
				sys::fs::mapped_file_region MappedFile(
				Fd, sys::fs::mapped_file_region::mapmode::readonly, FileSize, 0, EC);
				if (EC)
				return {"Cannot mmap file for read", EC};

				dblaikieUnsubmitted Done Reply Inline Actions Drop the "? false : true" here, and use !=, perhaps: "Entry.AlwaysInstrument = AlwaysInstrument != 0" (potentially even drop the "!= 0", but I can see how that helps readability. dblaikie: Drop the "? false : true" here, and use !=, perhaps: "Entry.AlwaysInstrument =…
				std::vector<YAMLXRaySledEntry> YAMLSleds;
				Input In(StringRef(MappedFile.data(), FileSize));
				In >> YAMLSleds;
				if (In.error())
				return {"Error parsing YAML file", In.error()};

				// Convert the sleds into proper XRaySledEntry objects.
				for (const auto &Y : YAMLSleds) {
				InstrMap[Y.FuncId] = Y.Function;
				FunctionIds[Y.Function] = Y.FuncId;
				Sleds.emplace_back(
				XRaySledEntry{Y.Address, Y.Function,
				Y.Kind == FunctionKinds::ENTRY ? uint8_t{0} : uint8_t{1},
				Y.AlwaysWinstrument, ""});
				}
				return {"", std::error_code()};
				}

				dblaikieUnsubmitted Done Reply Inline Actions Move assignment rather than swap? (but equally I wouldn't mind the old code that left the OutputSleds in an unspecified state on failure) dblaikie: Move assignment rather than swap? (but equally I wouldn't mind the old code that left the…
				} // namespace

				InstrumentationMapExtractor::InstrumentationMapExtractor(std::string Filename,
				InputFormats Format,
				std::error_code &EC)
				: Filename(std::move(Filename)) {
				switch (Format) {
				case InputFormats::ELF:
				std::tie(std::ignore, EC) = LoadBinaryInstrELF(
				this->Filename, Sleds, FunctionAddresses, FunctionIds);
				break;
				case InputFormats::YAML:
				std::tie(std::ignore, EC) = LoadYAMLInstrFile(
				this->Filename, Sleds, FunctionAddresses, FunctionIds);
				break;
				}
				}

				void InstrumentationMapExtractor::exportAsJSON(raw_ostream &OS) {
				// JSON Output for the instrumentation map should look like the following:
				//
				// [
				dblaikieUnsubmitted Done Reply Inline Actions Same feedback, might be fine to just use StringError dblaikie: Same feedback, might be fine to just use StringError
				// <sled entry>,
				// <sled entry>,
				// ...
				// <sled entry>
				// ]
				dblaikieUnsubmitted Done Reply Inline Actions Guess this should just be an assert for now? (represents a programmer error/isn't reachable/testable/etc?) dblaikie: Guess this should just be an assert for now? (represents a programmer error/isn't…
				//
				// Each of the <sled entry> elements turn into:
				//
				// {
				// "funcId": <function id as a number>,
				// "funcAdd": <function address as a string (in hex)>,
				// "sledKind": <the sled kind, as a string>,
				// "alwaysInstrument": <a boolean>
				// }
				OS << "[\n";
				auto Countdown = Sleds.size();
				for (const auto &Sled : Sleds) {
				OS << " {\n \"funcId\": " << FunctionIds[Sled.Function]
				<< ",\n \"funcAddr\": \"" << format_hex(Sled.Function, 16)
				<< "\",\n \"sledAddr\": \"" << format_hex(Sled.Address, 16)
				<< "\",\n \"sledKind\": \"";
				switch (Sled.Kind) {
				case '\0':
				OS << "FunctionEnter";
				break;
				case '\1':
				OS << "FunctionExit";
				break;
				default:
				OS << "Unknown";
				break;
				}
				OS << "\",\n \"alwaysInstrument\": "
				<< (Sled.AlwaysInstrument ? "true" : "false") << "\n }";
				dblaikieUnsubmitted Done Reply Inline Actions This code (& LoadYAMLInstrFile) is dead/untested - perhaps should be moved into another/separate change. dblaikie: This code (& LoadYAMLInstrFile) is dead/untested - perhaps should be moved into…
				if (--Countdown != 0)
				OS << ",";
				dblaikieUnsubmitted Done Reply Inline Actions Previous code only printed this "Cannot extract" message, new code will print that as well as whatever text is in Err, right? Is that a desired change? Is there some nice/easy way to append/prepend the "Cannot extract" text to the exsiting Error to pass up to main to print there? (splitting the diagnostic printing between the two places seems a bit awkward) dblaikie: Previous code only printed this "Cannot extract" message, new code will print that as well as…
				OS << "\n";
				}
				OS << "]\n";
				}

				void InstrumentationMapExtractor::exportAsYAML(raw_ostream &OS) {
				// First we translate the sleds into the YAMLXRaySledEntry objects in a deque.
				dblaikieUnsubmitted Done Reply Inline Actions Ideally pass up the string and error code here, rather than printing then passing? dblaikie: Ideally pass up the string and error code here, rather than printing then passing?
				std::vector<YAMLXRaySledEntry> YAMLSleds;
				YAMLSleds.reserve(Sleds.size());
				for (const auto &Sled : Sleds) {
				YAMLSleds.push_back(
				{FunctionIds[Sled.Function], Sled.Address, Sled.Function,
				Sled.Kind == '\0' ? FunctionKinds::ENTRY : FunctionKinds::EXIT,
				Sled.AlwaysInstrument ? true : false});
				}
				Output Out(OS);
				Out << YAMLSleds;
				}

				namespace llvm {
				namespace xray {

				int extract(StringRef Input,
				InstrumentationMapExtractor::InputFormats InputFormat,
				StringRef Output, ExtractOutputFormats OutputFormat) {
				std::error_code EC;
				dblaikieUnsubmitted Done Reply Inline Actions It'd probably be good to keep in the llvm::Error space rather than dropping to std::error_code - you'll probably be losing extra diagnostic information by collapsing into an error code like this. (same a few lines up with BinaryInstrELFError) dblaikie: It'd probably be good to keep in the llvm::Error space rather than dropping to std::error_code…
				xray::InstrumentationMapExtractor Extractor(Input, InputFormat, EC);
				dblaikieUnsubmitted Done Reply Inline Actions Would it make sense to sink this code into InstrumentationMapExtractor's ctor? (it has all the information - it knows it's trying to extract an instrumentation map, and the name of the input - so it seems it should be responsible for creating this message, maybe?) dblaikie: Would it make sense to sink this code into InstrumentationMapExtractor's ctor? (it has all the…
				dberrisAuthorUnsubmitted Done Reply Inline Actions Actually, nope -- because users of this class could choose to ignore the errors (i.e. treat it as if there was no available instrumentation map). It's just that for this sub-command, it wouldn't work if extraction actually failed. :) dberris: Actually, nope -- because users of this class could choose to ignore the errors (i.e. treat it…
				dblaikieUnsubmitted Done Reply Inline Actions I'm not sure I understand this - could you rephrase? What I mean is: The joining of this "Cannot extract instrumentation map from" seems like it could go inside the InstrumentationMapExtractor ctor (since it has all the context to make that message and the message seems appropriate at that level). Then at this level (in the CommandRegistration) we just propagate the error up ("if (Err) return Err; or whatever) dblaikie: I'm not sure I understand this - could you rephrase? What I mean is: The joining of this…
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions Ah, right -- I've moved the concatenation of the Twine into the error generation parts. dberris: Ah, right -- I've moved the concatenation of the Twine into the error generation parts.
				if (EC) {
				errs() << "Cannot extract instrumentation map from '" << Input << "'\n";
				return 1;
				}
				raw_fd_ostream OS(Output, EC, sys::fs::OpenFlags::F_Text);
				if (EC) {
				errs() << "Cannot open file '" << Output << "' for writing.\n";
				return 1;
				}
				switch (OutputFormat) {
				case ExtractOutputFormats::JSON:
				Extractor.exportAsJSON(OS);
				break;
				dblaikieUnsubmitted Done Reply Inline Actions What's the purpose of this conditional operator ("AlwaysIntrument ? true : false") - one would assume AlwaysInstrument is already a boolean and this conditional operator is redundant. Ah, because it matches the binary format on disk, it's a char. Could we just get away from memcpying the struct off disk - only being 4 fields, llvm::DataExtractor might suffice to quickly pull 4 fields of known sizes out of a stream? & then the struct's members can be the right types, etc. (I wonder, but assume it's not appropriate, if we could just use the YAML struct for everything? But I guess we want various APIs consuming these data structures/records that shouldn't be aware of any YAML binding, etc) dblaikie: What's the purpose of this conditional operator ("AlwaysIntrument ? true : false") - one would…
				dberrisAuthorUnsubmitted Done Reply Inline Actions Yeah, we're constrained by the structure that's in disk. I thought about only ever using the YAML struct for everything, and it boils down to allowing it to be ported/used by other classes/functions that need to load the instrumentation map and don't need any of the YAML functionality. dberris: Yeah, we're constrained by the structure that's in disk. I thought about only ever using the…
				dblaikieUnsubmitted Not Done Reply Inline Actions Still, think it might be nice to move away from it being splatted back and forth to memory like that - and having the expected C++ types (like enums and bools). dblaikie: Still, think it might be nice to move away from it being splatted back and forth to memory like…
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions I agree -- on the ELF section (and the stuff embedded in the binaries), I think we're a bit constrained by what we can effectively write on the compiler-side (and read on the tooling side). On the trace files though I'm a bit more open to working with a unified encoding format, that was a bit more clever than it currently is. dberris: I agree -- on the ELF section (and the stuff embedded in the binaries), I think we're a bit…
				dblaikieUnsubmitted Done Reply Inline Actions I agree -- on the ELF section (and the stuff embedded in the binaries), I think we're a bit constrained by what we can effectively write on the compiler-side (and read on the tooling side). On the trace files though I'm a bit more open to working with a unified encoding format, that was a bit more clever than it currently is. Sorry I think I'm still confused/we're talking past each other. All I mean is that, rather than this tool splatting into a carefully laid out struct, it would use something like DataExtractor to extract the carefully laid out bytes into a more generic struct. The file formats and writer would remain the same - just the reader would be more robust and more a usable type for the rest of the APIs. (I suppose put another way: Using the YAML struct with its YAML types isn't really ideal for general tool code that's trying to process these records. Equally, using a carefully laid out struct with surprising types (char instead of bool/enum, explicit padding, etc) has the same problem. The main interchange structure should probably be the generic semantic thing, not YAML or binary file related - and once that happens, we could skip the packed struct entirely and just read in the few records with DataExtractor or a similar API) dblaikie: > I agree -- on the ELF section (and the stuff embedded in the binaries), I think we're a bit…
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions Ah, yes, this makes sense. I've hidden the ELF64-specific sled layout and defined a higher level `SledEntry` type that could be exposed later. dberris: Ah, yes, this makes sense. I've hidden the ELF64-specific sled layout and defined a higher…
				case ExtractOutputFormats::YAML:
				Extractor.exportAsYAML(OS);
				break;
				}
				return 0;
				}

				dblaikieUnsubmitted Done Reply Inline Actions a static variable inside a namespace - is the namespace there to have the same effect as "using"? Would using "using" be more obvious/straightforward? Oh, they're already 'using' at the top. So do these namespace decls provide anything? dblaikie: a static variable inside a namespace - is the namespace there to have the same effect as…
				dberrisAuthorUnsubmitted Done Reply Inline Actions This is vestigial -- it used to be that there was a function defined in the namespace, and it's changed to this global registration approach. I'll remove the namespaces. :) dberris: This is vestigial -- it used to be that there was a function defined in the namespace, and it's…
				} // namespace xray
				} // namespace llvm
				dblaikieUnsubmitted Done Reply Inline Actions Might be worth making the registerCommand into a "CommandRegistration" (since they'll all want to do it in a global init anyway, so it's not like you'll have other callers/uses of registerCommand except in idioms that all look like this global ctor): static CommandRegistration Unused(&Extract, [] { ... }); dblaikie: Might be worth making the registerCommand into a "CommandRegistration" (since they'll all want…
				dberrisAuthorUnsubmitted Done Reply Inline Actions That's a really good point, I definitely like this style better too. :) dberris: That's a really good point, I definitely like this style better too. :)
				dblaikieUnsubmitted Done Reply Inline Actions Might even be worth having the command registration functor return an llvm::Error, then the error printing could be handled up in the caller (if everyone's just going to print an error and return 1, at least - that could happen up there). dblaikie: Might even be worth having the command registration functor return an llvm::Error, then the…
				dberrisAuthorUnsubmitted Done Reply Inline Actions Good idea, yeah I'll change this to make the function return an llvm::Error and consolidate the logging and exit in main. dberris: Good idea, yeah I'll change this to make the function return an llvm::Error and consolidate the…

tools/llvm-xray/xray-sleds.h

This file was added.

				//===- xray-sleds.h - XRay Sleds Data Structure ---------------------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// Defines the structure used to represent XRay instrumentation map entries.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_TOOLS_LLVM_XRAY_XRAY_SLEDS_H
				#define LLVM_TOOLS_LLVM_XRAY_XRAY_SLEDS_H

				namespace llvm {
				namespace xray {

				struct XRaySledEntry {
				uint64_t Address;
				uint64_t Function;
				unsigned char Kind;
				unsigned char AlwaysInstrument;
				unsigned char Padding[14];
				};

				majnemerUnsubmitted Not Done Reply Inline Actions This seems problematic, the fields should probably be ulittle64_t, etc.. This way they have the correct endianness and alignment. majnemer: This seems problematic, the fields should probably be ulittle64_t, etc.. This way they have…
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions Interesting -- can we use `ulittle64_t` in compiler-rt as well? dberris: Interesting -- can we use `ulittle64_t` in compiler-rt as well?
				} // namespace xray
				} // namespace llvm

				#endif // LLVM_TOOLS_LLVM_XRAY_XRAY_SLEDS_H