This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
test/tools/llvm-objdump/Offloading/
-
tools/
-
llvm-objdump/
-
Offloading/
-
Inputs/
1
binary.yaml
1
malformed.yaml
6/14
binary.test
1
content-failure.test
1/5
failure.test
1/8
warning.test
-
tools/llvm-objdump/
-
llvm-objdump/
-
CMakeLists.txt
2/5
ObjdumpOpts.td
1
OffloadDump.h
29/66
OffloadDump.cpp
3/4
llvm-objdump.cpp

Differential D126904

[llvm-objdump] Add support for dumping embedded offloading data
ClosedPublic

Authored by jhuber6 on Jun 2 2022, 11:39 AM.

Download Raw Diff

Details

Reviewers

jdoerfert
tianshilei1992
MaskRay
yaxunl
tra
saiislam
alexander-shaposhnikov
jhenderson
JonChesterfield

Commits

rGd2d8b0aa4f80: [llvm-objdump] Add support for dumping embedded offloading data

Summary

In Clang/LLVM we are moving towards a new binary format to store many
embedded object files to create a fatbinary. This patch adds support for
dumping these embedded images in the llvm-objdump tool. This will
allow users to query information about what is stored inside the binary.
This has very similar functionality to the cuobjdump tool for thoe familiar
with the Nvidia utilities. The proposed use is as follows:

$ clang input.c -fopenmp --offload-arch=sm_70 --offload-arch=sm_52 -c
$ llvm-objdump -O input.o

input.o:        file format elf64-x86-64

OFFLOADIND IMAGE [0]:
kind            cubin
arch            sm_52
triple          nvptx64-nvidia-cuda
producer        openmp

OFFLOADIND IMAGE [1]:
kind            cubin
arch            sm_70
triple          nvptx64-nvidia-cuda
producer        openmp

This will be expanded further once we start embedding more information
into these offloading images. Right now we are planning on adding
flags and entries for debug level, optimization, LTO usage, target
features, among others.

This patch only supports printing these sections, later we will want to
support dumping files the user may be interested in via another flag. I
am unsure if this should go here in llvm-objdump or llvm-objcopy.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

tra added inline comments.Jun 3 2022, 11:02 AM

llvm/tools/llvm-objdump/OffloadDump.cpp
79	I don't think the 'single' part of this assertion is true. AFAICT, `extractAllBinaries` will happily print all subsequent binaries if it finds them in the buffer. I think this should call `printBinary` instead.

jhuber6 added inline comments.Jun 3 2022, 11:46 AM

llvm/tools/llvm-objdump/OffloadDump.cpp
39	How about this? printOffloadBinary() printOffloadBinaries() dumpOffloadSections() dumpOffloadBinaries()
79	Yeah, I meant it more like to print on the single file that was already extracted or something. But it can definitely contain multiple. The reason I chose this method is because I wanted something that worked even if these sections were concatenated through a relocatable link or something. So whenever we parse one of these we just check the sizes to make sure there's not another one concatenated to it. I can make the comment less confusing.

Changing some names and comments.

Harbormaster completed remote builds in B167765: Diff 434096.Jun 3 2022, 12:34 PM

tra added inline comments.Jun 3 2022, 12:36 PM

llvm/tools/llvm-objdump/OffloadDump.cpp
39	What's the distinction between dump and print here? It would do for the time being, I guess. If we were to implement iterators over sections and offload binaries within, we would not need distinct names for them and then this whole code would look like this: printOffloadBinaries(const ObjectFile *O) { llvm::for_each(O->sections(), [](auto Section) { llvm::for_each(Section->binaries(), printBinary); }) } That would be expressive enough without having to split multiple levels of iteration into different functions, along with the associated hassle of having to come up with adequate names for them. :-) Error handling will likely throw a monkey wrench into this neat ideal scenario, so I'm not sure if it's worth it. I think general purpose iterators over sections and offload binaries would come handy when we get to extend the functionality. It does not need to be done in this patch.
79	I think the root of the problem here is that we're treating `OffloadBinary` as both the pointer to the binary itself and as a pointer to collection of such binaries. I think it's not a good API -- extractAllBinaries gets to look under the hood of the implmentation -- check if containing buffer has extra space beyond the OffloadBinary it's been passed. What if the user places something else in the memory buffer right behind the OffloadBinary object user passed to printOffloadBinary ? They would be within their rights to do so as the function would be expected to care about the content of the `OB` only. I think we should be a bit more pedantic about such things. If we expect to operate on a collection, the API should reflect that. E.g. use SmallVector<OffloadBinary>. I think implementing `ObjectFile::offload_sections()` and `OffloadSection::offload_binaries()` would help both here and above. Or, possibly, just `ObjectFile::offload_binaries()``if we don't need to care about how binaries are stored in the object file and just wanr offload binaries themselves.

jhuber6 added inline comments.Jun 3 2022, 1:04 PM

llvm/tools/llvm-objdump/OffloadDump.cpp
39	I just used dump since it's more in-line with the vocabulary of the rest of the functions in `llvm-objdump`. Also in the future if we want to print out these sections we'd just have a flag somewhere that if enabled dumps the contents of the image rather than just the metadata. So that's the difference between just dump and print in my mind. As you mentioned we can get errors here so we'd need to have some kind of iterator over `Expected`values, But I'm not sure if we would extend the section class for this since this binary format is just some blob that just so happens to be contained in a section I'd say.
79	So the problem is we don't know how many of these are in here until we parse it. This requires getting the `size` field within the `OffloadBinary`. So even if we abstracted it to this iterator, it would still need some parsing like this behind the scenes. I could have made the binary format contain many within a single binary image, but like I said I wanted this to be stable under arbitrary concatenation by the linker. I'm not sure if we could have a different API considering the parsing requirements. This can definitely be problematic, depending on usage. I'm assuming if a user initialized an object on a memory buffer containing a bunch of junk it would probably be fine and just stop once the file is fully parsed. We could probably just ignore a parsing error, basically just stop tryingto read things if we don't catch the magic bytes or there's not enough space left over, but that's probably not ideal. It's definitely a little obtuse, but I'm not sure if there's a good way to make it work better considering how we parse them.

Note: you probably need to consider jhenderson as a blocking reviewer for this change, even if folks who usually review offload changes give an approval.

llvm/tools/llvm-objdump/ObjdumpOpts.td
86	Single-letter options can easily conflict with GNU objdump. Please drop it. See https://github.com/llvm/llvm-project/issues/55297 that we spent many collaboration efforts with GNU.

In D126904#3559278, @MaskRay wrote:

Note: you probably need to consider jhenderson as a blocking reviewer for this change, even if folks who usually review offload changes give an approval.

I'll try to take a look at this later this week - I was off work for most of last week and am catching up on my core responsbilities before taking time for LLVM reviews.

Removing '-O' flag.

Harbormaster completed remote builds in B168132: Diff 434572.Jun 6 2022, 1:49 PM

jhuber6 mentioned this in D127304: [LinkerWrapper] Embed OffloadBinaries for OpenMP offloading images.Jun 8 2022, 6:51 AM

Fix test and make failing to visit an offload binary not an Error. We should require parsing at least one (The one passed in) and the ones after that should be allowed to be invalid.

Harbormaster completed remote builds in B168578: Diff 435166.Jun 8 2022, 8:51 AM

tra accepted this revision.Jun 8 2022, 1:58 PM

This revision is now accepted and ready to land.Jun 8 2022, 1:58 PM

@jhenderson Is this good to land?

Actually handle the errors correctly.

Harbormaster completed remote builds in B169064: Diff 435904.Jun 10 2022, 7:45 AM

Sorry for the delay - I ended up off sick for a big chunk of last week.

Please make sure new options are documented in the llvm-objdump command guide documentation (see llvm/docs/CommandGuide/llvm-objdump.rst).

llvm/test/tools/llvm-objdump/Offload/offload.test
1–2 ↗	(On Diff #435904)	As this is a new output format, we probably want to use the FileCheck options `--match-full-lines --strict-whitespace --implicit-check-not={{.}}` to ensure the entire output (including indentation etc) is checked-for and there is nothing additional being printed that shouldn't be. As there's only one check pattern used in the file, drop the `--check-prefix` argument and just use `CHECK:`, `CHECK-NEXT:` etc. Using pre-canned binaries is less than ideal. Can you use yaml2obj to generate the test inputs at runtime? This will make them less opaque and easier to maintain long-term. Depending on the section's complexity, it probably makes sense to add support for it to yaml2obj explicitly, so that it's easy to specify kind/arch/triple/producer etc in dedicated fields.
9–10 ↗	(On Diff #435904)	Should there be a blank line in the output between each offloading image? If so, use `CHECK-EMPTY` to check for it. If not, use `CHECK-NEXT` at the start of the next block.
llvm/tools/llvm-objdump/OffloadDump.cpp
2	Missing license header.
46–47	Test case for the error path?
61–63	It's extremely unfortunate that this relies on section names rather than section types. In my opinion, it would be far more appropriate to have a SHT_LLVM_OFFLOAD section type (or similar), so that section names don't need looking up and comparing to find the relevant section. ELF is designed really to use types for section comparison, not really names.
67	Error path test?
72	Error path test?
76	This seems to be just throwing away all errors. Are you sure that's what you meant to do, and not to report them with `reportError`? If so, did you mean to use `consumeError` (or possibly even `cantFail`)?
79	I said I wanted this to be stable under arbitrary concatenation by the linker Have you looked at how DWARF debug sections like .debug_line or .debug_aranges are structured? Typically, these sections have a header which contains information like total size of that section (or number of entries in the section) and version information. These sections are still concatenated, with the length simply representing the contribution from a single CU.
84	Ditto: should this be `consumeError`?
llvm/tools/llvm-objdump/llvm-objdump.cpp
194	Nit: it looks like there's a half-hearted attempt to have these fields in some form of alphabetical order - it's certainly not 100%, but I feel like this option probably belongs close to the `RawClangAST` option below.

Thanks for the review, I'll try to address your comments soon.

llvm/test/tools/llvm-objdump/Offload/offload.test
1–2 ↗	(On Diff #435904)	I'll adjust the tests. I could try to make a yaml2ojb implementation for this. I already have a tool that creates these but it lives in Clang. Would implementing the yaml2obj go in a separate patch?
llvm/tools/llvm-objdump/OffloadDump.cpp
61–63	I wanted to keep it somewhat generic if we ever want to get this to work on COFF / MACH-O and names were the easiest solution at the time. I can try to make a change for that in the future, it would definitely be a better solution than checking a magic string.
67	Since I needed to use prepackaged binaries it was a little hard to make the test cases contain errors. This path specifically is just for an ELF that's broken somehow and can't get the section contents so I don't think it's relevant to the feature.
72	This basically only happens if the section doesn't have the necessary magic bytes or is too small, I'll try to add a test for that.
76	Those are probably better options, thanks. Yes, throwing away the errors is the intended solution here. These binaries are individual files stored in a big blob, when we parse one we check the rest of the buffer to see if there are more. I did it this way so when sections get merged via `ld -r foo.o bar.o` we can still find the files. It leads to a little weirdness with parsing however. Basically I only check errors for the first file in the section, if we fail to find any after that one we shouldn't treat it as an error.

Addressing some comments for now. Let me know if this is sufficient for now, or if I should move to address the other main comments, the section identification scheme and using binary blobs. Improving the tests further would require an implementation of yaml2obj for these binaries. It would be nice to have but I'd need to figure out how to write it, if you have any materials on that it would help. Similarly, it would help if you could point me to where we determine a section's type when we do code generation in LLVM.

jhuber6 added inline comments.Jun 13 2022, 7:13 AM

llvm/tools/llvm-objdump/OffloadDump.cpp
79	Right now I have a binary that knows its own size, and if the size of the buffer is greater than the size of that binary we look for another one. Forgive me if I'm misunderstanding here, but the linker will only concatenate sections right? Do these sections simply work as some kind of buffer whose size indicated how many sections were concatenated? That is, for every `.llvm.offloading` section I'd have some other reference section that just contains a single byte whose size I can check? Otherwise I'm not sure how we could figure out how many of these sections have been concatenated without parsing them first.

Harbormaster completed remote builds in B169435: Diff 436379.Jun 13 2022, 7:54 AM

In D126904#3577758, @jhuber6 wrote:

Addressing some comments for now. Let me know if this is sufficient for now, or if I should move to address the other main comments, the section identification scheme and using binary blobs.

I'd be concerned if a dumping implementation were written and then needed regular modification due to the section format not being stable yet. Usually, we'd try to get the details of teh section format sorted before writing the dumping implementation, to reduce churn.

Improving the tests further would require an implementation of yaml2obj for these binaries. It would be nice to have but I'd need to figure out how to write it, if you have any materials on that it would help.

The only material I can offer is the yaml2obj source code - there are several other section kinds that have different dumping behaviour, that should be fairly straightforward to use as a basis for the implementation for this section. Take a look at how, for example, SHT_RELA sections are handled in yaml2obj.

Similarly, it would help if you could point me to where we determine a section's type when we do code generation in LLVM.

When you say "determine" do you mean, where we write the type out in assembly, or something else? I'm not all that familiar with the code generation layer of LLVM (I mostly focus on the tools and file format sections).

llvm/test/tools/llvm-objdump/Offload/offload-failure.test
1 ↗	(On Diff #436379)	The option is `offloading` so the test (and directory) names should be updated to match. There's not much point in naming a test "offloading-..." or similar, if it's already in an equivalent folder name. It's useful to have a brief comment at the top of a test to explain what exactly it is testing. In this case, what causes the failure? (In newer tests in llvm-objdump we use `##` for comments, to distinguish them from RUN and CHECK lines).
9 ↗	(On Diff #436379)	`Machine` type is usually optional, so you can omit it to reduce test noise.
11–17 ↗	(On Diff #436379)	Do you need the .text section for this test case?
20–24 ↗	(On Diff #436379)	This can be simplified: The flags aren't related to the offloading section printing, so get rid of them. Is the address needed? I suspect not from my understanding of what the program is doing. I suspect the AddressAlign field is also unnecessary for printing purposes, and can be omitted. You don't need both Content and Size, unless the section size needs to be bigger than the specified content. You can omit the Content field and the Size parameter will set the section size still, with the content set to null bytes to fill it up.
25 ↗	(On Diff #436379)	As you don't have any Symbols, you can omit this line.
llvm/test/tools/llvm-objdump/Offload/offload.test
11 ↗	(On Diff #436379)	I think you may be able to omit the Machine line.
12–20 ↗	(On Diff #436379)	As in the other test, you don't need the .text section (presumably) or the empty Symbols array.
28 ↗	(On Diff #436379)	Use `CHECK-NEXT` here and in the following `OFFLOADING IMAGE` lines.
1–2 ↗	(On Diff #435904)	Normally, yaml2obj support would be a separate, prerequisite patch, but it depends on how straightforward it is to test the implementation without introducing a circular dependency (as the section is fairly simple, you might be able to test by just inspecting the raw bytes). If it's not straightforward to devise necessary test cases, I'd manually test it using some pregenerated output, and then include the actual implementation in this patch (so that the automated testing can use llvm-objdump). It looks like you've not added the extra FileCheck switches mentioned in my first paragraph?
llvm/tools/llvm-objdump/OffloadDump.cpp
36–37	I think it would be sensible to test the default case (it's a common behaviour pattern for the default case to result in an error, so it seems sensible to demonstrate that it isn't in this case).
61–63	I'm not familiar with Mach-O, although I know that COFF does rely on strings, but it's quite normal to vary this somewhat between file formats, due to the different features available in those formats. Changing to a specific type for ELF may be a prerequisite for a proper yaml2obj implementation anyway, so that yaml2obj knows how to parse the YAML for such sections.
67	The fact that you've had to add code to handle the error shows that it is relevant, otherwise if the error handling is broken, you won't know. It should be fairly easy to test this using yaml2obj, which has the ability to overwrite the sh_offset field of the section header (using a "SHOffset" field name, if I remember rightly - look for examples).
72	Usually it's a good idea to add more context to error messages that come out of the low-level libraries (see https://llvm.org/docs/CodingStandards.html#error-and-warning-messages). There are a number of examples of how this is done elsewhere in places like llvm-objdump and especially the llvm-readobj code. For example, the existing test case indicates simply that there's a problem with some encoding somewhere, but it's not clear which section that applies to (imagine if you were dumping multiple different sections at the same time).
76	The usual pattern in more up-to-date dumping tools is to report warnings rather than errors, and then to continue parsing as best as possible (or bailing out of the section if it's impossible to do so). This allows us to get the maximum information out possible
79	To be clear, I know very little about the new section type, how it is used and so on, so what I'm suggesting may not make much sense. Linkers concatenate sections blindly (in general). As such, if you had 1.o and 2.o each with a .llvm.offloading section, and you combine them into out.elf, you'd end up with a single output section containing the concatenation of the two. Presumably this means you'll end up with something that looks a bit like this? .llvm.offloading .llvm.offloading(1.o) - size field .llvm.offloading(1.o) - rest of section .llvm.offloading(2.o) - size field .llvm.offloading(2.o) - rest of section Is that correct? If so, I don't think there's anything to do here, assuming the section size is not guaranteed to be the same for all input sections.
88	Generally, we add a comment where we're deliberately throwing away errors, to explain why this is a good idea.
96	Ditto.
llvm/tools/llvm-objdump/llvm-objdump.cpp
205	Put it the line before RawClanAST, since Offloading appears before RawClangAST lexicographically.

Thanks for the comments again, I'll address them soon.

In D126904#3580868, @jhenderson wrote:

I'd be concerned if a dumping implementation were written and then needed regular modification due to the section format not being stable yet. Usually, we'd try to get the details of the section format sorted before writing the dumping implementation, to reduce churn.

The format itself I consider mostly stable as I don't have plans to change the structure. However, if we change the section to a type we may want to do that. I was hesitant to add a new type to Elf considering we would need support for it everywhere else, but it's definitely the more correct option. Also I plan on adding some more data to the binary section that will be dumped, but I figured it's not a big deal to make follow-up patches that just update the printing. Let me know if you think these are deal-breakers.

The only material I can offer is the yaml2obj source code - there are several other section kinds that have different dumping behaviour, that should be fairly straightforward to use as a basis for the implementation for this section. Take a look at how, for example, SHT_RELA sections are handled in yaml2obj.

I will probably just write the binary contents by hand for now, it's pretty much just a big struct of offsets into a string table.

llvm/test/tools/llvm-objdump/Offload/offload-failure.test
20–24 ↗	(On Diff #436379)	I mostly just copied these from another test, I'll clean them up.
llvm/tools/llvm-objdump/OffloadDump.cpp
79	Yeah, that's more or less how it's set up. I'm assuming there's not much we can do to make parsing this easier.

Addressing some comments. The binary string is a little long, mostly becaues this test contains four binaries concatenated, I could make multiple files for this if we want to get it shorter. I'm not sure how useful defining a yaml2obj interface for this would be, but I can do it if needed.

Harbormaster completed remote builds in B169709: Diff 436771.Jun 14 2022, 7:55 AM

jhuber6 added a parent revision: D127776: [ObjectYAML] Add offloading binary implementations for obj2yaml and yaml2obj.Jun 14 2022, 12:41 PM

Updating to use yaml2obj implementation for tests.

Harbormaster completed remote builds in B169806: Diff 436897.Jun 14 2022, 2:11 PM

jhenderson added inline comments.Jun 15 2022, 12:58 AM

llvm/test/tools/llvm-objdump/Offloading/binary.test
3	You need `--match-full-lines` too to ensure the whitespace indentation is strictly enforced at the start and end of the lines too. This will require you to reformat your check patterns slightly, because the space after the ":" is then a part of the pattern: # CHECK:OFFLOADING IMAGE [0]: # CHECK-NEXT:kind llvm ir etc (I've added extra whitespace in the `# CHECK:` directive, to make the colons line up nicely, enhancing the readability and emphasising the indentation.
llvm/test/tools/llvm-objdump/Offloading/elf.test
1 ↗	(On Diff #436897)	Now that I'm looking at this and the other test case, it seems like you'd be better off having them in the same file, because the CHECK-* patterns are identical. That would presumably still apply after adding COFF etc support.
4 ↗	(On Diff #436897)	Same comment as above.
llvm/tools/llvm-objdump/OffloadDump.cpp
36–37	This comment was marked as done, but I don't see a test case for it?
87–88	The question this comment needs to answer is WHY we should give up, rather than reporting the error (as a warning) and ending printing? Same goes below.

Thanks for the comments, I'll update the ObjectYAML implementation and propagate it to here.

llvm/tools/llvm-objdump/OffloadDump.cpp
36–37	Because I added some validation to `obj2yaml` it became impossible to test it by creating one without a valid value, but I'm going to get rid of that to simplify this and add it back in.
87–88	I'll try to clarify it.

Addressing more comments.

Harbormaster completed remote builds in B169994: Diff 437168.Jun 15 2022, 9:17 AM

jhenderson added inline comments.Jun 17 2022, 1:10 AM

llvm/test/tools/llvm-objdump/Offloading/binary.test
7	Nit: It's more common in newer tests to use --check-prefixes instead of multiple --check-prefix options.
llvm/tools/llvm-objdump/OffloadDump.cpp
36–37	As a general rule, yaml2obj should be as lax as possible in what it allows, as it allows testing these corner cases.
87
87–88	The clarification is clear enough, but it doesn't explain why you can't at least print a warning. This wouldn't prevent the code continuing.

Updating to an enum, and making the message if we fail to parse a subsequent binary after the first one when there is memory leftoever a warning instead of just ignoring it.

Harbormaster completed remote builds in B170511: Diff 437900.Jun 17 2022, 8:48 AM

@jhenderson Is this patch good to land as well?

Sorry for the delay - I've had a lot on my plate.

It seems like most (all?) of my last round of inline comments haven't been addressed?

llvm/test/tools/llvm-objdump/Offloading/Inputs/binary.yaml
32	Nit: Looks like one too many blank lines at EOF.
llvm/test/tools/llvm-objdump/Offloading/binary.test
2	In line 3, you run yaml2obj to create the .bin file again. I suggest you simply rename this %t to %t.bin and then delete the later yaml2obj invocation. It'd probably be useful to have a brief comment at the start of this individual test case, e.g. "Show can dump offloading binaries directly." and an equivalent one at the start of the wrapped-in-ELF case, e.g. "Show can dump offloading binaries embedded in ELF.".
5	Nit: Perhaps worth renaming %t to %t.elf for this test case to help make it clearer what this code is doing in contrast to the previous case. Also, I'd add a single blank line (followed by the suggested comment above) between the lines to do with the raw binary and ELF cases.
17	Nit: by adding this suggested spacing, your output then all lines line up, which I think improves test readability.
llvm/test/tools/llvm-objdump/Offloading/failure.test
18	This doesn't tell us whether the message is a warning or error, so please include the "warning:" or "error:" prefix in the check. Additionally, the error message as checked doesn't give sufficient context as to what has gone wrong. For example, it doesn't mention the input file or that it is the offloading code that has failed. See https://llvm.org/docs/CodingStandards.html#error-and-warning-messages. A couple of options include modifying the underlying code to give better error messages, or to "catch" the error and rewrap it with some additional context in the message. llvm-readobj has a number of examples of where this is done that I know of, for example (llvm-objdump may do too). Finally, in dumping tools, especially in newer code, we try to avoid having hard errors. The reason for this is because it is often useful to be able to see the reainder of the output from other files and/or options, whereas reporting an error is usually a hard end to the program (NB: I haven't double-checked the llvm-objdump behaviour to confirm that the error reporting ends the program, so apologies if this is a bit of misdirection). I made the second and third of these points in previous inline comments, but they don't seem to have been addressed.
llvm/tools/llvm-objdump/OffloadDump.cpp
21	Why is this a macro rather than just a `static const char *`?
36–37	With the changes made to yaml2obj, can we now test this default case?
61–63	Could I clarify when you said making a change in the future for the name -> type bit that you mean in a future patch?
72	Nit: `SectionRef` is designed to be lightweight and copyable (like `llvm::StringRef`) so there's no particular need to use `const &` here.
79	You should exercise this error path by adding a `SHOffset` key to your ELF YAML with an invalid value.
87	Not addressed yet.

Addressing comments, adding test for warnings.

llvm/test/tools/llvm-objdump/Offloading/failure.test
18	`ReportError` does indeed exit the program. There are a lot of other examples of `llvm-objdump` exiting on malformed input. I think it's reasonable to exit if the user requested `--offloading` and it's malformed to just exit. I will specify the checks and try to improve the message however.
llvm/tools/llvm-objdump/OffloadDump.cpp
36–37	It's already being tested now. The `binary.yaml` file has the fourth entry with the `None` type.

Harbormaster completed remote builds in B172766: Diff 441019.Jun 29 2022, 9:11 AM

jhenderson added inline comments.Jun 30 2022, 1:00 AM

llvm/test/tools/llvm-objdump/Offloading/Inputs/malformed.yaml
2	Is this file just meant to be the YAML? It's in an Inputs directory, so the RUN and CHECK lines aren't going to do anything...
llvm/test/tools/llvm-objdump/Offloading/binary.test
2	It'd probably be useful to have a brief comment at the start of this individual test case, e.g. "Show can dump offloading binaries directly." and an equivalent one at the start of the wrapped-in-ELF case, e.g. "Show can dump offloading binaries embedded in ELF.". Looks like this bit hasn't been addressed?
5	Again, the second half of this comment hasn't been addressed.
llvm/test/tools/llvm-objdump/Offloading/failure.test
18	A lot of those exits are from older code, but the general preference is to move away from the exit-immediately-on-malformed. Imagine the case where you have 4 different offloading binaries you wish to dump (e.g. `llvm-objdump --offloading 1.bin 2.bin 3.bin 4.bin`), and all 4 of these are malformed for different reasons. You'd end up getting an error on 1.bin and not knowing that the other 3 need fixing too, meaning you'd have to keep rerunning your code until you'd eventually flushed out all of the errors. Similarly, imagine the binary was wrapped in an ELF, and you wanted to dump other parts of the ELF too. You'd end up only getting as far as the offloading dump before erroring, and not getting any other information you wanted.
18	Unrelated to my other comments, but why is this string in single quotes?
llvm/test/tools/llvm-objdump/Offloading/warning.test
7	Is it worth checking that the good binary was dumped successfully?
15	You can use FileCheck's -D option to check the exact filename: # RUN: ... \| FileCheck -DFILE=%t.elf ... # CHECK: warning: '[[FILE]]': ...
16	Nit: too many blank lines at EOF.
llvm/tools/llvm-objdump/OffloadDump.cpp
79	Marked as done but I don't see it?
84	I believe you'll end up with an assertion under the ABI breaking checks config (I think that's the name of it anyway), as you don't actually use the error within `BinaryOrErr`. I think it would still be good to include the message as reported by the underlying code, but wrapped in the additional context you've added. Also, would be good to include the input file name. Rough idea (uncertain on the exact invocation needed to get the string, as I'm too lazy to look it up right now!): reportError("while extracting offloading files from \"" + O->getFileName() + "\": " + toString(BinaryOrErr));
91–93	Same comment as above.
103–107	Ditto.

14 revisions seems excessive here, we're into the region of polishing something that's already fine. Let's leave the remaining nits to be fixed in passing in later patches. I want to use this to debug something asap and would like to avoid spinning a local branch with this and my own wip stuff.

At least one of my comments highlights a thing that will cause tests to fail on bots under some configurations, so this isn't ready to land as-is, even if we were to defer the nits.

This revision now requires changes to proceed.Jun 30 2022, 4:57 AM

Addressing comments. I think what we need is some new function like reportErrorNoExit, otherwise all the file paths handling the printing themselves would be ugly.

Harbormaster completed remote builds in B173017: Diff 441365.Jun 30 2022, 6:47 AM

In D126904#3622100, @jhuber6 wrote:

Addressing comments. I think what we need is some new function like reportErrorNoExit, otherwise all the file paths handling the printing themselves would be ugly.

This is definitely something I think the tool could benefit from, with a final check at the end of the program to ensure the right exit code is produced, if this function has ever been called. LLD has a similar process for non-fatal errors (it sets a variable that is checked periodically to ensure it is safe to continue). That being said, as this impacts llvm-objdump more widely, I would be happy for it to be deferred to a later patch, so that other code paths could be updated at the same time.

There are a high number of inline comments that I haven't seen be addressed. Most of them are fairly minor, but the volume of them combined with the fact that I've had no response to them makes me concerned that letting the patch land without them means they'll never be addressed. Please could you go through them and either address them or explain why it should either a) be deferred or b) not done at all.

llvm/test/tools/llvm-objdump/Offloading/failure.test
18	To be clear, I don't think the explanation needs to be in quotes of any variety - only the file name.

In D126904#3624373, @jhenderson wrote:

In D126904#3622100, @jhuber6 wrote:

Addressing comments. I think what we need is some new function like reportErrorNoExit, otherwise all the file paths handling the printing themselves would be ugly.

This is definitely something I think the tool could benefit from, with a final check at the end of the program to ensure the right exit code is produced, if this function has ever been called. LLD has a similar process for non-fatal errors (it sets a variable that is checked periodically to ensure it is safe to continue). That being said, as this impacts llvm-objdump more widely, I would be happy for it to be deferred to a later patch, so that other code paths could be updated at the same time.

There are a high number of inline comments that I haven't seen be addressed. Most of them are fairly minor, but the volume of them combined with the fact that I've had no response to them makes me concerned that letting the patch land without them means they'll never be addressed. Please could you go through them and either address them or explain why it should either a) be deferred or b) not done at all.

The only comments I'm aware of that I haven't addressed is early exiting on failure and a check on failing to get the section. I think if we truly want to avoid early exits we should put it in a separate patch, I'm not doing anything that the rest of llvm-objdump doesn't already do. I do not think it's necessary to add an error check if we fail to get a section, this is tested already in many places and I see no point to add a completely redundant test to the LLVM test-suite. I can remove the quotation marks prior to landing if you want. If you have any further objections let me know, otherwise I would greatly appreciate being able to move forward with this.

I'm absolutely OK with the high volume of minor comments going unaddressed, especially since they're being added incrementally over a prolonged period of time. I'm also OK with James doing some post commit cleanup to make things better conform to his ideal.

In D126904#3624691, @jhuber6 wrote:

The only comments I'm aware of that I haven't addressed is early exiting on failure and a check on failing to get the section. I think if we truly want to avoid early exits we should put it in a separate patch, I'm not doing anything that the rest of llvm-objdump doesn't already do. I do not think it's necessary to add an error check if we fail to get a section, this is tested already in many places and I see no point to add a completely redundant test to the LLVM test-suite. I can remove the quotation marks prior to landing if you want. If you have any further objections let me know, otherwise I would greatly appreciate being able to move forward with this.

I've gone through and highlighted all the inline comments that still need addressing/answering. I can live with the early exiting in this patch, but still think you need the test I've requested (see inline for explanation).

I'm still not really comfortable with relying on section name to identify the offloading binaries, as it could potentially require tens or even hundreds of thousands of string comparisons under some cases, plus it's not really in the spirit of the ELF file format*. It's also worth noting that for other platforms, the naming schemes are different - for example in Mach-O files, section names are usually __some_section_name instead of .some_section_name, so you'll still have to have at least some platform-specific code in order to retrieve the section. Most (all?) new sections that have any dumping support are distinguished by type for ELF, although curiously, dumping support has only been added for them in llvm-readobj rather than llvm-objdump, so it's not a clear-cut point.

(* The ELF gABI states about the sh_type field: "This member categorizes the section's contents and semantics." and then defines the SHT_PROGBITS type (which I assume the current offloading binaries are) as "The section holds information defined by the program, whose format and meaning are determined solely by the program." Given that the offloading format is not specific to the program, it seems incorrect to say that it is SHT_PROGBITS.)

llvm/test/tools/llvm-objdump/Offloading/binary.test
2	Not addressed.
5	Not addressed.
llvm/test/tools/llvm-objdump/Offloading/warning.test
7	Not addressed.
llvm/tools/llvm-objdump/OffloadDump.cpp
61–63	Not answered.
79	The reason this needs addressing, is because this specific code path in llvm-objdump is otherwise untested. The test is needed to show that llvm-objdump under these specific circumstances properly handles errors if `Contents` is in an error state. This is actually important because if there is no such test, a change to the `reportError` whereby `Contents` isn't used in the message, could result in unchecked errors, but without a test case, these would only manifest under real usage, rather than under testing like they should be.
87	Not addressed.

Do you consider these blocking issues beyond fixing the typos and adding comments? I'm getting very tired of needing to constantly update this patch that others have already signed off on and I'm beginning to feel like this is a fruitless endeavor. If you have no intention of ever letting this through let me know and I'll abandon it so we don't waste any more of our time.

llvm/test/tools/llvm-objdump/Offloading/binary.test
2	Sure, I'll add a comment.
5	Sure I'll add a comment.
llvm/test/tools/llvm-objdump/Offloading/warning.test
7	No, this is the same test checks as the other file and it bloats the test. We already know that it will print the good one.
llvm/tools/llvm-objdump/OffloadDump.cpp
61–63	Yes, I was saying we could do this in a future patch as I didn't think it was a blocking issue for the functionality of this patch. The main reason I did this is just because it was the easiest common solution between extracting these from LLVM-IR and an ELF, that is the section's string will be the same. I was just planning on adding it to the list of existing ones in `ELF.h`, but as this worked overall I thought it was sufficient to land this patch.
79	There's already usage of `getContents()` like this in `llvm-objdump`, I don't see why this is a special case. If someone changed the `reportError` function to not report errors it should show up somewhere. The point of this patch is not about checking if the ELF works, and we know from similar usage in `llvm-objdump` that this pattern reports errors if it's malformed. I can add a test if you really want me t, but I fail to see the point even with your hypothetical situation.
87	I'll fix the typo.

In D126904#3624769, @jhuber6 wrote:

Do you consider these blocking issues beyond fixing the typos and adding comments?

The missing test case is the issue that I care about most.

I'm getting very tired of needing to constantly update this patch that others have already signed off on and I'm beginning to feel like this is a fruitless endeavor. If you have no intention of ever letting this through let me know and I'll abandon it so we don't waste any more of our time.

With all due respect to the others who have signed off on this patch already, they aren't routine contributors to this area of the LLVM code-base, so don't know the norms, expected code quality etc of it. Please understand that I am just trying to keep the quality as high as possible. This often means that things have to go through several iterations until they are right. Please also be aware that this patch is not the only patch I am reviewing and I have to balance my regular responsibilities alongside these reviews too, so I don't have time to do a back-and-forth on the patch multiple times a day. I'm certainly not blocking this for the sake of doing so, especially given the time I've spent reviewing the patch.

In D126904#3624693, @JonChesterfield wrote:

I'm absolutely OK with the high volume of minor comments going unaddressed, especially since they're being added incrementally over a prolonged period of time.

Most of these comments are as a result of changes in each iteration, where the later iteration either doesn't fully resolve the point I had raised, or introduced other issues. It takes time to get things right. I'd also point out that some inline comments were left without responses for several iterations of the patch, which is part of the reason for this taking this long.

I'm also OK with James doing some post commit cleanup to make things better conform to his ideal.

This isn't how reviews work. It's not my responsibility to address issues introduced with a patch that someone else has just landed. If you think it is, please raise this on the LLVM forums and see what others have to say.

llvm/test/tools/llvm-objdump/Offloading/warning.test
7	My thinking was that it would be useful, to show that good binaries are dumped despite the warning. The question is really, do you consider it a guarantee that all good binaries will be dumped even if a later one is bad, and if not, will users be concerned if the behaviour ever changed (by accident or otherwise) to check all the binaries are good before dumping any of them?
llvm/tools/llvm-objdump/OffloadDump.cpp
61–63	Okay.
79	Imagine if the code were changed to the following by somebody: if (!Contents) reportError("failed to get offloading section contents", O->getFileName()); There would be no test failure under any situation, because there is no test. You would get a crash though if someone were to try to use llvm-objdump with the enhanced error checks enabled, and ran into a malformed binary. This isn't really a contrived example either: in an earlier revision of this patch, there was a similar situation, where the error was thrown away without getting its message, so I don't think it's unreasonable to assume that it could occur in a later revision of this code. I'm not asking for testing that errors are reported when getting malformed contents, I'm asking for testing that the error returned by the lower-level function is handled by this higher-level one.

Addressing comments adding test

Looks good now, with a couple of nits to be addressed prior to being committed, and a slight change to a test.

llvm/test/tools/llvm-objdump/Offloading/binary.test
2	Nit: in newer tests in the binary tools at least, we use `##` for test comments to help them stand out from the RUN and CHECK directives. Same throughout the new comments.
6	Ditto. Also typo: "insode" -> "inside".
llvm/test/tools/llvm-objdump/Offloading/warning.test
8–9	These should be a single invocation and just check that the warning appears in the right place in the output with respect to the regular output. Sorry if that wasn't clear from my earlier comments. If you want, you can also abbreviate your other checks to e.g. just the `OFFLOADING IMAGE` line - as you rightly point out, we test that the dumping works properly elsewhere.

This revision is now accepted and ready to land.Jul 1 2022, 6:52 AM

Thanks, addressing final nits.

Harbormaster completed remote builds in B173235: Diff 441683.Jul 1 2022, 8:09 AM

MaskRay added inline comments.Jul 1 2022, 10:08 AM

llvm/tools/llvm-objdump/OffloadDump.h
19	Use const reference if non-null

MaskRay accepted this revision.Jul 1 2022, 10:12 AM

MaskRay added inline comments.

llvm/test/tools/llvm-objdump/Offloading/content-failure.test
2
llvm/tools/llvm-objdump/OffloadDump.cpp
21	The variable is writable. `static` is unneeded for const variables (internal linkage by default).
24	Use const reference if non-null
43	Append `[` to `OFFLOAD IMAGE`
62	delete unneeded blank line
71	Use const reference if non-null
99	Use const reference if non-null

jhuber6 added inline comments.Jul 1 2022, 10:47 AM

llvm/tools/llvm-objdump/OffloadDump.cpp
43	Could you elaborate on this? do you want it formatted like `OFFLOADING IMAGE[0]`?

MaskRay added inline comments.Jul 1 2022, 10:51 AM

llvm/tools/llvm-objdump/OffloadDump.cpp
43	`outs() << "\nOFFLOADING IMAGE [" << Index << "]:\n"`

MaskRay added inline comments.Jul 1 2022, 10:52 AM

llvm/test/tools/llvm-objdump/Offloading/binary.test
2	`to see` conveys no information and should be deleted. ditto elsewhere
llvm/test/tools/llvm-objdump/Offloading/warning.test
2

Addressing nits.

In D126904#3624693, @JonChesterfield wrote:

I'm absolutely OK with the high volume of minor comments going unaddressed, especially since they're being added incrementally over a prolonged period of time. I'm also OK with James doing some post commit cleanup to make things better conform to his ideal.

FWIW I don't think this is fine. If a minor comment is due to existing code, it's fine; otherwise it's not. I have only read a few comments and I think they are all relevant.
(Thanks to @jhenderson who has done a great job ensuring the tests are clean, readable, and maintainable.)
I do not think this new feature has rights to deviate from the usual standard.

There are many comments which have been addressed but haven't been marked "done" (tip: you can click "done" before arc diff and these comments will automatically be marked done).
They can be distracting to reviewers as they might think the patch is still in a not-ready state.

Harbormaster completed remote builds in B173277: Diff 441747.Jul 1 2022, 12:36 PM

This revision was landed with ongoing or failed builds.Jul 1 2022, 6:13 PM

Closed by commit rGd2d8b0aa4f80: [llvm-objdump] Add support for dumping embedded offloading data (authored by jhuber6). · Explain Why

This revision was automatically updated to reflect the committed changes.

jhuber6 added a commit: rGd2d8b0aa4f80: [llvm-objdump] Add support for dumping embedded offloading data.

jhuber6 mentioned this in rG080022d8ed6c: [LinkerWrapper] Embed OffloadBinaries for OpenMP offloading images.Jul 21 2022, 10:20 AM

Revision Contents

Path

Size

llvm/

test/

tools/

llvm-objdump/

Offloading/

Inputs/

30 lines

12 lines

40 lines

18 lines

17 lines

16 lines

tools/

llvm-objdump/

1 line

3 lines

22 lines

102 lines

10 lines

Diff 441835

llvm/test/tools/llvm-objdump/Offloading/Inputs/binary.yaml

This file was added.

				!Offload
				Members:
				- ImageKind: IMG_Bitcode
				OffloadKind: OFK_OpenMP
				String:
				- Key: "triple"
				Value: "amdgcn-amd-amdhsa"
				- Key: "arch"
				Value: "gfx908"
				- ImageKind: IMG_Bitcode
				OffloadKind: OFK_OpenMP
				String:
				- Key: "triple"
				Value: "amdgcn-amd-amdhsa"
				- Key: "arch"
				Value: "gfx90a"
				- ImageKind: IMG_Cubin
				OffloadKind: OFK_OpenMP
				String:
				- Key: "triple"
				Value: "nvptx64-nvidia-cuda"
				- Key: "arch"
				Value: "sm_52"
				- ImageKind: IMG_None
				OffloadKind: OFK_None
				String:
				- Key: "triple"
				Value: "nvptx64-nvidia-cuda"
				- Key: "arch"
				Value: "sm_70"
				jhendersonUnsubmitted Not Done Reply Inline Actions Nit: Looks like one too many blank lines at EOF. jhenderson: Nit: Looks like one too many blank lines at EOF.

llvm/test/tools/llvm-objdump/Offloading/Inputs/malformed.yaml

This file was added.

				!Offload
				EntryOffset: 999999999
				jhendersonUnsubmitted Not Done Reply Inline Actions Is this file just meant to be the YAML? It's in an Inputs directory, so the RUN and CHECK lines aren't going to do anything... jhenderson: Is this file just meant to be the YAML? It's in an Inputs directory, so the RUN and CHECK lines…
				Members:
				- ImageKind: IMG_Cubin
				OffloadKind: OFK_OpenMP
				Flags: 0
				String:
				- Key: "triple"
				Value: "nvptx64-nvidia-cuda"
				- Key: "arch"
				Value: "sm_70"
				Content: "deadbeef"

llvm/test/tools/llvm-objdump/Offloading/binary.test

This file was added.

## Check that we can dump an offloading binary directly.

# RUN: yaml2obj %S/Inputs/binary.yaml -o %t.bin

jhendersonUnsubmitted

Done

In line 3, you run yaml2obj to create the .bin file again. I suggest you simply rename this %t to %t.bin and then delete the later yaml2obj invocation.

It'd probably be useful to have a brief comment at the start of this individual test case, e.g. "Show can dump offloading binaries directly." and an equivalent one at the start of the wrapped-in-ELF case, e.g. "Show can dump offloading binaries embedded in ELF.".

jhenderson: In line 3, you run yaml2obj to create the .bin file again. I suggest you simply rename this %t…

jhendersonUnsubmitted

Not Done

It'd probably be useful to have a brief comment at the start of this individual test case, e.g. "Show can dump offloading binaries directly." and an equivalent one at the start of the wrapped-in-ELF case, e.g. "Show can dump offloading binaries embedded in ELF.".

Looks like this bit hasn't been addressed?

jhenderson: > It'd probably be useful to have a brief comment at the start of this individual test case, e.

jhendersonUnsubmitted

Not Done

Not addressed.

jhenderson: Not addressed.

jhuber6AuthorUnsubmitted

Done

Sure, I'll add a comment.

jhuber6: Sure, I'll add a comment.

jhendersonUnsubmitted

Not Done

Nit: in newer tests in the binary tools at least, we use ## for test comments to help them stand out from the RUN and CHECK directives. Same throughout the new comments.

jhenderson: Nit: in newer tests in the binary tools at least, we use `##` for test comments to help them…

MaskRayUnsubmitted

Not Done

to see conveys no information and should be deleted.

ditto elsewhere

MaskRay: `to see` conveys no information and should be deleted. ditto elsewhere

# RUN: llvm-objdump --offloading %t.bin | FileCheck %s --match-full-lines --strict-whitespace --implicit-check-not={{.}}

jhendersonUnsubmitted

Done

You need --match-full-lines too to ensure the whitespace indentation is strictly enforced at the start and end of the lines too. This will require you to reformat your check patterns slightly, because the space after the ":" is then a part of the pattern:

#      CHECK:OFFLOADING IMAGE [0]:
# CHECK-NEXT:kind            llvm ir

etc (I've added extra whitespace in the # CHECK: directive, to make the colons line up nicely, enhancing the readability and emphasising the indentation.

jhenderson: You need `--match-full-lines` too to ensure the whitespace indentation is strictly enforced at…

## Check that we can dump an offloading binary inside of an ELF section.

jhendersonUnsubmitted

Done

Nit: Perhaps worth renaming %t to %t.elf for this test case to help make it clearer what this code is doing in contrast to the previous case.

Also, I'd add a single blank line (followed by the suggested comment above) between the lines to do with the raw binary and ELF cases.

jhenderson: Nit: Perhaps worth renaming %t to %t.elf for this test case to help make it clearer what this…

jhendersonUnsubmitted

Not Done

Again, the second half of this comment hasn't been addressed.

jhenderson: Again, the second half of this comment hasn't been addressed.

jhendersonUnsubmitted

Not Done

Not addressed.

jhenderson: Not addressed.

jhuber6AuthorUnsubmitted

Done

Sure I'll add a comment.

jhuber6: Sure I'll add a comment.

# RUN: yaml2obj %s -o %t.elf

jhendersonUnsubmitted

Not Done

Ditto. Also typo: "insode" -> "inside".

jhenderson: Ditto. Also typo: "insode" -> "inside".

# RUN: llvm-objcopy --add-section .llvm.offloading=%t.bin %t.elf

jhendersonUnsubmitted

Not Done

Nit: It's more common in newer tests to use --check-prefixes instead of multiple --check-prefix options.

jhenderson: Nit: It's more common in newer tests to use --check-prefixes instead of multiple --check-prefix…

# RUN: llvm-objdump --offloading %t.elf | FileCheck %s --check-prefixes=CHECK,ELF --match-full-lines --strict-whitespace --implicit-check-not={{.}}

!ELF

FileHeader:

Class: ELFCLASS64

Data: ELFDATA2LSB

Type: ET_EXEC

# ELF:{{.*}}file format elf64-unknown

# ELF-EMPTY:

jhendersonUnsubmitted

Done

# ELF-EMPTY:

- # CHECK:OFFLOADING IMAGE [0]:

+ # CHECK:OFFLOADING IMAGE [0]:

# CHECK-NEXT:kind llvm ir

Nit: by adding this suggested spacing, your output then all lines line up, which I think improves test readability.

jhenderson: Nit: by adding this suggested spacing, your output then all lines line up, which I think…

# CHECK:OFFLOADING IMAGE [0]:

# CHECK-NEXT:kind llvm ir

# CHECK-NEXT:arch gfx908

# CHECK-NEXT:triple amdgcn-amd-amdhsa

# CHECK-NEXT:producer openmp

# CHECK-EMPTY:

# CHECK-NEXT:OFFLOADING IMAGE [1]:

# CHECK-NEXT:kind llvm ir

# CHECK-NEXT:arch gfx90a

# CHECK-NEXT:triple amdgcn-amd-amdhsa

# CHECK-NEXT:producer openmp

# CHECK-EMPTY:

# CHECK-NEXT:OFFLOADING IMAGE [2]:

# CHECK-NEXT:kind cubin

# CHECK-NEXT:arch sm_52

# CHECK-NEXT:triple nvptx64-nvidia-cuda

# CHECK-NEXT:producer openmp

# CHECK-EMPTY:

# CHECK-NEXT:OFFLOADING IMAGE [3]:

# CHECK-NEXT:kind <none>

# CHECK-NEXT:arch sm_70

# CHECK-NEXT:triple nvptx64-nvidia-cuda

# CHECK-NEXT:producer none

llvm/test/tools/llvm-objdump/Offloading/content-failure.test

This file was added.

# Test to check if we fail to get the section contents.

# RUN: yaml2obj %s -o %t

MaskRayUnsubmitted

Not Done

- # Test to check if we failto get the section contents.

+ # Test to check if we fail to get the section contents.

# RUN: yaml2obj %s -o %t

MaskRay:

# RUN: not llvm-objdump --offloading %t 2>&1 | FileCheck -DFILENAME=%t %s

!ELF

FileHeader:

Class: ELFCLASS64

Data: ELFDATA2LSB

Type: ET_EXEC

Sections:

- Name: .llvm.offloading

Type: SHT_PROGBITS

Flags: [ SHF_EXCLUDE ]

Address: 0x0

ShOffset: 0x99999

AddressAlign: 0x0000000000000008

# CHECK: error: '[[FILENAME]]': The end of the file was unexpectedly encountered

llvm/test/tools/llvm-objdump/Offloading/failure.test

This file was added.

				# RUN: yaml2obj %s -o %t
				# RUN: not llvm-objdump --offloading %t 2>&1 \| FileCheck -DFILENAME=%t %s

				!ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_EXEC
				Sections:
				- Name: .llvm.offloading
				Type: SHT_PROGBITS
				Flags: [ SHF_EXCLUDE ]
				Address: 0x0
				AddressAlign: 0x0000000000000008
				Content: "10ffb0ad"

				# CHECK: error: '[[FILENAME]]': while extracting offloading files: Invalid data was encountered while parsing the file
				jhendersonUnsubmitted Not Done Reply Inline Actions This doesn't tell us whether the message is a warning or error, so please include the "warning:" or "error:" prefix in the check. Additionally, the error message as checked doesn't give sufficient context as to what has gone wrong. For example, it doesn't mention the input file or that it is the offloading code that has failed. See https://llvm.org/docs/CodingStandards.html#error-and-warning-messages. A couple of options include modifying the underlying code to give better error messages, or to "catch" the error and rewrap it with some additional context in the message. llvm-readobj has a number of examples of where this is done that I know of, for example (llvm-objdump may do too). Finally, in dumping tools, especially in newer code, we try to avoid having hard errors. The reason for this is because it is often useful to be able to see the reainder of the output from other files and/or options, whereas reporting an error is usually a hard end to the program (NB: I haven't double-checked the llvm-objdump behaviour to confirm that the error reporting ends the program, so apologies if this is a bit of misdirection). I made the second and third of these points in previous inline comments, but they don't seem to have been addressed. jhenderson: This doesn't tell us whether the message is a warning or error, so please include the "warning…
				jhuber6AuthorUnsubmitted Done Reply Inline Actions `ReportError` does indeed exit the program. There are a lot of other examples of `llvm-objdump` exiting on malformed input. I think it's reasonable to exit if the user requested `--offloading` and it's malformed to just exit. I will specify the checks and try to improve the message however. jhuber6: `ReportError` does indeed exit the program. There are a lot of other examples of `llvm-objdump`…
				jhendersonUnsubmitted Not Done Reply Inline Actions A lot of those exits are from older code, but the general preference is to move away from the exit-immediately-on-malformed. Imagine the case where you have 4 different offloading binaries you wish to dump (e.g. `llvm-objdump --offloading 1.bin 2.bin 3.bin 4.bin`), and all 4 of these are malformed for different reasons. You'd end up getting an error on 1.bin and not knowing that the other 3 need fixing too, meaning you'd have to keep rerunning your code until you'd eventually flushed out all of the errors. Similarly, imagine the binary was wrapped in an ELF, and you wanted to dump other parts of the ELF too. You'd end up only getting as far as the offloading dump before erroring, and not getting any other information you wanted. jhenderson: A lot of those exits are from older code, but the general preference is to move away from the…
				jhendersonUnsubmitted Not Done Reply Inline Actions Unrelated to my other comments, but why is this string in single quotes? jhenderson: Unrelated to my other comments, but why is this string in single quotes?
				jhendersonUnsubmitted Not Done Reply Inline Actions To be clear, I don't think the explanation needs to be in quotes of any variety - only the file name. jhenderson: To be clear, I don't think the explanation needs to be in quotes of any variety - only the file…

llvm/test/tools/llvm-objdump/Offloading/warning.test

This file was added.

## Ensure we give a warning on bad input following good input.

# RUN: yaml2obj %S/Inputs/binary.yaml -o %t-good.bin

MaskRayUnsubmitted

Not Done

- ## Check to ensure we give a warning on bad input following good input.

+ ## Ensure we give a warning on bad input following good input.

# RUN: yaml2obj %S/Inputs/binary.yaml -o %t-good.bin

MaskRay:

# RUN: yaml2obj %S/Inputs/malformed.yaml -o %t-bad.bin

# RUN: cat %t-bad.bin >> %t-good.bin

# RUN: yaml2obj %s -o %t.elf

# RUN: llvm-objcopy --add-section .llvm.offloading=%t-good.bin %t.elf

# RUN: llvm-objdump --offloading %t.elf 2>&1 | FileCheck %s -DFILENAME=%t.elf

jhendersonUnsubmitted

Not Done

Is it worth checking that the good binary was dumped successfully?

jhenderson: Is it worth checking that the good binary was dumped successfully?

jhendersonUnsubmitted

Not Done

Not addressed.

jhenderson: Not addressed.

jhuber6AuthorUnsubmitted

Done

No, this is the same test checks as the other file and it bloats the test. We already know that it will print the good one.

jhuber6: No, this is the same test checks as the other file and it bloats the test. We already know that…

jhendersonUnsubmitted

Not Done

My thinking was that it would be useful, to show that good binaries are dumped despite the warning. The question is really, do you consider it a guarantee that all good binaries will be dumped even if a later one is bad, and if not, will users be concerned if the behaviour ever changed (by accident or otherwise) to check all the binaries are good before dumping any of them?

jhenderson: My thinking was that it would be useful, to show that good binaries are dumped despite the…

!ELF

jhendersonUnsubmitted

Not Done

These should be a single invocation and just check that the warning appears in the right place in the output with respect to the regular output. Sorry if that wasn't clear from my earlier comments.

If you want, you can also abbreviate your other checks to e.g. just the OFFLOADING IMAGE line - as you rightly point out, we test that the dumping works properly elsewhere.

jhenderson: These should be a single invocation and just check that the warning appears in the right place…

FileHeader:

Class: ELFCLASS64

Data: ELFDATA2LSB

Type: ET_EXEC

# CHECK: OFFLOADING IMAGE [0]:

jhendersonUnsubmitted

Not Done

You can use FileCheck's -D option to check the exact filename:

# RUN: ... | FileCheck -DFILE=%t.elf ...

# CHECK: warning: '[[FILE]]': ...

jhenderson: You can use FileCheck's -D option to check the exact filename: ``` # RUN: ... | FileCheck…

# CHECK: warning: '[[FILENAME]]': while parsing offloading files: The end of the file was unexpectedly encountered

jhendersonUnsubmitted

Not Done

Nit: too many blank lines at EOF.

jhenderson: Nit: too many blank lines at EOF.

llvm/tools/llvm-objdump/CMakeLists.txt

	Show All 22 Lines
	add_public_tablegen_target(OtoolOptsTableGen)			add_public_tablegen_target(OtoolOptsTableGen)

	add_llvm_tool(llvm-objdump			add_llvm_tool(llvm-objdump
	llvm-objdump.cpp			llvm-objdump.cpp
	SourcePrinter.cpp			SourcePrinter.cpp
	COFFDump.cpp			COFFDump.cpp
	ELFDump.cpp			ELFDump.cpp
	MachODump.cpp			MachODump.cpp
				OffloadDump.cpp
	WasmDump.cpp			WasmDump.cpp
	XCOFFDump.cpp			XCOFFDump.cpp
	DEPENDS			DEPENDS
	ObjdumpOptsTableGen			ObjdumpOptsTableGen
	OtoolOptsTableGen			OtoolOptsTableGen
	)			)

	if(LLVM_HAVE_LIBXAR)			if(LLVM_HAVE_LIBXAR)
	Show All 12 Lines

llvm/tools/llvm-objdump/ObjdumpOpts.td

	Show First 20 Lines • Show All 75 Lines • ▼ Show 20 Lines
	def dwarf_EQ : Joined<["--"], "dwarf=">,			def dwarf_EQ : Joined<["--"], "dwarf=">,
	HelpText<"Dump the specified DWARF debug sections. The "			HelpText<"Dump the specified DWARF debug sections. The "
	"only supported value is 'frames'">,			"only supported value is 'frames'">,
	Values<"frames">;			Values<"frames">;

	def fault_map_section : Flag<["--"], "fault-map-section">,			def fault_map_section : Flag<["--"], "fault-map-section">,
	HelpText<"Display the content of the fault map section">;			HelpText<"Display the content of the fault map section">;

				def offloading : Flag<["--"], "offloading">,
				traUnsubmitted Not Done Reply Inline Actions `def offload`? The name of the option usually matches its spelling. tra: `def offload`? The name of the option usually matches its spelling.
				jhuber6AuthorUnsubmitted Done Reply Inline Actions I changed one but not the other, I'll fix it. jhuber6: I changed one but not the other, I'll fix it.
				HelpText<"Display the content of the offloading section">;

				traUnsubmitted Not Done Reply Inline Actions Single-letter options are a limited resource. I'd rather not create a new one unless there's a specific need for it (e.g. compatibility with existing tool). Using it for a niche option like `--offload` does not seem to be a good use for it, however convenient that may be for the few of us who may care. tra: Single-letter options are a limited resource. I'd rather not create a new one unless there's a…
				jhuber6AuthorUnsubmitted Done Reply Inline Actions I would definitely like to have a nice single-letter shorthand for this, but you're right that it's a somewhat niche option. Most of the other dumping options have a single letter alias, but someone may want the `-O` in the future, or another tool introduces a `-O` and we can't be compatible. It'd be nice to get some other opinions on this, but I'm not super attached. jhuber6: I would definitely like to have a nice single-letter shorthand for this, but you're right that…
				MaskRayUnsubmitted Not Done Reply Inline Actions Single-letter options can easily conflict with GNU objdump. Please drop it. See https://github.com/llvm/llvm-project/issues/55297 that we spent many collaboration efforts with GNU. MaskRay: Single-letter options can easily conflict with GNU objdump. Please drop it. See https://github.
	def file_headers : Flag<["--"], "file-headers">,			def file_headers : Flag<["--"], "file-headers">,
	HelpText<"Display the contents of the overall file header">;			HelpText<"Display the contents of the overall file header">;
	def : Flag<["-"], "f">, Alias<file_headers>,			def : Flag<["-"], "f">, Alias<file_headers>,
	HelpText<"Alias for --file-headers">;			HelpText<"Alias for --file-headers">;

	def full_contents : Flag<["--"], "full-contents">,			def full_contents : Flag<["--"], "full-contents">,
	HelpText<"Display the content of each section">;			HelpText<"Display the content of each section">;
	def : Flag<["-"], "s">, Alias<full_contents>,			def : Flag<["-"], "s">, Alias<full_contents>,
	▲ Show 20 Lines • Show All 252 Lines • Show Last 20 Lines

llvm/tools/llvm-objdump/OffloadDump.h

This file was added.

				//===-- OffloadDump.h -------------------------------------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_TOOLS_LLVM_OBJDUMP_OFFLOADDUMP_H
				#define LLVM_TOOLS_LLVM_OBJDUMP_OFFLOADDUMP_H

				#include "llvm/Object/ObjectFile.h"
				#include "llvm/Object/OffloadBinary.h"

				namespace llvm {

				void dumpOffloadSections(const object::OffloadBinary &OB);
				void dumpOffloadBinary(const object::ObjectFile &O);

				MaskRayUnsubmitted Not Done Reply Inline Actions Use const reference if non-null MaskRay: Use const reference if non-null
				} // namespace llvm

				#endif

llvm/tools/llvm-objdump/OffloadDump.cpp

This file was added.

//===-- OffloadDump.cpp - Offloading dumper ---------------------*- C++ -*-===//

jhendersonUnsubmitted

Done

Missing license header.

jhenderson: Missing license header.

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

//===----------------------------------------------------------------------===//

///

/// \file

/// This file implements the offloading-specific dumper for llvm-objdump.

///

//===----------------------------------------------------------------------===//

#include "OffloadDump.h"

#include "llvm-objdump.h"

using namespace llvm;

using namespace llvm::object;

using namespace llvm::objdump;

constexpr const char OffloadSectionString[] = ".llvm.offloading";

jhendersonUnsubmitted

Done

Why is this a macro rather than just a static const char *?

jhenderson: Why is this a macro rather than just a `static const char *`?

MaskRayUnsubmitted

Not Done

using namespace llvm::objdump;

- static constexpr const char *OffloadSectionString = ".llvm.offloading";

+ constexpr const char OffloadSectionString[] = ".llvm.offloading";

/// Get the printable name of the image kind.

The variable is writable.

static is unneeded for const variables (internal linkage by default).

MaskRay: The variable is writable. `static` is unneeded for const variables (internal linkage by…

/// Get the printable name of the image kind.

static StringRef getImageName(const OffloadBinary &OB) {

switch (OB.getImageKind()) {

MaskRayUnsubmitted

Not Done

Use const reference if non-null

MaskRay: Use const reference if non-null

case IMG_Object:

return "elf";

case IMG_Bitcode:

return "llvm ir";

case IMG_Cubin:

traUnsubmitted

Not Done

Nit: I would prefer to split it into a function or lambda doing a printout for one binary and the main function which does the iteration over the container of those binaries.
While recursion is cool, and will likely be optimized into a loop in this case, it makes the code a bit harder to read and understand.

In this case things are simple enough so it does not make too much of a difference, so I'll leave it up to you.

tra: Nit: I would prefer to split it into a function or lambda doing a printout for one binary and…

jhuber6AuthorUnsubmitted

Done

Fair enough, I pretty much just made the function to print it out and realized that I could just keep calling it. Should be easy enough to split out.

jhuber6: Fair enough, I pretty much just made the function to print it out and realized that I could…

return "cubin";

case IMG_Fatbinary:

return "fatbinary";

case IMG_PTX:

return "ptx";

default:

return "<none>";

}

jhendersonUnsubmitted

Done

I think it would be sensible to test the default case (it's a common behaviour pattern for the default case to result in an error, so it seems sensible to demonstrate that it isn't in this case).

jhenderson: I think it would be sensible to test the default case (it's a common behaviour pattern for the…

jhendersonUnsubmitted

Not Done

This comment was marked as done, but I don't see a test case for it?

jhenderson: This comment was marked as done, but I don't see a test case for it?

jhuber6AuthorUnsubmitted

Done

Because I added some validation to obj2yaml it became impossible to test it by creating one without a valid value, but I'm going to get rid of that to simplify this and add it back in.

jhuber6: Because I added some validation to `obj2yaml` it became impossible to test it by creating one…

jhendersonUnsubmitted

Done

As a general rule, yaml2obj should be as lax as possible in what it allows, as it allows testing these corner cases.

jhenderson: As a general rule, yaml2obj should be as lax as possible in what it allows, as it allows…

jhendersonUnsubmitted

Done

With the changes made to yaml2obj, can we now test this default case?

jhenderson: With the changes made to yaml2obj, can we now test this default case?

jhuber6AuthorUnsubmitted

Done

It's already being tested now. The binary.yaml file has the fourth entry with the None type.

jhuber6: It's already being tested now. The `binary.yaml` file has the fourth entry with the `None` type.

}

traUnsubmitted

Not Done

extract is not quite what we're doing here. printAllBinaries ? But then we already have printOffloadBinary below, so it's a bit confusing.

IIUIC the hierarchy looks roughly like this:

Object file
- N offload sections
  - M offload binaries

So, this function prints content of one section (i.e. a collection of offload binaries), while the function below prints contents of the object file (i.e. a collection of offload sections) and the names currently do not reflect that.

tra: `extract` is not quite what we're doing here. `printAllBinaries` ? But then we already have…

jhuber6AuthorUnsubmitted

Done

How about this?

printOffloadBinary()
printOffloadBinaries()
dumpOffloadSections()
dumpOffloadBinaries()

jhuber6: How about this? ``` printOffloadBinary() printOffloadBinaries() dumpOffloadSections()…

traUnsubmitted

Not Done

What's the distinction between dump and print here? It would do for the time being, I guess.

If we were to implement iterators over sections and offload binaries within, we would not need distinct names for them and then this whole code would look like this:

printOffloadBinaries(const ObjectFile *O) {
  llvm::for_each(O->sections(), [](auto Section) {
    llvm::for_each(Section->binaries(), printBinary);
  })
}

That would be expressive enough without having to split multiple levels of iteration into different functions, along with the associated hassle of having to come up with adequate names for them. :-)

Error handling will likely throw a monkey wrench into this neat ideal scenario, so I'm not sure if it's worth it.
I think general purpose iterators over sections and offload binaries would come handy when we get to extend the functionality.
It does not need to be done in this patch.

tra: What's the distinction between dump and print here? It would do for the time being, I guess.

jhuber6AuthorUnsubmitted

Done

I just used dump since it's more in-line with the vocabulary of the rest of the functions in llvm-objdump. Also in the future if we want to print out these sections we'd just have a flag somewhere that if enabled dumps the contents of the image rather than just the metadata. So that's the difference between just dump and print in my mind.

As you mentioned we can get errors here so we'd need to have some kind of iterator over Expectedvalues, But I'm not sure if we would extend the section class for this since this binary format is just some blob that just so happens to be contained in a section I'd say.

jhuber6: I just used dump since it's more in-line with the vocabulary of the rest of the functions in…

static void printBinary(const OffloadBinary &OB, uint64_t Index) {

outs() << "\nOFFLOADING IMAGE [" << Index << "]:\n";

outs() << left_justify("kind", 16) << getImageName(OB) << "\n";

outs() << left_justify("arch", 16) << OB.getArch() << "\n";

MaskRayUnsubmitted

Not Done

Append [ to OFFLOAD IMAGE

MaskRay: Append `[` to `OFFLOAD IMAGE`

jhuber6AuthorUnsubmitted

Done

Could you elaborate on this? do you want it formatted like OFFLOADING IMAGE[0]?

jhuber6: Could you elaborate on this? do you want it formatted like `OFFLOADING IMAGE[0]`?

MaskRayUnsubmitted

Not Done

outs() << "\nOFFLOADING IMAGE [" << Index << "]:\n"

MaskRay: `outs() << "\nOFFLOADING IMAGE [" << Index << "]:\n"`

outs() << left_justify("triple", 16) << OB.getTriple() << "\n";

outs() << left_justify("producer", 16)

<< getOffloadKindName(OB.getOffloadKind()) << "\n";

}

jhendersonUnsubmitted

Done

Test case for the error path?

jhenderson: Test case for the error path?

static Error visitAllBinaries(const OffloadBinary &OB) {

uint64_t Offset = 0;

uint64_t Index = 0;

while (Offset < OB.getMemoryBufferRef().getBufferSize()) {

MemoryBufferRef Buffer =

MemoryBufferRef(OB.getData().drop_front(Offset), OB.getFileName());

auto BinaryOrErr = OffloadBinary::create(Buffer);

if (!BinaryOrErr)

return BinaryOrErr.takeError();

OffloadBinary &Binary = **BinaryOrErr;

printBinary(Binary, Index++);

Offset += Binary.getSize();

MaskRayUnsubmitted

Not Done

delete unneeded blank line

MaskRay: delete unneeded blank line

}

jhendersonUnsubmitted

Not Done

It's extremely unfortunate that this relies on section names rather than section types. In my opinion, it would be far more appropriate to have a SHT_LLVM_OFFLOAD section type (or similar), so that section names don't need looking up and comparing to find the relevant section. ELF is designed really to use types for section comparison, not really names.

jhenderson: It's extremely unfortunate that this relies on section names rather than section types. In my…

jhuber6AuthorUnsubmitted

Done

I wanted to keep it somewhat generic if we ever want to get this to work on COFF / MACH-O and names were the easiest solution at the time. I can try to make a change for that in the future, it would definitely be a better solution than checking a magic string.

jhuber6: I wanted to keep it somewhat generic if we ever want to get this to work on COFF / MACH-O and…

jhendersonUnsubmitted

Not Done

I'm not familiar with Mach-O, although I know that COFF does rely on strings, but it's quite normal to vary this somewhat between file formats, due to the different features available in those formats.

Changing to a specific type for ELF may be a prerequisite for a proper yaml2obj implementation anyway, so that yaml2obj knows how to parse the YAML for such sections.

jhenderson: I'm not familiar with Mach-O, although I know that COFF does rely on strings, but it's quite…

jhendersonUnsubmitted

Not Done

Could I clarify when you said making a change in the future for the name -> type bit that you mean in a future patch?

jhenderson: Could I clarify when you said making a change in the future for the name -> type bit that you…

jhendersonUnsubmitted

Not Done

Not answered.

jhenderson: Not answered.

jhuber6AuthorUnsubmitted

Done

Yes, I was saying we could do this in a future patch as I didn't think it was a blocking issue for the functionality of this patch. The main reason I did this is just because it was the easiest common solution between extracting these from LLVM-IR and an ELF, that is the section's string will be the same. I was just planning on adding it to the list of existing ones in ELF.h, but as this worked overall I thought it was sufficient to land this patch.

jhuber6: Yes, I was saying we could do this in a future patch as I didn't think it was a blocking issue…

jhendersonUnsubmitted

Not Done

Okay.

jhenderson: Okay.

return Error::success();

}

/// Print the embedded offloading contents of an ObjectFile \p O.

jhendersonUnsubmitted

Not Done

Error path test?

jhenderson: Error path test?

jhuber6AuthorUnsubmitted

Done

Since I needed to use prepackaged binaries it was a little hard to make the test cases contain errors. This path specifically is just for an ELF that's broken somehow and can't get the section contents so I don't think it's relevant to the feature.

jhuber6: Since I needed to use prepackaged binaries it was a little hard to make the test cases contain…

jhendersonUnsubmitted

Not Done

The fact that you've had to add code to handle the error shows that it is relevant, otherwise if the error handling is broken, you won't know. It should be fairly easy to test this using yaml2obj, which has the ability to overwrite the sh_offset field of the section header (using a "SHOffset" field name, if I remember rightly - look for examples).

jhenderson: The fact that you've had to add code to handle the error shows that it is relevant, otherwise…

void llvm::dumpOffloadBinary(const ObjectFile &O) {

for (SectionRef Sec : O.sections()) {

Expected<StringRef> Name = Sec.getName();

if (!Name || !Name->startswith(OffloadSectionString))

MaskRayUnsubmitted

Not Done

Use const reference if non-null

MaskRay: Use const reference if non-null

continue;

jhendersonUnsubmitted

Not Done

Error path test?

jhenderson: Error path test?

jhuber6AuthorUnsubmitted

Done

This basically only happens if the section doesn't have the necessary magic bytes or is too small, I'll try to add a test for that.

jhuber6: This basically only happens if the section doesn't have the necessary magic bytes or is too…

jhendersonUnsubmitted

Not Done

Usually it's a good idea to add more context to error messages that come out of the low-level libraries (see https://llvm.org/docs/CodingStandards.html#error-and-warning-messages). There are a number of examples of how this is done elsewhere in places like llvm-objdump and especially the llvm-readobj code. For example, the existing test case indicates simply that there's a problem with some encoding somewhere, but it's not clear which section that applies to (imagine if you were dumping multiple different sections at the same time).

jhenderson: Usually it's a good idea to add more context to error messages that come out of the low-level…

jhendersonUnsubmitted

Done

Nit: SectionRef is designed to be lightweight and copyable (like llvm::StringRef) so there's no particular need to use const & here.

jhenderson: Nit: `SectionRef` is designed to be lightweight and copyable (like `llvm::StringRef`) so…

Expected<StringRef> Contents = Sec.getContents();

if (!Contents)

reportError(Contents.takeError(), O.getFileName());

jhendersonUnsubmitted

Not Done

This seems to be just throwing away all errors. Are you sure that's what you meant to do, and not to report them with reportError? If so, did you mean to use consumeError (or possibly even cantFail)?

jhenderson: This seems to be just throwing away all errors. Are you sure that's what you meant to do, and…

jhuber6AuthorUnsubmitted

Done

Those are probably better options, thanks. Yes, throwing away the errors is the intended solution here. These binaries are individual files stored in a big blob, when we parse one we check the rest of the buffer to see if there are more. I did it this way so when sections get merged via ld -r foo.o bar.o we can still find the files. It leads to a little weirdness with parsing however. Basically I only check errors for the first file in the section, if we fail to find any after that one we shouldn't treat it as an error.

jhuber6: Those are probably better options, thanks. Yes, throwing away the errors is the intended…

jhendersonUnsubmitted

Not Done

The usual pattern in more up-to-date dumping tools is to report warnings rather than errors, and then to continue parsing as best as possible (or bailing out of the section if it's impossible to do so). This allows us to get the maximum information out possible

jhenderson: The usual pattern in more up-to-date dumping tools is to report warnings rather than errors…

MemoryBufferRef Buffer = MemoryBufferRef(*Contents, O.getFileName());

auto BinaryOrErr = OffloadBinary::create(Buffer);

traUnsubmitted

Not Done

I don't think the 'single' part of this assertion is true. AFAICT, extractAllBinaries will happily print all subsequent binaries if it finds them in the buffer. I think this should call printBinary instead.

tra: I don't think the 'single' part of this assertion is true. AFAICT, `extractAllBinaries` will…

jhuber6AuthorUnsubmitted

Done

Yeah, I meant it more like to print on the single file that was already extracted or something. But it can definitely contain multiple. The reason I chose this method is because I wanted something that worked even if these sections were concatenated through a relocatable link or something. So whenever we parse one of these we just check the sizes to make sure there's not another one concatenated to it. I can make the comment less confusing.

jhuber6: Yeah, I meant it more like to print on the single file that was already extracted or something.

traUnsubmitted

Not Done

I think the root of the problem here is that we're treating OffloadBinary as both the pointer to the binary itself and as a pointer to collection of such binaries.

I think it's not a good API -- extractAllBinaries gets to look under the hood of the implmentation -- check if containing buffer has extra space beyond the OffloadBinary it's been passed. What if the user places something else in the memory buffer right behind the OffloadBinary object user passed to printOffloadBinary ? They would be within their rights to do so as the function would be expected to care about the content of the *OB only.

I think we should be a bit more pedantic about such things. If we expect to operate on a collection, the API should reflect that. E.g. use SmallVector<OffloadBinary*>.
I think implementing ObjectFile::offload_sections() and OffloadSection::offload_binaries() would help both here and above. Or, possibly, just ObjectFile::offload_binaries()`if we don't need to care about how binaries are stored in the object file and just wanr offload binaries themselves.

tra: I think the root of the problem here is that we're treating `OffloadBinary` as both the pointer…

jhuber6AuthorUnsubmitted

Done

So the problem is we don't know how many of these are in here until we parse it. This requires getting the size field within the OffloadBinary. So even if we abstracted it to this iterator, it would still need some parsing like this behind the scenes. I could have made the binary format contain many within a single binary image, but like I said I wanted this to be stable under arbitrary concatenation by the linker. I'm not sure if we could have a different API considering the parsing requirements.

This can definitely be problematic, depending on usage. I'm assuming if a user initialized an object on a memory buffer containing a bunch of junk it would probably be fine and just stop once the file is fully parsed. We could probably just ignore a parsing error, basically just stop tryingto read things if we don't catch the magic bytes or there's not enough space left over, but that's probably not ideal.

It's definitely a little obtuse, but I'm not sure if there's a good way to make it work better considering how we parse them.

jhuber6: So the problem is we don't know how many of these are in here until we parse it. This requires…

jhendersonUnsubmitted

Not Done

I said I wanted this to be stable under arbitrary concatenation by the linker

Have you looked at how DWARF debug sections like .debug_line or .debug_aranges are structured? Typically, these sections have a header which contains information like total size of that section (or number of entries in the section) and version information. These sections are still concatenated, with the length simply representing the contribution from a single CU.

jhenderson: > I said I wanted this to be stable under arbitrary concatenation by the linker Have you…

jhuber6AuthorUnsubmitted

Done

Right now I have a binary that knows its own size, and if the size of the buffer is greater than the size of that binary we look for another one. Forgive me if I'm misunderstanding here, but the linker will only concatenate sections right? Do these sections simply work as some kind of buffer whose size indicated how many sections were concatenated? That is, for every .llvm.offloading section I'd have some other reference section that just contains a single byte whose size I can check? Otherwise I'm not sure how we could figure out how many of these sections have been concatenated without parsing them first.

jhuber6: Right now I have a binary that knows its own size, and if the size of the buffer is greater…

jhendersonUnsubmitted

Not Done

To be clear, I know very little about the new section type, how it is used and so on, so what I'm suggesting may not make much sense. Linkers concatenate sections blindly (in general). As such, if you had 1.o and 2.o each with a .llvm.offloading section, and you combine them into out.elf, you'd end up with a single output section containing the concatenation of the two. Presumably this means you'll end up with something that looks a bit like this?

.llvm.offloading
  .llvm.offloading(1.o) - size field
  .llvm.offloading(1.o) - rest of section
  .llvm.offloading(2.o) - size field
  .llvm.offloading(2.o) - rest of section

Is that correct? If so, I don't think there's anything to do here, assuming the section size is not guaranteed to be the same for all input sections.

jhenderson: To be clear, I know very little about the new section type, how it is used and so on, so what…

jhuber6AuthorUnsubmitted

Done

Yeah, that's more or less how it's set up. I'm assuming there's not much we can do to make parsing this easier.

jhuber6: Yeah, that's more or less how it's set up. I'm assuming there's not much we can do to make…

jhendersonUnsubmitted

Done

You should exercise this error path by adding a SHOffset key to your ELF YAML with an invalid value.

jhenderson: You should exercise this error path by adding a `SHOffset` key to your ELF YAML with an invalid…

jhendersonUnsubmitted

Not Done

Marked as done but I don't see it?

jhenderson: Marked as done but I don't see it?

jhendersonUnsubmitted

Not Done

The reason this needs addressing, is because this specific code path in llvm-objdump is otherwise untested. The test is needed to show that llvm-objdump under these specific circumstances properly handles errors if Contents is in an error state. This is actually important because if there is no such test, a change to the reportError whereby Contents isn't used in the message, could result in unchecked errors, but without a test case, these would only manifest under real usage, rather than under testing like they should be.

jhenderson: The reason this needs addressing, is because this specific code path in llvm-objdump is…

jhuber6AuthorUnsubmitted

Done

There's already usage of getContents() like this in llvm-objdump, I don't see why this is a special case. If someone changed the reportError function to not report errors it should show up somewhere. The point of this patch is not about checking if the ELF works, and we know from similar usage in llvm-objdump that this pattern reports errors if it's malformed. I can add a test if you really want me t, but I fail to see the point even with your hypothetical situation.

jhuber6: There's already usage of `getContents()` like this in `llvm-objdump`, I don't see why this is a…

jhendersonUnsubmitted

Not Done

Imagine if the code were changed to the following by somebody:

if (!Contents)
  reportError("failed to get offloading section contents", O->getFileName());

There would be no test failure under any situation, because there is no test. You would get a crash though if someone were to try to use llvm-objdump with the enhanced error checks enabled, and ran into a malformed binary.

This isn't really a contrived example either: in an earlier revision of this patch, there was a similar situation, where the error was thrown away without getting its message, so I don't think it's unreasonable to assume that it could occur in a later revision of this code.

I'm not asking for testing that errors are reported when getting malformed contents, I'm asking for testing that the error returned by the lower-level function is handled by this higher-level one.

jhenderson: Imagine if the code were changed to the following by somebody: ``` if (!Contents)…

if (!BinaryOrErr)

reportError(O.getFileName(), "while extracting offloading files: " +

toString(BinaryOrErr.takeError()));

OffloadBinary &Binary = **BinaryOrErr;

jhendersonUnsubmitted

Done

Ditto: should this be consumeError?

jhenderson: Ditto: should this be `consumeError`?

jhendersonUnsubmitted

Not Done

I believe you'll end up with an assertion under the ABI breaking checks config (I think that's the name of it anyway), as you don't actually use the error within BinaryOrErr. I think it would still be good to include the message as reported by the underlying code, but wrapped in the additional context you've added. Also, would be good to include the input file name. Rough idea (uncertain on the exact invocation needed to get the string, as I'm too lazy to look it up right now!):

reportError("while extracting offloading files from \"" + O->getFileName() + "\": " + toString(BinaryOrErr));

jhenderson: I believe you'll end up with an assertion under the ABI breaking checks config (I think that's…

// Print out all the binaries that are contained in this buffer. If we fail

// to parse a binary before reaching the end of the buffer emit a warning.

if (Error Err = visitAllBinaries(Binary))

jhendersonUnsubmitted

Not Done

OffloadBinary &Binary = **BinaryOrErr;

- // Print out all the binaries that are contained at this buffer. We want to

+ // Print out all the binaries that are contained in this buffer. We want to

// visit each offloading binary we can find, so failing to find one is not

jhenderson:

jhendersonUnsubmitted

Not Done

Not addressed yet.

jhenderson: Not addressed yet.

jhendersonUnsubmitted

Not Done

Not addressed.

jhenderson: Not addressed.

jhuber6AuthorUnsubmitted

Done

I'll fix the typo.

jhuber6: I'll fix the typo.

reportWarning("while parsing offloading files: " +

jhendersonUnsubmitted

Done

Generally, we add a comment where we're deliberately throwing away errors, to explain why this is a good idea.

jhenderson: Generally, we add a comment where we're deliberately throwing away errors, to explain why this…

jhendersonUnsubmitted

Not Done

The question this comment needs to answer is WHY we should give up, rather than reporting the error (as a warning) and ending printing? Same goes below.

jhenderson: The question this comment needs to answer is WHY we should give up, rather than reporting the…

jhuber6AuthorUnsubmitted

Done

I'll try to clarify it.

jhuber6: I'll try to clarify it.

jhendersonUnsubmitted

Not Done

The clarification is clear enough, but it doesn't explain why you can't at least print a warning. This wouldn't prevent the code continuing.

jhenderson: The clarification is clear enough, but it doesn't explain why you can't at least print a…

toString(std::move(Err)),

O.getFileName());

}

jhendersonUnsubmitted

Not Done

Same comment as above.

jhenderson: Same comment as above.

/// Print the contents of an offload binary file \p OB. This may contain

/// multiple binaries stored in the same buffer.

void llvm::dumpOffloadSections(const OffloadBinary &OB) {

jhendersonUnsubmitted

Done

Ditto.

jhenderson: Ditto.

// Print out all the binaries that are contained at this buffer. If we fail to

// parse a binary before reaching the end of the buffer emit a warning.

if (Error Err = visitAllBinaries(OB))

MaskRayUnsubmitted

Not Done

Use const reference if non-null

MaskRay: Use const reference if non-null

reportWarning("while parsing offloading files: " + toString(std::move(Err)),

OB.getFileName());

}

jhendersonUnsubmitted

Not Done

Ditto.

jhenderson: Ditto.

llvm/tools/llvm-objdump/llvm-objdump.cpp

Show All 14 Lines
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm-objdump.h"		#include "llvm-objdump.h"
#include "COFFDump.h"		#include "COFFDump.h"
#include "ELFDump.h"		#include "ELFDump.h"
#include "MachODump.h"		#include "MachODump.h"
#include "ObjdumpOptID.h"		#include "ObjdumpOptID.h"
		#include "OffloadDump.h"
#include "SourcePrinter.h"		#include "SourcePrinter.h"
#include "WasmDump.h"		#include "WasmDump.h"
#include "XCOFFDump.h"		#include "XCOFFDump.h"
#include "llvm/ADT/IndexedMap.h"		#include "llvm/ADT/IndexedMap.h"
#include "llvm/ADT/Optional.h"		#include "llvm/ADT/Optional.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SetOperations.h"		#include "llvm/ADT/SetOperations.h"
#include "llvm/ADT/SmallSet.h"		#include "llvm/ADT/SmallSet.h"
Show All 22 Lines
#include "llvm/Object/COFF.h"		#include "llvm/Object/COFF.h"
#include "llvm/Object/COFFImportFile.h"		#include "llvm/Object/COFFImportFile.h"
#include "llvm/Object/ELFObjectFile.h"		#include "llvm/Object/ELFObjectFile.h"
#include "llvm/Object/ELFTypes.h"		#include "llvm/Object/ELFTypes.h"
#include "llvm/Object/FaultMapParser.h"		#include "llvm/Object/FaultMapParser.h"
#include "llvm/Object/MachO.h"		#include "llvm/Object/MachO.h"
#include "llvm/Object/MachOUniversal.h"		#include "llvm/Object/MachOUniversal.h"
#include "llvm/Object/ObjectFile.h"		#include "llvm/Object/ObjectFile.h"
		#include "llvm/Object/OffloadBinary.h"
#include "llvm/Object/Wasm.h"		#include "llvm/Object/Wasm.h"
#include "llvm/Option/Arg.h"		#include "llvm/Option/Arg.h"
#include "llvm/Option/ArgList.h"		#include "llvm/Option/ArgList.h"
#include "llvm/Option/Option.h"		#include "llvm/Option/Option.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/Errc.h"		#include "llvm/Support/Errc.h"
#include "llvm/Support/FileSystem.h"		#include "llvm/Support/FileSystem.h"
▲ Show 20 Lines • Show All 115 Lines • ▼ Show 20 Lines
bool objdump::DisassembleAll;		bool objdump::DisassembleAll;
bool objdump::SymbolDescription;		bool objdump::SymbolDescription;
static std::vector<std::string> DisassembleSymbols;		static std::vector<std::string> DisassembleSymbols;
static bool DisassembleZeroes;		static bool DisassembleZeroes;
static std::vector<std::string> DisassemblerOptions;		static std::vector<std::string> DisassemblerOptions;
DIDumpType objdump::DwarfDumpType;		DIDumpType objdump::DwarfDumpType;
static bool DynamicRelocations;		static bool DynamicRelocations;
static bool FaultMapSection;		static bool FaultMapSection;
static bool FileHeaders;		static bool FileHeaders;
		jhendersonUnsubmitted Done Reply Inline Actions Nit: it looks like there's a half-hearted attempt to have these fields in some form of alphabetical order - it's certainly not 100%, but I feel like this option probably belongs close to the `RawClangAST` option below. jhenderson: Nit: it looks like there's a half-hearted attempt to have these fields in some form of…
bool objdump::SectionContents;		bool objdump::SectionContents;
static std::vector<std::string> InputFilenames;		static std::vector<std::string> InputFilenames;
bool objdump::PrintLines;		bool objdump::PrintLines;
static bool MachOOpt;		static bool MachOOpt;
std::string objdump::MCPU;		std::string objdump::MCPU;
std::vector<std::string> objdump::MAttrs;		std::vector<std::string> objdump::MAttrs;
bool objdump::ShowRawInsn;		bool objdump::ShowRawInsn;
bool objdump::LeadingAddr;		bool objdump::LeadingAddr;
		static bool Offloading;
static bool RawClangAST;		static bool RawClangAST;
bool objdump::Relocations;		bool objdump::Relocations;
		jhendersonUnsubmitted Done Reply Inline Actions Put it the line before RawClanAST, since Offloading appears before RawClangAST lexicographically. jhenderson: Put it the line before RawClanAST, since Offloading appears before RawClangAST…
bool objdump::PrintImmHex;		bool objdump::PrintImmHex;
bool objdump::PrivateHeaders;		bool objdump::PrivateHeaders;
std::vector<std::string> objdump::FilterSections;		std::vector<std::string> objdump::FilterSections;
bool objdump::SectionHeaders;		bool objdump::SectionHeaders;
static bool ShowLMA;		static bool ShowLMA;
bool objdump::PrintSource;		bool objdump::PrintSource;

static uint64_t StartAddress;		static uint64_t StartAddress;
▲ Show 20 Lines • Show All 2,264 Lines • ▼ Show 20 Lines	static void dumpObject(ObjectFile O, const Archive A = nullptr,
if (WeakBind)		if (WeakBind)
printWeakBindTable(O);		printWeakBindTable(O);

// Other special sections:		// Other special sections:
if (RawClangAST)		if (RawClangAST)
printRawClangAST(O);		printRawClangAST(O);
if (FaultMapSection)		if (FaultMapSection)
printFaultMaps(O);		printFaultMaps(O);
		if (Offloading)
		dumpOffloadBinary(*O);
}		}

static void dumpObject(const COFFImportFile I, const Archive A,		static void dumpObject(const COFFImportFile I, const Archive A,
const Archive::Child *C = nullptr) {		const Archive::Child *C = nullptr) {
StringRef ArchiveName = A ? A->getFileName() : "";		StringRef ArchiveName = A ? A->getFileName() : "";

// Avoid other output when using a raw option.		// Avoid other output when using a raw option.
if (!RawClangAST)		if (!RawClangAST)
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	static void dumpInput(StringRef file) {
Binary &Binary = *OBinary.getBinary();		Binary &Binary = *OBinary.getBinary();

if (Archive *A = dyn_cast<Archive>(&Binary))		if (Archive *A = dyn_cast<Archive>(&Binary))
dumpArchive(A);		dumpArchive(A);
else if (ObjectFile *O = dyn_cast<ObjectFile>(&Binary))		else if (ObjectFile *O = dyn_cast<ObjectFile>(&Binary))
dumpObject(O);		dumpObject(O);
else if (MachOUniversalBinary *UB = dyn_cast<MachOUniversalBinary>(&Binary))		else if (MachOUniversalBinary *UB = dyn_cast<MachOUniversalBinary>(&Binary))
parseInputMachO(UB);		parseInputMachO(UB);
		else if (OffloadBinary *OB = dyn_cast<OffloadBinary>(&Binary))
		dumpOffloadSections(*OB);
else		else
reportError(errorCodeToError(object_error::invalid_file_type), file);		reportError(errorCodeToError(object_error::invalid_file_type), file);
}		}

template <typename T>		template <typename T>
static void parseIntArg(const llvm::opt::InputArgList &InputArgs, int ID,		static void parseIntArg(const llvm::opt::InputArgList &InputArgs, int ID,
T &Value) {		T &Value) {
if (const opt::Arg *A = InputArgs.getLastArg(ID)) {		if (const opt::Arg *A = InputArgs.getLastArg(ID)) {
▲ Show 20 Lines • Show All 87 Lines • ▼ Show 20 Lines	if (const opt::Arg *A = InputArgs.getLastArg(OBJDUMP_dwarf_EQ)) {
DwarfDumpType = StringSwitch<DIDumpType>(A->getValue())		DwarfDumpType = StringSwitch<DIDumpType>(A->getValue())
.Case("frames", DIDT_DebugFrame)		.Case("frames", DIDT_DebugFrame)
.Default(DIDT_Null);		.Default(DIDT_Null);
if (DwarfDumpType == DIDT_Null)		if (DwarfDumpType == DIDT_Null)
invalidArgValue(A);		invalidArgValue(A);
}		}
DynamicRelocations = InputArgs.hasArg(OBJDUMP_dynamic_reloc);		DynamicRelocations = InputArgs.hasArg(OBJDUMP_dynamic_reloc);
FaultMapSection = InputArgs.hasArg(OBJDUMP_fault_map_section);		FaultMapSection = InputArgs.hasArg(OBJDUMP_fault_map_section);
		Offloading = InputArgs.hasArg(OBJDUMP_offloading);
FileHeaders = InputArgs.hasArg(OBJDUMP_file_headers);		FileHeaders = InputArgs.hasArg(OBJDUMP_file_headers);
SectionContents = InputArgs.hasArg(OBJDUMP_full_contents);		SectionContents = InputArgs.hasArg(OBJDUMP_full_contents);
PrintLines = InputArgs.hasArg(OBJDUMP_line_numbers);		PrintLines = InputArgs.hasArg(OBJDUMP_line_numbers);
InputFilenames = InputArgs.getAllArgValues(OBJDUMP_INPUT);		InputFilenames = InputArgs.getAllArgValues(OBJDUMP_INPUT);
MachOOpt = InputArgs.hasArg(OBJDUMP_macho);		MachOOpt = InputArgs.hasArg(OBJDUMP_macho);
MCPU = InputArgs.getLastArgValue(OBJDUMP_mcpu_EQ).str();		MCPU = InputArgs.getLastArgValue(OBJDUMP_mcpu_EQ).str();
MAttrs = commaSeparatedValues(InputArgs, OBJDUMP_mattr_EQ);		MAttrs = commaSeparatedValues(InputArgs, OBJDUMP_mattr_EQ);
ShowRawInsn = !InputArgs.hasArg(OBJDUMP_no_show_raw_insn);		ShowRawInsn = !InputArgs.hasArg(OBJDUMP_no_show_raw_insn);
▲ Show 20 Lines • Show All 142 Lines • ▼ Show 20 Lines	if (StartAddress >= StopAddress)
reportCmdLineError("start address should be less than stop address");		reportCmdLineError("start address should be less than stop address");

// Removes trailing separators from prefix.		// Removes trailing separators from prefix.
while (!Prefix.empty() && sys::path::is_separator(Prefix.back()))		while (!Prefix.empty() && sys::path::is_separator(Prefix.back()))
Prefix.pop_back();		Prefix.pop_back();

if (AllHeaders)		if (AllHeaders)
ArchiveHeaders = FileHeaders = PrivateHeaders = Relocations =		ArchiveHeaders = FileHeaders = PrivateHeaders = Relocations =
SectionHeaders = SymbolTable = true;		SectionHeaders = SymbolTable = true;
		traUnsubmitted Not Done Reply Inline Actions Should `Offloading` be added here to be included into output when `--all-headers` is in effect? tra: Should `Offloading` be added here to be included into output when `--all-headers` is in effect?
		jhuber6AuthorUnsubmitted Done Reply Inline Actions I'm not sure, I would say no since this is more for all the `ELF` contents and the offloading stuff is simply data contained in there. jhuber6: I'm not sure, I would say no since this is more for all the `ELF` contents and the offloading…

if (DisassembleAll \|\| PrintSource \|\| PrintLines \|\|		if (DisassembleAll \|\| PrintSource \|\| PrintLines \|\|
!DisassembleSymbols.empty())		!DisassembleSymbols.empty())
Disassemble = true;		Disassemble = true;

if (!ArchiveHeaders && !Disassemble && DwarfDumpType == DIDT_Null &&		if (!ArchiveHeaders && !Disassemble && DwarfDumpType == DIDT_Null &&
!DynamicRelocations && !FileHeaders && !PrivateHeaders && !RawClangAST &&		!DynamicRelocations && !FileHeaders && !PrivateHeaders && !RawClangAST &&
!Relocations && !SectionHeaders && !SectionContents && !SymbolTable &&		!Relocations && !SectionHeaders && !SectionContents && !SymbolTable &&
!DynamicSymbolTable && !UnwindInfo && !FaultMapSection &&		!DynamicSymbolTable && !UnwindInfo && !FaultMapSection && !Offloading &&
!(MachOOpt && (Bind \|\| DataInCode \|\| DyldInfo \|\| DylibId \|\| DylibsUsed \|\|		!(MachOOpt && (Bind \|\| DataInCode \|\| DyldInfo \|\| DylibId \|\| DylibsUsed \|\|
ExportsTrie \|\| FirstPrivateHeader \|\| FunctionStarts \|\|		ExportsTrie \|\| FirstPrivateHeader \|\| FunctionStarts \|\|
IndirectSymbols \|\| InfoPlist \|\| LazyBind \|\| LinkOptHints \|\|		IndirectSymbols \|\| InfoPlist \|\| LazyBind \|\| LinkOptHints \|\|
ObjcMetaData \|\| Rebase \|\| Rpaths \|\| UniversalHeaders \|\|		ObjcMetaData \|\| Rebase \|\| Rpaths \|\| UniversalHeaders \|\|
WeakBind \|\| !FilterSections.empty()))) {		WeakBind \|\| !FilterSections.empty()))) {
T->printHelp(ToolName);		T->printHelp(ToolName);
return 2;		return 2;
}		}
Show All 9 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objdump] Add support for dumping embedded offloading dataClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 441835

llvm/test/tools/llvm-objdump/Offloading/Inputs/binary.yaml

llvm/test/tools/llvm-objdump/Offloading/Inputs/malformed.yaml

llvm/test/tools/llvm-objdump/Offloading/binary.test

llvm/test/tools/llvm-objdump/Offloading/content-failure.test

llvm/test/tools/llvm-objdump/Offloading/failure.test

llvm/test/tools/llvm-objdump/Offloading/warning.test

llvm/tools/llvm-objdump/CMakeLists.txt

llvm/tools/llvm-objdump/ObjdumpOpts.td

llvm/tools/llvm-objdump/OffloadDump.h

llvm/tools/llvm-objdump/OffloadDump.cpp

llvm/tools/llvm-objdump/llvm-objdump.cpp

[llvm-objdump] Add support for dumping embedded offloading data
ClosedPublic