This is an archive of the discontinued LLVM Phabricator instance.

[ObjectYAML] Add offloading binary implementations for obj2yaml and yaml2obj
ClosedPublic

Authored by jhuber6 on Jun 14 2022, 12:41 PM.

Details

Summary

This patchs adds the necessary code for inspecting or creating offloading
binaries using the standing obj2yaml and yaml2obj features in LLVM.

Depends on D127774

Diff Detail

Event Timeline

jhuber6 created this revision.Jun 14 2022, 12:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2022, 12:41 PM
jhuber6 requested review of this revision.Jun 14 2022, 12:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2022, 12:41 PM

Have you got any reference material that I can read for the format of offloading binaries? I can't really review that this implementation does the right thing without knowing the format!

llvm/lib/ObjectYAML/OffloadYAML.cpp
27

Here and with all the other fields, you need test cases that show that, if they're optional, the fields can be omitted (and what their default values are if so), and if they're required, an error is emitted if the field is missing.

52

Here and below, these error messages should start with a lower-case letter, I suspect (see the Coding Standards for error message format).

That being said, in ELF yaml2obj, we allow specification of values that may not be permitted by the spec but could theoretically appear in the field, if something wrote the wrong value. This allows for testing of error paths more effectively. I can't give further pointers though without knowing the format and what makes sense to allow to handcraft.

llvm/test/ObjectYAML/Offload/multiple_members.yaml
2

I don't think you need --strict-whitespace as you're not really interested in the whitespace in this test.

I'm not sure there's much benefit in having a single and a multiple entry test case. I'd just have the multiple test case, and drop the single one.

13

Let's use a different Content for the two members. It avoids any risk of a content field being cached from one member to the other.

llvm/tools/obj2yaml/offload2yaml.cpp
2

This file isn't in "utils", so the header tag here is wrong.

jhuber6 marked 2 inline comments as done.Jun 15 2022, 4:38 AM

Have you got any reference material that I can read for the format of offloading binaries? I can't really review that this implementation does the right thing without knowing the format!

Sure, it's documented here https://clang.llvm.org/docs/ClangOffloadPackager.html. More or less, it's just a few integers followed by a list of offsets into an Elf-like string table that forms a string map.

llvm/lib/ObjectYAML/OffloadYAML.cpp
27

It's not really important that these be optional since there's not that many of them. I may just make them all required to avoid the hassle.

52

Honestly this error message isn't strictly necessary. In the linker wrapper where we actually use these binaries we handle unknown cases, so I may just remove these validation steps.

llvm/tools/obj2yaml/offload2yaml.cpp
2

Copied the header from archive2yaml.cpp which should probably be fixed as well.

jhuber6 updated this revision to Diff 437138.Jun 15 2022, 6:50 AM

Addressing comments.

jhenderson added inline comments.Jun 17 2022, 1:05 AM
llvm/include/llvm/ObjectYAML/OffloadYAML.h
32–33

I'd rather expect these two to be enums that correspond to the IMG and OFK values in the spec, with the option of specifying an arbitrary uint16_t value so that you can provide arbitrary values that don't have a defined mapping (this is a common pattern in ELF yaml2obj, for example).

39

The spec discsusses a wider range of fields than just the members themselves. In particular, I'd expect to see an optional Version field, so that you can test using an arbitrary version number. Having optional size and offset fields would also allow you to override the defaults of these values, allowing you to test error paths better (e.g. to show that if the offset is past the end of the file, an error message is reported). The one field you probably don't need is the Magic field, in my opinion.

llvm/test/ObjectYAML/Offload/default.yaml
13

Nit: looks like you've got multiple blank lines at EOF. The file should end in exactly one '\n', no more, no less.

jhuber6 marked 2 inline comments as done.Jun 17 2022, 6:27 AM
jhuber6 added inline comments.
llvm/include/llvm/ObjectYAML/OffloadYAML.h
39

Those field are internal to the binary format, and testing them is done by even being able to extract the members correctly in obj2yaml. I'd prefer not to add in ways to change it as it would massively clutter up the API for creating them. I don't see the utility in changing any of the offsets because there's no way to know if they are valid without crashing or printing out garbage, so we just assume every binary with a valid magic number is valid. The version only exists to change the ABI in the future if we ever need to, it's a constant as far as I'm concerned and if it's not 1 we just hard error immediately. I could add it in if you really, really, really think it's worth testing these paths. But I would rather not.

jhuber6 updated this revision to Diff 437882.Jun 17 2022, 6:54 AM

Moving to enumeration values.

jhenderson added inline comments.Jun 22 2022, 12:54 AM
llvm/include/llvm/ObjectYAML/OffloadYAML.h
39

The purpose of having yaml2obj support for file formats is primarily to be able to create test cases for both common use-cases and edge cases, particularly malformed input cases. For example, in the ELF YAML format, we allow overriding the sh_offset and other section header fields with values that wouldn't make sense. This in turn allows us to write tests for things like "llvm-readobj emits an error (instead of crashing) if the section header offset points past the end of the file)". So, the utility for changing the offsets is to be able to craft test cases that show that tools don't crash and instead give a sensible error if being given invalid values (clearly we can't usually tell the difference between a valid offset and another offset that points within the file, but we can at least make sure the offset is not past the end of the file). Similarly, how do you test that the tool correctly emits an error if a version > 1 is detected without having the ability to change the version field? Certainly in tools like llvm-readobj and llvm-objdump, we expect test cases for every code path (including error paths). This is important because we have to be able to rely on these low-level tools because they are so widely used in testing other tools.

obj2yaml support is not all that important typically, but just sort of falls out of the wash typically because of how it interacts with yaml2obj.

jhuber6 updated this revision to Diff 439417.Jun 23 2022, 8:39 AM

Add tweaking of specific binary features and add a test to make sure we handle it explicitly.

jhenderson added inline comments.Jun 23 2022, 11:11 PM
llvm/test/ObjectYAML/Offload/malformed.yaml
1

Thanks for the update, mostly looks good, but can I request this be split into a sequence of different test cases within the same file, where you only set one of the fields in each case? In other words, one with Version 2, another with just Size set etc.

jhuber6 updated this revision to Diff 439737.Jun 24 2022, 7:08 AM

Splitting tests

jhenderson accepted this revision.Jun 27 2022, 1:25 AM

LGTM.

llvm/test/ObjectYAML/Offload/malformed-offset.yaml
15 ↗(On Diff #439737)

I assume the body of this error message comes from code outside of this patch?

This revision is now accepted and ready to land.Jun 27 2022, 1:25 AM
This revision was landed with ongoing or failed builds.Jul 1 2022, 6:13 PM
This revision was automatically updated to reflect the committed changes.