This is an archive of the discontinued LLVM Phabricator instance.

[Object] Add binary format for bundling offloading metadata
ClosedPublic

Authored by jhuber6 on Mar 19 2022, 8:20 AM.

Details

Summary

We need to embed certain metadata along with a binary image when we wish
to perform a device-linking job on it. Currently this metadata was
embedded in the section name of the data itself. This worked, but made
adding new metadata very difficult and didn't work if the user did any
sort of section linking.

This patch introduces a custom binary format for bundling offloading
metadata with a device object file. This binary format is fundamentally
a simple string map table with some additional data and an embedded
image. I decided to use a custom format rather than using an existing
format (ELF, JSON, etc) because of the specialty use-case of this. We
need a simple binary format that can be concatenated without requiring
other external dependencies.

This extension will make it easier to extend the linker wrapper's
capabilties with whatever data is necessary. Eventually this will allow
us to remove all the external arguments passed to the linker wrapper and
embed it directly in the host's linker so device linking behaves exactly
like host linking.

Diff Detail

Event Timeline

jhuber6 created this revision.Mar 19 2022, 8:20 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 19 2022, 8:20 AM
jhuber6 requested review of this revision.Mar 19 2022, 8:20 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMar 19 2022, 8:20 AM
jhuber6 added inline comments.Mar 19 2022, 8:21 AM
clang/include/clang/Basic/Offloading.h
1 ↗(On Diff #416702)

Not sure if this should live in Clang or LLVM.

jhuber6 updated this revision to Diff 416713.Mar 19 2022, 11:21 AM

Update alignment to worse-case padding of 16. I'm not sure if there's a bette solution for preventing the linker from adding padding between these sections if they are combined.

jhuber6 updated this revision to Diff 416719.Mar 19 2022, 12:50 PM

Fix test after increasing alignment.

jhuber6 updated this revision to Diff 416749.Mar 19 2022, 5:09 PM

Changing to add alignment to the global variable. Should ensure that the section
alignment is correct.

tra added a subscriber: tra.Mar 21 2022, 12:31 PM

I decided to use a custom format rather than using an existing format (ELF, JSON, etc) because of the specialty use-case of this.

Will we ever need to process the files with tools built with a different LLVM version? E.g clang and lld may be built separately from different LLVM trees.
If so, then maintaining compatibility of the binary format will become more painful over time, compared to using json.

Considering that we already have json parsing and serialization in LLVM, I don't see much benefit growing yet another on-disk format. AFAICT, JSON should do the job encoding relevant data just fine and would need less maintenance long term.

clang/include/clang/Basic/Offloading.h
87 ↗(On Diff #416749)

On-disk format could use more comments.

87–104 ↗(On Diff #416749)

Given that it's an on-disk format I think these should be explicitly packed and carry a comment that it's an on-disk format which needs extra caution during future changes.

90 ↗(On Diff #416749)

What does Size cover and what units it uses -- bytes, number of Entries, something else?

95–96 ↗(On Diff #416749)

Should we use the matching enum types here?
We're using 16-bit enums, so it would save us on casts when we use those fields and would make it harder to set a wrong value by mistake.

97 ↗(On Diff #416749)

Do these flags have defined meaning?

98 ↗(On Diff #416749)

What are the offsets relative to?

jhuber6 added a comment.EditedMar 21 2022, 1:21 PM

I decided to use a custom format rather than using an existing format (ELF, JSON, etc) because of the specialty use-case of this.

Will we ever need to process the files with tools built with a different LLVM version? E.g clang and lld may be built separately from different LLVM trees.
If so, then maintaining compatibility of the binary format will become more painful over time, compared to using json.

As it stands now, this is only used by the linker-wrapper which is all clang. LLD is only called as part of the host link job so it's impossible to mix versions. That being said, eventually I want this functionality to be performed by the linker itself without needing the extra wrapper so it'll be possible in the future. I added the "Version" field specifically to emit a warning or error in the future if we change this. Even if we used JSON or some other form it would be a similar story if these get de-synced because we'd have different entries and keys. We could add to the struct without breaking the ABI since everything uses absolute offsets.

Considering that we already have json parsing and serialization in LLVM, I don't see much benefit growing yet another on-disk format. AFAICT, JSON should do the job encoding relevant data just fine and would need less maintenance long term.

I was definitely thinking about a lot of alternatives to making yet another on-disk binary format. I had a few discussions with @JonChesterfield on the subject. There were two things that put me off of using JSON primarily, correct me if I have any misconceptions.

  1. This is fundamentally a very thin metadata wrapper around a binary image. AFAIK if you want to stream binary data to a JSON you need to use a base64 or similar encoding, which makes the files bigger and adds some extra work. It's not a deal-breaker, but it's somewhat of a turn-off.
  2. There could be multiple of these contained in a single section, I primarily wanted something with a known size and easy to identify magic bytes to find where these live in the section. I wasn't sure how nicely this would interact with the JSON parser. It could also be done, but I wasn't sure how much utility there was.
clang/include/clang/Basic/Offloading.h
87 ↗(On Diff #416749)

Noted, will add more if we settle on this format.

87–104 ↗(On Diff #416749)

What do you mean by explicitly packed? And I added the "Version" field in the header so we can warn on old versions.

90 ↗(On Diff #416749)

Size of the entire file, so you can do Data[Header->Size] and potentially get the next one in memory.

95–96 ↗(On Diff #416749)

Yeah, the casts are annoying but I elected to be explicit with the size constraints in the header itself. Wasn't really sure which was better.

97 ↗(On Diff #416749)

They're unused for now, I was planning on making it a bitfield so we could pass things like whether or not this file contains debug information or is optimized, kind of like fatbinary.

98 ↗(On Diff #416749)

Absolute from the start of the file, can comment.

tra added a comment.Mar 21 2022, 2:42 PM

I was definitely thinking about a lot of alternatives to making yet another on-disk binary format. I had a few discussions with @JonChesterfield on the subject. There were two things that put me off of using JSON primarily, correct me if I have any misconceptions.

  1. This is fundamentally a very thin metadata wrapper around a binary image. AFAIK if you want to stream binary data to a JSON you need to use a base64 or similar encoding, which makes the files bigger and adds some extra work.

I didn't realize that content of the file is part of your packagin scheme. I've interpreted embed certain metadata along with a binary image as literally keeping the binaries as binaries and just adding a small blob with additional metadata. It was that data I meant to encode as JSON. Encoding the offload binaries themselves as JSON would indeed be wasteful.

We could keep the header as a binary (never changes on-disk format) and use JSON representation for the array of the entries (0-terminated string or string + length stored in the header) which also has fixed (as in version-agnostic) on-disk format, though of variable length.
Versioning still has to be dealt with, but now it would be independent of the on-disk format. The variable length of JSON is both a plus and a minus. On the positive side is that content is open. Some tool may add whatever is relevant for its own use, comments, provenance info, checksum, etc.
The downside is that it has variable length, so it would have to be written after the image binary. We would also need to deal with potential errors parsing JSON.

I don't have a strong preference either way. I think JSON may have few minor benefits, but the proposed binary format has the advantage of simplicity. We can always switch to json-encoded entries later by bumping the header version.

clang/include/clang/Basic/Offloading.h
87–104 ↗(On Diff #416749)

__attribute__((packed)). Otherwise you depend on assumed natural alignment and that is target-dependent, IIRC.

I was definitely thinking about a lot of alternatives to making yet another on-disk binary format. I had a few discussions with @JonChesterfield on the subject. There were two things that put me off of using JSON primarily, correct me if I have any misconceptions.

  1. This is fundamentally a very thin metadata wrapper around a binary image. AFAIK if you want to stream binary data to a JSON you need to use a base64 or similar encoding, which makes the files bigger and adds some extra work.

I didn't realize that content of the file is part of your packagin scheme. I've interpreted embed certain metadata along with a binary image as literally keeping the binaries as binaries and just adding a small blob with additional metadata. It was that data I meant to encode as JSON. Encoding the offload binaries themselves as JSON would indeed be wasteful.

We could keep the header as a binary (never changes on-disk format) and use JSON representation for the array of the entries (0-terminated string or string + length stored in the header) which also has fixed (as in version-agnostic) on-disk format, though of variable length.
Versioning still has to be dealt with, but now it would be independent of the on-disk format. The variable length of JSON is both a plus and a minus. On the positive side is that content is open. Some tool may add whatever is relevant for its own use, comments, provenance info, checksum, etc.
The downside is that it has variable length, so it would have to be written after the image binary. We would also need to deal with potential errors parsing JSON.

I don't have a strong preference either way. I think JSON may have few minor benefits, but the proposed binary format has the advantage of simplicity. We can always switch to json-encoded entries later by bumping the header version.

I like the idea of keeping the header, we could add an additional field to the header for the size of the entry and I feel like we'd be pretty future-proof if we want to change stuff. I think using this binary format for now is sufficient as long as we keep upgrading it to something more complex an open possibility.

I'm also not sure if I should extend this as a binary format inheriting from LLVM's Binary class. It would be a minimal amount of work but I'm not sure if this use-case warrants extending this to broader LLVM.

jhuber6 updated this revision to Diff 417129.Mar 21 2022, 4:11 PM

Add more comments and an entry size field to the header.

jhuber6 updated this revision to Diff 417685.Mar 23 2022, 10:36 AM

Updating to use path instead of generic cmdline, makes it a lot easier to pass it. Also just adding a reserved field in case I want to add the cmdline back.

jhuber6 updated this revision to Diff 418227.Mar 25 2022, 8:06 AM

Splitting this out into a patch for the format. Adding a unit test and changing
strings to now be an arbitrary string map. Hopefully the move to LLVM proper
won't draw ire for creating another binary format in LLVM.

jhuber6 retitled this revision from [Clang] Add binary format for bundling offloading metadata to [Object] Add binary format for bundling offloading metadata.Mar 25 2022, 8:07 AM
jhuber6 edited the summary of this revision. (Show Details)
jhuber6 edited the summary of this revision. (Show Details)
jhuber6 updated this revision to Diff 418255.Mar 25 2022, 9:39 AM

Changing test, uniform_int_distribution doesn't support char or uint8_t according to the standard.

Added some reviewers. I'd much prefer this used an existing binary format, DIY is prone to errors and maintenance hassles down the road. Don't care as much about which format as about it being one with an existing, tested implementation and ideally existing inspection tools.

Added some reviewers. I'd much prefer this used an existing binary format, DIY is prone to errors and maintenance hassles down the road. Don't care as much about which format as about it being one with an existing, tested implementation and ideally existing inspection tools.

I'm not married to the idea, and worst case scenario we can replace it with something else in the future. I'd like to get something like this working so I can finally make the new driver the default, so I'll just outline the problem and some of the potential solutions.

The issue is that we need to store some metadata along with a binary image so we know how to handle it at a later date. Currently we shove this in the name of the ELF section and parse it there, but that's not idea because more metadata will be needed in the future and it prevents us from doing things like relocatable linking or other merging (e.g. we want to store both an sm_70 and sm_35 image in the same file). So we want a binary format that can store some strings, other data. I'll just go over a brief overview of the options:

YAML / JSON

+ Ubiquitous simple text format for encoding object data
+ In-tree implementation
x Requires encoding to store the device image
x Will need binary padding and size calculation to make sure these merge properly in a section

Protocol Buffers

+ Well-tested
+ Implicit appending, no additional code required to handle merged sections.
x Out-of-tree, requires external dependencies to build and maintain in the future. No other use in Clang / LLVM

ELF

+ Ubiquitous tooling. Object extraction and copying for free
+ Simple key-value storage
x Difficult to calculate size, will need to figure out the size of the buffer and write it in later so we can read multiple appended sections.
x Difficult to create. The Elf object writer is completely tied to the MC backend. YAML2ELF would require base64 or similar again

MSGPACK

+ Exists in-tree in some form and well tested
+ Supports key-value storage
x Doesn't know its size, will need to add padding and a size field

Custom format

+ Relatively simple implementation that solves this specific problem
x No existing tooling support, more error prone

I decided to go with the custom format because it was the simplest to get working for a proof of concept to solve the problem I was immediately facing. I think ELF would be the next best if someones could suggest a way to write the data and get the size. MSGPACK seems to be @JonChesterfield's preferred method because it has a lot of use at AMD, it would work as long as we can figure out its size and get alignment. Let me know what suggestions you have because I really want to move forward with this.

Hey @jhuber6 , as discussed in multi-company meeting, I think that we will need at least an arch field somewhere in this. We would like to create multi-arch binaries so that runtime can load the compatible one on its own.
You may even consider using TargetID Format to store the list of archs.

Hey @jhuber6 , as discussed in multi-company meeting, I think that we will need at least an arch field somewhere in this. We would like to create multi-arch binaries so that runtime can load the compatible one on its own.
You may even consider using TargetID Format to store the list of archs.

The binary format contains a string map along with some integer fields. I have the getArch() function in the binary that just extracts the value get the "arch" key. This makes it easy to add some arbitrary data so I was planning on simply adding a "features" key as well. Then we can extract the associated image features and decide what to do with the image. I haven't thought of a good solution for allowing multiple compatible architectures, maybe a comma separated list of architectures. You can see the proposed usage right now in D122683 but more will be added.

Ping, I'd like to finalize the new driver in time for the GPU newsletter and the LLVM Performance Workshop at CGO.

JonChesterfield accepted this revision.Apr 13 2022, 7:27 AM

Couple of nits above but basically I'm convinced. The gnarly part of binary formats is string tables and I'm delighted that part of MC was readily reusable. Wrapping the string table in different bytes to align with the elf format may still be a good idea but it's not an obvious correctness hazard.

I like msgpack as a format but the writing machinery in llvm is not very reusable. Likewise elf is a great format but quite interwoven with MC. Protobuf seems to have the nice property of concatenating objects yielding a valid protobuf but the cost of codegen that isn't presently part of the llvm core, and is a slightly hairy dependency to pull in.

Medium term, factoring out parts of the elf handling for use here (and in lld?) is probably reasonable, but the leading magic bytes here are sufficient that we could detect that in backwards-compat fashion if the release gets ahead of us. The format here is essentially a string map which is likely to meet future requirements from other platforms adequately.

Thanks for sticking with this!

llvm/include/llvm/Object/OffloadBinary.h
75

these should probably be returning the enum types

98

enums here as well? They have uint16_t specified in the type so layout is stable

llvm/lib/Object/OffloadBinary.cpp
21

Not sure Expected<> helps hugely here - stuff only goes wrong as 'parse_failed' or failed to allocate, which is kind of the same thing - so we could return a default-initialized (null) unique_ptr on failure without loss of information

42

this is good, string table building is by far the most tedious part of formats like this

This revision is now accepted and ready to land.Apr 13 2022, 7:27 AM
jhuber6 updated this revision to Diff 422544.Apr 13 2022, 9:45 AM

Maxing suggested changes.

This revision was landed with ongoing or failed builds.Apr 14 2022, 7:51 AM
This revision was automatically updated to reflect the committed changes.