This is an archive of the discontinued LLVM Phabricator instance.

[Libomptarget] Add utility functions for loading an ELF symbol by name
ClosedPublic

Authored by jhuber6 on Aug 5 2022, 7:38 PM.

Details

Summary

The SHT_HASH sections in an ELF are used to look up a symbol in the
symbol table using a symbol's name. This is done by obtaining the
SHT_HASH section and using its sh_link attribute to access the
associated symbol table, from which we can access the string table
containing the associated name. We can then search for the symbol using
the hash of the name and the buckets and chains in the hash table
itself

This patch adds utility functions that allow us to look up a symbol in
an ELF file by name. It will first attempt to look through the hash
tables, and then search the section tables manually if failed. This
allows us to pull out constants necessary for setting up offloading
without first loading the object.

Diff Detail

Event Timeline

jhuber6 created this revision.Aug 5 2022, 7:38 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 5 2022, 7:38 PM
Herald added a subscriber: StephenFan. · View Herald Transcript
jhuber6 requested review of this revision.Aug 5 2022, 7:38 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 5 2022, 7:38 PM

Could you call this function from the amdgpu plugin in the same patch? In particular the sketchy looking hashtable logic is likely to be a copy from there with modifications to fit the llvm libs and it'll be easier to review with that in the diff scope.

Basically I can't remember the details of that lookup but remember writing it carefully at the time + it's been in use in that plugin for a year or so. I'd like to verify that the logic is unchanged by the move between files.

jhuber6 added a comment.EditedAug 8 2022, 4:45 AM

Could you call this function from the amdgpu plugin in the same patch? In particular the sketchy looking hashtable logic is likely to be a copy from there with modifications to fit the llvm libs and it'll be easier to review with that in the diff scope.

Basically I can't remember the details of that lookup but remember writing it carefully at the time + it's been in use in that plugin for a year or so. I'd like to verify that the logic is unchanged by the move between files.

I'd prefer not to make this patch really messy with all the required changes in Libomptarget, I can make a new patch for that and put it on the stack for this one. Here's the relevant file if you'd rather look at what's in-source first https://github.com/llvm/llvm-project/blob/main/openmp/libomptarget/plugins/amdgpu/src/rtl.cpp#L1630. For the functionality I figured the runtime test I added was sufficient, though I could make it more complicated.

I think you might be reinventing the wheel here. llvm-readobj already has code for printing the symbols derived from the hash table - see printHashTableSymbols. Can you reuse that code? Obviously, this might require refactoring some of the llvm-readobj logic, so that it is in libObject rather than the tool specifically.

I think you might be reinventing the wheel here. llvm-readobj already has code for printing the symbols derived from the hash table - see printHashTableSymbols. Can you reuse that code? Obviously, this might require refactoring some of the llvm-readobj logic, so that it is in libObject rather than the tool specifically.

The usage is somewhat different, as far as I can tell the llvm-readobj implementation is only concerned with printing all the symbols in the hash table. This function is intended to be used to look up a specific symbol if we already know the name.

I think you might be reinventing the wheel here. llvm-readobj already has code for printing the symbols derived from the hash table - see printHashTableSymbols. Can you reuse that code? Obviously, this might require refactoring some of the llvm-readobj logic, so that it is in libObject rather than the tool specifically.

The usage is somewhat different, as far as I can tell the llvm-readobj implementation is only concerned with printing all the symbols in the hash table. This function is intended to be used to look up a specific symbol if we already know the name.

Oops, yes, I see. Not sure how I missed that.

Could you clarify the use-case a bit more, please? The hash table is primarily for dynamic linking purposes, and consequently section headers aren't guaranteed to exist when doing this kind of lookup, with the dynamic tags like DT_HASH being used instead.

llvm/include/llvm/Object/ELF.h
518–519

You need to make sure that this value is within the ELF, and that there's enough size for the hash table. You should also validate that the sh_size is large enough to fit a valid hash table in, so that you don't try to look for e.g. the nbuckets field if the size is 0 or the Bucket or Chain arrays if the size is too small.

520–521

Similarly, you should validate that the symbol table fits within the ELF.

529

I'd use STN_UNDEF rather than 0, since that's what the spec uses.

530

I needs validating to ensure it doesn't fall outside the range of SymTab.

llvm/unittests/Object/ELFObjectFileTest.cpp
812

Your testing needs to be expanded to cover the various failure cases that you check for in your function.

jhuber6 marked 4 inline comments as done.Aug 12 2022, 4:52 AM

Oops, yes, I see. Not sure how I missed that.

Could you clarify the use-case a bit more, please? The hash table is primarily for dynamic linking purposes, and consequently section headers aren't guaranteed to exist when doing this kind of lookup, with the dynamic tags like DT_HASH being used instead.

Offloading execution to an external accelerator is very similar to loading a shared library via dlopen. The images we execute on an external device are basically dynamic objects we load at runtime. When loading these images there are certain symbols and constants we emit inside the ELF that are used to control how the image itself is executed (how many blocks and threads to launch for example). This method is currently used to look up these known constants by-name so we can read with without first needing to load the whole image. This method is supposed to be faster than iterating the entire symbol table, checking each one.

Interesting point about DT_HASH. Would this just require an extra lookup from the dynamic table if we don't have section headers? I'm assuming they'd both point to the same symbol / string table, or .

yaxunl added inline comments.Aug 12 2022, 7:57 AM
llvm/unittests/Object/ELFObjectFileTest.cpp
878

I am wondering whether it is useful to add a test for getting symbol values here.

jhuber6 updated this revision to Diff 452194.Aug 12 2022, 8:43 AM

Addressing comments and adding more error handling.

For the tests I had to manually edit the values with a const_cast, the ELF parsing was changing some things I think. I think this is fine for now since the interface requires a section header, if in the future we need to be able to do this without section headers we could probably make another interface to get the same pointers.

jhenderson added inline comments.Aug 15 2022, 1:25 AM
llvm/include/llvm/Object/ELF.h
505–516

All these checks still need test cases.

524–526

This is certainly part of what is needed, but it doesn't address the issue of if the sh_size is too small for the hash table to be valid (e.g. it is too small for the nchain or nbucket fields, or the size derived from this information is different to sh_size).

530

Not sure I see where this check is made, even though it's marked as done?

534–535

I would very much prefer not to use auto here. It's not obvious what the type of Bucket and Chain are.

llvm/unittests/Object/ELFObjectFileTest.cpp
823–825

SHF_ALLOC flags and Link: .dynsym are implicit for SHT_HASH sections in YAML, so you can omit those two fields for simplicity.

826

I think the AddressAlign field is unnecessary? (Also for the other sections)

831–833

Similar to above, SHF_ALLOC and Link: .dynstr are implicit, so don't need specifying.

842–845

I think it's probably irrelevant to the test what the contents/address of the section is: I don't believe anything validates that the symbols appear in the claimed section. You could even just use SHN_ABS dynamic symbols, if you wanted to avoid the question entirely. Removing these details would improve the signal to noise ratio in the setup data.

864

Nit: the preferred style for these comments is as inline - clang-format recognises this specific syntax and formats it slightly differently to other inline comments. Applies below too.

867

There's a lot of unhelpful auto in this test, which I don't think really conforms to LLVM's style guide.

871

Seems simpler?

878

I'm in two minds about it. However, if preferring to test just the symbol names, the other symbol properties can be omitted to simplify the YAML.

940

Rather than const_cast here, you should be able to use an StName field in the YAML. See https://github.com/llvm/llvm-project/blob/main/llvm/test/tools/yaml2obj/ELF/symbol-name.yaml for examples.

943

Use FailedWithMessage to check the error message contents. Same elsewhere.

945
946–947

You can use the .symtab's ShOffset field in the YAML to force this without needing the const_cast.

952
955

You can use the .hash's ShOffset field in the YAML to force this without needing the const_cast.

jhuber6 updated this revision to Diff 452649.Aug 15 2022, 6:30 AM

Addressing comments.

jhuber6 marked 16 inline comments as done.Aug 15 2022, 6:32 AM
jhuber6 added inline comments.
llvm/include/llvm/Object/ELF.h
505–516

Checking these errors explicitly is excessive. This isn't any new logic so it should already be covered by existing tests. I added one for calling this with an incorrect section but the other error paths are already covered as far as I'm aware.

llvm/unittests/Object/ELFObjectFileTest.cpp
867

ELF uses a lot of type-aliases depending on the input. It's much easier to just use auto rather than importing the names and using Expected<typename ELF64LE::xyz> for everything.

jhenderson added inline comments.Aug 16 2022, 2:20 AM
llvm/include/llvm/Object/ELF.h
505–516

I'm not suggesting the called code itself needs additional testing, merely testing that shows that if the called code reports an error, the error is correctly handled and returned up the call stack. The new if statements are new logic. As things stand, there is zero testing for those statements.

506

I'm looking at this error and wondering if it would be useful to print the actual type (possibly just as a number), so that if a user encounters this area, it might be slightly easier to find the problem. Up to you whether you think it's useful or not though.

527–528

If HashTab's size is 0, then you can't read the nbucket or nchain field, so this will result in reading outside the section. You'll need to split this into a check to make sure it's big enough for the nchain/nbucket fields, and then a separate one to make sure the arrays fit.

543–544

It's not clear in the final message what the bit after the colon means. I'd add some additional words clarifying that it is the number of symbols.

llvm/unittests/Object/ELFObjectFileTest.cpp
916

You might be able to avoid duplicating this string between some of the test cases, by having a factory function that constructs the string by having a prefix (which ends at the end of the section you want to modify) and a suffix (which starts immediately after it), and then an optional field that can be inserted in the middle. Essentially you could have lines like the following then in the different tests:

makeYAMLWithInvalidDynsym("ShOffset: 9999");
makeYAMLWithInvalidHash("ShSize: 1");

or perhaps:

makeYAML(DynsymPrefix, "ShOffset: 9999", DynsymSuffix);
makeYAML(HashPrefix, "ShSize: 1", HashSuffix);

Potentially, you could have that function construct the ELF and return the hash table too, to reduce the boiler plate.

995

I was slightly confused by first reading of this comment, so I'd add the "index" to be specific.

1035

"is" -> "it" here and in all the other comments.

Not sure which "table" is referred to in this comment. Probably should be "check that it fails if a symbol has an invalid st_name."

jhuber6 marked 6 inline comments as done.Aug 16 2022, 9:10 AM
jhuber6 added inline comments.
llvm/include/llvm/Object/ELF.h
505–516

I'll add a test for an invalid st_link. That should be good enough to assert that this part functions properly. I think it's safe to assert that getting a section from an index is sufficiently tested.

llvm/unittests/Object/ELFObjectFileTest.cpp
916

The solution I had previously that modified the ELF in-place was more compact because I could iteratively make different field invalid using the same input. I could go back to that method, otherwise I think it's too much work to make some factory generator just for saving a few string constants in a test.

jhuber6 updated this revision to Diff 453037.Aug 16 2022, 9:11 AM
jhuber6 marked an inline comment as done.

Addressing comments.

jhenderson added inline comments.Aug 18 2022, 12:48 AM
llvm/include/llvm/Object/ELF.h
505–516

(I assume you meant sh_link). That will cover the getSection if statement, but it doesn't cover the sections() if statement or the getStringTableForSymtab: every conditional should have at least one test case that triggers both the true and false cases, or you haven't fully tested the if.

Putting it in other terms, if things were easier to mock, I'd suggest mocking out the return values of each of those functions, so that in one case they produced an error and in the other case they produced a valid value that can be used later on, and then having test cases that used both mocked versions. In practice, what you have to do instead is craft some invalid ELF to trigger the error. I don't care how you achieve that, as long as the respective functions trigger errors.

506

Just pinging this point in case you missed it, as you've not responded one way or another (even with just a "no I don't think it's useful).

527–528

Thanks for the fix, although there are now two failure cases here, both of which need separate testing (i.e. one with a very small/nearly empty SHT_HASH and another where it has enough space for the header fields, but not the full bucket array).

llvm/unittests/Object/ELFObjectFileTest.cpp
916

Yes, I see your point. I do have a strong aversion to const_cast, but I don't really have much justification to it in this particular context. I see there is limited usage within LLVM unit tests of it too. I'd like a second opinion from a regular contributor in this area:

@MaskRay, do you have any thoughts?

1080

sh_link not st_link

jhuber6 marked an inline comment as done.Aug 18 2022, 4:24 AM
jhuber6 added inline comments.
llvm/include/llvm/Object/ELF.h
505–516

At this point half of the existing ELF runtime test is going to be this one feature. I personally don't see the utility in testing errors that already have coverage somewhere else. If we already know calling some function with an invalid ELF will return the correct error I don't see why repeating that same pattern elsewhere warrants extra tests since none of this logic is unique or new.

506

I don't think it's useful and this error message is very similar to other messages.

jhenderson added inline comments.Aug 18 2022, 4:50 AM
llvm/include/llvm/Object/ELF.h
505–516

Most of ELF.h is tested via lit tests rather than unittests, so it's a bit of a mischaracterisation that this feature will be half the runtime tests.

If you put a code coverage tool on this function, you wouldn't get 100% coverage without testing both halves of the ifs in question. A lot of work has gone into making sure that many/most/all of these sorts of functions have 100% coverage, both for the positive and the negative cases. That means testing needs to trigger every case where an error is handled (not just where it is reported) to show that the error handling is correct and that the error is appropriately propagated up the call stack. Note that I'm not arguing this for the sake of it: bugs can and do happen even in simple looking code around error handling.

I'm not budging on this point, because I am not willing to accept only partial test coverage, when it is relatively straightforward to add said coverage. I think you'll find that this is the same for others who regularly work on and review code in this area.

jhuber6 updated this revision to Diff 453626.Aug 18 2022, 5:21 AM

Adding a test for an invalid ELF header failing to get sections, an invalid link index, and an invalid hash size greater than the size of the header.

jhuber6 marked 3 inline comments as done.Aug 18 2022, 5:21 AM
MaskRay added inline comments.Aug 18 2022, 12:04 PM
llvm/unittests/Object/ELFObjectFileTest.cpp
916

I prefer avoiding const_cast, but for this case it's fine if an alternative needs too much boilerplate.

SmallString<0> Storage; and the toBinary<ELF64LE>(Storage, R"( use make it pretty clear the underlying object is actually writable.

MaskRay added a comment.EditedAug 18 2022, 12:12 PM

Many Linux distributions (see clang/lib/Driver/ToolChains/Linux.cpp:240 for Clang behavior; the distros typically patch their system GCC to do the same thing) use the linker option --hash-style=gnu, so most of their executabls/shared objects do not have SHT_HASH/DT_HASH. You may need to add SHT_GNU_HASH/DT_GNU_HASH support in a separate patch.

Actually, on Linux SHT_GNU_HASH occurs more than SHT_HASH. For symbol search's purpose, SHT_HASH is actually legacy now. If I am to implement the functionality, I may not even implement the SHT_HASH part. (I have a SHT_GNU_HASH state summary for many ELF based OSes https://groups.google.com/g/generic-abi/c/9L03yrxXPBc/m/WKuUjZshAQAJ)

llvm/include/llvm/Object/ELF.h
75

s/libelf/generic ABI/

546

You can save sh_size / sizeof(Elf_Sym) as a local variable, then you can replace I * sizeof(Elf_Sym) >= (*SymTabOrErr)->sh_size with a simpler check.

551

Prefer substr which is also available in std::string_view and is the unaliased function in StringRef.

jhuber6 marked 2 inline comments as done.Aug 18 2022, 12:23 PM

Many Linux distributions (clang/lib/Driver/ToolChains/Linux.cpp:240; they typically patch their system GCC to do the same thing) use the linker option --hash-style=gnu, so most of their executabls/shared objects do not have SHT_HASH/DT_HASH. You may need to add SHT_GNU_HASH/DT_GNU_HASH support.

Right now we use this for the offloading runtime library and the generated objects use SHT_HASH so it's sufficient for our purposes. Do we currently do anything with SHT_GNU_HASH? My understanding is that SHT_HASH is standard while SHT_GNU_HASH isn't officially documented so we would need to import some GNU code verbatim to LLVM, but I'm hardly an expert.

llvm/include/llvm/Object/ELF.h
75

I just moved this, but I can fix it while I'm here.

MaskRay added a comment.EditedAug 18 2022, 12:35 PM

Many Linux distributions (clang/lib/Driver/ToolChains/Linux.cpp:240; they typically patch their system GCC to do the same thing) use the linker option --hash-style=gnu, so most of their executabls/shared objects do not have SHT_HASH/DT_HASH. You may need to add SHT_GNU_HASH/DT_GNU_HASH support.

Right now we use this for the offloading runtime library and the generated objects use SHT_HASH so it's sufficient for our purposes. Do we currently do anything with SHT_GNU_HASH? My understanding is that SHT_HASH is standard while SHT_GNU_HASH isn't officially documented so we would need to import some GNU code verbatim to LLVM, but I'm hardly an expert.

I added " (I have a SHT_GNU_HASH state summary for many ELF based OSes https://groups.google.com/g/generic-abi/c/9L03yrxXPBc/m/WKuUjZshAQAJ)" perhaps after you were composing the reply.

SHT_GNU_HASH is quite universal now. At least all OS with serious LLVM support (except Solaris) support the format, with some OSes for a long time (almost 10 years; and glibc for 16 years).
SHT_HASH will probably just benefit Solaris and illumnos (I am not sure how the whole GPU stack works there.) (Also, Solaris rtld may be quite different anyway. I am uncertain we can support them with generic code.)

jhuber6 marked an inline comment as done.Aug 18 2022, 12:44 PM

I added " (I have a SHT_GNU_HASH state summary for many ELF based OSes https://groups.google.com/g/generic-abi/c/9L03yrxXPBc/m/WKuUjZshAQAJ)" perhaps after you were composing the reply.

SHT_GNU_HASH is quite universal now. At least all OS with serious LLVM support (except Solaris) support it, with some OSes for a long time (almost 10 years; and glibc for 16 years).
SHT_HASH will probably just benefit Solaris and illumnos (I am not sure how the whole GPU stack works there.)

I see, thanks for the information. It seems we have a GnuHash struct to represent the values in LLVM. Do we have the GNU hash function somewhere in the LLVM source? I don't see it used anywhere.

As for this patch, I'd say since this patch is pretty much done and it meets our (LLVM offloading's) immediate needs we may as well support it. Then I can make a follow-up patch that adds the GNU hash support. Since it'll use most of the same logic it should be relatively easy to add-on to this patch.

I added " (I have a SHT_GNU_HASH state summary for many ELF based OSes https://groups.google.com/g/generic-abi/c/9L03yrxXPBc/m/WKuUjZshAQAJ)" perhaps after you were composing the reply.

SHT_GNU_HASH is quite universal now. At least all OS with serious LLVM support (except Solaris) support it, with some OSes for a long time (almost 10 years; and glibc for 16 years).
SHT_HASH will probably just benefit Solaris and illumnos (I am not sure how the whole GPU stack works there.)

I see, thanks for the information. It seems we have a GnuHash struct to represent the values in LLVM. Do we have the GNU hash function somewhere in the LLVM source? I don't see it used anywhere.

As for this patch, I'd say since this patch is pretty much done and it meets our (LLVM offloading's) immediate needs we may as well support it. Then I can make a follow-up patch that adds the GNU hash support. Since it'll use most of the same logic it should be relatively easy to add-on to this patch.

I am sorry that I made the comment so late. I am travelling and will have more limited bandwidth next week.
I am quite sure the SHT_HASH will benefit nearly no user. If the SHT_HASH patch is created 3 years ago, it'd probably benefit some *BSD, but now, it probably doesn't add much value.
I agree this patch is pretty much done with decent tests now. Perhaps this can be changed to use SHT_GNU_HASH instead? Quite a large portion of the code can be reused and you will not waste your effort :)

https://flapenguin.me/elf-dt-gnu-hash has a great introduction of the format.

I added " (I have a SHT_GNU_HASH state summary for many ELF based OSes https://groups.google.com/g/generic-abi/c/9L03yrxXPBc/m/WKuUjZshAQAJ)" perhaps after you were composing the reply.

SHT_GNU_HASH is quite universal now. At least all OS with serious LLVM support (except Solaris) support it, with some OSes for a long time (almost 10 years; and glibc for 16 years).
SHT_HASH will probably just benefit Solaris and illumnos (I am not sure how the whole GPU stack works there.)

I see, thanks for the information. It seems we have a GnuHash struct to represent the values in LLVM. Do we have the GNU hash function somewhere in the LLVM source? I don't see it used anywhere.

As for this patch, I'd say since this patch is pretty much done and it meets our (LLVM offloading's) immediate needs we may as well support it. Then I can make a follow-up patch that adds the GNU hash support. Since it'll use most of the same logic it should be relatively easy to add-on to this patch.

I am sorry that I made the comment so late. I am travelling and will have more limited bandwidth next week.
I am quite sure the SHT_HASH will benefit nearly no user. If the SHT_HASH patch is created 3 years ago, it'd probably benefit some *BSD, but now, it probably doesn't add much value.
I agree this patch is pretty much done with decent tests now. Perhaps this can be changed to use SHT_GNU_HASH instead? Quite a large portion of the code can be reused and you will not waste your effort :)

https://flapenguin.me/elf-dt-gnu-hash has a great introduction of the format.

There's a patch in the stack view that's waiting on this one to land, so it meets the needs of that currently. I'm definitely on board with adding SHT_GNU_HASH support afterwards since it should be relatively simple given what's already here. Is there no utility in supporting both?

OK. I don't understand OpenMP and GPU enough to comment on that patch.
The DT_HASH support will have some utilitization if mips is to be supported. How good is LLVM's support for OpenMP/GPU on mips?

OK. I don't understand OpenMP and GPU enough to comment on that patch.
The DT_HASH support will have some utilitization if mips is to be supported. How good is LLVM's support for OpenMP/GPU on mips?

I think libomp should work, but not for GPU offloading with libomptarget as none of the offloading plugins target MIPS. I guess my understanding is that the compiler will typically emit both HASH and GNU_HASH sections and we previously used the old DT_HASH because it was simpler. We could probably switch over to using the GNU version and it would still work. I still feel like if this works already we may as well support this function for both DT_HASH and DT_GNU_HASH, but you're obviously more qualified to make that judgement.

Thanks for the information about mips.
For Linux, I think for x years non-mips distributions have shipped executables with both DT_HASH and DT_GNU_HASH,
for 16-x years, these distributions ship executables with only DT_GNU_HASH.

For example, Go vdso used to support DT_HASH but in 2017 they noticed that DT_GNU_HASH had to be supported: https://go-review.git.corp.google.com/c/go/+/45511/

Nowadays, there are some --hash-style=sysv uses which are all mips specific or ancient tech debt.
AIUI your patch does not need to deal with such ancient objects.
mips is in a weird situation as it uses a technique from last century to sort .dynsym. The .dynsym sorting is incompatible with DT_GNU_HASH.
(https://maskray.me/blog/2021-08-29-all-about-global-offset-table#dt_mips_local_gotno-and-dt_mips_symtabno-dt_mips_gotsym)

Mostly just nits remaining from me, plus a couple of "normal behaviour" test cases I missed earlier.

FWIW, I think implementing SHT_HASH support first, as this patch does, should definitely happen. This is because the SHT_HASH is the thing that is in the gABI, and not SHT_GNU_HASH, however widely the GNU version is supported. Last I checked my company's system didn't use GNU hash tables yet, for example. I wouldn't be opposed to follow-up work for the GNU hash table though.

Presumably LLD has the GNU hash alogrithm in its source code, since it can generate such tables? Could that code be moved to libObject if needed?

llvm/unittests/Object/ELFObjectFileTest.cpp
835–840

If I follow this correctly, y is in the same bucket and ahead of x, such that we see y, check its name, and then move onto x. Assuming that is right, it might be worth a brief comment stating why we have y (I think we should have it FWIW, to show that the bucket/chain logic is correct).

We definitely also want a test case where the symbol is not found. I think it would also be worth a test case involving multiple buckets, to show that we calculate the bucket correctly (this test case only currently has one bucket, so there's no risk that the wrong bucket would be selected).

916

Okay, thanks @MaskRay. @jhuber6, if you would prefer to revert to const_cast style, I'm not going to oppose it, although I'd ask that you keep the test cases separate still, with a helper function for the boiler plate to set up the ELF. The reasoning for this is mostly to keep the errors isolated (i.e. "the symbol would be found/nullptr returned except for this one thing").

923

I believe nchain is supposed to always match the size of the .dynsym. Not sure that it matters that much, but might be less confusing/risk of causing problems if it did in these test cases.

1064

I'm not sure "an invalid table" is clear as to the meaning. I'd go with "an invalid symbol name offset."

1106

Probably change this comment to "check that it fails when not referencing a dynamic symbol table" or something to that effect. "invalid sh_link" sounds like the value is outside the range of the section indexes, which is what the next case does.

1164–1165
jhuber6 updated this revision to Diff 453980.Aug 19 2022, 6:47 AM
jhuber6 marked 3 inline comments as done.

Updating tests

jhenderson added inline comments.Aug 22 2022, 12:02 AM
llvm/unittests/Object/ELFObjectFileTest.cpp
1106

You've updated the wrong comment in response to my suggestion here. This test case is the case where the sh_link is valid, but points at a .dynstr. The one you updated is the one that has a sh_link that is not within the section count range.

1164–1165

Not addressed?

jhuber6 updated this revision to Diff 454843.Aug 23 2022, 7:50 AM

Addressing comments. Hopefully this covers all nits so we can move forward.

jhuber6 marked 3 inline comments as done.Aug 23 2022, 7:50 AM
jhenderson accepted this revision.Aug 24 2022, 12:31 AM

LGTM, from my point of view, but I note a couple of @MaskRay's comments seem to not have been addressed. Please wait for him to confirm he is happy.

This revision is now accepted and ready to land.Aug 24 2022, 12:31 AM

LGTM, from my point of view, but I note a couple of @MaskRay's comments seem to not have been addressed. Please wait for him to confirm he is happy.

The only one I didn't address was using substr because we don't know the size I rely on the const char * being implicitly converted to a StringRef again.

MaskRay added a comment.EditedAug 24 2022, 11:54 PM

Mostly just nits remaining from me, plus a couple of "normal behaviour" test cases I missed earlier.

FWIW, I think implementing SHT_HASH support first, as this patch does, should definitely happen. This is because the SHT_HASH is the thing that is in the gABI, and not SHT_GNU_HASH, however widely the GNU version is supported. Last I checked my company's system didn't use GNU hash tables yet, for example. I wouldn't be opposed to follow-up work for the GNU hash table though.

OK, I did not know you have internal users with DT_HASH. https://maskray.me/blog/2022-08-21-glibc-and-dt-gnu-hash#what-is-dt_gnu_hash My investigation suggests that multiple major Linux distributions have switched to --hash-style=gnu before 2010...
Nowadays DT_HASH is unlikely to be used on non-mips Linux.
My mild objection is because such a feature would nearly assuredly benefit nobody, but I think I'll not hold a strong objection.
In case related, FreeBSD rtld has supported DT_GNU_HASH since 2012.
As of today, their clang driver does not pass --hash-style= to ld, so the default --hash-style=both is used, but this does not mean that DT_HASH is mandatory.

Note that the bloom filter in a GNU hash table is optional. @jhuber6 can just change the hash function, then the existing cost is almost a DT_GNU_HASH implementation.

Presumably LLD has the GNU hash alogrithm in its source code, since it can generate such tables? Could that code be moved to libObject if needed?

Mostly just nits remaining from me, plus a couple of "normal behaviour" test cases I missed earlier.

FWIW, I think implementing SHT_HASH support first, as this patch does, should definitely happen. This is because the SHT_HASH is the thing that is in the gABI, and not SHT_GNU_HASH, however widely the GNU version is supported. Last I checked my company's system didn't use GNU hash tables yet, for example. I wouldn't be opposed to follow-up work for the GNU hash table though.

OK, I did not know you have internal users with DT_HASH. https://maskray.me/blog/2022-08-21-glibc-and-dt-gnu-hash#what-is-dt_gnu_hash My investigation suggests that multiple major Linux distributions have switched to --hash-style=gnu before 2010...
Nowadays DT_HASH is unlikely to be used on non-mips Linux.
My mild objection is because such a feature would nearly assuredly benefit nobody, but I think I'll not hold a strong objection.
In case related, FreeBSD rtld has supported DT_GNU_HASH since 2012.
As of today, their clang driver does not pass --hash-style= to ld, so the default --hash-style=both is used, but this does not mean that DT_HASH is mandatory.

Note that the bloom filter in a GNU hash table is optional. @jhuber6 can just change the hash function, then the existing cost is almost a DT_GNU_HASH implementation.

Presumably LLD has the GNU hash alogrithm in its source code, since it can generate such tables? Could that code be moved to libObject if needed?

Are you okay with me landing this patch now and adding the GNU hash support later?

Mostly just nits remaining from me, plus a couple of "normal behaviour" test cases I missed earlier.

FWIW, I think implementing SHT_HASH support first, as this patch does, should definitely happen. This is because the SHT_HASH is the thing that is in the gABI, and not SHT_GNU_HASH, however widely the GNU version is supported. Last I checked my company's system didn't use GNU hash tables yet, for example. I wouldn't be opposed to follow-up work for the GNU hash table though.

OK, I did not know you have internal users with DT_HASH. https://maskray.me/blog/2022-08-21-glibc-and-dt-gnu-hash#what-is-dt_gnu_hash My investigation suggests that multiple major Linux distributions have switched to --hash-style=gnu before 2010...
Nowadays DT_HASH is unlikely to be used on non-mips Linux.
My mild objection is because such a feature would nearly assuredly benefit nobody, but I think I'll not hold a strong objection.
In case related, FreeBSD rtld has supported DT_GNU_HASH since 2012.
As of today, their clang driver does not pass --hash-style= to ld, so the default --hash-style=both is used, but this does not mean that DT_HASH is mandatory.

Note that the bloom filter in a GNU hash table is optional. @jhuber6 can just change the hash function, then the existing cost is almost a DT_GNU_HASH implementation.

Presumably LLD has the GNU hash alogrithm in its source code, since it can generate such tables? Could that code be moved to libObject if needed?

Are you okay with me landing this patch now and adding the GNU hash support later?

I wish that we don't add unneeded functionality...

MaskRay requested changes to this revision.Aug 25 2022, 6:53 PM
This revision now requires changes to proceed.Aug 25 2022, 6:53 PM

I wish that we don't add unneeded functionality...

It won't be strictly unneeded as the whole point of this was to factor out something done specifically for AMDGPU offloading into a proper ELF helper function. The question then is if we should switch that handling over to use GNU_HASH instead. That's a little out of scope of the original patch series I've been trying to land so I was hoping to get this in first and work on that later. We already support other functions using DT_HASH in Object/Elf.h so I figured it would be fine to support both as many applications provide both as you said.

OK, if openmp offloading already uses findOnlyShtHash and doesn't support mips, you may refactor the openmp code to switch to DT_GNU_HASH instead.

While DT_HASH has been mostly distinct in Linux distributions, D85742 added it. The support should be replaced with DT_GNU_HASH.

Changing the hash table amdgpu uses would be an ABI break needing work in the HSA loader and possibly the windows stack as well. I'm not aware of any inclination to do that. Glibc's deprecation decisions have no direct bearing on amdgpu as it doesn't use glibc.

This patch wants to move code from the openmp runtime into llvm. I'm not convinced that has any value relative to leaving it where it already is, perhaps Joseph would be better off applying these review induced changes to the runtime instead.

This patch wants to move code from the openmp runtime into llvm. I'm not convinced that has any value relative to leaving it where it already is, perhaps Joseph would be better off applying these review induced changes to the runtime instead.

That's probably the best road forward if this is a blocker, as it no longer provides the functionality we need. Most of the code can be directly put in the runtime instead, but it would be nice to have all the runtime and correctness tests.

jhuber6 updated this revision to Diff 455901.Aug 26 2022, 7:14 AM

Use sections() instead of rolling the array manually and duplicating an error check. Still handled by existing tests but the error message changed.

@MaskRay is this feature not going to be accepted? The next patch in this series fixes a bug in AMD's support and fixing it has been long overdue now. If you don't agree with this feature let me know so I can abandon the patch and put it in the runtime instead so we can move forward.

Changing the hash table amdgpu uses would be an ABI break needing work in the HSA loader and possibly the windows stack as well. I'm not aware of any inclination to do that. Glibc's deprecation decisions have no direct bearing on amdgpu as it doesn't use glibc.

This patch wants to move code from the openmp runtime into llvm. I'm not convinced that has any value relative to leaving it where it already is, perhaps Joseph would be better off applying these review induced changes to the runtime instead.

The host binaries do not use DT_HASH for many years. If AMDGPU wants to stick with DT_HASH, it's fine, but deviating from the host is probably a poor decision.

I do not object to using DT_HASH if AMDGPU needs it, but why does the patch expose getSymbolFromHashTable as an lib/Object API while it works well in the current setup that it stays as a private API in openmp/libomptarget/plugins/amdgpu/src/rtl.cpp?

getSymbolFromHashTable is not suitable if it only supports DT_HASH, as this is not useful for most Linux binaries:

  • most have only DT_GNU_HASH and no DT_HASH
  • for some having both, DT_GNU_HASH is used and DT_HASH is ignored
  • the API does some dynamic loader work which I am not sure is suitable to lib/Object. Reloation resolving and dlsym have more rules than implemented here and the implementation does not handle symbol versioning. How fast do you want? If the object only has a couple of dynamic symbols, just iterate over the whole DT_SYMTAB.

I do not object to using DT_HASH if AMDGPU needs it, but why does the patch expose getSymbolFromHashTable as an lib/Object API while it works well in the current setup that it stays as a private API in openmp/libomptarget/plugins/amdgpu/src/rtl.cpp?

In the process of transitioning libomptarget from libelf to lib/Object I figured this functionality was tied directly to ELF such that it justified pulling it out and testing directly. If you think it's not a proper fit for lib/Object you obviously have seniority here, where would this functionality go if not here? Or is this generally out of LLVM's scope.

getSymbolFromHashTable is not suitable if it only supports DT_HASH, as this is not useful for most Linux binaries:

  • most have only DT_GNU_HASH and no DT_HASH
  • for some having both, DT_GNU_HASH is used and DT_HASH is ignored

I made a patch in D132743 that adds the DT_GNU_HASH implementation, AMDGPU uses DT_HASH so I figured it was acceptable to support both given that DT_HASH isn't officially deprecated as far as I know.

  • the API does some dynamic loader work which I am not sure is suitable to lib/Object. Relocation resolving and dlsym have more rules than implemented here and the implementation does not handle symbol versioning. How fast do you want? If the object only has a couple of dynamic symbols, just iterate over the whole DT_SYMTAB.

I believe we simply iterated the dynamic symbol table at some point, the number of symbols depends on the user. Large applications have a sufficiently large number of device functions / variables to justify a faster lookup.

Yes, I believe the shortcomings of this implementation as it relates to dynamic loading is that we require section headers and do not handle symbol versions as you said. A better interface just for this could potentially have a getSymbolFromGnuHashTable and getSymbolFromSysVHashTable that don't use any section headers. Then have a function that attempts to find the relevant sections and call either one of those functions based on the ELF with optional versioning information. But as you mentioned this is pretty much re-implementing dlsym, which I don't know if LLVM has any desire to support that kind of functionality outside of our use-case.

if you're against including this functionality we can simply merge it back into libomptarget. Personally I feel there is utility in having ELF hash table traversal present in LLVM, as is making the gnu_hash function readily available as in D132696. If you feel like this shouldn't be provided by LLVM we can easily move it back into libomptarget. Alternatively, if you provide a path forward to provide something that you find useful from this we can move forward with that.

I do not object to using DT_HASH if AMDGPU needs it, but why does the patch expose getSymbolFromHashTable as an lib/Object API while it works well in the current setup that it stays as a private API in openmp/libomptarget/plugins/amdgpu/src/rtl.cpp?

In the process of transitioning libomptarget from libelf to lib/Object I figured this functionality was tied directly to ELF such that it justified pulling it out and testing directly. If you think it's not a proper fit for lib/Object you obviously have seniority here, where would this functionality go if not here? Or is this generally out of LLVM's scope.

getSymbolFromHashTable is not suitable if it only supports DT_HASH, as this is not useful for most Linux binaries:

  • most have only DT_GNU_HASH and no DT_HASH
  • for some having both, DT_GNU_HASH is used and DT_HASH is ignored

I made a patch in D132743 that adds the DT_GNU_HASH implementation, AMDGPU uses DT_HASH so I figured it was acceptable to support both given that DT_HASH isn't officially deprecated as far as I know.

  • the API does some dynamic loader work which I am not sure is suitable to lib/Object. Relocation resolving and dlsym have more rules than implemented here and the implementation does not handle symbol versioning. How fast do you want? If the object only has a couple of dynamic symbols, just iterate over the whole DT_SYMTAB.

I believe we simply iterated the dynamic symbol table at some point, the number of symbols depends on the user. Large applications have a sufficiently large number of device functions / variables to justify a faster lookup.

Yes, I believe the shortcomings of this implementation as it relates to dynamic loading is that we require section headers and do not handle symbol versions as you said. A better interface just for this could potentially have a getSymbolFromGnuHashTable and getSymbolFromSysVHashTable that don't use any section headers. Then have a function that attempts to find the relevant sections and call either one of those functions based on the ELF with optional versioning information. But as you mentioned this is pretty much re-implementing dlsym, which I don't know if LLVM has any desire to support that kind of functionality outside of our use-case.

if you're against including this functionality we can simply merge it back into libomptarget. Personally I feel there is utility in having ELF hash table traversal present in LLVM, as is making the gnu_hash function readily available as in D132696. If you feel like this shouldn't be provided by LLVM we can easily move it back into libomptarget. Alternatively, if you provide a path forward to provide something that you find useful from this we can move forward with that.

Having the public API may lure users to improperly use it and what it does (DT_HASH) is very different from the practice.
I think the DT_HASH code should reside in libomptarget

Having the public API may lure users to improperly use it and what it does (DT_HASH) is very different from the practice.
I think the DT_HASH code should reside in libomptarget

What if we changed the interface to instead search the ELF for the hash section rather than passing it in. Then we would only use the DT_HASH section if it is the only one present.

Having the public API may lure users to improperly use it and what it does (DT_HASH) is very different from the practice.
I think the DT_HASH code should reside in libomptarget

What if we changed the interface to instead search the ELF for the hash section rather than passing it in. Then we would only use the DT_HASH section if it is the only one present.

Can you elaborate? When adding a public API, it should be useful. If this is libomptarget specific (this DT_HASH case pretty much is), keep it there.

Can you elaborate? When adding a public API, it should be useful. If this is libomptarget specific (this DT_HASH case pretty much is), keep it there.

I'm asking because you seemed open to this patch earlier if only GNU_HASH was supported. Your main concern was that DT_HASH being present may lure users into using the wrong format. If the interface were changed users would not be able to use the wrong format. If instead you think that this is not useful at all then that's the end of it and I'll abandon this patch. There are other public API functions with single users so I figured it wasn't an issue, but it's your call.

Can you elaborate? When adding a public API, it should be useful. If this is libomptarget specific (this DT_HASH case pretty much is), keep it there.

I'm asking because you seemed open to this patch earlier if only GNU_HASH was supported. Your main concern was that DT_HASH being present may lure users into using the wrong format. If the interface were changed users would not be able to use the wrong format. If instead you think that this is not useful at all then that's the end of it and I'll abandon this patch. There are other public API functions with single users so I figured it wasn't an issue, but it's your call.

It's not a wrong format, but it is a not useful format, nearly not in use. So having the functionality as a public API is not useful. I am somewhat concerned of people using this to do dlsym like things, too, but it is minor.
If the amdgpu offloading stuff can implement the functionality without exposing an API in LLVMObject, that'll be great.

It's not a wrong format, but it is a not useful format, nearly not in use. So having the functionality as a public API is not useful. I am somewhat concerned of people using this to do dlsym like things, too, but it is minor.
If the amdgpu offloading stuff can implement the functionality without exposing an API in LLVMObject, that'll be great.

Okay, I'll change this patch to implement an API in libomptarget for getting symbols either via GNU_HASH, DT_HASH, or by a linear search of symbols in that order. I already wrote this patch and D132743 with all the testing necessary so I can at least be confident that the implementation works without a runtime test.

Side note, I am curious if there is ever a plan for LLVM to have dlsym features given that we already have the llvm-libc project.

Side note, I am curious if there is ever a plan for LLVM to have dlsym features given that we already have the llvm-libc project.

That probably wants bringing up on the relevant part of Discourse. If there are going to be multiple users of this ultimately, we don't want to have multiple implementations, if we can avoid it.

jhuber6 retitled this revision from [ELF] Add ability to get a symbol by name from the hash table section to [Libomptarget] Add utility functions for loading an ELF symbol by name.Sep 5 2022, 9:07 AM
jhuber6 edited the summary of this revision. (Show Details)
jhuber6 updated this revision to Diff 458028.Sep 5 2022, 9:08 AM
jhuber6 edited the summary of this revision. (Show Details)

Moving code into libomptarget.

Herald added a project: Restricted Project. · View Herald TranscriptSep 5 2022, 9:08 AM

I think we've got two users in libomptarget - the amdgpu plugin where the code was originally, and the cuda plugin where we'd like to call it from. That seems a good reason to put it under plugins/common, and we can move it to llvm proper if a user outside of libomptarget shows up. It's not a disaster if we end up with a couple of implementations of this by accident given it's a hashtable lookup.

jhuber6 updated this revision to Diff 458422.Sep 7 2022, 4:48 AM

Pulling in user of this function so it's easier to see what it's replacing.

jhuber6 updated this revision to Diff 458426.Sep 7 2022, 5:00 AM

Replacing REPORT with DP.

Implementation looks reasonable. Extra const annotations are fine. I'm not totally convinced implementation complexity has gone down - we're down 70 lines and up 210 - what're the users of the second format of hashtable, and of the linear scan? I'm guessing cuda?

openmp/libomptarget/plugins/amdgpu/src/rtl.cpp
1617 ↗(On Diff #458422)

REPORT -> DP for consistency with the rest of the plugin I think

openmp/libomptarget/plugins/common/elf_common/ELFSymbols.cpp
135 ↗(On Diff #458422)

So one of these is presumably dead, since the only caller is amdgpu. Does cuda use the other sort of hash table? If not, suggest we delete the unused code path for now

JonChesterfield accepted this revision.Sep 7 2022, 5:03 AM
JonChesterfield added inline comments.
openmp/libomptarget/plugins/amdgpu/src/rtl.cpp
1613 ↗(On Diff #458426)

Possible follow up is to have this return an Expected<SymbolInfo> but that's not worth doing inline with this patch

This revision is now accepted and ready to land.Sep 7 2022, 5:03 AM
jhuber6 added inline comments.Sep 7 2022, 5:09 AM
openmp/libomptarget/plugins/amdgpu/src/rtl.cpp
1613 ↗(On Diff #458426)

Once the plugins are rewritten we'll probably switch over.

openmp/libomptarget/plugins/common/elf_common/ELFSymbols.cpp
135 ↗(On Diff #458422)

I don't think CUDA even uses a hash table as it's some weird executable, for that we will need the exhaustive search below. I mostly kept this because we may have more users in the future and it was easier to just include it all since I already wrote it.

JonChesterfield added inline comments.Sep 7 2022, 5:11 AM
openmp/libomptarget/plugins/common/elf_common/ELFSymbols.cpp
135 ↗(On Diff #458422)

Yeah, none of that. Delete it please.