The current implementation stops dumping in case of a single error
it handles, though we can continue dumping.
This patch refines it: it adds a few warnings and a few test cases.
Details
Diff Detail
Event Timeline
llvm/test/tools/llvm-readobj/ELF/gnuhash.test | ||
---|---|---|
205 |
we emit a warning when the dynamic symbol table is empty. A valid dynamic symbol table should have at least one symbol: the symbol with index 0. | |
llvm/tools/llvm-readobj/ELFDumper.cpp | ||
2525 | GnuHashTable->symndx == 1 is probably sufficient. The other conditions are redundant. |
llvm/tools/llvm-readobj/ELFDumper.cpp | ||
---|---|---|
2525 |
It is not, because it can be > 1. For example when we have an object with an 1.s: as 1.s -o 1.o lld 1.o -o 1.so -shared --hash-style=gnu llvm-readelf --gnu-hash-table 1.so GnuHashTable { Num Buckets: 1 First Hashed Symbol Index: 2 Num Mask Words: 1 Shift Count: 26 Bloom Filter: [0x0] Buckets: [0] Values: [] } For GNU ld the output is different, btw (but it does not seem to be critical to fix LLD here, I guess): ld.bfd 1.o -o 1.so -shared --hash-style=gnu llvm-readelf --gnu-hash-table 1.so GnuHashTable { Num Buckets: 1 First Hashed Symbol Index: 1 Num Mask Words: 1 Shift Count: 0 Bloom Filter: [0x0] Buckets: [0] Values: [0x0] } |
llvm/test/tools/llvm-readobj/ELF/gnuhash.test | ||
---|---|---|
259 | Linker -> either "Linkers" or "A linker" | |
264–267 | I'm not sure I follow why we shouldn't report this in this specific case? Surely the linker should just set the index to 0, not 1? | |
llvm/tools/llvm-readobj/ELFDumper.cpp | ||
2506 | I was going to suggest that this should be reportUniqueWarning, but I guess that it can only ever be hit once per file, right? | |
2507 | I'd move "found" to after "section". I might also change section -> table here, since the data does not technically need a section header, but follow whatever we say elsewhere. | |
2512 | unsigned -> size_t? | |
2521–2524 | Similar comment to the test. Why is it okay for linkers to do this? At least explain this. | |
2530–2531 | I'm not sure I agree with this statement. The first symbol is the symbol with index 0. Saying we have 0 dynamic symbols is incorrect when there is just that symbol. After all the size of dynamic_symbols() will be 1. |
llvm/tools/llvm-readobj/ELFDumper.cpp | ||
---|---|---|
2525 | OK, !Buckets.empty() && Buckets[0] == 0 is sufficient. The number of buckets does not matter. The size of the bloom filter does not matter (it can preclude some symbols. Since there is no defined dynamic symbol, it shouldn't matter if the bloom filter precludes anything). |
llvm/tools/llvm-readobj/ELFDumper.cpp | ||
---|---|---|
2525 | As far I understand, the first thing that any dynamic loader tries to do during a symbol lookup is that it checks agains the Bloom filter, Then it means that what is really important for an empty gnu hash table is that the Bloom filter is zeroed. The code to check against the filter can be: /* Test against the Bloom filter */ c = sizeof (BloomWord) * 8; n = (h1 / c) & os->os_maskwords_bm; bitmask = (1 << (h1 % c)) | (1 << (h2 % c)); if ((os->os_bloom[n] & bitmask) != bitmask) return (NULL); So it uses maskwords and bloom filter words. Hence, I'd simplify this to: bool IsEmptyHashTable = `GnuHashTable->maskwords == 1 && BloomFilter.size() == 1 && BloomFilter.front() == 0`; Are you OK with this? |
llvm/tools/llvm-readobj/ELFDumper.cpp | ||
---|---|---|
2525 | The bloom filter saying "true" is a necessary condition. It the first thing to check, but it is not required to return "false" when there is a question "can symbol foo exist". It can say "true", and let the bucket/chain filter to say "false". All buckets being zeros (so buckets[hash % num_buckets] = 0) is a sufficient condition to say "false". This is another way to make .gnu.hash reject all input. (Now I think the warning is probably not useful.) |
llvm/tools/llvm-readobj/ELFDumper.cpp | ||
---|---|---|
2525 |
LLD creates a single dummy bucket just because of an issue in the android loader I think: // Note that we don't want to create a zero-sized hash table because // Android loader as of 2018 doesn't like a .gnu.hash containing such // table. If that's the case, we create a hash table with one unused // dummy slot. nBuckets = std::max<size_t>((v.end() - mid) / 4, 1); I see no other reason to create a bucket. I think checking the Bloom filter is a more stable way?
I've introduced it because of which has a dangerous implementation: ArrayRef<Elf_Word> values(unsigned DynamicSymCount) const { return ArrayRef<Elf_Word>(buckets().end(), DynamicSymCount - symndx); } Unfortunately we can't know the GNU hash table size, we might have no section headers and there is no dynamic tag that holds such a value. So it might overflow even with this patch. For example when we have N dynamic symbols and that N is larger than the number of values. I do not think there is a way to detect this. So, this warning itself not needed, but the whole branch that exits early might help a bit. |
llvm/tools/llvm-readobj/ELFDumper.cpp | ||
---|---|---|
2525 |
Maybe we can use all_of(Buckets, [](... V) { return !V; }); |
- Addressed review comments.
llvm/test/tools/llvm-readobj/ELF/gnuhash.test | ||
---|---|---|
264–267 | In real world they do not do that. as test.s -o test.o ld.bfd test.o -o test.so -shared --hash-style=gnu Then you'll see that GNU ld sets index to 1: GnuHashTable { Num Buckets: 1 First Hashed Symbol Index: 1 Num Mask Words: 1 Shift Count: 0 Bloom Filter: [0x0] Buckets: [0] Values: [] } While we have buckets empty or a Bloom filter empy it is enough for And I am not sure we even can think about this as about a bug: | |
llvm/tools/llvm-readobj/ELFDumper.cpp | ||
2506 | Yes. We take the GnuHashTable from the DT_GNU_HASH entry: case ELF::DT_GNU_HASH: GnuHashTable = reinterpret_cast<const Elf_GnuHash *>( toMappedAddr(Dyn.getTag(), Dyn.getPtr())); and even if there are more than one of them (a possible broken case), we still will print only the last one. Perhaps we want to change all of reportWarning to reportUniqueWarning. The benefit I see is that people | |
2507 | We use "table" usually it seems, so I've switched to it. | |
2530–2531 | OK |
llvm/test/tools/llvm-readobj/ELF/gnuhash.test | ||
---|---|---|
266–267 | what -> which What is "it" that we should normally report? I think that needs clarifying a little. I'd split the whole part from "and we..." into a new sentence, which more explicitly explains what we normally report, and say something like "For empty tables however, this value is unimportant and can be ignored." | |
llvm/tools/llvm-readobj/ELFDumper.cpp | ||
2506 | All sounds reasonable to me. I'd actually make reportWarning report unique warnings if possible, with a possible extra argument (defaulted to true) that says whether to make them unique or not. | |
2525 | when they finds that the -> when the | |
2526 | skips -> skip |
we emit a warning when the dynamic symbol table is empty. A valid dynamic symbol table should have at least one symbol: the symbol with index 0.