This is an archive of the discontinued LLVM Phabricator instance.

[lld][WebAssembly] Prefer objdump -d over obj2yaml for tests. NFC
ClosedPublic

Authored by sbc100 on Jul 27 2021, 11:08 AM.

Details

Summary

Now that we have https://reviews.llvm.org/D105539 we can
use objdump -d to actually check for instruction sequences
rather than binary blobs.

This is just an example of how to do that we should followup
with a wider ranging conversion of existing tests.

Diff Detail

Event Timeline

sbc100 created this revision.Jul 27 2021, 11:08 AM
sbc100 requested review of this revision.Jul 27 2021, 11:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 27 2021, 11:08 AM

I guess depending on what the test is looking for, objdump might also be able to replace obj2yaml for non-code-section stuff too. At least data section maybe?

I guess depending on what the test is looking for, objdump might also be able to replace obj2yaml for non-code-section stuff too. At least data section maybe?

I think other sections are less readable with objdump. e.g.:

$ llvm-objdump -s ...
....
Contents of section DATA:
 0000 01004180 080b1003 00000004 0000002a  ..A............*
 0010 0000002b 000000    

The problem here is that the section header is also displayed not just the actual data, so its less clear I think.

! In D106897#2908207, @sbc100 wrote:
Contents of section DATA:

0000 01004180 080b1003 00000004 0000002a  ..A............*
0010 0000002b 000000
The problem here is that the section header is also displayed not just the actual data, so its less clear I think.

Oh, interesting. So obj2yaml doesn't show the section header. Do you know if ELF objdump shows the header too? If not, maybe we should make wasm objdump match

dschuff accepted this revision.Jul 27 2021, 3:29 PM

but anyway this change is fine.
I guess one advantage of using objdump over obj2yaml when they are equivalent might be if we can get the whole test in one tool invocation instead of 2.

This revision is now accepted and ready to land.Jul 27 2021, 3:29 PM

but anyway this change is fine.
I guess one advantage of using objdump over obj2yaml when they are equivalent might be if we can get the whole test in one tool invocation instead of 2.

I think the difference is that ELF sections don't have headers... I think all the headers are stored centrally for ELF.

but anyway this change is fine.
I guess one advantage of using objdump over obj2yaml when they are equivalent might be if we can get the whole test in one tool invocation instead of 2.

I think the difference is that ELF sections don't have headers... I think all the headers are stored centrally for ELF.

Most wasm sections contain all kind of structured data... its not clear exactly what bytes should be considered payload, (think data sections with maybe individual data segments within in it... in that case you can data and metadata interspersed).

I'd say from the llvm-objdump perspective (which only thinks of sections and not anything more finegrained other than instructions), anything other than the section header would be payload, including subsections etc. Unless we want llvm-objdump to be even smarter, a la wasm-objdump. I'm not sure if there's precedent for that in other formats or not.

I'd say from the llvm-objdump perspective (which only thinks of sections and not anything more finegrained other than instructions), anything other than the section header would be payload, including subsections etc. Unless we want llvm-objdump to be even smarter, a la wasm-objdump. I'm not sure if there's precedent for that in other formats or not.

Even if we do that it doesn't really help our use case here since the DATA section still has some seeming random bytes that describe the segments, which follow the section header and precede any actual user data.

wingo added a comment.Jul 28 2021, 1:25 AM

Neat :) LGTM as well of course.

An idle thought -- I think it may be in the spirit of llvm-objdump to allow llvm-objdump --disassemble-all to produce some kind of structured representation for non-code sections, perhaps in the grammar accepted for .S files. As it is, it tries to parse the bodies of non-code sections as bytecode, which obviously produces garbage and may run into similar asserts as in https://bugs.llvm.org/show_bug.cgi?id=50957. That would allow you to use one tool for more things.

wingo accepted this revision.Jul 28 2021, 1:25 AM