This is an archive of the discontinued LLVM Phabricator instance.

[CSSPGO][llvm-profgen] Continue disassembling after illegal instruction is seen.
ClosedPublic

Authored by hoy on Mar 2 2021, 8:57 AM.

Details

Summary

Previously we errored out when disassembling illegal instructions and there would be no profile generated. In fact illegal instructions are not uncommon and we'd better skip them and print "unknown" instead of erroring out. This matches the behavior of llvm-objdump (see disassembleObject in llvm-objdump.cpp).

Diff Detail

Event Timeline

hoy created this revision.Mar 2 2021, 8:57 AM
hoy requested review of this revision.Mar 2 2021, 8:57 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2021, 8:57 AM
wenlei added a comment.Mar 2 2021, 9:36 AM

I'm wondering if we need a warning when running into such cases. It's possible with data in code for assembly code, but for normal c++ compilation, this could be a red flag that something might be off?

The consequence of such cases for llvm-objdump is less important and also visible in the output, so we don't need warning there. But llvm-profgen could be different - clues like warning may help us detect subtle profile issues.

llvm/tools/llvm-profgen/ProfiledBinary.cpp
241

Does llvm-objdump also prints one <unknown> for each byte of illegal instruction, or one <unknown> for entire range of illegal instructions?

hoy added a comment.Mar 2 2021, 9:58 AM

I'm wondering if we need a warning when running into such cases. It's possible with data in code for assembly code, but for normal c++ compilation, this could be a red flag that something might be off?

The consequence of such cases for llvm-objdump is less important and also visible in the output, so we don't need warning there. But llvm-profgen could be different - clues like warning may help us detect subtle profile issues.

Sure, a warning makes sense.

llvm/tools/llvm-profgen/ProfiledBinary.cpp
241

llvm-objdump prints <unknown> for each bad instruction: https://github.com/llvm/llvm-project/blob/eecbb1c77655d38c06e47cf32e2dcc72da45c517/llvm/tools/llvm-objdump/llvm-objdump.cpp#L1158

objdump prints <bad> instead.

wenlei added inline comments.Mar 2 2021, 10:02 AM
llvm/tools/llvm-profgen/ProfiledBinary.cpp
241

what's the length of a bad instruction? one byte? not a big deal, but thought nice to be consistent.

hoy added inline comments.Mar 2 2021, 10:08 AM
llvm/tools/llvm-profgen/ProfiledBinary.cpp
241

The logic in llvm-objdump is, if the dissembler doesn't tell the length of the bad instruction, moves forward byte-by-byte until a good instruction is seen. This is typically with X86 or variant sized ISA. For Arm, I think it'll skip 2 bytes or 4 bytes which is the length of fixed sized instructions.

hoy updated this revision to Diff 327506.Mar 2 2021, 10:09 AM

Warning when disassembling bad instructions.

wlei accepted this revision.Mar 2 2021, 12:20 PM

Thanks for working on this! LGTM

llvm/tools/llvm-profgen/ProfiledBinary.cpp
228

Nit: how about also use Disassembled as the condition and merge this to the Line274, like:

if (Disassembled)
  Offset += Size;
else
  Offset += 1;
This revision is now accepted and ready to land.Mar 2 2021, 12:20 PM
hoy added inline comments.Mar 2 2021, 1:06 PM
llvm/tools/llvm-profgen/ProfiledBinary.cpp
228

Size can still be a valid value even the disassembling fails on some targets. For example, on arm64, instruction size is always four byte. When the current disassembling fails, it might skip 4 bytes to the next instruction.

wlei added inline comments.Mar 2 2021, 1:10 PM
llvm/tools/llvm-profgen/ProfiledBinary.cpp
228

I see, thanks for the clarification.

wenlei added inline comments.Mar 2 2021, 6:56 PM
llvm/tools/llvm-profgen/ProfiledBinary.cpp
270

It would be good to have one warning for a consecutive range of invalid instructions, instead of one warning for each byte. Otherwise it can be noisy.

nit: Bad disassembling -> Invalid instructions at <start> - <end>

hoy added inline comments.Mar 2 2021, 10:19 PM
llvm/tools/llvm-profgen/ProfiledBinary.cpp
270

Good point, fixed.

hoy updated this revision to Diff 327673.Mar 2 2021, 10:19 PM

Addressing Wenlei's feedbacks.

wenlei accepted this revision.Mar 2 2021, 10:44 PM

lgtm.

llvm/tools/llvm-profgen/ProfiledBinary.cpp
223

nit: InvalidInstLength

hoy updated this revision to Diff 327678.Mar 2 2021, 10:53 PM

Renaming InvalidInstRange to InvalidInstLength.