This is an archive of the discontinued LLVM Phabricator instance.

[MC][COFF][ELF] Reject instructions in IMAGE_SCN_CNT_UNINITIALIZED_DATA/SHT_NOBITS sections
ClosedPublic

Authored by MaskRay on Apr 14 2020, 11:49 AM.

Details

Summary

For .bss; nop, MC inappropriately calls abort() (via report_fatal_error()) with a message
cannot have fixups in virtual section!
It is a bug to crash for invalid user input. Fix it by erroring out early in EmitInstToData().

Similarly, emitIntValue() in a virtual section (SHT_NOBITS in ELF) can crash with the mssage
non-zero initializer found in section '.bss' (see D4199)
It'd be nice to report the location but so many directives can call emitIntValue()
and it is difficult to track every location.
Note, COFF does not crash because MCAssembler::writeSectionData() is not
called for an IMAGE_SCN_CNT_UNINITIALIZED_DATA section.

Note, GNU as' arm64 backend reports `Error: attempt to store non-zero value in section .bss'``
for a non-zero .inst but fails to do so for other instructions.
We simply reject all instructions, even if the encoding is all zeros.

The Mach-O counterpart is D48517 (see test/MC/MachO/zerofill-text.s)

Diff Detail

Event Timeline

MaskRay created this revision.Apr 14 2020, 11:49 AM
skan added inline comments.Apr 14 2020, 7:09 PM
llvm/lib/MC/MCELFStreamer.cpp
514–519 ↗(On Diff #257425)

Do you think MCObjectStreamer::emitInstruction is a better place to check Sec.isVirtualSection()?

skan added inline comments.Apr 14 2020, 8:39 PM
llvm/lib/MC/MCELFStreamer.cpp
540–541 ↗(On Diff #257425)

This change is not related.

MaskRay marked an inline comment as done.Apr 14 2020, 8:44 PM
MaskRay added inline comments.
llvm/lib/MC/MCELFStreamer.cpp
514–519 ↗(On Diff #257425)

For an error message specific to the binary format (e.g. ELF), EmitInstToData is probably the best choice.

For the COFF error message, the error message may mention IMAGE_SCN_CNT_UNINITIALIZED_DATA.

MaskRay marked 2 inline comments as done.Apr 14 2020, 8:45 PM
MaskRay added inline comments.
llvm/lib/MC/MCELFStreamer.cpp
540–541 ↗(On Diff #257425)

I moved the variable to the top.

skan added inline comments.Apr 14 2020, 10:12 PM
llvm/lib/MC/MCELFStreamer.cpp
514–519 ↗(On Diff #257425)

But EmitInstToData can not check if a relaxable instruction is emitted into a virtual section.

MaskRay updated this revision to Diff 257601.Apr 14 2020, 11:03 PM
MaskRay marked an inline comment as done.
MaskRay edited the summary of this revision. (Show Details)

Move the error to MCObjectStreamer::emitInstruction to make MCRelaxableFragment work.

skan added inline comments.Apr 14 2020, 11:54 PM
llvm/include/llvm/MC/MCSection.h
192–193

I think "llvm::StringRef" is better than const char * here.

https://llvm.org/doxygen/classllvm_1_1StringRef.html#details

llvm/lib/MC/MCObjectStreamer.cpp
372–373

const MCSection &?

MaskRay updated this revision to Diff 257716.Apr 15 2020, 7:49 AM

Address comments

rnk added a comment.Apr 15 2020, 8:25 AM

Sounds good. Do you mind implementing this for COFF? I don't think "virtual section" is standard terminology. It seems like something someone invented here in MC. The COFF impl of isVirtualSection is straightforward:

bool MCSectionCOFF::isVirtualSection() const {
  return getCharacteristics() & COFF::IMAGE_SCN_CNT_UNINITIALIZED_DATA;
}

This suggests we can return "IMAGE_SCN_CNT_UNINITIALIZED_DATA" instead of "virtual".

The .section flag for this seems to be b for the test.

llvm/include/llvm/MC/MCSection.h
192

This appears to be the first non-pure virtual method, i.e. the key method. Please define it out of line to avoid duplicate debug info & vtables.

MaskRay updated this revision to Diff 257756.Apr 15 2020, 9:59 AM
MaskRay retitled this revision from [MC][ELF] Reject instructions in SHT_NOBITS sections to [MC][COFF][ELF] Reject instructions in IMAGE_SCN_CNT_UNINITIALIZED_DATA/SHT_NOBITS sections.
MaskRay edited the summary of this revision. (Show Details)

Check COFF errors and add test/MC/COFF/bss-text.s

MaskRay marked 4 inline comments as done.Apr 15 2020, 9:59 AM
MaskRay added inline comments.
llvm/include/llvm/MC/MCSection.h
192

Ah, thanks for noting this.

MaskRay marked an inline comment as done.Apr 15 2020, 11:38 AM
rnk accepted this revision.Apr 15 2020, 11:42 AM

lgtm with suggestion to improve the diagnostic at the old location

llvm/lib/MC/MCAssembler.cpp
686–688

I wonder if there are other creative ways to trigger this assertion, something like:

.section ... # stuff to make it bss-y
.uleb .LtmpN - .LtmpM

Might be worth transitioning it to reportError.

690–691

This can also be unified to use the new method, right?

This revision is now accepted and ready to land.Apr 15 2020, 11:42 AM
MaskRay updated this revision to Diff 257804.Apr 15 2020, 12:32 PM
MaskRay marked 2 inline comments as done.

Address comments

llvm/lib/MC/MCAssembler.cpp
690–691

This would be really difficult... so difficult that I have to mention it in the description

It'd be nice to report the location but so many directives can call emitIntValue()
and it is difficult to track every location.
rnk added inline comments.Apr 15 2020, 1:03 PM
llvm/lib/MC/MCAssembler.cpp
690–691

I didn't mean to ask about the source location, I was wondering if we could avoid the special ELF dyn_cast. The message could be:

"non-zero initializer found in " + Sec->getVirtualSectionKind() + " section " + Sec->getSectionName()
skan accepted this revision.Apr 15 2020, 6:44 PM

LGTM

llvm/lib/MC/MCAssembler.cpp
690–691

I am aslo wondering if we can avoid the special ELF dyn_cast?

MaskRay updated this revision to Diff 257937.Apr 15 2020, 7:01 PM

Rebase after MCSection::getName() is available

This revision was automatically updated to reflect the committed changes.
llvm/test/MC/ELF/ARM/bss-non-zero-value.s