This is an archive of the discontinued LLVM Phabricator instance.

BitstreamWriter: Better assertion when a block's abbrev code size is too small
Needs ReviewPublic

Authored by jordan_rose on Oct 11 2017, 5:34 PM.

Details

Summary

Before: ((Val & ~(~0U >> (32-NumBits))) == 0 && "High bits set!")
After: (Abbrev <= (~0U >> (32-CurCodeSize)) && "Block code length is too small")

The same check, but earlier and with a better error message for this specific use case. (I was originally going to put this in EmitAbbrev instead, but realized that it's possible for some overly clever soul to have a block that just never uses high-abbrev records.)

Diff Detail

Repository
rL LLVM

Event Timeline

jordan_rose created this revision.Oct 11 2017, 5:34 PM

Reviving this old review. I'd like *someone* to tell me I didn't miss something.

See inline notes.
Also, the edge case with CurCodeSize == 32, if that is possible, should be added.

include/llvm/Bitcode/BitstreamWriter.h
303

What is the expected range of CurCodeSize values?
This has UB if CurCodeSize == 32.
It would be much safer to just use the variant with subtraction:

assert(Abbrev <= (~(~0U >> (32-CurCodeSize))) && "Block code length is too small");
unittests/Bitcode/BitstreamWriterTest.cpp
65

I'd expect to see three tests.
One last valid one, the invalid one, and the next after the first invalid one.
So 3U, 4U, 5U.

jordan_rose added inline comments.Jun 13 2018, 2:06 PM
include/llvm/Bitcode/BitstreamWriter.h
303

Good point. I don't think anyone's trying to use 32-bit record codes, but the spec doesn't seem to impose a limit. I'll switch it.

unittests/Bitcode/BitstreamWriterTest.cpp
65

This is just an arbitrary data value. What makes this too big is the CodeLen of 2 above. But I can certainly test overflowing a CodeLen of 3 as well.

jordan_rose added inline comments.Jun 13 2018, 2:07 PM
unittests/Bitcode/BitstreamWriterTest.cpp
65

Oh, I see what you mean. Values 0-3 have a special meaning in bitstream mode, so I can't actually do that test, but I'll add one for overflowing 3.

lebedev.ri added inline comments.Jun 13 2018, 2:14 PM
unittests/Bitcode/BitstreamWriterTest.cpp
65

Values 0-3 have a special meaning in bitstream mode

Right, so do that for CodeLen = 3, it will still be 3 values :)

jordan_rose edited the summary of this revision. (Show Details)
jordan_rose added a reviewer: lebedev.ri.

Took recommendation about the form of the check, pumped up the tests.

lebedev.ri added inline comments.Jun 13 2018, 11:29 PM
unittests/Bitcode/BitstreamWriterTest.cpp
104–211

This looks rather monstrous, with a lot of duplication.
I was suggesting to just do:

TEST(BitstreamWriterTest, trapOnSmallAbbrevUsingBlockInfo_CodeLength3_Ok) {
  SmallString<64> Buffer;
  BitstreamWriter W(Buffer);
  W.EnterBlockInfoBlock();
  auto Abbrev = std::make_shared<BitCodeAbbrev>();
  Abbrev->Add(BitCodeAbbrevOp(2U);
  (void)W.EmitBlockInfoAbbrev(bitc::FIRST_APPLICATION_BLOCKID,
                              std::move(Abbrev));
  W.ExitBlock();
  W.EnterSubblock(bitc::FIRST_APPLICATION_BLOCKID, /*CodeLen*/3);
  W.EmitRecordWithAbbrev(Abbrev, makeArrayRef(2U));
  W.ExitBlock();
}

TEST(BitstreamWriterTest, trapOnSmallAbbrevUsingBlockInfo_CodeLength3_LastOk) {
  SmallString<64> Buffer;
  BitstreamWriter W(Buffer);
  W.EnterBlockInfoBlock();
  auto Abbrev = std::make_shared<BitCodeAbbrev>();
  Abbrev->Add(BitCodeAbbrevOp(3U);
  (void)W.EmitBlockInfoAbbrev(bitc::FIRST_APPLICATION_BLOCKID,
                              std::move(Abbrev));
  W.ExitBlock();
  W.EnterSubblock(bitc::FIRST_APPLICATION_BLOCKID, /*CodeLen*/3);
  W.EmitRecordWithAbbrev(Abbrev, makeArrayRef(3U));
  W.ExitBlock();
}

TEST(BitstreamWriterTest, trapOnSmallAbbrevUsingBlockInfo_CodeLength3_NotOk) {
  SmallString<64> Buffer;
  BitstreamWriter W(Buffer);
  W.EnterBlockInfoBlock();
  auto Abbrev = std::make_shared<BitCodeAbbrev>();
  Abbrev->Add(BitCodeAbbrevOp(4U);
  (void)W.EmitBlockInfoAbbrev(bitc::FIRST_APPLICATION_BLOCKID,
                              std::move(Abbrev));
  W.ExitBlock();
  W.EnterSubblock(bitc::FIRST_APPLICATION_BLOCKID, /*CodeLen*/3);
  W.EmitRecordWithAbbrev(Abbrev, makeArrayRef(4U));
  W.ExitBlock();

TEST(BitstreamWriterTest, trapOnSmallAbbrevUsingBlockInfo_CodeLength3_NoOverFlow) {
  SmallString<64> Buffer;
  BitstreamWriter W(Buffer);
  W.EnterBlockInfoBlock();
  auto Abbrev = std::make_shared<BitCodeAbbrev>();
  Abbrev->Add(BitCodeAbbrevOp(4294967295U); // i32 -1 
  (void)W.EmitBlockInfoAbbrev(bitc::FIRST_APPLICATION_BLOCKID,
                              std::move(Abbrev));
  W.ExitBlock();
  W.EnterSubblock(bitc::FIRST_APPLICATION_BLOCKID, /*CodeLen*/32);
  W.EmitRecordWithAbbrev(Abbrev, makeArrayRef(4294967295U));
  W.ExitBlock();
}
jordan_rose added inline comments.Jun 14 2018, 9:30 AM
unittests/Bitcode/BitstreamWriterTest.cpp
104–211

The values in the abbreviation have nothing to do with the abbreviation codes themselves; in order to get to code 8 you have to actually register 5 abbreviations. (0-3 are reserved.) I could probably pull out some of this into helper functions but I didn't feel like it was going to be super important.

lebedev.ri accepted this revision.Jun 16 2018, 1:42 AM

I would still like to see a test with CodeLen=32, and even a single call to EmitRecordWithAbbrevImpl()
(ubsan would then have caught the ub in the initial approach), but otherwise this seems reasonable to me.
I have no expertise in this code, so *please* wait for someone else to review this, too.

include/llvm/Bitcode/BitstreamWriter.h
303

I think you want to move this assert to before the first use of Abbrev (before line 301)

unittests/Bitcode/BitstreamWriterTest.cpp
104–211

Oh duh, i knew that, but kinda forgot :)

This revision is now accepted and ready to land.Jun 16 2018, 1:42 AM
jordan_rose requested review of this revision.Jun 18 2018, 8:47 AM

Thanks, Roman. Too bad this is such a stable piece of LLVM, or else I'd have more people qualified to review!

include/llvm/Bitcode/BitstreamWriter.h
303

It shouldn't matter in this case. The first use (and the above assertion) is for indexing into a write-time data structure; the second is for the actual representation in the bitstream.

lebedev.ri added inline comments.Jun 18 2018, 9:22 AM
include/llvm/Bitcode/BitstreamWriter.h
303

Point being, if you don't move it, you won't be able to add a test for 32-bit CurCodeSize overflow.

jordan_rose added inline comments.Jun 18 2018, 9:54 AM
include/llvm/Bitcode/BitstreamWriter.h
303

I have to admit I'm not really interested in that. It's worth writing code that won't silently invoke undefined behavior, but in practice no one is actually going to have billions of abbreviations, and it's not interesting to me which assertion fires in that case.

lebedev.ri resigned from this revision.Jun 21 2019, 10:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 21 2019, 10:45 AM