Previously when a metadata string contained unicode characters,
it would be incorrectly placed in the Record array because chars
are signed by default and hence characters with the high bit set
would get sign extended, but the bitcode writer was attempting
to write the lowest 8 bit of the now sign-extended value. This
caused an assertion failure later on. The fix is just to cast
the pointer to uint8_t* first to prevent sign extension.
This came up in the context for metadata strings, but I did a
quick pass and changed the other instances of this pattern in
the file as well.
Details
Details
- Reviewers
- None
Diff Detail
Diff Detail
Event Timeline
Comment Actions
I'm a bit confused why we're widening chars into 64 bit values - is there a quick explanation for that? (it seems inefficient to put one char in each 64 bit entry in Record, rather than putting 8 of them in there)
If Record is actually bytes, it has the wrong type, doesn't it - it should be SmallString, or SmallVector<uint8_t>, etc...
Comment Actions
As far as I understand, the bitcode supports arbitrarily sized fields defined at runtime, so everything goes through uint64_t.
Comment Actions
Seems a bit strange but certainly getting out of my depth - thanks for the explanation :)