Page MenuHomePhabricator

Fix usage of APInt.getRawData for big-endian systems

Authored by uweigand on Apr 11 2016, 11:40 AM.



The Scalar implementation and a few other places in LLDB directly
access the internal implementation of APInt values using the
getRawData method. Unfortunately, pretty much all of these places
do not handle big-endian systems correctly. While on little-endian
machines, the pointer returned by getRawData can simply be used as
a pointer to the integer value in its natural format, no matter
what size, this is not true on big-endian systems: getRawData
actually points to an array of type uint64_t, with the first element
of the array always containing the least-significant word of the
integer. This means that if the bitsize of that integer is smaller
than 64, we need to add an offset to the pointer returned by
getRawData in order to access the value in its natural type, and
if the bitsize is *larger* than 64, we actually have to swap the
constituent words before we can access the value in its natural type.

This patch fixes every incorrect use of getRawData in the code base.
For the most part, this is done by simply removing uses of getRawData
in the first place, and using other APInt member functions to operate
on the integer data.

This can be done in many member functions of Scalar itself, as well
as in Symbol/Type.h and in IRInterpreter::Interpret. For the latter,
I've had to add a Scalar::MakeUnsigned routine to parallel the existing
Scalar::MakeSigned, e.g. in order to implement an unsigned divide.

The Scalar::RawUInt, Scalar::RawULong, and Scalar::RawULongLong
were already unused and can be simply removed. I've also removed
the Scalar::GetRawBits64 function and its few users.

The one remaining user of getRawData in Scalar.cpp is GetBytes.
I've implemented all the cases described above to correctly
implement access to the underlying integer data on big-endian
systems. GetData now simply calls GetBytes instead of reimplementing
its contents.

Finally, two places in the clang interface code were also accessing
APInt.getRawData in order to actually construct a byte representation
of an integer. I've changes those to make use of a Scalar instead,
to avoid having to re-implement the logic there.

Diff Detail


Event Timeline

uweigand updated this revision to Diff 53297.Apr 11 2016, 11:40 AM
uweigand retitled this revision from to Fix usage of APInt.getRawData for big-endian systems.
uweigand updated this object.
uweigand added a subscriber: lldb-commits.
clayborg accepted this revision.Apr 11 2016, 1:52 PM
clayborg edited edge metadata.

Looks good.

This revision is now accepted and ready to land.Apr 11 2016, 1:52 PM
labath edited edge metadata.Apr 12 2016, 4:20 AM

It would be great if you could add a some unit tests for your modifications to the Scalar class. You don't have to be too exhaustive, but a couple of tests would go a long way.

uweigand updated this revision to Diff 53622.Apr 13 2016, 2:49 PM
uweigand edited edge metadata.

This adds some unit tests verifying correct operation of the GetBytes routine as well as the conversions (SChar, UChar, ...). Those tests actually exposed more problems in the original Scalar code: the SetValueFromData routine didn't work correctly for 128- and 256-bit data types, and the SChar routine should have an explicit "signed char" return type to work correctly on platforms where char defaults to unsigned.

Regression testing on Intel revealed a bug in my original patch: since Scalar only accepts APInt initializers with a bit size that is a power of 2, we may need to extend some values before converting them to Scalar. This occured for values representing 10-byte Intel long double data.

Tested on System z and Intel.

labath accepted this revision.Apr 14 2016, 1:56 AM
labath edited edge metadata.

I am glad that the unit tests are finding real problems and not just being a nuisance. Thanks a lot.

2655 ↗(On Diff #53622)


mamai added a subscriber: mamai.Apr 14 2016, 6:57 AM
This revision was automatically updated to reflect the committed changes.