This is an archive of the discontinued LLVM Phabricator instance.

[libc] Capture floating point encoding and arrange it sequentially in memory
ClosedPublic

Authored by hedingarcia on Jul 7 2021, 10:12 AM.

Details

Summary

Redefined FPBits.h and LongDoubleBitsX86 so its implementation works for the Windows
and Linux platform while maintaining a packed memory alignment of the precision floating
point numbers. For its size in memory to be the same as the data type of the float point number.
This change was necessary because the previous attribute((packed)) specification in the struct was not working
for Windows like it was for Linux and consequently static_asserts in the FPBits.h file were failing.

Diff Detail

Event Timeline

hedingarcia created this revision.Jul 7 2021, 10:12 AM
hedingarcia requested review of this revision.Jul 7 2021, 10:12 AM
hedingarcia edited the summary of this revision. (Show Details)Jul 7 2021, 10:19 AM
hedingarcia added a subscriber: sivachandra.

Fixed error signaled when running the tests for LdExpTest.h by ensuring that the value passed to set mantissa
did not lead to overload to the exponent value in FPBits.h and LongDoubleBitsX86.h.
Added asserts to detect those unintentional overflows in setMantissa() and setExponent().

the title should say something about creating a custom packed int, the getter and setter part is a small detail that shouldn't be in the title

libc/utils/FPUtil/FPBits.h
22–23

since we're basically just forwarding MantissaWidth to FloatProperties::mantissaWidth, we can delete all the specializations below and do

template <typename T> struct MantissaWidth {
    static constexpr unsigned value = FloatProperties<T>::mantissaWidth;
};

Or even better, delete these and and only use FloatProperties::mantissaWidth. But maybe that can be done in a follow up patch.

53

I think we don't need the __llvm_libc::fputil:: since we're arleady inside that namespace

56

I don't really like valueFP, how about something like bits

62

these comments should probably be deleted, they're not super useful

87

this is very confusing, can we just use mantissaWidth?

93–94

we should replace all uses of integer with encoding.valueFP

99

bool?

hedingarcia retitled this revision from [libc] (WIP) Modify the struct that captures floating point encoding to have setters and getters to [libc] Creating a struct that captures floating point encoding and manually arranges it sequentially in memory .Jul 8 2021, 9:37 AM
hedingarcia edited the summary of this revision. (Show Details)
hedingarcia added reviewers: sivachandra, lntue.
hedingarcia removed a subscriber: sivachandra.
hedingarcia retitled this revision from [libc] Creating a struct that captures floating point encoding and manually arranges it sequentially in memory to [libc] Creating a struct that captures floating point encoding and manually arranges it sequentially in memory.

[libc] Fix of the patch, refactored FPBits.h to have template structs

Made one template struct for MantissaWidth, one for ExponentWidth, and another for FPUIntType.
Removed namespaced calls, comments and changed variable name valueFP to bits in FPBits.h and LongDoubleBitsX86.h.
Changed the return data type of getSign() from bool to uint8_t.

the commit message is a bit verbose:

  • no need to say that the patch was fixed
  • no need to say that you updated all getters/setters of exponent/mantissa/sign, that is a direct and obvious consequence of FPBits work
  • no need to say that we're using FloatProperties.h, that's obvious if you look at the patch
  • the ultimate reason for this patch isn't that static_asserts are failing, it's that attribute((packed)) isn't portable and doesn't work on Windows

in general the commit message should be more of a "why" rather than a "what are the steps I took to create this patch"; try to figure out exactly what a reader of this patch needs to know and don't over-explain details since that clutters the message
if something tricky is being done then that should be explained of course, but most of this patch is self-explanatory if you know the reasoning behind the change

libc/utils/FPUtil/FPBits.h
64

@sivachandra is this the right way to assert?

91

not sure if this should be a bool or a uint8_t...
let's see what the others say

hedingarcia marked 6 inline comments as done.Jul 8 2021, 2:28 PM
hedingarcia added inline comments.
libc/utils/FPUtil/FPBits.h
87

The reason why we cannot use mantissaWidth is because in the 80-bit long double implementation the value for that field is 63 instead of 64. An explicit bit is considered in the 63 bit as the integer part of the significand (https://en.wikipedia.org/wiki/Extended_precision ). This exception leads to that long expression in getExponent(), because with only the value of mantissaWidth the shift will still require to move one bit in order to return all the bits of the exponent.

aeubanks added inline comments.Jul 8 2021, 2:32 PM
libc/utils/FPUtil/FPBits.h
87

I thought that only applied to the version in LongDoubleBitsX86

hedingarcia added inline comments.Jul 8 2021, 2:44 PM
libc/utils/FPUtil/FPBits.h
87

Yes, however should the same implementation in LongDoubleBitsX86.h be kept the same even in FPBits.h?

hedingarcia edited the summary of this revision. (Show Details)Jul 8 2021, 2:55 PM
hedingarcia edited the summary of this revision. (Show Details)Jul 8 2021, 3:01 PM
hedingarcia edited the summary of this revision. (Show Details)Jul 9 2021, 6:19 AM

[libc] Fixing shift in getExponent() and setExponent()

Refactoring the previous shift to retrieve and set the exponent
with the mantissaWidth value in FPBits.h

hedingarcia marked 2 inline comments as done.Jul 9 2021, 6:36 AM
hedingarcia added inline comments.
libc/utils/FPUtil/FPBits.h
87

I see what you mean now, yes since LongDoubleBitsX86.h takes care of the long double implementation, this file only should take in consideration that explicit bit. FPBits.h does not have to work around this case since it is used for single and double. Thank you for catching that detail.

hedingarcia retitled this revision from [libc] Creating a struct that captures floating point encoding and manually arranges it sequentially in memory to [libc] Capture floating point encoding and arrange it sequentially in memory with structs.Jul 9 2021, 8:23 AM

mostly looks good

libc/utils/FPUtil/FPBits.h
50

FloatProp::BitsType seems more consistent (you can move the using FloatProp here)

63

the next assert will never fail with this line (same for setExponent())

[libc] Rearranging alias declarations and removing asserts

hedingarcia marked 2 inline comments as done.Jul 9 2021, 10:57 AM

Sorry for the delay in the review. I have gone through the history and it mostly looks good. I have a few comments about the cosmetics.

libc/utils/FPUtil/FPBits.h
30–31

Can this struct be removed now?

51

For consistency, I would name UIntType as BitsType here as well. But, you can choose to add a TODO here and do that "cleanup" in a follow up change.

59

Why are these methods required? Probably because you got rid of the integer field of FPBits? If you add the rest of the methods to FPBits directly (see below), this should not be required as the integer / bits value can be directly set or read.

64

Not sure what line this pertains to. But, after this change, the only assert that would be required is something like:

assert(sizeof(T) == sizeof(UIntVal), ...);
75

You should probably name this method unbiasedExponent to distinguish it from the real exponent value returned by getExponent().

91

To which line was this comment originally for?

93–94

I would actually prefer if we got rid of encoding completely [1]. Adding the new methods to FPBits directly would reduce the verbosity when calling them. The integer value is currently called integer, but as @aeubanks suggests elsewhere, bits is more appropriate may be. However, to reduce the churn, you can choose to keep it as integer or bits depending on whats convenient to you.

[1] - We had a separate encoding field because it was essentially that when we could use bit-fields. Moreover, it helped us distinguish it with the other fields. With that gone, we can choose to remove it and eliminate that "complexity" and verbosity.

aeubanks added inline comments.Jul 12 2021, 9:21 AM
libc/utils/FPUtil/FPBits.h
30–31

I'd say clean this, along with Exponent/MantissaWidth up in a later patch, there are lots of uses of these

64

oh, I think the code was deleted for some reason

I spent a while debugging a failure Hedin was running into. Turns out sometimes when setting the mantissa/exponent, it was actually larger than the max value (IIRC at the very bottom of NormalFloat.h). With the struct bitfield it automatically took care of that, but here we needed to mask out the extra bits. The question is whether to do that in the setter or do that in the caller. If we do that in the setter we preserve the existing behavior. But I'm slightly in favor of forcing callers to handle that and adding an assert in the setter that the bits don't overflow. This is probably a bit faster since we don't have to do the bitmask on every setter, and may catch future issues.

The question about the assert was, is assert(cond); the right way to assert? As in in debug modes it'll run the assert and crash, and in release modes the assert won't be there for perf reasons. Is <assert.h> the right thing to include?

91

this was for getSign()

sivachandra added inline comments.Jul 12 2021, 10:23 AM
libc/utils/FPUtil/FPBits.h
64

Ah, OK! I should have said static_assert in my comment. The general rule we follow is:

  1. Include freestanding C headers if required.
  2. Include the header file corresponding to the implementation if required. So, in fputils, we can include math.h.
  3. Do not include other libc public headers unless they are related.

Some of the above are checked by the libc lint rules implemented as part of clang-tidy. They are only run on the full build builders: https://lab.llvm.org/buildbot/#/workers/120

So, we cannot include assert.h or use assert. Whether the value should be checked in the caller or the setter, there are a few algorithms which assume the setters are doing it (there is a deliberate intention to overflow the mantissa.) So, I would say do the masking in the setter to retain functionality. In practice, it isn't a big runtime penalty as the setting usually happens at the very end of a complex algorithm.

hedingarcia retitled this revision from [libc] Capture floating point encoding and arrange it sequentially in memory with structs to [libc] Capture floating point encoding and arrange it sequentially in memory.Jul 12 2021, 11:39 AM
hedingarcia edited the summary of this revision. (Show Details)
hedingarcia edited the summary of this revision. (Show Details)

[libc] Removed encoding/FPUIntType struct and renamed getExponent()

Eliminated encoding struct and moved its data and member fields into FPBits.
Removed the FPUIntType struct since it was not being referenced.

hedingarcia marked 10 inline comments as done.Jul 12 2021, 11:58 AM
aeubanks accepted this revision.Jul 12 2021, 12:54 PM

lgtm besides the question for Siva of uint8_t vs bool for getSign()

This revision is now accepted and ready to land.Jul 12 2021, 12:54 PM
sivachandra accepted this revision.Jul 12 2021, 1:35 PM

lgtm besides the question for Siva of uint8_t vs bool for getSign()

Logically, it doesn't matter. Are there benefits wrt compiler driven optimizations? If yes, then suggest the one which should be preferred?

Otherwise, LGTM as well. Thanks for patiently resolving the comments.

lgtm besides the question for Siva of uint8_t vs bool for getSign()

Logically, it doesn't matter. Are there benefits wrt compiler driven optimizations? If yes, then suggest the one which should be preferred?

Otherwise, LGTM as well. Thanks for patiently resolving the comments.

Probably bool is better since the compiler can assume it's only 0 or 1.

Probably bool is better since the compiler can assume it's only 0 or 1.

Okay, I will change the return type to bool.

[libc] Changed the return type of getSign()

hedingarcia marked 3 inline comments as done.Jul 12 2021, 2:07 PM
lntue added inline comments.Jul 12 2021, 10:32 PM
libc/utils/FPUtil/FPBits.h
124

Sorry for late comments! Since you already updated to get* and set*, can you also add getBits and setBits? I think they would be a better than the current uintval() (and technically no equivalent of setBits).

libc/utils/FPUtil/NextAfterLongDoubleX86.h
54

I'm a bit curious about generated assembly of this line. @sivachandra : can we check if there is any regression with this one? I think no regression for O2 and O3 would be good enough.

hedingarcia added inline comments.Jul 13 2021, 9:42 AM
libc/utils/FPUtil/FPBits.h
124

These get/setBits members were used previously with a different name but now they are not required anymore since bits value can be directly set or read from FPBits.

libc/utils/FPUtil/NextAfterLongDoubleX86.h
54

The object files generated before and after this patch are the same, at least after running the check there was no difference between the files at O2 optimization level.

This revision was landed with ongoing or failed builds.Jul 13 2021, 1:44 PM
This revision was automatically updated to reflect the committed changes.
amccarth added inline comments.
libc/utils/FPUtil/FPBits.h
97

I'm rather surprised that this change was necessary. When working with a specific bit layout, using a unsigned integer type and bitwise operations can be a good idea--I don't think the language standards give many guarantees about how the bits are allocated (e.g., MSB first or LSB first?). But I'm not sure why the compiler (we are talking about clang here, right?) didn't pack the structures the same way on all platforms.

Perhaps when targeting Windows, clang is trying too hard to be compatible with MSVC. With MSVC, when defining bit fields, they will pack by default--but only if the underlying types are all the same. Since the old types were UIntType,, uint16_t, and uint8_t--all of which have different alignment requirements--MSVC doesn't pack those. But if they had all been defined as uint32_t, MSVC would certainly have done the right thing.

And thus, maybe clang would have as well.