This is an archive of the discontinued LLVM Phabricator instance.

[Fixed Point Arithmetic] Fixed Point Constant
ClosedPublic

Authored by leonardchan on Jun 27 2018, 10:34 AM.

Details

Summary

This patch proposes an abstract type that represents fixed point numbers, similar to APInt or APSInt that was discussed in https://reviews.llvm.org/D48456#inline-425585. This type holds a value, scale, and saturation and is meant to perform intermediate calculations on constant fixed point values.

Currently this class is used as a way for handling the conversions between fixed point numbers with different sizes and radixes. For example, if I'm casting from a signed _Accum to a saturated unsigned short _Accum, I will need to check the value of the signed _Accum to see if it fits into the short _Accum which involves getting and comparing against the max/min values of the short _Accum. The FixedPointNumber class currently handles the radix shifting and extension when converting to a signed _Accum.

Diff Detail

Repository
rC Clang

Event Timeline

rjmccall added inline comments.Jun 27 2018, 11:05 AM
include/clang/Basic/FixedPoint.h
12

Referencing the standard here would be good, and maybe even excerpting the key parts.

24

The established naming convention here — as seen in APInt, APFloat, APValue, etc. — would call this APFixedPoint. Maybe that's not a great convention, but we should at least discuss deviating from it.

You might also want a type which encapsulates the details of a fixed-point type, i.e. the semantic width, scale, and saturating-ness. (Like the "float semantics" of APFloat.)

I think a key question here is whether you want a FixedPointNumber to exactly represent the bit-pattern or just the semantic value. I think it would eliminate a non-trivial source of bugs in this type if it just represents the semantic value, i.e. if a 16-bit unsigned fract value on a target where that uses a padded representation did not explicitly represent the padding bit, and then just added it back in in some accessor that asks for the bit-pattern. Regardless, that should be clearly documented.

27

This probably shouldn't be optional, since it's a really important semantic aspect of the type whether it's saturating or not.

84

LLVM convention does not include the trailing underscore; they're just capitalized. No, I don't like it, either.

include/clang/Basic/TargetInfo.h
318

You can cut "false otherwise".

319

The convention is that predicates like this start with a verb, so this should be something like doUnsignedFixedPointTypesHavePadding.

lib/AST/ASTContext.cpp
10387

Please put at least the member functions on FixedPointNumber in their own .cpp file.

ebevhan added inline comments.Jun 28 2018, 3:07 AM
include/clang/Basic/FixedPoint.h
24

a 16-bit unsigned fract value on a target where that uses a padded representation did not explicitly represent the padding bit

So does that mean that the underlying APInt in this type would be 15 bits instead of 16, to avoid representing the padding? It feels a bit scary to throw around values with different internal representation than what other parts of the code (say, the target specification) have specified them to be.

46

I'm not so sure that extension and truncation on their own are particularly meaningful operations on fixed-point values. I think you need to provide both a width and a scale for these kinds of operations so you can both resize and rescale the value simultaneously. Perhaps you can even provide a 'saturation width' that the value should be saturating on when converting.

You have the convert method, but it only takes a QualType, so if you need to convert to something that doesn't exist as a type, it's not enough. Maybe I'm overdesigning.

58

As with the integer conversion in the earlier patch, this does not round toward zero. This might lead to surprising results.

lib/AST/ASTContext.cpp
10311

It is possible to perform saturation without extending and comparing, by looking at the bits above the 'imagined' sign bit in the resulting type. If they are all the same, then the value is in range, otherwise you must saturate.

Explicit comparison probably gets the point across better.

10328

If you do rescaling before and after resizing, you can use extOrTrunc instead.

10350

This width and scale commoning looks odd. If you have two types for which the width is the same but the scale is different, you will discard bits from the value with lower scale and the comparison might be wrong.

You need to find a width and scale that can fit all of the values of both.

rjmccall added inline comments.Jun 28 2018, 9:24 AM
include/clang/Basic/FixedPoint.h
24

Well, you can encapsulate it so that the 15-bit APInt is never exposed to users of the type. That's a relatively small number (probably just 1) of places to check for correctness, vs. having to have logic to handle the padding bit in basically every operation on the type in addition to all the other configuration-explosions that are actually mandatory.

leonardchan marked 8 inline comments as done.
  • Renamed to APFixedPoint
  • Added FixedPointSemantics to represent saturation and whether or not padding is involved. Similar to APFloatSemantics, this indicated how the underlying APSInt passed to this will be used (ie. is the MSB padding or not).
leonardchan added inline comments.Jun 28 2018, 4:46 PM
include/clang/Basic/FixedPoint.h
24

I definitely think the class needs to know at least of either the number of integral bits, or if this padding exists since the initial task I wanted it to do was handle conversion between unsigned and signed types while taking care of this padding.

I think a good solution would be to pass an additional semantics type such that we can still pass a literal APSInt to represent the underlying integer and know stuff like saturation and padding. I don't think it should represent the scale or width though since those are configured by the target and can have different values.

This semantic type I think would also help discern how to treat the underlying APSInt.

46

This makes sense. I had the extend and trunc methods so I could match the width of another fixed type when comparing, but if that's the only use for it, it makes sense to have a function that just accepts the width and scale.

lib/AST/ASTContext.cpp
10328

I think having this line just be Val_ = Val_.extOrTrunc(DstWidth); would allow for truncation to occur first before potentially right shifting.

rjmccall added inline comments.Jun 28 2018, 7:51 PM
include/clang/Basic/FixedPoint.h
31

I figured you'd want this to be a struct which include the scale, width, signed-ness, and saturating-ness; and then APFixedPoint can just store one of these next to a bare APInt.

41

Why the elaborated-type-specifier?

Ka-Ka added a subscriber: Ka-Ka.Jun 29 2018, 4:34 AM

I also think it would be good with some unit tests for this class once the functionality and interface is nailed down.

include/clang/Basic/FixedPoint.h
31

That's what I was expecting too, similar to the APFloat version.

What width would it contain? The width with or without padding? Probably the one with padding as the APInt itself has the width without padding.

53

occurred

lib/AST/ASTContext.cpp
10328

True, but I meant something more along the lines of

if(DstScale < Scale)
  Val = Val >> (Scale - DstScale);
Val = Val.extOrTrunc(DstWidth);
if(DstScale > Scale)
  Val = Val << (DstScale - Scale);

But I think you've changed it a bit so I need to have a look at the revised version.

lib/Basic/FixedPoint.cpp
20

I'm unsure if this operation is useful for anything but converting fixed-point to integer. Make this a member of APFixedPoint, drop the arguments and give it a name that makes it clearer that it is a round-to-zero integer conversion? toIntegerRoundToZero?

53

I think saturation can be modeled a bit better. It should be possible to do the overflow check/saturation once instead of twice, and also have it fit in better with the other conversions.

Saturation on a value is essentially checking whether all bits above and including a certain bit are all the same, and 'clamping' to either the minimum (the mask) or maximum (inverse of the mask) if not.

// Saturate Value to SatWidth. This will clamp Value at the signed min/max of a value that is SatWidth long.
Saturate(SatWidth):
  Mask = highBits(Width, SatWidth + 1)
  Masked = Value & Mask
  Result = Value
  if (Masked == Mask || Masked == 0) {
    if (Masked.isNegative())
      Result = Mask
    else
      Result = ~Mask
  }

This cannot be done on the value in the source scale, since upscaling after clamping would produce an incorrect value - we would shift in 0's from the right even though we should have saturated all bits completely. Also, we cannot upscale without first extending or we might shift out bits on the left that could affect saturation.

One thing to note is that (I'm *pretty sure* this is the case) fractional bits cannot affect saturation. This means that we can find a common scale, then perform the saturation operation, and then resize, and the value should just fall into place. Basically:

  1. Upscale if necessary, but extend first to ensure that we don't drop any bits on the left. Do this by resizing to SrcWidth - SrcScale + std::max(SrcScale, DstScale) and then upscaling normally by DstScale - SrcScale.
  2. Downscale if necessary by SrcScale - DstScale. (I think; in our downstream, we saturate first and then downscale, but we also assume that saturation can only occur on _Fract, and doing saturation first makes the saturation width calculation a bit messier because it will be a max expression. I'm unsure if the order can be changed.)
  3. At this point, the scale of the value should be DstScale. Saturate with Saturate(DstWidth).
  4. Now the value should be in range for the new width, and will be at the right scale as well. Resize with extOrTrunc(DstWidth).
  5. (Also; if the result was negative and the dest type is unsigned, just make the result zero here instead.)

Note that the order of operations is different than what I've mentioned before with non-saturated conversion (downscale if needed, resize, upscale if needed), but it works because we only do the upscale as a resize+upscale. Somewhere in here you also need to change the signedness of the value, but I'm not entirely certain where. Likely after 4.

Everything I've mentioned here is semi-conjectural, since our implementation has different type widths than the defaults in this one, we can only saturate on _Fract, and our unsigned types have padding. It's possible that there are some assumptions that cause this method to fail for unsigned without padding, or perhaps for some other type conversion, but I haven't come up with a counterexample yet.

61

This is a bit tricky. The spec does indeed say that conversion to integer must round toward zero, but does not say that conversion between fixed-point types must. If we have these routines perform conversion like this, we would likely have to emit extra code (either an addition or the two negations) during fixed-to-fixed conversion to match the consteval behavior.

I think fixed-to-fixed conversion should keep using a normal shift, and reserve the round-to-zero shift for conversion to integer.

89

You can use getMax and getMin here to simplify the logic, I think?

As I mention in my other comment I think that the saturation logic can be more strongly integrated into the conversion.

114
if (Scale != OtherScale)
  CommonWidth += std::abs(OtherScale - Scale)

They need to be int, of course.

122

Are the if's needed? If CommonScale == Scale or OtherScale, shl will do nothing.

ebevhan added inline comments.Jun 29 2018, 7:12 AM
lib/Basic/FixedPoint.cpp
53

Now that I think about it a bit more, it's clear that the Saturate routine does not work for unsigned fixed-point without padding. That would need to be taken into consideration, but the rest should work.

leonardchan marked 10 inline comments as done.
leonardchan edited the summary of this revision. (Show Details)
  • Added tests
  • Moved all conversion logic into convert
  • Saturation is checked by checking the bits above the sign bit in the destination type
leonardchan added inline comments.Jul 2 2018, 4:13 PM
include/clang/Basic/FixedPoint.h
31

Woops my bad, I thought you were referring to the APFloatSemantics. I actually didn't know about fltSemantics until now.

lib/Basic/FixedPoint.cpp
20

You're right. Removed it altogether.

53

Thanks for the deeper explanation. The implementation is essentially the same and catching the unsigned fixed point without padding is just checking if the value is negative and we are casting to an unsigned.

53

Also I wasn't sure if this was a typo, but I have

if (!(Masked == Mask || Masked == 0)) {

instead of

if (Masked == Mask || Masked == 0) {

because I think the mask comparison checks if the bits above the sign changed which indicates saturation.

61

You're right. The spec also says that shifting falls under the typical rounding behavior which is implementation-defined.

89

Removed saturatedRescaleAndResize and rescaleAndResize altogether since the logic in both can be performed in convert.

ebevhan added inline comments.Jul 3 2018, 2:02 AM
include/clang/Basic/FixedPoint.h
68

If this class is supposed to be used like APInt and APSInt, perhaps this function should return a copy instead of modifying in place.

The reason that the APFloat convert function modifies in place is because it needs to return a status code, but I don't think that is necessary for this.

lib/Basic/FixedPoint.cpp
26

Why the early exit? What does the logic below not handle?

If this value has padding but the other doesn't (or vice versa), the scales will be different and the code below should work just fine.

41

It should be possible to replace this with extOrTrunc and move it below the saturation.

45

Combine the above block with the below one.

50

Val is an APSInt so it should be possible to just do >> and get the right behavior, provided that the signedness is correct.

53

Yes, that was a typo! Sorry! Good thing you realized it.

59

If the value is unsigned and negative, we always zero it?

Also, this code always saturates the value regardless of whether the destination semantics call for it, no?

74

When we say 'width' do we mean the width - padding or the width including padding? What information does the semantics structure enshrine?

94

Use extOrTrunc and you won't need the ifs.

All of those accessors are a bit misnamed, it should be called extOrTruncOrSelf...

leonardchan marked 7 inline comments as done.
leonardchan added inline comments.
lib/Basic/FixedPoint.cpp
41

I don't think we can move the extension below saturation since saturation requires checking the bits above the dst width which may require extending and shifting beforehand.

Still leaving it above saturation may require checking for the width over using extOrTrunc to prevent truncating early before right shifting.

59

You're right. I did not test for this since I thought this would fall under undefined behavior converting a signed negative number to an unsigned number. I'll add a check for saturation since I imagine most people would expect modular wrap around, but do you think I should add a test case for this?

74

This width includes the padding. This is commented on over struct fixedPointSema where this width represents that of the underlying APInt, so if the sema is unsigned and hasUnsignedPadding is true, then in a provided APInt with this width, the width-1 bit is padding.

rjmccall added inline comments.Jul 3 2018, 10:49 AM
include/clang/Basic/FixedPoint.h
31

Thanks, this looks a lot better to me. Please do capitalize FixedPointSemantics, though; fltSemantics is a bizarre deviation from LLVM's style guidelines that should be fixed, not something to emulate.

41

Similarly, this should be renamed to getIntegralBits to follow the normal LLVM method-naming guidelines.

Also, please go ahead and hide all the members here behind getters and setters. It's better to reserve flexibility in advance than to have to rewrite a bunch of code later.

leonardchan marked 2 inline comments as done.
  • Renamed fixedPointSemantics to FixedPointSemantics and hid the members behind getters

Thanks, much appreciated. A couple more style changes that I noticed and this will be LGTM, although you should also make sure you have Bevin Hansson's approval.

include/clang/AST/ASTContext.h
33

Please forward-declare FixedPointSemantics and APFixedPoint instead of including them here. This file is #included into almost the entire project, but relatively few files will actually need to use any of the functionality of APFixedPoint. Among other things, it will make your life much happier bringing up this feature if every tiny change to the APFixedPoint interface doesn't force a full recompile.

1953

Please spell out "semantics" here.

include/clang/Basic/FixedPoint.h
41

All of these inlines are unnecessary.

ebevhan added inline comments.Jul 4 2018, 1:32 AM
lib/Basic/FixedPoint.cpp
41

I think the cases where saturation takes effect are limited to simply truncation, since when we upscale we automatically extend first to avoid shifting out bits. This means that the only situation where we need to check for saturation is right before we truncate, and we don't have to worry about anything happening when we upscale.

Do you have an example of when the extension is needed? (In case it wasn't clear, I mean the NewVal = NewVal.extend(DstWidth); extension, not the extension that is part of the upscale)

51

If you move the !DstSema.isSigned() && NewVal.isNegative() block after the mask check, you can avoid doing = 0 twice.

59

Yes, even though it's undefined behavior to wrap, I think it's a good idea to do as little work as possible for each operation and keep every operation as simple as possible. This is to make it easier to codegen later, since it will mean fewer operations, it will be simpler to understand the connection between the semantics of the operations and the code emission and there will be better 'alignment' between generated code and constant evaluation results.

Obviously, if you invoke undefined behavior in runtime anything can happen, but I think that the operations should behave "sensibly" in isolation.

59

The signedness of NewVal is not being set here.

unittests/Basic/FixedPointTest.cpp
380 ↗(On Diff #154020)

Technically, neither CheckSaturatedConversion function checks whether or not the result was correct, only that it saturated.

506 ↗(On Diff #154020)

Acc?

leonardchan marked 4 inline comments as done.
leonardchan added inline comments.
lib/Basic/FixedPoint.cpp
41

Ooooooh I see. Sorry this confused me for a bit. Thanks.

unittests/Basic/FixedPointTest.cpp
380 ↗(On Diff #154020)

Could you elaborate on this? It should be correct that the value saturates to the respective min/max for the destination type right?

506 ↗(On Diff #154020)

Short for Accum. Changed to Accum since it wasn't clear at first.

@ebevhan Any followup on the patch/my previous comments?

rjmccall added inline comments.Jul 17 2018, 3:34 PM
include/clang/AST/ASTContext.h
1953

You missed this request.

include/clang/Basic/FixedPoint.h
65

This should get a documentation comment describing the class. You should mention that, like APSInt, it carries all of the semantic information about the value's type (width, signed-ness, saturating-ness, etc.), but by design it does not carry the original C type — among other reasons, so that it can more easily be moved to LLVM should fixed-point types gain native IR support.

leonardchan marked 3 inline comments as done.
rjmccall added inline comments.Jul 18 2018, 9:06 PM
include/clang/Basic/FixedPoint.h
65

Documentation comments start with ///.

Otherwise, these changes look good; LGTM, but of course you should settle Bevin's review comments before committing.

leonardchan marked an inline comment as done.

Sorry for the late response, I've been away.

LGTM, although my browser doesn't want to let me set Accepted on this.

unittests/Basic/FixedPointTest.cpp
380 ↗(On Diff #154020)

It should... I'm not sure what I was thinking of when I wrote this comment, and I can't think of a counterexample so it's fine as it is.

506 ↗(On Diff #154020)

Seems like you got a couple Accumum from that.

leonardchan marked an inline comment as done.
  • Fixed Accumum names
This revision was not accepted when it landed; it landed in state Needs Review.Aug 6 2018, 9:43 AM
This revision was automatically updated to reflect the committed changes.
bricci added a subscriber: bricci.EditedAug 6 2018, 9:53 AM

Just a nit but could you please add new-lines to your commit messages.
Also the 3 bools in FixedPointSemantics are a little bit wasteful.
I don't know here if it matter but I have been spending the last weeks
cleaning up this kind of thing and this already cut the time of an fsyntax-only
by ~3.5%.

And using the name Sema is a bad idea IMHO since this has a totally different meaning in clang.

Just a nit but could you please add new-lines to your commit messages.

My bad, will remember this for future commits.

Also the 3 bools in FixedPointSemantics are a little bit wasteful.
I don't know here if it matter but I have been spending the last weeks
cleaning up this kind of thing and this already cut the time of an fsyntax-only
by ~3.5%.

Would you recommend something along the lines of having a bitmask instead of 3 bools? What should I keep in mind to avoid making fsyntax-only longer?

And using the name Sema is a bad idea IMHO since this has a totally different meaning in clang.

I'll make another small patch where I rename this.

bricci added a comment.Aug 6 2018, 2:50 PM

Added some comments inline.
However I have not looked at the logic too closely and
have not looked at the tests at all.

cfe/trunk/include/clang/AST/ASTContext.h
1966 ↗(On Diff #159321)

could you add some comments for these functions ?
What are they doing ?

cfe/trunk/include/clang/Basic/FixedPoint.h
24 ↗(On Diff #159321)

This should be removed if unused. It is very strange to have
something basic like a fixed point type depending on ASTContext...

31 ↗(On Diff #159321)

s/treaded/treated

54 ↗(On Diff #159321)

Do you really need the branch here?
Why not just return
Width - Scale - (IsSigned || (!IsSigned && HasUnsignedPadding))

Also doc

62 ↗(On Diff #159321)

My point is that you have two unsigned and then 3 bools.
This means that your class has now sizeof == 3*sizeof(unsigned)
This is annoying on 64 bits archs since pretty much everything
in clang is aligned to 8 bytes (because of pointer alignment requirements)
Just stuff your bits into Width or Scale. Moreover I think that
you could just use a single unsigned to store all of this.

For example you could here use a bit-field like so:
unsigned Width : 15; (or any other appropriate split with Scale)
unsigned Scale : 14; (or any other appropriate split with Width)
unsigned IsSigned : 1;
unsigned IsSaturated : 1;
unsigned HasUnsignedPadding : 1;

Now I do not now if this matters or not in this case but it seems
to me that this is easy to do.

85 ↗(On Diff #159321)

same comment about "sema"

99 ↗(On Diff #159321)

Maybe a bit more documentation here...

Also use /// for comments explaining something to an user:
eg:

/// Create a new Frob blablabla
Frob makeFrobinator()

so that doxygen picks up these comments.

Use // for comments which are internal only
eg:

...
// TODO this is broken because blablabla
...

110 ↗(On Diff #159321)

doc

120 ↗(On Diff #159321)

does this fit into 80 cols ?

130 ↗(On Diff #159321)

doc

cfe/trunk/lib/AST/ASTContext.cpp
10439 ↗(On Diff #159321)

assert( ... && "Ty is not a fixed point type!")

10448 ↗(On Diff #159321)

same

10453 ↗(On Diff #159321)

same

cfe/trunk/lib/Basic/FixedPoint.cpp
15 ↗(On Diff #159321)

Basic depends on ASTContext ? huh

98 ↗(On Diff #159321)

Would this work: do a test for equality and return 0 if true,
then it seems to me that you can drop every else if.
Also maybe use the above trick to remove some branches.

bricci added a comment.Aug 7 2018, 8:35 AM

Just to provide a bit of additional context:

My comment about fsyntax-only is not specific to fsyntax-only.
This is just an handy way to benchmark the frontend without noise
from LLVM. It is important to keep all frequently used structures small
(within reason) even if this result in a little bit of additional work to pack/unpack
(again, within reason) since the front-end is not doing a lot of computations
but is doing a lot of memory accesses.

On my machine when I do an fsyntax-only on all of Boost
~25% of the cycles are spent with execution stalled because of
a cache miss.

Now this basic fixed point type will potentially ends up in
the Type/Stmt/Decl hierarchy -> it *must* be small.

For another example of what not to do see APSInt which
stores a single bool, thus wasting 8 bytes on 64 bits archs
just for this single bit. I actually have a patch stuffing it into APInt
but have not bothered to upstream it since in this case it do not
really matter for clang.