This is an archive of the discontinued LLVM Phabricator instance.

[compiler-rt][builtins] Scaled Integer log10()
Needs ReviewPublic

Authored by leonardchan on May 17 2019, 2:37 PM.

Details

Summary

This patch proposes adding a log10() function that works on scaled integers. The only 2 arguments are the integer itself and its scale. The result is also a scaled integer with the same scale as the input. If the true result cannot be precisely represented, it is rounded down towards negative infinity.

I think this would be a good function to complement fixed point types, although we use scaled integers here since the types are still hidden behind a flag. This function also isn't something specified in the Embedded C spec, though we would like to use it in Fuchsia with fixed point types.

Diff Detail

Event Timeline

leonardchan created this revision.May 17 2019, 2:37 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMay 17 2019, 2:37 PM
Herald added subscribers: Restricted Project, mgorny, dberris. · View Herald Transcript

I don't know much about compiler-rt, and neither what the procedure is for adding new builtins.
Maybe try to find some more reviewers (someone that has been doing/reviewing this kind of additions before).

I don't review compiler-rt patches generally. Can you explain what would actually use this function? Is there supposed to be a new Clang builtin for it?

I can't speak to how useful this would be and for whom, but I'm not opposed to including it if Fuschia has a use for it.

Typically, compiler-support libraryfunctions are all lowercase, and just use a single letter for each type--"i" for integer--prefixed by a "size of type" character: 's' for "single", 'd' for double length as in this example (it's a little archaic), I don't know of a common one for "scaled" though.

Then each one is also numbered. The exact mechanism to choose the number escapes me at the moment, but I'm consulting with someone who should know.

So this should probably be named something more like __log10scaleddiX, while I look for what X should be.

Also, has this been run through clang-format? I can never remember the rules on line-length.

The number is the argument count plus one for return value. So this function should be named

__log10scaleddi3

I don't review compiler-rt patches generally. Can you explain what would actually use this function? Is there supposed to be a new Clang builtin for it?

We will be using this as part of our WLAN library which represents energy units (decibels and watts) as scaled integers that we have custom classes for. Converting between these units and adding decibels mathematically involves using log10(). Currently, we just use float log10 and casting, but would like to use this for scaled integers and eventually fixed point types. The idea is that this would be a new clang builtin that we could use instead of a floating point log10().

I don't review compiler-rt patches generally. Can you explain what would actually use this function? Is there supposed to be a new Clang builtin for it?

We will be using this as part of our WLAN library which represents energy units (decibels and watts) as scaled integers that we have custom classes for. Converting between these units and adding decibels mathematically involves using log10(). Currently, we just use float log10 and casting, but would like to use this for scaled integers and eventually fixed point types. The idea is that this would be a new clang builtin that we could use instead of a floating point log10().

Okay. Will you need a signed variant of this? Is using a 64-bit argument reasonable for all targets, or should there be 32-bit and 64-bit (and maybe eventually 128-bit) variants?

leonardchan removed a subscriber: Restricted Project.

The number is the argument count plus one for return value. So this function should be named

__log10scaleddi3

Done and clang-formated

I don't review compiler-rt patches generally. Can you explain what would actually use this function? Is there supposed to be a new Clang builtin for it?

We will be using this as part of our WLAN library which represents energy units (decibels and watts) as scaled integers that we have custom classes for. Converting between these units and adding decibels mathematically involves using log10(). Currently, we just use float log10 and casting, but would like to use this for scaled integers and eventually fixed point types. The idea is that this would be a new clang builtin that we could use instead of a floating point log10().

Okay. Will you need a signed variant of this? Is using a 64-bit argument reasonable for all targets, or should there be 32-bit and 64-bit (and maybe eventually 128-bit) variants?

I don't think we need a signed version since ideally all inputs to log functions should be non-negative, otherwise the result is imaginary. This could also be reasonably extended to support other sizes. I only chose a 64 bit size for now since it's the largest type we use and any other types could be resized/rescaled into this. For 128-bit, this would be a little more tricky since the algorithm depends on wide multiplication (which we kinda do through x = (__uint128_t)x * x >> scale;), although I can't seem to find an existing compiler-rt multiplication function that does this.

The number is the argument count plus one for return value. So this function should be named

__log10scaleddi3

Done and clang-formated

I don't review compiler-rt patches generally. Can you explain what would actually use this function? Is there supposed to be a new Clang builtin for it?

We will be using this as part of our WLAN library which represents energy units (decibels and watts) as scaled integers that we have custom classes for. Converting between these units and adding decibels mathematically involves using log10(). Currently, we just use float log10 and casting, but would like to use this for scaled integers and eventually fixed point types. The idea is that this would be a new clang builtin that we could use instead of a floating point log10().

Okay. Will you need a signed variant of this? Is using a 64-bit argument reasonable for all targets, or should there be 32-bit and 64-bit (and maybe eventually 128-bit) variants?

I don't think we need a signed version since ideally all inputs to log functions should be non-negative, otherwise the result is imaginary.

Well, I think the builtin has to be able to operate on signed scaled-integer / fixed-point types, and you'll have to specify the behavior for negative numbers. If you're comfortable implementing those semantics inline in the caller, then I agree you don't need signed functions in compiler-rt.

This could also be reasonably extended to support other sizes. I only chose a 64 bit size for now since it's the largest type we use and any other types could be resized/rescaled into this. For 128-bit, this would be a little more tricky since the algorithm depends on wide multiplication (which we kinda do through x = (__uint128_t)x * x >> scale;), although I can't seem to find an existing compiler-rt multiplication function that does this.

My concern about that is that it's going to limit this builtin to 64-bit targets — most 32-bit targets don't support __uint128_t (as an implementation concern), and there might be problems with using a 64-bit type in the parameter list.

As a separate concern, 128-bit multiplication might itself be implemented with a compiler-rt function on some targets, and I'm not sure how acceptable it is for compiler-rt functions to have this kind of cross-dependency.

Well, I think the builtin has to be able to operate on signed scaled-integer / fixed-point types, and you'll have to specify the behavior for negative numbers. If you're comfortable implementing those semantics inline in the caller, then I agree you don't need signed functions in compiler-rt.

Ok. I'll add it to the caller when adding the builtin clang function. We can handle usage with different types there.

This could also be reasonably extended to support other sizes. I only chose a 64 bit size for now since it's the largest type we use and any other types could be resized/rescaled into this. For 128-bit, this would be a little more tricky since the algorithm depends on wide multiplication (which we kinda do through x = (__uint128_t)x * x >> scale;), although I can't seem to find an existing compiler-rt multiplication function that does this.

My concern about that is that it's going to limit this builtin to 64-bit targets — most 32-bit targets don't support __uint128_t (as an implementation concern), and there might be problems with using a 64-bit type in the parameter list.

As a separate concern, 128-bit multiplication might itself be implemented with a compiler-rt function on some targets, and I'm not sure how acceptable it is for compiler-rt functions to have this kind of cross-dependency.

I updated this patch and made a separate 32 bit version for this. The 64 bit version now uses __uint128_t only if compiler rt supports it, otherwise it uses a wide multiplication that I borrowed from fp_lib.hand a custom funnel shift right.

Well, I think the builtin has to be able to operate on signed scaled-integer / fixed-point types, and you'll have to specify the behavior for negative numbers. If you're comfortable implementing those semantics inline in the caller, then I agree you don't need signed functions in compiler-rt.

Ok. I'll add it to the caller when adding the builtin clang function. We can handle usage with different types there.

Okay. Please mention the behavior for 0 in the documentation, since this function is not otherwise defined for 0.

This could also be reasonably extended to support other sizes. I only chose a 64 bit size for now since it's the largest type we use and any other types could be resized/rescaled into this. For 128-bit, this would be a little more tricky since the algorithm depends on wide multiplication (which we kinda do through x = (__uint128_t)x * x >> scale;), although I can't seem to find an existing compiler-rt multiplication function that does this.

My concern about that is that it's going to limit this builtin to 64-bit targets — most 32-bit targets don't support __uint128_t (as an implementation concern), and there might be problems with using a 64-bit type in the parameter list.

As a separate concern, 128-bit multiplication might itself be implemented with a compiler-rt function on some targets, and I'm not sure how acceptable it is for compiler-rt functions to have this kind of cross-dependency.

I updated this patch and made a separate 32 bit version for this. The 64 bit version now uses __uint128_t only if compiler rt supports it, otherwise it uses a wide multiplication that I borrowed from fp_lib.hand a custom funnel shift right.

Thanks. That sounds right to me, but I still don't feel qualified to actually review the approach. :)

Well, I think the builtin has to be able to operate on signed scaled-integer / fixed-point types, and you'll have to specify the behavior for negative numbers. If you're comfortable implementing those semantics inline in the caller, then I agree you don't need signed functions in compiler-rt.

Ok. I'll add it to the caller when adding the builtin clang function. We can handle usage with different types there.

Okay. Please mention the behavior for 0 in the documentation, since this function is not otherwise defined for 0.

Done. We just return INT64/32_MIN for input of 0.

This could also be reasonably extended to support other sizes. I only chose a 64 bit size for now since it's the largest type we use and any other types could be resized/rescaled into this. For 128-bit, this would be a little more tricky since the algorithm depends on wide multiplication (which we kinda do through x = (__uint128_t)x * x >> scale;), although I can't seem to find an existing compiler-rt multiplication function that does this.

My concern about that is that it's going to limit this builtin to 64-bit targets — most 32-bit targets don't support __uint128_t (as an implementation concern), and there might be problems with using a 64-bit type in the parameter list.

As a separate concern, 128-bit multiplication might itself be implemented with a compiler-rt function on some targets, and I'm not sure how acceptable it is for compiler-rt functions to have this kind of cross-dependency.

I updated this patch and made a separate 32 bit version for this. The 64 bit version now uses __uint128_t only if compiler rt supports it, otherwise it uses a wide multiplication that I borrowed from fp_lib.hand a custom funnel shift right.

Thanks. That sounds right to me, but I still don't feel qualified to actually review the approach. :)

No problem. @saugustine , would you feel comfortable reviewing this, or is there someone else you would recommend for compiler-rt?

I am absolutely not the right person to review the math. I think we need to trust that the author has it correct. The logic looks reasonable and the tests look basically thorough. I am inclined to accept it.

Please add test cases for scale=0 and scale=width as I assume those need special handling (UB right now?).
And if scale=0 and scale=width needs special handling, then I guess scale=1 and scale=width-1 are new boundary values so I maybe it would be nice to have tests for those scales as well.

compiler-rt/lib/builtins/log10scaleddi3.c
64 ↗(On Diff #203939)

So with scale=0 being a valid input this results in a negative shift count (UB).

66 ↗(On Diff #203939)

So with scale=64 being a valid input this results in a too big shift count (UB).

leonardchan marked 2 inline comments as done.

Please add test cases for scale=0 and scale=width as I assume those need special handling (UB right now?).
And if scale=0 and scale=width needs special handling, then I guess scale=1 and scale=width-1 are new boundary values so I maybe it would be nice to have tests for those scales as well.

Added. For a scale of 0 we only return the integer result. For widths > 60 or 28, we return INT_MAX to represent an error since we need to represent 10 << scale in 64/32 bits for this to work. So the scale boundaries are [0,60] for the 64 bit version and [0,28] for the 32 bit version. We could technically increase this to go up to 63/31, although to get the precise result would require a bigger buffer (using more 128 bit ints) which would complicate things further.

For the 64 bit function, I realized that in order to get the precise result without a 128 bit int would require implementing division by 10 in 2 64 bit ints. I eventually found a way to do this and get the precise result, but am not sure if it should be included in this patch since it would be adding another large function. Perhaps it could be readded in a followup? For now, I removed the functions we fallback to if 128 bit ints aren't supported and only define the function if 128 bits are enabled.

Please add test cases for scale=0 and scale=width as I assume those need special handling (UB right now?).
And if scale=0 and scale=width needs special handling, then I guess scale=1 and scale=width-1 are new boundary values so I maybe it would be nice to have tests for those scales as well.

Added. For a scale of 0 we only return the integer result. For widths > 60 or 28, we return INT_MAX to represent an error since we need to represent 10 << scale in 64/32 bits for this to work. So the scale boundaries are [0,60] for the 64 bit version and [0,28] for the 32 bit version. We could technically increase this to go up to 63/31, although to get the precise result would require a bigger buffer (using more 128 bit ints) which would complicate things further.

For the 64 bit function, I realized that in order to get the precise result without a 128 bit int would require implementing division by 10 in 2 64 bit ints. I eventually found a way to do this and get the precise result, but am not sure if it should be included in this patch since it would be adding another large function. Perhaps it could be readded in a followup? For now, I removed the functions we fallback to if 128 bit ints aren't supported and only define the function if 128 bits are enabled.

Is there not a function for that already in compiler-rt? Or is the problem that it doesn't necessarily exist, e.g. if we're supporting a 64-bit scaled integer type but not a 128-bit integer type?

At any rate, I'd say it's better to have the operation with a big code-size cost than to not reliably have it.

Is there not a function for that already in compiler-rt? Or is the problem that it doesn't necessarily exist, e.g. if we're supporting a 64-bit scaled integer type but not a 128-bit integer type?

Yes, the extra functions are for if 64 bit scaled ints are supported but we aren't able to use 128 bit ints. There is __udivmodti4 which does 128 bit division, but only if 128 bit ints are enabled.

At any rate, I'd say it's better to have the operation with a big code-size cost than to not reliably have it.

Added.

Thanks! Still don't feel qualified to actually approve compiler-rt changes, though.

Hmm is there usually a protocol/etiquette for getting a new built-in submitted to compiler-rt?

Perhaps I could email the original creator and request a rubber stamp LGTM if he approves?

(my email reply might not have been sent)

My understanding is that Tim Northover sometimes reviews patches to compiler-rt, even if he's not an official owner; CC'ing him.

@t.p.northover Would you feel ok reviewing this?

Been a while since the last update, but would anyone still be comfortable reviewing this?