# Changeset View

# Standalone View

# llvm/docs/LangRef.rst

- This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 12570 Lines • ▼ Show 20 Line(s) | |||||

12571 | 12571 | | |||

12572 | .. code-block:: llvm | 12572 | .. code-block:: llvm | ||

12573 | 12573 | | |||

12574 | %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) | 12574 | %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) | ||

12575 | %sum = extractvalue {i32, i1} %res, 0 | 12575 | %sum = extractvalue {i32, i1} %res, 0 | ||

12576 | %obit = extractvalue {i32, i1} %res, 1 | 12576 | %obit = extractvalue {i32, i1} %res, 1 | ||

12577 | br i1 %obit, label %overflow, label %normal | 12577 | br i1 %obit, label %overflow, label %normal | ||

12578 | 12578 | | |||

12579 | Fixed Point Arithmetic Intrinsics | ||||

12580 | --------------------------------- | ||||

12581 | | ||||

12582 | A fixed point number represents a real data type for a number that has a fixed | ||||

12583 | number of digits after a radix point (equivalent to the decimal point '.'). | ||||

12584 | The number of digits after the radix point is referred as the ``scale``. These | ||||

12585 | are useful for representing fractional values to a specific precision. The | ||||

12586 | following intrinsics perform fixed point arithmetic operations on 2 operands | ||||

12587 | of the same scale, specified as the third argument. | ||||

12588 | | ||||

12589 | | ||||

12590 | '``llvm.smul.fix.*``' Intrinsics | ||||

12591 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||||

12592 | | ||||

12593 | Syntax | ||||

12594 | """"""" | ||||

12595 | | ||||

12596 | This is an overloaded intrinsic. You can use ``llvm.smul.fix`` | ||||

12597 | on any integer bit width or vectors of integers. | ||||

12598 | | ||||

12599 | :: | ||||

12600 | | ||||

12601 | declare i16 @llvm.smul.fix.i16(i16 %a, i16 %b, i32 %scale) | ||||

12602 | declare i32 @llvm.smul.fix.i32(i32 %a, i32 %b, i32 %scale) | ||||

12603 | declare i64 @llvm.smul.fix.i64(i64 %a, i64 %b, i32 %scale) | ||||

12604 | declare <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale) | ||||

bjope: I'm not sure if/how the auto-vectorizers are aware that only the first two operands should be… | |||||

12605 | | ||||

12606 | Overview | ||||

12607 | """"""""" | ||||

12608 | | ||||

12609 | The '``llvm.smul.fix``' family of intrinsic functions perform signed | ||||

12610 | fixed point multiplication on 2 arguments of the same scale. | ||||

12611 | | ||||

12612 | Arguments | ||||

12613 | """""""""" | ||||

12614 | | ||||

12615 | The arguments (%a and %b) and the result may be of integer types of any bit | ||||

12616 | width, but they must have the same bit width. ``%a`` and ``%b`` are the two | ||||

12617 | values that will undergo signed fixed point multiplication. The argument | ||||

12618 | ``%scale`` represents the scale of both operands, and must be a constant | ||||

12619 | integer. | ||||

12620 | | ||||

12621 | Semantics: | ||||

12622 | """""""""" | ||||

12623 | | ||||

12624 | This operation performs fixed point multiplication on the 2 arguments of a | ||||

12625 | specified scale. The result will also be returned in the same scale specified | ||||

12626 | in the third argument. In the event a value cannot be precisely represented in | ||||

12627 | this scale, the value is either rounded down to the closest fixed point value | ||||

12628 | less than the source value, or rounded ip to the closest fixed point value | ||||

ebevhanUnsubmitted Done Replytypo "up" ebevhan: typo "up" | |||||

12629 | greater than the source value. This rounding is dependent on the target. | ||||

ebevhanUnsubmitted Done ReplyThis doc doesn't say what happens on overflow. Truncation/wraparound? Or should we consider it undefined? If it says that the rounding is dependent on the target, will the rounding mode be target-configurable and exposed through TLI/TTI? We essentially can't touch these intrinsics (constant folding, optimization, even legalization) otherwise. Looking at the legalization sequence, it will definitely lower these into a round-down form. If a target has some legal operations which round up and some non-legal operations, then the legal ones will round up and the non-legal ones will round down. That's sort of messy. It might be safer to say that the rounding is indeterminate, but that's even worse for optimization. ebevhan: This doc doesn't say what happens on overflow. Truncation/wraparound? Or should we consider it… | |||||

leonardchanAuthorUnsubmitted Done ReplyOverflow I was going to leave as undefined behavior. Added this into the docs. By target dependent, I meant that whatever legal operation a target could replace this intrinsic with has the choice of rounding up or down. I'm not sure how bad this would be for optimization, but I imagine rounding isn't something that needs to be touched immediately since with this intrinsic, I'm just attempting to match what the standard says. I think one of the conclusions we also came up to in the long thread on llvm-dev was that, for now, we don't need to do more than what's necessary to implement the spec. I changed the docs though to say rounding is indeterminable, as the spec says. leonardchan: Overflow I was going to leave as undefined behavior. Added this into the docs.
By target… | |||||

ebevhanUnsubmitted Done ReplyOkay. If we claim that it's indeterminate I guess we can still fold however we like, but the results would seem a bit inconsistent. ebevhan: Okay. If we claim that it's indeterminate I guess we can still fold however we like, but the… | |||||

12630 | | ||||

Done Reply"The rounding direction is unspecified" seems scary to me... is there really no standard for rounding these operations? efriedma: "The rounding direction is unspecified" seems scary to me... is there really no standard for… | |||||

Done ReplyThe Embedded-C standard says that the rounding direction is implementation-defined, and is allowed to be indeterminable. I also think it's a bit unfortunate that we can't express the rounding direction, since it inhibits optimization, but locking it down to either rounding up or down will make things annoying for any target with operations that round the other direction. Saying that it's unspecified lets us properly legalize this to something sensible while allowing targets to implement it with legal operations if they have them. For what it's worth, the legalization will make it round down (toward negative infinity). ebevhan: The Embedded-C standard says that the rounding direction is implementation-defined, and is… | |||||

Done ReplyMaybe it should say "target specific" instead? And then the legalization (and any constant folding etc) could use some target transform hook to decide in which direction to do the rounding (I guess it could be set to up/down/any). Does "unspecified" mean that ConstantFolding use one strategy for rounding and legalization another strategy? Or should we only allow constant folding when rounding isn't a problem (such as multiply by zero)? Things could still be annoying for a target that does care about rounding. Or maybe it isn't allowed for a target to decide what the implementation-defined behavior should be? Or should it be the same for all targets? A first implementation could of course go for only supporting "unspecified" (if it is clear what kind of constant folding and other optimizations that are allowed). If a target needs a "target specific" behavior, then I assume the code could be sprinkled with hooks at a later stage. bjope: Maybe it should say "target specific" instead? And then the legalization (and any constant… | |||||

Done Reply
I would be ok with adding this later. For now I was focusing on laying down the basic code necessary for just using this intrinsic.
I initially had it as "target specific" but changed it to "unspecified" since the rest of the LangRef doc seems to use this term for describing either implementation-defined, or undefined, behavior.
The hooks are a good idea. I figure for optimizations, depending on what's supported (up/down/any), InstCombine and others can use the hook to determine rounding direction. leonardchan: > Maybe it should say "target specific" instead? And then the legalization (and any constant… | |||||

Done ReplyWould it make sense to make the rounding mode a parameter to the intrinsic? That way, frontends can choose the semantics they want, and the semantics are clear throughout the optimization pipeline. clang could choose the rounding mode in a target-specific manner if necessary. efriedma: Would it make sense to make the rounding mode a parameter to the intrinsic? That way… | |||||

Done ReplyThe only thing that I think would be of concern is just having too many parameters, or having an extra intrinsic to dictate rounding direction. Other than this, I have no other strong opinions regarding this. I imagine though that having an extra parameter/intrinsic to specify rounding would imply that a particular target has the option of specifying rounding in both directions, when really a target would likely just have native rounding in one direction (although this is just an assumption). leonardchan: The only thing that I think would be of concern is just having too many parameters, or having… | |||||

Done Replyleonardchan: @ebevhan @bjope Do you think an extra parameter/intrinsic is worth adding also in this patch? | |||||

Done ReplyIt's definitely handy to be able to express the rounding of the operations in the intrinsic, but I think some of the downsides are: - The number of parameters grows, making the intrinsic more unwieldy.
- There are other roundings than just up and down; rounding towards or away from zero (the former of which I think the fixed-point division ought to try and always follow) are also possibilities.
- We need to support legalization of every rounding mode we add to the intrinsic. Then again, if we add a hook we need to do the same anyway.
- The legalization checks need to check both rounding and scale.
And as you say, it's more likely that the rounding of a specific target should be the same across the board rather than be configurable per operation, if it has support for fixed-point. ebevhan: It's definitely handy to be able to express the rounding of the operations in the intrinsic… | |||||

Done ReplyIgnore what I said about the division rounding; I was totally off mark there. I think we can just keep it unspecified for now and if the need arises, add rounding mode with a hook. ebevhan: Ignore what I said about the division rounding; I was totally off mark there.
I think we can… | |||||

12631 | | ||||

12632 | Examples | ||||

12633 | """"""""" | ||||

12634 | | ||||

12635 | .. code-block:: llvm | ||||

Done ReplyI think "operation" might be clearer than "conversion". Or don't mention it at all and just say "It is undefined behavior if the source value does not fit within the range of the fixed point type." The rounding section is also a bit wordy. "If the result value cannot be precisely represented in the given scale, the value is rounded up or down to the closest representable value. The rounding direction is unspecified." Also, my choice of the word "indeterminate" was a bit unfortunate, I think the rest of the LangRef uses "unspecified". ebevhan: I think "operation" might be clearer than "conversion". Or don't mention it at all and just say… | |||||

12636 | | ||||

12637 | %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6) | ||||

12638 | %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5) | ||||

12639 | %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 x -1 = -1.5) | ||||

12640 | | ||||

12641 | ; The result in the following could be rounded up to -2 or down to -2.5 depending on the system | ||||

12642 | %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -3, i32 1) ; %res = -5 (or -4) (1.5 x -1.5 = -2.25) | ||||

12643 | | ||||

12644 | | ||||

12579 | Specialised Arithmetic Intrinsics | 12645 | Specialised Arithmetic Intrinsics | ||

12580 | --------------------------------- | 12646 | --------------------------------- | ||

12581 | 12647 | | |||

12582 | '``llvm.canonicalize.*``' Intrinsic | 12648 | '``llvm.canonicalize.*``' Intrinsic | ||

12583 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 12649 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||

12584 | 12650 | | |||

12585 | Syntax: | 12651 | Syntax: | ||

12586 | """"""" | 12652 | """"""" | ||

▲ Show 20 Lines • Show All 3159 Lines • Show Last 20 Lines |

I'm not sure if/how the auto-vectorizers are aware that only the first two operands should be vectorized and that the third operand should stay scalar. So I guess it is unlikely that we ever get any vector operands at the moment (except for handwritten lit tests).

So I guess this should work for now (after all it is simpler than having a vector of scales). But perhaps we need to allow the scale to be given as a vector as well in the future to support vectorization (

shlis for example similar, but afaik it will get the shift counts as a vector when being vectorized).