This is an archive of the discontinued LLVM Phabricator instance.

[compiler-rt][BF16] "bfloat -> float -> bfloat" round-trip conversions
ClosedPublic

Authored by vdonaldson on Aug 28 2023, 9:57 AM.

Download Raw Diff

Details

Reviewers

klausler
PeteSteinfeld
vzakhari

Summary

Invoking compiler-rt function __truncsfbf2 to convert a zero 32-bit float
0x00000000 to a 16-bit bfloat value currently generates the denormal value
0x0040, rather than value 0x0000. Negative zero 0x80000000 is converted
to denormal 0x8040 rather than 0x8000.

This behavior is seen in flang code under development (not yet integrated)
that converts bfloat/REAL(KIND=3) argument values to float/REAL(KIND=4)
values and then converts those values back to bfloat/REAL(KIND=3). There
are other instances of the problem. A round-trip type conversion using
__truncsfbf2 of a denormal generates a different denormal, and an sNaN
is converted to a qNaN.

The problem is addressed in generic conversion function fp_trunc_impl.inc
by removing trailing 0 significand bits when the source and destination
type formats are identical except for the significand size. This condition
is met only for float -> bfloat conversions.

Round-trip conversions for at least some other type pairs have the same
problem. A solution in those cases would need to account for exponent
size differences. Those cases are not relevant to flang compilations
and are not addressed here. A broader solution might subsume this fix,
or this fix might remain useful as is.

There are no existing tests of bfloat conversion functionality in the
compiler-rt test directory. Tests for other conversions use a common
infrastructure that does not currently have support for bfloat conversions.
This patch does not attempt to add that infrastructure for this new case.
CodeGen test bfloat.ll checks bfloat adds and other operations that invoke
__truncsfbf2.

Diff Detail

Event Timeline

vdonaldson created this revision.Aug 28 2023, 9:57 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 28 2023, 9:57 AM

Herald added subscribers: Enna1, jdoerfert, dberris. · View Herald Transcript

vdonaldson requested review of this revision.Aug 28 2023, 9:57 AM

vdonaldson added reviewers: klausler, PeteSteinfeld, vzakhari.

klausler accepted this revision.Aug 28 2023, 10:02 AM

This revision is now accepted and ready to land.Aug 28 2023, 10:02 AM

Harbormaster completed remote builds in B255251: Diff 553969.Aug 28 2023, 10:43 AM

All builds and tests correctly and looks good.

https://reviews.llvm.org/D159005

Revision Contents

Path

Size

compiler-rt/

lib/

builtins/

fp_trunc_impl.inc

7 lines

Diff 553969

compiler-rt/lib/builtins/fp_trunc_impl.inc

Show First 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	static __inline dst_t __truncXfYf2__(src_t a) {
const dst_rep_t dstNaNCode = dstQNaN - 1;		const dst_rep_t dstNaNCode = dstQNaN - 1;

// Break a into a sign and representation of the absolute value.		// Break a into a sign and representation of the absolute value.
const src_rep_t aRep = srcToRep(a);		const src_rep_t aRep = srcToRep(a);
const src_rep_t aAbs = aRep & srcAbsMask;		const src_rep_t aAbs = aRep & srcAbsMask;
const src_rep_t sign = aRep & srcSignMask;		const src_rep_t sign = aRep & srcSignMask;
dst_rep_t absResult;		dst_rep_t absResult;

		const int tailBits = srcBits - dstBits;
		if (srcExpBits == dstExpBits && ((aRep >> tailBits) << tailBits) == aRep) {
		// Same size exponents and a's significand tail is 0. Remove tail.
		dst_rep_t result = aRep >> tailBits;
		return dstFromRep(result);
		}

if (aAbs - underflow < aAbs - overflow) {		if (aAbs - underflow < aAbs - overflow) {
// The exponent of a is within the range of normal numbers in the		// The exponent of a is within the range of normal numbers in the
// destination format. We can convert by simply right-shifting with		// destination format. We can convert by simply right-shifting with
// rounding and adjusting the exponent.		// rounding and adjusting the exponent.
absResult = aAbs >> (srcSigBits - dstSigBits);		absResult = aAbs >> (srcSigBits - dstSigBits);
absResult -= (dst_rep_t)(srcExpBias - dstExpBias) << dstSigBits;		absResult -= (dst_rep_t)(srcExpBias - dstExpBias) << dstSigBits;

const src_rep_t roundBits = aAbs & roundMask;		const src_rep_t roundBits = aAbs & roundMask;
▲ Show 20 Lines • Show All 47 Lines • Show Last 20 Lines