scanon (Steve Canon)
User

Projects

User does not belong to any projects.

User Details

User Since
Jun 10 2014, 9:32 AM (214 w, 2 d)

Recent Activity

Today

scanon committed rL337523: Add x86_64-unkown triple to llc for x86 test..
Add x86_64-unkown triple to llc for x86 test.
Thu, Jul 19, 8:56 PM
scanon committed rL337506: Skip out of SimplifyDemandedBits for BITCAST of f16 to i16.
Skip out of SimplifyDemandedBits for BITCAST of f16 to i16
Thu, Jul 19, 3:51 PM
scanon closed D49524: Avoid generating FGETSIGN for f16.
Thu, Jul 19, 3:51 PM
scanon added inline comments to D49561: AMDGPU: Try to make isKnownNeverSNan more accurate.
Thu, Jul 19, 2:42 PM
scanon added inline comments to D49561: AMDGPU: Try to make isKnownNeverSNan more accurate.
Thu, Jul 19, 2:41 PM
scanon added a comment to D49524: Avoid generating FGETSIGN for f16.
In D49524#1168723, @ab wrote:

I wouldn't be surprised if it failed on some target that can't even handle half parameters

Thu, Jul 19, 1:47 PM

Yesterday

scanon updated the diff for D49524: Avoid generating FGETSIGN for f16.
Wed, Jul 18, 5:22 PM
scanon created D49524: Avoid generating FGETSIGN for f16.
Wed, Jul 18, 4:59 PM

May 22 2018

scanon requested changes to D46926: [Fixed Point Arithmetic] Conversion between Fixed Point and Floating Point Numbers.

IIRC the optimization of divide-by-power-of-two --> multiply-by-inverse does not occur at -O0, so it would be better to multiply by 2^(-fbits) instead.

May 22 2018, 7:06 AM · Restricted Project

May 10 2018

scanon accepted D46499: [Support] Add docs for 'openFileFor{Write,Read}'.

LGTM, thanks!

May 10 2018, 2:19 PM

May 9 2018

scanon added a comment to D46499: [Support] Add docs for 'openFileFor{Write,Read}'.

One more question: the caller is responsible for closing the file when they're done, right?

May 9 2018, 8:57 AM
scanon added a comment to D46499: [Support] Add docs for 'openFileFor{Write,Read}'.

Two quick notes as someone who has never used these functions:

  • Name is really the *path* of the file to open, right?
  • What happens to ResultFD if the function fails?
May 9 2018, 8:52 AM

May 8 2018

scanon accepted D41868: APFloat/x87: Fix string conversion for "unnormal" values (pr35860).

LGTM

May 8 2018, 8:34 AM

May 4 2018

scanon added a comment to D46042: Cap vector alignment at 16 for all Darwin platforms.
In D46042#1088044, @ab wrote:

So, this makes sense to me, but on x86, should we also be worried about the fact that the calling convention is based on which features are available? (>128bit ext_vector_types are passed in AVX/AVX-512 registers, if available). Presumably swift is also affected, no?

May 4 2018, 10:50 AM

Apr 26 2018

scanon added a comment to D46135: [Driver, CodeGen] add options to enable/disable an FP cast optimization.

I like Chandler's wording. Something like:

Apr 26 2018, 1:51 PM

Apr 25 2018

scanon updated subscribers of D30527: Replacing float with new class Fraction for LSR alternative way of resolving complex solution.

@gottesmm can you take a look at this? You're more familiar with the APFloat API than I am.

Apr 25 2018, 7:28 AM

Apr 20 2018

scanon added a comment to rL330437: [DAGCombine] (float)((int) f) --> ftrunc (PR36617).

Tangential question: Do we have an intrinsic floating -> integer conversion with defined semantics for out-of-range values?

Apr 20 2018, 8:43 AM

Apr 6 2018

scanon added a comment to D45336: Apply accumulator to fadd/fmul experimental vector reductions (PR36734).

it should really be the default in both LLVM and source languages except where there's an explicit constraint that forces linear reduction.

Tree reductions is what Rust portable packed SIMD RFC [0] currently specifies.

Apr 6 2018, 10:06 AM
scanon added a comment to D45336: Apply accumulator to fadd/fmul experimental vector reductions (PR36734).

I also wonder whether requiring fast-math to allow tree reductions is overkill. Tree reductions can be implemented reasonably efficiently in many architectures, while linearly ordered reduction appear to me to be more of a niche.

Apr 6 2018, 8:19 AM

Mar 26 2018

scanon added a comment to D44909: [DAGCombine] (float)((int) f) --> ftrunc (PR36617).

Two questions, to which I do not know the answer:

(a) Are the semantics of ISD::FP_TO_[US]INT for out-of-range values specified anywhere? The do not seem to be, but maybe I'm just missing it.

No, I don't see anything either. We just have:

/// FP_TO_[US]INT - Convert a floating point value to a signed or unsigned
/// integer.

I was assuming these nodes follow the same rules as the IR instructions since they are mapped 1-to-1 in SelectionDAGBuilder::visitFPToSI() etc.

Mar 26 2018, 5:36 PM
scanon added a comment to D44909: [DAGCombine] (float)((int) f) --> ftrunc (PR36617).

Two questions, to which I do not know the answer:

Mar 26 2018, 5:08 PM
scanon added a comment to D44909: [DAGCombine] (float)((int) f) --> ftrunc (PR36617).

This is the class of optimizations that I would call "formally allowed by the standard, but extremely likely to break things and surprise people." Which isn't to say that we shouldn't do it, just ... be prepared.

Mar 26 2018, 2:57 PM

Mar 20 2018

scanon accepted D44318: [LangRef] describe the default FP environment.

I'm OK with this.

Mar 20 2018, 8:08 AM

Mar 16 2018

scanon added inline comments to D44521: [InstSimplify] fp_binop X, NaN --> NaN.
Mar 16 2018, 9:37 AM

Mar 5 2018

scanon added a comment to D43765: [InstSimplify] loosen FMF for sqrt(X) * sqrt(X) --> X.

IIRC Intrinsic::sqrt is undef for negative inputs (unlike the sqrt libcall), so we don't need FMF.noNaNs to license this transformation.

Mar 5 2018, 8:56 AM

Mar 2 2018

scanon accepted D44038: [InstCombine] Rewrite the binary op shrinking in visitFPTrunc to avoid creating overly small ConstantFPs that we'll just need to extend again..

Works for me.

Mar 2 2018, 12:07 PM
scanon accepted D43970: [InstCombine] Allow fptrunc (fpext X)) to be reduced to a single fpext/ftrunc.

LGTM.

Mar 2 2018, 10:05 AM
scanon added inline comments to D43970: [InstCombine] Allow fptrunc (fpext X)) to be reduced to a single fpext/ftrunc.
Mar 2 2018, 8:45 AM

Mar 1 2018

scanon added a comment to D43970: [InstCombine] Allow fptrunc (fpext X)) to be reduced to a single fpext/ftrunc.

We should be able to go farther and just do a fptrunc if SrcSize > DstSize.

Mar 1 2018, 2:02 PM

Feb 7 2018

scanon added a comment to D42879: InstCombine: 1./x >= 0. -> x >= 0..

In that case, you still have trouble because even 1/x can produce zero if someone is running with flush-to-zero enabled.

IIUC, we also have out-of-tree targets with no option; they always operate with FTZ behavior.

I think it's still possible to allow this kind of transform in instcombine with more fast-math-flags. Clang/gcc's -fassociative-math translates indirectly to 'reassoc' in IR FMF and says it may "reorder floating-point comparisons".

Feb 7 2018, 8:37 AM

Feb 6 2018

scanon added a comment to D42879: InstCombine: 1./x >= 0. -> x >= 0..

Underflow or overflow doesn't change sign, so 0 < C < inf && X >= 0 --> C/X >= 0.

It doesn't change the sign. However we have to differentiate between three cases here: negative, null (or minus null), and positive.

Underflow can change a value from positive or negative to null.
My understanding is that in case of underflow of large positive X the expression C/X <= 0 may be true while X <= 0 is not.

Feb 6 2018, 4:26 PM
scanon added a comment to D42879: InstCombine: 1./x >= 0. -> x >= 0..

Underflow or overflow doesn't change sign, so 0 < C < inf && X >= 0 --> C/X >= 0.

Feb 6 2018, 4:08 PM

Jan 25 2018

scanon added a comment to D41868: APFloat/x87: Fix string conversion for "unnormal" values (pr35860).

(After checking the architecture manual)

Jan 25 2018, 7:16 AM
scanon added a comment to D41868: APFloat/x87: Fix string conversion for "unnormal" values (pr35860).

My recollection is that on 386 and later unnormals with a zero significand are treated as an invalid operand; should these print as 0 or as NaN?

Jan 25 2018, 6:00 AM

Dec 4 2017

scanon accepted D40792: DAG: Match truncated rotation (PR35487).

Yup, there's likely something more general that we could match, but it's also worth taking this as is.

Dec 4 2017, 11:09 AM

Nov 27 2017

scanon added a comment to D37989: InstCombine: Insert missing canonicalizes.

IEEE 754 rules are that everything canonicalizes except bitwise operations (copy, abs, negate, copysign) and decimal re-encoding operations (which you don't care about).

Does this mean that we need to make all other float optimizations in LLVM do the same?

for example, we cannot optimize fmul(x, 1) to x, we must optimize it to fcanonicalize(x), right? thus preventing pretty much all float optimizations, since presumably that will act as a barrier.

Nov 27 2017, 11:23 AM
scanon added a comment to D37989: InstCombine: Insert missing canonicalizes.

IEEE 754 rules are that everything canonicalizes except bitwise operations (copy, abs, negate, copysign) and decimal re-encoding operations (which you don't care about).

Nov 27 2017, 11:05 AM

Oct 30 2017

scanon accepted D39225: Fix APFloat mod sign.

mod is not bound to the IEEE 754 remainder operation. It binds the C fmod operation. You're looking for the remainder operation.

Oct 30 2017, 9:32 AM

Sep 13 2017

scanon accepted D34695: _Float16 preprocessor macro definitions.

LGTM as well.

Sep 13 2017, 8:18 AM

Sep 12 2017

scanon added inline comments to D34695: _Float16 preprocessor macro definitions.
Sep 12 2017, 8:59 AM
scanon added inline comments to D34695: _Float16 preprocessor macro definitions.
Sep 12 2017, 8:36 AM

Sep 6 2017

scanon accepted D35222: InstSimplify: canonicalize is idempotent.

LGTM. Sorry that I didn't see this earlier.

Sep 6 2017, 8:30 AM

Aug 21 2017

scanon requested changes to D36961: [APFloat] Fix IsInteger() for DoubleAPFloat.

... except, please add another test-case where the other component is not an integer as well.

Aug 21 2017, 9:37 AM
scanon accepted D36961: [APFloat] Fix IsInteger() for DoubleAPFloat.

This looks fine.

Aug 21 2017, 9:35 AM

Aug 4 2017

scanon added a comment to D28335: [WIP] [RFC] Don't lower floating point intrinsics to libcalls which modify errno.

Huh, I guess we do use the x87 sin/cos. It's probably a terrible idea (the microcoded routines are both slow and wildly inaccurate for large inputs), but we do it.

Aug 4 2017, 4:08 PM

Mar 31 2017

scanon committed rL299257: Fix 80-column violation in previous commit..
Fix 80-column violation in previous commit.
Mar 31 2017, 1:47 PM
scanon committed rL299256: Fix APFloat mod (committing for simonbyrne).
Fix APFloat mod (committing for simonbyrne)
Mar 31 2017, 1:44 PM
scanon closed D29346: Fix APFloat mod by committing rL299256: Fix APFloat mod (committing for simonbyrne).
Mar 31 2017, 1:44 PM
scanon added a comment to D31533: Do not translate rint into nearbyint, but truncate it like nearbyint.

If rint() is faster than nearbyint(), should we canonicalize nearbyint()->rint()?

Mar 31 2017, 11:34 AM
scanon added a reviewer for D31398: [X86][X86 intrinsics]Folding cmp(sub(a,b),0) into cmp(a,b) optimization: hfinkel.

(x-y) == 0 --> x == y does not require nsz (zeros of any sign compare equal), nor does it require nnan (if x or y is NaN, both comparisons are false). It *does* require ninf (because inf-inf = NaN), and it also requires that subnormals are not flushed by the CPU. There's been some churn around flushing recently, Hal may have thoughts (+Hal).

Mar 31 2017, 7:44 AM
scanon accepted D31533: Do not translate rint into nearbyint, but truncate it like nearbyint.

LGTM.

Mar 31 2017, 6:56 AM
scanon added a comment to D31182: [InstCombine] fadd double (sitofp x), y check that the promotion is valid .

This addresses my concerns. Gentle ping to the other reviewers.

Mar 31 2017, 6:55 AM

Mar 21 2017

scanon added inline comments to D31164: [IR] Add AllowContract to FastMathFlags.
Mar 21 2017, 9:42 AM
scanon added inline comments to D31182: [InstCombine] fadd double (sitofp x), y check that the promotion is valid .
Mar 21 2017, 9:38 AM
scanon added a comment to D31182: [InstCombine] fadd double (sitofp x), y check that the promotion is valid .

The assumption is that an integer add is more canonical and/or cheaper than an FP add. Is that universally true?

Mar 21 2017, 9:31 AM

Mar 8 2017

scanon added a comment to D30527: Replacing float with new class Fraction for LSR alternative way of resolving complex solution.

What's the rationale for using rationals here instead of a (simpler) fixed-point representation, if we want to get rid of float?

Mar 8 2017, 8:35 AM

Feb 7 2017

scanon added a comment to D29346: Fix APFloat mod.

Great, thanks Simon! This looks fine numerically. @gottesmm, can you double-check for style, etc?

Feb 7 2017, 9:49 AM

Feb 1 2017

scanon added a comment to D29346: Fix APFloat mod.

Numerically this is sound, and something that I've wanted to fix for a while but haven't had time to deal with. Thanks for doing it! Please add some test cases that would detect the old error (@gottesmm can likely point you in the right direction).

Feb 1 2017, 5:58 AM

Jan 19 2017

scanon added a comment to D28862: [compiler-rt] [test] Use approximate comparison on float types.

These tests should either be exact, or should have a tolerance that's mathematically sound. +/-1ulp is not sound; the allowed error should be proportional to the magnitude of the larger of the real and imaginary components of the result -- e.g. if one component is very small compared to the other, the smaller one may have *no* "correct" bits and still be acceptable.

Jan 19 2017, 7:01 AM

Jan 6 2017

scanon accepted D27898: [compiler-rt] [builtins] Implement __floattitf() & __floatuntitf().

OK, I'm satisfied with that.

Jan 6 2017, 10:00 AM
scanon added inline comments to D27898: [compiler-rt] [builtins] Implement __floattitf() & __floatuntitf().
Jan 6 2017, 7:54 AM

Jan 4 2017

scanon added inline comments to D27932: InstSimplify: Eliminate fabs on known positive.
Jan 4 2017, 11:28 AM
scanon added inline comments to D27028: Add intrinsics for constrained floating point operations.
Jan 4 2017, 9:32 AM
scanon added inline comments to D27932: InstSimplify: Eliminate fabs on known positive.
Jan 4 2017, 9:28 AM

Dec 23 2016

scanon requested changes to D27898: [compiler-rt] [builtins] Implement __floattitf() & __floatuntitf().

I don't particularly care about the shift, since the code is completely equivalent either way, though it would be nice to clean up. However, please replace "mantissa" with "significand" before committing.

Dec 23 2016, 4:54 AM
scanon added inline comments to D27898: [compiler-rt] [builtins] Implement __floattitf() & __floatuntitf().
Dec 23 2016, 4:51 AM

Dec 22 2016

scanon added inline comments to D27898: [compiler-rt] [builtins] Implement __floattitf() & __floatuntitf().
Dec 22 2016, 10:41 AM
scanon added inline comments to D27898: [compiler-rt] [builtins] Implement __floattitf() & __floatuntitf().
Dec 22 2016, 10:36 AM
scanon added a comment to D27898: [compiler-rt] [builtins] Implement __floattitf() & __floatuntitf().

Please s/mantissa/significand/. I know that "mantissa" is used in a number of places in llvm sources, but it's incorrect terminology. Significand is the IEEE-754 nomenclature, which any new work should follow.

Dec 22 2016, 10:20 AM

Sep 26 2016

scanon added a comment to D24925: [testsuite] turn fp-contract off because we can check those results more precisely.

The tests in question are broken. They blindly apply a relative error tolerance that is not motivated by sound numerical analysis, despite the existence of well-known and well-founded error bounds for these problems, in every basic numerical analysis textbook.

Sep 26 2016, 10:56 AM

Sep 22 2016

scanon accepted D24481: make “#pragma STDC FP_CONTRACT” on by default.

LGTM

Sep 22 2016, 10:44 AM

Sep 12 2016

scanon added a comment to D24481: make “#pragma STDC FP_CONTRACT” on by default.

Abe, I had a patch to do the same thing last year (that we ended up backing out because it exposed a bug which has since been fixed). Your approach looks a bit different from the one that I took originally, and you seem to be missing some of the arm64 test changes that I had to put in:

Sep 12 2016, 4:40 PM
scanon added a reviewer for D24481: make “#pragma STDC FP_CONTRACT” on by default: hfinkel.
Sep 12 2016, 4:37 PM

Jul 15 2016

scanon added a comment to D18639: Use __builtin_isnan/isinf/isfinite in complex.

I am not the right person to review the C++ template details, but everything else seems OK to me.

Jul 15 2016, 8:02 AM

Jul 1 2016

scanon added a comment to D18639: Use __builtin_isnan/isinf/isfinite in complex.

I agree with Marshall. Aside from the points he raises, this approach looks fine.

Jul 1 2016, 8:49 AM

May 26 2016

scanon added a comment to D19391: transform obscured FP sign bit ops into a fabs/fneg using TLI hook.

Numerically, I'm fine with this. Someone else can address the approach and style, etc.

May 26 2016, 9:20 AM

May 23 2016

scanon added inline comments to D19178: Broaden FoldItoFPtoI to try and establish whether the integer value fits into the float type.
May 23 2016, 7:38 AM

May 17 2016

scanon added a comment to D20341: [CUDA] Enable fusing FP ops for CUDA by default..

-ffp-contract=on obeys the semantics of C's FP_CONTRACT pragma. In particular, it will not fuse:

float m = x*y;
float a = m + z;

Whereas you probably want that to fuse for your purposes. -ffp-contract=fast seems more in line with your needs.

May 17 2016, 4:29 PM

May 12 2016

scanon added inline comments to D19178: Broaden FoldItoFPtoI to try and establish whether the integer value fits into the float type.
May 12 2016, 8:33 AM

Apr 8 2016

scanon added a comment to D18874: InstCombine optimization to convert floating-point sign bit XORs to fsubs from -0.0.

The input code is denorm preserving on architectures that flush denorms, but the output may not be.

Apr 8 2016, 10:41 AM
scanon added a comment to D18874: InstCombine optimization to convert floating-point sign bit XORs to fsubs from -0.0.

There are a few niche floating-point formats that use a representation of the form [ exponent | 2s complement significand ], so the signbit ends up sitting in the middle of the number. Does LLVM claim to support arbitrary oddball formats for float/double? Ideally we would specify that LLVM assumes IEEE formats, even if we don't require fully conformant operations in hardware; that would give us some flexibility.

Apr 8 2016, 9:53 AM

Apr 7 2016

scanon added a comment to D18513: Simplify isfinite/isnan/isinf in finite-math-only mode.

Breaking isnan, isinf, etc is a non-starter. I know it's appealing from a consistent formal model viewpoint, but in practice it breaks a lot of code (this is why we have the call fallbacks for iOS/OSX).

Apr 7 2016, 10:58 AM

Mar 22 2016

scanon accepted D18320: APFloat: Fix signalling nans for scalbn.

OK. Let's take this in as is, then, and someone can look at normalize( ) when they have time.

Mar 22 2016, 10:19 AM

Mar 21 2016

scanon added a reviewer for D18320: APFloat: Fix signalling nans for scalbn: gottesmm.
Mar 21 2016, 10:04 AM
scanon added inline comments to D18320: APFloat: Fix signalling nans for scalbn.
Mar 21 2016, 10:03 AM
scanon accepted D18161: APFloat: Add frexp.

LGTM, thanks.

Mar 21 2016, 9:24 AM

Mar 15 2016

scanon added inline comments to D18161: APFloat: Add frexp.
Mar 15 2016, 11:54 AM

Mar 12 2016

scanon accepted D18118: APFloat: Fix ilogb for denormals.

Thanks.

Mar 12 2016, 8:47 PM
scanon accepted D18117: APFloat: Fix scalbn handling of denormals.

LGTM, thanks.

Mar 12 2016, 8:46 PM
scanon added inline comments to D18118: APFloat: Fix ilogb for denormals.
Mar 12 2016, 4:48 AM
scanon added inline comments to D18117: APFloat: Fix scalbn handling of denormals.
Mar 12 2016, 4:37 AM

Jan 29 2016

scanon accepted D16696: InstCombine: fabs(x) * fabs(x) -> x * x.

The signbit of NaN explicitly has no meaning, so there's no concern there. LGTM.

Jan 29 2016, 2:31 PM

Jan 11 2016

scanon accepted D15937: [LibCallSimplifier] use instruction-level fast-math-flags to transform sqrt calls.

LGTM.

Jan 11 2016, 11:32 AM

Dec 4 2015

scanon accepted D14066: [FPEnv Core 01/14] Add flags and command-line switches for FPEnv.

LGTM.

Dec 4 2015, 12:12 PM
scanon added a comment to D14067: [FPEnv Core 02/14] Add FPEnv access flags to fast-math flags.

Thanks!

Dec 4 2015, 12:02 PM

Dec 2 2015

scanon abandoned D14891: Replace assert with early-out in tryEmitFMulAdd.

Abandoned for D15165.

Dec 2 2015, 5:17 PM
scanon added a comment to D15165: change an assert when generating fmuladd to an ordinary 'if' check (PR25719).

LGTM.

Dec 2 2015, 5:17 PM
scanon added a comment to D15165: change an assert when generating fmuladd to an ordinary 'if' check (PR25719).

This is mostly http://reviews.llvm.org/D14891 with a test case added, but D14891 also fixed a second very minor issue: that the "else if" should just be "if". Also, majnemer made a few style suggestions there that it would be nice to adopt. Either way, we should merge the two patches. This can be the canonical one if you want to update it.

Dec 2 2015, 4:11 PM

Nov 30 2015

scanon added a comment to D14909: [X86][FMA] Optimize FNEG(FMUL) Patterns.
Nov 30 2015, 12:06 PM
scanon added a comment to D14909: [X86][FMA] Optimize FNEG(FMUL) Patterns.

Can we assume default rounding for now, change the operand to X86::FNMSUB, remove the nsz check, and leave a 'TODO' comment to revisit this after rounding mode support is added? I think it would be better to have this transform be more general if we can.

Nov 30 2015, 11:31 AM