andrew.w.kaylor (Andy Kaylor)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 2 2013, 1:50 PM (241 w, 4 d)

Recent Activity

Fri, Aug 18

andrew.w.kaylor added a comment to D36335: Add ‘llvm.experimental.constrained.fma‘ Intrinsic.

Do we want to give the target any chance to use FMSUB/FNMADD/FNSUB if any of the arguments are negated?

Fri, Aug 18, 3:12 PM
andrew.w.kaylor added inline comments to D36335: Add ‘llvm.experimental.constrained.fma‘ Intrinsic.
Fri, Aug 18, 2:54 PM

Thu, Aug 17

andrew.w.kaylor added inline comments to D36335: Add ‘llvm.experimental.constrained.fma‘ Intrinsic.
Thu, Aug 17, 11:17 AM

Tue, Aug 15

andrew.w.kaylor added inline comments to D36335: Add ‘llvm.experimental.constrained.fma‘ Intrinsic.
Tue, Aug 15, 3:58 PM

Mon, Aug 14

andrew.w.kaylor added inline comments to D36335: Add ‘llvm.experimental.constrained.fma‘ Intrinsic.
Mon, Aug 14, 10:22 AM

Fri, Aug 4

andrew.w.kaylor added a comment to D36335: Add ‘llvm.experimental.constrained.fma‘ Intrinsic.

Could you also add a use of this new intrinsic to llvm/test/Verifier/fp-intrinsics.ll?

Fri, Aug 4, 3:13 PM
andrew.w.kaylor requested changes to D36335: Add ‘llvm.experimental.constrained.fma‘ Intrinsic.
Fri, Aug 4, 3:09 PM

Jul 9 2017

andrew.w.kaylor updated the diff for D34163: Add strictfp attribute to prevent unwanted optimizations of libm calls.

-Addressed review feedback
-Rebased

Jul 9 2017, 6:13 AM

Jul 6 2017

andrew.w.kaylor added inline comments to D34163: Add strictfp attribute to prevent unwanted optimizations of libm calls.
Jul 6 2017, 3:22 PM

Jun 22 2017

andrew.w.kaylor updated the diff for D34487: Restrict the definition of loop preheader to avoid special blocks.

Addressed initial review feedback

Jun 22 2017, 1:45 PM
andrew.w.kaylor added inline comments to D34487: Restrict the definition of loop preheader to avoid special blocks.
Jun 22 2017, 1:30 PM

Jun 21 2017

andrew.w.kaylor created D34487: Restrict the definition of loop preheader to avoid special blocks.
Jun 21 2017, 4:55 PM
andrew.w.kaylor added a reviewer for D34163: Add strictfp attribute to prevent unwanted optimizations of libm calls: craig.topper.
Jun 21 2017, 4:38 PM

Jun 13 2017

andrew.w.kaylor created D34163: Add strictfp attribute to prevent unwanted optimizations of libm calls.
Jun 13 2017, 1:07 PM

Jun 7 2017

andrew.w.kaylor requested changes to D34006: [LoopSimplify] Add opt-bisect support to LoopSimplify pass.

Oops. I didn't mean to accept the revision yet.

Jun 7 2017, 12:10 PM
andrew.w.kaylor accepted D34006: [LoopSimplify] Add opt-bisect support to LoopSimplify pass.

It looks like a number of passes have this as a required analysis pass, and some assert that loops are in the simplified form. Can you verify that all passes which depend on this pass can also be skipped by opt-bisect?

Jun 7 2017, 12:10 PM

Jun 1 2017

andrew.w.kaylor updated the diff for D33737: [InstSimplify] Don't constant fold or DCE calls that are marked nobuiltin.

Re-wrote the patch to move the isNoBuiltin checks into canConstantFoldCallTo and ConstantFoldCall.

Jun 1 2017, 12:36 PM
andrew.w.kaylor added a comment to D33751: Add opt-bisect support for region passes.

Thanks for doing this!

Jun 1 2017, 9:15 AM

May 31 2017

andrew.w.kaylor added a comment to D33737: [InstSimplify] Don't constant fold or DCE calls that are marked nobuiltin.

The other alternative is that you could pass in an AttributeList (the result of getAttributes()) to the functions... that makes it more clear what exactly is getting used, but it's more complicated (and probably involves a bunch of refactoring to expose a function to check isNoBuiltin given a Function/AttributeList pair).

But either way, I would prefer a solution that involves checking for isNoBuiltin() in as few places as possible.

May 31 2017, 4:10 PM
andrew.w.kaylor added a comment to D33737: [InstSimplify] Don't constant fold or DCE calls that are marked nobuiltin.

Would it be possible to pass in a CallSite to canConstantFoldCallTo, ConstantFoldCall, and SimplifyCall, to reduce the number of places we check this?

May 31 2017, 3:06 PM
andrew.w.kaylor created D33737: [InstSimplify] Don't constant fold or DCE calls that are marked nobuiltin.
May 31 2017, 1:27 PM

May 25 2017

andrew.w.kaylor updated the diff for D32319: Add constrained intrinsics for some libm-equivalent operations.

Addressed review feedback

May 25 2017, 10:01 AM

May 22 2017

andrew.w.kaylor updated the diff for D32319: Add constrained intrinsics for some libm-equivalent operations.

Fixed LangRef spacing issues in more places

May 22 2017, 1:58 PM
andrew.w.kaylor updated the diff for D32319: Add constrained intrinsics for some libm-equivalent operations.

Addressed review comments
Updated to latest source base

May 22 2017, 1:54 PM

May 12 2017

andrew.w.kaylor added inline comments to D32319: Add constrained intrinsics for some libm-equivalent operations.
May 12 2017, 11:45 AM
andrew.w.kaylor accepted D31789: [TLI] Add mapping for various '__<func>_finite' forms of the math routines to SVML routines.

lgtm

May 12 2017, 10:17 AM

May 11 2017

andrew.w.kaylor added inline comments to D31789: [TLI] Add mapping for various '__<func>_finite' forms of the math routines to SVML routines.
May 11 2017, 2:49 PM
andrew.w.kaylor added inline comments to D31788: [ConstantFolding] Add folding for various math '__<func>_finite' routines generated from -ffast-math.
May 11 2017, 1:35 PM
andrew.w.kaylor added a comment to D31788: [ConstantFolding] Add folding for various math '__<func>_finite' routines generated from -ffast-math.

A couple of things came up as I was preparing this for commit. They're pretty straightforward issues, so I'll just fix them in my sandbox before committing.

May 11 2017, 1:15 PM

May 10 2017

andrew.w.kaylor accepted D31788: [ConstantFolding] Add folding for various math '__<func>_finite' routines generated from -ffast-math.

lgtm

May 10 2017, 2:58 PM
andrew.w.kaylor added a comment to D31787: [TLI] Add declarations for various math header file routines from math-finite.h that create '__<func>_finite as functions.

Still approved.

May 10 2017, 2:58 PM

May 9 2017

andrew.w.kaylor added inline comments to D31788: [ConstantFolding] Add folding for various math '__<func>_finite' routines generated from -ffast-math.
May 9 2017, 5:03 PM
andrew.w.kaylor added inline comments to D31788: [ConstantFolding] Add folding for various math '__<func>_finite' routines generated from -ffast-math.
May 9 2017, 4:40 PM

May 8 2017

andrew.w.kaylor added inline comments to D31787: [TLI] Add declarations for various math header file routines from math-finite.h that create '__<func>_finite as functions.
May 8 2017, 12:28 PM
andrew.w.kaylor added inline comments to D31788: [ConstantFolding] Add folding for various math '__<func>_finite' routines generated from -ffast-math.
May 8 2017, 12:26 PM
andrew.w.kaylor accepted D31787: [TLI] Add declarations for various math header file routines from math-finite.h that create '__<func>_finite as functions.

Looks good to me.

May 8 2017, 11:54 AM

May 5 2017

andrew.w.kaylor accepted D32837: TargetLibraryInfo: Introduce wcslen.

Looks good to me

May 5 2017, 8:48 AM

May 2 2017

andrew.w.kaylor added inline comments to D31787: [TLI] Add declarations for various math header file routines from math-finite.h that create '__<func>_finite as functions.
May 2 2017, 2:23 PM

May 1 2017

andrew.w.kaylor added a comment to D31787: [TLI] Add declarations for various math header file routines from math-finite.h that create '__<func>_finite as functions.

Can you add tests for these changes? I think the attribute.ll and no-proto.ll tests in test/Transforms/InferFunctionAttrs would be the right place to add these. I'm not sure if the existing tests check all of the LibFunc calls or just a representative sample.

May 1 2017, 12:10 PM

Apr 28 2017

andrew.w.kaylor added a comment to D32319: Add constrained intrinsics for some libm-equivalent operations.

Ping.

Apr 28 2017, 10:03 AM

Apr 24 2017

andrew.w.kaylor updated subscribers of D31789: [TLI] Add mapping for various '__<func>_finite' forms of the math routines to SVML routines.
Apr 24 2017, 12:33 PM
andrew.w.kaylor updated subscribers of D31788: [ConstantFolding] Add folding for various math '__<func>_finite' routines generated from -ffast-math.
Apr 24 2017, 12:32 PM
andrew.w.kaylor updated subscribers of D31787: [TLI] Add declarations for various math header file routines from math-finite.h that create '__<func>_finite as functions.
Apr 24 2017, 12:32 PM

Apr 21 2017

andrew.w.kaylor accepted D31182: [InstCombine] fadd double (sitofp x), y check that the promotion is valid .

I'm happy with this. Thanks for the improvements!

Apr 21 2017, 10:32 AM

Apr 20 2017

andrew.w.kaylor created D32319: Add constrained intrinsics for some libm-equivalent operations.
Apr 20 2017, 3:54 PM

Mar 23 2017

andrew.w.kaylor added inline comments to D31182: [InstCombine] fadd double (sitofp x), y check that the promotion is valid .
Mar 23 2017, 3:32 PM

Mar 21 2017

andrew.w.kaylor added inline comments to D31182: [InstCombine] fadd double (sitofp x), y check that the promotion is valid .
Mar 21 2017, 1:07 PM
andrew.w.kaylor added inline comments to D31182: [InstCombine] fadd double (sitofp x), y check that the promotion is valid .
Mar 21 2017, 1:00 PM

Mar 13 2017

andrew.w.kaylor abandoned D30662: Update clang filtering for mxcsr.
Mar 13 2017, 11:50 AM
andrew.w.kaylor abandoned D30661: [x86] Split MXCSR into two pseudo-registers.

It looks like I need to rethink this.

Mar 13 2017, 11:49 AM

Mar 7 2017

andrew.w.kaylor added a comment to D30661: [x86] Split MXCSR into two pseudo-registers.

OK, so maybe that puts me back in the position of needing to find a way to conditionally add the MXCSR use/def information only when the strict semantics are required, in which case there would be no significant advantage to splitting the register as I'm proposing here.

Mar 7 2017, 12:39 PM

Mar 6 2017

andrew.w.kaylor added a comment to D30661: [x86] Split MXCSR into two pseudo-registers.

It's not just DCE which is problematic... we could also sink a floating-point operation past a read from the status register. I suppose you could prevent that particular problem by making reads from the status register write to the control bits, but that causes its own problems.

Mar 6 2017, 10:21 PM
andrew.w.kaylor added a comment to D30661: [x86] Split MXCSR into two pseudo-registers.

A subsequent patch will update floating point operations to add an implicit use of the control bits and an implicit def of the status bits

This seems kind of confusing... strict floating-point ops need to implicitly use and def the status bits, because the new value depends on the previous value. You can think of an FP operation as a logical OR acting on the status register. Many kinds of code motion are legal (e.g. you can reorder FP operations with each other, or hoist them out of loops). But if you omit the use, other optimizations won't work correctly; for example, dead code elimination will eliminate FP operations which have a visible effect on the status register.

Given that, I'm not sure what splitting the status register buys you; I guess it becomes easier to check whether an instruction modifies the control bits?

Mar 6 2017, 4:12 PM
andrew.w.kaylor added a comment to D30662: Update clang filtering for mxcsr.

Is it possible to add a test case (possibly in CodeGen)?

Mar 6 2017, 1:36 PM
andrew.w.kaylor added a dependency for D30662: Update clang filtering for mxcsr: D30661: [x86] Split MXCSR into two pseudo-registers.
Mar 6 2017, 9:44 AM
andrew.w.kaylor added a dependent revision for D30661: [x86] Split MXCSR into two pseudo-registers: D30662: Update clang filtering for mxcsr.
Mar 6 2017, 9:44 AM
andrew.w.kaylor created D30662: Update clang filtering for mxcsr.
Mar 6 2017, 9:44 AM
andrew.w.kaylor created D30661: [x86] Split MXCSR into two pseudo-registers.
Mar 6 2017, 9:41 AM

Feb 13 2017

andrew.w.kaylor created D29903: [X86] Add MXCSR register.
Feb 13 2017, 12:16 PM

Jan 17 2017

andrew.w.kaylor added a comment to D27028: Add intrinsics for constrained floating point operations.

Is there anything left blocking this patch?

Jan 17 2017, 11:10 AM

Jan 11 2017

andrew.w.kaylor updated the diff for D27028: Add intrinsics for constrained floating point operations.

-Consolidated examination of StrictFP node opcodes.

Jan 11 2017, 12:47 PM

Jan 10 2017

andrew.w.kaylor updated the diff for D27028: Add intrinsics for constrained floating point operations.

-Combined uses of string literals for FP rounding mode and exception behavior.
-Removed extension to StringSwitch since it is no longer needed.
-Added documentation explaining that the rounding mode argument isn't used from the frem intrinsic.

Jan 10 2017, 5:32 PM

Jan 5 2017

andrew.w.kaylor added a comment to D28363: [LICM] Small update to note changes made in hoistRegion.

(Out of curiosity - you mean r290726 actually has an observable effect on compile time?)

Jan 5 2017, 11:01 AM
andrew.w.kaylor retitled D28363: [LICM] Small update to note changes made in hoistRegion from to [LICM] Small update to note changes made in hoistRegion.
Jan 5 2017, 9:32 AM

Jan 4 2017

andrew.w.kaylor added a comment to D27028: Add intrinsics for constrained floating point operations.

When frem has a meaningful result, it is always exact, so perhaps we should omit the rounding behavior argument? I think we still need a constrained frem intrinsic, though, to handle the exceptional behavior in cases such as "frem inf, x" and "frem x, 0".

Jan 4 2017, 5:40 PM

Jan 3 2017

andrew.w.kaylor added a comment to D27028: Add intrinsics for constrained floating point operations.

FWIW, rounding controls are needed for llvm.fma.*, llvm.fmuladd.*, and llvm.sqrt.*

Jan 3 2017, 10:59 AM
andrew.w.kaylor added a comment to D27028: Add intrinsics for constrained floating point operations.

Ping.

Jan 3 2017, 10:06 AM

Dec 21 2016

andrew.w.kaylor accepted D27965: Update mailing list post URL.

It looks to me like you are probably correct about the intended mailing list post, but it's still not particularly helpful. It would be nice to have a reference to a primary source, but this is the best I could find:

Dec 21 2016, 12:37 PM

Dec 16 2016

andrew.w.kaylor accepted D27508: [CodeGen] Make MachineInstr::isIdenticalTo() symmetric..

I have one nitpick about a typo in a comment, but otherwise this looks good to me.

Dec 16 2016, 10:59 AM

Dec 15 2016

andrew.w.kaylor added a comment to D27028: Add intrinsics for constrained floating point operations.

Ping

Dec 15 2016, 2:45 PM
andrew.w.kaylor added a comment to D25848: [PM/OptBisect] Don't crash with some particular values of -opt-bisect-limit=.

I consider this reasonable, but I don't feel qualified enough to review as I'm not a LICM expert, so I'm CC:ing @danielcdh . I'm slightly worried about other passes hitting a similar issue (expecting some state to be freed and asserting in doFinalization())

Dec 15 2016, 2:43 PM
andrew.w.kaylor added a comment to D25848: [PM/OptBisect] Don't crash with some particular values of -opt-bisect-limit=.

What do you think of this change instead?

Index: lib/Transforms/Scalar/LICM.cpp
===================================================================
--- lib/Transforms/Scalar/LICM.cpp	(revision 289480)
+++ lib/Transforms/Scalar/LICM.cpp	(working copy)
@@ -124,8 +124,13 @@
   }
Dec 15 2016, 12:34 PM

Dec 14 2016

andrew.w.kaylor added a comment to D25848: [PM/OptBisect] Don't crash with some particular values of -opt-bisect-limit=.

Sorry this is moving so slowly. I just reproduced the problem with the test file from your last comment. I'll take a closer look and see if I can understand what's going on.

Dec 14 2016, 12:29 PM
andrew.w.kaylor requested changes to D27508: [CodeGen] Make MachineInstr::isIdenticalTo() symmetric..
Dec 14 2016, 12:22 PM

Dec 12 2016

andrew.w.kaylor retitled D27693: [WinEH] Avoid holding references to BlockColor (DenseMap) entries while inserting new elements from to [WinEH] Avoid holding references to BlockColor (DenseMap) entries while inserting new elements.
Dec 12 2016, 5:50 PM

Dec 9 2016

andrew.w.kaylor updated the diff for D27582: Avoid infinite loops in branch folding.

I moved the isEHPad() check, as suggested.

Dec 9 2016, 4:46 PM
andrew.w.kaylor added inline comments to D27582: Avoid infinite loops in branch folding.
Dec 9 2016, 12:03 PM

Dec 8 2016

andrew.w.kaylor updated subscribers of D27028: Add intrinsics for constrained floating point operations.
Dec 8 2016, 11:56 AM
andrew.w.kaylor retitled D27582: Avoid infinite loops in branch folding from to Avoid infinite loops in branch folding.
Dec 8 2016, 11:02 AM

Dec 5 2016

andrew.w.kaylor added a comment to D27028: Add intrinsics for constrained floating point operations.

Correct handling of ftz is required for correctness, so it isn't appropriate to be an optimization hint placed with the fast math flags

Dec 5 2016, 5:50 PM
andrew.w.kaylor updated the diff for D27028: Add intrinsics for constrained floating point operations.

Fixed typos and style issues.
Added FIXME comments as requested.

Dec 5 2016, 5:39 PM
andrew.w.kaylor added a comment to D27028: Add intrinsics for constrained floating point operations.

Having the llvm_unreachable right after the StringSwitch should achieve the same thing.

I'm not sure I understand what you're suggesting here. If I do this:

return StringSwitch<RoundingMode>(RoundingArg)
  .Case("round.dynamic",    rmDynamic)
  .Case("round.tonearest",  rmToNearest)
  .Case("round.downward",   rmDownward)
  .Case("round.upward",     rmUpward)
  .Case("round.towardzero", rmTowardZero);
llvm_unreachable("Unexpected rounding mode argument in FP intrinsic!");

the llvm_unreachable statement is purely unreachable because the implicit default from StringSwitch will assert or dereference a null pointer. The llvm_unreachable in this case effectively becomes a comment.

Not really: in an optimized build it means that the default case is unreachable. The assertion does not exist there, and the optimizer can drop the nullptr dereference.

Dec 5 2016, 5:30 PM
andrew.w.kaylor added inline comments to D27028: Add intrinsics for constrained floating point operations.
Dec 5 2016, 5:19 PM
andrew.w.kaylor added a comment to D27028: Add intrinsics for constrained floating point operations.

Having the llvm_unreachable right after the StringSwitch should achieve the same thing.

Dec 5 2016, 3:43 PM
andrew.w.kaylor added a comment to D27028: Add intrinsics for constrained floating point operations.

I see. I still think it's worth adding UnreachableDefault() instead of just a comment explaining that other values will assert/crash. I'm checking the legal values in the verfifier, so this shouldn't be an issue.

Dec 5 2016, 2:51 PM
andrew.w.kaylor added inline comments to D27028: Add intrinsics for constrained floating point operations.
Dec 5 2016, 2:41 PM
andrew.w.kaylor added inline comments to D27028: Add intrinsics for constrained floating point operations.
Dec 5 2016, 12:03 PM
andrew.w.kaylor added inline comments to D27028: Add intrinsics for constrained floating point operations.
Dec 5 2016, 11:08 AM

Nov 28 2016

andrew.w.kaylor updated the diff for D27028: Add intrinsics for constrained floating point operations.

I re-wrote the ISel code to introduce a pseudo-instruction for the strict variants of FP operations, which is mutated directly to a normal FP node just before instruction selection. I removed the SDNodeFlag extensions I had added in my original patch because they aren't needed in this implementation. Some variant of that code will likely need to be re-introduced at some point, particularly to handle FTZ rounding modes.

Nov 28 2016, 1:46 PM

Nov 23 2016

andrew.w.kaylor added a comment to D27028: Add intrinsics for constrained floating point operations.

It's not handled in general llvm. We currently have a subtarget feature for f32/f64 denormal support in the default rounding mode

Nov 23 2016, 1:10 PM
andrew.w.kaylor added a comment to D27028: Add intrinsics for constrained floating point operations.

Is flush-to-zero currently handled with function attributes?

Nov 23 2016, 12:58 PM
andrew.w.kaylor added a comment to D27028: Add intrinsics for constrained floating point operations.

As I mentioned at the meeting I think these need a way to control whether denormals are flushed or not

You mean that there should be a way to control this on a per-operation basis, or there should be some way to represent that the user might be changing some thread state that controls how this is done?

Nov 23 2016, 11:25 AM
andrew.w.kaylor added a comment to D27028: Add intrinsics for constrained floating point operations.

Based on our conversation at the dev meeting, here's how I thought this would work:

  1. Introduce target-independent chain-carrying nodes to represent these operations. For argument's sake, STRICT_FADD, etc.
  2. Since nothing in the SDAG knows about what these nodes do, there's no problem with optimizations doing bad things.
  3. You'd do something minimal in SelectionDAGISel::DoInstructionSelection() around the call to:

    Select(Node);

    so that it would become:

    bool IsStrictFPOp = isStrictFPOp(Node); if (IsStrictFPOp) mutateStrictFPToFP(Node); // STRICT_FADD -> FADD, etc.

    Select(Node);

    if (IsStrictFPOp && !TLI->addStrictFPRegDeps(Node)) report_fatal_error("Could not add strict FP reg deps");

    and then you'd be done. Obviously this is somewhat hand wavy, but if there are complexities I'm overlooking, I'd like to understand them.
Nov 23 2016, 10:49 AM

Nov 22 2016

andrew.w.kaylor retitled D27028: Add intrinsics for constrained floating point operations from to Add intrinsics for constrained floating point operations.
Nov 22 2016, 5:32 PM

Nov 21 2016

andrew.w.kaylor updated the diff for D26485: Add IntrInaccessibleMemOnly property for intrinsics.

Removed IntrInaccessibleMemOnly from llvm.assume.

Nov 21 2016, 5:09 PM
andrew.w.kaylor added a comment to D26485: Add IntrInaccessibleMemOnly property for intrinsics.

Ping.

Nov 21 2016, 11:02 AM

Nov 10 2016

andrew.w.kaylor updated the diff for D26485: Add IntrInaccessibleMemOnly property for intrinsics.

I added the IntrInaccessibleMemOrArgMemOnly property, and updated the llvm.assume intrinsic to use IntrInaccessibleMemOnly.

Nov 10 2016, 10:48 AM

Nov 9 2016

andrew.w.kaylor retitled D26485: Add IntrInaccessibleMemOnly property for intrinsics from to Add IntrInaccessibleMemOnly property for intrinsics.
Nov 9 2016, 4:53 PM

Nov 8 2016

andrew.w.kaylor added a comment to D26382: [BasicAA] Teach BasicAA to handle the inaccessiblememonly and inaccessiblemem_or_argmemonly attributes.

This will work, but we might want a separate tracking state for this (because otherwise we won't be able to infer readonly for any functions with FP operations, which is going to be very unfortunate). We might do this as a later enhancement, however.

Nov 8 2016, 1:26 PM

Nov 7 2016

andrew.w.kaylor retitled D26382: [BasicAA] Teach BasicAA to handle the inaccessiblememonly and inaccessiblemem_or_argmemonly attributes from to [BasicAA] Teach BasicAA to handle the inaccessiblememonly and inaccessiblemem_or_argmemonly attributes.
Nov 7 2016, 6:12 PM