Page MenuHomePhabricator

andrew.w.kaylor (Andy Kaylor)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 2 2013, 1:50 PM (496 w, 16 h)

Recent Activity

Tue, Jul 5

andrew.w.kaylor requested review of D129154: Use TuningFastScalarFSQRT for default X86 tuning.
Tue, Jul 5, 10:54 AM · Restricted Project, Restricted Project

Apr 12 2022

andrew.w.kaylor added inline comments to D123531: [GlobalsModRef][FIX] Ensure we honor synchronizing effects of intrinsics.
Apr 12 2022, 5:29 PM · Restricted Project, Restricted Project

Mar 29 2022

andrew.w.kaylor accepted D69798: Implement inlining of strictfp functions.

lgtm

Mar 29 2022, 9:39 AM · Restricted Project, Restricted Project
andrew.w.kaylor accepted D69562: Mapping of FP operations to constrained intrinsics.

This looks good. I have just a couple of minor nits.

Mar 29 2022, 9:23 AM · Restricted Project, Restricted Project

Mar 25 2022

andrew.w.kaylor added inline comments to D118426: [InstCombine] Remove side effect of replaced constrained intrinsics.
Mar 25 2022, 5:23 PM · Restricted Project, Restricted Project
andrew.w.kaylor added a comment to D69798: Implement inlining of strictfp functions.

@sepavloff I apologize for having lost track of this for so long. Do you have time to rebase this and the dependent patch?

Mar 25 2022, 2:20 PM · Restricted Project, Restricted Project
Herald added a project to D69798: Implement inlining of strictfp functions: Restricted Project.

I had forgotten that this patch never landed, but I was investigating a bug yesterday that I think this will help with (https://github.com/llvm/llvm-project/issues/48669).

Mar 25 2022, 11:00 AM · Restricted Project, Restricted Project

Mar 22 2022

andrew.w.kaylor added inline comments to D122155: Add warning when eval-method is set in the presence of value unsafe floating-point calculations..
Mar 22 2022, 1:32 PM · Restricted Project, Restricted Project

Mar 11 2022

andrew.w.kaylor accepted D121410: Have cpu-specific variants set 'tune-cpu' as an optimization hint.

This looks good to me. Thanks for the patch!

Mar 11 2022, 11:02 AM · Restricted Project, Restricted Project, Restricted Project

Mar 10 2022

andrew.w.kaylor added inline comments to D121410: Have cpu-specific variants set 'tune-cpu' as an optimization hint.
Mar 10 2022, 3:21 PM · Restricted Project, Restricted Project, Restricted Project
andrew.w.kaylor added a comment to D121410: Have cpu-specific variants set 'tune-cpu' as an optimization hint.

This example illustrates the problem this patch intends to fix: https://godbolt.org/z/j445sxPMc

Mar 10 2022, 2:53 PM · Restricted Project, Restricted Project, Restricted Project

Mar 8 2022

andrew.w.kaylor added a comment to D121122: Set FLT_EVAL_METHOD to -1 when fast-math is enabled..

The fix for the eval_method crash should be moved to a separate patch. Otherwise, this looks good. I have only minor comments.

Mar 8 2022, 10:21 AM · Restricted Project, Restricted Project

Mar 7 2022

andrew.w.kaylor added a comment to D120395: [X86] Prohibit arithmetic operations on type `__bfloat16`.

I don't agree. Unlike __fp16, __bf16 is simple an ARM specific type.

Mar 7 2022, 2:41 PM · Restricted Project, Restricted Project, Restricted Project
andrew.w.kaylor added a comment to D120395: [X86] Prohibit arithmetic operations on type `__bfloat16`.

m256bh should not have been a new type. It should have been an alias of m256i. We don't have load/store intrinsics for m256bh so if you can even get the m256bh type in and out of memory using load/store intrinsics, it is only because we allow lax vector conversion by default. -fno-lax-vector-conversions will probably break any code trying to load/store it using a load/store intrinsic. If __m256bh was made a struct as at one point proposed, this would have been broken.

If we want m256bh to be a unique type using bf16, we must define load, store, and cast intrinsics for it. We would probably want insert/extract element intrinsics as well.

Mar 7 2022, 2:30 PM · Restricted Project, Restricted Project, Restricted Project

Mar 3 2022

andrew.w.kaylor added a comment to D120395: [X86] Prohibit arithmetic operations on type `__bfloat16`.

Good question! This is actually the scope of ABI. Unfortunately, we don't have the BF16 ABI at the present. We can't assume what are the physical registers the arguments been passed and returned before we have such a hardware. For example, ARM has soft FP ABI that supports FP arithmetic operations and passes and returns arguments by integer registers. When we enabling some ISA set whose type doesn't have ABI representation, e.g., F16C, we borrowed such conception. And as a trade off, we used integer rather than introducing a new IR type, since we don't need to support the arithmetic operations.

Mar 3 2022, 3:49 PM · Restricted Project, Restricted Project, Restricted Project

Mar 2 2022

Herald added a project to D120395: [X86] Prohibit arithmetic operations on type `__bfloat16`: Restricted Project.

There's a lot of churn around proposed "solutions" on this and related PR, but not a very clear analysis of what the problem we're trying to solve is.

Mar 2 2022, 6:21 PM · Restricted Project, Restricted Project, Restricted Project

Feb 25 2022

andrew.w.kaylor requested changes to D120411: [X86] Replace __m[128|256|512]bh with __m[128|256|512]i and mark the former deprecated.

Replacing __m128bh with __m128i does not prevent arithmetic operations on the type.

Feb 25 2022, 1:05 PM · Restricted Project
andrew.w.kaylor added inline comments to D120395: [X86] Prohibit arithmetic operations on type `__bfloat16`.
Feb 25 2022, 12:57 PM · Restricted Project, Restricted Project, Restricted Project
andrew.w.kaylor added a comment to D120395: [X86] Prohibit arithmetic operations on type `__bfloat16`.

Disscussed with GCC folks. We think it's better to use the same way as D120411 that replacing it with short int.

Feb 25 2022, 11:07 AM · Restricted Project, Restricted Project, Restricted Project

Feb 24 2022

andrew.w.kaylor added a comment to D120395: [X86] Prohibit arithmetic operations on type `__bfloat16`.

These intrinsics pre-date the existence of the bfloat type in LLVM. To use bfloat we have to make __bf16 a legal type in C. This means we need to support loads, stores, and arguments of that type. I think that would create bunch of backend complexity because we don't have could 16-bit load/store support to XMM registers. I think we only have load that inserts into a specific element. It's doable, but I'm not sure what we gain from it.

Feb 24 2022, 11:29 AM · Restricted Project, Restricted Project, Restricted Project

Feb 23 2022

andrew.w.kaylor added a comment to D120395: [X86] Prohibit arithmetic operations on type `__bfloat16`.

Update LangRef. We use i16 type to represent bfloat16.

Feb 23 2022, 10:26 AM · Restricted Project, Restricted Project, Restricted Project

Jan 18 2022

andrew.w.kaylor accepted D114564: Fix the use of -fno-approx-func along with -Ofast or -ffast-math.

lgtm

Jan 18 2022, 10:46 AM · Restricted Project

Dec 13 2021

andrew.w.kaylor added reviewers for D115657: [Nomination] Adding Intel representatives to security group: ab, apilipenko, dim, emaste, george.burgess.iv, kristof.beyls, mattdr, nikhgupt, ojhunt, probinson, peter.smith, pietroalbini, serge-sans-paille, Shayne, steveklabnik, tpenge.
Dec 13 2021, 11:59 AM · Restricted Project
andrew.w.kaylor requested review of D115657: [Nomination] Adding Intel representatives to security group.
Dec 13 2021, 11:55 AM · Restricted Project

Dec 3 2021

andrew.w.kaylor added a comment to D96646: [NFC] update LangRef for D88645.

I'm not trying to be difficult, but I genuinely still don't understand the additional arguments pointer. Is it intended to allow proprietary extensions? Is there an example somewhere?

If these intrinsics are meant as a general mechanism to enable arbitrary communication between custom front ends and custom optimization passes, that's fine. I'd just like to see something explicitly explaining that.

the pointer is to a global struct containing the the provided arguments. I use it to communicate from source code to custom passes. but the front-end doesn't need to be custom. since the code to generate theses extra arguments is in clang trunk.
here is a basic example: https://godbolt.org/z/Tavcv9.
how the passes use this information is not defined it can be used for many things.
the major benefit I see is that it can pass arbitrary information and works well with template and constexpr programing. but the front-end does not do any semantic checking on the meaning of the information.

Dec 3 2021, 12:13 PM · Restricted Project

Nov 24 2021

andrew.w.kaylor added a comment to D114564: Fix the use of -fno-approx-func along with -Ofast or -ffast-math.

Thanks for the patch! This looks mostly good. I have just a few suggestions.

Nov 24 2021, 3:08 PM · Restricted Project

Nov 2 2021

andrew.w.kaylor added a comment to D112760: Require 'contract' fast-math flag for FMA generation.

This attribute has never been handled consistently in the backend. We have attributes approximately corresponding to each of the individual fast math flags, so the mirror of the IR controls would be to just set all of those attributes. I've never been sure how unsafe-fp-math fits in here, other than just a convenience to avoid setting all the others. I would also like to get rid of it, since you should be checking the individual properties anyway.

Nov 2 2021, 3:39 PM · Restricted Project
andrew.w.kaylor added inline comments to D112760: Require 'contract' fast-math flag for FMA generation.
Nov 2 2021, 3:02 PM · Restricted Project
andrew.w.kaylor added inline comments to D112760: Require 'contract' fast-math flag for FMA generation.
Nov 2 2021, 11:34 AM · Restricted Project

Nov 1 2021

andrew.w.kaylor added a comment to D112760: Require 'contract' fast-math flag for FMA generation.

I would rephrase the description as removing the global flag for contraction

Nov 1 2021, 2:59 PM · Restricted Project

Oct 28 2021

andrew.w.kaylor requested review of D112760: Require 'contract' fast-math flag for FMA generation.
Oct 28 2021, 2:35 PM · Restricted Project

Sep 29 2021

andrew.w.kaylor abandoned D110589: [IntelJITListener] Generalize JIT listener test.

Instead of generalizing the test, I re-ordered the checks to match what's currently happening. It appears to be stable and the test was broken by an explicit change to the object symbol ordering.

Sep 29 2021, 4:53 PM · Restricted Project
andrew.w.kaylor committed rGe49c0c5100b9: [IntelJITListener] Fix order in JitListener/multiple.ll (authored by andrew.w.kaylor).
[IntelJITListener] Fix order in JitListener/multiple.ll
Sep 29 2021, 4:49 PM

Sep 27 2021

andrew.w.kaylor requested review of D110589: [IntelJITListener] Generalize JIT listener test.
Sep 27 2021, 2:14 PM · Restricted Project

Sep 7 2021

andrew.w.kaylor committed rG34528c32d23f: Copy Elementtype Attribute to IR at Link step (authored by andrew.w.kaylor).
Copy Elementtype Attribute to IR at Link step
Sep 7 2021, 11:47 AM
andrew.w.kaylor closed D108796: Copy Elementtype Attribute to IR at Link step.
Sep 7 2021, 11:47 AM · Restricted Project

Aug 23 2021

andrew.w.kaylor added a comment to D105895: [FPEnv][EarlyCSE] Add support for CSE of constrained FP intrinsics.

This patch may be overkill. If the reading of the floating point status register is restricted to "fpexcept.strict" then perhaps we don't need all of this code. But if we want to allow checking the status register after "fpexcept.maytrap" instructions then we do need all this. The LangRef restricts those reads to "fpexcept.strict". Are we certain we want to stick with that language?

Aug 23 2021, 3:06 PM · Restricted Project
andrew.w.kaylor added inline comments to D106362: [FPEnv][InstSimplify] Enable more folds for constrained fadd.
Aug 23 2021, 1:05 PM · Restricted Project

Aug 5 2021

andrew.w.kaylor added a comment to D74436: Change clang option -ffp-model=precise to select ffp-contract=on.

FWIW, fp-contract=on has been the documented default for clang since version 5.

Aug 5 2021, 10:13 AM · Restricted Project

Aug 4 2021

andrew.w.kaylor accepted D104551: Delay initialization of OptBisect.

lgtm

Aug 4 2021, 10:13 AM · Restricted Project

Jul 29 2021

andrew.w.kaylor added a comment to D104551: Delay initialization of OptBisect.

Yes, that's exactly what happens. The project uses LLVM as a code generator for multiple targets. The OptBisector object is constructed the first time it's queried by shouldRunPass, which happened during the code generation for the first target. We set up the second target by passing some extra flags to ParseCommandLineOptions, but passing -opt-bisect-limit at this time no longer has any effect (i.e. the setting from the construction time remains).

Jul 29 2021, 6:11 PM · Restricted Project
andrew.w.kaylor committed rGb4d945bacdaf: Fixing an infinite loop problem in InstCombine (authored by andrew.w.kaylor).
Fixing an infinite loop problem in InstCombine
Jul 29 2021, 1:05 PM
andrew.w.kaylor closed D106950: Fixing an infinite loop problem in InstCombine.
Jul 29 2021, 1:05 PM · Restricted Project

Jul 27 2021

andrew.w.kaylor added a comment to D104551: Delay initialization of OptBisect.

Sorry for the non-response, I hadn't notice this review earlier.

Jul 27 2021, 2:14 PM · Restricted Project
andrew.w.kaylor committed rGb373b5990d59: Enabling the copy-constant-to-alloca optimization in more instances (authored by andrew.w.kaylor).
Enabling the copy-constant-to-alloca optimization in more instances
Jul 27 2021, 10:23 AM
andrew.w.kaylor closed D106573: Enabling the copy-constant-to-alloca optimization in more instances.
Jul 27 2021, 10:23 AM · Restricted Project

Jul 26 2021

andrew.w.kaylor accepted D106573: Enabling the copy-constant-to-alloca optimization in more instances.

lgtm

Jul 26 2021, 3:35 PM · Restricted Project

Jun 30 2021

andrew.w.kaylor added a comment to D104935: Update the Polybench tests to check relative error.

I think having a more robust library for FP comparisons in the test suite would be great. In fact, I thought about moving even the simple comparison I did in this patch to some common location where it could be included by other tests. Maybe that's a good starting point for an incremental change in the direction you're suggesting. A robust library of such checks is beyond the scope of what I have time to do at the moment, but I would certainly support that direction.

Jun 30 2021, 7:37 AM

Jun 25 2021

andrew.w.kaylor set the repository for D104935: Update the Polybench tests to check relative error to rT test-suite.
Jun 25 2021, 1:14 PM
andrew.w.kaylor requested review of D104935: Update the Polybench tests to check relative error.
Jun 25 2021, 11:17 AM

Jun 10 2021

andrew.w.kaylor committed rG41555eaf65b1: Preserve more MD_mem_parallel_loop_access and MD_access_group in SROA (authored by andrew.w.kaylor).
Preserve more MD_mem_parallel_loop_access and MD_access_group in SROA
Jun 10 2021, 3:47 PM
andrew.w.kaylor closed D103254: Preserve more MD_mem_parallel_loop_access and MD_access_group in SROA.
Jun 10 2021, 3:47 PM · Restricted Project

May 21 2021

andrew.w.kaylor added a comment to D102861: Suppress FP_CONTRACT due to planned command line changes.

This seems OK as a short term solution, but it is also problematic in that it prevents FMA from being used in these performance measurements. I understand that it's not trivial to allow FP-related error tolerance in the test, but (as a future patch) it would be nice to at least have a way to turn off the exact result checking on FP tests so that the performance could be measured with fast-math and fp-contract enabled, and that would require having a way to re-enable FMA for these tests.

May 21 2021, 11:49 AM

May 10 2021

andrew.w.kaylor committed rG7086025d6567: [Dependence Analysis] Enable delinearization of fixed sized arrays (authored by andrew.w.kaylor).
[Dependence Analysis] Enable delinearization of fixed sized arrays
May 10 2021, 10:31 AM
andrew.w.kaylor closed D101486: [Dependence Analysis] Enable delinearization of fixed sized arrays.
May 10 2021, 10:31 AM · Restricted Project

Apr 27 2021

andrew.w.kaylor committed rG0a82d885a4fc: [Dependence Analysis] Fix ExactSIV producing wrong analysis (authored by andrew.w.kaylor).
[Dependence Analysis] Fix ExactSIV producing wrong analysis
Apr 27 2021, 12:25 PM
andrew.w.kaylor closed D100331: [Dependence Analysis] Fix ExactSIV producing wrong analysis.
Apr 27 2021, 12:25 PM · Restricted Project

Apr 23 2021

andrew.w.kaylor added a comment to D100091: [X86] Fix wrong handle with "-mno-x87".

What is the usage model you’re trying to enable? This has been broken for a long time. Why is it important now?

I don't know any specific case for now. This was found by @andrew.w.kaylor. I will check with him.

Apr 23 2021, 11:02 AM · Restricted Project

Apr 6 2021

andrew.w.kaylor added a comment to D99675: [llvm][clang] Create new intrinsic llvm.arithmetic.fence to control FP optimization at expression level.

The expression “llvm.arith.fence(a * b) + c” means that “a * b” must happen before “+ c” and FMA guarantees that, but to prevent later optimizations from unpacking the FMA the correct transformation needs to be:

llvm.arith.fence(a * b) + c → llvm.arith.fence(FMA(a, b, c))

Does this actually block later transforms from unpacking the FMA? Maybe if the FMA isn't marked "fast"...

Apr 6 2021, 1:15 PM · Restricted Project

Mar 1 2021

andrew.w.kaylor added a comment to D96646: [NFC] update LangRef for D88645.

I'm not trying to be difficult, but I genuinely still don't understand the additional arguments pointer. Is it intended to allow proprietary extensions? Is there an example somewhere?

Mar 1 2021, 10:38 AM · Restricted Project

Feb 22 2021

andrew.w.kaylor committed rG9a827906cb95: Add auto-upgrade support for annotation intrinsics (authored by andrew.w.kaylor).
Add auto-upgrade support for annotation intrinsics
Feb 22 2021, 3:43 PM
andrew.w.kaylor closed D95993: Add auto-upgrade support for annotation intrinsics.
Feb 22 2021, 3:42 PM · Restricted Project
andrew.w.kaylor updated the diff for D95993: Add auto-upgrade support for annotation intrinsics.

Addressed review feedback

Feb 22 2021, 2:32 PM · Restricted Project
andrew.w.kaylor added a comment to D95993: Add auto-upgrade support for annotation intrinsics.

I just checked, and it appears the langref still only documents the 4 argument variants (https://llvm.org/docs/LangRef.html#llvm-ptr-annotation-intrinsic). I think those need to be updated as well?

Feb 22 2021, 1:06 PM · Restricted Project
andrew.w.kaylor added a comment to D95993: Add auto-upgrade support for annotation intrinsics.

Ping

Feb 22 2021, 11:39 AM · Restricted Project

Feb 17 2021

andrew.w.kaylor updated the diff for D95993: Add auto-upgrade support for annotation intrinsics.

Removed unnecessary bitcast

Feb 17 2021, 5:41 PM · Restricted Project
andrew.w.kaylor added inline comments to D95993: Add auto-upgrade support for annotation intrinsics.
Feb 17 2021, 5:37 PM · Restricted Project

Feb 16 2021

andrew.w.kaylor added inline comments to D96646: [NFC] update LangRef for D88645.
Feb 16 2021, 10:49 AM · Restricted Project

Feb 12 2021

andrew.w.kaylor added a comment to D95993: Add auto-upgrade support for annotation intrinsics.

@pcc CODE_OWNERS.txt says that you own the bitcode. Can you help with this review?

Feb 12 2021, 5:38 PM · Restricted Project
andrew.w.kaylor added a reviewer for D95993: Add auto-upgrade support for annotation intrinsics: pcc.
Feb 12 2021, 5:37 PM · Restricted Project

Feb 3 2021

andrew.w.kaylor requested review of D95993: Add auto-upgrade support for annotation intrinsics.
Feb 3 2021, 6:22 PM · Restricted Project

Jun 12 2020

andrew.w.kaylor accepted D70096: [strictfp] Replace dangling strictfp attrs with nobuiltin.

lgtm

Jun 12 2020, 12:02 PM · Restricted Project

Jun 9 2020

andrew.w.kaylor added a comment to D72425: [OptRemark] RFC: Introduce a message table for OptRemarks.

Sorry for having let this drop for so long. Some other priorities came up, but I am still interested in seeing this through.

Jun 9 2020, 1:14 PM · Restricted Project

May 19 2020

andrew.w.kaylor added inline comments to D64193: [PowerPC] Add exception constraint to FP rounding operations.
May 19 2020, 9:49 AM · Restricted Project

May 13 2020

andrew.w.kaylor added a comment to D79760: [WinEH64] Fix a crush issue when c++ exception nested in a particular form..

@andrew.w.kaylor I tested for below 3 cases, and get the runtime result for them in different compilers.

May 13 2020, 11:25 AM · Restricted Project

May 12 2020

andrew.w.kaylor added a reviewer for D79760: [WinEH64] Fix a crush issue when c++ exception nested in a particular form.: JosephTremoulet.
May 12 2020, 6:20 PM · Restricted Project
andrew.w.kaylor added a comment to D79760: [WinEH64] Fix a crush issue when c++ exception nested in a particular form..

I don't understand why this difference exists. Beyond just trying to reproduce what MSVC does, can you explain the difference?

May 12 2020, 6:20 PM · Restricted Project

Apr 22 2020

andrew.w.kaylor added a comment to D27028: Add intrinsics for constrained floating point operations.

@andrew.w.kaylor I went through the mailing list thread regarding this change and saw "Eventually, we’ll want to go back and teach specific optimizations to understand the intrinsics so that where possible optimizations can be performed in a manner consistent with dynamic rounding modes and strict exception handling.".
Do you have any references\plans on how to teach specific optimizations on this?

Apr 22 2020, 2:09 PM

Mar 30 2020

andrew.w.kaylor added inline comments to D75670: [FPEnv] Intrinsic llvm.roundeven.
Mar 30 2020, 10:16 AM · Restricted Project

Mar 26 2020

andrew.w.kaylor accepted D72930: [FEnv] Constfold some unary constrained operations.

Sorry for the delay in approval.

Mar 26 2020, 2:08 PM · Restricted Project

Mar 12 2020

andrew.w.kaylor added a comment to D72675: [Clang][Driver] Fix -ffast-math/-ffp-contract interaction.

I may be wrong, but i suspect those failures aren't actually due to the fact
that we pessimize optimizations with this change, but that the whole execution
just fails. Can you try running test-suite locally? Do tests themselves actually pass,
ignoring the question of their performance?

Mar 12 2020, 3:12 PM · Restricted Project
andrew.w.kaylor accepted D69891: [VP,Integer,#1] Vector-predicated integer intrinsics.

OK. Since the behavior for out-of-range evl is target-dependent, undefined makes sense.

Mar 12 2020, 10:51 AM · Restricted Project

Mar 11 2020

andrew.w.kaylor added a comment to D75935: Add RET-hardening Support to X86 to mitigate Load Value Injection (LVI) [3/6].

Can you use the "Edit Related Revisions" link to set the parent/child relationships of these patches and put "[1/5]" in the titles?

Mar 11 2020, 12:29 PM · Restricted Project
andrew.w.kaylor added a comment to D75939: [x86][seses] Introduce SESES pass for LVI.

What is the intention of this set of patches in relation to D75938? It was unclear to me whether you intended to commit this implementation or were just offering it as an alternative for discussion.

Mar 11 2020, 11:56 AM · Restricted Project
andrew.w.kaylor added a comment to D69891: [VP,Integer,#1] Vector-predicated integer intrinsics.

I'm satisfied with the functionality, but I'm not sure about the intrinsics having undefined behavior outside the [0, W] range. The way you've implemented it, it seems like the behavior would be predictable. If the evl argument is outside that range, it is ignored. Applying an unsigned value greater than W using the "%mask AND %EVLmask" also has this effect. Why not just make that the defined behavior?

Mar 11 2020, 10:09 AM · Restricted Project

Mar 10 2020

andrew.w.kaylor added inline comments to D75932: Move RDF from Hexagon to Codegen [1/6].
Mar 10 2020, 1:39 PM · Restricted Project

Mar 4 2020

andrew.w.kaylor accepted D74500: clang: Treat ieee mode as the default for denormal-fp-math.

lgtm

Mar 4 2020, 5:59 PM · Restricted Project
andrew.w.kaylor added inline comments to D69891: [VP,Integer,#1] Vector-predicated integer intrinsics.
Mar 4 2020, 4:20 PM · Restricted Project

Feb 27 2020

andrew.w.kaylor added inline comments to D72930: [FEnv] Constfold some unary constrained operations.
Feb 27 2020, 1:47 PM · Restricted Project

Feb 25 2020

andrew.w.kaylor added inline comments to D69891: [VP,Integer,#1] Vector-predicated integer intrinsics.
Feb 25 2020, 3:14 PM · Restricted Project
andrew.w.kaylor added a comment to D72930: [FEnv] Constfold some unary constrained operations.

I may have overstated. The exception behavior of the

Feb 25 2020, 11:24 AM · Restricted Project

Feb 24 2020

andrew.w.kaylor added a comment to D72425: [OptRemark] RFC: Introduce a message table for OptRemarks.

Any more feedback? Should I proceed in this direction?

Feb 24 2020, 4:27 PM · Restricted Project
andrew.w.kaylor added inline comments to D72930: [FEnv] Constfold some unary constrained operations.
Feb 24 2020, 10:48 AM · Restricted Project

Feb 21 2020

andrew.w.kaylor added a comment to D63916: [PowerPC] Add exception constraint to FP arithmetic.

I don't know enough about PowerPC to review the check in the tests, but the basic constrained intrinsic handling looks correct.

Feb 21 2020, 2:30 PM · Restricted Project
andrew.w.kaylor added a comment to D74712: Remove *_finite library support, following upstream.

It's a little weird that we're still recognizing these as LibFuncs, but I guess that's necessary to keep the optimizations like constant folding working with them. I see that ConstantFolding already calls the non-finite-restricted versions of the functions when it's doing compile-time evaluation. It would be good to have a comment somewhere (probably in TargetLibraryInfo.cpp below the note about math-finite.h) explaining the handling of these functions.

Feb 21 2020, 10:23 AM · Restricted Project

Feb 19 2020

andrew.w.kaylor added a comment to D69891: [VP,Integer,#1] Vector-predicated integer intrinsics.

After I made that suggestion @craig.topper pointed out to me that the "experimental" qualifier has a tendency to never go away. See, for example, vector add/reduce intriniscs. All this is to say that my suggestion is just a suggestion and I could be convinced to drop it if that is the consensus.

How about we stick with llvm.vp.fadd and go for llvm.vp.fadd.v2, etc when/if the intrinsics are updated?

Feb 19 2020, 3:59 PM · Restricted Project

Feb 14 2020

andrew.w.kaylor added a comment to D74436: Change clang option -ffp-model=precise to select ffp-contract=on.

You're right, -O0 shouldn't generate FMA. I'm preparing to revert this now -- just verifying the build.

Perhaps this should be
off with no optimization
on with -O1/-O2/-O3/-Os/-Oz
fast with fast math

Just a suggestion, I'm not sure whether that would be the best breakdown. Perhaps we can also see what the defaults are for GCC and unify with those?

Feb 14 2020, 3:07 PM · Restricted Project
andrew.w.kaylor added a comment to D57504: RFC: Prototype & Roadmap for vector predication in LLVM.

Perhaps this would be cleared up if I had a better understanding of what you were saying.

appreciated. if it's ok, can we schedule that for when it's part of a (new) proposal?

Feb 14 2020, 2:57 PM · Restricted Project, Restricted Project
andrew.w.kaylor added a comment to D57504: RFC: Prototype & Roadmap for vector predication in LLVM.

MMX does use the X87 FP register file, but they can't coexist at the same. The first use of MMX marks the X87 register stack as occupied. I can't remember if it alters the data or not. An explicit emms instruction has to be done at the end of the MMX code to erase the MMX data and make the registers usable for X87 again.

craig, thank you for correcting me. that makes a lot of sense as i can just imagine the x87 designers going "argh, how are we going to avoid a pipeline clash / mess, here" :)

you get the principle i am sure, even though MMX is not a suitable example.

Feb 14 2020, 2:20 PM · Restricted Project, Restricted Project
andrew.w.kaylor updated subscribers of D69891: [VP,Integer,#1] Vector-predicated integer intrinsics.
  1. Rename to llvm.experimental.vp.* - Inserting experimental is the preferred way to introduce new intrinsics until they are stable (for technical reasons brought up by @chandlerc - https://reviews.llvm.org/D69891#1852795 ).
Feb 14 2020, 11:26 AM · Restricted Project