Page MenuHomePhabricator

Please use GitHub pull requests for new patches. Avoid migrating existing patches. Phabricator shutdown timeline

jgreenhalgh (James Greenhalgh)
Principal Engineer at Arm

Projects

User does not belong to any projects.

User Details

User Since
Apr 13 2016, 3:58 AM (399 w, 4 d)

Hardware, Software, and Toolchain optimisation at Arm. Principal engineer, GCC maintainer, Optimisation Team lead.

Recent Activity

Jan 19 2021

jgreenhalgh added inline comments to D93585: [AArch64][Clang][Linux] Enable out-of-line atomics by default..
Jan 19 2021, 1:05 PM · Restricted Project, Restricted Project

May 28 2020

jgreenhalgh created D80736: [AArch64] Add native CPU detection for Neoverse N1.
May 28 2020, 9:50 AM · Restricted Project

Mar 9 2020

jgreenhalgh updated the diff for D75248: [Arm] Do not lower vmax/vmin to Neon instructions.

Update to also cover float16 as per David's request.

Mar 9 2020, 10:15 AM · Restricted Project

Feb 27 2020

jgreenhalgh created D75248: [Arm] Do not lower vmax/vmin to Neon instructions.
Feb 27 2020, 5:53 AM · Restricted Project

Sep 6 2018

jgreenhalgh added inline comments to D51683: Fix arm_neon.h and arm_fp16.h generation for compiling with std=c89.
Sep 6 2018, 7:51 AM

Apr 18 2018

jgreenhalgh added a comment to D45668: [NEON] Define vget_high_f16() and vget_low_f16() intrinsics in AArch64 mode only.

Thanks, and I am going to try to get some clarity on this doc issue. But looks like it should be "ARMv7, ARMv8", as it used to be. Make sense to comment on this in the commit message, if that's what you mean.

Apr 18 2018, 7:03 AM · Restricted Project

Mar 26 2018

jgreenhalgh added a comment to D44389: [LNT] Error accessing URL /db_default/v4/nts/profile/ajax/getFunctions; NOT FOUND.

For what it is worth; this is the fix I would have suggested for the issue.

Mar 26 2018, 6:56 AM

Mar 2 2018

jgreenhalgh added a comment to D43862: [LNT] Error accessing URL /db_default/v4/nts/profile/ajax/getFunctions; NOT FOUND.

Sorry, I should have caught this when we were reviewing this patch internally; this obviously leaves the lookups in:

v4_profile_ajax_getTopLevelCounters
v4_profile_ajax_getCodeForFunction
v4_profile_fwd2
Mar 2 2018, 5:45 AM

Dec 18 2017

jgreenhalgh added a comment to D40673: Add _Float128 as alias to __float128 to enable compilations on Fedora27/glibc2-26.

If this patch unconditionally defines _Float128, then I think it will conflict with the typedef for _Float128 for IEEE754 128-bit long double systems in glibc:

Dec 18 2017, 6:26 AM · Restricted Project

Nov 24 2017

jgreenhalgh added a comment to D40423: [ARM][AArch64] Workaround ARM/AArch64 percularity in clearing icache..

...

So the change makes sense, but the references to Armv8.2-A look wrong to me.

I've checked the docs once again, and, indeed, they don't mention Armv8.2, so I'll remove the reference. However, judging from cursory read of https://patchwork.kernel.org/patch/9275721/ , it appears only Armv8.2 kernels would set EXECUTE_ONLY bits on the pages, kernels for Armv8.[01] would set READ permissions on EXEC pages as well.

Nov 24 2017, 5:50 AM
jgreenhalgh added a comment to D40423: [ARM][AArch64] Workaround ARM/AArch64 percularity in clearing icache..

I'm confused by the title and rationale for this change. I don't think this has anything to do with Armv8.2-A and is instead a feature of Armv8-A.

Nov 24 2017, 4:32 AM

Oct 17 2017

jgreenhalgh added inline comments to D38942: [DAG] Promote ADDCARRY / SUBCARRY.
Oct 17 2017, 9:28 AM

Sep 4 2017

jgreenhalgh updated subscribers of D37083: LNT Make machine selection/update more flexible.

Thanks for this. I don't know LNT well enough to review the code, but the top-level summary makes sense to me.

Sep 4 2017, 10:04 AM

Aug 14 2017

jgreenhalgh added a comment to D35598: Rework machine creation strategy.

Hi James,

Aug 14 2017, 4:01 AM

Aug 11 2017

jgreenhalgh added a comment to D35598: Rework machine creation strategy.

Hi Matze,

I did not check the implementation in detail, but this makes total sense to me. From my perspective this is a clear improvement and should go in.

Aug 11 2017, 2:00 AM

Jul 11 2017

jgreenhalgh added a comment to rL306596: New REST API format.

Hi Chris, this commit causes servers running from the wsgi script to fail. To reproduce:

Jul 11 2017, 9:30 AM

Feb 22 2017

jgreenhalgh added a comment to D29639: [SelectionDAG] Add a signed integer absolute ISD node.

Hi,

Having asked around: The way we define this is that the VABS instruction takes a signed integer and outputs an unsigned integer, getting around this problem.

However, I believe the output of VABS(INT_MIN) is indeed bit-identical to INT_MIN.

+ @jgreenhalgh to confirm I haven't mangled his explanation.

Feb 22 2017, 2:05 AM

Jan 20 2017

jgreenhalgh added a comment to D28891: add support for Cavium ThunderX ARM64 processors.

The feature bits set match my (limited) understanding of the various ThunderX variants, and their implementation in GCC. From that perspective this patch looks good to me.

Jan 20 2017, 4:11 AM

Jan 12 2017

jgreenhalgh added inline comments to D28152: Cortex-A57 scheduling model for ARM backend (AArch32).
Jan 12 2017, 8:01 AM

Oct 18 2016

jgreenhalgh added inline comments to D24540: [AArch64] Add Cavium ThunderX subtarget support..
Oct 18 2016, 8:00 AM
jgreenhalgh added inline comments to D24540: [AArch64] Add Cavium ThunderX subtarget support..
Oct 18 2016, 7:02 AM
jgreenhalgh added inline comments to D24540: [AArch64] Add Cavium ThunderX subtarget support..
Oct 18 2016, 6:21 AM

Sep 22 2016

jgreenhalgh added inline comments to D24661: More processors support under AARch64 state for auto detection..
Sep 22 2016, 3:24 AM

Sep 20 2016

jgreenhalgh added inline comments to D24661: More processors support under AARch64 state for auto detection..
Sep 20 2016, 9:14 AM

Aug 18 2016

jgreenhalgh added a comment to D23583: [AArch64] Add feature has-fast-fma.

Presumably this is where the "faster than an FADD" comes from. This transform is FMUL + FADD + [use of FMUL] -> FMA + FMUL + [use of FMUL].

There are other cases, such as FADD + FMUL + FMA -> FMA + FMA. Probably a better way to describe the use of enableAggressiveFMAFusion is the relative cost of FMA to FADD and FMUL.

Aug 18 2016, 10:03 AM · Restricted Project, Restricted Project
jgreenhalgh added a comment to D23583: [AArch64] Add feature has-fast-fma.

Hi,

I also have concerns here. The TargetLowering hook states:

/// Return true if target always beneficiates from combining into FMA for a
/// given value type. This must typically return false on targets where FMA
/// takes more cycles to execute than *FADD*.

Whereas you say:

In spite of what the original author intended, I observed that the extra folds are worth it if FMA is as quick *FMUL* instead.

Which is correct? Or are you using this hook in a way the hook users don't intend? The wording used is vague and I really think we need to have more detail about what property of Exynos-M1 makes this good for Exynos but not for any other microarchitecture.

Aug 18 2016, 6:34 AM · Restricted Project, Restricted Project

Apr 26 2016

jgreenhalgh added a comment to D19061: [ARM] Add support for the X asm constraint.

Thanks, that looks perfect now.

Apr 26 2016, 1:31 AM

Apr 25 2016

jgreenhalgh added a comment to D19061: [ARM] Add support for the X asm constraint.

It'd be good if James could have a final look, but I'm ok with the change.

Apr 25 2016, 4:03 AM

Apr 18 2016

jgreenhalgh added a comment to D18701: [ARM] Adding IEEE-754 SIMD detection to loop vectorizer.

Sorry to have missed this before commit.

Apr 18 2016, 2:48 AM

Apr 15 2016

jgreenhalgh added a comment to D19061: [ARM] Add support for the X asm constraint.

It is true that GCC would be more efficient in some cases (one example would be FP constants), but we would still fit into the definition of "no constraint whatsoever" and therefore correct - which is an improvement from the current situation, where we'll simply crash on this constraint.

I agree bad codegen trumps ICE crashes, but James mentioned it "might well break use cases". I'm interested in those...

James, do you have some pointers on the expected usage of this constraint in the wild? The more the merrier!

Apr 15 2016, 9:48 AM

Apr 14 2016

jgreenhalgh added a comment to D19061: [ARM] Add support for the X asm constraint.

Decaying this to "w" or "r" would potentially pessimize code generation, and might well break use cases. The whole point of "X" is to prevent the compiler from having to reload an operand you don't actually care about using in the output assembly (a scratch, or in that blog post, a hidden dependency). As GCC isn't going to put any effort in to forcing the form of the operand, I'd expect most uses that actually try to print out an "X" constrained register to be using it as a shorthand for getting constants or labels out. There are normally more expressive constraints if that is what you need.

Apr 14 2016, 3:44 AM

Apr 13 2016

jgreenhalgh added inline comments to D19053: [LNT] Some more docs for the LNT json file format..
Apr 13 2016, 4:29 AM