Page MenuHomePhabricator

SjoerdMeijer (Sjoerd Meijer)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 26 2016, 2:17 AM (169 w, 1 d)

Recent Activity

Today

SjoerdMeijer updated the diff for D59787: [ARM] Implement TTI::getMemcpyCost.

Removed the change in findOptimalMemOpLowering, because that belongs to D59766 which I have just updated.

Wed, Apr 24, 7:29 AM
SjoerdMeijer updated the diff for D59766: [TargetLowering] findOptimalMemOpLowering. NFC..

Changed the assert into an early return from function findOptimalMemOpLowering.
We might run into this assert when we actually start using this in D59787.

Wed, Apr 24, 6:38 AM

Mon, Apr 15

SjoerdMeijer added a comment to D60698: [ARM] add target arch definitions for 8.1-M and MVE..

This adds:

  • LLVM subtarget features to make all the new instructions conditional on
  • CPU and FPU names for use on armclang's command line, with default FPUs set so that "armv8.1-m.main+fp" and "armv8.1-m.main+fp.dp" will select the right FPU features
  • architecture extension names "mve" and "mve.fp"
  • ABI build attribute support for v8.1-M (a new value for Tag_CPU_arch) and MVE (a new actual tag)
Mon, Apr 15, 7:23 AM · Restricted Project
SjoerdMeijer added a reviewer for D60691: [ARM] Replace fp-only-sp and d16 with fp64 and d32.: olista01.
Mon, Apr 15, 6:55 AM · Restricted Project, Restricted Project
SjoerdMeijer accepted D60703: [ARM] Add an MVE execution domain..

looks reasonable

Mon, Apr 15, 6:46 AM · Restricted Project
SjoerdMeijer added inline comments to D60699: [ARM] add CLI support for 8.1-M and MVE..
Mon, Apr 15, 6:09 AM · Restricted Project

Fri, Apr 5

SjoerdMeijer updated the diff for D59787: [ARM] Implement TTI::getMemcpyCost.

Run code-size benchmark CSiBe, triggered an assert in function FindOptimalMemOpLowering, and fixed that.

Fri, Apr 5, 8:45 AM

Wed, Apr 3

SjoerdMeijer accepted D60118: [AArch64][AsmParser] Fix .arch_extension directive parsing.

Nice one, thanks for fixing this!

Wed, Apr 3, 8:34 AM · Restricted Project
SjoerdMeijer added a comment to D60118: [AArch64][AsmParser] Fix .arch_extension directive parsing.

It appears the .cpu, .arch and .arch_extension directives can only enable features defined in ExtensionMap in the AArch64AsmParser.cpp and ignores them otherwise. ExtensionMap seems incomplete, in the patch you mentioned only tlb-rmi, pan-rwv and ccpp features were added.

Wed, Apr 3, 5:41 AM · Restricted Project

Tue, Apr 2

SjoerdMeijer added a comment to D53633: [AArch64] Implement FP16FML intrinsics.

FYI: a new ACLE version has been published, please find it here: https://developer.arm.com/architectures/system-architectures/software-standards/acle

Tue, Apr 2, 9:17 AM · Restricted Project
SjoerdMeijer added a comment to D60118: [AArch64][AsmParser] Fix .arch_extension directive parsing.

This change looks also fine to me.
From a quick grep in the test directory, looks like we don't have many tests for the .arch_extension directive. I.e., I see only one, for the simd directive. I think that confirms my suspicion that we tend to forget to add tests/support for the .arch_extension directive when we upstream new architecture support. So, sorry to be a pain here, but while you're add it, my request would be to add tests for some other architecture extensions that we are missing. For example, rasv8_4 might be an interesting one because it has an underscore in it. Looks like the refactoring in D54633 introduced new extensions that we could test.

Tue, Apr 2, 5:27 AM · Restricted Project

Mon, Apr 1

SjoerdMeijer added a comment to D59785: [TargetLowering] Change getOptimalMemOpType to take a Function instead of a MachineFunction.

Hi Florian, just to clarify my previous message. I would be more than happy to change it, but was only curious if just passing function attributes was too restrictive somehow. Let me know what you think. Would you mind having a quick look at D59787 too? :-) Cheers.

Mon, Apr 1, 11:55 AM

Fri, Mar 29

SjoerdMeijer updated the diff for D59787: [ARM] Implement TTI::getMemcpyCost.
  • cleaned up the diff,
  • tweaked the cost for generating a library call.
Fri, Mar 29, 4:31 AM
SjoerdMeijer added a comment to D59787: [ARM] Implement TTI::getMemcpyCost.

Ah, oops, that was a experiment in my local branch. I will get rid of this, and upload a new diff soon. So, thanks for spotting this, you haven't missed anything!

Fri, Mar 29, 3:37 AM

Thu, Mar 28

SjoerdMeijer added a comment to D59787: [ARM] Implement TTI::getMemcpyCost.

Ignoring for a moment how I would like to use this, I think this is a nice self-contained improvement (or perhaps even bug fix) so that getIntstructionCost/getIntrinsicCost returns something an awful lot more reasonable for memcpy's. I think the dependent refactoring patches are general improvements too. So, a friendly ping, any opinions on this?

Thu, Mar 28, 1:09 PM
SjoerdMeijer added a reviewer for D59787: [ARM] Implement TTI::getMemcpyCost: olista01.
Thu, Mar 28, 10:09 AM

Tue, Mar 26

SjoerdMeijer added a comment to D59785: [TargetLowering] Change getOptimalMemOpType to take a Function instead of a MachineFunction.

Hi Florian, thanks for taking a look!

Tue, Mar 26, 4:10 PM
SjoerdMeijer updated the diff for D59787: [ARM] Implement TTI::getMemcpyCost.
Tue, Mar 26, 1:49 PM
SjoerdMeijer added a child revision for D59766: [TargetLowering] findOptimalMemOpLowering. NFC.: D59787: [ARM] Implement TTI::getMemcpyCost.
Tue, Mar 26, 1:49 PM
SjoerdMeijer added a child revision for D59785: [TargetLowering] Change getOptimalMemOpType to take a Function instead of a MachineFunction: D59787: [ARM] Implement TTI::getMemcpyCost.
Tue, Mar 26, 1:49 PM

Mar 25 2019

SjoerdMeijer updated the diff for D59766: [TargetLowering] findOptimalMemOpLowering. NFC..

Restored the comments.

Mar 25 2019, 11:26 AM
SjoerdMeijer added a comment to D59766: [TargetLowering] findOptimalMemOpLowering. NFC..

Summarising, these are the patch sets that we need, i.e. the patches that show the whole flow:

Mar 25 2019, 11:19 AM
SjoerdMeijer updated the diff for D59766: [TargetLowering] findOptimalMemOpLowering. NFC..

With the prep work done in D59785, findOptimalMemOpLowering takes a Function as a last argument so that it can actually be used as shown in the work-in-progress patch D59787.

Mar 25 2019, 11:15 AM
SjoerdMeijer created D59787: [ARM] Implement TTI::getMemcpyCost.
Mar 25 2019, 11:12 AM
SjoerdMeijer created D59785: [TargetLowering] Change getOptimalMemOpType to take a Function instead of a MachineFunction.
Mar 25 2019, 11:07 AM
SjoerdMeijer added inline comments to D59766: [TargetLowering] findOptimalMemOpLowering. NFC..
Mar 25 2019, 9:21 AM
SjoerdMeijer updated the diff for D59766: [TargetLowering] findOptimalMemOpLowering. NFC..
Mar 25 2019, 9:13 AM
SjoerdMeijer created D59766: [TargetLowering] findOptimalMemOpLowering. NFC..
Mar 25 2019, 6:49 AM
SjoerdMeijer committed rG65584d38112d: [TTI] Move getIntrinsicCost to allow functions to be overridden. NFC (authored by SjoerdMeijer).
[TTI] Move getIntrinsicCost to allow functions to be overridden. NFC
Mar 25 2019, 1:55 AM
SjoerdMeijer committed rL356873: [TTI] Move getIntrinsicCost to allow functions to be overridden. NFC.
[TTI] Move getIntrinsicCost to allow functions to be overridden. NFC
Mar 25 2019, 1:55 AM
SjoerdMeijer closed D59706: [TTI] Move getIntrinsicCost to allow functions to be overridden. NFC..
Mar 25 2019, 1:55 AM · Restricted Project

Mar 23 2019

SjoerdMeijer accepted D59713: [ARM] Add missing memory operands to a bunch of instructions..

LGTM, just a question inline.

Mar 23 2019, 2:50 AM · Restricted Project

Mar 22 2019

SjoerdMeijer created D59706: [TTI] Move getIntrinsicCost to allow functions to be overridden. NFC..
Mar 22 2019, 10:24 AM · Restricted Project

Mar 20 2019

SjoerdMeijer committed rG7bb785cbc3b7: Follow up of rL356555 (authored by SjoerdMeijer).
Follow up of rL356555
Mar 20 2019, 7:34 AM
SjoerdMeijer committed rL356557: Follow up of rL356555.
Follow up of rL356555
Mar 20 2019, 7:33 AM
SjoerdMeijer committed rG633fb0f266f9: [TTI] getMemcpyCost (authored by SjoerdMeijer).
[TTI] getMemcpyCost
Mar 20 2019, 7:18 AM
SjoerdMeijer committed rL356555: [TTI] getMemcpyCost.
[TTI] getMemcpyCost
Mar 20 2019, 7:18 AM
SjoerdMeijer closed D59252: [TTI] getMemcpyCost.
Mar 20 2019, 7:18 AM · Restricted Project

Mar 19 2019

SjoerdMeijer added a comment to D59252: [TTI] getMemcpyCost.

Thanks for looking Sam!
I will wait a day with committing this, just in case there are more comments.

Mar 19 2019, 6:56 AM · Restricted Project

Mar 14 2019

SjoerdMeijer accepted D59383: [ARM] Add MachineVerifier logic for some Thumb1 instructions..

LGTM

Mar 14 2019, 1:10 PM · Restricted Project
SjoerdMeijer accepted D59368: [ARM][ParallelDSP] Disable for big-endian.

Oops, also forgot about BE.
Looks like a sensible workaround to me.

Mar 14 2019, 9:00 AM · Restricted Project
SjoerdMeijer added a comment to D59252: [TTI] getMemcpyCost.

Apologies for already asking, but I was curious if @t.p.northover or @arsenm had more feedback/concerns or if this looks somewhat reasonable.

Mar 14 2019, 6:36 AM · Restricted Project
SjoerdMeijer accepted D59257: [ARM] Run ARMParallelDSP in the IRPasses phase.

Thanks, this looks good to me.

Mar 14 2019, 3:30 AM · Restricted Project

Mar 13 2019

SjoerdMeijer added inline comments to D59257: [ARM] Run ARMParallelDSP in the IRPasses phase.
Mar 13 2019, 10:00 AM · Restricted Project
SjoerdMeijer updated the diff for D59252: [TTI] getMemcpyCost.

In my previous comment, I commented on a possible default cost calculation for a memcpy, which approximated the cost by calculating the number of load/stores pairs. I probably could see this working as a default estimation, because I think when this would be used, the result will be that memcpy's are considered expensive relative to other instructions, which is probably correct. However, I appreciate that this is an approximation, and that there many things are missing here, like source/destination alignment, available load/store instruction, etc., which really shows this is target dependent decision. So, I've changed getMemcpyCost to just simply return TCC_Expensive, so that target can override this method and implement their decision making there, like is done for many/most functions here in TTI.

Mar 13 2019, 9:39 AM · Restricted Project
SjoerdMeijer accepted D59215: [ARM][ParallelDSP] Enable multiple uses of loads.

Looks very reasonable to me

Mar 13 2019, 9:19 AM · Restricted Project

Mar 12 2019

SjoerdMeijer added inline comments to D59252: [TTI] getMemcpyCost.
Mar 12 2019, 9:19 AM · Restricted Project
SjoerdMeijer added inline comments to D59257: [ARM] Run ARMParallelDSP in the IRPasses phase.
Mar 12 2019, 8:25 AM · Restricted Project
SjoerdMeijer added a comment to D59215: [ARM][ParallelDSP] Enable multiple uses of loads.

Mainly nits, one question about the tests.

Mar 12 2019, 8:00 AM · Restricted Project
SjoerdMeijer created D59252: [TTI] getMemcpyCost.
Mar 12 2019, 6:59 AM · Restricted Project
SjoerdMeijer committed rG31ff647c1d26: [TTI] Enable analysis of clib functions in getIntrinsicCosts. NFCI. (authored by SjoerdMeijer).
[TTI] Enable analysis of clib functions in getIntrinsicCosts. NFCI.
Mar 12 2019, 2:50 AM
SjoerdMeijer committed rL355901: [TTI] Enable analysis of clib functions in getIntrinsicCosts. NFCI..
[TTI] Enable analysis of clib functions in getIntrinsicCosts. NFCI.
Mar 12 2019, 2:50 AM
SjoerdMeijer closed D59014: [TTI] Enable analysis of clib functions in getIntrinsicCosts. NFCI..
Mar 12 2019, 2:50 AM · Restricted Project
SjoerdMeijer added a comment to D59129: [SROA] WIP: Lowering alloca is not always beneficial.

The transform is more powerful working the other way: it could catch cases that were't originally written as "memcpy" in the source code. Not sure that means we shouldn't do something in SROA, though. Ideally, I think we want the "canonical" form to be the promoted version, and transform it > to the compact version as part of lowering. But I'm not sure how much work it would be to make the reverse transform reliable, and I don't want to block an incremental improvement.

MemCpyOpt doesn't actually merge memcpys at the moment, I think. It does merge memsets, and it has a lot of memcpy-related transforms, but that one in particular is missing. But it's the right sort of logic, at least...

Mar 12 2019, 2:18 AM

Mar 11 2019

SjoerdMeijer added a comment to D59129: [SROA] WIP: Lowering alloca is not always beneficial.

I was sort of thinking you could write the reverse transform in MemCpyOpt, since it does a similar sorts of analysis, but maybe we'd have to do it later, or patch SROA anyway, to avoid two transforms fighting each other.

Mar 11 2019, 1:21 PM
SjoerdMeijer added a comment to D59129: [SROA] WIP: Lowering alloca is not always beneficial.

Hi @RKSimon, @efriedma, thank you very much for looking at this!

Mar 11 2019, 9:04 AM

Mar 8 2019

SjoerdMeijer added a comment to D59014: [TTI] Enable analysis of clib functions in getIntrinsicCosts. NFCI..

Just for your information, how I would like to use this patch: I've uploaded work-in-progress patch D59129. This is using getInstructionCost to decide if alloca's need to be lowered. The missing piece is the memcpy instruction cost decision making that I stripped out (the intial plumbing) from this patch, that I will start working on now.

Mar 8 2019, 4:07 AM · Restricted Project
SjoerdMeijer created D59129: [SROA] WIP: Lowering alloca is not always beneficial.
Mar 8 2019, 4:02 AM
SjoerdMeijer added inline comments to D59014: [TTI] Enable analysis of clib functions in getIntrinsicCosts. NFCI..
Mar 8 2019, 2:28 AM · Restricted Project
SjoerdMeijer updated the diff for D59014: [TTI] Enable analysis of clib functions in getIntrinsicCosts. NFCI..

Thanks for the reviews!

  • removed the memcpy business from this patch.
  • and we do have more const pointers now.
Mar 8 2019, 12:53 AM · Restricted Project

Mar 7 2019

SjoerdMeijer added inline comments to D59014: [TTI] Enable analysis of clib functions in getIntrinsicCosts. NFCI..
Mar 7 2019, 8:50 AM · Restricted Project
SjoerdMeijer added reviewers for D59014: [TTI] Enable analysis of clib functions in getIntrinsicCosts. NFCI.: efriedma, sanjoy.
Mar 7 2019, 8:02 AM · Restricted Project

Mar 6 2019

SjoerdMeijer created D59014: [TTI] Enable analysis of clib functions in getIntrinsicCosts. NFCI..
Mar 6 2019, 12:58 AM · Restricted Project

Mar 5 2019

SjoerdMeijer accepted D58855: [AArch64] Improve FP16 instruction selection for vector round and vector convert from half instructions.

LGTM

Mar 5 2019, 1:10 AM · Restricted Project
SjoerdMeijer closed D58563: [AArch64] Improve FP16 vector convert from short instructions.

Committed in rL355134

Mar 5 2019, 12:57 AM · Restricted Project
SjoerdMeijer closed D58561: [AArch64] Generate FP16 vector compare instructions.

Committed in rL355050

Mar 5 2019, 12:56 AM · Restricted Project
SjoerdMeijer accepted D58813: [ARM] Fix select_cc lowering for fp16.
Mar 5 2019, 12:45 AM · Restricted Project

Mar 1 2019

SjoerdMeijer added inline comments to D58813: [ARM] Fix select_cc lowering for fp16.
Mar 1 2019, 4:10 AM · Restricted Project
SjoerdMeijer accepted D58812: [ARM] Consider undefined-on-NaN conditions in checkVSELConstraints.

Looks like a good fix to me

Mar 1 2019, 3:56 AM · Restricted Project
SjoerdMeijer accepted D58816: [ARM] Fix FP16 stack loads/stores for Thumb2 with frame pointer.

Looks like a straightforward fix to me.

Mar 1 2019, 3:56 AM · Restricted Project

Feb 25 2019

SjoerdMeijer accepted D58563: [AArch64] Improve FP16 vector convert from short instructions.

LGTM

Feb 25 2019, 1:06 AM · Restricted Project
SjoerdMeijer accepted D58561: [AArch64] Generate FP16 vector compare instructions.

LGTM, with one nit inline.

Feb 25 2019, 1:01 AM · Restricted Project

Feb 21 2019

SjoerdMeijer accepted D58452: [ARM] Negative constants mishandled in ARM CGP.

LGTM

Feb 21 2019, 12:44 AM · Restricted Project

Feb 19 2019

SjoerdMeijer accepted D58306: [AArch64] Change size suffix for FP16FML intrinsics..

The ACLE has been updated and a new version with change included will be released soon.

Feb 19 2019, 5:08 AM · Restricted Project, Restricted Project

Feb 17 2019

SjoerdMeijer added a comment to D58306: [AArch64] Change size suffix for FP16FML intrinsics..

I am discussing this with our GCC team as we would like both Clang/GCC implementation to be the same. But you're right that _f16 looks like to be the more consistent choice. I will let you know as soon I know more.

Feb 17 2019, 12:11 PM · Restricted Project, Restricted Project

Feb 15 2019

SjoerdMeijer added inline comments to D53633: [AArch64] Implement FP16FML intrinsics.
Feb 15 2019, 6:57 AM · Restricted Project
SjoerdMeijer added inline comments to D53633: [AArch64] Implement FP16FML intrinsics.
Feb 15 2019, 2:37 AM · Restricted Project

Feb 14 2019

SjoerdMeijer accepted D57686: [ARM CGP] Fix ConvertTruncs.

Thanks, LGTM

Feb 14 2019, 5:45 AM · Restricted Project

Feb 13 2019

SjoerdMeijer accepted D58176: [ARM] Ensure we update the correct flags in the peephole optimiser.

LGTM, with one nit inline.
Perhaps wait a day in case someone else has comments.

Feb 13 2019, 6:45 AM · Restricted Project
SjoerdMeijer added inline comments to D57686: [ARM CGP] Fix ConvertTruncs.
Feb 13 2019, 6:06 AM · Restricted Project

Feb 11 2019

SjoerdMeijer committed rG150ccb889e9e: [ARM] LoadStoreOptimizer: reoder limit (authored by SjoerdMeijer).
[ARM] LoadStoreOptimizer: reoder limit
Feb 11 2019, 1:38 AM
SjoerdMeijer committed rL353678: [ARM] LoadStoreOptimizer: reoder limit.
[ARM] LoadStoreOptimizer: reoder limit
Feb 11 2019, 1:37 AM
SjoerdMeijer closed D57954: [ARM] LoadStoreOptimizer: reoder limit.
Feb 11 2019, 1:37 AM · Restricted Project
SjoerdMeijer added a comment to D57954: [ARM] LoadStoreOptimizer: reoder limit.

Thanks Eli, also for your thoughts on this:

Feb 11 2019, 1:27 AM · Restricted Project
SjoerdMeijer committed rG0cc50c6b87c4: [ARM] LoadStoreOptimizer: just a clean-up. NFC. (authored by SjoerdMeijer).
[ARM] LoadStoreOptimizer: just a clean-up. NFC.
Feb 11 2019, 12:48 AM
SjoerdMeijer committed rL353670: [ARM] LoadStoreOptimizer: just a clean-up. NFC..
[ARM] LoadStoreOptimizer: just a clean-up. NFC.
Feb 11 2019, 12:48 AM
SjoerdMeijer closed D57955: [ARM] LoadStoreOptimizer: just a clean-up. NFC..
Feb 11 2019, 12:47 AM · Restricted Project

Feb 8 2019

SjoerdMeijer updated the summary of D57954: [ARM] LoadStoreOptimizer: reoder limit.
Feb 8 2019, 8:45 AM · Restricted Project
SjoerdMeijer created D57955: [ARM] LoadStoreOptimizer: just a clean-up. NFC..
Feb 8 2019, 8:43 AM · Restricted Project
SjoerdMeijer created D57954: [ARM] LoadStoreOptimizer: reoder limit.
Feb 8 2019, 8:41 AM · Restricted Project
SjoerdMeijer accepted D57577: Make predefined FLT16 macros conditional on support for the type.

Looks okay to me, with one nit inline.

Feb 8 2019, 5:38 AM · Restricted Project, Restricted Project

Jan 31 2019

SjoerdMeijer committed rL352737: [ARM] Thumb2: ConstantMaterializationCost.
[ARM] Thumb2: ConstantMaterializationCost
Jan 31 2019, 12:40 AM
SjoerdMeijer closed D57327: [ARM] Thumb2: ConstantMaterializationCost.
Jan 31 2019, 12:40 AM
SjoerdMeijer added inline comments to D57327: [ARM] Thumb2: ConstantMaterializationCost.
Jan 31 2019, 12:19 AM
SjoerdMeijer committed rL352736: [SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTS.
[SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTS
Jan 31 2019, 12:09 AM
SjoerdMeijer closed D57386: [SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTS.
Jan 31 2019, 12:09 AM · Restricted Project

Jan 30 2019

SjoerdMeijer updated the diff for D57386: [SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTS.

It probably should be shouldExpandShift or something?

Jan 30 2019, 5:38 AM · Restricted Project
SjoerdMeijer updated the diff for D57386: [SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTS.

Thanks for reviewing!

Jan 30 2019, 5:13 AM · Restricted Project
SjoerdMeijer updated the diff for D57327: [ARM] Thumb2: ConstantMaterializationCost.

Maybe put the tests into test/CodeGen/ARM/subtarget-no-movt.ll? We have some similar tests there from https://reviews.llvm.org/D23090 .

Jan 30 2019, 2:52 AM

Jan 29 2019

SjoerdMeijer created D57386: [SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTS.
Jan 29 2019, 7:27 AM · Restricted Project