Page MenuHomePhabricator
Feed Advanced Search

Jun 23 2020

labrinea accepted D82372: [ARM][BFloat] Legalize bf16 type even without fullfp16..

LGTM with a nit. Can you also remove FPRegs16Pat from ARMInstrFormats.td now that is no longer used?

Jun 23 2020, 7:26 AM · Restricted Project
labrinea added a comment to D82372: [ARM][BFloat] Legalize bf16 type even without fullfp16..

Hi Simon, thanks for working on this. Looks good overall. A few remarks inline.

Jun 23 2020, 6:21 AM · Restricted Project
labrinea added a reviewer for D82372: [ARM][BFloat] Legalize bf16 type even without fullfp16.: labrinea.
Jun 23 2020, 6:21 AM · Restricted Project

Jun 18 2020

labrinea committed rGecdf48f15bd2: [ARM] Basic bfloat support (authored by labrinea).
[ARM] Basic bfloat support
Jun 18 2020, 9:47 AM
labrinea closed D81373: [ARM] Basic bfloat support.
Jun 18 2020, 9:46 AM · Restricted Project
labrinea updated the diff for D81373: [ARM] Basic bfloat support.

Rebased

Jun 18 2020, 8:06 AM · Restricted Project
labrinea updated the diff for D81373: [ARM] Basic bfloat support.

Addressed last round's review comments.

Jun 18 2020, 7:34 AM · Restricted Project
labrinea added inline comments to D81373: [ARM] Basic bfloat support.
Jun 18 2020, 2:08 AM · Restricted Project

Jun 17 2020

labrinea updated the diff for D81373: [ARM] Basic bfloat support.

Changes from last revision:

  • the code generation relies on fullfp16 being present,
  • the unit test also checks the codegen for soft float abi
Jun 17 2020, 11:18 AM · Restricted Project
labrinea added inline comments to D81373: [ARM] Basic bfloat support.
Jun 17 2020, 10:45 AM · Restricted Project

Jun 16 2020

labrinea committed rGf6189da93816: [ARM][NFC] Explicitly specify the fp16 value type in codegen patterns. (authored by labrinea).
[ARM][NFC] Explicitly specify the fp16 value type in codegen patterns.
Jun 16 2020, 3:51 AM
labrinea closed D81505: [ARM][NFC] Explicitly specify the fp16 value type in codegen patterns..
Jun 16 2020, 3:51 AM · Restricted Project

Jun 11 2020

labrinea added inline comments to D81373: [ARM] Basic bfloat support.
Jun 11 2020, 11:00 AM · Restricted Project
labrinea updated the diff for D81373: [ARM] Basic bfloat support.
Jun 11 2020, 10:28 AM · Restricted Project

Jun 9 2020

labrinea created D81505: [ARM][NFC] Explicitly specify the fp16 value type in codegen patterns..
Jun 9 2020, 2:21 PM · Restricted Project

Jun 8 2020

labrinea added a comment to D81373: [ARM] Basic bfloat support.

Hey Oliver, thanks for looking at this.

Jun 8 2020, 7:05 AM · Restricted Project
labrinea added a comment to D81373: [ARM] Basic bfloat support.

I believe the codegen patterns for vmov and load/store half are incorrect on the bf16 type. Can someone suggest what is the right approach?

Jun 8 2020, 4:19 AM · Restricted Project
labrinea created D81373: [ARM] Basic bfloat support.
Jun 8 2020, 4:19 AM · Restricted Project

Jun 4 2020

labrinea added inline comments to D76077: [ARM] Add __bf16 as new Bfloat16 C Type.
Jun 4 2020, 3:13 AM · Restricted Project

Jun 3 2020

labrinea accepted D79710: [clang][BFloat] Add create/set/get/dup intrinsics.
Jun 3 2020, 6:33 AM · Restricted Project

Jun 1 2020

labrinea added inline comments to D79710: [clang][BFloat] Add create/set/get/dup intrinsics.
Jun 1 2020, 9:06 AM · Restricted Project
labrinea created D80928: [BFloat] Add convert/copy instrinsic support.
Jun 1 2020, 8:33 AM · Restricted Project, Restricted Project
labrinea added inline comments to D76077: [ARM] Add __bf16 as new Bfloat16 C Type.
Jun 1 2020, 1:33 AM · Restricted Project

May 28 2020

labrinea added inline comments to D80716: [AArch64]: BFloat Load/Store Intrinsics&CodeGen.
May 28 2020, 6:30 AM · Restricted Project, Restricted Project

May 26 2020

labrinea added a comment to D79711: [ARM] Add poly64_t on AArch32..

Should poly128_t be available on AArch32 too? I don't see anything in the ACLE version you linked restricting it to AArch64 only, and the intrinsics reference has a number of intrinsics available for both ISAs using it.

It should but it is not that simple. The reason it is not available is that __int128_t is not supported in AArch32. I think that is future work, since this patch unblocks the bfloat reinterpret_cast patch, which btw is annotated with TODO comments regarding the poly128_t type for AArch32.

May 26 2020, 7:00 AM · Restricted Project

May 20 2020

labrinea added inline comments to D79710: [clang][BFloat] Add create/set/get/dup intrinsics.
May 20 2020, 9:48 AM · Restricted Project

Jan 20 2020

labrinea accepted D72762: [ARM][TargetParser] Improve handling of dependencies between target features.
Jan 20 2020, 1:21 AM · Restricted Project, Restricted Project

Sep 29 2019

labrinea added a comment to D68050: WIP Make attribute target work better with AArch64.

ARM and AArch64 have a way to list the implied target features using the TargetParser but we can't directly use that in CodeGenModule because it's tied to the backend.

Sep 29 2019, 3:55 PM · Restricted Project

Sep 27 2019

labrinea added a comment to D68050: WIP Make attribute target work better with AArch64.

However, passing the AArch64 architecture names in target-cpu isn't supported by LLVM

The Clang documentation suggests that arch is used to override the CPU, not the Architecture (which is rather confusing if you ask me). GCC makes more sense having separate target attributes for CPU and Architecture (see the equivalent GCC documentation). I think target-cpu should remain generic when it is not explicitly specified either on the command line (-mcpu) or as a function attribute (i.e target("arch=cortex-a57")). However, if the function attribute specifies an Architecture (i.e target("arch=armv8.4a")), I agree we should favor the subtarget features corresponding to armv8.4 over those of the command line. Similarly we should favor the subtarget features corresponding to cortex-a57 (not sure if we do so atm - I think we don't). ARM and AArch64 have a way to list the implied target features using the TargetParser but we can't directly use that in CodeGenModule because it's tied to the backend.

Sep 27 2019, 10:30 AM · Restricted Project
labrinea committed rGc006b6f4cb80: [MC][ARM] vscclrm disassembles as vldmia (authored by labrinea).
[MC][ARM] vscclrm disassembles as vldmia
Sep 27 2019, 1:23 AM

Sep 26 2019

labrinea updated the diff for D68025: [MC][ARM] vscclrm disassembles as vldmia.

Updated the Filecheck labels as suggested.

Sep 26 2019, 10:15 AM · Restricted Project
labrinea added inline comments to D68025: [MC][ARM] vscclrm disassembles as vldmia.
Sep 26 2019, 10:06 AM · Restricted Project

Sep 25 2019

labrinea created D68025: [MC][ARM] vscclrm disassembles as vldmia.
Sep 25 2019, 7:24 AM · Restricted Project

Sep 23 2019

labrinea added a reviewer for D67485: AArch64: use ldp/stp for atomic & volatile 128-bit where appropriate.: labrinea.
Sep 23 2019, 9:29 AM · Restricted Project
labrinea added a comment to D67485: AArch64: use ldp/stp for atomic & volatile 128-bit where appropriate..

I think Clang is involved there too, in horribly non-obvious ways (for example I think that's the only way to get the actual libcalls you want rather than legacy ones). Either way, that's a change that would need pretty careful coordination. Since all of our CPUs are Cyclone or above we could probably just skip the libcalls entirely at Apple without ABI breakage (which, unintentionally, is what this patch does).

I am not sure I am following here. According to https://llvm.org/docs/Atomics.html the AtomicExpandPass will translate atomic operations on data sizes above MaxAtomicSizeInBitsSupported into calls to atomic libcalls. The docs say that even though the libcalls share the same names with clang builtins they are not directly related to them. Indeed, I hacked the AArhc64 backend to disallow codegen for 128-bit atomics and as a result LLVM emitted calls to __atomic_store_16 and __atomic_load_16. Are those legacy names? I also tried emitting IR for the clang builtins and I saw atomic load/store IR instructions (like those in your tests), no libcalls. Anyhow, my concern here is that if sometime in the future we replace the broken CAS loop with a libcall, the current patch will break ABI compatibity between v8.4 objects with atomic ldp/stp and v8.X objects without the extension. Moreover, this ABI incompatibility already exists between objects built with LLVM and GCC. Any thoughts?

Sep 23 2019, 9:29 AM · Restricted Project

Sep 20 2019

labrinea added a comment to D67485: AArch64: use ldp/stp for atomic & volatile 128-bit where appropriate..

Hi Tim, thanks for looking into this optimization opportunity. I have a few remarks regarding this change:

  • First, it appears that the current codegen (CAS loop) for 128-bit atomic accesses is broken based on this comment: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70814#c3. There are two problematic cases as far as I understand: (1) const and (2) volatile atomic objects. Const objects disallow write access to the underlying memory, volatile objects mandate that each byte of the underlying memory shall be accessed exactly once according to the AAPCS. The CAS loop violates both.
Sep 20 2019, 5:28 AM · Restricted Project

Jul 14 2019

labrinea committed rG951bb68ce262: [TargetParser][ARM] Account dependencies when processing target features (authored by labrinea).
[TargetParser][ARM] Account dependencies when processing target features
Jul 14 2019, 1:35 PM
labrinea committed rG24cacf9c56f0: [clang][Driver][ARM] Favor -mfpu over default CPU features (authored by labrinea).
[clang][Driver][ARM] Favor -mfpu over default CPU features
Jul 14 2019, 11:34 AM

Jul 4 2019

labrinea requested review of D64048: [TargetParser][ARM] Account dependencies when processing target features.
Jul 4 2019, 7:57 AM · Restricted Project, Restricted Project
labrinea updated the diff for D64048: [TargetParser][ARM] Account dependencies when processing target features.

Added the dependency of mve on dsp and some missing tests to cover those cases.

Jul 4 2019, 3:29 AM · Restricted Project, Restricted Project
labrinea added inline comments to D63936: [clang][Driver][ARM] Favor -mfpu over default CPU features.
Jul 4 2019, 3:07 AM · Restricted Project, Restricted Project

Jul 3 2019

labrinea added inline comments to D63936: [clang][Driver][ARM] Favor -mfpu over default CPU features.
Jul 3 2019, 5:58 AM · Restricted Project, Restricted Project

Jul 2 2019

labrinea committed rG9fcf5dadd7cc: [clang][Driver][ARM] NFC: Remove unused function parameter (authored by labrinea).
[clang][Driver][ARM] NFC: Remove unused function parameter
Jul 2 2019, 2:47 AM

Jul 1 2019

labrinea created D64048: [TargetParser][ARM] Account dependencies when processing target features.
Jul 1 2019, 5:08 PM · Restricted Project, Restricted Project
labrinea updated the diff for D63936: [clang][Driver][ARM] Favor -mfpu over default CPU features.

I've split the patch.

Jul 1 2019, 4:27 PM · Restricted Project, Restricted Project
labrinea retitled D63936: [clang][Driver][ARM] Favor -mfpu over default CPU features from [ARM] Minor fixes in command line option parsing to [clang][Driver][ARM] Favor -mfpu over default CPU features.
Jul 1 2019, 4:25 PM · Restricted Project, Restricted Project
labrinea created D64044: [clang][Driver][ARM] NFC: Remove unused function parameter.
Jul 1 2019, 3:49 PM · Restricted Project, Restricted Project
labrinea added a comment to D63936: [clang][Driver][ARM] Favor -mfpu over default CPU features.

@simon_tatham, thanks for clarifying. I think my change is doing the right thing then: favors the -mfpu option over the default CPU features. I will split the patch as @ostannard suggested.

Jul 1 2019, 9:40 AM · Restricted Project, Restricted Project
labrinea added a comment to D63936: [clang][Driver][ARM] Favor -mfpu over default CPU features.

The second change this patch makes

Could this be spilt into two patches?

Jul 1 2019, 9:02 AM · Restricted Project, Restricted Project
labrinea added reviewers for D63936: [clang][Driver][ARM] Favor -mfpu over default CPU features: llvm-commits, cfe-commits.
Jul 1 2019, 2:17 AM · Restricted Project, Restricted Project

Jun 28 2019

labrinea created D63936: [clang][Driver][ARM] Favor -mfpu over default CPU features.
Jun 28 2019, 9:13 AM · Restricted Project, Restricted Project

Dec 17 2018

labrinea closed D55108: [AArch64] Re-run load/store optimizer after aggressive tail duplication.

Committed as https://reviews.llvm.org/rL349338

Dec 17 2018, 2:57 AM
labrinea added a comment to D55108: [AArch64] Re-run load/store optimizer after aggressive tail duplication.

IIRC, there is a test for the pass pipeline I would expect needs updating.

Dec 17 2018, 2:48 AM

Dec 12 2018

labrinea added a comment to D55108: [AArch64] Re-run load/store optimizer after aggressive tail duplication.

I've tested the patch with native builds of the llvm-test-suite on an AArch64 Cortex-A72 and couldn't spot anything interesting in terms of compilation time.

Dec 12 2018, 7:23 AM

Dec 11 2018

labrinea added a comment to D55108: [AArch64] Re-run load/store optimizer after aggressive tail duplication.

Ping

Dec 11 2018, 9:50 AM

Dec 4 2018

labrinea added inline comments to D55009: [GVN] Don't perform scalar PRE on GEPs.
Dec 4 2018, 9:07 AM

Nov 30 2018

labrinea updated subscribers of D55108: [AArch64] Re-run load/store optimizer after aggressive tail duplication.
Nov 30 2018, 7:09 AM
labrinea updated subscribers of D55009: [GVN] Don't perform scalar PRE on GEPs.
Nov 30 2018, 7:09 AM
labrinea accepted D55112: [ARM] FP16: select vld1.16 for vector loads with post-increment.

Looks fine. Thanks!

Nov 30 2018, 6:16 AM
labrinea edited reviewers for D55009: [GVN] Don't perform scalar PRE on GEPs, added: t.p.northover, john.brawn; removed: llvm-commits.
Nov 30 2018, 3:55 AM
labrinea added a comment to D55009: [GVN] Don't perform scalar PRE on GEPs.

It may be worthwhile allowing scalar PRE on GEPs that we know won't be combined into the addressing mode of a load/store, i.e. those where TargetTransformInfo::isLegalAddressingMode returns false.

Nov 30 2018, 3:50 AM
labrinea created D55108: [AArch64] Re-run load/store optimizer after aggressive tail duplication.
Nov 30 2018, 1:55 AM

Nov 28 2018

labrinea created D55009: [GVN] Don't perform scalar PRE on GEPs.
Nov 28 2018, 9:31 AM

Nov 6 2018

labrinea added a comment to D54170: [InstCombine][SelectionDAG][AArch64] fold gep into select to enable speculation of load.

This looks like a bunch of separate changes which should be split into multiple patches. Especially the changes to DAGCombine and InstCombiner::visitZExt .

Nov 6 2018, 12:54 PM
labrinea added a comment to D53236: [SelectionDAG] swap select_cc operands to enable folding.

LGTM

(For reference: I was wondering why x86 doesn't show any diffs for this change; it looks like there's custom code in X86ISelLowering that already does the same thing.)

Nov 6 2018, 12:43 PM
labrinea created D54170: [InstCombine][SelectionDAG][AArch64] fold gep into select to enable speculation of load.
Nov 6 2018, 11:40 AM

Nov 2 2018

labrinea updated the diff for D53236: [SelectionDAG] swap select_cc operands to enable folding.

Rebased and clang-formatted.

Nov 2 2018, 8:01 AM

Oct 29 2018

labrinea added inline comments to D53236: [SelectionDAG] swap select_cc operands to enable folding.
Oct 29 2018, 5:36 AM
labrinea updated the diff for D53236: [SelectionDAG] swap select_cc operands to enable folding.

I've autogenerated the filecheck lines to show the diff compared to the trunk codegen. For making sure we never fall-through to the next block, having changed the CC but not swapped (N2, N3), I've moved all the preconditions to the beginning of the block (instead of moving the block into a helper function).

Oct 29 2018, 5:32 AM

Oct 24 2018

labrinea added a comment to D53236: [SelectionDAG] swap select_cc operands to enable folding.

Is the motivating case integer or FP?
I'm asking because we have a canonicalization for integer cmp+sel for the IR in these tests, but we're missing the corresponding FP transform.
If we add the FP canonicalization in IR, would there still be a need for this backend patch? Ie, is something generating this select code in the DAG itself?

Oct 24 2018, 6:10 PM

Oct 15 2018

labrinea added a reviewer for D53236: [SelectionDAG] swap select_cc operands to enable folding: spatel.
Oct 15 2018, 2:49 AM

Oct 12 2018

labrinea created D53236: [SelectionDAG] swap select_cc operands to enable folding.
Oct 12 2018, 6:40 PM

Sep 26 2018

labrinea planned changes to D52568: [InstCombine] Delay Phi operand folding in the pass manager.

Not ready for review. Using this as reference to and RFC in llvm-dev.

Sep 26 2018, 11:52 AM
labrinea created D52568: [InstCombine] Delay Phi operand folding in the pass manager.
Sep 26 2018, 11:49 AM

Sep 12 2018

labrinea created D51980: [GVNHoist] computeInsertionPoints() miscalculates the Iterated Dominance Frontiers.
Sep 12 2018, 5:52 AM

Sep 7 2018

labrinea added reviewers for D51801: [MemorySSAUpdater] Avoid creating self-referencing MemoryDefs: efriedma, george.burgess.iv.
Sep 7 2018, 11:03 AM
labrinea added inline comments to D51801: [MemorySSAUpdater] Avoid creating self-referencing MemoryDefs.
Sep 7 2018, 10:49 AM
labrinea created D51801: [MemorySSAUpdater] Avoid creating self-referencing MemoryDefs.
Sep 7 2018, 10:37 AM

Aug 28 2018

labrinea requested review of D49858: [RFC] re-enable GVNHoist by default.
Aug 28 2018, 6:29 AM
labrinea updated the diff for D49858: [RFC] re-enable GVNHoist by default.

Rebase rL338240 since the excessive memory usage observed when using GVNHoist with UBSan has been fixed by rL340818 (https://reviews.llvm.org/D50323).

Aug 28 2018, 6:29 AM
labrinea added a comment to D50323: [GVNHoist] Prune out useless CHI insertions.

Do you need help with pushing the changes?

Apologies for delaying this, I was out of office. I'll rebase and push it asap.

Aug 28 2018, 2:07 AM

Aug 10 2018

labrinea added inline comments to D50323: [GVNHoist] Prune out useless CHI insertions.
Aug 10 2018, 5:42 AM
labrinea added inline comments to D50323: [GVNHoist] Prune out useless CHI insertions.
Aug 10 2018, 5:30 AM

Aug 9 2018

labrinea added a comment to D50323: [GVNHoist] Prune out useless CHI insertions.

So, is everyone happy with this change?

Aug 9 2018, 1:36 AM

Aug 7 2018

labrinea added inline comments to D50323: [GVNHoist] Prune out useless CHI insertions.
Aug 7 2018, 3:07 AM

Aug 6 2018

labrinea added inline comments to D50323: [GVNHoist] Prune out useless CHI insertions.
Aug 6 2018, 6:28 AM
labrinea planned changes to D49858: [RFC] re-enable GVNHoist by default.
Aug 6 2018, 3:11 AM
labrinea reopened D49858: [RFC] re-enable GVNHoist by default.

This got reverted because of an out-of-memory error on an ubsan buildbot. Details and fix here -> https://reviews.llvm.org/D50323. I'll update the tests upon rebase.

Aug 6 2018, 3:10 AM
labrinea created D50323: [GVNHoist] Prune out useless CHI insertions.
Aug 6 2018, 1:46 AM

Jul 30 2018

labrinea added a comment to D49858: [RFC] re-enable GVNHoist by default.

Did you test it with some benchmarks? Results?

I am running lnt, spec2000 and spec2006 on AArch64 at the moment. I'll post results soon.

Jul 30 2018, 3:14 AM

Jul 26 2018

labrinea created D49858: [RFC] re-enable GVNHoist by default.
Jul 26 2018, 8:30 AM

Jul 23 2018

labrinea added inline comments to D49229: [AggressiveInstCombine] Fold redundant masking operations of shifted value.
Jul 23 2018, 7:38 AM
labrinea added a reviewer for D49229: [AggressiveInstCombine] Fold redundant masking operations of shifted value: labrinea.
Jul 23 2018, 7:11 AM
labrinea added inline comments to D49229: [AggressiveInstCombine] Fold redundant masking operations of shifted value.
Jul 23 2018, 7:10 AM

Jul 20 2018

labrinea updated the diff for D49425: [MemorySSAUpdater] Update Phi operands after trivial Phi elimination.

Changes to prior revision.

  • Removed the update loop for PhiOps and used TrackingVH<MemoryAccess> instead.
  • Replaced the Bitcode reproducer with IR using -preserve-ll-uselistorder.
Jul 20 2018, 3:59 AM

Jul 19 2018

labrinea added a comment to D49425: [MemorySSAUpdater] Update Phi operands after trivial Phi elimination.

If the bitcode is crashing but the textual IR isn't, you're probably getting bitten by use-list ordering. You can use the preserve-ll-uselistorder option for "opt" to preserve it in IR.

Jul 19 2018, 9:46 AM
labrinea created D49555: [GVNHoist] safeToHoistLdSt incorrectly checks whether a defining access dominates the insertion point.
Jul 19 2018, 9:40 AM

Jul 18 2018

labrinea added a comment to D49425: [MemorySSAUpdater] Update Phi operands after trivial Phi elimination.

Does the original test-case crash reliably as IR for you? If so, please use that instead. (Phab won't let me download the attached bitcode, but with asan, I see use-after-free crashes 100% of the time in the original repro).

It does, but using opt -S -O3 ./tc_memphi_gvnhoist.ll -enable-gvn-hoist. Using bugpoint on that command you get the bitcode I uploaded.

Jul 18 2018, 3:05 AM

Jul 17 2018

labrinea added a comment to D49425: [MemorySSAUpdater] Update Phi operands after trivial Phi elimination.

A few remarks:

  • SmallVector<WeakVH, 8> PhiOps fixes the bug on its own (without the rest changes) and I am wondering why..
  • When we mark a block as visited why do we cache it? When the recursion ends we might trivially remove the Phi. In that case the second cache insertion for the same key block should fail, no?
  • Do we ever reach the PHIExistsButNeedsUpdate case? Is it when a Phi existed beforehand, meaning we did not create it? I can't think of another way to reach that state.
  • Interestingly enough the reproducer only made opt crash in bitcode form and not in IR form.
Jul 17 2018, 7:16 AM
labrinea created D49425: [MemorySSAUpdater] Update Phi operands after trivial Phi elimination.
Jul 17 2018, 7:08 AM