Page MenuHomePhabricator

bsaleil (Baptiste Saleil)
User

Projects

User does not belong to any projects.

User Details

User Since
Oct 10 2019, 12:42 PM (142 w, 5 d)

Recent Activity

Thu, Jun 23

bsaleil committed rG79e77a9f39f0: [AMDGPU] Flush the vmcnt counter in loop preheaders when necessary (authored by bsaleil).
[AMDGPU] Flush the vmcnt counter in loop preheaders when necessary
Thu, Jun 23, 7:54 AM · Restricted Project, Restricted Project
bsaleil closed D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary.
Thu, Jun 23, 7:53 AM · Restricted Project, Unknown Object (Project), Restricted Project

Wed, Jun 22

bsaleil updated the diff for D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary.

Address review comments

Wed, Jun 22, 7:49 AM · Restricted Project, Unknown Object (Project), Restricted Project
bsaleil added inline comments to D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary.
Wed, Jun 22, 6:42 AM · Restricted Project, Unknown Object (Project), Restricted Project

Tue, Jun 21

bsaleil updated the diff for D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary.

Addressed review comments: Use ranges in for loop and split generateWaitcntInstBefore to reuse its second half for generateWaitcntBlockEnd.
Also changed applyPreexistingWaitcnt to take a MachineBasicBlock::instr_iterator instead of a MachineBasicBlock::iterator to match the caller.

Tue, Jun 21, 2:23 PM · Restricted Project, Unknown Object (Project), Restricted Project

Jun 1 2022

bsaleil added inline comments to D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary.
Jun 1 2022, 9:37 AM · Restricted Project, Unknown Object (Project), Restricted Project

May 30 2022

bsaleil added inline comments to D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary.
May 30 2022, 2:07 PM · Restricted Project, Unknown Object (Project), Restricted Project
bsaleil updated the diff for D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary.

Addressed review comments and fixed debug build.

May 30 2022, 1:52 PM · Restricted Project, Unknown Object (Project), Restricted Project

May 26 2022

bsaleil updated the diff for D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary.

Add support for nested loops, and fixes the case where we have no terminator but preexisting s_waitcnt instructions. I still need to address @foad comments.

May 26 2022, 2:21 PM · Restricted Project, Unknown Object (Project), Restricted Project

Mar 22 2022

bsaleil added a comment to D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary.

@piotr I ran compile time testing and the patch has no significant impact. Worst case is 1.1% and is the only one above 1%. Average is below 0.1%.

Mar 22 2022, 10:40 AM · Restricted Project, Unknown Object (Project), Restricted Project

Mar 21 2022

bsaleil added a comment to D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary.

Gentle ping @foad @nhaehnle

Mar 21 2022, 12:18 PM · Restricted Project, Unknown Object (Project), Restricted Project

Mar 10 2022

bsaleil updated the diff for D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary.

Rebased and addressed @foad comment.

Mar 10 2022, 12:24 PM · Restricted Project, Unknown Object (Project), Restricted Project

Mar 3 2022

bsaleil updated the diff for D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary.

@foad I updated the patch. It is a lot simpler than the previous, and it fixes both the GFX9 and GFX10 cases. But it may still have a significant impact on compile time, I cannot think of another way to do that without visiting all the instructions from the loops :(

Mar 3 2022, 10:50 AM · Restricted Project, Unknown Object (Project), Restricted Project

Feb 17 2022

bsaleil added a comment to D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary.

I agree that the code is a lot more complicated with this patch than it was with the previous patch. I think this improvement cannot be implemented simply by looking at the WaitcntBrackets without requiring all this refactoring.
So this means we have two choices:

  1. Original patch: Before inserting the waitcnts, visit all the loop instructions a single time (no fixed-point) until we visit all the instructions, or until we find an instruction that invalidates the optimization. Depending on what we found, flush in preheaders or not before generating the waitcnts. This is a lot simpler.
  2. New patch with refactoring: No need to visit the instructions before inserting the waitcnt, but we need to compute two brackets for each block, and keep two waitcnt lists until we decide which one we want to generate. This is a lot more complicated.

The original motivation to work on 2. was concerns about compile-time impact of 1., but because we need to compute two brackets for each block, I actually don't think that 2. is more efficient. Both the GFX9 and GFX10 improvements can be implemented with either 1. or 2. So for the sake of simplicity, I think I should revert back to the original patch.

Feb 17 2022, 9:31 AM · Restricted Project, Unknown Object (Project), Restricted Project

Feb 16 2022

bsaleil added reviewers for D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary: foad, nhaehnle, Joe_Nash.
Feb 16 2022, 1:28 PM · Restricted Project, Unknown Object (Project), Restricted Project

Feb 15 2022

bsaleil updated the summary of D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary.
Feb 15 2022, 2:44 PM · Restricted Project, Unknown Object (Project), Restricted Project
bsaleil updated the summary of D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary.
Feb 15 2022, 2:42 PM · Restricted Project, Unknown Object (Project), Restricted Project
bsaleil updated the diff for D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary.

Refactoring of the pass. We compute the brackets for both the flushed and non-flushed versions of each outer loop until we finish visiting the loop or until we decide it is not worth to flush in the preheader. Instead of generating the waitcnts, we add them to two separate lists (for the flushed and non-flushed versions). After all the blocks are visited, we generate one of the two lists depending on the decision we made.

Feb 15 2022, 2:36 PM · Restricted Project, Unknown Object (Project), Restricted Project

Dec 20 2021

bsaleil added a comment to D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary.

@arsenm the idea is to rotate the processing loop (runOnMachineFunction) of the SIInsertWaitcnt pass, so we can make decisions from predecessors, not the actual machine IR loop containing the waitcnt:

Note that the loop over basic blocks in runOnMachineFunction currently looks like:

for each block:
  get saved state for this block
  process block
  merge new state into saved state for each successor

But if necessary we could change it to:

for each block:
  merge saved state from each predecessor
  process block
  save state for this block
Dec 20 2021, 12:02 PM · Restricted Project, Unknown Object (Project), Restricted Project
bsaleil added a comment to D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary.

That makes sense, thanks for the comments. So maybe I can start by trying to implement the idea of rotating the processing loop and making a decision from the predecessors and see if that solution would be enough to fix the most common cases we observed. Then, if control flow inside loops is really an issue with that solution, I can try the other solution based on GPR liveness, but I also don't like the idea of having another fixed-point loop that could significantly degrade compilation time.

Dec 20 2021, 11:46 AM · Restricted Project, Unknown Object (Project), Restricted Project

Dec 14 2021

bsaleil requested review of D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary.
Dec 14 2021, 12:03 PM · Restricted Project, Unknown Object (Project), Restricted Project

May 12 2021

bsaleil committed rG5885f1a4cb0b: [AMDGPU] Disable the SIFormMemoryClauses pass at -O1 (authored by bsaleil).
[AMDGPU] Disable the SIFormMemoryClauses pass at -O1
May 12 2021, 8:53 AM
bsaleil closed D101939: [AMDGPU] Disable the SIFormMemoryClauses pass at -O1.
May 12 2021, 8:52 AM · Unknown Object (Project), Restricted Project

May 11 2021

bsaleil updated the diff for D101939: [AMDGPU] Disable the SIFormMemoryClauses pass at -O1.

Revert changes improving the pass implementation, these changes will be addressed in another patch.

May 11 2021, 1:40 PM · Unknown Object (Project), Restricted Project

May 7 2021

bsaleil added a comment to D101939: [AMDGPU] Disable the SIFormMemoryClauses pass at -O1.

Can we address it being slow instead of just turning of everything that is? There's no reason it should be slow, but the way it uses liveness information is horrible

On small functions I think the run time of this pass is dominated by the two calls to TRI->getAllocatableSet, which means two identical calls to SIRegisterInfo::getReservedRegs, which has a very large fixed cost. Is there some way the Reserved register set can be cached (in the SIMachineFunctionInfo object?)?

May 7 2021, 8:24 AM · Unknown Object (Project), Restricted Project
bsaleil updated the diff for D101939: [AMDGPU] Disable the SIFormMemoryClauses pass at -O1.

Add the TargetRegisterInfo::getAllocatableSets method to avoid calling getReservedRegs multiple times when we want allocatable sets for multiple register classes.

May 7 2021, 8:18 AM · Unknown Object (Project), Restricted Project

May 5 2021

bsaleil added a comment to D101939: [AMDGPU] Disable the SIFormMemoryClauses pass at -O1.

Can we address it being slow instead of just turning of everything that is? There's no reason it should be slow, but the way it uses liveness information is horrible

May 5 2021, 1:26 PM · Unknown Object (Project), Restricted Project
bsaleil requested review of D101939: [AMDGPU] Disable the SIFormMemoryClauses pass at -O1.
May 5 2021, 1:09 PM · Unknown Object (Project), Restricted Project
bsaleil committed rG83646f60a8a4: [AMDGPU] Fix llc pipeline lit test for bots enabling expensive checks (authored by bsaleil).
[AMDGPU] Fix llc pipeline lit test for bots enabling expensive checks
May 5 2021, 7:58 AM
bsaleil added a comment to D101414: [AMDGPU] Disable the scalar IR, SDWA and load store vectorizer passes at -O1.

llc-pipeline.ll has problems with -DLLVM_ENABLE_EXPENSIVE_CHECKS=1 as instruction verification seems to still take place.

test/CodeGen/X86/opt-pipeline.ll uses grep -v to filter these lines out.

May 5 2021, 7:41 AM · Unknown Object (Project), Restricted Project

May 4 2021

bsaleil committed rG845c8a60e9f3: [AMDGPU] Add rm line to lit test to cleanup bots (authored by bsaleil).
[AMDGPU] Add rm line to lit test to cleanup bots
May 4 2021, 3:29 PM
bsaleil added a comment to D101414: [AMDGPU] Disable the scalar IR, SDWA and load store vectorizer passes at -O1.

This breaks incremental builders: http://45.33.8.238/linux/45830/step_12.txt

Ptal. You'll need a -o /dev/null and an rm -f for the temp file to clean up bots (can remove the latter after a bit)

May 4 2021, 3:16 PM · Unknown Object (Project), Restricted Project
bsaleil committed rGa018bd51998d: [AMDGPU] Fix lit failure introduced by 6a17609157196878b9cd9aa9ce71bde247ca14db (authored by bsaleil).
[AMDGPU] Fix lit failure introduced by 6a17609157196878b9cd9aa9ce71bde247ca14db
May 4 2021, 2:26 PM
bsaleil committed rG6a1760915719: [AMDGPU] Disable the scalar IR, SDWA and load store vectorizer passes at -O1 (authored by bsaleil).
[AMDGPU] Disable the scalar IR, SDWA and load store vectorizer passes at -O1
May 4 2021, 1:45 PM
bsaleil closed D101414: [AMDGPU] Disable the scalar IR, SDWA and load store vectorizer passes at -O1.
May 4 2021, 1:45 PM · Unknown Object (Project), Restricted Project

Apr 30 2021

bsaleil added inline comments to D101414: [AMDGPU] Disable the scalar IR, SDWA and load store vectorizer passes at -O1.
Apr 30 2021, 1:23 PM · Unknown Object (Project), Restricted Project
bsaleil updated the diff for D101414: [AMDGPU] Disable the scalar IR, SDWA and load store vectorizer passes at -O1.

Enable the passes only from -O2 or when explicitly enabled with flags. Also update the test case to ensure the passes are run for -O1 when explicitly enabled.

Apr 30 2021, 1:22 PM · Unknown Object (Project), Restricted Project

Apr 29 2021

bsaleil updated the diff for D101414: [AMDGPU] Disable the scalar IR, SDWA and load store vectorizer passes at -O1.

Remove checks for the "Pass Arguments:" lines in the test case.

Apr 29 2021, 1:09 PM · Unknown Object (Project), Restricted Project

Apr 28 2021

bsaleil added a comment to D101414: [AMDGPU] Disable the scalar IR, SDWA and load store vectorizer passes at -O1.

Needs a testcase. I think we have some pass pipeline tests already to show what's run (maybe not for -O1, I didn't know anyone actually used it)

Apr 28 2021, 10:58 AM · Unknown Object (Project), Restricted Project
bsaleil updated the diff for D101414: [AMDGPU] Disable the scalar IR, SDWA and load store vectorizer passes at -O1.

Add test case

Apr 28 2021, 10:56 AM · Unknown Object (Project), Restricted Project

Apr 27 2021

bsaleil requested review of D101414: [AMDGPU] Disable the scalar IR, SDWA and load store vectorizer passes at -O1.
Apr 27 2021, 4:55 PM · Unknown Object (Project), Restricted Project

Apr 26 2021

bsaleil committed rGcaf1294d9578: [AMDGPU] Experiments show that the GCNRegBankReassign pass significantly impacts (authored by bsaleil).
[AMDGPU] Experiments show that the GCNRegBankReassign pass significantly impacts
Apr 26 2021, 2:22 PM
bsaleil closed D101313: [AMDGPU] Drop the GCNRegBankReassign pass.
Apr 26 2021, 2:22 PM · Unknown Object (Project), Restricted Project
bsaleil retitled D101313: [AMDGPU] Drop the GCNRegBankReassign pass from Drop the GCNRegBankReassign pass to [AMDGPU] Drop the GCNRegBankReassign pass.
Apr 26 2021, 2:01 PM · Unknown Object (Project), Restricted Project
bsaleil updated the diff for D101313: [AMDGPU] Drop the GCNRegBankReassign pass.

Update patch so it applies upstream.

Apr 26 2021, 11:41 AM · Unknown Object (Project), Restricted Project
bsaleil added reviewers for D101313: [AMDGPU] Drop the GCNRegBankReassign pass: piotr, foad, rampitec.
Apr 26 2021, 11:17 AM · Unknown Object (Project), Restricted Project
bsaleil requested review of D101313: [AMDGPU] Drop the GCNRegBankReassign pass.
Apr 26 2021, 11:14 AM · Unknown Object (Project), Restricted Project

Feb 11 2021

bsaleil added inline comments to D96346: [NFC][PPC] Refactor TOC representation to allow several entries for the same symbol.
Feb 11 2021, 9:43 AM · Restricted Project, Restricted Project
bsaleil updated the diff for D96346: [NFC][PPC] Refactor TOC representation to allow several entries for the same symbol.

Specialize DenseMapInfo directly with the PPC TOC key type instead of VariantKind.

Feb 11 2021, 9:41 AM · Restricted Project, Restricted Project

Feb 10 2021

bsaleil added inline comments to D96346: [NFC][PPC] Refactor TOC representation to allow several entries for the same symbol.
Feb 10 2021, 3:16 PM · Restricted Project, Restricted Project
bsaleil updated the summary of D96184: [AIX][TLS] Generate TLS variables in assembly files.
Feb 10 2021, 3:07 PM · Restricted Project, Restricted Project
bsaleil updated the diff for D96184: [AIX][TLS] Generate TLS variables in assembly files.

Generate uninitialized external TLS data with the .csect directive instead of .comm

Feb 10 2021, 3:06 PM · Restricted Project, Restricted Project
bsaleil added inline comments to D96346: [NFC][PPC] Refactor TOC representation to allow several entries for the same symbol.
Feb 10 2021, 7:57 AM · Restricted Project, Restricted Project

Feb 9 2021

bsaleil closed D96073: Add Python binary path to CMake arguments for the clang-ppc64le-linux builder.

I forgot to link the commit with this differential, but the patch has been committed: https://github.com/llvm/llvm-zorg/commit/45c4f238dc6fd9855b8578aa3ca3b8db336efb7e

Feb 9 2021, 10:41 AM · Restricted Project, Restricted Project
bsaleil committed rZORG45c4f238dc6f: Add Python binary path to CMake arguments for the clang-ppc64le-linux builder (authored by bsaleil).
Add Python binary path to CMake arguments for the clang-ppc64le-linux builder
Feb 9 2021, 10:38 AM
bsaleil requested review of D96346: [NFC][PPC] Refactor TOC representation to allow several entries for the same symbol.
Feb 9 2021, 7:52 AM · Restricted Project, Restricted Project

Feb 8 2021

bsaleil added a reviewer for D96184: [AIX][TLS] Generate TLS variables in assembly files: hubert.reinterpretcast.
Feb 8 2021, 1:56 PM · Restricted Project, Restricted Project
bsaleil updated the diff for D96184: [AIX][TLS] Generate TLS variables in assembly files.

Add support for variables with internal linkage.

Feb 8 2021, 1:31 PM · Restricted Project, Restricted Project
bsaleil added inline comments to D92405: [VirtRegRewriter] Insert missing killed flags when tracking subregister liveness.
Feb 8 2021, 7:06 AM · Restricted Project

Feb 5 2021

bsaleil requested review of D96184: [AIX][TLS] Generate TLS variables in assembly files.
Feb 5 2021, 2:51 PM · Restricted Project, Restricted Project

Feb 4 2021

bsaleil requested review of D96073: Add Python binary path to CMake arguments for the clang-ppc64le-linux builder.
Feb 4 2021, 12:32 PM · Restricted Project, Restricted Project

Jan 25 2021

bsaleil updated the diff for D92405: [VirtRegRewriter] Insert missing killed flags when tracking subregister liveness.

Rebase and reduce PPC test case.

Jan 25 2021, 8:32 PM · Restricted Project

Jan 22 2021

bsaleil accepted D95116: [PowerPC] Update PC-Relative Load/Store Patterns to use the refactored Load/Store Implementation.

LGTM, I only have a minor comment.

Jan 22 2021, 3:44 PM · Restricted Project, Restricted Project
bsaleil accepted D93370: [PowerPC] Add new infrastructure to select load/store instructions, update P8/P9 load/store patterns..

Thanks for addressing the comments, LGTM now.

Jan 22 2021, 3:10 PM · Restricted Project, Restricted Project
bsaleil accepted D94980: [PowerPC] Do not emit HW loop with half precision operations.

LGTM

Jan 22 2021, 2:27 PM · Restricted Project

Jan 20 2021

bsaleil added a comment to D92405: [VirtRegRewriter] Insert missing killed flags when tracking subregister liveness.

Gentle ping :)

Jan 20 2021, 7:26 AM · Restricted Project

Jan 15 2021

bsaleil added inline comments to D94454: [PowerPC] Exploit the vinsw, vinsd, and vins[wd][lr]x instructions on P10.
Jan 15 2021, 2:42 PM · Restricted Project, Restricted Project
bsaleil updated the diff for D94454: [PowerPC] Exploit the vinsw, vinsd, and vins[wd][lr]x instructions on P10.

Use isISA3_1 instead of hasP10Vector and add P9 run line in test case.

Jan 15 2021, 2:41 PM · Restricted Project, Restricted Project
bsaleil accepted D94498: [PowerPC][NFC] Update atomic patterns to use the refactored load/store implementation.

LGTM

Jan 15 2021, 1:45 PM · Restricted Project, Restricted Project

Jan 12 2021

bsaleil added a comment to D93370: [PowerPC] Add new infrastructure to select load/store instructions, update P8/P9 load/store patterns..

Thanks a lot for working on that Amy ! I have some comments on the patch.

Jan 12 2021, 1:55 PM · Restricted Project, Restricted Project

Jan 11 2021

bsaleil added reviewers for D94454: [PowerPC] Exploit the vinsw, vinsd, and vins[wd][lr]x instructions on P10: nemanjai, stefanp.
Jan 11 2021, 2:37 PM · Restricted Project, Restricted Project
bsaleil requested review of D94454: [PowerPC] Exploit the vinsw, vinsd, and vins[wd][lr]x instructions on P10.
Jan 11 2021, 2:36 PM · Restricted Project, Restricted Project

Jan 5 2021

bsaleil added inline comments to D92405: [VirtRegRewriter] Insert missing killed flags when tracking subregister liveness.
Jan 5 2021, 9:01 PM · Restricted Project
bsaleil updated the diff for D92405: [VirtRegRewriter] Insert missing killed flags when tracking subregister liveness.

Remove unrelated NFC change, simplify lane mask computation and MIR code

Jan 5 2021, 8:56 PM · Restricted Project
bsaleil added inline comments to D94058: [PowerPC] Sign extend comparison operand for signed atomic comparisons.
Jan 5 2021, 8:48 AM · Restricted Project

Dec 16 2020

bsaleil updated the diff for D91974: [PowerPC] Rename the vector pair intrinsics and builtins to replace the _mma_ prefix by _vsx_.

Rebase and fix comment

Dec 16 2020, 9:27 AM · Restricted Project, Restricted Project, Restricted Project

Dec 15 2020

bsaleil committed rG57d83c3a90c4: [PowerPC] Enable paired vector type and intrinsics when MMA is disabled (authored by bsaleil).
[PowerPC] Enable paired vector type and intrinsics when MMA is disabled
Dec 15 2020, 1:15 PM
bsaleil closed D91819: [PowerPC] Enable paired vector type and intrinsics when MMA is disabled.
Dec 15 2020, 1:14 PM · Restricted Project, Restricted Project, Restricted Project
bsaleil added inline comments to D91819: [PowerPC] Enable paired vector type and intrinsics when MMA is disabled.
Dec 15 2020, 10:26 AM · Restricted Project, Restricted Project, Restricted Project

Dec 8 2020

bsaleil accepted D92864: [PowerPC] Set SubRegIndex offset for sub_vsx1/sub_pair1.

Good catch, thanks for fixing that.

Dec 8 2020, 1:38 PM · Restricted Project

Dec 4 2020

bsaleil accepted D92420: [PowerPC] Exploitation of xxeval instruction for AND and NAND.

LGTM, just a minor comment regarding the multiclass.

Dec 4 2020, 7:55 AM · Restricted Project

Dec 3 2020

bsaleil committed rG45ec3a37b0a5: [PowerPC] Fix for excessive ACC copies due to PHI nodes (authored by bsaleil).
[PowerPC] Fix for excessive ACC copies due to PHI nodes
Dec 3 2020, 7:52 AM
bsaleil closed D91391: [PowerPC] Fix for excessive ACC copies due to PHI nodes.
Dec 3 2020, 7:52 AM · Restricted Project

Dec 2 2020

bsaleil updated the diff for D91391: [PowerPC] Fix for excessive ACC copies due to PHI nodes.

Update patch so it applies to ToT

Dec 2 2020, 3:06 PM · Restricted Project
bsaleil updated the diff for D91391: [PowerPC] Fix for excessive ACC copies due to PHI nodes.

Fix comments

Dec 2 2020, 2:46 PM · Restricted Project

Dec 1 2020

bsaleil added inline comments to D92420: [PowerPC] Exploitation of xxeval instruction for AND and NAND.
Dec 1 2020, 1:28 PM · Restricted Project
bsaleil requested review of D92405: [VirtRegRewriter] Insert missing killed flags when tracking subregister liveness.
Dec 1 2020, 10:18 AM · Restricted Project

Nov 26 2020

bsaleil added a comment to D92139: [PowerPC] Add `hasSideEffects=0` for PLXVP and PSTXVP instructions definition .

Why do we need these flags for PLXVP and PSTXVP ? Please, add a test case showing why this is required.

Nov 26 2020, 1:10 PM · Restricted Project

Nov 24 2020

bsaleil accepted D91420: [PowerPC][PCRelative] Add new pseudo instructions for PCRel TLS to fix R2 clobber issue.

LGTM, thanks.

Nov 24 2020, 8:24 AM · Restricted Project

Nov 23 2020

bsaleil updated the summary of D91974: [PowerPC] Rename the vector pair intrinsics and builtins to replace the _mma_ prefix by _vsx_.
Nov 23 2020, 9:24 AM · Restricted Project, Restricted Project, Restricted Project
bsaleil retitled D91974: [PowerPC] Rename the vector pair intrinsics and builtins to replace the _mma_ prefix by _vsx_ from [PowerPC] Rename the pair intrinsics and builtins to replace the _mma_ prefix by _vsx_ to [PowerPC] Rename the vector pair intrinsics and builtins to replace the _mma_ prefix by _vsx_.
Nov 23 2020, 9:23 AM · Restricted Project, Restricted Project, Restricted Project
bsaleil requested review of D91974: [PowerPC] Rename the vector pair intrinsics and builtins to replace the _mma_ prefix by _vsx_.
Nov 23 2020, 9:23 AM · Restricted Project, Restricted Project, Restricted Project

Nov 19 2020

bsaleil added a reviewer for D91819: [PowerPC] Enable paired vector type and intrinsics when MMA is disabled: Restricted Project.
Nov 19 2020, 1:29 PM · Restricted Project, Restricted Project, Restricted Project
bsaleil requested review of D91819: [PowerPC] Enable paired vector type and intrinsics when MMA is disabled.
Nov 19 2020, 1:27 PM · Restricted Project, Restricted Project, Restricted Project

Nov 18 2020

bsaleil committed rG18db29ea6fb6: [PowerPC] Add peephole to remove redundant accumulator prime/unprime… (authored by bsaleil).
[PowerPC] Add peephole to remove redundant accumulator prime/unprime…
Nov 18 2020, 1:01 PM
bsaleil closed D91386: [PowerPC] Add peephole to remove redundant accumulator prime/unprime instructions.
Nov 18 2020, 1:01 PM · Restricted Project, Restricted Project

Nov 16 2020

bsaleil updated the diff for D91386: [PowerPC] Add peephole to remove redundant accumulator prime/unprime instructions.

Update comments

Nov 16 2020, 7:57 AM · Restricted Project, Restricted Project

Nov 13 2020

bsaleil committed rG3f78605a8cb1: [PowerPC] Add paired vector load and store builtins and intrinsics (authored by bsaleil).
[PowerPC] Add paired vector load and store builtins and intrinsics
Nov 13 2020, 10:36 AM
bsaleil closed D90799: [PowerPC] Add paired vector load and store builtins and intrinsics.
Nov 13 2020, 10:35 AM · Restricted Project, Restricted Project, Restricted Project
bsaleil added inline comments to D91420: [PowerPC][PCRelative] Add new pseudo instructions for PCRel TLS to fix R2 clobber issue.
Nov 13 2020, 8:45 AM · Restricted Project