Page MenuHomePhabricator
Feed Advanced Search

Apr 5 2023

OutOfCache committed rG04317d4da78e: [AMDGPU][GISel] Add inverse ballot intrinsic (authored by OutOfCache).
[AMDGPU][GISel] Add inverse ballot intrinsic
Apr 5 2023, 10:47 PM · Restricted Project, Restricted Project
OutOfCache closed D146287: [AMDGPU][GISel] Add inverse ballot intrinsic.
Apr 5 2023, 10:47 PM · Unknown Object (Project), Restricted Project, Restricted Project

Apr 3 2023

OutOfCache updated the diff for D146287: [AMDGPU][GISel] Add inverse ballot intrinsic.

I decided to keep the error output since it is more
descriptive than the Invalid Opcode error we
would otherwise output.

Apr 3 2023, 9:27 AM · Unknown Object (Project), Restricted Project, Restricted Project

Mar 31 2023

OutOfCache updated the diff for D146287: [AMDGPU][GISel] Add inverse ballot intrinsic.
  • add internal error if the mask size is not 32 or 64
Mar 31 2023, 5:04 AM · Unknown Object (Project), Restricted Project, Restricted Project

Mar 30 2023

OutOfCache added inline comments to D146287: [AMDGPU][GISel] Add inverse ballot intrinsic.
Mar 30 2023, 4:12 AM · Unknown Object (Project), Restricted Project, Restricted Project
OutOfCache updated the diff for D146287: [AMDGPU][GISel] Add inverse ballot intrinsic.
  • Simplify SelDAG with call to SIInstrInfo::readlaneVGPRToSGPR
  • Add comment to clarify that current GISel test is incorrect
  • remove [VCC] form pseudo instruction def
Mar 30 2023, 4:00 AM · Unknown Object (Project), Restricted Project, Restricted Project
OutOfCache added inline comments to D146287: [AMDGPU][GISel] Add inverse ballot intrinsic.
Mar 30 2023, 2:28 AM · Unknown Object (Project), Restricted Project, Restricted Project

Mar 28 2023

OutOfCache abandoned D146829: [AMDGPU] Remove unnecessary waitcnts.

After further discussion, @mareko is right and the waitcnts are necessary. Thanks for bringing that up!

Mar 28 2023, 9:01 AM · Unknown Object (Project), Restricted Project, Restricted Project
OutOfCache added a comment to D146829: [AMDGPU] Remove unnecessary waitcnts.

The waitcnt's serve two purposes. They notify that the result of the operation is available to the thread that requested it, and they ensure that the effect of the operation is visible to other threads before this thread continues to do other operations. This latter purpose is used to ensure the happens-before relationship in the memory model. So for example, if a VMEM release atomic is done at workgroup scope, should these operations be visible to other threads before the result that is store-released onto VMEM?

If these operations go down the LDS queues (even if they are not performed in the LDS itself), then there are 2 queues for the waves of a workgroup, but a single L1 shared by all waves of a workgroup for VMEM. So to ensure visibility to all waves in the workgroup the LDS operation must be waited to complete before starting the VMEM operation if there needs to be a happens-before relation. That waiting is achieved by the waitcnt on LGKM before executing the VMEM instruction.

Mar 28 2023, 6:02 AM · Unknown Object (Project), Restricted Project, Restricted Project
OutOfCache added inline comments to D146287: [AMDGPU][GISel] Add inverse ballot intrinsic.
Mar 28 2023, 3:57 AM · Unknown Object (Project), Restricted Project, Restricted Project

Mar 27 2023

OutOfCache added a comment to D146829: [AMDGPU] Remove unnecessary waitcnts.

Are you sure about this? lgkmcnt(0) isn't about accessing LDS memory, but about waiting for the result to be received from the LDS block.

Mar 27 2023, 8:38 AM · Unknown Object (Project), Restricted Project, Restricted Project
OutOfCache added inline comments to D146829: [AMDGPU] Remove unnecessary waitcnts.
Mar 27 2023, 3:29 AM · Unknown Object (Project), Restricted Project, Restricted Project
OutOfCache updated the summary of D146829: [AMDGPU] Remove unnecessary waitcnts.
Mar 27 2023, 3:27 AM · Unknown Object (Project), Restricted Project, Restricted Project
OutOfCache updated the diff for D146829: [AMDGPU] Remove unnecessary waitcnts.
  • editing DSInstructions.td instead of SIInsertWaitcnts.cpp
Mar 27 2023, 3:26 AM · Unknown Object (Project), Restricted Project, Restricted Project

Mar 24 2023

OutOfCache updated the summary of D146829: [AMDGPU] Remove unnecessary waitcnts.
Mar 24 2023, 10:44 AM · Unknown Object (Project), Restricted Project, Restricted Project
OutOfCache requested review of D146829: [AMDGPU] Remove unnecessary waitcnts.
Mar 24 2023, 10:39 AM · Unknown Object (Project), Restricted Project, Restricted Project

Mar 21 2023

OutOfCache updated the diff for D146287: [AMDGPU][GISel] Add inverse ballot intrinsic.
  • Fix typos
  • Rename variable
Mar 21 2023, 10:42 AM · Unknown Object (Project), Restricted Project, Restricted Project
OutOfCache added inline comments to D146287: [AMDGPU][GISel] Add inverse ballot intrinsic.
Mar 21 2023, 10:35 AM · Unknown Object (Project), Restricted Project, Restricted Project
OutOfCache updated the diff for D146287: [AMDGPU][GISel] Add inverse ballot intrinsic.
  • Simplify GISel Implementation by using legalizeOperands
  • Remove unnecessary checks.
  • Remove redundant tests and move GISel ones to SDAG tests
  • Increase readability.
Mar 21 2023, 10:25 AM · Unknown Object (Project), Restricted Project, Restricted Project

Mar 17 2023

OutOfCache added inline comments to D146287: [AMDGPU][GISel] Add inverse ballot intrinsic.
Mar 17 2023, 5:25 AM · Unknown Object (Project), Restricted Project, Restricted Project
OutOfCache updated the summary of D146287: [AMDGPU][GISel] Add inverse ballot intrinsic.
Mar 17 2023, 5:22 AM · Unknown Object (Project), Restricted Project, Restricted Project
OutOfCache added reviewers for D146287: [AMDGPU][GISel] Add inverse ballot intrinsic: tsymalla, nhaehnle, arsenm, critson.
Mar 17 2023, 5:16 AM · Unknown Object (Project), Restricted Project, Restricted Project
OutOfCache requested review of D146287: [AMDGPU][GISel] Add inverse ballot intrinsic.
Mar 17 2023, 4:58 AM · Unknown Object (Project), Restricted Project, Restricted Project

Feb 22 2023

OutOfCache committed rGfc672b6a8b48: [AMDGPU] Improved wide multiplies (authored by OutOfCache).
[AMDGPU] Improved wide multiplies
Feb 22 2023, 7:41 AM · Restricted Project, Restricted Project
OutOfCache closed D140208: [AMDGPU] Improved wide multiplies.
Feb 22 2023, 7:40 AM · Restricted Project, Restricted Project

Feb 21 2023

OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.

Remove gfx11 from MIR tests

Feb 21 2023, 6:42 AM · Restricted Project, Restricted Project
OutOfCache committed rGc9fd858172d0: [AMDGPU] MIR-Tests for Multiplication using KBA (authored by OutOfCache).
[AMDGPU] MIR-Tests for Multiplication using KBA
Feb 21 2023, 5:48 AM · Restricted Project, Restricted Project
OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.

Added MIR Tests for gfx10 and gfx11.

Feb 21 2023, 1:19 AM · Restricted Project, Restricted Project

Feb 20 2023

OutOfCache committed rG959216f9b1f1: [AMDGPU] MIR-Tests for Multiplication using KBA (authored by OutOfCache).
[AMDGPU] MIR-Tests for Multiplication using KBA
Feb 20 2023, 11:42 PM · Restricted Project, Restricted Project

Feb 14 2023

OutOfCache added inline comments to D140208: [AMDGPU] Improved wide multiplies.
Feb 14 2023, 7:34 AM · Restricted Project, Restricted Project

Jan 23 2023

OutOfCache added inline comments to D140208: [AMDGPU] Improved wide multiplies.
Jan 23 2023, 7:15 AM · Restricted Project, Restricted Project
OutOfCache added inline comments to D140208: [AMDGPU] Improved wide multiplies.
Jan 23 2023, 7:09 AM · Restricted Project, Restricted Project
OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.
  • [AMDGPU] reverting separation of for-loops
Jan 23 2023, 6:57 AM · Restricted Project, Restricted Project

Jan 22 2023

OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.
  • [AMDGPU] reverting separation of for-loops
Jan 22 2023, 11:35 PM · Restricted Project, Restricted Project

Jan 20 2023

OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.
  • [AMDGPU] reverting separation of for-loops
Jan 20 2023, 7:38 AM · Restricted Project, Restricted Project

Jan 16 2023

OutOfCache added inline comments to D140208: [AMDGPU] Improved wide multiplies.
Jan 16 2023, 7:17 AM · Restricted Project, Restricted Project

Jan 11 2023

OutOfCache added inline comments to D140208: [AMDGPU] Improved wide multiplies.
Jan 11 2023, 8:38 AM · Restricted Project, Restricted Project
OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.
  • [AMDGPU] Removing redundant zero-check for mults
Jan 11 2023, 8:21 AM · Restricted Project, Restricted Project

Jan 10 2023

OutOfCache added inline comments to D140208: [AMDGPU] Improved wide multiplies.
Jan 10 2023, 4:32 AM · Restricted Project, Restricted Project
OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.
  • [GISel] Adding KnownBitsAnalysis to Legalizer
  • [AMDGPU] Wide multiplies with Known Bits Analysis
  • [AMDGPU] Improved wide multiplies tests
  • [GISel/AMDGPU] caching results of isZero()
  • [AMDGPU] Inlining conditions in buildMultiply
  • [AMDGPU] Removing redundant zero-check for mults
Jan 10 2023, 4:21 AM · Restricted Project, Restricted Project
OutOfCache committed rGf33633f51243: [AMDGPU] adding test for partially masked operands (authored by OutOfCache).
[AMDGPU] adding test for partially masked operands
Jan 10 2023, 2:06 AM · Restricted Project, Restricted Project

Jan 5 2023

OutOfCache added inline comments to D140907: [GlobalISel] New combine to commute constant operands to the RHS.
Jan 5 2023, 12:55 AM · Restricted Project, Restricted Project

Dec 27 2022

OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.
  • [AMDGPU] Wide multiplies with Known Bits Analysis
  • [AMDGPU] Improved wide multiplies tests
Dec 27 2022, 4:13 AM · Restricted Project, Restricted Project
OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.
  • [AMDGPU] Improved wide multiplies tests
Dec 27 2022, 1:52 AM · Restricted Project, Restricted Project

Dec 26 2022

OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.
  • [GISel] Adding KnownBitsAnalysis to Legalizer
  • [AMDGPU] Wide multiplies with Known Bits Analysis
  • [AMDGPU] Improved wide multiplies tests
Dec 26 2022, 11:20 PM · Restricted Project, Restricted Project
OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.

Fixing constructor issues by moving the default parameter to the header.

Dec 26 2022, 8:29 AM · Restricted Project, Restricted Project
OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.

Fixing constructor issues by moving the default parameter to the header.

Dec 26 2022, 8:28 AM · Restricted Project, Restricted Project

Dec 24 2022

OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.

Improved tests with better naming.
Better variable naming in LegalizerInfo.
Added constructor again because that was causing issues.

Dec 24 2022, 1:50 AM · Restricted Project, Restricted Project
OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.
  • [AMDGPU] Improved naming of mul-known-bits tests
  • [AMDGPU] Better variable naming in LegalizerInfo
Dec 24 2022, 1:48 AM · Restricted Project, Restricted Project

Dec 23 2022

OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.
Dec 23 2022, 8:46 AM · Restricted Project, Restricted Project
OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.

Improved tests with better naming.
Better variable naming in LegalizerInfo.
Added constructor again because that was causing issues.

Dec 23 2022, 8:43 AM · Restricted Project, Restricted Project

Dec 20 2022

OutOfCache committed rG5ee13e6c6527: [AMDGPU] Wide multiplies tests for D140208 (authored by OutOfCache).
[AMDGPU] Wide multiplies tests for D140208
Dec 20 2022, 3:09 AM · Restricted Project, Restricted Project

Dec 16 2022

OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.

Adding the new test file

Dec 16 2022, 10:48 AM · Restricted Project, Restricted Project
OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.

Adressing the redundant constructor and minor format issues

Dec 16 2022, 7:45 AM · Restricted Project, Restricted Project
OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.

Adding missing commits.

Dec 16 2022, 5:32 AM · Restricted Project, Restricted Project
OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.

Fixing indentation

Dec 16 2022, 5:31 AM · Restricted Project, Restricted Project
OutOfCache retitled D140208: [AMDGPU] Improved wide multiplies from [AMDGPU] Improved wide multiplies tests to [AMDGPU] Improved wide multiplies.
Dec 16 2022, 5:14 AM · Restricted Project, Restricted Project
OutOfCache updated the diff for D140208: [AMDGPU] Improved wide multiplies.

Added missing commits

Dec 16 2022, 5:06 AM · Restricted Project, Restricted Project
OutOfCache requested review of D140208: [AMDGPU] Improved wide multiplies.
Dec 16 2022, 5:00 AM · Restricted Project, Restricted Project