Page MenuHomePhabricator

Flakebi (Sebastian Neubauer)
User

Projects

User does not belong to any projects.

User Details

User Since
Oct 13 2017, 7:01 AM (137 w, 2 d)

Recent Activity

Wed, May 20

Flakebi updated the diff for D76836: [AMDGPU] Add G16 support to image instructions.

As Nicolai suggested, I added an operand which encodes A16- and G16-ness.

Wed, May 20, 9:16 AM · Unknown Object (Project), Restricted Project

Apr 27 2020

Flakebi added a comment to D76278: [AMDGPU] Don't mark the .note section as ALLOC.

ping

Apr 27 2020, 5:19 AM · Restricted Project
Flakebi created D78913: [CMake] Fix cross-compiling with LLVM as CMake subproject.
Apr 27 2020, 4:15 AM · Restricted Project

Apr 9 2020

Flakebi added a comment to D76278: [AMDGPU] Don't mark the .note section as ALLOC.

ping

Apr 9 2020, 2:41 AM · Restricted Project

Apr 2 2020

Flakebi added a comment to D76836: [AMDGPU] Add G16 support to image instructions.

What is the best way to handle A16 and G16 in instruction selection for GlobalISel?

Apr 2 2020, 3:13 AM · Unknown Object (Project), Restricted Project

Mar 31 2020

Flakebi accepted D77090: AMDGPU/GlobalISel: Change intrinsic ID for _L to _LZ opt.

I think any operand changes should be done during legalization, and selection should be relatively simple. We already do more of this type of conversion during legalize, so I think it makes sense to consolidate it there.

Mar 31 2020, 6:36 AM
Flakebi added a comment to D77090: AMDGPU/GlobalISel: Change intrinsic ID for _L to _LZ opt.

What is the reason to do this during legalization?
Could we look for a constant zero argument during instruction selection?

Mar 31 2020, 2:45 AM

Mar 30 2020

Flakebi added inline comments to D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic.
Mar 30 2020, 10:16 AM · Restricted Project
Flakebi updated the diff for D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic.

Add uniformity test

Mar 30 2020, 10:16 AM · Restricted Project
Flakebi updated the diff for D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic.

return instead of report_fatal_error

Mar 30 2020, 5:55 AM · Restricted Project
Flakebi added inline comments to D74316: AMDGPU/GlobalISel: Start selecting image intrinsics.
Mar 30 2020, 2:40 AM · Restricted Project

Mar 26 2020

Flakebi updated the diff for D76836: [AMDGPU] Add G16 support to image instructions.

Thank you for the fast review!

Mar 26 2020, 10:50 AM · Unknown Object (Project), Restricted Project
Flakebi added a comment to D76278: [AMDGPU] Don't mark the .note section as ALLOC.

Anyone willing to review this?

Mar 26 2020, 4:17 AM · Restricted Project
Flakebi updated the diff for D76836: [AMDGPU] Add G16 support to image instructions.

Forgot to run formatter

Mar 26 2020, 4:17 AM · Unknown Object (Project), Restricted Project
Flakebi created D76836: [AMDGPU] Add G16 support to image instructions.
Mar 26 2020, 4:17 AM · Unknown Object (Project), Restricted Project

Mar 23 2020

Flakebi updated the diff for D76278: [AMDGPU] Don't mark the .note section as ALLOC.

Fix formatting issue

Mar 23 2020, 9:14 AM · Restricted Project
Flakebi updated the diff for D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic.

Move the code to lowering again, I’m back were Jay started.
Report a fatal error if the size is neither i32 nor i64.

Mar 23 2020, 9:14 AM · Restricted Project

Mar 20 2020

Flakebi updated the diff for D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic.

I removed the COPY_TO_REGCLASS, it looked flaky and does not work with GlobalISel.
Instead, the ballot intrinsic is morphed into an AMDGPUISD::SETCC. Compares are the only reasonable way to get a boolean value into the wavefront form as an i32/i64 and use it in LLVM.

Mar 20 2020, 3:13 AM · Restricted Project
Flakebi updated the diff for D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic.
Mar 20 2020, 3:13 AM · Restricted Project

Mar 19 2020

Flakebi created D76427: [AMDGPU][RFC] Use default value for PrivateLabelPrefix.
Mar 19 2020, 5:21 AM · Restricted Project

Mar 18 2020

Flakebi updated the diff for D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic.

Add missing wave32 instcombining

Mar 18 2020, 10:20 AM · Restricted Project
Flakebi added inline comments to D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic.
Mar 18 2020, 9:47 AM · Restricted Project
Flakebi updated the diff for D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic.

Use getCopyFromReg(exec) and rebase on fixed whole-wave-mode.

Mar 18 2020, 3:13 AM · Restricted Project
Flakebi closed D75976: [AMDGPU] Optimize AtomicOptimizer.

Unfortunately this does not work anymore with the updated ballot intrinsic. I’ll leave this for later, see also D65088.

Mar 18 2020, 3:13 AM · Restricted Project

Mar 17 2020

Flakebi created D76278: [AMDGPU] Don't mark the .note section as ALLOC.
Mar 17 2020, 5:43 AM · Restricted Project
Flakebi updated the diff for D76232: [AMDGPU] Fix whole wavefront mode.

Use mayReadEXEC as suggested by Matt.

Mar 17 2020, 3:00 AM · Restricted Project

Mar 16 2020

Flakebi created D76232: [AMDGPU] Fix whole wavefront mode.
Mar 16 2020, 9:14 AM · Restricted Project

Mar 13 2020

Flakebi abandoned D75857: [AMDGPU] Fix using physical registers in vector instructions.

Thanks Matt, I’ll use CopyFromReg.

Mar 13 2020, 9:06 AM · Restricted Project

Mar 12 2020

Flakebi added inline comments to D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic.
Mar 12 2020, 2:19 AM · Restricted Project
Flakebi added a comment to D75857: [AMDGPU] Fix using physical registers in vector instructions.

I still think theses should not be seeing physical register operands, and it would be better to fix this by avoiding that situation

Mar 12 2020, 12:52 AM · Restricted Project

Mar 11 2020

Flakebi updated the diff for D75857: [AMDGPU] Fix using physical registers in vector instructions.

Add test

Mar 11 2020, 9:04 AM · Restricted Project
Flakebi added an edge to rG2f857eadf5d4: [AMDGPU] Use script to generate atomic optimizations test: D75855: [AMDGPU] Use script to generate atomic optimizations test.
Mar 11 2020, 3:47 AM
Flakebi closed D75855: [AMDGPU] Use script to generate atomic optimizations test.
Mar 11 2020, 3:47 AM · Restricted Project
Flakebi added 1 commit(s) for D75855: [AMDGPU] Use script to generate atomic optimizations test: rG2f857eadf5d4: [AMDGPU] Use script to generate atomic optimizations test.
Mar 11 2020, 3:47 AM · Restricted Project
Flakebi added a comment to D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic.

The code generation for test2 is currently not optimal:

%trunc = trunc i32 %x to i1
%ballot = call i64 @llvm.amdgcn.ballot.i64(i1 %trunc)

generates

v_and_b32_e32 v0, 1, v0
v_cmp_eq_u32_e32 vcc, 1, v0
s_and_b64 s[4:5], vcc, exec

where the first compare stems from the truncate.

I'm confused by this. What is the optimal code generation?

Mar 11 2020, 3:47 AM · Restricted Project
Flakebi added inline comments to D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic.
Mar 11 2020, 3:47 AM · Restricted Project
Flakebi added a parent revision for D75976: [AMDGPU] Optimize AtomicOptimizer: D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic.
Mar 11 2020, 2:19 AM · Restricted Project
Flakebi added a child revision for D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic: D75976: [AMDGPU] Optimize AtomicOptimizer.
Mar 11 2020, 2:19 AM · Restricted Project
Flakebi created D75976: [AMDGPU] Optimize AtomicOptimizer.
Mar 11 2020, 2:19 AM · Restricted Project
Flakebi added parent revisions for D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic: D75855: [AMDGPU] Use script to generate atomic optimizations test, D75857: [AMDGPU] Fix using physical registers in vector instructions.
Mar 11 2020, 1:35 AM · Restricted Project
Flakebi updated the diff for D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic.

Address Nicolai’s comments and implement this as DAG combines and TableGen patterns.

Mar 11 2020, 1:35 AM · Restricted Project
Flakebi added a child revision for D75855: [AMDGPU] Use script to generate atomic optimizations test: D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic.
Mar 11 2020, 1:35 AM · Restricted Project
Flakebi added a child revision for D75857: [AMDGPU] Fix using physical registers in vector instructions: D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic.
Mar 11 2020, 1:35 AM · Restricted Project
Flakebi commandeered D65088: [AMDGPU][RFC] New llvm.amdgcn.ballot intrinsic.
Mar 11 2020, 1:35 AM · Restricted Project

Mar 10 2020

Flakebi created D75913: [AMDGPU] Use progbits type for .AMDGPU.disasm section.
Mar 10 2020, 5:48 AM · Restricted Project

Mar 9 2020

Flakebi added a comment to D75857: [AMDGPU] Fix using physical registers in vector instructions.

I’ll add a test case. Yes, this is related to the atomic optimizer and ballot intrinsic. There we get e.g. %0:vgpr_32 = V_MBCNT_LO_U32_B32_e64 $exec_lo, 0, implicit $exec.

Mar 9 2020, 10:46 AM · Restricted Project
Flakebi added a comment to D75855: [AMDGPU] Use script to generate atomic optimizations test.

No specific reason, I fixed the other tests manually but this one had many operations with exec so I wanted to make sure I don’t destroy anything there.

Mar 9 2020, 10:46 AM · Restricted Project
Flakebi updated the diff for D75857: [AMDGPU] Fix using physical registers in vector instructions.

Remove accidentally added empty line

Mar 9 2020, 9:41 AM · Restricted Project
Flakebi created D75857: [AMDGPU] Fix using physical registers in vector instructions.
Mar 9 2020, 9:41 AM · Restricted Project
Flakebi created D75855: [AMDGPU] Use script to generate atomic optimizations test.
Mar 9 2020, 9:41 AM · Restricted Project

Feb 17 2020

Flakebi updated the diff for D74600: [AMDGPU] Don’t marke the .note section as ALLOC.

Fix test, I hope the existing test suffices.

Feb 17 2020, 1:52 AM · Restricted Project
Flakebi commandeered D74600: [AMDGPU] Don’t marke the .note section as ALLOC.
Feb 17 2020, 1:52 AM · Restricted Project

Oct 14 2019

Flakebi added a comment to D44077: Clear the stack protector after checking it.

is there anything missing for this pull request?

Oct 14 2019, 9:20 AM · Restricted Project

May 18 2019

Flakebi updated the diff for D44077: Clear the stack protector after checking it.

The ssp is now set to zero in the SDAG variant and when a stack guard check function is used. I am not familiar with the selection DAG so please correct me if it can be improved.
I also added a test as requested.

May 18 2019, 5:33 AM · Restricted Project

Sep 6 2018

Flakebi added a comment to D44077: Clear the stack protector after checking it.

It took a while, now the code, figures and a more detailed explanation are online: https://flakebi.de/uni-items/ba/#clear-ssp
Even with this microbenchmark, I was not able to measure a difference in performance in comparison to llvm without this patch.

Sep 6 2018, 11:06 PM · Restricted Project

Mar 4 2018

Flakebi created D44077: Clear the stack protector after checking it.
Mar 4 2018, 7:11 AM · Restricted Project