Page MenuHomePhabricator

Flakebi (Sebastian Neubauer)
User

Projects

User does not belong to any projects.

User Details

User Since
Oct 13 2017, 7:01 AM (216 w, 2 d)

Recent Activity

Mon, Nov 22

Flakebi abandoned D76427: [AMDGPU][RFC] Use default value for PrivateLabelPrefix.

The change of PrivateLabelPrefix was pushed in D114273.

Mon, Nov 22, 1:24 AM · Restricted Project

May 13 2021

Flakebi accepted D99038: AMDGPU/GlobalISel: Implement tail calls.
May 13 2021, 2:27 AM · Restricted Project

May 12 2021

Flakebi added a comment to D99038: AMDGPU/GlobalISel: Implement tail calls.

I left a nit inline. Apart from that, LGTM.

May 12 2021, 9:04 AM · Restricted Project

Mar 22 2021

Flakebi added a comment to D99038: AMDGPU/GlobalISel: Implement tail calls.

It would be nice if we can put the common code with AArch64 into some generic place instead of duplicating it.
Other than that, looks good.

Mar 22 2021, 7:25 AM · Restricted Project
Flakebi accepted D98630: AMDGPU: Allow tail calls for amdgpu_gfx functions.

LGTM, thanks

Mar 22 2021, 4:36 AM · Restricted Project

Feb 24 2021

Flakebi accepted D95619: [AMDGPU] Add more PAL metadata register names.

If these registers keep changing, we probably want to switch between architectures at some point. But it’s easier to look at (slightly off) names than at hex numbers, so LGTM.

Feb 24 2021, 5:30 AM · Restricted Project

Jan 29 2021

Flakebi added a comment to D95619: [AMDGPU] Add more PAL metadata register names.

We have slight problems when it comes to differences between hardware versions.
COMPUTE_USER_ACCUM_1 on gfx9 has the same number as COMPUTE_SHADER_CHKSUM on gfx10.
Also, some have slightly different names. E.g. IA_MULTI_VGT_PARAM for gfx9 is called IA_MULTI_VGT_PARAM_PIPED for gfx10.

Jan 29 2021, 9:12 AM · Restricted Project

Jan 27 2021

Flakebi accepted D94429: AMDGPU: Move handling of allocation of fixed ABI inputs.
Jan 27 2021, 12:11 AM · Restricted Project

Jan 26 2021

Flakebi added inline comments to D94429: AMDGPU: Move handling of allocation of fixed ABI inputs.
Jan 26 2021, 8:04 AM · Restricted Project

Jan 12 2021

Flakebi added a comment to D94429: AMDGPU: Move handling of allocation of fixed ABI inputs.

Also stop passing them for amdgpu_gfx, since the DAG path seems to skip these. I'm unclear on what amdgpu_gfx's expectations are.

Jan 12 2021, 5:24 AM · Restricted Project

Dec 23 2020

Flakebi added reviewers for D93708: [AMDGPU] Add a new Clamp Pattern to the GlobalISel Path.: arsenm, foad.

Looks good to me, I left some nits inline.
Someone who is more familiar with GlobalISel should review this.

Dec 23 2020, 2:12 AM · Restricted Project, Unknown Object (Project)

Nov 3 2020

Flakebi added a comment to D88540: [AMDGPU] Add amdgpu_gfx calling convention.

ping :)

Nov 3 2020, 6:18 AM · Restricted Project

Nov 2 2020

Flakebi added a comment to D89399: [AMDGPU] Set rsrc1 flags for graphics shaders.

ping

Nov 2 2020, 1:05 AM · Restricted Project

Oct 30 2020

Flakebi added inline comments to D90036: [AMDGPU] Emit stack frame size in metadata.
Oct 30 2020, 7:23 AM · Restricted Project
Flakebi added inline comments to D90036: [AMDGPU] Emit stack frame size in metadata.
Oct 30 2020, 6:39 AM · Restricted Project
Flakebi updated the diff for D90036: [AMDGPU] Emit stack frame size in metadata.

Add test with loop

Oct 30 2020, 6:38 AM · Restricted Project
Flakebi added inline comments to D90036: [AMDGPU] Emit stack frame size in metadata.
Oct 30 2020, 2:22 AM · Restricted Project

Oct 28 2020

Flakebi updated the diff for D89399: [AMDGPU] Set rsrc1 flags for graphics shaders.

Rebase

Oct 28 2020, 9:37 AM · Restricted Project
Flakebi updated the diff for D89388: [AMDGPU] Fix ieee mode default value.

Add pre-commited tests.

Oct 28 2020, 9:25 AM · Restricted Project

Oct 27 2020

Flakebi updated the diff for D88540: [AMDGPU] Add amdgpu_gfx calling convention.

Fix lld test failure

Oct 27 2020, 4:53 AM · Restricted Project
Flakebi updated the diff for D90036: [AMDGPU] Emit stack frame size in metadata.

Ah, the dynamically sized alloca provoked a message that it is unsupported.
An alloca in a branch works fine.

Oct 27 2020, 4:41 AM · Restricted Project

Oct 26 2020

Flakebi updated the diff for D90036: [AMDGPU] Emit stack frame size in metadata.

Fix code and add more tests.

Oct 26 2020, 9:28 AM · Restricted Project
Flakebi retitled D88540: [AMDGPU] Add amdgpu_gfx calling convention from [AMDGPU] Add amdgpu_gfx_callable calling convention to [AMDGPU] Add amdgpu_gfx calling convention.
Oct 26 2020, 9:14 AM · Restricted Project
Flakebi updated the diff for D88540: [AMDGPU] Add amdgpu_gfx calling convention.

Rebased

Oct 26 2020, 9:14 AM · Restricted Project
Flakebi accepted D89170: [AMDGPU] Use flat scratch instructions where available.

Looks good to me.
I tested it with the amdvlk vulkan driver (needs a pal-specific patch) and a short Vulkan CTS test ran through fine (except for pal-related failures).

Oct 26 2020, 8:30 AM · Restricted Project
Flakebi updated the diff for D89399: [AMDGPU] Set rsrc1 flags for graphics shaders.

Also set MEM_ORDERED and WGP_MODE for supported PGMRSrc1 registers.

Oct 26 2020, 3:21 AM · Restricted Project

Oct 23 2020

Flakebi requested review of D90036: [AMDGPU] Emit stack frame size in metadata.
Oct 23 2020, 6:15 AM · Restricted Project
Flakebi requested review of D90035: [AMDGPU] Emit new pal metadata by default.
Oct 23 2020, 5:56 AM · Restricted Project
Flakebi added inline comments to D88540: [AMDGPU] Add amdgpu_gfx calling convention.
Oct 23 2020, 2:54 AM · Restricted Project
Flakebi updated the diff for D88540: [AMDGPU] Add amdgpu_gfx calling convention.

Fix comments

Oct 23 2020, 2:53 AM · Restricted Project

Oct 21 2020

Flakebi updated the diff for D89804: [AMDGPU] Fix off by one in assert.

I don’t see a way to add a test case. It fails an assertion on Windows when compiling with msvc.

Oct 21 2020, 1:27 AM · Restricted Project

Oct 20 2020

Flakebi requested review of D89804: [AMDGPU] Fix off by one in assert.
Oct 20 2020, 9:28 AM · Restricted Project
Flakebi added inline comments to D89170: [AMDGPU] Use flat scratch instructions where available.
Oct 20 2020, 7:59 AM · Restricted Project
Flakebi added inline comments to D88540: [AMDGPU] Add amdgpu_gfx calling convention.
Oct 20 2020, 5:38 AM · Restricted Project
Flakebi updated the diff for D88540: [AMDGPU] Add amdgpu_gfx calling convention.

Disallow calls with C calling convention from shaders

Oct 20 2020, 5:36 AM · Restricted Project
Flakebi updated the diff for D88540: [AMDGPU] Add amdgpu_gfx calling convention.

Update from internal review comments.

Oct 20 2020, 4:54 AM · Restricted Project

Oct 19 2020

Flakebi added inline comments to D89217: [AMDGPU] Base getSubRegFromChannel on TableGen data.
Oct 19 2020, 5:46 AM · Restricted Project

Oct 16 2020

Flakebi added inline comments to D89170: [AMDGPU] Use flat scratch instructions where available.
Oct 16 2020, 3:57 AM · Restricted Project

Oct 14 2020

Flakebi requested review of D89399: [AMDGPU] Set rsrc1 flags for graphics shaders.
Oct 14 2020, 9:02 AM · Restricted Project
Flakebi requested review of D89388: [AMDGPU] Fix ieee mode default value.
Oct 14 2020, 5:18 AM · Restricted Project
Flakebi requested review of D89375: [AMDGPU] Add objdump invalid metadata testcase.
Oct 14 2020, 2:15 AM · Restricted Project

Oct 12 2020

Flakebi requested review of D89243: [AMDGPU] Print metadata on error.
Oct 12 2020, 8:02 AM · Restricted Project
Flakebi added a comment to D89217: [AMDGPU] Base getSubRegFromChannel on TableGen data.

Looks good, I have one comment.

Oct 12 2020, 7:10 AM · Restricted Project

Oct 8 2020

Flakebi updated the diff for D88291: [AMDGPU] Insert waterfall loops for divergent calls.

Fix return value handling, also need to copy COPYs following the call.

Oct 8 2020, 5:15 AM · Restricted Project

Oct 7 2020

Flakebi requested review of D88961: [AMDGPU] Use isLegalMUBUFImmOffset more.
Oct 7 2020, 5:49 AM · Restricted Project

Oct 6 2020

Flakebi requested review of D88904: [AMDGPU] Remove SIInstrInfo::calculateLDSSpillAddress.
Oct 6 2020, 8:09 AM · Restricted Project
Flakebi requested review of D88876: [AMDGPU] Fix gcc warnings.
Oct 6 2020, 1:48 AM · Restricted Project

Oct 2 2020

Flakebi added inline comments to D87704: [AMDGPU] Reduce stack pointer alignment.
Oct 2 2020, 2:31 AM · Restricted Project

Sep 30 2020

Flakebi updated the diff for D86270: [AMDGPU] Use tablegen for argument indices.

So someone has a preference :)

Sep 30 2020, 5:19 AM · Restricted Project
Flakebi added a comment to D86270: [AMDGPU] Use tablegen for argument indices.

friendly ping for review

Sep 30 2020, 4:58 AM · Restricted Project
Flakebi requested review of D88540: [AMDGPU] Add amdgpu_gfx calling convention.
Sep 30 2020, 1:37 AM · Restricted Project

Sep 29 2020

Flakebi added inline comments to D88291: [AMDGPU] Insert waterfall loops for divergent calls.
Sep 29 2020, 2:48 AM · Restricted Project
Flakebi updated the diff for D88291: [AMDGPU] Insert waterfall loops for divergent calls.

Remove debug dumps

Sep 29 2020, 2:42 AM · Restricted Project

Sep 25 2020

Flakebi requested review of D88291: [AMDGPU] Insert waterfall loops for divergent calls.
Sep 25 2020, 3:54 AM · Restricted Project

Sep 24 2020

Flakebi requested review of D88206: [AMDGPU] Fix v3f16 handling for getresinfo.
Sep 24 2020, 2:41 AM · Restricted Project

Sep 23 2020

Flakebi added a comment to D87674: [AMDGPU] Insert waitcnt after returning from call.

Thanks for the heads-up, I reverted it for now.

Sep 23 2020, 8:19 AM · Restricted Project

Sep 16 2020

Flakebi updated the diff for D84420: [AMDGPU] Add v3f16/v3i16 support to SDag.

Fix formatting

Sep 16 2020, 8:20 AM · Restricted Project
Flakebi updated the diff for D87674: [AMDGPU] Insert waitcnt after returning from call.

Add fixme that wait on function entry/return should depend on calling convention.

Sep 16 2020, 7:39 AM · Restricted Project
Flakebi updated the diff for D84420: [AMDGPU] Add v3f16/v3i16 support to SDag.

Improve is-widened check in CustomWidenLowerNode to determine if a value was widened or not.

Sep 16 2020, 5:05 AM · Restricted Project
Flakebi updated the diff for D84420: [AMDGPU] Add v3f16/v3i16 support to SDag.

Having this one use a different multiclass is weird looking. Why can't it directly use the same multiclass as the other cases?

I would expect this to look something like

class MTBUF_LoadIntrinsicPta<SDPatternOperator node, ValueType memvt, ValueType vt = memvt>
and then only override the vt in the weird v3 cases

Sep 16 2020, 4:47 AM · Restricted Project
Flakebi added inline comments to D86270: [AMDGPU] Use tablegen for argument indices.
Sep 16 2020, 1:33 AM · Restricted Project
Flakebi added inline comments to D87674: [AMDGPU] Insert waitcnt after returning from call.
Sep 16 2020, 1:14 AM · Restricted Project
Flakebi updated the diff for D87674: [AMDGPU] Insert waitcnt after returning from call.

Move is gs-done check to own function.

Sep 16 2020, 1:12 AM · Restricted Project

Sep 15 2020

Flakebi added inline comments to D87674: [AMDGPU] Insert waitcnt after returning from call.
Sep 15 2020, 9:03 AM · Restricted Project
Flakebi updated the diff for D87674: [AMDGPU] Insert waitcnt after returning from call.

Improve comment as suggested

Sep 15 2020, 9:03 AM · Restricted Project
Flakebi added inline comments to D87704: [AMDGPU] Reduce stack pointer alignment.
Sep 15 2020, 8:48 AM · Restricted Project
Flakebi requested review of D87704: [AMDGPU] Reduce stack pointer alignment.
Sep 15 2020, 8:35 AM · Restricted Project
Flakebi added inline comments to D87674: [AMDGPU] Insert waitcnt after returning from call.
Sep 15 2020, 8:29 AM · Restricted Project
Flakebi updated the diff for D87674: [AMDGPU] Insert waitcnt after returning from call.

Use DebugLoc from call for waitcnt and return early.

Sep 15 2020, 8:29 AM · Restricted Project
Flakebi requested review of D87674: [AMDGPU] Insert waitcnt after returning from call.
Sep 15 2020, 1:15 AM · Restricted Project

Sep 1 2020

Flakebi added a comment to D86938: [AMDGPU] Fix offset for REL32_HI relocs.

Wow, good catch. Looks good to me.

Sep 1 2020, 7:55 AM · Restricted Project

Aug 21 2020

Flakebi accepted D86340: [AMDGPU, docs] Fix typos.

Looks good, thanks!

Aug 21 2020, 9:28 AM · Restricted Project
Flakebi updated the diff for D86270: [AMDGPU] Use tablegen for argument indices.

Right, changed unsigned to uint8_t for offsets in ImageDimIntrinsicInfo.

Aug 21 2020, 1:16 AM · Restricted Project

Aug 20 2020

Flakebi requested review of D86270: [AMDGPU] Use tablegen for argument indices.
Aug 20 2020, 2:12 AM · Restricted Project

Aug 18 2020

Flakebi accepted D84638: AMDGPU/GlobalISel: Select llvm.amdgcn.groupstaticsize.

LGTM

Aug 18 2020, 12:38 AM · Restricted Project

Aug 14 2020

Flakebi added inline comments to D84420: [AMDGPU] Add v3f16/v3i16 support to SDag.
Aug 14 2020, 2:20 AM · Restricted Project
Flakebi updated the diff for D84420: [AMDGPU] Add v3f16/v3i16 support to SDag.

Address review comments: Move patterns to SIInstrInfo.td and use MemoryVT.

Aug 14 2020, 2:14 AM · Restricted Project
Flakebi updated the diff for D85887: [AMDGPU] Add A16/G16 to InstCombine.

Preserve fast-math flags and add test that ensures a16 combining is not done on gfx8.

Aug 14 2020, 12:36 AM · Restricted Project

Aug 13 2020

Flakebi updated the diff for D85887: [AMDGPU] Add A16/G16 to InstCombine.

Fix review comments

Aug 13 2020, 8:45 AM · Restricted Project
Flakebi requested review of D85895: [AMDGPU] Enable .rodata for amdpal os.
Aug 13 2020, 5:28 AM · Restricted Project
Flakebi requested review of D85887: [AMDGPU] Add A16/G16 to InstCombine.
Aug 13 2020, 3:03 AM · Restricted Project

Jul 23 2020

Herald added a project to D84420: [AMDGPU] Add v3f16/v3i16 support to SDag: Restricted Project.
Jul 23 2020, 8:21 AM · Restricted Project
Flakebi added a comment to D81728: [InstCombine] Add target-specific inst combining.

Thanks for the notification @davezarzycki, an auto-bisecting bot is cool!

Jul 23 2020, 12:47 AM · Restricted Project, Unknown Object (Project), Restricted Project

Jul 21 2020

Flakebi added a comment to D84223: [AMDGPU] Don't combine memory intrs to v3i16.

I’m also trying to get it working properly (currently for SDag). I think I got the legalization/widening part working but I’m still trying to figure out how to select the right instruction patterns.

Jul 21 2020, 9:27 AM · Restricted Project
Flakebi updated the diff for D81728: [InstCombine] Add target-specific inst combining.

Rebased and fix triple for Thumb2 tests as suggested.

Jul 21 2020, 3:14 AM · Restricted Project, Unknown Object (Project), Restricted Project
Herald added a project to D84223: [AMDGPU] Don't combine memory intrs to v3i16: Restricted Project.
Jul 21 2020, 2:14 AM · Restricted Project

Jul 17 2020

Flakebi updated the diff for D81728: [InstCombine] Add target-specific inst combining.

Here you go.

Jul 17 2020, 5:34 AM · Restricted Project, Unknown Object (Project), Restricted Project
Flakebi updated the diff for D81728: [InstCombine] Add target-specific inst combining.

Rebased and added some docs.

Jul 17 2020, 3:40 AM · Restricted Project, Unknown Object (Project), Restricted Project

Jul 13 2020

Flakebi added inline comments to D81728: [InstCombine] Add target-specific inst combining.
Jul 13 2020, 3:04 AM · Restricted Project, Unknown Object (Project), Restricted Project

Jul 10 2020

Flakebi added inline comments to D81728: [InstCombine] Add target-specific inst combining.
Jul 10 2020, 12:22 PM · Restricted Project, Unknown Object (Project), Restricted Project
Flakebi updated the diff for D81728: [InstCombine] Add target-specific inst combining.

Rebased (no conflicts this time).

Jul 10 2020, 4:18 AM · Restricted Project, Unknown Object (Project), Restricted Project

Jul 6 2020

Flakebi updated the diff for D81728: [InstCombine] Add target-specific inst combining.

Rebased and removed a few includes as suggested.
Make the TargetTransformInfo a private member of InstCombiner because it should not be used in general inst combines.
Move CreateOverflowTuple out of InstCombiner and make CreateNonTerminatorUnreachable static.

Jul 6 2020, 2:30 AM · Restricted Project, Unknown Object (Project), Restricted Project

Jun 30 2020

Flakebi updated the diff for D81728: [InstCombine] Add target-specific inst combining.

Rebased and call target-specific combining only for target-specific intrinsics as suggested.
Add Function::isTargetIntrinsic() for this purpose.

Jun 30 2020, 5:55 AM · Restricted Project, Unknown Object (Project), Restricted Project

Jun 25 2020

Flakebi updated the diff for D81728: [InstCombine] Add target-specific inst combining.

Rebased, so the automatic builds can run

Jun 25 2020, 11:20 AM · Restricted Project, Unknown Object (Project), Restricted Project
Flakebi updated the diff for D81728: [InstCombine] Add target-specific inst combining.

Adjust failing clang test, TargetIRAnalysis is run earlier now

Jun 25 2020, 10:13 AM · Restricted Project, Unknown Object (Project), Restricted Project

Jun 24 2020

Flakebi updated the diff for D81728: [InstCombine] Add target-specific inst combining.

Moved most target specific InstCombine parts to their respective targets.
The largest left-over part in InstCombineCalls.cpp is the code shared between arm and aarch64. Is there a place where code for these targets is shared?

Jun 24 2020, 9:09 AM · Restricted Project, Unknown Object (Project), Restricted Project

Jun 22 2020

Flakebi added a comment to D81074: [TableGen] Add error messages.

friendly ping

Jun 22 2020, 3:43 AM · Restricted Project

Jun 18 2020

Flakebi added a comment to D81728: [InstCombine] Add target-specific inst combining.

Summarizing the comments, the important points are

  1. Everyone agrees on moving target specific stuff out of Transforms/InstCombine into target specific folders
  2. Keep running the instruction combining in the InstCombine pass, so the fixed-point iteration works
Jun 18 2020, 3:14 AM · Restricted Project, Unknown Object (Project), Restricted Project

Jun 12 2020

Flakebi added a comment to D81728: [InstCombine] Add target-specific inst combining.

To add more context to this, the problem I am facing is that amdgpu image intrinsics are usually called with float arguments. However, on some subtargets/hardware generations it is possible to call them with half arguments.
If llvm is compiling for such a subtarget, it is beneficial to combine

%s32 = fpext half %s to float
call <4 x float> @llvm.amdgcn.image.sample.2d.v4f32.f32(…, float %s32, …)

into

call <4 x float> @llvm.amdgcn.image.sample.2d.v4f32.f16(…, half %s, …)
Jun 12 2020, 5:53 AM · Restricted Project, Restricted Project, Restricted Project
Flakebi created D81728: [InstCombine] Add target-specific inst combining.
Jun 12 2020, 3:12 AM · Restricted Project, Restricted Project, Restricted Project