Page MenuHomePhabricator

dfukalov (Daniil Fukalov)
Compiler Engineer at AMD

Projects

User does not belong to any projects.

User Details

User Since
Mar 27 2014, 8:40 AM (269 w, 3 d)

Recent Activity

Sep 25 2018

dfukalov committed rL343004: [RegAllocGreedy] avoid using physreg candidates that cannot be correctly spilled.
[RegAllocGreedy] avoid using physreg candidates that cannot be correctly spilled
Sep 25 2018, 11:41 AM
dfukalov closed D52052: [RegAllocGreedy] avoid using physreg candidates that cannot be correctly spilled.
Sep 25 2018, 11:41 AM · Restricted Project

Sep 17 2018

dfukalov added a comment to D52052: [RegAllocGreedy] avoid using physreg candidates that cannot be correctly spilled.

My understanding is that you don't want split code to occur right before this instruction, but I would like to understand exactly what it is not allowed.

The amdgpu target has special exec mask register that specifies what lanes should execute current vector instruction. So a basic block created from "if/else" construction with vector condition may contain this exec mask register restore code in preamble.
E.g. instruction S_OR_SAVEEXEC_B64 at the start of a BB set (restores) exec mask register value to correctly execute "else" block. So a spilling code SplitKit tries to insert before such an instruction may be transformed into memory operation. And it can be incorrectly executed with a wrong exec mask.

Sep 17 2018, 12:57 PM · Restricted Project

Sep 14 2018

dfukalov added a comment to D52052: [RegAllocGreedy] avoid using physreg candidates that cannot be correctly spilled.

Hi,

Could you elaborate on that part?

But it is not possible since can generate incorrect code in terms of exec mask.

I don't understand the constraints and thus what you are trying to achieve.

Cheers,
-Quentin

SplitEditor during splitting region can try to insert COPY spill instruction before S_OR_SAVEEXEC_B64 since it interferes with physreg candidate that is already assigned to destination of this exec mask restoration instruction. Unfortunately, such a COPY operation may be lowered to a memory operation that will not be correct because exec mask is not restored before the preamble.

Sep 14 2018, 10:50 AM · Restricted Project
dfukalov updated the diff for D52052: [RegAllocGreedy] avoid using physreg candidates that cannot be correctly spilled.
Sep 14 2018, 10:37 AM · Restricted Project

Sep 13 2018

dfukalov created D52052: [RegAllocGreedy] avoid using physreg candidates that cannot be correctly spilled.
Sep 13 2018, 12:51 PM · Restricted Project

Aug 6 2018

dfukalov committed rL339029: Fix typo in the MSVC Visualizer for SmallVector class.
Fix typo in the MSVC Visualizer for SmallVector class
Aug 6 2018, 9:47 AM

Jun 26 2018

dfukalov abandoned D48558: [AMDGPU] fix for register coalescer.

The testcase compiles fine with ToT at r335473.

Jun 26 2018, 5:27 AM · Restricted Project

Jun 25 2018

dfukalov created D48558: [AMDGPU] fix for register coalescer.
Jun 25 2018, 12:31 PM · Restricted Project

Jun 21 2018

dfukalov added a comment to D48416: [StackSlotColoring] Fixed handling of StackID.

I have a quite small .ll that causes assertion in SIFrameLowering.cpp, if you need

Jun 21 2018, 3:47 AM

Jun 20 2018

dfukalov added a comment to D45968: StackSlotColoring: Decide colors per stack ID.

The patch fixes the issue I found in Blender application

Jun 20 2018, 10:47 AM

Jun 8 2018

dfukalov committed rL334301: [AMDGPU] Inline asm - added i16, half and i128 types support.
[AMDGPU] Inline asm - added i16, half and i128 types support
Jun 8 2018, 9:33 AM
dfukalov closed D44920: [AMDGPU] Inline asm - added i16, half and i128 types support.
Jun 8 2018, 9:33 AM · Restricted Project
dfukalov committed rL334300: reapply r334209 with fixes for harfbuzz in Chromium.
reapply r334209 with fixes for harfbuzz in Chromium
Jun 8 2018, 9:27 AM

Jun 7 2018

dfukalov committed rL334209: [LSR] Check yet more intrinsic pointer operands.
[LSR] Check yet more intrinsic pointer operands
Jun 7 2018, 10:35 AM
dfukalov closed D47794: [LSR] Check yet more intrinsic pointer operands.
Jun 7 2018, 10:35 AM
dfukalov updated the diff for D47794: [LSR] Check yet more intrinsic pointer operands.

test updated as requested

Jun 7 2018, 7:10 AM

Jun 5 2018

dfukalov created D47794: [LSR] Check yet more intrinsic pointer operands.
Jun 5 2018, 12:16 PM

May 21 2018

dfukalov committed rL332848: [AMDGPU] fixes for lds f32 builtins.
[AMDGPU] fixes for lds f32 builtins
May 21 2018, 9:25 AM
dfukalov committed rC332848: [AMDGPU] fixes for lds f32 builtins.
[AMDGPU] fixes for lds f32 builtins
May 21 2018, 9:25 AM
dfukalov closed D43281: [AMDGPU] fixes for lds f32 builtins.
May 21 2018, 9:24 AM · Restricted Project

Apr 13 2018

dfukalov updated the diff for D43281: [AMDGPU] fixes for lds f32 builtins.
Apr 13 2018, 8:08 AM · Restricted Project

Apr 12 2018

dfukalov added a comment to D44920: [AMDGPU] Inline asm - added i16, half and i128 types support.

ping...

Apr 12 2018, 11:40 AM · Restricted Project

Apr 3 2018

dfukalov added a comment to D43281: [AMDGPU] fixes for lds f32 builtins.

ping...

Apr 3 2018, 8:09 AM · Restricted Project

Mar 30 2018

dfukalov added inline comments to D44920: [AMDGPU] Inline asm - added i16, half and i128 types support.
Mar 30 2018, 3:54 AM · Restricted Project
dfukalov updated the diff for D44920: [AMDGPU] Inline asm - added i16, half and i128 types support.

diff updated as requested

Mar 30 2018, 3:54 AM · Restricted Project

Mar 27 2018

dfukalov created D44920: [AMDGPU] Inline asm - added i16, half and i128 types support.
Mar 27 2018, 3:30 AM · Restricted Project
dfukalov added a comment to D43281: [AMDGPU] fixes for lds f32 builtins.

ping...

Mar 27 2018, 3:16 AM · Restricted Project

Mar 19 2018

dfukalov added a comment to D43281: [AMDGPU] fixes for lds f32 builtins.

My real question was what happens if you put 11 in the description string?

in this case CanT.getAddressSpace() returns target addrspace value "20" (also shifted in the enum by 9==LangAS::FirstTargetAddressSpace)

Mar 19 2018, 4:32 AM · Restricted Project

Mar 9 2018

dfukalov added a comment to D43281: [AMDGPU] fixes for lds f32 builtins.

ping...

Mar 9 2018, 1:40 PM · Restricted Project

Mar 2 2018

dfukalov updated the diff for D43281: [AMDGPU] fixes for lds f32 builtins.

addrspace specifications are kept in descriptions strings

Mar 2 2018, 7:44 AM · Restricted Project

Mar 1 2018

dfukalov added a comment to D43281: [AMDGPU] fixes for lds f32 builtins.

The problem is that if set addrspace "2" in description string, CanT.getAddressSpace() returns target addrspace value "11" (shifted in the enum) and compares it with input LangAS addrspace ("2", "opencl_local" in our case).
So I cannot set a number a description string that will be equal to LangAS addrspace "opencl_local".

Mar 1 2018, 9:56 AM · Restricted Project

Feb 26 2018

dfukalov added a comment to D43281: [AMDGPU] fixes for lds f32 builtins.

Can’t you just change the description to be the LangAS value? I also thought these happened to be the same already

Feb 26 2018, 7:29 AM · Restricted Project

Feb 24 2018

dfukalov added a comment to D43281: [AMDGPU] fixes for lds f32 builtins.

ping...

Feb 24 2018, 3:06 PM · Restricted Project

Feb 15 2018

dfukalov added inline comments to D43281: [AMDGPU] fixes for lds f32 builtins.
Feb 15 2018, 1:59 PM · Restricted Project
dfukalov updated the diff for D43281: [AMDGPU] fixes for lds f32 builtins.

diff updated as requested by reviewer

Feb 15 2018, 1:56 PM · Restricted Project

Feb 14 2018

dfukalov created D43281: [AMDGPU] fixes for lds f32 builtins.
Feb 14 2018, 3:16 AM · Restricted Project

Feb 6 2018

dfukalov added a comment to D42689: [SCEV] Fix threshold limit check.

any test and/or example of the case?

Feb 6 2018, 8:41 AM

Feb 4 2018

dfukalov committed rC324201: Recommit rL323890: [AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions.
Recommit rL323890: [AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions
Feb 4 2018, 2:34 PM
dfukalov committed rL324201: Recommit rL323890: [AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions.
Recommit rL323890: [AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions
Feb 4 2018, 2:34 PM

Jan 31 2018

dfukalov committed rL323896: Revert "[AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions".
Revert "[AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions"
Jan 31 2018, 10:52 AM
dfukalov committed rC323896: Revert "[AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions".
Revert "[AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions"
Jan 31 2018, 10:51 AM
dfukalov committed rL323890: [AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions.
[AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions
Jan 31 2018, 8:59 AM
dfukalov committed rC323890: [AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions.
[AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions
Jan 31 2018, 8:59 AM
dfukalov closed D42578: [AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions.
Jan 31 2018, 8:59 AM · Restricted Project
dfukalov closed D42578: [AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions.
Jan 31 2018, 8:59 AM · Restricted Project

Jan 29 2018

dfukalov updated the diff for D42578: [AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions.

fixed builtins descriptions

Jan 29 2018, 9:48 AM · Restricted Project

Jan 26 2018

dfukalov updated the diff for D42578: [AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions.

Sorry, missed them

Jan 26 2018, 8:31 AM · Restricted Project
dfukalov created D42578: [AMDGPU] Add ds_fadd, ds_fmin, ds_fmax builtins functions.
Jan 26 2018, 7:25 AM · Restricted Project
dfukalov committed rL323516: [AMDGPU] fix LDS f32 intrinsics.
[AMDGPU] fix LDS f32 intrinsics
Jan 26 2018, 3:11 AM
dfukalov closed D42383: [AMDGPU] fix LDS f32 intrinsics.
Jan 26 2018, 3:11 AM · Restricted Project

Jan 22 2018

dfukalov created D42383: [AMDGPU] fix LDS f32 intrinsics.
Jan 22 2018, 9:51 AM · Restricted Project

Jan 17 2018

dfukalov committed rL322656: [AMDGPU] add LDS f32 intrinsics.
[AMDGPU] add LDS f32 intrinsics
Jan 17 2018, 6:06 AM
dfukalov closed D37985: [AMDGPU] add LDS f32 intrinsics.
Jan 17 2018, 6:06 AM
dfukalov set the repository for D37985: [AMDGPU] add LDS f32 intrinsics to rL LLVM.
Jan 17 2018, 6:02 AM

Jan 15 2018

dfukalov updated the diff for D37985: [AMDGPU] add LDS f32 intrinsics.

diff updated according to latest comments

Jan 15 2018, 4:02 AM

Dec 20 2017

dfukalov added a comment to D37985: [AMDGPU] add LDS f32 intrinsics.

ping

Dec 20 2017, 3:54 AM

Dec 13 2017

dfukalov updated the diff for D37985: [AMDGPU] add LDS f32 intrinsics.

updates as requested by reviewers

Dec 13 2017, 5:49 AM

Oct 25 2017

dfukalov closed D39125: [inlineasm] Fix crash when number of matched input constraint operands overflows signed char.
Oct 25 2017, 6:31 AM · Restricted Project
dfukalov added an edge to rL316574: [inlineasm] Fix crash when number of matched input constraint operands…: D39125: [inlineasm] Fix crash when number of matched input constraint operands overflows signed char.
Oct 25 2017, 6:19 AM
dfukalov added 1 commit(s) for D39125: [inlineasm] Fix crash when number of matched input constraint operands overflows signed char: rL316574: [inlineasm] Fix crash when number of matched input constraint operands….
Oct 25 2017, 6:19 AM · Restricted Project
dfukalov set the repository for D39125: [inlineasm] Fix crash when number of matched input constraint operands overflows signed char to rL LLVM.
Oct 25 2017, 6:18 AM · Restricted Project
dfukalov committed rL316574: [inlineasm] Fix crash when number of matched input constraint operands….
[inlineasm] Fix crash when number of matched input constraint operands…
Oct 25 2017, 5:51 AM

Oct 23 2017

dfukalov updated the diff for D39125: [inlineasm] Fix crash when number of matched input constraint operands overflows signed char.
Oct 23 2017, 9:20 AM · Restricted Project
dfukalov updated the diff for D39125: [inlineasm] Fix crash when number of matched input constraint operands overflows signed char.

Sorry for incomplete diff

Oct 23 2017, 9:18 AM · Restricted Project

Oct 21 2017

dfukalov updated the diff for D39125: [inlineasm] Fix crash when number of matched input constraint operands overflows signed char.
Oct 21 2017, 2:28 PM · Restricted Project

Oct 20 2017

dfukalov created D39125: [inlineasm] Fix crash when number of matched input constraint operands overflows signed char.
Oct 20 2017, 8:29 AM · Restricted Project

Sep 29 2017

dfukalov accepted D38325: [AMDGPU] Set fast-math flags on functions given the options.
Sep 29 2017, 3:51 PM
dfukalov added a comment to D38325: [AMDGPU] Set fast-math flags on functions given the options.

otherwise LGTM

Sep 29 2017, 3:24 PM

Sep 20 2017

dfukalov added a comment to D37985: [AMDGPU] add LDS f32 intrinsics.

Am I right that since we should have almost the same processing as atomic inc intrinsics,
it would be better idea to define ds_add/min/max intrinsics the same way as AMDGPUAtomicIncIntrin (or unify them),
and then update AMDGPU BE to correctly process these ds_ instrisics the same ways as atomic inc/dec?

Sep 20 2017, 12:14 PM

Sep 18 2017

dfukalov added inline comments to D37985: [AMDGPU] add LDS f32 intrinsics.
Sep 18 2017, 11:42 AM
dfukalov updated the diff for D37985: [AMDGPU] add LDS f32 intrinsics.
Sep 18 2017, 11:40 AM
dfukalov created D37985: [AMDGPU] add LDS f32 intrinsics.
Sep 18 2017, 11:17 AM

Sep 4 2017

dfukalov created D37438: Fix segfault in FlattenCFG.
Sep 4 2017, 8:22 AM

Feb 6 2017

dfukalov committed rL294181: [SCEV] limit recursion depth and operands number in getAddExpr.
[SCEV] limit recursion depth and operands number in getAddExpr
Feb 6 2017, 4:49 AM
dfukalov closed D28158: [SCEV] limit recursion depth and operands number in getAddExpr by committing rL294181: [SCEV] limit recursion depth and operands number in getAddExpr.
Feb 6 2017, 4:49 AM

Feb 3 2017

dfukalov added a comment to D28158: [SCEV] limit recursion depth and operands number in getAddExpr.

Hi Sanjoy,

Feb 3 2017, 6:15 AM

Jan 27 2017

dfukalov added a comment to D28158: [SCEV] limit recursion depth and operands number in getAddExpr.

ping

Jan 27 2017, 1:53 AM

Jan 26 2017

dfukalov committed rL293176: [SCEV] Introduce add operation inlining limit.
[SCEV] Introduce add operation inlining limit
Jan 26 2017, 5:44 AM
dfukalov closed D28812: [SCEV] Introduce add operation inlining limit by committing rL293176: [SCEV] Introduce add operation inlining limit.
Jan 26 2017, 5:44 AM

Jan 24 2017

dfukalov added a comment to D28812: [SCEV] Introduce add operation inlining limit.

ping...

Jan 24 2017, 6:13 AM

Jan 19 2017

dfukalov updated the diff for D28158: [SCEV] limit recursion depth and operands number in getAddExpr.
  1. add expr inlining limit moved to separate change https://reviews.llvm.org/D28812
  2. removed depth parameter from wrapper functions
  3. added helper function to early return
  4. refined test to use func params instead of undef
Jan 19 2017, 10:57 AM

Jan 18 2017

dfukalov retitled D28812: [SCEV] Introduce add operation inlining limit from [SCEV] Add add operation inlining limit to [SCEV] Introduce add operation inlining limit.
Jan 18 2017, 2:44 AM

Jan 17 2017

dfukalov created D28812: [SCEV] Introduce add operation inlining limit.
Jan 17 2017, 9:53 AM

Jan 11 2017

dfukalov added a comment to D28158: [SCEV] limit recursion depth and operands number in getAddExpr.

ping...

Jan 11 2017, 5:49 AM

Jan 9 2017

dfukalov added inline comments to D28158: [SCEV] limit recursion depth and operands number in getAddExpr.
Jan 9 2017, 3:54 AM
dfukalov added a comment to D28158: [SCEV] limit recursion depth and operands number in getAddExpr.

What do you mean exactly? getMulExpr does not have a depth parameter like this. We have a comparison depth limit and a depth limit on SimplifyICmpOperands.

The second point is about new AddOpsInlineThreshold parameter. It is almost the same as MulOpsInlineThreshold used in getMulExpr (line 2608). And this part of change is independent of recursion depth limit, but suggested for the same purpose - to reduce almost infinite time of processing very long expressions.

Jan 9 2017, 3:50 AM

Dec 29 2016

dfukalov retitled D28158: [SCEV] limit recursion depth and operands number in getAddExpr from to [SCEV] limit recursion depth and operands number in getAddExpr.
Dec 29 2016, 7:42 AM

Nov 28 2016

dfukalov committed rL288042: [CMAKE] fix LLVM_OPTIMIZED_TABLEGEN for Visual Studio.
[CMAKE] fix LLVM_OPTIMIZED_TABLEGEN for Visual Studio
Nov 28 2016, 9:22 AM
dfukalov closed D27135: [CMAKE] fix LLVM_OPTIMIZED_TABLEGEN for Visual Studio by committing rL288042: [CMAKE] fix LLVM_OPTIMIZED_TABLEGEN for Visual Studio.
Nov 28 2016, 9:22 AM
dfukalov added inline comments to D27135: [CMAKE] fix LLVM_OPTIMIZED_TABLEGEN for Visual Studio.
Nov 28 2016, 7:11 AM
dfukalov updated the diff for D27135: [CMAKE] fix LLVM_OPTIMIZED_TABLEGEN for Visual Studio.

refined with Chris suggestion

Nov 28 2016, 7:10 AM

Nov 25 2016

dfukalov updated subscribers of D27135: [CMAKE] fix LLVM_OPTIMIZED_TABLEGEN for Visual Studio.
Nov 25 2016, 9:01 AM
dfukalov retitled D27135: [CMAKE] fix LLVM_OPTIMIZED_TABLEGEN for Visual Studio from to [CMAKE] fix LLVM_OPTIMIZED_TABLEGEN for Visual Studio.
Nov 25 2016, 9:00 AM

Nov 17 2016

dfukalov committed rL287232: [SCEV] limit recursion depth of CompareSCEVComplexity.
[SCEV] limit recursion depth of CompareSCEVComplexity
Nov 17 2016, 8:17 AM
dfukalov closed D26389: [SCEV] limit recursion depth of CompareSCEVComplexity by committing rL287232: [SCEV] limit recursion depth of CompareSCEVComplexity.
Nov 17 2016, 8:17 AM

Nov 16 2016

dfukalov committed rL287116: test commit, changed tab to spaces, NFC.
test commit, changed tab to spaces, NFC
Nov 16 2016, 8:51 AM

Nov 14 2016

dfukalov added inline comments to D26389: [SCEV] limit recursion depth of CompareSCEVComplexity.
Nov 14 2016, 3:31 PM
dfukalov added inline comments to D26389: [SCEV] limit recursion depth of CompareSCEVComplexity.
Nov 14 2016, 2:56 PM
dfukalov added a comment to D26389: [SCEV] limit recursion depth of CompareSCEVComplexity.

Hi Sanjoy, would you please check the updated diff and test?

Nov 14 2016, 7:54 AM